Skip to content

Conversation

@jameshadfield
Copy link
Member

@jameshadfield jameshadfield commented Aug 19, 2024

The provision-users command requires us to provide both a username and email for each user we want to add. A common use case is being given a (long) list of emails and wanting to check if a user already exists for each email, which is exactly what this script does.

Once this PR is merged i'll add instructions to the wiki docs

@nextstrain-bot nextstrain-bot temporarily deployed to nextstrain-s-james-grou-fhxfbv August 19, 2024 02:24 Inactive
@victorlin
Copy link
Member

This seems useful! Instead of a separate script, how about adding it as a validation step within the provision-group script? Example: 434dff7

$ ./scripts/provision-group test --members new-members.yaml

PROGRAMMING ERROR: Caught unhandled rejection of 1 promise:
  Promise {
    <rejected> Error: Email [email protected] doesn't yet exist but the username 'victorlin-test' is already associated with '[email protected]'
        at queryUsernames (file:///Users/vlin/Dropbox/repos/nextstrain/nextstrain.org/scripts/provision-group.js:487:17)
        at readMembersFile (file:///Users/vlin/Dropbox/repos/nextstrain/nextstrain.org/scripts/provision-group.js:123:3)
        at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
        at async main (file:///Users/vlin/Dropbox/repos/nextstrain/nextstrain.org/scripts/provision-group.js:69:7)
  }

@jameshadfield
Copy link
Member Author

Instead of a separate script, how about adding it as a validation step within the provision-group script?

There's a few reasons I chose not to go down this road. My script needs to list all the cognito users (as you can't query by email, AFAIK), which is unnecessary overhead for most use cases of provision-group. Secondly, I allow the username to be left out, whereas provision-group enforces it (could be fixed). Thirdly, I wanted to use the script with confidence no changes would be made on the cognito side; I sketched out adding --dry-run to provision-group (which I think would be a very good thing) but ultimately found it easier to make a separate script. Finally I deemed the bar for addition to provision-group to be higher than a script like this (ensuring all exceptions are handled etc), as their purposes and importance are quite different. If you want to take this direction on then by all means please do - there are a few emails with multiple usernames in the production pool, which I don't think is ideal.

Copy link
Member

@victorlin victorlin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation. I only checked against the testing user pool so I didn't realize the overhead on the production user pool. I understand your decision better now and feel less strongly about adding to provision-group.

Thread for tangential discussion below.

Copy link
Member

@victorlin victorlin Aug 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are a few emails with multiple usernames in the production pool, which I don't think is ideal.

The ability to do so seems intentional. Skimming through the code, everything is keyed on username (also apparent on https://nextstrain.org/login). Many services allow multiple accounts for a single email. This can be useful for things such as one person having separate accounts for different levels of access.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - for sure cognito is keyed off username, and by extension nextstrain is as well.

This can be useful for things such as one person having separate accounts for different levels of access

I guess so? Outside of testing purposes I don't quite see the point of one person maintaining different access levels for the same group, but I'm sure there are use cases.

In the few examples I've seen (and thus anecdotal) I think we've created a username/email combo for one group, then for a subsequent group been given a list of emails and used the provision-groups script to add a new username for the same email. From the users perspective that's not ideal (need to login with different usernames to see different groups), but for the users in question I believe their usage is very low at the moment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example use case for multiple usernames under the same email.

I don't doubt that some of the existing many-to-one username/email instances are not ideal, so maybe it'd be good to print a warning such as

WARNING: Email <email> exists with username(s) <name> and you are asking for the new username <name> to be created. Confirm that this is intentional.

In the future when user creation is self-service, it would be good to add an extra confirmation step to this scenario.

@tsibley
Copy link
Contributor

tsibley commented Aug 20, 2024

My script needs to list all the cognito users (as you can't query by email, AFAIK),

You can absolutely search by email, e.g.
https://docs.aws.amazon.com/cognito/latest/developerguide/how-to-manage-user-accounts.html#manage-user-accounts-searching-user-attributes

Copy link
Member

@victorlin victorlin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just used this for 88c3092 – very useful! I'm good for this to be merged pending one small change below.

It would be nice to run this automatically as part of validation in provision-group. But given the long-term plan to allow self-service, maybe no need to invest much additional effort here.

@victorlin
Copy link
Member

victorlin commented Nov 14, 2024

Bump: It's been helpful to run this while handling Groups requests. Currently I switch to the PR branch just to run it.

Edit: also used for 11d083d

@victorlin
Copy link
Member

victorlin commented Apr 8, 2025

Bump: I'm now doing this

git restore -s origin/james/groups-helper-script scripts/provision-group-check-users.js
chmod a+x scripts/provision-group-check-users.js
AWS_PROFILE=victorlin-admin-read-only CONFIG_FILE=env/production/config.json ./scripts/provision-group-check-users.js --members members.yaml

@jameshadfield
Copy link
Member Author

jameshadfield commented Nov 11, 2025

I moved this forward today with three major improvements:

We now check the roles we're attempting to add against an existing users roles, which helps correctly identify no-op situations and (more importantly) avoids the previous issue where a prospective change in user role wouldn't be considered and so the script would say "no changes needed here" (or similar)

The verbosity is also increased. I find this really helpful when I'm only working on this part of the ecosystem from time to time.

Finally we now check emails against other emails in a case-insensitive way which closes #1258

P.S. If you run this against the testing pool you'll see I've added a "duplicate" email with different casing. I'll remove this when I merge this PR.

@jameshadfield
Copy link
Member Author

Comment on lines 179 to 181
for (const [_key, emails] of Object.entries(db)) {
if (emails.length>1) {
console.log(`WARNING: Multiple emails with different casings present in pool: ${emails.join(', ')}`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this warning is necessary – it seems out of scope for this script which should focus on just the entries in the given members YAML file. For entries in the file that have multiple casings, those are already reported later which should be sufficient:

console.log(`\tWARNING: The provided email ${member.email} wasn't an exact match but there are` +
`multiple different (case-sensitive) matches: ${emailMatch.candidates.join(", ")}`)
console.log(`\tACTION: try again with a case-corrected email or (better) unify these various emails before proceeding`)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get this, but assuming (!) we don't have any ~duplicates at the moment (or if we did then we can sort them out ASAP) then I think informing the developer who's in this part of the ecosystem that there are some other issues is helpful as it may prompt them to sort them out while they're here. Ideally it'll never happen, of course. But if you think it's too annoying then I can drop it...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked – we don't have any duplicates in the production user pool at the moment. With the new warning when adding them, duplicates that make it into the user pool would likely be intentional, leading to noisy warnings whenever provisioning members for other groups. While I don't see the benefit, I'm fine with keeping it if you would find it beneficial.

The other motivation, though not as important, is simpler code as sketched out in #987 (comment).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went a different direction and this ended up being dropped, but

duplicates that make it into the user pool would likely be intentional

I don't think we should ever have duplicates-which-differ-by-case-only, either for email or username. I can see the value in multiple users with the same email (e.g. as we do this in our testing pool), and so we don't loudly warn against this anymore.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I don't think case-insensitive duplicates should be allowed. The warnings in script are a good step towards avoiding them, but I think the proper way would be to fail instead of creating the duplicates as part of scripts/provision-group.

* candidates: string[] cognito URLs which are identical to the provided email
* except for potential different casing
*/
function searchEmails(emailDb, email) {
Copy link
Member

@victorlin victorlin Nov 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the warning that's given as a byproduct of creating emailDb is not needed (#987 (comment)), the code could be simplified to remove the entire emailDb intermediate data structure and instead do case-insensitive searches on-the-fly. Example: a1a599f

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion -- went more in this direction :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(non-blocking) Another follow-up suggested simplification: 9332153

The provision-users command requires us to provide both a username and
email for each user we want to add. Commonly we don't know the username
for a given email, so one usage of this script is designed around
listing existing users (and their roles) for a given email. A second
use of the script is cross-referencing the proposed username/email/role
against the existing pool.

Note that emails & usernames in our pools are _case sensitive_
<#1258> which adds
a lot of complexity to this script, as one aim is to avoid us
provisioning new users which differ from an existing user only by case.

The output is perhaps overly verbose, but given the frequency we add
users I find this verbosity clarifying rather than annoying.
@jameshadfield jameshadfield force-pushed the james/groups-helper-script branch from dfa4982 to c6db4d5 Compare November 13, 2025 22:43
A YAML file describing the members to add to the Group.
The file must be an array of objects (i.e. dict/map/hash).
Each object must contain an "email" key which is case-sensitive.
The "username" key is optional for this script, but is required when you actually create provision the group membership.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why allow "username" to be optional in this script? I think it should be required to match the other script, and also simplify the code by removing the extra code paths that handle its absence.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original motivation for this script was the common case of being given an email to add to a group and no username. In fact, I think this was the situation every time I fielded a request to add a new "user" to a group.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh... for requests without usernames I've always taken it from the email, but that's probably just me. We can keep both code paths!

* candidates: string[] cognito URLs which are identical to the provided email
* except for potential different casing
*/
function searchEmails(emailDb, email) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(non-blocking) Another follow-up suggested simplification: 9332153

*/
async function crossReferenceUsernameAndEmail(group, member, usersByEmail, usersByUsername) {
console.log()
console.log(`Membership request for user "${member.username}" (${member.email}). Proposed role: ${member.role}`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally on board with the level of verbosity here. I think it can be reworded for clarity. Suggestion:

from

Membership request for user "testUser2" ([email protected]). Proposed role: owners
        No matching usernames in the pool (case insensitive) but the email did match.
        There is one email which matches exactly. Associated users:
                testUser (existing group role: owners)
        ACTION: Use a username from above and cross-reference with existing roles.

to

{ username: 'testUser2', email: '[email protected]', role: 'test/owners' }
    • Username is unique
    • Email matches 1 existing user:
          { username: 'testUser', email: '[email protected]', role: 'test/owners' }
    ↳ ACTION: Use a username from above and cross-reference with existing roles.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can change the formatting to be fancier, but I don't find this clearer, e.g. "Username is unique" vs "No matching usernames in the pool (case insensitive)" so I'll put my own twist on it

Comment on lines +226 to +230
const emailMsg = cognitoUsername===member.email ?
'(exact match)' :
cognitoUsername.toUpperCase()===member.email.toUpperCase() ?
'(case differs)' :
'(not a match with the provided email)';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: username compared against email

Suggested change
const emailMsg = cognitoUsername===member.email ?
'(exact match)' :
cognitoUsername.toUpperCase()===member.email.toUpperCase() ?
'(case differs)' :
'(not a match with the provided email)';
const emailMsg = associatedEmail===member.email ?
'(exact match)' :
associatedEmail.toUpperCase()===member.email.toUpperCase() ?
'(case differs)' :
'(not a match with the provided email)';

@victorlin victorlin linked an issue Nov 14, 2025 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Case sensitivity in nextstrain.org usernames and emails

6 participants