Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Identity dupe resolution guide first draft #29308

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

banks
Copy link
Member

@banks banks commented Jan 7, 2025

Description

First draft of the Idenity duplicate resolution docs. The features described are not all merged yet but this document is so important we want to start getting feedback on it ASAP.

TODO only if you're a HashiCorp employee

  • Backport Labels: If this fix needs to be backported, use the appropriate backport/ label that matches the desired release branch. Note that in the CE repo, the latest release branch will look like backport/x.x.x, but older release branches will be backport/ent/x.x.x+ent.
    • LTS: If this fixes a critical security vulnerability or severity 1 bug, it will also need to be backported to the current LTS versions of Vault. To ensure this, use all available enterprise labels.
  • ENT Breakage: If this PR either 1) removes a public function OR 2) changes the signature
    of a public function, even if that change is in a CE file, double check that
    applying the patch for this PR to the ENT repo and running tests doesn't
    break any tests. Sometimes ENT only tests rely on public functions in CE
    files.
  • Jira: If this change has an associated Jira, it's referenced either
    in the PR description, commit message, or branch name.
  • RFC: If this change has an associated RFC, please link it in the description.
  • ENT PR: If this change has an associated ENT PR, please link it in the
    description. Also, make sure the changelog is in this PR, not in your ENT PR.

@banks banks added this to the 1.19.0-rc milestone Jan 7, 2025
@github-actions github-actions bot added the hashicorp-contributed-pr If the PR is HashiCorp (i.e. not-community) contributed label Jan 7, 2025
@digital-content-events
Copy link

📄 Content Checks

Updated: Tue, 07 Jan 2025 17:04:30 GMT

Found 3 error(s)

content/docs/upgrading/identity-deduplication.mdx

Position Description Rule
137:5-138:117 Unexpected fully-qualified link to developer.hashicorp.com: https://developer.hashicorp.com/vault/api-docs/secret/identity/entity-alias#delete-entity-alias-by-id. Replace with a relative path internal to Developer. Possibly: /vault/api-docs/secret/identity/entity-alias#delete-entity-alias-by-id. ensure-valid-link-format
233:12-234:22 Unexpected folder-relative link found: . Ensure this link is an absolute Developer path. ensure-valid-link-format

content/docs/upgrading/upgrade-to-1.19.x.mdx

Position Description Rule
58:55-58:75 Unexpected folder-relative link found: . Ensure this link is an absolute Developer path. ensure-valid-link-format

Copy link

github-actions bot commented Jan 7, 2025

CI Results:
All Go tests succeeded! ✅


# Vault identity duplicate resolution guide

Some users may have duplicate identity resources (entities, aliases, groups) in
Copy link
Collaborator

@digivava digivava Jan 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about linking the word "identity" to the introductory identity docs page, so that people who come to this page from the 1.19 upgrade guide and aren't very familiar with entities/aliases/groups can do some initial reading to remind themselves what these things are first before they continue.

whether this de-duplication is required. No behavior will change until the
operator activates the relevant flag via the API.

To understand if you need to take action, consult the [resolution guide]() which
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this is meant to link to the other page and you mean to fill it in later, but here's a comment so you don't forget!

@@ -42,6 +42,23 @@ based on the table below.
| CE | true | any value other than sha2-512 | An error is returned | Pure Ed25519 |
| CE | true | sha2-512 | An error is returned (not supported on CE) | Pure Ed25519 |

### Identity System Duplicate Cleanup

**Users should review their server logs after upgrading to see if identity
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You explained the 1.19 changes/recommendations in this section in a simple and concise way, great job! 😀

Only recommendation I have is that I don't think we need this bold part because we already essentially say the same thing down below in the "also includes improved reporting" paragraph, or if you want to keep it, at least just moving it to inside or after that paragraph, so all mention of the server logs is in the same place.

Otherwise the bolded line coming first kind of looks like we're saying we've added identity duplicates to 1.19 as some kind of regression.

Copy link
Collaborator

@digivava digivava Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Y'know, I wrote this comment at the beginning of my review, and now I'm seeing it again after having read the rest of the doc, and I'm not sure I agree with my own recommendation. Choose what makes the most sense to you!


Some users may have duplicate identity resources (entities, aliases, groups) in
their cluster's storage due to historical bugs. These can cause unexpected
behavior as they are outside of Vault's typical expectations and test scenarios.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: clarity around "these" and "they" here--maybe:

"Having duplicates can cause unexpected behavior, as Vault expects there to be just one of an entity, alias, or group with a given name."

To identify whether your cluster has duplicates you need to follow these steps:

1. Ensure you have upgraded to Vault 1.19.0 or higher on all cluster nodes.
2. Check the Vault logs on the active node and locate it's last unseal operation between the following log lines:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's -> its

```

**If you don't see this line, you have no further action to take on this
cluster**. Ensure you repeat this process on any Performance Replication
Copy link
Collaborator

@digivava digivava Jan 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great callout! PR clusters would be easy to miss


1. Ensure all clusters (primary and secondaries) are upgraded to 1.19.0 or
higher.
1. On the primary, address each duplicate reported using the type-specific
Copy link
Collaborator

@digivava digivava Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: just because using "type" without a qualifier in the land of computers can be kinda unclear, I might slightly revise this to "... address each reported duplicate using the guidance specific to each duplicate type below"

(in this and the other spots that phrasing is used -- I won't be offended if you think "type-specific" is clear enough though)

1. On the primary, address each duplicate reported using the type-specific
guidance below.
2. On any Performance Replication secondaries, address each _local alias_ duplicate reported using the guidance below.
3. Activate the `force-identity-deduplication` activation flag.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you were already going to do this and just hadn't yet, but I recommend linking this to the Activating the flag heading you've got below so there's an immediate direction after people read it and go "wha, how do I do that though!"


```
[WARN] identity: 2 different-case entity alias duplicates found (potential security risk)
[WARN] identity: entity-alias "alias-case" with mount accessor "auth_userpass_34aca7ec" duplicates 1 others: id=df3568a4-3b65-4104-9481-1129ecbed72f canonical_id=5f013d99-a6c7-9a00-6ad5-4ad724b14f60 force_deduplication="would merge into entity 7da76b0d-fe9b-a125-3362-2a8ff055dcf8"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question: Feel free to take this offline as it's not related to this PR and probably more to my overall noobery, but I'm just curious what a canonical ID is vs the regular ID. In this example log line I was first trying to find the corresponding entity 7da7 and didn't realize it was the canonical_id I was supposed to be looking at, not the id.

Is there a way for users to query identity resources by canonical ID to inspect them, and see "aha, they're right, it's merged"? Or can you only query by id? I'm just curious from a UX perspective.

1. Address each duplicate reported using the type specific guidance below.
2. Activate the `force-identity-deduplication` activation flag.

### Resolving different-case entity alias duplicates
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My guess is no, but is there not a way to make these h3-level headings show up on the right navigation sidebar?

Copy link
Collaborator

@digivava digivava Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll defer to what @schavis thinks but I think given the length of this page and that there's so many headings, most of which are worth being aware of and able to jump to quickly, I'd recommend doing the unorthodox thing of bumping the h2s to h1s, and h3s to h2s.

avoid accidentally merging two entities that might not be the same user.

<Note> These are the most critical duplicates to resolve because there may be
security impact when enabling the `force-identity-deduplication` flag. </Note>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As in, because of potential increases in permissions that were given to one entity's policy but not the other? Is it worth spelling out somewhere (maybe we do, still reading) that activating the flag should be paired with a review of the policies that apply to the detected duplicates shown in the server logs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this whole guide is basically trying to say what you said? The overview of the process at the start was my attempt at introducing the idea that you need to review the report first before using the feature but if that wasn't clear enough then I'll take another look.




#### Resolving templated ACL policies
Copy link
Collaborator

@digivava digivava Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are this and the Terraform section only relevant to the previous section, and that's why it's a subsection? If it's relevant to all duplicate types, it may be best they're the same level of heading as the other "Resolving" ones.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was meant to be a subsection since this is the "detail" part of the more general overview directly above for those who are impacted... but open to feedback - the point you made about h3 headings not showing in the overview was something I noticed too and didn't love.

the cluster that reports them is a Performance Replication secondary, you will
need to perform any resolution steps necessary against that cluster and not the
Primary.

Copy link
Collaborator

@digivava digivava Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens to same-case entity alias duplicates? I don't see a section for that. Is it so simple that it can be covered right away before jumping into all these different-case scenarios? Or is there no such thing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I left it out because it's nuanced: basically those are already auto-merged. I've alluded to the alias merging in a few places but was kinda trying to avoid spelling out the complicated history of that here for brevity. I guess it's a natural question though so I'll see if I can work in a simple statement to indicate that they won't been found without going into the details.

* Only one of them will be returned when looking up or listing by name
* Before Vault 1.19.0, which one is returned might vary depending on which
server is hit or even after a seal and unseal on the same server
* Lookup by ID _will_ work for all duplicates
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is "will" italicized here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally because this is a list of "unexpected behavior" and this item on it's own is not unexpected it's more meant to clarify since the other forms of lookup don't work as expected. I think it might be better just to leave it out though.

* Before Vault 1.19.0, which one is returned might vary depending on which
server is hit or even after a seal and unseal on the same server
* Lookup by ID _will_ work for all duplicates
* Listing by IS will return _all_ duplicates
Copy link
Collaborator

@digivava digivava Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is IS in this context? Is this a typo for ID?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A typo for ID 😄

will be unaffected and continue to work the same way. The exceptions are
noted below.

There are two know edge cases that you should be aware of before activating the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

know -> known

the resource is renamed. The same could apply to other external references to
a named entity or group that use the name-based API methods to read or
manage. [Resolution details are
provided below.]()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

link missing to the section below

In practice this is overwhelmingly unlikely to result in unintended access being
granted to an existing resource since the new name contains a UUID which should
never collide with another resource path. The rename _could_ potentially cause
the entity to _loose_ access to a resource they previously had access to
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

loose -> lose

After activating the `force-identity-deduplication` flag, one of these entities
will no longer have the specified name and so Terraform will attempt to rename
it and be unable to because that violates the case-insensitive name constraint
that has be re-introduced. An apply would result in an error such as:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"that has been reintroduced in more recent versions of Vault", right? (for clarity and a typo there)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call! In this case it means "reintroduced when you activated the flag because that's what this flag does". But agree that's not obvious and I'll call it out.

or group by name. Simply changing the reference to use the new name is the
simplest resolution.

### Activating the `force-identity-deduplication` flag
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it was said somewhere way up above, but once you write this part I think it might be good to make sure there's another warning here that this is a one-time thing, so they should review the resolution information for their relevant duplicate types above before activating the flag.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hashicorp-contributed-pr If the PR is HashiCorp (i.e. not-community) contributed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants