Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct issues and bugs with neighbor traversal #2020

Merged
merged 1 commit into from
Nov 2, 2023
Merged

Conversation

mtdowling
Copy link
Member

This commit fixes a few fundamental issues with how neighbors and model graph traversal was implemented. These changes are bug fixes and correctness fixes, though they could potentially impact existing selectors of a very small set of users (e.g., there is no known impact to Amazon's very large corpus of Smithy models).

The theme of this change is more accurately exposing the Smithy model as a graph, making graph traversal unambiguous. There are a few major issues addressed in this CR.

  1. The bound relationship is no longer emitted or documented. It can still be used if referred to explicitly in a directed neighbor selector.
  2. instanceOperation relationship is now "synthetic", deprecated, and not documented. It is no longer emitted from a NeighborVisitor but can be used if referred to explicitly in a directed neighbor selector.
  3. The operation relationship now only includes operations bound to resources and service via the operations property.
  4. The collectionOperation relationship now only includes operations bound to a resource vai the collectionOperations property.
  5. Recursive neighbor traversal from a member no longer includes the containing shape.

Further background and justification can be found below.

Bound relationships are confusing and buggy

Smithy today documents a “bound” relationship as:

resource → bound: The service or resource to which the resource is bound.
operation → bound: The service or resource to which the operation is bound.

However, the implementation of bound is inconsistent and buggy, and worse, the idea itself is confusing: there should be no relationship from a shape to shapes that refer to it. The only reason the “bound” relationship exists is that we added it before reverse neighbor selectors were added to Smithy.

Consider the following model that is commented with each neighbor we previously emitted:

$version: "2.0"

namespace smithy.example

// [Relationship shape="smithy.example#Foo" type="RESOURCE" neighbor="smithy.example#Bar"]
// [Relationship shape="smithy.example#Bar" type="BOUND" neighbor="smithy.example#Foo"]
service Foo {
    resources: [Bar]
}

// [Relationship shape="smithy.example#Bar" type="OPERATION" neighbor="smithy.example#GetBar"]
// [Relationship shape="smithy.example#Bar" type="READ" neighbor="smithy.example#GetBar"]
// [Relationship shape="smithy.example#Bar" type="INSTANCE_OPERATION" neighbor="smithy.example#GetBar"]
// [Relationship shape="smithy.example#GetBar" type="BOUND" neighbor="smithy.example#Bar"]
// [Relationship shape="smithy.example#Foo" type="RESOURCE" neighbor="smithy.example#Bar"]
// [Relationship shape="smithy.example#Bar" type="BOUND" neighbor="smithy.example#Foo"]
resource Bar {
    read: GetBar
}

// [Relationship shape="smithy.example#GetBar" type="INPUT" neighbor="smithy.example#GetBarInput"]
@readonly
operation GetBar {
    // [Relationship shape="smithy.example#GetBarInput" type="STRUCTURE_MEMBER" neighbor="smithy.example#GetBarInput$greeting"]
    input := {
        // [Relationship shape="smithy.example#GetBarInput$greeting" type="MEMBER_CONTAINER" neighbor="smithy.example#GetBarInput"]
        // [Relationship shape="smithy.example#GetBarInput$greeting" type="MEMBER_TARGET" neighbor="smithy.example#Greeting"]
        greeting: Greeting
    }
}

// No relationships
string Greeting

Let’s walk through each shape and document what the relationships currently are, what they should be according to documentation, and what they are after removing BOUND.

Previously, we emitted:

  • Foo -[resource]→ Bar
    • This is correct.
  • Bar -[bound]→ Foo
    • This is wrong in both docs and the ideal. Emitting the bound relationship here is wrong since the documentation states it’s between the resource and a service, so looking at relationships from a service should not emit BOUND.

This shape now only emits:

  • Foo -[resource]→ Bar

Previously, we emitted:

  • Bar -[read]→ GetBar
    • This is correct and expected.
  • Bar -[instanceOperation]→ GetBar
    • This is correct, though it should be a “synthetic” relationship that selectors know about but not actually emitted from NeighborVisitor.
  • Bar -[operation]→ GetBar
    • This is correct according to the current documentation, but is confusing because we don’t know if the operation was bound using read or the “operations” property. (see next topic).
  • Foo -[resource]→ Bar
    • This is completely wrong and not supported by documentation. It’s unquestionably a bug.
  • GetBar -[bound]→ Bar
    • This is wrong since according to the documentation, this relationship should come from GetBar not Bar.
  • Bar -[bound]→ Foo
    • This is correct according to the documentation, but shouldn’t be emitted at all ideally since this shape has no defined edges from it to Foo and these kinds of inverted edges are confusing.

This shape now only emits:

  • Bar -[read]→ GetBar

Today, we emit (and continue to emit after this change):

  • GetBar -[input] → GetBarInput

According to documentation, we should emit:

  • GetBar -[input]→ GetBarInput
  • GetBar -[bound] → Bar

However, given bound relationships are buggy and counter-intuitive, we will continue to not emit a bound relationship here.

Because these neighbors are implemented so inconsistently, using them with selectors is confusing and buggy to the point of the feature being unusable.

This selector works as expected and sees that GetBar is connected to Bar using a READ relationship.

operation <

Returns:

  • smithy.example#Bar

The documentation says that operations have a BOUND relationship to any resource or service they are bound to, but that doesn't work today:

operation -[bound]->

Returns: []

operation <-[bound]-

Returns []

If we check if a resource has a bound relationship from the operation, it surprisingly works. This is very wrong considering there's no edge found from the operation to the resource, but the resource can find the edge (this is due to not implementing the reverse lookups in neighbor provider).

resource <-[bound]-

Returns:

  • smithy.example#GetBar

Strangely, if we check the bound relationships from resources, we get the resource itself back:

resource -[bound]->

Returns:

  • smithy.example#Bar
    • No resource is bound to another resource in this model, so that's clearly wrong. What's happening here is that due to a bug, the resource emits a bound relationship back to itself.
  • smithy.example#Foo

Just getting all edges using ">" and "<" returns the same results:

operation >

Incorrectly (according to docs previously) returns:

  • smithy.example#GetBarInput
operation <

Correctly (according to the docs previously) returns:

  • smithy.example#Bar

However, getting all directed edges of a resource is strange:

resource >

Returns:

  • smithy.example#Bar
    • Bad, and due to the BOUND edge from the service to the resource
  • smithy.example#Foo
    • Strange, given the resource doesn't have a direct edge to the service other than BOUND.
  • smithy.example#GetBar

Getting even weirder due to bugs in the current implementation is getting reverse edges of a resource:

resource <

Returns:

  • smithy.example#Foo (good, the service refers to it)
  • smithy.example#GetBar (bad, because there is no corresponding edge when you get directed neighbors from GetBar)

Making this all more confusing, using a Walker will include service shapes a resource is bound to:

walker.walkShapeIds(model.expectShape(ShapeId.from("smithy.example#Bar")))

Or, using a selector:

resource ~>

Returns:

  • smithy.example#Foo (bad since it traversed up)
  • smithy.example#Bar ✅ (walker always includes the given shape)
  • smithy.example#GetBar ✅
  • smithy.example#GetBarInput ✅
  • smithy.example#GetBarInput$greeting ✅
  • smithy.example#Greeting ✅

However, if we check GetBar, we don't crawl up to the resource! (this is actually better behavior but different from the above).

operation ~>

Returns:

  • smithy.example#GetBar
  • smithy.example#GetBarInput
  • smithy.example#GetBarInput$greeting
  • smithy.example#Greeting

Operation relationships were ambiguous

When it comes to traversing the graph of shapes to operations, the current implementation leaves things ambiguous. We emit the following kinds of relationships today (omitting BOUND relationships since they’re covered above):

  • Service “operations”
    • Service → OPERATION → Operation shape
  • Resource “operations”
    • Resource → OPERATION → Operation shape
    • Resource → INSTANCE_OPERATION → Operation shape
  • Resource “collectionOperations”
    • Resource → OPERATION → Operation shape
    • Resource → COLLECTION_OPERATION → Operation shape
  • Resource “create”
    • Resource → CREATE → Operation shape
    • Resource → COLLECTION_OPERATION → Operation shape
    • Resource → OPERATION → Operation shape
  • Resource “list”
    • Resource → LIST → Operation shape
    • Resource → COLLECTION_OPERATION → Operation shape
    • Resource → OPERATION → Operation shape
  • Resource “put”
    • Resource → PUT → Operation shape
    • Resource → INSTANCE_OPERATION → Operation shape
    • Resource → OPERATION → Operation shape
  • Resource “read”
    • Resource → READ → Operation shape
    • Resource → INSTANCE_OPERATION → Operation shape
    • Resource → OPERATION → Operation shape
  • Resource “update”
    • Resource → UPDATE → Operation shape
    • Resource → INSTANCE_OPERATION → Operation shape
    • Resource → OPERATION → Operation shape
  • Resource “delete”
    • Resource → DELETE → Operation shape
    • Resource → INSTANCE_OPERATION → Operation shape
    • Resource → OPERATION → Operation shape

These relationships are problematic for several reasons:

  • Emitting an OPERATION relationship for every operation makes it impossible to determine how an operation was bound to a resource. Was it bound through a lifecycle? Through “operations”? Through “collectionOperations”?
  • Emitting an INSTANCE_OPERATION relationship is unnecessary and could become a synthetic kind of relationship (that is, rather than actually emit it, just implement special handling for resource -[instanceOperation]->.
  • Emitting COLLECTION_OPERATION for list and create is problematic because you can’t tell if everything returned from resource -[collectionOperation]-> was bound through collectionOperations or not without convoluted :not clauses.

We now only emit the following events from resources and services for operation bindings:

  • Service “operations”
    • Service → OPERATION → Operation shape
  • Resource “operations”
    • Resource → OPERATION → Operation shape
  • Resource “collectionOperations”
    • Resource → COLLECTION_OPERATION → Operation shape
  • Resource “create”
    • Resource → CREATE → Operation shape
  • Resource “list”
    • Resource → LIST → Operation shape
  • Resource “put”
    • Resource → PUT → Operation shape
  • Resource “read”
    • Resource → READ → Operation shape
  • Resource “update”
    • Resource → UPDATE → Operation shape
  • Resource “delete”
    • Resource → DELETE → Operation shape

The "instanceOperation" relationship is now deprecated, no longer emitted, and no longer documented. Because "collectionOperation" is now fixed to only include operations actually bound through "collectionOperations" it would be confusing to have "instanceOperation" act as a synthetic grouping of relationships while "collectionOperation" does not, despite being the other kind of resource binding than instance operations.

You can still refer to it using -[instanceOperation]-> but a relationship isn't emitted via NeighborVisitor.

Recursive member traversal crawled up to its container

When you use a recursive neighbor selector on a member shape, it crawls up to the containing shape. This is wrong because it makes getting the closure of a single member is impossible. For example:

structure Foo {
    bar: String
    baz: Integer
}

Today, using member [id|member = 'bar'] ~> would return:

  • Foo
  • Bar
  • Baz
  • String
  • Integer

The intended result was just String. The previous result was wrong because the edge between the member and its container wasn't filtered out of recursive neighbor traversal, causing all shapes in the containing shape to be returned in the closure.

Issue #, if available:

Description of changes:

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

This commit fixes a few fundamental issues with how neighbors and
model graph traversal was implemented. These changes are bug fixes and
correctness fixes, though they could potentially impact existing
selectors of a very small set of users (e.g., there is no known impact
to Amazon's very large corpus of Smithy models).

The theme of this change is more accurately exposing the Smithy model
as a graph, making graph traversal unambiguous. There are a few major
issues addressed in this CR.

1. The bound relationship is no longer emitted or documented. It can
   still be used if referred to explicitly in a directed neighbor
   selector.
2. instanceOperation relationship is now "synthetic", deprecated, and
   not documented. It is no longer emitted from a NeighborVisitor but
   can be used if referred to explicitly in a directed neighbor selector.
3. The operation relationship now only includes operations bound to
   resources and service via the operations property.
4. The collectionOperation relationship now only includes operations
   bound to a resource vai the collectionOperations property.
5. Recursive neighbor traversal from a member no longer includes the
   containing shape.

Further background and justification can be found below.

**Bound relationships are confusing and buggy**

Smithy today documents a “bound” relationship as:

> resource → bound: The service or resource to which the resource is bound.
> operation → bound: The service or resource to which the operation is bound.

However, the implementation of bound is inconsistent and buggy, and worse,
the idea itself is confusing: there should be no relationship from a shape to
shapes that refer to it. The only reason the “bound” relationship exists is that
we added it before reverse neighbor selectors were added to Smithy.

Consider the following model that is commented with each neighbor we previously
emitted:

```
$version: "2.0"

namespace smithy.example

// [Relationship shape="smithy.example#Foo" type="RESOURCE" neighbor="smithy.example#Bar"]
// [Relationship shape="smithy.example#Bar" type="BOUND" neighbor="smithy.example#Foo"]
service Foo {
    resources: [Bar]
}

// [Relationship shape="smithy.example#Bar" type="OPERATION" neighbor="smithy.example#GetBar"]
// [Relationship shape="smithy.example#Bar" type="READ" neighbor="smithy.example#GetBar"]
// [Relationship shape="smithy.example#Bar" type="INSTANCE_OPERATION" neighbor="smithy.example#GetBar"]
// [Relationship shape="smithy.example#GetBar" type="BOUND" neighbor="smithy.example#Bar"]
// [Relationship shape="smithy.example#Foo" type="RESOURCE" neighbor="smithy.example#Bar"]
// [Relationship shape="smithy.example#Bar" type="BOUND" neighbor="smithy.example#Foo"]
resource Bar {
    read: GetBar
}

// [Relationship shape="smithy.example#GetBar" type="INPUT" neighbor="smithy.example#GetBarInput"]
@readonly
operation GetBar {
    // [Relationship shape="smithy.example#GetBarInput" type="STRUCTURE_MEMBER" neighbor="smithy.example#GetBarInput$greeting"]
    input := {
        // [Relationship shape="smithy.example#GetBarInput$greeting" type="MEMBER_CONTAINER" neighbor="smithy.example#GetBarInput"]
        // [Relationship shape="smithy.example#GetBarInput$greeting" type="MEMBER_TARGET" neighbor="smithy.example#Greeting"]
        greeting: Greeting
    }
}

// No relationships
string Greeting
```

Let’s walk through each shape and document what the relationships currently are,
what they should be according to documentation, and what they are after removing
BOUND.

Previously, we emitted:

- Foo -[resource]→ Bar
  - This is correct.
- Bar -[bound]→ Foo
  - This is wrong in both docs and the ideal. Emitting the bound relationship
    here is wrong since the documentation states it’s between the resource and a
    service, so looking at relationships from a service should not emit BOUND.

This shape now only emits:

- Foo -[resource]→ Bar

Previously, we emitted:

- Bar -[read]→ GetBar
  - This is correct and expected.
- Bar -[instanceOperation]→ GetBar
  - This is correct, though it should be a “synthetic” relationship that
    selectors know about but not actually emitted from `NeighborVisitor`.
- Bar -[operation]→ GetBar
  - This is correct according to the current documentation, but is confusing
    because we don’t know if the operation was bound using read or the
    “operations” property. (see next topic).
- Foo -[resource]→ Bar
  - This is completely wrong and not supported by documentation. It’s
    unquestionably a bug.
- GetBar -[bound]→ Bar
  - This is wrong since according to the documentation, this relationship should
    come from GetBar not Bar.
- Bar -[bound]→ Foo
  - This is correct according to the documentation, but shouldn’t be emitted at
    all ideally since this shape has no defined edges from it to Foo and these
    kinds of inverted edges are confusing.

This shape now only emits:

- Bar -[read]→ GetBar

Today, we emit (and continue to emit after this change):

- GetBar -[input] → GetBarInput

According to documentation, we should emit:

- GetBar -[input]→ GetBarInput
- GetBar -[bound] → Bar

However, given bound relationships are buggy and counter-intuitive,
we will continue to not emit a bound relationship here.

Because these neighbors are implemented so inconsistently, using them with
selectors is confusing and buggy to the point of the feature being unusable.

This selector works as expected and sees that `GetBar` is connected to `Bar`
using a `READ` relationship.

```
operation <
```

Returns:

- smithy.example#Bar

The documentation says that operations have a BOUND relationship to any resource
or service they are bound to, but that doesn't work today:

```
operation -[bound]->
```

Returns: []

```
operation <-[bound]-
```

Returns []

If we check if a resource has a `bound` relationship from the operation, it
surprisingly works. This is very wrong considering there's no edge found from
the operation to the resource, but the resource can find the edge (this is due
to not implementing the reverse lookups in neighbor provider).

```
resource <-[bound]-
```

Returns:

- smithy.example#GetBar

Strangely, if we check the bound relationships from resources, we get the
resource itself back:

```
resource -[bound]->
```

Returns:

- smithy.example#Bar
  - No resource is bound to another resource in this model, so that's clearly
    wrong. What's happening here is that due to a bug, the resource emits
    a bound relationship back to itself.
- smithy.example#Foo

Just getting all edges using ">" and "<" returns the same results:

```
operation >
```

Incorrectly (according to docs previously) returns:

- smithy.example#GetBarInput

```
operation <
```

Correctly (according to the docs previously) returns:

- smithy.example#Bar

However, getting all directed edges of a resource is strange:

```
resource >
```

Returns:

- smithy.example#Bar
  - Bad, and due to the BOUND edge from the service to the resource
- smithy.example#Foo
  - Strange, given the resource doesn't have a direct edge to the service other
    than BOUND.
- smithy.example#GetBar

Getting even weirder due to bugs in the current implementation is getting
reverse edges of a resource:

```
resource <
```

Returns:

- smithy.example#Foo (good, the service refers to it)
- smithy.example#GetBar (bad, because there is no corresponding edge when you
  get directed neighbors from GetBar)

Making this all more confusing, using a Walker will include service shapes a
resource is bound to:

```
walker.walkShapeIds(model.expectShape(ShapeId.from("smithy.example#Bar")))
```

Or, using a selector:

```
resource ~>
```

Returns:

- smithy.example#Foo (bad since it traversed up)
- smithy.example#Bar ✅ (walker always includes the given shape)
- smithy.example#GetBar ✅
- smithy.example#GetBarInput ✅
- smithy.example#GetBarInput$greeting ✅
- smithy.example#Greeting ✅

However, if we check `GetBar`, we don't crawl up to the resource! (this is
actually better behavior but different from the above).

```
operation ~>
```

Returns:

- smithy.example#GetBar
- smithy.example#GetBarInput
- smithy.example#GetBarInput$greeting
- smithy.example#Greeting

**Operation relationships were ambiguous**

When it comes to traversing the graph of shapes to operations, the
current implementation leaves things ambiguous. We emit the following
kinds of relationships today (omitting BOUND relationships since
they’re covered above):

- Service “operations”
  - Service → OPERATION → Operation shape
- Resource “operations”
  - Resource → OPERATION → Operation shape
  - Resource → INSTANCE_OPERATION → Operation shape
- Resource “collectionOperations”
  - Resource → OPERATION → Operation shape
  - Resource → COLLECTION_OPERATION → Operation shape
- Resource “create”
  - Resource → CREATE → Operation shape
  - Resource → COLLECTION_OPERATION → Operation shape
  - Resource → OPERATION → Operation shape
- Resource “list”
  - Resource → LIST → Operation shape
  - Resource → COLLECTION_OPERATION → Operation shape
  - Resource → OPERATION → Operation shape
- Resource “put”
  - Resource → PUT → Operation shape
  - Resource → INSTANCE_OPERATION → Operation shape
  - Resource → OPERATION → Operation shape
- Resource “read”
  - Resource → READ → Operation shape
  - Resource → INSTANCE_OPERATION → Operation shape
  - Resource → OPERATION → Operation shape
- Resource “update”
  - Resource → UPDATE → Operation shape
  - Resource → INSTANCE_OPERATION → Operation shape
  - Resource → OPERATION → Operation shape
- Resource “delete”
  - Resource → DELETE → Operation shape
  - Resource → INSTANCE_OPERATION → Operation shape
  - Resource → OPERATION → Operation shape

These relationships are problematic for several reasons:

- Emitting an OPERATION relationship for every operation makes it impossible to
  determine how an operation was bound to a resource. Was it bound through a
  lifecycle? Through “operations”? Through “collectionOperations”?
- Emitting an INSTANCE_OPERATION relationship is unnecessary and could become a
  synthetic kind of relationship (that is, rather than actually emit it, just
  implement special handling for `resource -[instanceOperation]->`.
- Emitting COLLECTION_OPERATION for `list` and `create` is problematic because
  you can’t tell if everything returned from `resource -[collectionOperation]->`
  was bound through `collectionOperations` or not without convoluted `:not`
  clauses.

We now only emit the following events from resources and services for
operation bindings:

- Service “operations”
  - Service → OPERATION → Operation shape
- Resource “operations”
  - Resource → OPERATION → Operation shape
- Resource “collectionOperations”
  - Resource → COLLECTION_OPERATION → Operation shape
- Resource “create”
  - Resource → CREATE → Operation shape
- Resource “list”
  - Resource → LIST → Operation shape
- Resource “put”
  - Resource → PUT → Operation shape
- Resource “read”
  - Resource → READ → Operation shape
- Resource “update”
  - Resource → UPDATE → Operation shape
- Resource “delete”
  - Resource → DELETE → Operation shape

The "instanceOperation" relationship is now deprecated, no longer
emitted, and no longer documented. Because "collectionOperation" is
now fixed to only include operations actually bound through
"collectionOperations" it would be confusing to have
"instanceOperation" act as a synthetic grouping of relationships
while "collectionOperation" does not, despite being the other kind of
resource binding than instance operations.

You can still refer to it using `-[instanceOperation]->` but a
relationship isn't emitted via NeighborVisitor.

**Recursive member traversal crawled up to its container**

When you use a recursive neighbor selector on a member shape, it crawls up to
the containing shape. This is wrong because it makes getting the closure of a
single member is impossible. For example:

```
structure Foo {
    bar: String
    baz: Integer
}
```

Today, using `member [id|member = 'bar'] ~>` would return:

- Foo
- Bar
- Baz
- String
- Integer

The intended result was just `String`. The previous result was wrong
because the edge between the member and its container wasn't filtered
out of recursive neighbor traversal, causing all shapes in the
containing shape to be returned in the closure.
@mtdowling mtdowling requested a review from a team as a code owner October 25, 2023 22:05
@JordonPhillips JordonPhillips self-assigned this Nov 2, 2023
@mtdowling mtdowling merged commit 0fc277a into main Nov 2, 2023
11 checks passed
@mtdowling mtdowling deleted the fix-relationships branch September 22, 2024 00:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants