Fix __typename inside router response #6009

IvanGoncharov · 2024-09-17T00:24:37Z

This PR fixes the following edge cases:

1. The initial response of the query containing `@defer` should be an empty object

For a query like this:

{
  ... @defer { me { name } }
}

The initial response should have data set to an empty object, as in this graphql-js test case:
https://github.com/graphql/graphql-js/blob/2b42a70191243d0ca0e0e4f1d580d6794718fbd5/src/execution/__tests__/defer-test.ts#L371-L384

2. `data: null` semantic should be maintained

According to GraphQL spec, null can be propagated to data: https://spec.graphql.org/draft/#sel-FANTNRCAACGBrwa
But the fix:

router/apollo-router/src/spec/query.rs

Lines 241 to 257 in b443ad0

    
           Some(Value::Null) => { 
        
               // Detect if root __typename is asked in the original query (the qp doesn't put root __typename in subselections) 
        
               // cf https://github.com/apollographql/router/issues/1677 
        
               let operation_kind_if_root_typename = original_operation.and_then(|op| { 
        
                   op.selection_set 
        
                       .iter() 
        
                       .any(|f| f.is_typename_field()) 
        
                       .then(|| *op.kind()) 
        
               }); 
        
               response.data = match operation_kind_if_root_typename { 
        
                   Some(operation_kind) => { 
        
                       let mut output = Object::default(); 
        
                       output.insert(TYPENAME, operation_kind.default_type_name().into()); 
        
                       Some(output.into()) 
        
                   } 
        
                   None => Some(Value::default()), 
        
               };

Added in #2253, resulted data: null being replaced with data: { __typename: "<some typename>" }, for queries like this:

{
  nonNullFieldThatErrors
  __typename
}

3. Handle cases where query with `@defer` also have `__typename` inside fragment

Fix added in #2253, here:

router/apollo-router/src/spec/query.rs

Lines 241 to 257 in b443ad0

    
           Some(Value::Null) => { 
        
               // Detect if root __typename is asked in the original query (the qp doesn't put root __typename in subselections) 
        
               // cf https://github.com/apollographql/router/issues/1677 
        
               let operation_kind_if_root_typename = original_operation.and_then(|op| { 
        
                   op.selection_set 
        
                       .iter() 
        
                       .any(|f| f.is_typename_field()) 
        
                       .then(|| *op.kind()) 
        
               }); 
        
               response.data = match operation_kind_if_root_typename { 
        
                   Some(operation_kind) => { 
        
                       let mut output = Object::default(); 
        
                       output.insert(TYPENAME, operation_kind.default_type_name().into()); 
        
                       Some(output.into()) 
        
                   } 
        
                   None => Some(Value::default()), 
        
               };

And here:

router/apollo-router/src/spec/query.rs

Lines 149 to 160 in b443ad0

    
           // Detect if root __typename is asked in the original query (the qp doesn't put root __typename in subselections) 
        
           // cf https://github.com/apollographql/router/issues/1677 
        
           let operation_kind_if_root_typename = 
        
               original_operation.and_then(|op| { 
        
                   op.selection_set 
        
                       .iter() 
        
                       .any(|f| f.is_typename_field()) 
        
                       .then(|| *op.kind()) 
        
               }); 
        
           if let Some(operation_kind) = operation_kind_if_root_typename { 
        
               output.insert(TYPENAME, operation_kind.default_type_name().into()); 
        
           }

Only worked for cases where __typename is specified directly in the topmost selection set, but didn't work for __typenames wrapped in fragments, e.g.:

{
  ... { __typename }
  ... defer {
    me { name }
  }
}

Because of #1, #2 and #3 fix added in #2253 was removed and but fix for #4 also fixes issues with @defer

4. Remaping subgraph's root type names to supergraph names didn't work for __typename wrapped in fragments

Subgraphs could have their root types named differently than supergraphs.
Moreover, different subgraphs could have different names for root types.

QP removes __typename from the topmost selection set, so in combination with:

router/apollo-router/src/spec/query.rs

Lines 840 to 843 in b443ad0

    
           } else if name.as_str() == TYPENAME { 
        
               if !output.contains_key(field_name_str) { 
        
                   output.insert(field_name.clone(), Value::String(root_type_name.into())); 
        
               }

This issue is fixed for queries like this:

{
  __typename
  me { name }
}

But QP doesn't delete __typename wrapped in fragments like so (same true for named fragment):

{
  ... { __typename }
  me { name }
}

That means

router/apollo-router/src/spec/query.rs

Lines 840 to 843 in b443ad0

    
           } else if name.as_str() == TYPENAME { 
        
               if !output.contains_key(field_name_str) { 
        
                   output.insert(field_name.clone(), Value::String(root_type_name.into())); 
        
               }

Can't be reached because subgraph will return __typename value for root, and it will trigger this "if" instead:

router/apollo-router/src/spec/query.rs

Line 814 in b443ad0

if let Some(input_value) = input.get_mut(field_name_str) {

That's why I changed the order of these if branches, so the subgraph's value of __typename for the root type is always ignored, and the supergraph's root type names are always used. Also, the exact same order of if branches are already used inside apply_selection_set.

However, this code is not reachable for __typenames wrapped in fragments because apply_root_selection_set incorrectly calls apply_selection_set for selection sets wrapped in fragments:

router/apollo-router/src/spec/query.rs

Lines 873 to 882 in b443ad0

    
           self.apply_selection_set( 
        
               selection_set, 
        
               parameters, 
        
               input, 
        
               output, 
        
               path, 
        
               // FIXME: use `ast::Name` everywhere so fallible conversion isn’t needed 
        
               #[allow(clippy::unwrap_used)] 
        
               &FieldType::new_named(type_condition.try_into().unwrap()).0, 
        
           )?;

router/apollo-router/src/spec/query.rs

Lines 909 to 918 in b443ad0

    
           self.apply_selection_set( 
        
               &fragment.selection_set, 
        
               parameters, 
        
               input, 
        
               output, 
        
               path, 
        
               // FIXME: use `ast::Name` everywhere so fallible conversion isn’t needed 
        
               #[allow(clippy::unwrap_used)] 
        
               &FieldType::new_named(root_type_name.try_into().unwrap()).0, 
        
           )?;

If root fields are wrapped in fragments, they are still applied on root type, e.g.:

{
   ... { 
     a: __typename # applied on query root
       ... {
         b: __typename # still query root
         
         me {
           __typename # not a root
         }
       }
     }
   }
}

So, I switch those calls to be apply_root_selection_set.
All of the above steps in combination fixed #4

5. Incorrect errors on inline fragments using root's interfaces

Root type can implement interfaces and it's legal to use those interfaces in queries, like so:

interface Foo {
   foo: String
}

type Query implements {
   foo: String
}

with query:

{
  ... on Foo {
    foo
  }
}

But before my change, this query would error here:

router/apollo-router/src/spec/query.rs

Lines 865 to 867 in b443ad0

    
           if type_condition.as_str() != root_type_name { 
        
               return Err(InvalidValue); 
        
           }

6. __typename with alias returned by subgraph is not validated

__typename is validated here:

router/apollo-router/src/spec/query.rs

Lines 547 to 561 in b443ad0

    
           if let Some(input_type) = 
        
               input_object.get(TYPENAME).and_then(|val| val.as_str()) 
        
           { 
        
               // If there is a __typename, make sure the pointed type is a valid type of the schema. Otherwise, something is wrong, and in case we might 
        
               // be inadvertently leaking some data for an @inacessible type or something, nullify the whole object. However, do note that due to `@interfaceObject`, 
        
               // some subgraph can have returned a __typename that is the name of an interface in the supergraph, and this is fine (that is, we should not 
        
               // return such a __typename to the user, but as long as it's not returned, having it in the internal data is ok and sometimes expected). 
        
               let Some(ExtendedType::Object(_) | ExtendedType::Interface(_)) = 
        
                   parameters.schema.types.get(input_type) 
        
               else { 
        
                   parameters.nullified.push(Path::from_response_slice(path)); 
        
                   *output = Value::Null; 
        
                   return Ok(()); 
        
               }; 
        
           }

And is used to assign current_type here:

router/apollo-router/src/spec/query.rs

Lines 574 to 586 in b443ad0

    
           let current_type = if parameters 
        
               .schema 
        
               .get_interface(field_type.inner_named_type()) 
        
               .is_some() 
        
               || parameters 
        
                   .schema 
        
                   .get_union(field_type.inner_named_type()) 
        
                   .is_some() 
        
           { 
        
               typename.as_ref().unwrap_or(field_type) 
        
           } else { 
        
               field_type 
        
           };

But aliased __typename handled here (same code for non-aliased, but they already validated by code above):

router/apollo-router/src/spec/query.rs

Lines 574 to 586 in b443ad0

    
           let current_type = if parameters 
        
               .schema 
        
               .get_interface(field_type.inner_named_type()) 
        
               .is_some() 
        
               || parameters 
        
                   .schema 
        
                   .get_union(field_type.inner_named_type()) 
        
                   .is_some() 
        
           { 
        
               typename.as_ref().unwrap_or(field_type) 
        
           } else { 
        
               field_type 
        
           };

This code guarantees that the aliased __typename contains the name of the object type from the supergraph schema.
But it doesn't guarantee that all __typenames (aliased and not) have the same value that is compatible with current_type.

As #4 proves, we can't trust the subgraph to return the correct names, and QP doesn't guarantee that either.
So, since we already use current_type to track the type of the current selection set, it makes sense to use it as a source of truth for all __typenames but fallback to subgraph's __typename if current_type is not an object (shouldn't happen unless QP has bugs).

Fixes #issue_number

Checklist

Complete the checklist (and note appropriate exceptions) before the PR is marked ready-for-review.

Exceptions

Note any exceptions here

Notes

It may be appropriate to bring upcoming changes to the attention of other (impacted) groups. Please endeavour to do this before seeking PR approval. The mechanism for doing this will vary considerably, so use your judgement as to how and when to do this. ↩
Configuration is an important part of many changes. Where applicable please try to document configuration examples. ↩
Tick whichever testing boxes are applicable. If you are adding Manual Tests, please document the manual testing (extensively) in the Exceptions. ↩

apollo-router/tests/integration/typename.rs

apollo-router/src/spec/query.rs

IvanGoncharov · 2024-09-17T00:59:21Z

apollo-router/src/spec/query.rs

                    if include_skip.should_skip(parameters.variables) {
                        continue;
                    }

-                    self.apply_selection_set(


We are still inside root selection, even if we go inside an inline fragment.
So it should be apply_root_selection_set

this sounds reasonable but looks like a dangerous change. Do you have a test that would show what happens before and after that change?

yes it tested here:

router/apollo-router/tests/integration/subgraph_response.rs

Lines 34 to 82 in e871c93

async fn test_subgraph_returning_different_typename_on_query_root() -> Result<(), BoxError> {

let mut router = IntegrationTest::builder()

.config(CONFIG)

.responder(ResponseTemplate::new(200).set_body_json(json!({

"data": {

"topProducts": null,

"__typename": "SomeQueryRoot",

"aliased": "SomeQueryRoot",

"inside_fragment": "SomeQueryRoot",

"inside_inline_fragment": "SomeQueryRoot"

}

})))

.build()

.await;

router.start().await;

router.assert_started().await;

let query = r#"

{

topProducts { name }

__typename

aliased: __typename

...TypenameFragment

... {

inside_inline_fragment: __typename

}

}

fragment TypenameFragment on Query {

inside_fragment: __typename

}

"#;

let (_trace_id, response) = router.execute_query(&json!({ "query": query })).await;

assert_eq!(response.status(), 200);

assert_eq!(

response.json::<serde_json::Value>().await?,

json!({

"data": {

"topProducts": null,

"__typename": "Query",

"aliased": "Query",

"inside_fragment": "Query",

"inside_inline_fragment": "Query"

}

})

);

Ok(())

}

apollo-router/src/spec/query.rs

apollo-router/src/services/supergraph/tests.rs

IvanGoncharov · 2024-09-17T02:25:25Z

apollo-router/src/spec/query.rs

-                            .unwrap_or_else(|| {
-                                Value::String(ByteString::from(
-                                    current_type.inner_named_type().as_str().to_owned(),
-                                ))


here, we tried to use __typename values returned by the subgraph but didn't validate them. I assume (not tested) it can even be the name of the interface if the subgraph uses @interfaceObject

According to the GraphQL spec, __typename should always use an object name (interfaces and unions are forbidden), so I changed the code to first use current_type (which is always valid) and then fallback to using the subgraph's value only if current_type is not an object type.

this should be checked with what the output rewriting code in the QP is doing

Not sure what to do here.
Realistically, I can't inspect QP in the scope of this PR.
I can revert this particular change, because it separated from other fixes.
@Geal Do you think it worth to revert it?

apollo-router/src/spec/query.rs

IvanGoncharov · 2024-09-17T06:57:08Z

apollo-router/src/query_planner/execution.rs

@@ -311,6 +311,8 @@ impl PlanNode {
                            let _ = primary_sender.send((value.clone(), errors.clone()));
                        } else {
                            let _ = primary_sender.send((value.clone(), errors.clone()));
+                            // primary response should be an empty object
+                            value.deep_merge(Value::Object(Default::default()));


here is a test case from graphql-js:
https://github.com/graphql/graphql-js/blob/2b42a70191243d0ca0e0e4f1d580d6794718fbd5/src/execution/__tests__/defer-test.ts#L371-L384

if the data field should ve an enpty object (unless it contained a non nullable field that was nullified), then I think this should be fixed here:

router/apollo-router/src/query_planner/execution.rs

Line 80 in 3436380

&initial_value.unwrap_or_default(),

to set it to an object instead of Value::Null by default

You right, it should be correct.
I hesitated to do it since I'm less familiar with how QP executes and also I don't have capacity to test all possible scenarios in context of this PR.
So I choose to do it just inside this specific branch that I'm 100% sure is correct.

@Geal Do you think it safe to make this change for entire QP execution?
Or I can create issue for that I do it in separate PR?

maybe in a separate PR, considering the timing

I created a router issue for it.

Geal · 2024-09-18T11:49:54Z

apollo-router/src/query_planner/execution.rs

@@ -311,6 +311,8 @@ impl PlanNode {
                            let _ = primary_sender.send((value.clone(), errors.clone()));
                        } else {
                            let _ = primary_sender.send((value.clone(), errors.clone()));
+                            // primary response should be an empty object
+                            value.deep_merge(Value::Object(Default::default()));


if the data field should ve an enpty object (unless it contained a non nullable field that was nullified), then I think this should be fixed here:

router/apollo-router/src/query_planner/execution.rs

Line 80 in 3436380

&initial_value.unwrap_or_default(),

to set it to an object instead of Value::Null by default

apollo-router/src/services/supergraph/tests.rs

apollo-router/tests/integration/typename.rs

apollo-router/tests/integration/subgraph_response.rs

Geal · 2024-09-18T12:08:00Z

apollo-router/tests/integration/subgraph_response.rs

+    assert_eq!(response.status(), 200);
+    assert_eq!(
+        serde_json::from_str::<serde_json::Value>(&response.text().await?)?,
+        json!({ "data": null })


why should { __typename topProducts { name } } return a null data if the subgraph returns null? topProducts is a nullable field, right?
Is it for the case where the subgraph would nullify because there was a non nullable field?

You are right, it was mainly to test null propagation that goes up to data.
However, graphql servers can return data: null if execution is started by no resolvers executed:
https://spec.graphql.org/October2021/#sel-FAPHPHCAACEBr6C

For example, here is the test from graphql-js:
https://github.com/graphql/graphql-js/blob/993d7ce2ee6db2a13973b037817f3eac60607b8a/src/execution/__tests__/executor-test.ts#L971-L981

These cases are but could happen, so we should keep semantics.

I guess the "right" way would be to detect from errors if we got a null because a non null field was nullified by the subgraph, but that may be too complex

If we switch default value to execute_recursively to be an object as discussed earlier that would mean we would get empty object in most of the cases.
But we should inspect QP execution to see all possible scenarios.

Geal · 2024-09-18T12:11:11Z

apollo-router/tests/integration/subgraph_response.rs

+}
+
+#[tokio::test(flavor = "multi_thread")]
+async fn test_subgraph_returning_different_typename_on_query_root() -> Result<(), BoxError> {


are we sure in this test that the __typename selections got all the way to the subgraph? If that's the case, we are supposed to have some field rewriting driven by the query plan fetch nodes (input_rewrites and output_rewrites). Is that test exercising that rewriting, or what is happening in query formatting?

Happening in query formatting, that's test that answer your comment here: #6009 (comment)

By switching to apply_root_selection_set we always execute this code (even if parts of root selection is wrapped in fragments):

router/apollo-router/src/spec/query.rs

Lines 783 to 786 in ddeed8d

if name.as_str() == TYPENAME {

if !output.contains_key(field_name_str) {

output.insert(field_name.clone(), Value::String(root_type_name.into()));

}

So now we 100% sure __typename on root type is valid independently from anything that QP will return.

that's good to make sure of that but probably we need a test that exercise the entire pipeline, not only the formatting

yes, it's done in this test:

router/apollo-router/tests/integration/subgraph_response.rs

Lines 34 to 82 in e871c93

async fn test_subgraph_returning_different_typename_on_query_root() -> Result<(), BoxError> {

let mut router = IntegrationTest::builder()

.config(CONFIG)

.responder(ResponseTemplate::new(200).set_body_json(json!({

"data": {

"topProducts": null,

"__typename": "SomeQueryRoot",

"aliased": "SomeQueryRoot",

"inside_fragment": "SomeQueryRoot",

"inside_inline_fragment": "SomeQueryRoot"

}

})))

.build()

.await;

router.start().await;

router.assert_started().await;

let query = r#"

{

topProducts { name }

__typename

aliased: __typename

...TypenameFragment

... {

inside_inline_fragment: __typename

}

}

fragment TypenameFragment on Query {

inside_fragment: __typename

}

"#;

let (_trace_id, response) = router.execute_query(&json!({ "query": query })).await;

assert_eq!(response.status(), 200);

assert_eq!(

response.json::<serde_json::Value>().await?,

json!({

"data": {

"topProducts": null,

"__typename": "Query",

"aliased": "Query",

"inside_fragment": "Query",

"inside_inline_fragment": "Query"

}

})

);

Ok(())

}

it testing all possible scenario for how __typename can be nested and tests that __typename returned by subgraph are ignored.

apollo-router/src/spec/query.rs

Geal · 2024-09-18T12:21:33Z

apollo-router/src/spec/query.rs

-                            .unwrap_or_else(|| {
-                                Value::String(ByteString::from(
-                                    current_type.inner_named_type().as_str().to_owned(),
-                                ))


this should be checked with what the output rewriting code in the QP is doing

Geal · 2024-09-18T12:25:19Z

apollo-router/src/spec/query.rs

                    if include_skip.should_skip(parameters.variables) {
                        continue;
                    }

-                    self.apply_selection_set(


this sounds reasonable but looks like a dangerous change. Do you have a test that would show what happens before and after that change?

@Geal

All credits goes to @Geal

Fix __typename inside router response

7402a1b

This comment has been minimized.

Sign in to view

apollo-bot2 assigned IvanGoncharov Sep 17, 2024