Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

verification failure: Clients should error if response is missing required fields #123

Open
k-simons opened this issue May 19, 2020 · 15 comments

Comments

@k-simons
Copy link
Contributor

k-simons commented May 19, 2020

What happened?

What did you want to happen?

  • Clients error if receiving a response missing required keys
@nmiyake
Copy link
Contributor

nmiyake commented May 20, 2020

Yes, this is unfortunate. The actual issue is also more general -- since Conjure treats non-optional fields as "required" generally, this behavior should also be enforced when doing a standard unmarshal from JSON into the Conjure-generated struct.

I propose the following fix:

  • For every Conjure-generated struct, we generate a custom UnmarshalJSON function that does the following:
    • Defines a struct type that matches the Conjure struct, except that every non-optional non-collection field is generated as an optional field (becomes a pointer)
    • Performs an unmarshal into custom struct type
    • Verify that every non-optional field is set (value != nil)
    • If any value is nil, fail
    • Otherwise, copy all values from the unmarshaled optional-supporting struct type into the actual destination struct (just a simple copy assignment with pointer dereference)

As an example, for the type:

type Basic struct {
	Data string `json:"data"`
}

The generated unmarshal would be:

func (o *Basic) UnmarshalJSON(b []byte) error {
	basicStrict := struct {
		Data *string `json:"data"`
	}{}
	if err := json.Unmarshal(b, &basicStrict); err != nil {
		return err
	}
	if basicStrict.Data == nil {
		return fmt.Errorf(`required field "data" is missing`)
	}
	o.Data = *basicStrict.Data
	return nil
}

@nmiyake
Copy link
Contributor

nmiyake commented May 20, 2020

One concern for the above would be that if any current code that uses Conjure types implicitly depends on supporting missing values, this will clearly break. Since some code uses Conjure types for serialization (and not just REST APIs), this is more of a risk.

@nmiyake
Copy link
Contributor

nmiyake commented May 20, 2020

@bmoylan for feedback/thoughts on the above.

@bmoylan
Copy link
Contributor

bmoylan commented May 22, 2020

Overall +1. I'm sympathetic to the behavior break but I think we can communicate this out to warn people that required fields are really required.

Some notes on the suggested generated code above:

While we're changing this, it might be worth more correctly implementing wire spec items Forward compatible clients, Servers reject unknown fields, Round-trip of unknown variants if possible, though this probably requires more thought.

@k-simons
Copy link
Contributor Author

k-simons commented May 27, 2020

Yup, specifically was looking into https://github.com/palantir/conjure/blob/master/docs/spec/wire.md#42-servers-reject-unknown-fields and https://github.com/palantir/conjure/blob/master/docs/spec/wire.md#41-forward-compatible-clients and the best way to handle this.

Erroring on extra fields is a bit tricky. The simplest thing is to potentially use a map[string]interface{} and then assign the values into a struct.
We could also have different "JSON structs" for clients/servers in which this ^^ map idea would only be needed server side

@k-simons
Copy link
Contributor Author

@nmiyake
Copy link
Contributor

nmiyake commented May 27, 2020

Yup that's a more recent addition, but believe that (or something like it) would be the most correct way to handle this.

@k-simons
Copy link
Contributor Author

k-simons commented May 27, 2020

Played around with this a bit and came up with:

package pkg

import (
	"bytes"
	"encoding/json"
	"fmt"
	"io"
	"testing"

	"github.com/palantir/pkg/safejson"
	"github.com/stretchr/testify/assert"
)

type OriginalConjureStruct struct {
	Foo string  `json:"foo"`
	Bar *string `json:"bar"`
	Baz int     `json:"baz"`
}

type optionalGeneratedOriginalConjureStruct struct {
	Foo *string `json:"foo"`
	Bar *string `json:"bar"`
	Baz *int    `json:"baz"`
}

var allPresent = []byte(`{
  "foo":"foo",
  "bar":"bar",
  "baz":1
}`)

var hasExtra = []byte(`{
  "foo":"foo",
  "bar":"bar",
  "c": "c",
  "baz":1
}`)

var missingRequired = []byte(`{
  "baz":1
}`)

var missingOptional = []byte(`{
  "foo":"foo",
  "baz":1
}`)

func TestServer(t *testing.T) {
	for _, tc := range []struct {
		name     string
		input    []byte
		expected *OriginalConjureStruct
	}{
		{
			name:  "allPresent",
			input: allPresent,
			expected: &OriginalConjureStruct{
				Foo: "foo",
				Bar: str("bar"),
				Baz: 1,
			},
		},
		{
			name:  "missingOptional",
			input: missingOptional,
			expected: &OriginalConjureStruct{
				Foo: "foo",
				Baz: 1,
			},
		},
		{
			name:  "hasExtra",
			input: hasExtra,
		},
		{
			name:  "missingRequired",
			input: missingRequired,
		},
	} {
		t.Run(tc.name, func(t *testing.T) {
			actual, err := functionGenerated(bytes.NewReader(tc.input), true)
			if tc.expected == nil {
				assert.Nil(t, actual)
				assert.Error(t, err)
			} else {
				assert.NoError(t, err)
				assert.Equal(t, actual, tc.expected)
			}
		})
	}
}

func TestClient(t *testing.T) {
	for _, tc := range []struct {
		name     string
		input    []byte
		expected *OriginalConjureStruct
	}{
		{
			name:  "allPresent",
			input: allPresent,
			expected: &OriginalConjureStruct{
				Foo: "foo",
				Bar: str("bar"),
				Baz: 1,
			},
		},
		{
			name:  "missingOptional",
			input: missingOptional,
			expected: &OriginalConjureStruct{
				Foo: "foo",
				Baz: 1,
			},
		},
		{
			name:  "hasExtra",
			input: hasExtra,
			expected: &OriginalConjureStruct{
				Foo: "foo",
				Bar: str("bar"),
				Baz: 1,
			},
		},
		{
			name:  "missingRequired",
			input: missingRequired,
		},
	} {
		t.Run(tc.name, func(t *testing.T) {
			actual, err := functionGenerated(bytes.NewReader(tc.input), false)
			if tc.expected == nil {
				assert.Nil(t, actual)
				assert.Error(t, err)
			} else {
				assert.NoError(t, err)
				assert.Equal(t, actual, tc.expected)
			}
		})
	}
}

// Generated: functionGenerated
func functionGenerated(r io.Reader, strict bool) (*OriginalConjureStruct, error) {
	var optionalA optionalGeneratedOriginalConjureStruct
	dec := getDecoder(r, strict)
	err := dec.Decode(&optionalA)
	if err != nil {
		return nil, err
	}
	if optionalA.Foo == nil {
		return nil, fmt.Errorf(`required field "foo" is missing`)
	}
	if optionalA.Baz == nil {
		return nil, fmt.Errorf(`required field "baz" is missing`)
	}
	return &OriginalConjureStruct{
		Foo: *optionalA.Foo,
		Bar: optionalA.Bar,
		Baz: *optionalA.Baz,
	}, nil
}

// We can either generate this or generate 2 versions of the function above
func getDecoder(r io.Reader, strict bool) *json.Decoder {
	if strict {
		return DecodeStrict(r)
	}
	return safejson.Decoder(r)
}

// This moves into safejson in some way. Similar to safejson.Decoder
func DecodeStrict(r io.Reader) *json.Decoder {
	decoder := safejson.Decoder(r)
	decoder.DisallowUnknownFields()
	return decoder
}

// Just a helper for the test
func str(s string) *string {
	return &s
}

It is a similar thought to what you proposed, however it avoids using a custom Unmarshal function because once you invoke this you will by-pass DisallowUnknownFields. Thoughts?

@nmiyake
Copy link
Contributor

nmiyake commented May 27, 2020

So there are 2 issues here:

  1. Conjure structs should require non-optional fields
  2. Conjure servers should reject Conjure structs with unknown fields

If we want to do (1) correctly, then I think that has to be part of the unmarshal, since we would want to catch this even when no server interactions are required (unless we think this is too risky due to backcompat and just want to enforce this for servers)

For (2), this does need to be on the server side, and agree using the decoder is the best approach.

There is an interesting question of whether if we do (1) in the unmarshal, whether using a custom decoder is honored/propagated through the custom unmarshal. This is something we'd need to investigate -- if that doesn't work, then we may have to use the approach you outlined, or actually hard-code decoder settings as part of the custom unmarshal.

It looks like your proposal above is to handle both (1) and (2) only for arguments provided to server code, and to basically generate the with-optional object in the server. That's certainly a valid approach.

However, would just want to clarify our thought process around non-optional fields for Conjure structs in the context other than Conjure REST calls -- I think it's more correct to provide enforcement that these values can't be missing in JSON, in which case it would make sense to handle (1) as part of the Unmarshal.

@k-simons
Copy link
Contributor Author

Yes so I am saying that unless you define the custom decoder in the actual Unmarshal function, it will not work. So something like:

func (o *OriginalConjureStruct) UnmarshalJSON(b []byte) error {
	basicStrict := struct {
		Foo *string `json:"foo"`
		Bar *string `json:"bar"`
		Baz *int    `json:"baz"`
	}{}

	decoder := safejson.Decoder(bytes.NewReader(b))
	decoder.DisallowUnknownFields()

	if err := decoder.Decode(&basicStrict); err != nil {
		return err
	}
	if basicStrict.Foo == nil {
		return fmt.Errorf(`required field "data" is missing`)
	}
	if basicStrict.Baz == nil {
		return fmt.Errorf(`required field "data" is missing`)
	}
	o.Foo = *basicStrict.Foo
	o.Baz = *basicStrict.Baz
	o.Bar = basicStrict.Bar
	return nil
}

will work.

But:

func (o *OriginalConjureStruct) UnmarshalJSON(b []byte) error {
	basicStrict := struct {
		Foo *string `json:"foo"`
		Bar *string `json:"bar"`
		Baz *int    `json:"baz"`
	}{}
	if err := json.Unmarshal(b, &basicStrict); err != nil {
		return err
	}
	if basicStrict.Foo == nil {
		return fmt.Errorf(`required field "data" is missing`)
	}
	if basicStrict.Baz == nil {
		return fmt.Errorf(`required field "data" is missing`)
	}
	o.Foo = *basicStrict.Foo
	o.Baz = *basicStrict.Baz
	o.Bar = basicStrict.Bar
	return nil
}

// Generated: functionGenerated
func functionGenerated(r io.Reader, strict bool) (*OriginalConjureStruct, error) {
	var originalConjureStruct OriginalConjureStruct
	dec := getDecoder(r, strict)
	err := dec.Decode(&originalConjureStruct)
	if err != nil {
		return nil, err
	}
	return &originalConjureStruct, nil
}

// We can either generate this or generate 2 versions of the function above
func getDecoder(r io.Reader, strict bool) *json.Decoder {
	if strict {
		return DecodeStrict(r)
	}
	return safejson.Decoder(r)
}

^^ with a strict decoder will not work. I do not want to stick the decoder in the UnmarshalJSON so opted to not override it. We would use similar generated code for unmarshalling client side (minus the strict part)

@k-simons
Copy link
Contributor Author

Additionally, I have confirmed that overriding UnmarshalJSON will break the current decoder configuration that we use:

func Decoder(r io.Reader) *json.Decoder {
	decoder := json.NewDecoder(r)
	decoder.UseNumber()
	return decoder
}

specifically the UseNumber will not be used

@nmiyake
Copy link
Contributor

nmiyake commented May 27, 2020

Yeah looks like the general issue is that defining a custom Unmarshal function causes all decoder configuration to be ignored, which is unfortunate.

Given that, I'm on-board with the proposal @k-simons posted above that will enforce the "require non-optional fields" behavior on the client and server, and the "reject unknown fields" behavior on server.

The once case this doesn't handle is requiring non-optional fields from an unmarshal performed manually from JSON (for example, if JSON is persisted to disk or encoded and decoded in-memory or across non-Conjure clients), but I think that's an acceptable trade-off (and actually makes the backcompat story easier too)

@bmoylan
Copy link
Contributor

bmoylan commented May 27, 2020

Makes sense to me! I think using safejson in the generated code is probably not worth the indirection and we could drop the dependency and just create the plain decoder with UseNumber as well.

Are we going to have to generate two different structs for the client and server sides due to the divergent UnmarshalJSON behavior, or do we implement two different unmarshal methods on the same struct and have the generated client/server call the right one?

Also it's fine to keep these examples short but want to make sure my concerns about the returned error are handled in the eventual implementation.

Thanks for taking this on!

@k-simons
Copy link
Contributor Author

k-simons commented May 27, 2020

@bmoylan yup noted. Will make sure that

  1. We capture all missing fields
  2. Error will be a conjure error

And we should be able to use the same struct. If you look in #123 (comment) (which is the table tests for client/server) they each just call functionGenerated which handles the "missing required fields". The "no extra fields" is then determined by the decoder. You'll notice that hasExtra passes for client but fails for server

@k-simons
Copy link
Contributor Author

@bmoylan @nmiyake have been playing around with this a bit, the UnmarshalJSON side looks fine but MarshalJSON is proving a bit tricky:

type TopLevel2 struct {
	StrArg                 string   `json:"strArg"`
	StrOptionalArg   *string `json:"strOptionalArg"`
}
func TestBigSad(t *testing.T) {
	myTopLevelArg := TopLevel2{}
	a, _ := json.Marshal(myTopLevelArg)
	fmt.Println(string(a))
}
{"strArg":"","strOptionalArg":null}

When we Marshal fields with empty values the keys are preserved, which when we receive these bytes, we will see these keys and assume they were set/present with empty values.

You could add the omitempty tags but then it makes it impossible to actually pass in the null values (0/“”/false).

Currently the only solution that I have came up with is to

  1. Change all non-collection fields to be *$field as opposed to $field. So everything is a pointer.
  2. Add the omit empty tags

This forces object creation to actually set the fields they need to. There becomes some ambiguity for optionals, a required string field and an optional string field would each be *string, with the server/client doing the validation that the required string is set. However, I'm not sure if this actually matters... you could make it more formal by generating Optional like wrapper structs in this case which would make it more explicit, but I'm not sure that is better.

Thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants