Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: non-interned strings returned by JsonParser #24

Closed
mraasvel opened this issue Feb 6, 2024 · 7 comments · Fixed by #28
Closed

Question: non-interned strings returned by JsonParser #24

mraasvel opened this issue Feb 6, 2024 · 7 comments · Fixed by #28

Comments

@mraasvel
Copy link

mraasvel commented Feb 6, 2024

I had some trouble reproducing this issue, as in a smaller scale test the JVM likely optimized and accidentally produced the correct behavior. But I ran into this issue: converted an existing JsonNode to a TreeTraversingParser using ObjectMapper#treeAsTokens. This returns a JsonParser, which I passed later to the curioswitch marshaller.

Result:

  • InvalidProtoculBufferException: Cannot find field: <field-name> in message <message-name> even though the field was present in the message.

After some debugging I found that the generated parsing code does reference comparisons and thus relies on the parser to produce interned field names as shown in the comment of DoParse.java:

String fieldName = parser.getCurrentName();
// JsonParser returns interned field names, which works well with our class constants.
if (fieldName == "fieldOne" || fieldName == "field_one") 
...

The issue was indeed fixed after I wrote a parser wrapper that interned the getCurrentName() method.

My question is how would you approach fixing it? To me it seems like the JsonParser is not guaranteed to return interned strings, so no assumption should be made that they are.

@chokoswitch
Copy link
Contributor

Hi @mraasvel - thanks for the report and sorry for the issue. I agree that this needs to be fixed. It's been a while but I recall using reference comparison had reasonable impact on benchmarks (most comparisons will be "not equals" in that loop), but I will confirm it and see if it makes sense to use normal comparison all the time, or perhaps have an inspection of the JsonParser implementation that are known to intern.

@chokoswitch
Copy link
Contributor

@mraasvel I'd like to examine the failing case. I tried a simple test but as you found also I guess it doesn't reproduce the failure

https://github.com/curioswitch/protobuf-jackson/blob/main/src/testDatabind/java/org/curioswitch/common/protobuf/json/ObjectMapperTest.java#L125

Would you be able to contribute a test case that triggers the bug?

@mraasvel
Copy link
Author

mraasvel commented Feb 7, 2024

We get a Value message in, we then parse it as a different message using a TokenBuffer in between:

    @Test
    void testConvertProtoType() throws IOException {
        MessageMarshaller marshaller = MessageMarshaller.builder()
                .register(SimpleMessage.class)
                .build();

        // original JsonNode is not parsed from actual json
        Value original = Value.newBuilder()
                .setStructValue(Struct.newBuilder()
                        // Allocate field name with new String(), meaning the key is not interned automatically
                        .putFields(new String("data"), Value.newBuilder().setStringValue("value").build())
                        .build())
                .build();

        ObjectMapper mapper = new ObjectMapper()
            .registerModule(MessageMarshallerModule.of(marshaller));

        // Use a jackson TokenBuffer to convert the message to a JsonNode
        TokenBuffer buffer = new TokenBuffer(mapper.getFactory().getCodec(), false);
        mapper.writeValue(buffer, original);
        JsonParser parser = buffer.asParser();

        // Parse the value into its actual message type
        SimpleMessage value = mapper.readValue(parser, SimpleMessage.class);
        System.out.println(value);
    }

This crashes because now the key is created via new String("data"), and thus won't be a constant or interned since it's dynamic. I would expect this test case to work as the JsonParser is valid.

@mraasvel
Copy link
Author

mraasvel commented Feb 8, 2024

Is the test case sufficient for you to pick it up from here? Or do you need anything else from my side?

@chokoswitch
Copy link
Contributor

Thanks a lot for the great repro @mraasvel! I was able to understand the situation better. I created a simple workaround for now that solves your case by checking if the parser is a jackson-core parser or not, but does not actually handle the case where interning is disabled explicitly. After many years, I haven't heard of anyone running into an issue because of explicit interning, and for now I am hoping that this feature is never disabled. Though it would be better to handle it properly so I'm also checking with Jackson team FasterXML/jackson-core#1211

@mraasvel
Copy link
Author

mraasvel commented Feb 9, 2024

Thanks for the quick responses and help! I didn't notice the base parser class being different was what caused it. Still it seems to be an instance of com.fasterxml.jackson.core.base.ParserMinimalBase, which is part of jackson core by the looks of it?

I think it might be better for us to write a small wrapper around the parser that returns interned field names instead.

@chokoswitch
Copy link
Contributor

@mraasvel Yeah as you've noticed, this is a workaround and not a precise fix - we ran into a corner case here but I am hoping this is enough for now, and it looks like Jackson is receptive to potentially adding more help for us here so will look at pursuing that for a truer fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants