Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow using precise floats in logs #2005

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

kasmarian
Copy link
Member

@kasmarian kasmarian commented Jan 10, 2025

Description

This PR allows using precise floats in logs.

Several JSON body filters in Logbook use JsonGenerator to reconstruct the payload.
generator.copyCurrentEvent method has a flaw when writing float numbers. This may result in logging a different float value than the actual value that was in the payload.

With this change, I'm adding an optional boolean flag to all JSON body filters to use copyCurrentEventExact instaed. That may come with a performance impact (not sure how significant).

Motivation and Context

#1993

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have added tests to cover my changes.

@kasmarian kasmarian added the bugfix Bug fixes and patches label Jan 10, 2025
Copy link
Collaborator

@msdousti msdousti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Meta comment: Jackson can encode floats as double, BigDecimal or String. The last option might be both efficient and precise.

README.md Outdated Show resolved Hide resolved
@@ -27,7 +38,11 @@ public String compact(final String json) throws IOException {
final JsonGenerator generator = factory.createGenerator(output)) {

while (parser.nextToken() != null) {
generator.copyCurrentEvent(parser);
if (usePreciseFloats) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this logic be refactored?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean to avoid 2 if blocks? Did it in 27515ed

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, but we now have three places where private void copyCurrentEvent() is defined. What I meant is, could we define it once, and use it in all places?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These 3 places are not really connected to each other. In another language I could potentially do it via an extension function, but here the only way would be to extract it further into a static method that would take not 2 but 3 arguments. Not sure if I like the idea. Please suggest a better way, if you have it in mind

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrapped everything in strategy as suggested by @whiskeysierra in #2005 (comment).

Not sure if I like the overcomplication (in my view), but I'm OK with keeping it this way, if all in favor. Please have a look.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason is mostly because JsonGenerator has to be created every time filter method is called, as it's created per output. At the same time, the Json BodyFilter classes are created once the Logbook is created.
So to save the intent (precise float or not) we need to insert this information on the class creation time, but apply it (call different methods of the JsonGenerator depending on the strategy) on every filter call with different instances of generators.

Copy link
Collaborator

@whiskeysierra whiskeysierra Jan 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd have the strategy invoked at the point where a factory (and output) is given and a generator is needed.
The default returns factory.createGenerator(output).
The string variant returns:

factory.rebuild()
    .enable(JsonWriteFeature.WRITE_NUMBERS_AS_STRINGS)
    .build()
    .createGenerator(output)
    .enable(JsonGenerator.Feature.WRITE_NUMBERS_AS_STRINGS)

(Do we actually need both features there?)

The precise variant returns a custom JsonGenerator that wraps around an existing one and delegates everything, except copyCurrentEvent which it implements by calling copyCurrentEventExact on the underlying generator. You can create an abstract ForwardingJsonGenerator that just delegates every method as-is and have the PreciseFloatJsonGenerator only override the single method. Makes it a bit easier to read.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Do we actually need both features there?)

No, it's a transition from one way to set the feature to another, JsonGenerator.Feature.WRITE_NUMBERS_AS_STRINGS is deprecated. But in another comment (#2005 (comment)), I'm trying to convince @msdousti to give up on the idea to use WRITE_NUMBERS_AS_STRINGS feature all together, as it doesn't preserve the precision. I'd just leave precise and not precise floats for now.

I'd have the strategy invoked at the point where a factory (and output) is given and a generator is needed.
The default returns factory.createGenerator(output).

But that's what is happening. Or am I missing something? I can keep the boolean flag in the BodyFilter classes constructors to decide which strategy to call, but then it's no better to the previous version.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But that's what is happening. Or am I missing something?

Almost, yes. You have this dedicated wrapper class around the generator, I'm assuming to reduce the interface size? I'd just get rid of that and the duality of having a wrapper and a creator for each case. You can get away with just a creator in two out of three cases and only for one of them you need a custom generator.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made the wrappers stateless (they now require JsonGenerator to be directly supplied on each method call) and left only one layer in 2499c37, please have a look

@kasmarian
Copy link
Member Author

Meta comment: Jackson can encode floats as double, BigDecimal or String. The last option might be both efficient and precise.

I like the idea, but wouldn't this end up in showing floats as strings in the output? This may also lead to a confusion as it won't represent the actual payload.

@msdousti
Copy link
Collaborator

This may also lead to a confusion as it won't represent the actual payload.

We can give the library user the option to choose between float representations: Double, BigDecimal, or String. This is IMHO better than the boolean flag.

  • If they want speed + precision, and don't care about number represented as string, they use String.
  • speed but not precision ==> Double
  • low speed, high precision ==> BigDecimal

WDYT?

@kasmarian
Copy link
Member Author

This may also lead to a confusion as it won't represent the actual payload.

We can give the library user the option to choose between float representations: Double, BigDecimal, or String. This is IMHO better than the boolean flag.

* If they want speed + precision, and don't care about number represented as string, they use String.

* speed but not precision ==> Double

* low speed, high precision ==> BigDecimal

WDYT?

Can you give me a hint hot to actually use floats as strings, when JsonGenerator is used? All I'm finding so far is a way to annotate DTO classes to use this feature, but can't find a low level approach that would fit in the current implementation.

@whiskeysierra
Copy link
Collaborator

whiskeysierra commented Jan 16, 2025 via email

@msdousti
Copy link
Collaborator

Can you give me a hint hot to actually use floats as strings, when JsonGenerator is used? All I'm finding so far is a way to annotate DTO classes to use this feature, but can't find a low level approach that would fit in the current implementation.

Haven't tested it, but from their wiki:

JsonFactory f = JsonFactory.builder()
   .enable(JsonWriteFeature.WRITE_NUMBERS_AS_STRINGS)
   .build();

...because no one likes boolean flags anymore ¯\_(ツ)_/¯
void shouldLogFloatAsString() {
final String filtered = new CompactingJsonBodyFilter(new NumberAsStringJsonGeneratorWrapperCreator())
.filter("application/custom+json", pretty);
final String compactedWithFloatAsString = "{\"root\":{\"child\":\"text\",\"float_child\":\"0.40000000000000002\"}}";
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@msdousti as per WRITE_NUMBERS_AS_STRINGS feature, it looks like it doesn't work the way we thought.

JsonToken type is still ID_NUMBER_FLOAT and JsonGenerator calls _copyCurrentFloatValue, which calls
WriterBasedJsonGenerator.writeNumber(double d) and the double value is written down to the write, just wrapped in a string.

I left this failing test as well as NumberAsStringJsonGeneratorWrapperCreator implementation as a showcase.

Copy link
Collaborator

@msdousti msdousti Jan 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kasmarian

I should admit enable(JsonWriteFeature.WRITE_NUMBERS_AS_STRINGS) was a bad suggestion as it is used during writing of an object to JSON, not reading of a JSON string.

Unfortunately, I could not find a similar JsonReadFeature.

Alternatively, I came up with the following code. The compact function can be adapted to look like this or something similar.

import com.fasterxml.jackson.core.JsonFactory;
import com.fasterxml.jackson.core.JsonGenerator;
import com.fasterxml.jackson.core.JsonParser;

import java.io.CharArrayWriter;

import static com.fasterxml.jackson.core.JsonToken.VALUE_NUMBER_FLOAT;

enum JsonNumberFormat {
    AS_FLOAT,
    AS_DOUBLE,
    AS_BIG_DECIMAL,
    AS_STRING,
    AS_DEFAULT,
}

public class Example {


    private static String readJson(String json, JsonNumberFormat format) throws Exception {
        final JsonFactory factory = new JsonFactory();

        try (
                final CharArrayWriter output = new CharArrayWriter(json.length());
                final JsonParser parser = factory.createParser(json);
                final JsonGenerator generator = factory.createGenerator(output)) {

            while (parser.nextToken() != null) {
                if (parser.getCurrentToken() == VALUE_NUMBER_FLOAT) {
                    switch (format) {
                        case AS_FLOAT -> generator.writeNumber(parser.getFloatValue());
                        case AS_DOUBLE -> generator.writeNumber(parser.getValueAsDouble());
                        case AS_BIG_DECIMAL -> generator.writeNumber(parser.getDecimalValue());
                        case AS_STRING -> generator.writeString(parser.getValueAsString());
                        default -> generator.copyCurrentEvent(parser);
                    }
                } else {
                    generator.copyCurrentEvent(parser);
                }
            }

            generator.flush();
            return output.toString();
        }
    }

    public static void main(String[] args) throws Exception {
        final String json = """
            {
                "x": [0.40000000000000002, 1, "a"]
            }
            """;

        for (JsonNumberFormat format : JsonNumberFormat.values()) {
            System.out.println(format + " --> " + readJson(json, format));
        }
    }
}

Output:

AS_FLOAT --> {"x":[0.4,1,"a"]}
AS_DOUBLE --> {"x":[0.4,1,"a"]}
AS_BIG_DECIMAL --> {"x":[0.40000000000000002,1,"a"]}
AS_STRING --> {"x":["0.40000000000000002",1,"a"]}
AS_DEFAULT --> {"x":[0.4,1,"a"]}

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion with using bare (parser.getValueAsString())! It's straightforward, and the functionality is on the same level, that these filters are operating on, as opposed to factory feature flags, and I didn't like as much. I applied it in NumberAsStringJsonGeneratorWrapper. Please have a look.
If you think we should switch from the strategy approach to feature flags (enums), let's discuss it in the other thread, please.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix Bug fixes and patches
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants