Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change include of json.h in json-builder.h. Change %g to %f because Python.json doesn't like scientific notation #6

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

cabeaulac
Copy link

Changed include to use quotes instead of angle brackets.
This allows the include to be found in the local include path.

  Changed include to use quotes instead of angle brackets.
  This allows the include to be found in the local include path.
@cabeaulac cabeaulac changed the title json-builder.h Change include of json.h in json-builder.h Oct 6, 2014
The Python json module does not like scientific notation when loading a string using the json.loads() method.
@cabeaulac cabeaulac changed the title Change include of json.h in json-builder.h Change include of json.h in json-builder.h. Change %g to %f because Python.json doesn't like scientific notation Oct 8, 2014
@cabeaulac
Copy link
Author

I just added a commit to this pull request. Changed %g to %f when using scanf to convert double into a string. %g creates doubles in scientific notation for large values and Python.json.loads(my_json) does not play nicely with scientific notation.

@LB--
Copy link
Member

LB-- commented Oct 9, 2014

The JSON standard explicitly allows for scientific notation.

Also it's not a good idea to combine two separate things into one pull request.

@cabeaulac
Copy link
Author

It does but Python.json isn't parsing it correctly. What shall we do? Change to %f so the Python binding works or maybe there's a way to tell the Python.json.loads function to take scientific notation that I don't know about yet. What do you think?

On Oct 8, 2014, at 5:02 PM, LB-- [email protected] wrote:

The JSON standard explicitly allows for scientific notation.


Reply to this email directly or view it on GitHub.

@jamesamcl
Copy link
Collaborator

Could make it an option?

@cabeaulac
Copy link
Author

It's changed in my version to %f so the JSON coming from an embedded C
application to Python will parse with the Python.json module. If you want
to make it an option, that sounds great to me. I don't know how you'd want
to do that. If you want to do it, I'll pull down the latest after you push.
Otherwise, you can tell me how you'd like it done and I'll add it to my
pull request.

Thanks,
Chad

On Thu, Oct 9, 2014 at 2:16 AM, James McLaughlin [email protected]
wrote:

Could make it an option?


Reply to this email directly or view it on GitHub
#6 (comment).

@LB--
Copy link
Member

LB-- commented Oct 9, 2014

Probably by adding an option to json-builder.h#L113-L145

#define json_serialize_opt_no_scientific_notation  (1 << 6)

And then changing the implementation to check for that option.

json-builder.h
  Added the json_serialize_opt_no_scientific_notation option.
  This changes the serialization code to use %f instead of %g when it's set.
@cabeaulac
Copy link
Author

I pushed a change to implement the json_serialize_opt_no_scientific_notation. I'm running with it in my code and in my CUnit tests now. It's working good so far.

total += snprintf (NULL, 0, "%g", value->u.dbl);
}
if (value->u.dbl - floor (value->u.dbl) < 0.001)
total += 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these lines are 1 space too shallow - James uses 3-space indents.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't change the 2 space indent code. That was already there.

Sent from my iPhone

On Oct 9, 2014, at 3:05 PM, LB-- [email protected] wrote:

In json-builder.c:

@@ -663,11 +670,16 @@ size_t json_measure_ex (json_value * value, json_serialize_opts opts)
break;

      case json_double:

- total += snprintf (NULL, 0, "%g", value->u.dbl);

  •        if (value->u.dbl - floor (value->u.dbl) < 0.001)
    
  •            total += 2;
    
  •       if (no_scientific_notation)
    
  •       {
    
  •          total += snprintf (NULL, 0, "%f", value->u.dbl);
    
  •       }
    
  •       else
    
  •       {
    
  •          total += snprintf (NULL, 0, "%g", value->u.dbl);  
    
  •       }
    
  •       if (value->u.dbl - floor (value->u.dbl) < 0.001)
    
  •          total += 2;
    
    I think these lines are 1 space too shallow - James uses 3-space indents.


Reply to this email directly or view it on GitHub.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compare the original (deleted) lines 669 and 670 to your new 681 and 682

@cabeaulac
Copy link
Author

Hmm. I'm not seeing it. Please elaborate. I'm brand new to this codebase and can be easily missing something.
Original

 if (value->u.dbl - floor (value->u.dbl) < 0.001)
                total += 2;

New

if (value->u.dbl - floor (value->u.dbl) < 0.001)
              total += 2;

@jamesamcl
Copy link
Collaborator

I think this would be more concise as a ternary if on the format anyway?

---- Chad Beaulac wrote ----

Hmm. I'm not seeing it. Please elaborate. I'm brand new to this codebase and can be easily missing something.
Original

if (value->u.dbl - floor (value->u.dbl) < 0.001) total += 2;

New

if (value->u.dbl - floor (value->u.dbl) < 0.001) total += 2;


Reply to this email directly or view it on GitHub.

@LB--
Copy link
Member

LB-- commented Oct 10, 2014

Hmm. I'm not seeing it.

image

I don't know how strict @udp is on formatting, I just know he prefers 3 spaces per indent.

@cabeaulac
Copy link
Author

Ternary operator sounds good.

Sent from my iPhone

On Oct 10, 2014, at 8:48 AM, James McLaughlin [email protected] wrote:

I think this would be more concise as a ternary if on the format anyway?

---- Chad Beaulac wrote ----

Hmm. I'm not seeing it. Please elaborate. I'm brand new to this codebase and can be easily missing something.
Original

if (value->u.dbl - floor (value->u.dbl) < 0.001) total += 2;

New

if (value->u.dbl - floor (value->u.dbl) < 0.001) total += 2;


Reply to this email directly or view it on GitHub.

Reply to this email directly or view it on GitHub.

@cabeaulac
Copy link
Author

Ok. It seems like there's definitely something wrong. Some code is returning shorter values for json_measure_ex than the string returned by json_serialize_ex. I'm writing unit tests to address it now.

@cabeaulac
Copy link
Author

My Python unittest is working with my scientific notation. So, I believe the issue I was originally having is due to json_measure and json_serialize returning different length strings. I'll figure it out and post resolution. I'll probably back out the json_serialize_opt_no_scientific_notation option since it appears that wasn't the issue.

@cabeaulac
Copy link
Author

In json_serialize_ex, can somebody tell me the purpose of adding two spaces to the total if the subtraction condition is met? Thanks

         case json_double:

            total += snprintf (NULL, 0, "%g", value->u.dbl);

            if (value->u.dbl - floor (value->u.dbl) < 0.001)
                total += 2;

@jamesamcl
Copy link
Collaborator

It was added in 4885ca3. It's to allow room for serialize adding .0 when integer double values get serialized without a fractional part (which it does to make sure they're read back as doubles, rather than integers).

@cabeaulac
Copy link
Author

Oh. Ok. That makes total sense. I'll write some CUnit tests that target
that area of the code. The json_measure function is returning a value 2
characters longer than necessary sometimes. I'm trying to figure out why.
It's only when using floats. And, two extra zeros at the end of a byte
array makes the Python json.loads(my_doc) raise and exception.

On Sat, Oct 11, 2014 at 2:32 AM, James McLaughlin [email protected]
wrote:

It was added in 4885ca3
4885ca3.
It's to allow room for serialize adding .0 when integer double values get
serialized without a fractional part (which it does to make sure they're
read back as doubles, rather than integers).


Reply to this email directly or view it on GitHub
#6 (comment).

@LB--
Copy link
Member

LB-- commented Oct 11, 2014

%g doesn't seem to work as expected: http://ideone.com/Z3wT6A

I think it would take less effort to just try both %f and %e manually and choose one instead of using %g and then trying to manually append .0

http://en.cppreference.com/w/cpp/io/c/fprintf

Unless alternative representation is requested the trailing zeros are removed, also the decimal point character is removed if no fractional part is left.

I don't know what it means by "alternative representation".

@cabeaulac
Copy link
Author

How about this?

  1. We use %f by default
  2. Let user pass in the %f format string. Like "8.2". If the user provides a string that says x.0, we change the zero to a one.
  3. Let client specify %e as an option. Format string is not used if this option is specified.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants