Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stop relying on ujson (fix #589, #587, #507) #637

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

ziirish
Copy link
Collaborator

@ziirish ziirish commented May 6, 2019

ujson does not seem to be maintained anymore. Some bugs reported months (or years ago) start to affect our users.
Since we don't ship with ujson in our requirements and this was just a "optional performance improvement" I think it is safe to remove it.

Note: this reverts #318

@ziirish ziirish added this to the 0.12.1 milestone May 6, 2019
@ziirish ziirish requested a review from SteadBytes May 6, 2019 08:32
@ziirish ziirish modified the milestones: 0.12.1, 0.12.2 May 6, 2019
@coveralls
Copy link

coveralls commented May 6, 2019

Coverage Status

Coverage increased (+0.008%) to 96.836% when pulling 912920e on fix-drop-ujson into e911078 on master.

@ziirish
Copy link
Collaborator Author

ziirish commented May 6, 2019

Coverall seems drunk.
It noticed there is one line less than in the previous commit dropping the number of covered lines from 11 to 10... right, but it did not notice there are several lines less than in the previous commit.

@ziirish
Copy link
Collaborator Author

ziirish commented May 6, 2019

As discussed with @noirbizarre we give people the ability to choose their own json serializer through a new dedicated RESTPLUS_JSON_SERIALIZER config option.
The default being to use python's build-in serializer by default.

This way if they are not affected by a ujson bug they can still use it as an opt-in option, or they can even use whatever serializer they like (there are plenty nowadays)

Copy link
Collaborator

@j5awry j5awry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Future work -- this solution does the import on each run of output_json. That might slow things down. Suggest splitting this into a separate module that's instantiate on app instantiation, and passed around...or some other option that keeps the choice around for the length of the app and reuses

@ziirish
Copy link
Collaborator Author

ziirish commented May 6, 2019

@j5awry actually the import only occurs once (during the first call) then the serializer is cached within the serializer global variable.

I just ran some tests to confirm the caching is working as it should and it turns out it does =)

@j5awry
Copy link
Collaborator

j5awry commented May 6, 2019

@j5awry actually the import only occurs once (during the first call) then the serializer is cached within the serializer global variable.

I just ran some tests to confirm the caching is working as it should and it turns out it does =)

Sweet! i'd still like it as a separate module for clarity, but that's a personal thing. If it sticks around cached in the global, then we're good! Again a personal thing with an aversion to global :)

@SteadBytes
Copy link
Collaborator

Does using the default Python json module have any performance implications over the current usage of ujson?

@ziirish
Copy link
Collaborator Author

ziirish commented May 9, 2019

I don't know whether this is representative or not but here are the benchmark results.

Without ujson:

>> Run benchmarks
Test session starts (platform: linux, Python 3.6.8, pytest 3.2.3, pytest-sugar 0.9.0)
benchmark: 3.1.1 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=2 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /opt/workspace/flask-restplus, inifile: setup.cfg
plugins: sugar-0.9.0, profiling-1.2.11, mock-1.6.3, flask-0.10.0, faker-2.0.0, cov-2.5.1, benchmark-3.1.1

 tests/benchmarks/bench_marshalling.py ✓✓✓✓                                                                                                                                                                                      67% ██████▋   
 tests/benchmarks/bench_swagger.py ✓✓                                                                                                                                                                                           100% ██████████

------------------------------------------------------------------------------------------ benchmark 'marshalling': 4 tests -----------------------------------------------------------------------------------------
Name (time in us)                         Min                    Max                  Mean              StdDev                Median                IQR            Outliers         OPS            Rounds  Iterations
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
bench_marshal_simple                 429.9954 (1.0)       2,721.3283 (1.44)       447.7427 (1.0)       49.8622 (1.0)        447.1317 (1.0)      13.0665 (1.0)        16;106  2,233.4255 (1.0)        2220           1
bench_marshal_simple_with_mask     1,173.6453 (2.73)      1,895.1111 (1.0)      1,216.0383 (2.72)      72.7597 (1.46)     1,200.3686 (2.68)     21.7594 (1.67)        54;70    822.3425 (0.37)       1396           1
bench_marshal_nested               1,844.3316 (4.29)      2,925.9734 (1.54)     1,885.2151 (4.21)      49.8984 (1.00)     1,877.7251 (4.20)     26.5837 (2.03)        62;74    530.4434 (0.24)       1021           1
bench_marshal_nested_with_mask     2,606.2354 (6.06)     20,920.8205 (11.04)    2,762.5558 (6.17)     783.2512 (15.71)    2,675.0267 (5.98)     71.7286 (5.49)         8;79    361.9836 (0.16)        736           1
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------ benchmark 'swagger': 2 tests -----------------------------------------------------------------------------------------
Name (time in us)                     Min                   Max                  Mean              StdDev                Median                 IQR            Outliers         OPS            Rounds  Iterations
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
bench_swagger_specs_cached       732.1090 (1.0)      1,028.4223 (1.0)        753.2495 (1.0)       36.4551 (1.0)        738.8182 (1.0)       18.8295 (1.0)         30;33  1,327.5814 (1.0)         276           1
bench_swagger_specs            6,795.6112 (9.28)     7,992.8115 (7.77)     6,943.3125 (9.22)     143.5110 (3.94)     6,859.9358 (9.29)     217.2068 (11.54)        36;4    144.0235 (0.11)        342           1
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Legend:
  Outliers: 1 Standard Deviation from Mean; 1.5 IQR (InterQuartile Range) from 1st Quartile and 3rd Quartile.
  OPS: Operations Per Second, computed as 1 / Mean

Results (12.57s):
       6 passed

With app.config['RESTPLUS_JSON_SERIALIZER'] = 'ujson':

>> Run benchmarks
Test session starts (platform: linux, Python 3.6.8, pytest 3.2.3, pytest-sugar 0.9.0)
benchmark: 3.1.1 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=2 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /opt/workspace/flask-restplus, inifile: setup.cfg
plugins: sugar-0.9.0, profiling-1.2.11, mock-1.6.3, flask-0.10.0, faker-2.0.0, cov-2.5.1, benchmark-3.1.1

 tests/benchmarks/bench_marshalling.py ✓✓✓✓                                                                                                                                                                                      67% ██████▋   
 tests/benchmarks/bench_swagger.py ✓✓                                                                                                                                                                                           100% ██████████

----------------------------------------------------------------------------------------- benchmark 'marshalling': 4 tests -----------------------------------------------------------------------------------------
Name (time in us)                         Min                   Max                  Mean              StdDev                Median                IQR            Outliers         OPS            Rounds  Iterations
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
bench_marshal_simple                 431.2173 (1.0)      1,028.7687 (1.0)        443.5333 (1.0)       16.7665 (1.0)        441.4711 (1.0)       8.2292 (1.0)       187;235  2,254.6223 (1.0)        2248           1
bench_marshal_simple_with_mask     1,174.5617 (2.72)     2,034.2432 (1.98)     1,216.6918 (2.74)      79.1666 (4.72)     1,196.5148 (2.71)     23.9285 (2.91)       74;111    821.9008 (0.36)       1377           1
bench_marshal_nested               1,799.6989 (4.17)     2,629.2317 (2.56)     1,847.2305 (4.16)      50.8462 (3.03)     1,838.7847 (4.17)     26.0845 (3.17)        56;63    541.3510 (0.24)       1026           1
bench_marshal_nested_with_mask     2,561.9529 (5.94)     4,902.0723 (4.76)     2,656.1316 (5.99)     144.1454 (8.60)     2,619.2032 (5.93)     51.0439 (6.20)        45;66    376.4874 (0.17)        643           1
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------- benchmark 'swagger': 2 tests -------------------------------------------------------------------------------------------
Name (time in us)                     Min                    Max                  Mean                StdDev                Median                 IQR            Outliers         OPS            Rounds  Iterations
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
bench_swagger_specs_cached       730.5555 (1.0)         789.5790 (1.0)        738.7847 (1.0)          8.5986 (1.0)        735.3090 (1.0)        3.2894 (1.0)         52;53  1,353.5743 (1.0)         284           1
bench_swagger_specs            6,796.8778 (9.30)     43,136.9580 (54.63)    7,136.4213 (9.66)     1,976.2746 (229.84)   6,988.0933 (9.50)     234.0488 (71.15)        1;28    140.1263 (0.10)        339           1
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Legend:
  Outliers: 1 Standard Deviation from Mean; 1.5 IQR (InterQuartile Range) from 1st Quartile and 3rd Quartile.
  OPS: Operations Per Second, computed as 1 / Mean

Results (11.57s):
       6 passed

@SteadBytes
Copy link
Collaborator

SteadBytes commented May 10, 2019

Looks like for the average case (mean) the two are fairly similar, however ujson has a much smaller standard deviation for the marshalling benchmarks - interesting! I'm glad it's not a horrendous slow down or anything 😅, so I'm happy with dropping ujson default.

Is it worth adding test to ensure that importing the configured RESTPLUS_JSON_SERIALIZER is correct? The test_representations test is ensuring that the serialization produces correct outputs but not that the correct serializer is imported and used.

On a separate note, the addition of the configuration options documentation is nice 👍 The SWAGGER_SUPPORTED_SUBMIT_METHODS option that was merged the other day should be in there too now.

I'll create a PR to this PR to add that and the serializer import tests so we can get this merged in 😄

EDIT: @ziirish whilst adding some tests for the serializer import, I have found quite a significant issue - take a look at my comment https://github.com/noirbizarre/flask-restplus/pull/637/files#r282764702 and let me know your thoughts 😄

@ziirish ziirish force-pushed the fix-drop-ujson branch 2 times, most recently from 7e885a3 to 9325cc6 Compare May 10, 2019 14:29
# then enforce a custom serializer
current_app.config['RESTPLUS_JSON_SERIALIZER'] = 'ujson'
r2 = rep.output_json(payload, 200)
assert uloads(r2.get_data(True)) == uloads(udumps(payload))
Copy link
Collaborator

@SteadBytes SteadBytes May 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is testing several different things within a single test case - how do you feel about splitting it up into several test cases so that it's clearer to see the individual assertions? Something like:

def test_representations_serialization_output_correct():
    # original test comparing get_data and dumps outputs

def test_config_custom_serializer_is_module():
    # current_app.config['RESTPLUS_JSON_SERIALIZER'] = 'ujson'
    # test .dumps method is used 

def test_config_custom_serializer_is_function():
    # current_app.config['RESTPLUS_JSON_SERIALIZER'] = 'my_custom_json.module.dump_method'
    # test correct function imported

def test_config_custom_serializer_fallback():
    # test fallback to json.dumps

This makes the intended behaviour clearer and is easier to see from a test failure where the issue is 😄

I think it's a good idea to at least log a message when a fallback to json.dumps takes place. It can fallback without breaking, but the user's should be aware that it's happening (adding it to the docs is still good though so thank you for that). Perhaps something like:

import warnings

# after failing to import user's custom serializer
warnings.warn(
    "unable to import RESTPLUS_JSON_SERIALIZER={}, falling back to json.dumps".format(
        current_app.config["RESTPLUS_JSON_SERIALIZER"]
    )
)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

I have spit the test.
I'm not quite sure about the warnings.warn yet though. I think we should define a communication with upper apps policy first so people know how to deal with those.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great thank you! Sorry, I'm not sure what you mean by

communication with upper apps policy

Could you please clarify? Did you mean define a way in which Flask-RESTPlus should interact with other applications?

Using warnings/warn is a common practice for Flask extensions that wish to communicate misconfiguration/possible error to the user. For example:

Or using logger.warning is also a common approach.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please clarify? Did you mean define a way in which Flask-RESTPlus should interact with other applications?

I mean we should document what kind of communication people using FLask-RESTPlus should expect.
I know warnings.warn is a common practice, but we should document this.
The reason is the default filter might hide some messages hence this doesn't provide any help to the users:

import warnings

warnings.warn('You should not use this', DeprecationWarning)
warnings.warn('Hey, something went wrong', UserWarning)

Result in:

$ python /tmp/warn.py
/tmp/warn.py:4: UserWarning: Hey, something went wrong
  warnings.warn('Hey, something went wrong', UserWarning)

So by default, unless people know they should expect a DeprecationWarning they won't notice it.

Copy link
Collaborator

@SteadBytes SteadBytes May 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes I understand - thank you for clarifying 😄 Yes I agree that this should definitely be documented - it's good to be clear to the users so thank you for pointing that out 👍 .

How about changing this:
https://github.com/noirbizarre/flask-restplus/pull/637/files#diff-dce4048f760a2254db20f3a78d7fcfa1R362 to say something like:

Flask-RESTPlus will always fallback to the default json.dumps serializer if it cannot manage to import the one you configured. This is indicated by a warning during initialisation.

And then include an example of the output when starting the application, for example using logging:

* Serving Flask app "t" (lazy loading)
 * Environment: production
   WARNING: Do not use the development server in a production environment.
   Use a production WSGI server instead.
 * Debug mode: on
 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
 * Restarting with stat
unable to import RESTPLUS_JSON_SERIALIZER=broken_seralizer.dumps, falling back to json.dumps
 * Debugger is active!
 * Debugger PIN: 627-628-298

Or using warnings:

 * Serving Flask app "t" (lazy loading)
 * Environment: production
   WARNING: Do not use the development server in a production environment.
   Use a production WSGI server instead.
 * Debug mode: on
 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
 * Restarting with stat
/home/ben/Documents/flask-restplus/flask_restplus/api.py:213: UserWarning: unable to import RESTPLUS_JSON_SERIALIZER=broken_seralizer.dumps, falling back to json.dumps
  warnings.warn("unable to import RESTPLUS_JSON_SERIALIZER=broken_seralizer.dumps, falling back to json.dumps")
 * Debugger is active!
 * Debugger PIN: 627-628-298

I'm just very hesitant to fallback completely silently (even if it is documented) as the user will not be able to tell when it has happened.

What do you think? 😄

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only problem I see with this is the code path in which this warning will be raised is not deterministic (it will happen only after some route returns and if it needs marshalling).

Maybe the solution is to load the serializer within the Api constructor.

This way, people know when/where to catch the warnings.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added the warnings.warn as suggested.
This should happen early in the initialization of the Api unless you pass it a Blueprint instead of a Flask app (in which case the warning will be raised when calling a route that needs serialization.
The loaded serializer is cached within the app.config. I don't think this is the best place to do so, but I didn't find any other...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants