stop relying on ujson (fix #589, #587, #507) #637

ziirish · 2019-05-06T08:32:22Z

ujson does not seem to be maintained anymore. Some bugs reported months (or years ago) start to affect our users.
Since we don't ship with ujson in our requirements and this was just a "optional performance improvement" I think it is safe to remove it.

Note: this reverts #318

coveralls · 2019-05-06T08:41:34Z

Coverage increased (+0.008%) to 96.836% when pulling 912920e on fix-drop-ujson into e911078 on master.

ziirish · 2019-05-06T08:47:54Z

Coverall seems drunk.
It noticed there is one line less than in the previous commit dropping the number of covered lines from 11 to 10... right, but it did not notice there are several lines less than in the previous commit.

ziirish · 2019-05-06T12:09:54Z

As discussed with @noirbizarre we give people the ability to choose their own json serializer through a new dedicated RESTPLUS_JSON_SERIALIZER config option.
The default being to use python's build-in serializer by default.

This way if they are not affected by a ujson bug they can still use it as an opt-in option, or they can even use whatever serializer they like (there are plenty nowadays)

j5awry

Future work -- this solution does the import on each run of output_json. That might slow things down. Suggest splitting this into a separate module that's instantiate on app instantiation, and passed around...or some other option that keeps the choice around for the length of the app and reuses

ziirish · 2019-05-06T16:24:48Z

@j5awry actually the import only occurs once (during the first call) then the serializer is cached within the serializer global variable.

I just ran some tests to confirm the caching is working as it should and it turns out it does =)

j5awry · 2019-05-06T17:14:50Z

@j5awry actually the import only occurs once (during the first call) then the serializer is cached within the serializer global variable.

I just ran some tests to confirm the caching is working as it should and it turns out it does =)

Sweet! i'd still like it as a separate module for clarity, but that's a personal thing. If it sticks around cached in the global, then we're good! Again a personal thing with an aversion to global :)

SteadBytes · 2019-05-08T16:21:30Z

Does using the default Python json module have any performance implications over the current usage of ujson?

ziirish · 2019-05-09T08:13:39Z

I don't know whether this is representative or not but here are the benchmark results.

Without ujson:

>> Run benchmarks
Test session starts (platform: linux, Python 3.6.8, pytest 3.2.3, pytest-sugar 0.9.0)
benchmark: 3.1.1 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=2 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /opt/workspace/flask-restplus, inifile: setup.cfg
plugins: sugar-0.9.0, profiling-1.2.11, mock-1.6.3, flask-0.10.0, faker-2.0.0, cov-2.5.1, benchmark-3.1.1

 tests/benchmarks/bench_marshalling.py ✓✓✓✓                                                                                                                                                                                      67% ██████▋   
 tests/benchmarks/bench_swagger.py ✓✓                                                                                                                                                                                           100% ██████████

------------------------------------------------------------------------------------------ benchmark 'marshalling': 4 tests -----------------------------------------------------------------------------------------
Name (time in us)                         Min                    Max                  Mean              StdDev                Median                IQR            Outliers         OPS            Rounds  Iterations
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
bench_marshal_simple                 429.9954 (1.0)       2,721.3283 (1.44)       447.7427 (1.0)       49.8622 (1.0)        447.1317 (1.0)      13.0665 (1.0)        16;106  2,233.4255 (1.0)        2220           1
bench_marshal_simple_with_mask     1,173.6453 (2.73)      1,895.1111 (1.0)      1,216.0383 (2.72)      72.7597 (1.46)     1,200.3686 (2.68)     21.7594 (1.67)        54;70    822.3425 (0.37)       1396           1
bench_marshal_nested               1,844.3316 (4.29)      2,925.9734 (1.54)     1,885.2151 (4.21)      49.8984 (1.00)     1,877.7251 (4.20)     26.5837 (2.03)        62;74    530.4434 (0.24)       1021           1
bench_marshal_nested_with_mask     2,606.2354 (6.06)     20,920.8205 (11.04)    2,762.5558 (6.17)     783.2512 (15.71)    2,675.0267 (5.98)     71.7286 (5.49)         8;79    361.9836 (0.16)        736           1
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------ benchmark 'swagger': 2 tests -----------------------------------------------------------------------------------------
Name (time in us)                     Min                   Max                  Mean              StdDev                Median                 IQR            Outliers         OPS            Rounds  Iterations
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
bench_swagger_specs_cached       732.1090 (1.0)      1,028.4223 (1.0)        753.2495 (1.0)       36.4551 (1.0)        738.8182 (1.0)       18.8295 (1.0)         30;33  1,327.5814 (1.0)         276           1
bench_swagger_specs            6,795.6112 (9.28)     7,992.8115 (7.77)     6,943.3125 (9.22)     143.5110 (3.94)     6,859.9358 (9.29)     217.2068 (11.54)        36;4    144.0235 (0.11)        342           1
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Legend:
  Outliers: 1 Standard Deviation from Mean; 1.5 IQR (InterQuartile Range) from 1st Quartile and 3rd Quartile.
  OPS: Operations Per Second, computed as 1 / Mean

Results (12.57s):
       6 passed

With app.config['RESTPLUS_JSON_SERIALIZER'] = 'ujson':

>> Run benchmarks
Test session starts (platform: linux, Python 3.6.8, pytest 3.2.3, pytest-sugar 0.9.0)
benchmark: 3.1.1 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=2 calibration_precision=10 warmup=False warmup_iterations=100000)
rootdir: /opt/workspace/flask-restplus, inifile: setup.cfg
plugins: sugar-0.9.0, profiling-1.2.11, mock-1.6.3, flask-0.10.0, faker-2.0.0, cov-2.5.1, benchmark-3.1.1

 tests/benchmarks/bench_marshalling.py ✓✓✓✓                                                                                                                                                                                      67% ██████▋   
 tests/benchmarks/bench_swagger.py ✓✓                                                                                                                                                                                           100% ██████████

----------------------------------------------------------------------------------------- benchmark 'marshalling': 4 tests -----------------------------------------------------------------------------------------
Name (time in us)                         Min                   Max                  Mean              StdDev                Median                IQR            Outliers         OPS            Rounds  Iterations
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
bench_marshal_simple                 431.2173 (1.0)      1,028.7687 (1.0)        443.5333 (1.0)       16.7665 (1.0)        441.4711 (1.0)       8.2292 (1.0)       187;235  2,254.6223 (1.0)        2248           1
bench_marshal_simple_with_mask     1,174.5617 (2.72)     2,034.2432 (1.98)     1,216.6918 (2.74)      79.1666 (4.72)     1,196.5148 (2.71)     23.9285 (2.91)       74;111    821.9008 (0.36)       1377           1
bench_marshal_nested               1,799.6989 (4.17)     2,629.2317 (2.56)     1,847.2305 (4.16)      50.8462 (3.03)     1,838.7847 (4.17)     26.0845 (3.17)        56;63    541.3510 (0.24)       1026           1
bench_marshal_nested_with_mask     2,561.9529 (5.94)     4,902.0723 (4.76)     2,656.1316 (5.99)     144.1454 (8.60)     2,619.2032 (5.93)     51.0439 (6.20)        45;66    376.4874 (0.17)        643           1
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

------------------------------------------------------------------------------------------- benchmark 'swagger': 2 tests -------------------------------------------------------------------------------------------
Name (time in us)                     Min                    Max                  Mean                StdDev                Median                 IQR            Outliers         OPS            Rounds  Iterations
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
bench_swagger_specs_cached       730.5555 (1.0)         789.5790 (1.0)        738.7847 (1.0)          8.5986 (1.0)        735.3090 (1.0)        3.2894 (1.0)         52;53  1,353.5743 (1.0)         284           1
bench_swagger_specs            6,796.8778 (9.30)     43,136.9580 (54.63)    7,136.4213 (9.66)     1,976.2746 (229.84)   6,988.0933 (9.50)     234.0488 (71.15)        1;28    140.1263 (0.10)        339           1
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Legend:
  Outliers: 1 Standard Deviation from Mean; 1.5 IQR (InterQuartile Range) from 1st Quartile and 3rd Quartile.
  OPS: Operations Per Second, computed as 1 / Mean

Results (11.57s):
       6 passed

SteadBytes · 2019-05-10T05:35:27Z

Looks like for the average case (mean) the two are fairly similar, however ujson has a much smaller standard deviation for the marshalling benchmarks - interesting! I'm glad it's not a horrendous slow down or anything 😅, so I'm happy with dropping ujson default.

Is it worth adding test to ensure that importing the configured RESTPLUS_JSON_SERIALIZER is correct? The test_representations test is ensuring that the serialization produces correct outputs but not that the correct serializer is imported and used.

On a separate note, the addition of the configuration options documentation is nice 👍 The SWAGGER_SUPPORTED_SUBMIT_METHODS option that was merged the other day should be in there too now.

I'll create a PR to this PR to add that and the serializer import tests so we can get this merged in 😄

EDIT: @ziirish whilst adding some tests for the serializer import, I have found quite a significant issue - take a look at my comment https://github.com/noirbizarre/flask-restplus/pull/637/files#r282764702 and let me know your thoughts 😄

flask_restplus/representations.py

SteadBytes · 2019-05-11T17:05:35Z

tests/test_representations.py

+    # then enforce a custom serializer
+    current_app.config['RESTPLUS_JSON_SERIALIZER'] = 'ujson'
+    r2 = rep.output_json(payload, 200)
+    assert uloads(r2.get_data(True)) == uloads(udumps(payload))


This is testing several different things within a single test case - how do you feel about splitting it up into several test cases so that it's clearer to see the individual assertions? Something like:

def test_representations_serialization_output_correct(): # original test comparing get_data and dumps outputs def test_config_custom_serializer_is_module(): # current_app.config['RESTPLUS_JSON_SERIALIZER'] = 'ujson' # test .dumps method is used def test_config_custom_serializer_is_function(): # current_app.config['RESTPLUS_JSON_SERIALIZER'] = 'my_custom_json.module.dump_method' # test correct function imported def test_config_custom_serializer_fallback(): # test fallback to json.dumps

This makes the intended behaviour clearer and is easier to see from a test failure where the issue is 😄

I think it's a good idea to at least log a message when a fallback to json.dumps takes place. It can fallback without breaking, but the user's should be aware that it's happening (adding it to the docs is still good though so thank you for that). Perhaps something like:

import warnings # after failing to import user's custom serializer warnings.warn( "unable to import RESTPLUS_JSON_SERIALIZER={}, falling back to json.dumps".format( current_app.config["RESTPLUS_JSON_SERIALIZER"] ) )

Thanks!

I have spit the test.
I'm not quite sure about the warnings.warn yet though. I think we should define a communication with upper apps policy first so people know how to deal with those.

Great thank you! Sorry, I'm not sure what you mean by

communication with upper apps policy

Could you please clarify? Did you mean define a way in which Flask-RESTPlus should interact with other applications?

Using warnings/warn is a common practice for Flask extensions that wish to communicate misconfiguration/possible error to the user. For example:

Flask-Login

Flask-Debugtoolbar

Flask-Limiter

Or using logger.warning is also a common approach.

Could you please clarify? Did you mean define a way in which Flask-RESTPlus should interact with other applications?

I mean we should document what kind of communication people using FLask-RESTPlus should expect.
I know warnings.warn is a common practice, but we should document this.
The reason is the default filter might hide some messages hence this doesn't provide any help to the users:

import warnings warnings.warn('You should not use this', DeprecationWarning) warnings.warn('Hey, something went wrong', UserWarning)

Result in:

$ python /tmp/warn.py /tmp/warn.py:4: UserWarning: Hey, something went wrong warnings.warn('Hey, something went wrong', UserWarning)

So by default, unless people know they should expect a DeprecationWarning they won't notice it.

Ah yes I understand - thank you for clarifying 😄 Yes I agree that this should definitely be documented - it's good to be clear to the users so thank you for pointing that out 👍 .

How about changing this:
https://github.com/noirbizarre/flask-restplus/pull/637/files#diff-dce4048f760a2254db20f3a78d7fcfa1R362 to say something like:

Flask-RESTPlus will always fallback to the default json.dumps serializer if it cannot manage to import the one you configured. This is indicated by a warning during initialisation.

And then include an example of the output when starting the application, for example using logging:

* Serving Flask app "t" (lazy loading) * Environment: production WARNING: Do not use the development server in a production environment. Use a production WSGI server instead. * Debug mode: on * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit) * Restarting with stat unable to import RESTPLUS_JSON_SERIALIZER=broken_seralizer.dumps, falling back to json.dumps * Debugger is active! * Debugger PIN: 627-628-298

Or using warnings:

* Serving Flask app "t" (lazy loading) * Environment: production WARNING: Do not use the development server in a production environment. Use a production WSGI server instead. * Debug mode: on * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit) * Restarting with stat /home/ben/Documents/flask-restplus/flask_restplus/api.py:213: UserWarning: unable to import RESTPLUS_JSON_SERIALIZER=broken_seralizer.dumps, falling back to json.dumps warnings.warn("unable to import RESTPLUS_JSON_SERIALIZER=broken_seralizer.dumps, falling back to json.dumps") * Debugger is active! * Debugger PIN: 627-628-298

I'm just very hesitant to fallback completely silently (even if it is documented) as the user will not be able to tell when it has happened.

What do you think? 😄

The only problem I see with this is the code path in which this warning will be raised is not deterministic (it will happen only after some route returns and if it needs marshalling).

Maybe the solution is to load the serializer within the Api constructor.

This way, people know when/where to catch the warnings.

I have added the warnings.warn as suggested.
This should happen early in the initialization of the Api unless you pass it a Blueprint instead of a Flask app (in which case the warning will be raised when calling a route that needs serialization.
The loaded serializer is cached within the app.config. I don't think this is the best place to do so, but I didn't find any other...

…N_SERIALIZER config option

…h a warnings.warn(UserWarning)

stop relying on ujson (fix #589, #587, #507)

bee950a

ziirish added this to the 0.12.1 milestone May 6, 2019

ziirish requested a review from SteadBytes May 6, 2019 08:32

ziirish modified the milestones: 0.12.1, 0.12.2 May 6, 2019

ziirish force-pushed the fix-drop-ujson branch from 25b9242 to c3d8741 Compare May 6, 2019 12:03

j5awry approved these changes May 6, 2019

View reviewed changes

ziirish force-pushed the fix-drop-ujson branch from c3d8741 to 22cfa56 Compare May 6, 2019 16:09

ziirish force-pushed the fix-drop-ujson branch from 22cfa56 to 1b58e13 Compare May 8, 2019 08:57

SteadBytes reviewed May 10, 2019

View reviewed changes

flask_restplus/representations.py Outdated Show resolved Hide resolved

ziirish force-pushed the fix-drop-ujson branch 2 times, most recently from 7e885a3 to 9325cc6 Compare May 10, 2019 14:29

SteadBytes reviewed May 11, 2019

View reviewed changes

let the user choose a custom json serializer through the RESTPLUS_JSO…

813ac5d

…N_SERIALIZER config option

ziirish force-pushed the fix-drop-ujson branch from 9325cc6 to 813ac5d Compare May 13, 2019 10:48

notify the user in case we couldn't load the custom serializer throug…

912920e

…h a warnings.warn(UserWarning)

ziirish force-pushed the fix-drop-ujson branch from bb1006c to 912920e Compare May 27, 2019 21:35

ziirish mentioned this pull request Nov 14, 2019

Prep 0.14.0 Release #743

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

stop relying on ujson (fix #589, #587, #507) #637

stop relying on ujson (fix #589, #587, #507) #637

ziirish commented May 6, 2019

coveralls commented May 6, 2019 •

edited

Loading

ziirish commented May 6, 2019

ziirish commented May 6, 2019

j5awry left a comment

ziirish commented May 6, 2019

j5awry commented May 6, 2019

SteadBytes commented May 8, 2019

ziirish commented May 9, 2019

SteadBytes commented May 10, 2019 •

edited

Loading

SteadBytes May 11, 2019 •

edited

Loading

ziirish May 13, 2019

SteadBytes May 14, 2019

ziirish May 14, 2019

SteadBytes May 15, 2019 •

edited

Loading

ziirish May 15, 2019

ziirish May 27, 2019

stop relying on ujson (fix #589, #587, #507) #637

Are you sure you want to change the base?

stop relying on ujson (fix #589, #587, #507) #637

Conversation

ziirish commented May 6, 2019

coveralls commented May 6, 2019 • edited Loading

ziirish commented May 6, 2019

ziirish commented May 6, 2019

j5awry left a comment

Choose a reason for hiding this comment

ziirish commented May 6, 2019

j5awry commented May 6, 2019

SteadBytes commented May 8, 2019

ziirish commented May 9, 2019

SteadBytes commented May 10, 2019 • edited Loading

SteadBytes May 11, 2019 • edited Loading

Choose a reason for hiding this comment

ziirish May 13, 2019

Choose a reason for hiding this comment

SteadBytes May 14, 2019

Choose a reason for hiding this comment

ziirish May 14, 2019

Choose a reason for hiding this comment

SteadBytes May 15, 2019 • edited Loading

Choose a reason for hiding this comment

ziirish May 15, 2019

Choose a reason for hiding this comment

ziirish May 27, 2019

Choose a reason for hiding this comment

coveralls commented May 6, 2019 •

edited

Loading

SteadBytes commented May 10, 2019 •

edited

Loading

SteadBytes May 11, 2019 •

edited

Loading

SteadBytes May 15, 2019 •

edited

Loading