-
-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigating GSon #8668
Comments
Can I work on this issue? Is your meaning is using GSon library to replace now library. |
You can work on it. I don't know whether gson will replace current JSON library. What I want is to make the test and see whether it will improve loading time at all, and potentially other long operations. Even creating a note, when the note type is complex, or changing a note type, can be improved when the JSON library is efficient, as accessing template is required often You can test over http://milchior.fr/anki/ which is my real life collection and take a dozen of seconds to load. That seemed silly (at least to me), but each second saved is actually really great for the confort of use of the app. |
I will try to do it. It is the first time I working on anki. How does it work when import a deck? |
It's not a deck but an entire collection. You'll need to use the desktop version to import it. And you should create a new profile, as otherwise it would remove the current content of your collection. |
Jackson is supposed to be much faster than gson. |
Script a set of tests that exercise both cases on the same set of easily-available test data, then post benchmarks that others can attempt to reproduce, or it's just talk ;-) |
Challenge Accepted 😃 , should this benchmark be done in android context or in the JVM environment is enoght? |
Users only care about the performance they see in the real world on their real device, right? It has to be a practical test, like of the Anking decks or I think Arthur has his decks up for testing on https://milchior.fr but I'm not sure where? |
Sorry - meant to actually tag @Arthur-Milchior - Arthur - do you have your decks available somewhere for people looking to stress test Anki things? I seem to recall you did, if not sorry for the bother |
I think this is the link: http://milchior.fr/anki/. I found this repo with some benchmarks. https://github.com/fabienrenaud/java-json-benchmark, and Jackson is among the top. Jackson provides databinding which will be super helpful to covert models and configs from json to java classes easily. |
Benchmark Resultsdevice: Huawei P30 Using Arthur's collection (1.3 GB) and this repo https://github.com/TarekkMA/android_json_benchmarks decks json (500 KB)
models json (34.1 MB 🤯 )
GsonI found this issue about gson [Question] : is gson still maintained? #1821 jackson_jrSupposed to be a more lightweight version of Jackson data binding, but it does lack some features. using default config I couldn't parse moshislowest of them all jacksonhave lots of features I only used data binding because it's the easiest. also they have set of annotations that can change how serialization works. jackson streamingSlightly faster than data binding but not very convenient to write. I think the memory usage may be better in streaming. |
Nice! GSON definitely out as unmaintained - good find on that issue. Ingesting unmaintained repos is always a risk, and always best avoided as you see doing real benchmarks takes time and are needed for performance areas jackson_jr already has identified problems, it's out moshi seems slow and jackson apppears to support other things that have interest (data binding), so between those I'd go with jackson What's the APK size difference? Relatively easy to examine by turning minify on for debug build and comparing output...it does not appear to contain native libraries so it should not be too bad but there is nothing like knowing instead of guessing :-) |
gson was already added as a dependency, so I replaced it with Jackson and tried to my best to use jackson the same as gson so that the minifer would only remove unused stuff. Gson (streaming) was used here and here Gson (previous version)JacksonAdditional notesfrom my research jackson seems to be very large library in term of method count and in the number of modules/extensions that can be added to it |
I have used jackson streaming and added the result of the benchmark to my previous comment. jackson databinding includes streaming (core) and annotation package. if we want to be save size we can do everything using streaming rather than data binding. it will be faster than databinding, but very verbose and tedious. |
That's quite the performance increase... It looks worth it to me |
If we go with streaming only the size would be exactly the same as the previous version with gson, but will have to write everything by hand and no more one-liners Streaming API example form my benchmark: (The same would go for writing the json back using Should we switch to streaming only? |
Another candidate is https://github.com/ngs-doo/dsl-json , will try to get some benchmarks on it Edit: mmm dsl-json is about 250 ms which is comparable to moshi, maybe I'm doing something wrong. but anyways the library not very active on GitHub. |
I guess jackson is the way to go, but should we go streaming or databinding 🤔 |
less code I think - not having the JSON schema embedded in our parsing code like that seems like a win |
I assume "old" meant what we use right now, right? Thanks for finding my deck so quickly. Please don't hesitate to ping me on discord if needed. That would have been realy quick to answer, and would be a shame to block you. It's hard to know what to answer first on github. |
Interesting. I didn't even knew we used gson. This issue was referring to reading Decks/Models from the collection. But clearly, this is interesting too. Note that default:
jp.skipChildren(); would not be acceptable. If there are properties we don't know, we should keep them. Indeed, when we recreate the string and put it in the database, we want to put all properties we didn't change as they used to be |
You are right, that is just for the benchmarking code since the streaming API is very verbose I only handled some elements not all. we will be using the databinding API which will have lots of cool tricks and things we could use like the following snippint of code: The following code will serialize and deserialize unknow fields into/from public class Model {
private Map<String, Object> extraProperties = new HashMap<>();
@JsonAnyGetter
public Map<String, Object> getExtraProperties() {
return extraProperties;
}
@JsonAnySetter
public void setExtraProperty(String name, Object value) {
extraProperties.put(name, value);
}
} |
Great. I'd clearly be curious, if you have the courage to go into git history, can you find and explain why we used two different libraries? Great to see that the extra properties where taken into account. Thanks. |
Gson is used from 2013, I don't know the exact reason why but maybe the authors needed to do json streaming with gson as seen in Anki-Android/AnkiDroid/src/main/java/com/ichi2/async/CollectionTask.java Lines 1473 to 1484 in 0711bcd
If I found an interesting git message I will be sure to put it here |
Replying to #8799 (comment)
If it's faster, then please go for it. But if you can replace our own JSON library, that would be even better. It's only role is to catch the checked exception and replace them by unchecked exception. It simplifies a lot the remaining of the code, but there is a small cost in efficiency. |
@Arthur-Milchior jackson has a tree model that can replace current json classes. but I think the goal is the serilize/deserilize json from/to a type safe and expressive class instead of using string keys everywhere. #8810 I tried migrating |
#8817 I have just finished converting the old And the results are very very amazing, Benchmarks for parsing 34MB of modelsCurrent Appreach with
|
To be honest, I'm fine with the 9ms cost of keeping current approach compared to using directly a map :p Yeah, you're right, that's amazing. I'm kind of sorry for you, because there used to me changes that can save dozens of seconds, but most are gone, and now you have to deal with changes that same 300ms. I really like ensuring that the application is as fluid as possible (and I suspect it will also uses less memory for temporary variables), and it makes the app more pleasurable. But you missed last year progress, that was more spectacular: https://www.reddit.com/r/Anki/comments/iu24ph/ankidroid_one_year_of_change_in_term_of_speed/ |
Wow the difference you made is incridible, I hope I will be able to make the app more pleasurable for the end-user to use and for the developer to contribute and understand/test the code. |
I think you already have! |
Hello 👋, this issue has been opened for more than 2 months with no activity on it. If the issue is still here, please keep in mind that we need community support and help to fix it! Just comment something like still searching for solutions and if you found one, please open a pull request! You have 7 days until this gets closed automatically |
As far as I remember, some of the long tasks, such as loading the collection when the application start, takes a few second - on big collection - processing JSon. If I remember correctly, one of the bottleneck was that each JSon get operation uses a try/catch, which is is an expansive operation. The only trouble is that the JSONException are checked exception and that we don't want to
throws
them everywhere in the codebase.I think it might be interesting to check GSon, another library for JSon. I suspect we would win time some precious seconds. Indeed:
Most of the change can be automatized, but that's not entirely trivial. Indeed, the functions do not have the same name even if it has the same signature than in current implementation. And if we want to save this second of loading time, it's not ideal to create an intermediate class just to change the function name.
It may be useless if most of the work moves from java to rust, those try/catch will disappear with it anyway
The text was updated successfully, but these errors were encountered: