Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to mange molecule data set? #109

Open
Denz1994 opened this issue Nov 21, 2019 · 4 comments
Open

How to mange molecule data set? #109

Denz1994 opened this issue Nov 21, 2019 · 4 comments

Comments

@Denz1994
Copy link
Contributor

Molecule data is stored in a rather large data set. See collectionMoleculesData, otherMoleculesData, and structuresData. This data contains all the available molecules that can be built in the play area.

With this data, a build version of the sim is sized at 8.1MB. Quite larger compared to other sims. Also, note that downloading the sim via Chrome will compress the .html file to 2.3MB. That is about 1/4 compression but is still larger than other sims.

If this isn't a problem then there is nothing left to be done with this issue and we can close. Otherwise, there are a few things to consider to handle this data set. After a brainstorming with @jonathanolson, we came up with some considerations:

  • Reduce the amount of available atoms/kits//kitCollections.
    Pro: This will, in turn, reduce the number of possible molecules and truncate our data set.
    Con: The set of data will need to be regenerated. This is not a straight forward process. It
    requires some digging into java code and sets a precedent for porting this code and adding
    documentation.

  • We could serialize the data set into a JSON object.
    Pro: This should make startup time a bit faster (if that is an issue) and slightly increase performance.
    Con: This comes at the cost of space. A JSON object with fields and keys will be larger than the data generated by the Java code.

  • Use a post-processing step to generate molecule data via the network.
    Pro: Should decrease size of the initial set of molecule data.
    Con: Assumes a network connection. Intermediate steps involving recovering molecule data during or after bonding could hinder performance.

Thoughts @ariel-phet on how to proceed?

@arouinfar
Copy link
Contributor

There is so much flexibility in the third screen, and it would be really unfortunate to reduce the number of kits/atoms. My (unsolicited) vote would be against that option.

@Denz1994
Copy link
Contributor Author

Reduce the amount of available atoms/kits//kitCollections.

Just for clarification, any change to the amount of kits/atoms would have the same con of data regeneration noted in the above comment.

@ariel-phet
Copy link

As discussed today, we might see if there is some compression that can be done, but I don't think we need to reduce the data set at all.

@Denz1994
Copy link
Contributor Author

Denz1994 commented Feb 7, 2020

@ariel-phet mentioned via a zoom meeting on 02/07/20:

We can revist this while #158 is being worked on. There may be other options for compression.

Deferring this issue until then.

@Denz1994 Denz1994 removed their assignment Feb 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants