-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Grapher could support computed traces as a display option. #181
Comments
Bump. I just deleted a computed trace from a scan that I often use and merged this change to master. This measurement takes a lot of time and it is useful to watch the data come in. This would be a really useful feature. |
Yeah, this would be a really useful feature, but will take serious thinking to implement properly as it might need changes to the datavault server, the format of storage on the datavault and the grapher itself. @btchiaro what is the actual formula for the data you would want to have plotted? There may be a way to add part of this feature on the frontend, but it would be nice to know what you need to plot. |
I'm storing the data from each tomography phase as a data column, but what On Wed, Jun 1, 2016 at 8:31 PM, Josh Mutus [email protected] wrote:
|
@joshmutus I think we could do this by adding properties to the dataset. No need for modifications to the datavault server. This is just a rendering issue. It seems totally reasonable to read a parameter from the dataset which could even just be a bit of js that says how to compute the extra curves. If we don't like allowing arbitrary code to be run from user data (which does indeed sound kinda jank) we could parametrize some of the most common curves. |
Another example is when I make the noise spectrum measurements, the raw 1,0 output is not human readable at all. It would be nice to save the raw 1,0 time series, but plot the power spectral density using our code in pyle. Ideally, we could access whatever processing parameters are exposed in the pyle functions e.g. 'frequency_average' for the rapid rto data. This would be extremely useful to me. This is also kind of an extreme case since this type of processing requires other datasets as inputs (spectroscopy_z_func data to calibrate frequency noise to flux noise) I'm not sure I like using a dataset parameter, I think we want this to be mutable. I can easily imagine people wanting to plot the data in different ways, for example as above. I think it would be really good to allow the user to call pyle processing code for the plotting. |
The idea is that you could have a dataset parameter that would dictate the default way it was plotted. You could still do whatever you wanted with it later. |
@btchiaro It sounds like you want to put the pyle analysis code into the web browser. Is this because you're looking for a graphical intuitive way to browse through your processed data? Let's try to distill the thing you want without the context of the grapher, and then we can decide how to implement it. |
@btchiaro we should probably chat about this Friday after group meeting. I can't think of an easy way to implement pyle in a browser. |
@DanielSank I'd like to be able to watch my data come in in some human On Wed, Jun 1, 2016 at 9:35 PM, Josh Mutus [email protected] wrote:
|
This is an interesting question! I think @btchiaro is saying that it would be nice to easily associate processed data and plots with the raw data entries in the data vault, and view them in the same application as the grapher. Right now I'm not really sure how to do that, although I do have an interesting idea: put the analyzed data in Drive (slides or otherwise) and link to that document in the comments box in the grapher. At this point I would like to formally say to @maffoo that he was right about hyperlinks being a good reason to rewrite the grapher using the web. You were right and I was wrong :) |
Instead of dealing with the difficulty (and security risk) of running arbitrary (possibly Python) code in the grapher, I would propose the following:
|
I think that there is a pretty limited security risk if we restrict the On Mon, Jun 6, 2016 at 2:12 PM, Jim Wenner [email protected] wrote:
|
@btchiaro, note this scalabrad-web project is a public project which other groups are using. As such, the approach I could see for what you suggest is having an environment variable specifying external projects to use for plotting. @joshmutus, is this even possible? |
Yea, what I'm thinking is that we have our own instance of the grapher On Mon, Jun 6, 2016 at 2:30 PM, Jim Wenner [email protected] wrote:
|
So @maffoo and I talked about this briefly and what you're talking about is making scalabrad web into a fully featured analysis tool, which is waaaay beyond the scope of this project. Generally speaking adding simple features introduces all sorts of interactions and bugs and makes a project super hard to maintain. Adding a complex features like this are particularly daunting. We can talk about it in detail later. Basically numpy doesn't exist for javascript and all the things you take for granted in pyle don't exist in the browser. I don't see the problem of storing a computed trace in this case. You have all the code on your end to compute it and adding it is trivial. It's not like we're hard up for hard drive space and the live update is a useful feature. Why do we have to kill ourselves for the DRY principle here? @DanielSank @maffoo |
Storing computed data in the datavault is certainly possible and in particular there's no way to stop people from doing that. However, I wouldn't. I prefer to keep the thing I collected the experiment well separated from everything else in the project. In my mind, data has a very special role as completely immutable and un-erasable. I prefer to use more user-friendly tools like Google Drive, which support link sharing, editing, commenting, etc. for my analysis and general "lab notebook" style work. tl,dr: I recommend using the data vault for storing raw data and nothing else. Use more appropriate tools for analysis. See IPython notebooks, for example. |
But that (ipython notebook) doesn't solve the use case of live-view |
Neither does storing processed data. We can use derived traces to make live-view a little nicer, but I think processed data is a totally separate issue (already commented on it in my previous post). |
I don't understand how storing processed data doesn't solve the live-view use case. You create a separate processed dependent and it live updates as you take data? |
I would say there are multiple separate live-view cases: In (b) and (c), it's probably best to use pyle for plotting so long as we can refresh the cache (martinisgroup/pyle#1285). I guess the question here is what to do about (a). Am I right @btchiaro @joshmutus? |
Yes, thank you for clarifying @jwenner |
I'm personally OK with storing processed data along with the raw data. On Mon, Jun 6, 2016 at 3:38 PM, Josh Mutus [email protected] wrote:
|
@btchiaro does your processing update in the live view or does the whole dataset have to be taken first? |
I had been storing the computed trace point by point and it was present in On Mon, Jun 6, 2016 at 3:43 PM, Josh Mutus [email protected] wrote:
|
I would be ok with having a computed saved column in a dataset. This kind of reminds me of the idea of denormalizing data in a database (basically, storing redundant or derived data to improve read performance, or in our case, to avoid having to recompute it all the time). One simple example would be storing both That said, I think this should be used judiciously, for things where there is one "obvious" way to compute the derived data column, because once it is stored it is immutable. If you're trying to compute and store something and then later decide that the computation needs to be modified slightly, then we're going to have a problem because all that old data is fixed. I don't know the specific case that @btchiaro is referring to, so I don't what to think about that. @btchiaro, can you give some specifics? |
Disclaimer: I'm totally unfamiliar with @btchiaro's code. We store "processed" data all the time in say, T1 where we convert from IQ point to one state probability. What the difference here? |
Can the computed column be tied to a commit so we know what code is used to compute it? |
@maffoo the use case that brought this all up is the rapid Ramsey scan that I wrote. This is just a Ramsey scan that is where the stats are sampled without qubit reset at a user defined sampling interval. The issue there was that I was storing data from each tomography phase and and also storing the Ramsey fringe envelope computed from those phases. During code review it was suggested that this is redundant information that should be computed by the grapher rather than stored as it's own trace. The other use case that I think would be useful is to save the raw binary data stream that is generated by the rapid rto, but also save the noise spectrum that is revealed through Fourier analysis. The in this case the raw data is not human readable at all and requires fairly significant processing to get it into a useful form. That said there is a pretty clear "right" way to plot this data, and I think that it would be useful to store that processed trace for easy access through the grapher. @joshmutus, the T1 example ocurred to me too and I considered it for a while. I had thought that the logical implication of the no processed data view point was that we should only ever save raw IQ data, and so even the idea of averaging over stats should be forbidden. Thinking about it more though, I think that the issue is redundancy.. Recording p1 data is not redundant unless you're storing the raw IQ also. You kind of get to set the "initial resolution" of your raw data, with out breaking the redundancy policy. All that said. I think that storing processed data alongside the raw data can be useful both from a convenience standpoint, and also from a preservation stand point. It often happens that the format of a scan and / or its associated processing code is changed. When this happens it can be difficult to go back in time and plot an old data set if all you have is the raw ( potentially human un-readable) trace. Just having a processed trace right alongside the raw data can save a lot of time when stuff like this happens and you want to quickly compare with an old dataset. |
It would be really useful if the grapher could plot computed traces. For example in the ramsey function we store the envelope as a data column, although this is redundant data. Another example is in the rapid RTO, the raw output of a 1,0 timeseries is not human readable and not worth displaying in the grapher. It would be nice to save the raw binary data, but have the grapher plot the spectrum after Fourier analysis.
The text was updated successfully, but these errors were encountered: