-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proof of Concept for Pepys in Jupyter #1078
Comments
@IanMayo Here are some initial demos of a very simple notebook interface: There are loads of problems with this interface, but it's just an idea of what is possible with just a few lines of code. I'll put up a PR shortly so you can see the actual notebook code, and then I'll move on to some of the other stuff we wanted to demo. |
See #1096 for a PR including this notebook code. I've also included some static and interactive plots of other variables - see: and Notably at the moment we have to work around pandas incompatibility with SQLAlchemy 2.0. This means that the SQLAlchemy 'engine' that we create in the Pepys DataStore won't work with pandas, as we create it using |
Ah yes, one more thing: Do you have any really good, realistic (ideally actually real - but not sensitive) data that I could use for playing around with developing analysis capabilities in Jupyter? Part of the reason I built the UI for selecting a platform and plotting the points was so that I could see if I could find a realistic looking track - a lot of the data on TracStor is obviously test data. The best I found was this |
Aah, @robintw - from the depths of my memory I remembered where I'd seen a sample dataset, it's in the CSV files here: Some tracks appeared to have up to 3k points. Obvs you'll either have to produce a parser to get the data into Pepys, or do some Excel column fiddling to make it look like an existing format which we parse. The "unknown platform" handling will be great for this data :-D |
Here's another source of AIS data @robin - it's a huge dataset, hopefully they're long tracks rather than just lots of small ones. |
Thanks @IanMayo. That's an interesting task, and slightly different to what I was expecting. I'll have a ponder and do some experimentation and get back to you. |
🐞 Overview
Produce a proof-of-concept for viewing Pepys data in a Jupyter notebook.
This will de-risk the future use of Jupyter notebooks both in Pepys and in general usage by analysts, offering lessons learned in data connectivity, data processing, and visualisation.
Time-permitting, to include:
State
data for a period of time from one or more platforms🔗 Feature
This represents an alternate solution for #859
🔢 Acceptance criteria
Machine Learning
SciKit provides capable clustering algorithms. But, we need to think of an application of this method to Pepys data
Offline mapping
Pepys will frequently be used without an Internet connection, unable to provide an OpenStreetMap backdrop. It would be useful to consider how a similar capability could be used to provide coverage in these areas of descending importance:
I guess some options are:
Sample analysis task #
Extended analysis task, considering bulk data #
Prioritised subsequent tasks #
The text was updated successfully, but these errors were encountered: