-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Streamline getting started documentation #636
Conversation
for more information, see https://pre-commit.ci
Hi @hrlai , I've had a first go at it. Would love to hear your thoughts - if it would be easier to talk on Teams would be happy to set up a call. |
@sallymatson The |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## develop #636 +/- ##
========================================
Coverage 94.69% 94.69%
========================================
Files 73 73
Lines 4582 4582
========================================
Hits 4339 4339
Misses 243 243 ☔ View full report in Codecov by Sentry. |
Yes, that would be my fault for not running |
It should be running automatically when you |
It does. The instructions on the doc were perfect ! Do you have any thoughts about the one build that failed? |
Yup - our CI workflow runs the pre-commit |
@@ -20,7 +20,7 @@ language_info: | |||
name: python | |||
nbconvert_exporter: python | |||
pygments_lexer: ipython3 | |||
version: 3.11.9 | |||
version: 3.12.0rc3 | |||
--- | |||
|
|||
# Getting started |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the first mention of pip
or a command, should we assume less coding experience of the users and say explicitly where to run it? For example, before pip install
below, say something like "...open up the Terminal and type:"
@@ -56,18 +56,36 @@ configuration and data files to run a model. | |||
ve_run --install-example /path/ | |||
``` | |||
|
|||
You can then run the model itself: | |||
You can then run the model itself. If you have already run the simulation you will need | |||
to delete or rename the output files, as previously generated output can prevent the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though one could just remove the files using file explorer, should we give an example code to remove the files (e.g., using rm
) for those who have little experience with the Terminal?
If so, probably write it after the ve_run
chunk for a better flow...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the other file I just realised that @jacobcook1995 had already written something, maybe ditto them here for completeness?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's generally not great practice to use bash scripts to delete directories, because you might end up deleting something by mistake. It's useful in that example because the files are being stored in a kind of weird location (in the user directory) which might be hard to find using Finder. But I personally would encourage users to keep the data files in the virtual_ecosystem
directory, because they will likely need to access them while working, and it will all stay a lot more organized that way. That is why I used this method - and in addition, I just think it overcomplicates the task to run a bash script, when the files should be easy to find.
Let me know what you think about that reasoning!!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All good. I agree that it is unnecessary :) The only thing left to decide if whether to list the filenames to be deleted, but this is not always the same (e.g., the initial states may not be saved depending on user setting.) When I first bumped into this I had to ask around which files to remove.
updated at each time step. | ||
* `final_state.nc`: The model data state at the end of the final step. | ||
|
||
These files are written to the standard NetCDF data file format. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe insert example code to remove them here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above - I think it's better in this tutorial to not have a script for that. LMK if you feel strongly about including it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me, replied above :)
Ecosystem Output](virtual_ecosystem_in_use.md) tutorial, which walks you through basic | ||
graphs using model inputs and outputs. | ||
* The [Example Data](./example_data.md) pages provides a detailed description of the | ||
contents of the `ve_example` directory. Here you can dig into the strucutre of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo "strucutre" -> "structure".
delete the existing virtual ecosystem example directory, as previously generated files | ||
can prevent the example simulation from running successfully. That can be done as | ||
follows. | ||
The following commands allow you to run the simulation from a Jupyter Notebook. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is great to remind the users that they can run the python codes using Jupyter Notebook. However, I struggle to even comprehend where to type the following commands after launching Jupyter Notebook, sorry! Most ecologists or biologists would be more familiar with R than python, so it would be a shame to turn them away because we didn't ease them into python.
That said, I struggle between providing too much details about python coding here (some users may already know how and find it boring) versus hand-carrying new users. Should we dedicate a new page just for using python codes in Jupyter, or at least point the users to an external tutorial page?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great start towards a clearer start-up tutorial! I have added my two cents, and the most major one is possibly to hand-carry the users more around how to run python codes using Jupyter Notebook...
slideshow: | ||
slide_type: '' | ||
tags: [] | ||
--- | ||
%%bash | ||
# Remove any existing VE data directory in the /tmp/ directory | ||
if [ -d /tmp/ve_example ]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be simplified as rm -rf /tmp/ve_example
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh - I don't think of the -f
flag that way, but it does explicitly not raise an error if the file is missing. We should note that is why we're using it, but it is less complex than the explicit test.
--- | ||
editable: true | ||
slideshow: | ||
slide_type: '' | ||
tags: [] | ||
--- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not figured out why Jupyter sometimes chucks some of these tags in or how to filter them out. I think we can do so with the myst filters in jupytext, and we definitely should because we do not want this!
slideshow: | ||
slide_type: '' | ||
tags: [] | ||
--- | ||
# Load the generated data files | ||
initial_state = xarray.load_dataset("/tmp/ve_example/out/initial_state.nc") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If a user install the develop branch using pip
they probably won't have the initial state file by default settings, see #544 I can confirm that this is still the case today on my computer.
``` | ||
|
||
+++ {"editable": true, "slideshow": {"slide_type": ""}, "tags": []} | ||
|
||
### Initial state and input data | ||
|
||
The `initial_state.nc` file contains all of the data required to run the model. For some |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the plot below, we reshape the vector output of elevation to be a matrix of dimension 9 x 9. Can we avoid hard-coding the row and column numbers from 9 to something from the netCDF output? I imagine someone to change the grid numbers eventually so this will help to automate the plotting a bit more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's actually going to need to be a new function to reverse the internal representation of cells (which is a 1D array) back onto the grid. So that's going to be a general helper function (convert_cell_data_to_grid
or similar).
But your point is a good one!
ETA - Actually that get's interesting. You need the grid configuration to go back to the 2D, so plotting spatial data will need a helper class (SpatialPlotter
) that is created using the Grid config and then has a method (to_spatial
) that maps the cell_id back onto the spatial layout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, something like convert_cell_data_to_grid
would be a great addition indeed. Is the grid configuration (e.g., number of rows and columns) stored anywhere in the output files?
Going to close this. I think the issue still stands, but I think from this conversation / other chats in meetings and over email, it seems like we need a bit of a deeper think about what is most useful for the introduction to VE docs. I still have this on my radar to work on soon, just going to wait a bit to make sure I do it right! |
Description
Updating the getting started documentation to remove redundancies between the Getting Started page and Using the VE page. For now, I have separated them out so that the Getting started is focused on installation/running the simulation for the first time, and the "Using the VE page" (which I renamed to "Exploring the VE outputs") is more to go through a few ways to interact with the outputs / visualize.
Fixes #631
Key checklist
pre-commit
checks:$ pre-commit run -a
$ poetry run pytest
Further checks