-
Notifications
You must be signed in to change notification settings - Fork 5
Add Python module. Issue #47 #49
Conversation
Fix notebook to call from Python module. Remove notebook functions.
Marked this draft for general comments first. Once the approach is okay, I'll mark it ready for review and we can work through some of the details. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple style changes here to to start. You'll also want to run it through something that'll validate PEP-8. http://pep8online.com/checkresult
You'll also need to update the Dockerfile
to include the library. Otherwise, it's missing from the build and won't run. See here: https://mybinder.org/v2/gh/archivesunleashed/auk-notebooks/issue-47?filepath=auk-notebook.ipynb
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
<ipython-input-2-8eb457572b52> in <module>
----> 1 from auknb import Notebook
ModuleNotFoundError: No module named 'auknb'
You can put a COPY auknb.py $HOME
on line 18 https://github.com/archivesunleashed/auk-notebooks/blob/master/Dockerfile#L18
Notebook itself:
- Probably want to change "
# Change to alter path.
" to something a bit more descriptive. Maybe "Change value to full path to your data, including trailing slash." # Setup Archives Unleashed Cloud data.
you can get rid of that comment in the required packages section.- Put the library import in with all the others.
- Probably need to change the intro language and section header for "User Configuration"
auknb.py
Outdated
return(result) | ||
|
||
if __name__ == "__main__": | ||
print ("This module is intended for use with Archives Unleashed Cloud") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can put all this in a header comment.
auknb.py
Outdated
from nltk.sentiment.vader import SentimentIntensityAnalyzer | ||
from nltk.corpus import stopwords | ||
|
||
class Notebook(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's change the class name to AuNotebook
.
Remove print statements in favor of docstrings.
- class variables are lower case as per PEP8 - add auknb.py to Dockerfile - fix explanations for function calls etc. - add instructions to view functions and class documentation. x
I think the latest commit fixes the above review fixes. I also included mention that help(nb.function_name()) will provide instructions. Once I get word from @ianmilligan1 that the approach is okay, I will take this out of draft and work on fine tuning. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works on my end as a user, but I'd want to make sure @ruebot green-lights this as he knows exponentially more than I do.
Thanks @ianmilligan1 @ruebot what are your thoughts on unit tests here? I can build them, but am not sure if they belong in the auk-notebooks folder. |
Out of scope for now. We'll cross that bridge when we publish it in it's own repo. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Notebook review:
AUK_PATH
comment has two full-stops.- What's the purpose of displaying:
help(AuNotebook.clean_domain)
? - Change:
The following cell sets up AuNotebook
toThe following cell sets up the notebook
- Cell 13 has an output error
# Get a list of the top words in the collection (regardless of year).
Did you clear the notebook and re-run it? - You should update the library filename to reflect the class name change. Make sure you update the Dockerfile as well.
- remove help cell (was only for testing). - change module name to aunb. - fix output errors (i.e. be more patient when waiting for notebook to run.)
Latest commit should be ready to go. Moving this out of draft. |
This PR replaces the function cell in the notebook with a python module and revises the function to work with the Notebook class.
A few notes:
fun(x, min_length="2")
- now it just uses the MINIMUM_WORD_LENGTH variable, so