Open
Description
write a script you can run on your local machine (like a .py script you can run in PyCharm) that automates publication of datasets.
The following are some general things to think about, but I suggest breaking them out into your own tasks in this story
Some steps to investigate:
- whether or not you can pull the download links to the datasets programmatically, by webscraping with BeautifulSoup or a similar Python package
- how to write the metadata to .json files (use the json package)
- how to read in the metadata files and data files from your local machine programmatically (I suggest keeping them all in a folder that you walk through using os or something similar)
- how to map the metadata files to the data files
- how to pass arguments to the script so you can run it easily from the commandline (I suggest argparse) -- an example of an argument you might want to pass in would be the path to the directory containing the data
- The ultimate goal is to be able to just run the script and publish these datasets with as little human labor as possible. So if there's something you can code to make less human work in the future, do that something! :)
if there does not appear to be a way to read in the title and related information from the metadata, then create a new issue to add that to foundry and amend publish(). But first, see if you can include it in the metadata