All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- include an article about handling data with null cells using Python
- updated dataset 10.7910_DVN_SKP9IB
- updated dataset 10.7910_DVN_SKP9IB
- add link to https://gist.github.com/kuhlaid/a8de3160fceb9a72248d3a28c39f761c
- changing the main menu link to
> Main menu
to make it more obvious - switching to using port 8080 (to try and make OpenShift happy on build attempts)
- switching package manager to
[email protected]+sha512.a6b2f7906b721bba3d67d4aff083df04dad64c399707841b7acf00f6b133b7ac24255f2652fa22ae3534329dc6180534e98d17432037ff6fd140556e2bb3137e
in the package.json file to see if that forces OpenShift to build using yarn instead of npm (testing this since it updates packages and lockfile); apparently the package.json file was using the wrong package version but the problem with OpenShift was that the image does not allow for use of Yarn and only uses NPM (which is the reason for the problems)
- moving sidebar definitions to the sidebars JS and out of the Markdown and category files to centralize the management of the sidebar
- adding https://fontawesome.com/ icon library
- tried to use Node v16 but Vercel notifies that it is end of life so reverting back to v20
- update
intro
document with links and more information - removed default Docusaurus styling
- create python packages for handling basic API functions
- added cheerio resolutions to the package.json file to correct an issue with local search
- adding additional documentation for a data curation checklist and steps to help automate a curation log
- adding some starter documents on data curation
- adding dates and authors to blogs on docs
- updated Docusaurus to version 3.5.2
- testing orama search plugin, but both the local and cloud version have an error needing to be fixed in oramasearch/orama#728
- testing DocSearch crawler https://docsearch.algolia.com/docs/legacy/run-your-own/#run-the-crawl-from-the-docker-image and setting up an .env.prod set of environment variables, but it keeps throwing
Unreachable hosts
error - adding a basic
cmfcmf/docusaurus-search-local
search plugin (needs configuration changes but is a good starter search tool)
- try to query the Dataverse API to retrieve a list of datasets to help autogenerate a dataset listing README file that links to the individual README and curation files for each dataset (remove dataset
_category_.json
file and replace with this) - create a jupyter notebook similar to the README generator that will pull dataset data from the Dataverse for a specified DOI and Dataverse alias, so API-users can select the datasets they want to download instead of doing it manually; maybe save the downloaded datasets to DOI directories then maybe include code to link datasets together; create a simple JS page that lists all datasets so users can check the datasets they want and an autogenerated DOI list will be created for them that they can then copy into their Jupyter notebook configuration for specifying which datasets to download
- write instructions on how to link this documentation site builder with the Jupyter notebooks used for dataset curation to publish the curation logs to this site under dataset specific directories/pages (using shutil commands to copy files from the notebook directories to this site builder)
- add Algolia search once the site is production ready and write up a document regarding how users can best search for data
- write a document which contains the code and resources to build a curation log generator and how it compares to the curation log for https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/SKP9IB
-
cmfcmf/docusaurus-search-local
search plugin needs better configuration - local search linking to pages such as
https://jocoknow.vercel.app/docs/jocoknow-extras/translate-your-site/docs/jocoknow-extras/translate-your-site#translate-a-doc
which do not exist - add resources for developing data management plan (https://dmponline.dcc.ac.uk/) and data curation logs
- add a template for logging data curation steps
- write instructions for a curation log example using JSON and code to parse the file into Markdown; include methods for adding ToDo items to the log to help encourage users to make use of the log; add ability to include question or checklist types to a curation task IMPORTANT