Skip to content

Commit

Permalink
Default to sample
Browse files Browse the repository at this point in the history
  • Loading branch information
nkelber committed Nov 8, 2022
1 parent c486ee9 commit 11e6a66
Showing 1 changed file with 17 additions and 9 deletions.
26 changes: 17 additions & 9 deletions exploring-metadata.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -116,17 +116,17 @@
},
"outputs": [],
"source": [
"# Download the full dataset (up to a limit of 25,000 documents),\n",
"# request it first in the builder environment. See the Constellate Client\n",
"# documentation at: https://constellate.org/docs/constellate-client\n",
"# Then use the `constellate.download` method show below.\n",
"dataset_metadata = constellate.download(dataset_id, 'metadata')\n",
"\n",
"# Pull in the sampled (1500 items) dataset CSV using\n",
"# The .get_metadata() method downloads the CSV file for our metadata\n",
"# to the /data folder and returns a string for the file name and location\n",
"# dataset_metadata will be a string containing that file name and location\n",
"# dataset_metadata = constellate.get_metadata(dataset_id)"
"dataset_metadata = constellate.get_metadata(dataset_id)\n",
"\n",
"# Download the full dataset (up to a limit of 25,000 documents),\n",
"# request it first in the builder environment. See the Constellate Client\n",
"# documentation at: https://constellate.org/docs/constellate-client\n",
"# Then use the `constellate.download` method show below.\n",
"#dataset_metadata = constellate.download(dataset_id, 'metadata')"
]
},
{
Expand Down Expand Up @@ -450,7 +450,10 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"We can now download the full dataset file (JSON-L). See [more information about Constellate dataset types](https://constellate.org/docs/what-format-are-jstor-portico-datasets)."
"We can now download the sampled (1500 items) dataset (or the full dataset if it has been requested). See more:\n",
"* [Constellate dataset types](https://constellate.org/docs/what-format-are-jstor-portico-datasets)\n",
"* [Dataset options](https://constellate.org/docs/dataset-options)\n",
"* [Constellate client](https://constellate.org/docs/constellate-client)\n"
]
},
{
Expand All @@ -459,8 +462,13 @@
"metadata": {},
"outputs": [],
"source": [
"# Pull in the sampled (1500 items) dataset JSON using\n",
"# The .get_dataset() method downloads the sampled JSON file\n",
"dataset_file = constellate.get_dataset(dataset_id)\n",
"\n",
"# Download the full dataset (JSON-L file)\n",
"dataset_file = constellate.download(dataset_id, 'jsonl')"
"# The full dataset must be requested first\n",
"#dataset_file = constellate.download(dataset_id, 'jsonl')"
]
},
{
Expand Down

0 comments on commit 11e6a66

Please sign in to comment.