Skip to content

Commit 90e994f

Browse files
committed
Cleaning Internet Archive Data
Here are many steps in my cleaning of my data
1 parent 0f97722 commit 90e994f

10 files changed

+56290
-56203
lines changed

.ipynb_checkpoints/GetIdentifierFromIA-checkpoint.ipynb

+56,222-22
Large diffs are not rendered by default.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
{
2+
"metadata": {
3+
"name": "",
4+
"signature": "sha256:83f73ae63a663f0c64e21077ce3b5fb0116a865103a311c8895931d5b066bcc1"
5+
},
6+
"nbformat": 3,
7+
"nbformat_minor": 0,
8+
"worksheets": []
9+
}

Full text journals available.xlsx

718 KB
Binary file not shown.

GetIdentifierFromIA.ipynb

+26-46,834
Large diffs are not rendered by default.

IAresults.csv

-9,347
Large diffs are not rendered by default.

InternetArchiveResults-ORdata.xls

1.89 MB
Binary file not shown.

InternetArchiveResults-ORdata2.xls

1.99 MB
Binary file not shown.

InternetArchiveResults-ORdataXX.xls

1.99 MB
Binary file not shown.
1.68 MB
Binary file not shown.

Untitled0.ipynb

+33
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
{
2+
"metadata": {
3+
"name": "",
4+
"signature": "sha256:f94a2af3d321cf8fa734a9669c42d111e05436b67d82d3fcf3e4ead98f6a4ea6"
5+
},
6+
"nbformat": 3,
7+
"nbformat_minor": 0,
8+
"worksheets": [
9+
{
10+
"cells": [
11+
{
12+
"cell_type": "code",
13+
"collapsed": false,
14+
"input": [],
15+
"language": "python",
16+
"metadata": {},
17+
"outputs": []
18+
},
19+
{
20+
"cell_type": "code",
21+
"collapsed": false,
22+
"input": [
23+
"#Search given fields for any instances of string similar to given string"
24+
],
25+
"language": "python",
26+
"metadata": {},
27+
"outputs": []
28+
}
29+
],
30+
"metadata": {}
31+
}
32+
]
33+
}

0 commit comments

Comments
 (0)