Skip to content

Commit

Permalink
docs: fix cross references
Browse files Browse the repository at this point in the history
  • Loading branch information
Ingerid committed Mar 7, 2024
1 parent 7c0e1d7 commit 066ddbc
Show file tree
Hide file tree
Showing 4 changed files with 34 additions and 12 deletions.
8 changes: 5 additions & 3 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -71,26 +71,28 @@

autodoc2_packages = [
"../../dhlab/api",
"../../dhlab/future",
"../../dhlab/images",
"../../dhlab/ngram",
"../../dhlab/metadata",
"../../dhlab/text",
"../../dhlab/visualize",
"../../dhlab/wordbank",
{
"path": "../../dhlab",
"path": "../../dhlab/__init__.py",
"exclude_files": [
"ngram/ngram.py",
"graph_networkx_louvain.py",
"module_update.py",
"nbpictures.py",
"constants.py",
"nbtext.py",
"nbtokenizer.py",
"text/nbtokenizer.py",
"token_map.py",
"__init__.py",
],
"exclude_dirs": [
"legacy",
"future",
"css_style_sheets",
],
"auto_mode": True,
Expand Down
20 changes: 20 additions & 0 deletions docs/source/docs_example_use.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,15 @@
---
jupytext:
formats: md:myst
text_representation:
extension: .md
format_name: myst
kernelspec:
display_name: Python 3
language: python
name: python3
---

# Examples of use

The python package calls the [DHLAB API](https://api.nb.no/dhlab/) to retrieve and present data from the digital texts.
Expand All @@ -8,6 +20,7 @@ Analyses can be performed on both a single document, and on a larger corpus.

Here are some of the text mining and automatic analyses you can do with `dhlab`:

(example_corpus)=
## Build a corpus

Build a [corpus](#text.corpus.Corpus) from bibliographic metadata about publications, e.g. books published between 1980 and 2005:
Expand All @@ -18,6 +31,7 @@ import dhlab as dh
corpus = dh.Corpus(doctype="digibok", from_year=1980, to_year=2005)
```

(example_count)=
## Word frequencies

Retrieve word (token) [frequencies](#text.corpus.Corpus.count) from a corpus:
Expand All @@ -27,6 +41,7 @@ Retrieve word (token) [frequencies](#text.corpus.Corpus.count) from a corpus:
corpus.count()
```

(example_chunks)=
## Bags of words

Fetch [chunks of text](#text.chunking.Chunks) (paragraphs) as bag of words from a specific publication:
Expand All @@ -41,6 +56,7 @@ c.chunks[0] # The first bag-of-words is the title
c.chunks[1] # Second bag-of-words is a paragraph, with word counts
```

(example_concordance)=
## Concordance

Extract [concordances](#text.conc_coll.Concordance) from the corpus:
Expand All @@ -53,6 +69,7 @@ concs.concordance
# including links to the concordance's positions in books on nb.no
```

(example_collocations)=
## Collocations

Compute [collocations](#text.conc_coll.Collocations), a ranking of relevant words to a given word:
Expand All @@ -73,6 +90,7 @@ lte 3
ein 3
```

(example_ngram)=
## N-grams

Retrieve [n-gram](#ngram.nb_ngram.nb_ngram) frequencies per year in a time period.
Expand All @@ -88,6 +106,7 @@ The `plot` method gives us this graph:

Check out our [N-gram app](https://www.nb.no/ngram/#1_1_1__1_1_3_1810%2C2022_2_2_2_12_2) for an online visual graph of all uni-, bi-, and trigrams in the National Library's digitized collection of publications.

(example_ner)=
## Named Entity Recognition

Extract occurrences of [named entities](#text.parse.NER), for example place names:
Expand All @@ -99,6 +118,7 @@ docid = 'URN:NBN:no-nb_digibok_2007091701028'
ner = NER(docid)
```

(example_dispersion)=
## Word dispersions

Plot narrative graphs of word [dispersions](#text.dispersion.Dispersion) in a publication, for instance in "Kristin Lavransdatter":
Expand Down
16 changes: 8 additions & 8 deletions docs/source/docs_functionality.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,12 @@
<!-- start docs-functionality -->
Here are some of the text mining and automatic analyses you can do with `dhlab`:

- Build a [corpus](#dhlab.Corpus) from bibliographic metadata about publications.
- Retrieve word (token) [frequencies](#dhlab.Corpus.count) from a corpus.
- Fetch [chunks of text](#dhlab.Chunks) (paragraphs) as bag of words from a specific publication.
- Extract [concordances](#dhlab.Concordance)
- [collocations](#dhlab.Collocations)
- Retrieve [n-gram](#dhlab.ngram.nb_ngram) frequencies per yer in a time period.
- Extract occurrences of [named entities](#dhlab.NER).
- Plot narrative graphs of word [dispersions](#dhlab.text.dispersion.Dispersion) in a publication.
- Build a [corpus](#example_corpus) from bibliographic metadata about publications.
- Retrieve word (token) [frequencies](#example_count) from a corpus.
- Fetch [chunks of text](#example_chunks) (paragraphs) as bag of words from a specific publication.
- Extract [concordances](#example_concordance)
- [collocations](#example_collocations)
- Retrieve [n-gram](#example_ngram) frequencies per yer in a time period.
- Extract occurrences of [named entities](#example_ner).
- Plot narrative graphs of word [dispersions](#example_dispersion) in a publication.
<!-- end docs-functionality -->
2 changes: 1 addition & 1 deletion docs/source/index_home.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ pip install -U dhlab

```{include} ./docs_functionality.md
:heading-offset: 1
relative-docs: docs/source/
:relative-docs: docs/
```

Try some of our [examples](./docs_example_use.md) to get started.
Expand Down

0 comments on commit 066ddbc

Please sign in to comment.