Skip to content

Commit

Permalink
Add documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
emreozcan committed Nov 9, 2024
1 parent 634a3d2 commit 24ccb0e
Show file tree
Hide file tree
Showing 8 changed files with 835 additions and 47 deletions.
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
.pdm-python
/docs/apidocs/
/.pdm-python

# Default name for profiling stat file
profile.stats
Expand Down
119 changes: 74 additions & 45 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,80 +1,109 @@
# tdk-py

Python API for the Turkish Language Foundation
*Python API for the Turkish Language Foundation*

tdk-py is a Python package that allows for simple access to
[Turkish dictionaries] made available by
the [TDK], the Turkish Language Society.
tdk-py aims to be easy to use and internally queries the TDK and parses its
response into easy to use Python class objects.
`tdk-py` is a Python package allowing access to
[Turkish dictionaries] of the [TDK], the Turkish Language Society.

`tdk-py` provides both synchronous and asynchronous interfaces to the TDK's
APIs and parses their responses into Python class objects based on Pydantic,
so you can do things like
[`.model_dump_json()`][model_dump_json]
them, or use them in your API endpoints and generate beautiful schemas.

[Turkish dictionaries]: https://sozluk.gov.tr
[TDK]: https://www.tdk.gov.tr
[model_dump_json]: https://docs.pydantic.dev/2.9/api/base_model/#pydantic.BaseModel.model_dump_json

## Installation

tdk-py is supported on Python 3.6+. The recommended way to install is
via *pip* which comes with Python.
tdk-py is supported on Python 3.10+.

pip install tdk-py
```bash
poetry add tdk-py
pipenv install tdk-py
pip install tdk-py
```

If your machine doesn't have Python and pip installed you can download
it from [The Python Software Foundation's
website](https://www.python.org/downloads/).
```python
# in Python
import tdk
```

## Sample usage
## Examples

`tdk.gts` is used to access TDK's GTS, the up-to-date Turkish dictionary
(Güncel Türkçe Sözlük).

```python-repl
>>> import tdk.gts
>>> tdk.gts.search("merkeziyetçilik")
[<Entry 41635 (merkeziyetçilik)>]
### Searching

```python
import tdk
results = tdk.gts.search_sync("merkeziyetçilik")
print(results[0].meanings[0].meaning)
```
```{code-block}
Otoritenin ve işin tek bir merkezde toplanmasını amaçlayan görüş; merkeziyet, merkezcilik
```

`tsk.gts.search` returns a list because it is possible for there to be
more than one word with the exact same spelling.
`tsk.gts.search` (and its `search_sync` counterpart) returns a list because it
is possible for there to be more than one word with the exact same spelling.

```python-repl
>>> for number, entry in enumerate(tdk.gts.search("bar")):
... for meaning in entry.meanings:
... print(number+1, entry.entry, meaning.meaning)
...
```python
import tdk
for number, entry in enumerate(tdk.gts.search_sync("bar")):
for meaning in entry.meanings:
print(number+1, entry.entry, meaning.meaning)
```
```{code-block}
1 bar Anadolu'nun doğu ve kuzey bölgesinde, en çok Artvin ve Erzurum yörelerinde el ele tutuşularak oynanan, ağır ritimli bir halk oyunu
2 bar Danslı, içkili eğlence yeri
2 bar Ayaküstü içki içilen eğlence yeri
2 bar Amerikan bar
2 bar Amerikan bar
3 bar Hava basıncı birimi
4 bar Ateşten, mide bozukluğundan, ağızda, dil ve dişlerde meydana gelen acılık, pas
4 bar Ateşten, mide bozukluğundan, ağızda, dil ve dişlerde meydana gelen acılık; pas
4 bar Sirke, pekmez gibi sıvıların üzerinde sonradan oluşan köpük veya küf
4 bar Su kaplarında su etkisiyle oluşan tortu veya kir
5 bar Halter sporunda ağırlığı oluşturan kiloları birbirine bağlayan metal çubuk
>>> # 5 different words! One of them (#2) has multiple meanings!
```

5 different words! One of them (#2) has multiple meanings!

### Generating suggestions

You can query suggestions for misspelt words or for other similar words.

```python-repl
>>> from difflib import get_close_matches
>>> get_close_matches("feldispat", tdk.gts.index())
['feldspat', 'ispat', 'fesat']
```python
from difflib import get_close_matches
import tdk

# Calculate suggestions locally using the index:
suggestions = get_close_matches("feldispat", tdk.gts.get_index_sync())
# assert suggestions == ['feldspat', 'ispat', 'fesat']

# Use the TDK API: (sometimes errors out)
suggestions = tdk.gts.get_suggestions_sync("feldispat")
# assert suggestions == ['feldspat', 'felekiyat', 'ispat']
```

### Performing complex analyses

You can perform complex analyses very easily. Let's see the distribution
of entries by the number of maximum consecutive consonants.

```python-repl
>>> from tdk.tools import max_streak
>>> from tdk.alphabet import CONSONANTS
>>> annotated_dict = {}
>>> for entry in tdk.gts.index():
... streaks = max_streak(entry)
... if streaks not in annotated_dict:
... annotated_dict[streaks] = [entry]
... else:
... annotated_dict[streaks].append(entry)
>>> for i in set(annotated_dict):
... print(i, len(annotated_dict[i]))
...
```python
import tdk
annotated_dict = {}
for entry in tdk.gts.get_index_sync():
streaks = tdk.etc.tools.max_streak(entry)
if streaks not in annotated_dict:
annotated_dict[streaks] = [entry]
else:
annotated_dict[streaks].append(entry)
for i in set(annotated_dict):
print(i, len(annotated_dict[i]))
```
```{code-block}
0 19
1 15199
2 73511
Expand All @@ -89,4 +118,4 @@ tdk-py's source code is provided under the [MIT License]

[MIT License]: https://github.com/EmreOzcan/tdk-py/blob/master/LICENSE

Copyright © 2021-2023 Emre Özcan
Copyright © 2021-2024 Emre Özcan
20 changes: 20 additions & 0 deletions docs/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Minimal makefile for Sphinx documentation
#

# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
SOURCEDIR = .
BUILDDIR = _build

# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

.PHONY: help Makefile

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
72 changes: 72 additions & 0 deletions docs/conf.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
import sys
from pathlib import Path

sys.path.insert(0, str(Path('..', 'src').resolve()))

# Configuration file for the Sphinx documentation builder.
#
# For the full list of built-in configuration values, see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html

# -- Project information -----------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information

project = 'tdk-py'
copyright = '2024, Emre Özcan'
author = 'Emre Özcan'
release = __import__('tdk').__version__

# -- General configuration ---------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration

extensions = [
'sphinx.ext.intersphinx',
'sphinx.ext.autodoc',
'myst_parser',
'sphinx_copybutton',
'autodoc2',
'sphinx_inline_tabs',
]

templates_path = ['_templates']
exclude_patterns = []

# -- MystParser

myst_enable_extensions = [
'attrs_block',
'colon_fence',
'fieldlist',
]
myst_heading_anchors = 3

# -- Internationalization
# https://www.sphinx-doc.org/en/master/usage/configuration.html#confval-gettext_uuid

gettext_uuid = True
gettext_compact = False

# -- Options for HTML output -------------------------------------------------
# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output

html_theme = 'furo'
html_static_path = ['_static']

# -- Options for autodoc2

autodoc2_packages = [
'../src/tdk',
]
autodoc2_render_plugin = "myst"
autodoc2_module_all_regexes = [
# r'tdk\..*',
]

# -- Options for intersphinx
# https://www.sphinx-doc.org/en/master/usage/extensions/intersphinx.html#configuration

intersphinx_mapping = {
'python': ('https://docs.python.org/3', None),
'pydantic': ('https://docs.pydantic.dev/2.9', None),
'aiohttp': ('https://docs.aiohttp.org/en/v3.10.10/', None),
}
137 changes: 137 additions & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,137 @@
# tdk-py

*Python API for the Turkish Language Foundation*

---

`tdk-py` is a Python package allowing access to
[Turkish dictionaries] of the [TDK], the Turkish Language Society.

`tdk-py` provides both synchronous and asynchronous interfaces to the TDK's
APIs and parses their responses into Python class objects based on Pydantic,
so you can do things like
[`.model_dump_json()`](<inv:#pydantic.BaseModel.model_dump_json>)
them, or use them in your API endpoints and generate beautiful schemas.

[Turkish dictionaries]: https://sozluk.gov.tr
[TDK]: https://www.tdk.gov.tr

## Installation

tdk-py is supported on Python 3.10+.

:::{tab} poetry
```bash
poetry add tdk-py
```
:::
:::{tab} pipenv
```bash
pipenv install tdk-py
```
:::
:::{tab} pip
```bash
pip install tdk-py
```
:::

```python
# in Python
import tdk
```

## Examples

`tdk.gts` is used to access TDK's GTS, the up-to-date Turkish dictionary
(Güncel Türkçe Sözlük).

### Searching

```python
import tdk
results = tdk.gts.search_sync("merkeziyetçilik")
print(results[0].meanings[0].meaning)
```
```{code-block}
:caption: Output
Otoritenin ve işin tek bir merkezde toplanmasını amaçlayan görüş; merkeziyet, merkezcilik
```

`tsk.gts.search` (and its `search_sync` counterpart) returns a list because it
is possible for there to be more than one word with the exact same spelling.

```python
import tdk
for number, entry in enumerate(tdk.gts.search_sync("bar")):
for meaning in entry.meanings:
print(number+1, entry.entry, meaning.meaning)
```
```{code-block}
:caption: Output
1 bar Anadolu'nun doğu ve kuzey bölgesinde, en çok Artvin ve Erzurum yörelerinde el ele tutuşularak oynanan, ağır ritimli bir halk oyunu
2 bar Danslı, içkili eğlence yeri
2 bar Ayaküstü içki içilen eğlence yeri
2 bar ► Amerikan bar
3 bar Hava basıncı birimi
4 bar Ateşten, mide bozukluğundan, ağızda, dil ve dişlerde meydana gelen acılık; pas
4 bar Sirke, pekmez gibi sıvıların üzerinde sonradan oluşan köpük veya küf
4 bar Su kaplarında su etkisiyle oluşan tortu veya kir
5 bar Halter sporunda ağırlığı oluşturan kiloları birbirine bağlayan metal çubuk
```

5 different words! One of them (#2) has multiple meanings!
z
### Generating suggestions

You can query suggestions for misspelt words or for other similar words.

```python
from difflib import get_close_matches
import tdk

# Calculate suggestions locally using the index:
suggestions = get_close_matches("feldispat", tdk.gts.get_index_sync())
# assert suggestions == ['feldspat', 'ispat', 'fesat']

# Use the TDK API: (sometimes errors out)
suggestions = tdk.gts.get_suggestions_sync("feldispat")
# assert suggestions == ['feldspat', 'felekiyat', 'ispat']
```

### Performing complex analyses

You can perform complex analyses very easily. Let's see the distribution
of entries by the number of maximum consecutive consonants.

```python
import tdk
annotated_dict = {}
for entry in tdk.gts.get_index_sync():
streaks = tdk.etc.tools.max_streak(entry)
if streaks not in annotated_dict:
annotated_dict[streaks] = [entry]
else:
annotated_dict[streaks].append(entry)
for i in set(annotated_dict):
print(i, len(annotated_dict[i]))
```
```{code-block}
:caption: Output
0 19
1 15199
2 73511
3 3605
4 68
5 5
```

```{toctree}
:maxdepth: 2
:caption: Contents:
apidocs/index.rst
```
Loading

0 comments on commit 24ccb0e

Please sign in to comment.