Skip to content

Commit 651b652

Browse files
Improve docs.
1 parent 6072bc8 commit 651b652

File tree

2 files changed

+113
-83
lines changed

2 files changed

+113
-83
lines changed

c/examples/haploid_wright_fisher.c

+3-1
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,9 @@ simulate(tsk_table_collection_t *tables, int N, int T, int simplify_interval, gs
5353
if (t % simplify_interval == 0) {
5454
printf("Simplify at generation %d: (%d nodes %d edges)", t,
5555
tables->nodes.num_rows, tables->edges.num_rows);
56-
ret = tsk_table_collection_sort(tables, 0, 0); /* FIXME; should take position. */
56+
/* Note: Edges must be sorted for simplify to work, and we use a brute force
57+
* approach of sorting each time here for simplicity. This is inefficient. */
58+
ret = tsk_table_collection_sort(tables, NULL, 0);
5759
check_tsk_error(ret);
5860
ret = tsk_table_collection_simplify(tables, children, N, 0, NULL);
5961
check_tsk_error(ret);

docs/c-api.rst

+110-82
Original file line numberDiff line numberDiff line change
@@ -4,83 +4,41 @@
44
C API
55
=====
66

7-
.. warning::
8-
**This section is under construction, incomplete and experiemental!!**
7+
This is the documentation for the ``tskit`` C API, a low-level library
8+
for manipulating and processing :ref:`tree sequence data <sec_data_model>`.
9+
The library is written using the C99 standard and is fully thread safe.
10+
Tskit uses `kastore <https://kastore.readthedocs.io/>`_ to define a
11+
simple storage format for the tree sequence data.
12+
13+
To see the API in action, please see :ref:`sec_c_api_examples` section.
914

1015
********
1116
Overview
1217
********
1318

14-
The ``tskit`` C API is a low-level library for manipulating and processing
15-
:ref:`tree sequence data <sec_data_model>`.
16-
17-
For high level operations
18-
and operations that are not performance sensitive, the :ref:`sec_python_api`
19-
is much more useful. The Python code uses this C API under the hood and
20-
so there's often no real performance penalty for using Python, and Python
21-
is *much* more convenient than C. This C API is useful in the following
22-
situtations:
23-
24-
- You want to use the ``tskit`` API in a larger C/C++ application;
25-
- You need to do lots of tree traversals/loops etc to analyse some data.
26-
27-
The library is written using the C99 standard and is fully thread safe.
28-
29-
---------------------------
30-
Using tskit in your project
31-
---------------------------
32-
33-
Tskit is intended to be embedded directly into applications that use it.
34-
That is, rather than linking against a shared ``tskit`` library, applications
35-
compile and embed their own copy of ``tskit``. As ``tskit`` is quite small, consisting
36-
of only a handful of C files, this is much more convenient and avoids many
37-
of the headaches caused by shared libraries.
38-
39-
The simplest way to include ``tskit`` in your C/C++ project is to
40-
use git submodule.
41-
42-
.. todo:: Set up an example project repo on GitHub and go through
43-
the steps of getting the submodule set up properly.
44-
45-
46-
If you don't use git (or prefer not to use submodules), then you can simply
47-
copy the ``tskit`` and ``kastore`` source files into your own repository.
48-
Please ensure that the files you use correspond to a **released version**
49-
of the API by checking out the appropriate tag on git.
50-
51-
We may distribute ``tskit`` as a shared library in the future, however.
52-
53-
-----------------
54-
Code organisation
55-
-----------------
56-
57-
Tskit is organised in a modular way, allowing users to pick and choose which
58-
parts of the library that they compile into their application. The functionality
59-
defined in each header file corresponds to one C file, giving fine-grained access
60-
to the functionality that is required for different applications.
61-
62-
Core functionality such as error handling required by all of ``tskit`` is
63-
defined in ``tsk_core.[c,h]``. Client code should not need to include ``tsk_core.h``
64-
as it is included in all other ``tskit`` headers.
65-
66-
The :ref:`sec_c_api_tables_api` is defined in ``tsk_tables.[c,h]``. Tree sequence
67-
and tree :ref:`functionality <sec_c_api_tree_sequences>` is defined in
68-
``tsk_trees.[c,h]``.
19+
--------------------
20+
Do I need the C API?
21+
--------------------
6922

70-
.. todo:: When the remaining types have been finalised and documented add the
71-
descriptions in here.
23+
The ``tskit`` C API is generally useful in the following situations:
7224

73-
For convenience, there is also a ``tskit.h`` header file that includes all
74-
of the functionality in ``tskit``.
25+
- You want to use the ``tskit`` API in a larger C/C++ application (e.g.,
26+
in order to output data in the ``.trees`` format);
27+
- You need to perform lots of tree traversals/loops etc to analyse some
28+
data that is in tree sequence form.
7529

30+
For high level operations that are not performance sensitive, the :ref:`sec_python_api`
31+
is generally more useful. Python is *much* more convenient that C,
32+
and since the ``tskit`` Python module is essentially a wrapper for this
33+
C library, there's often no real performance penalty for using it.
7634

7735
.. _sec_c_api_overview_structure:
7836

7937
-------------
8038
API structure
8139
-------------
8240

83-
Tskit uses a set of conventions to provide pseudo object oriented API. Each
41+
Tskit uses a set of conventions to provide a pseudo object oriented API. Each
8442
'object' is represented by a C struct and has a set of 'methods'. This is
8543
most easily explained by an example:
8644

@@ -109,8 +67,9 @@ and :c:func:`tsk_table_collection_copy` which automatically initialise
10967
the object by default for convenience). The free
11068
method must always be called to avoid leaking memory, even in the
11169
case of an error occuring during intialisation. If ``class_name_init`` has
112-
been called, we say the object has been "initialised"; if not,
113-
it is "uninitialised".
70+
been called succesfully, we say the object has been "initialised"; if not,
71+
it is "uninitialised". After ``class_name_free`` has been called,
72+
the object is again uninitialised.
11473

11574
It is important to note that the init methods only allocate *internal* memory;
11675
the memory for the instance itself must be allocated either on the
@@ -134,24 +93,86 @@ heap or the stack:
13493
Error handling
13594
--------------
13695

137-
Every function in ``tskit`` (except for trivial accessor methods) returns
96+
C does not have a mechanism for propagating exceptions, and great care
97+
must be taken to ensure that errors are correctly and safely handled.
98+
The convention adopted in ``tskit`` is that
99+
every function (except for trivial accessor methods) returns
138100
an integer. If this return value is negative an error has occured which
139-
must be handled.
101+
must be handled. A description of the error that occured can be obtained
102+
using the :c:func:`tsk_strerror` function. The following example illustrates
103+
the key conventions around error handling in ``tskit``:
140104

141105
.. literalinclude:: ../c/examples/error_handling.c
142106
:language: c
143107

144108
In this example we load a tree sequence from file and print out a summary
145109
of the number of nodes and edges it contains. After calling
146-
:c:func:`tsk_treeseq_load` we check it's return value ``ret`` to see
147-
if an error occured. If an error happens we with an error message produced with
148-
:c:func:`tsk_strerror`. Note that in this example we call
149-
``tsk_treeseq_free`` whether or not an error occurs: in general,
150-
once ``X_alloc`` (or ``load`` here) is called ``X_free`` must also
110+
:c:func:`tsk_treeseq_load` we check the return value ``ret`` to see
111+
if an error occured. If an error has occured we exit with an error
112+
message produced by :c:func:`tsk_strerror`. Note that in this example we call
113+
:c:func:`tsk_treeseq_free` whether or not an error occurs: in general,
114+
once a function that initialises an object (e.g., ``X_init``, ``X_copy``
115+
or ``X_load``) is called, then ``X_free`` must
151116
be called to ensure that memory is not leaked.
152117

153-
Most functions in ``tskit`` can return an error status, and we
154-
**strongly** recommend that every return value is checked.
118+
Most functions in ``tskit`` return an error status; we recommend that **every**
119+
return value is checked.
120+
121+
---------------------------
122+
Using tskit in your project
123+
---------------------------
124+
125+
Tskit is built as a standard C library and so there are many different ways
126+
in which it can be included in downstream projects. It is possible to
127+
install ``tskit`` onto a system (i.e., installing a shared library and
128+
header files to a standard locations on Unix) and linking against it,
129+
but there are many different ways in which this can go wrong. In the
130+
interest of simplicity and improving the end-user experience we recommend
131+
embedding ``tskit`` directly into your applications.
132+
133+
There are many different build systems and approaches to compiling
134+
code, and so it's not possible to give definitive documentation on
135+
how ``tskit`` should be included in downstream projects. Please
136+
see the `build examples <https://github.com/tskit-dev/tskit-build-examples>`_
137+
repo for some examples of how to incorporate ``tskit`` into
138+
different project structures and build systems.
139+
140+
Tskit uses the `meson <https://mesonbuild.com>`_ build system internally,
141+
and supports being used a `meson subproject <https://mesonbuild.com/Subprojects.html>`_.
142+
We show an `example <https://github.com/tskit-dev/tskit-build-examples/tree/master/meson>`_
143+
in which this is combined with
144+
`git submodules <https://git-scm.com/book/en/v2/Git-Tools-Submodules>`_ to neatly
145+
abstract many details of cross platform C development.
146+
147+
Some users may choose to check the source for ``tskit`` (and ``kastore``)
148+
directly into their source control repositories. If you wish to do this,
149+
the code is in the ``c`` subdirectory of the
150+
`tskit <https://github.com/tskit-dev/tskit/tree/master/c>`_ and
151+
`kastore <https://github.com/tskit-dev/kastore/tree/master/c>`__ repos.
152+
The following header files should be placed in the search path:
153+
``kastore.h``, ``tskit.h``, and ``tskit/*.h``.
154+
The C files ``kastore.c`` and ``tskit*.c`` should be compiled.
155+
For those who wish to minimise the size of their compiled binaries,
156+
``tskit`` is quite modular, and C files can be omitted if not needed.
157+
For example, if you are just using the :ref:`sec_c_api_tables_api` then
158+
only the files ``tskit/core.[c,h]`` and ``tskit/tables.[c,h]`` are
159+
needed.
160+
161+
However you include ``tskit`` in your project, however, please
162+
ensure that it is a **released version**. Released versions are
163+
tagged on GitHub using the convention ``C_{VERSION}``. The code
164+
can either be downloaded from GitHub on the `releases page
165+
<https://github.com/tskit-dev/tskit/releases>`_ or checked out
166+
using git. For example, to check out the ``C_0.99.0`` release::
167+
168+
$ git clone https://github.com/tskit-dev/tskit.git
169+
$ cd tskit
170+
$ git checkout C_0.99.0
171+
172+
Git submodules may also be considered---see the
173+
`example <https://github.com/tskit-dev/tskit-build-examples/tree/master/meson>`_
174+
for how to set these up and to check out at a specific release.
175+
155176

156177
***********
157178
Basic Types
@@ -161,20 +182,20 @@ Basic Types
161182
.. doxygentypedef:: tsk_size_t
162183
.. doxygentypedef:: tsk_flags_t
163184

185+
**************
186+
Common options
187+
**************
188+
189+
.. doxygengroup:: TABLES_API_FUNCTION_OPTIONS
190+
:content-only:
191+
164192
.. _sec_c_api_tables_api:
165193

166194
**********
167195
Tables API
168196
**********
169197

170-
The tables API section of ``tskit`` is defined in ``tsk_tables.h``.
171-
172-
--------------
173-
Common options
174-
--------------
175-
176-
.. doxygengroup:: TABLES_API_FUNCTION_OPTIONS
177-
:content-only:
198+
The tables API section of ``tskit`` is defined in the ``tskit/tables.h`` header.
178199

179200
-----------------
180201
Table collections
@@ -358,6 +379,7 @@ File format errors
358379

359380

360381

382+
.. _sec_c_api_examples:
361383

362384
********
363385
Examples
@@ -368,7 +390,13 @@ Basic forwards simulator
368390
------------------------
369391

370392
This is an example of using the tables API to define a simple
371-
haploid Wright-Fisher simulator.
393+
haploid Wright-Fisher simulator. Because this simple example
394+
repeatedly sorts the edge data, it is quite inefficient and
395+
should not be used as the basis of a large-scale simulator.
396+
397+
.. todo::
398+
Give a pointer to an example that caches and flushes edge data efficiently.
399+
Probably using the C++ API?
372400

373401
.. literalinclude:: ../c/examples/haploid_wright_fisher.c
374402
:language: c

0 commit comments

Comments
 (0)