diff --git a/docs/analysis.rst b/docs/analysis.rst
index ceb7f1e6..f2db2c57 100644
--- a/docs/analysis.rst
+++ b/docs/analysis.rst
@@ -424,6 +424,69 @@ contacts to link definitions from :ref:`monomer library <CCD_etc>`
 and to connections (LINK, SSBOND) from the structure.
 If you find it useful, please contact the author.
 
+Matthews coefficient
+====================
+
+Matthews coefficient V\ :sub:`M` is defined as the crystal volume
+per unit of protein molecular weight. Typically, the molecular weight
+for V\ :sub:`M` is calculated from a sequence,
+and that's what this section is mostly about.
+
+First, let's read a structure and get a protein sequence:
+
+.. doctest::
+
+  >>> st = gemmi.read_structure('../tests/5cvz_final.pdb')
+  >>> st.setup_entities()  # it should sort out chain parts
+  >>> list(st[0])
+  [<gemmi.Chain A with 141 res>]
+  >>> # we have just a single chain, which makes this example simpler
+  >>> chain = st[0]['A']
+  >>> chain.get_polymer()
+  <gemmi.ResidueSpan of 141: Axp [17(ALA) 18(ALA) 19(ALA) ... 157(SER)]>
+  >>> st.get_entity_of(_)  # doctest: +ELLIPSIS
+  <gemmi.Entity 'A' polymer polypeptide(L) object at 0x...>
+  >>> sequence = _.full_sequence
+
+Gemmi provides a simple function to calculate molecular weight
+from the sequence using the built-in table of popular residues:
+
+.. doctest::
+
+  >>> weight = gemmi.calculate_sequence_weight(_.full_sequence)
+  >>> # Now we can calculate Matthews coefficient
+  >>> st.cell.volume_per_image() / weight
+  3.1983428753317003
+
+We can continue and calculate the solvent content, assuming the protein
+density of 1.35 g/cm\ :sup:`3` (the other constants below are the Avogadro
+number and Å\ :sup:`3`/cm\ :sup:`3` = 10\ :sup:`-24`):
+
+.. doctest::
+
+  >>> protein_fraction = 1. / (6.02214e23 * 1e-24 * 1.35 * _)
+  >>> print('Solvent content: {:.1f}%'.format(100 * (1 - protein_fraction)))
+  Solvent content: 61.5%
+
+If the sequence includes rare chemical components
+(outside of the top 300+ most popular components in the PDB), you may
+specify the average weight of the components that are not tabulated:
+
+.. doctest::
+
+  >>> sequence = ['DSN', 'ALA', 'N2C', 'MVA', 'DSN', 'ALA', 'NCY', 'MVA']
+  >>> gemmi.calculate_sequence_weight(sequence, unknown=130.0)
+  784.6114543066407
+
+The weights are assumed to be of unbonded residues. Therefore, the chain weight
+is calculated as a sum of all components minus
+(*N*--1) × weight of H\ :sub:`2`\ O.
+
+.. note::
+
+    Gemmi includes a program that calculates the Matthews coefficient
+    and the solvent content: :ref:`gemmi-contents <gemmi-contents>`.
+
 Superposition
 =============
 
@@ -1131,89 +1194,3 @@ where
 
 
 TBC
-
-.. _pdb_dir:
-
-Local copy of the PDB archive
-=============================
-
-Some of the examples in this documentation work with a local copy
-of the Protein Data Bank archive. This subsection describes
-the assumed setup.
-
-Like in BioJava, we assume that the `$PDB_DIR` environment variable
-points to a directory that contains `structures/divided/mmCIF` -- the same
-arrangement as on the
-`PDB's FTP <ftp://ftp.wwpdb.org/pub/pdb/data/structures/>`_ server.
-
-.. code-block:: console
-
-    $ cd $PDB_DIR
-    $ du -sh structures/*/*  # as of Jun 2017
-    34G    structures/divided/mmCIF
-    25G    structures/divided/pdb
-    101G   structures/divided/structure_factors
-    2.6G   structures/obsolete/mmCIF
-
-A traditional way to keep an up-to-date local archive is to rsync it
-once a week:
-
-.. code-block:: shell
-
-    #!/bin/sh -x
-    set -u  # PDB_DIR must be defined
-    rsync_subdir() {
-      mkdir -p "$PDB_DIR/$1"
-      # Using PDBe (UK) here, can be replaced with RCSB (USA) or PDBj (Japan),
-      # see https://www.wwpdb.org/download/downloads
-      rsync -rlpt -v -z --delete \
-	  rsync.ebi.ac.uk::pub/databases/pdb/data/$1/ "$PDB_DIR/$1/"
-    }
-    rsync_subdir structures/divided/mmCIF
-    #rsync_subdir structures/obsolete/mmCIF
-    #rsync_subdir structures/divided/pdb
-    #rsync_subdir structures/divided/structure_factors
-
-Gemmi has a helper function for using the local archive copy.
-It takes a PDB code (case insensitive) and a symbol denoting what file
-is requested: P for PDB, M for mmCIF, S for SF-mmCIF.
-
-.. doctest::
-
-  >>> os.environ['PDB_DIR'] = '/copy'
-  >>> gemmi.expand_if_pdb_code('1ABC', 'P') # PDB file
-  '/copy/structures/divided/pdb/ab/pdb1abc.ent.gz'
-  >>> gemmi.expand_if_pdb_code('1abc', 'M') # mmCIF file
-  '/copy/structures/divided/mmCIF/ab/1abc.cif.gz'
-  >>> gemmi.expand_if_pdb_code('1abc', 'S') # SF-mmCIF file
-  '/copy/structures/divided/structure_factors/ab/r1abcsf.ent.gz'
-
-If the first argument is not in the PDB code format (4 characters for now)
-the function returns the first argument.
-
-.. doctest::
-
-  >>> arg = 'file.cif'
-  >>> gemmi.is_pdb_code(arg)
-  False
-  >>> gemmi.expand_if_pdb_code(arg, 'M')
-  'file.cif'
-
-Multiprocessing
-===============
-
-(Python-specific)
-
-Most of the gemmi objects cannot be pickled. Therefore, they cannot be
-passed between processes when using the multiprocessing module.
-Currently, the only picklable classes (with protocol >= 2) are:
-UnitCell and SpaceGroup.
-
-Usually, it is possible to organize multiprocessing in such a way that
-gemmi objects are not passed between processes. The example script below
-traverses subdirectories and asynchronously analyzes coordinate files,
-using 4 worker processes in parallel.
-
-.. literalinclude:: ../examples/multiproc.py
-   :language: python
-   :lines: 4-
diff --git a/docs/chemistry.rst b/docs/chemistry.rst
index 353974fc..5c513061 100644
--- a/docs/chemistry.rst
+++ b/docs/chemistry.rst
@@ -476,49 +476,3 @@ The `logging` argument above is described in the next section.
 
 TBC
 
-
-.. _logger:
-
-Logger
-======
-
-Gemmi Logger is a tiny helper class for passing messages from a gemmi function
-to the calling function. It doesn't belong in this section, but it's
-documented here because it's used in the previous subsection and I haven't found
-a better spot for it.
-
-The messages being passed are usually info or warnings that a command-line
-program would print to stdout or stderr.
-
-The Logger has two member variables:
-
-.. literalinclude:: ../include/gemmi/logger.hpp
-   :language: cpp
-   :start-at: ///
-   :end-at: int threshold
-
-and a few member functions for sending messages.
-
-When a function takes a Logger argument, we can pass:
-
-**C++**
-
-* `{&Logger::to_stderr}` to redirect messages to stderr
-  (to_stderr() calls fprintf),
-* `{&Logger::to_stdout}` to redirect messages to stdout,
-* `{&Logger::to_stdout, 3}` to print only warnings (threshold=3),
-* `{nullptr, 0}` to disable all messages,
-* `{}` to throw errors and ignore other messages (the default, see Quirk above),
-* `{[](const std::string& s) { do_anything(s);}}` to do anything else.
-
-**Python**
-
-* `sys.stderr` or `sys.stdout` or any other stream (an object with `write`
-  and `flush` methods), to redirect messages to that stream,
-* `(sys.stdout, 3)` to print only warnings (threshold=3),
-* `(None, 0)` to disable all messages,
-* `None` to throw errors and ignore other messages (the default, see Quirk above),
-* a function that takes a message string as its only argument
-  (e.g. `lambda s: print(s.upper())`).
-
-
diff --git a/docs/conf.py b/docs/conf.py
index a785fcbf..ecdf8008 100644
--- a/docs/conf.py
+++ b/docs/conf.py
@@ -21,7 +21,8 @@
             version = _line.split()[2].strip('"')
 release = version
 
-exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store']
+# now sure if we'll use headers.rst again, disable it for now
+exclude_patterns = ['_build', 'Thumbs.db', '.DS_Store', 'headers.rst' ]
 pygments_style = 'sphinx'
 todo_include_todos = False
 highlight_language = 'cpp'
@@ -43,6 +44,30 @@
 html_show_sourcelink = False
 html_copy_source = False
 
+def setup(app):
+    app.connect("builder-inited", monkey_patching_furo)
+
+def monkey_patching_furo(app):
+    if app.builder.name != 'html':
+        return
+
+    import furo
+    def _compute_navigation_tree(context: Dict[str, Any]) -> str:
+        # The navigation tree, generated from the sphinx-provided ToC tree.
+        if "toctree" in context:
+            toctree = context["toctree"]
+            toctree_html = toctree(
+                collapse=False,
+                titles_only=False,
+                maxdepth=2,
+                includehidden=True,
+            )
+        else:
+            toctree_html = ""
+        return furo.get_navigation_tree(toctree_html)
+
+    furo._compute_navigation_tree = _compute_navigation_tree
+
 # -- Options for LaTeX output ---------------------------------------------
 
 latex_elements = {
diff --git a/docs/hkl.rst b/docs/hkl.rst
index 62475caf..2eb9bd32 100644
--- a/docs/hkl.rst
+++ b/docs/hkl.rst
@@ -1001,6 +1001,11 @@ program documentation for details.
   >>> # and convert it back
   >>> cif_string = gemmi.MtzToCif().write_cif_to_string(_)
 
+XDS_ASCII
+=========
+
+TODO: document functions from `xds_ascii.hpp`
+
 
 SX hkl CIF
 ==========
diff --git a/docs/index.rst b/docs/index.rst
index c9617963..ae9733e0 100644
--- a/docs/index.rst
+++ b/docs/index.rst
@@ -1,11 +1,15 @@
 .. meta::
    :google-site-verification: LsEfb1rjo2RL8WOSZGigV11Kgyhtk9v1Vb-6GZFnHKo
 
-GEMMI - library for structural biology
-======================================
+Overview
+########
 
-Gemmi is a library, accompanied by a set of programs,
-developed primarily for use in **macromolecular crystallography** (MX).
+What is it for?
+===============
+
+Gemmi is a library, accompanied by a :ref:`set of programs <program>`,
+developed primarily for use in **structural biology**,
+and in particular in **macromolecular crystallography** (MX).
 For working with:
 
 * macromolecular models (content of PDB, PDBx/mmCIF and mmJSON files),
@@ -53,22 +57,109 @@ Source code repository: https://github.com/project-gemmi/gemmi
 .. _me: wojdyr+gemmi@gmail.com
 
 Contents
---------
+========
 
 .. toctree::
-   :maxdepth: 2
+   :maxdepth: 1
 
-   Introduction <self>
+   Overview <self>
    install
+   program
+
+.. toctree::
+   :caption: Prerequisites
+   :maxdepth: 2
+
    cif
    symmetry
    cell
+   misc
+
+.. toctree::
+   :caption: Working with Molecules
+   :maxdepth: 2
+
    chemistry
    mol
    analysis
+
+.. toctree::
+   :caption: Working with Data
+   :maxdepth: 2
+
    grid
    hkl
    scattering
-   program
+
+.. toctree::
+   :caption: Other Docs
+
+   ChangeLog <https://github.com/project-gemmi/gemmi/releases>
    Python API reference <https://project-gemmi.github.io/python-api/>
    C++ API reference <https://project-gemmi.github.io/cxx-api/>
+
+Credits
+=======
+
+This project is using code from a number of third-party open-source projects.
+
+Projects used in the C++ library, included under
+`include/gemmi/third_party/` (if used in headers) or `third_party/`:
+
+* `PEGTL <https://github.com/taocpp/PEGTL/>`_ -- library for creating PEG
+  parsers. License: MIT.
+* `sajson <https://github.com/chadaustin/sajson>`_ -- high-performance
+  JSON parser. License: MIT.
+* `PocketFFT <https://gitlab.mpcdf.mpg.de/mtr/pocketfft>`_ -- FFT library.
+  License: 3-clause BSD.
+* `stb_sprintf <https://github.com/nothings/stb>`_ -- locale-independent
+  snprintf() implementation. License: Public Domain.
+* `fast_float <https://github.com/fastfloat/fast_float>`_ -- locale-independent
+  number parsing. License: Apache 2.0.
+* `tinydir <https://github.com/cxong/tinydir>`_ -- directory (filesystem)
+  reader. License: 2-clause BSD.
+
+Code derived from the following projects is used in the library:
+
+* `ksw2 <https://github.com/lh3/ksw2>`_ -- sequence alignment in
+  `seqalign.hpp` is based on the ksw_gg function from ksw2. License: MIT.
+* `QCProt <https://theobald.brandeis.edu/qcp/>`_ -- superposition method
+  in `qcp.hpp` is taken from QCProt and adapted to our project. License: BSD.
+* `Larch <https://github.com/xraypy/xraylarch>`_ -- calculation of f' and f"
+  in `fprime.cpp` is based on CromerLiberman code from Larch.
+  License: 2-clause BSD.
+
+Projects included under `third_party/` that are not used in the library
+itself, but are used in command-line utilities, python bindings or tests:
+
+* `zpp serializer <https://github.com/eyalz800/serializer>`_ --
+  serialization framework. License: MIT.
+* `The Lean Mean C++ Option Parser <http://optionparser.sourceforge.net/>`_ --
+  command-line option parser. License: MIT.
+* `doctest <https://github.com/onqtam/doctest>`_ -- testing framework.
+  License: MIT.
+* `linalg.h <http://github.com/sgorsten/linalg/>`_ -- linear algebra library.
+  License: Public Domain.
+* `zlib <https://github.com/madler/zlib>`_ -- a subset of the zlib library
+  for decompressing gz files, used as a fallback when the zlib library
+  is not found in the system. License: zlib.
+
+Not distributed with Gemmi:
+
+* `nanobind <https://github.com/wjakob/nanobind>`_ -- used for creating
+  Python bindings. License: 3-clause BSD.
+* `zlib-ng <https://github.com/zlib-ng/zlib-ng>`_ -- optional, can be used
+  instead of zlib for faster reading of gzipped files.
+* `cctbx <https://github.com/cctbx/cctbx_project>`_ -- used in tests
+  (if cctbx is not present, these tests are skipped) and
+  in scripts that generated space group data and 2-fold twinning operations.
+  License: 3-clause BSD.
+
+Mentions:
+
+* `NLOpt <https://github.com/stevengj/nlopt>`_
+  was used to try out various optimization methods for class Scaling.
+  License: MIT.
+
+Email me if I forgot about something.
+
diff --git a/docs/install.rst b/docs/install.rst
index f2495264..cf44e921 100644
--- a/docs/install.rst
+++ b/docs/install.rst
@@ -225,76 +225,3 @@ We also have *Python doctest* tests in the documentation,
 and a few other test routines.
 All the commands used for testing are listed in the `run-tests.sh`
 script in the repository.
-
-Credits
--------
-
-This project is using code from a number of third-party open-source projects.
-
-Projects used in the C++ library, included under
-`include/gemmi/third_party/` (if used in headers) or `third_party/`:
-
-* `PEGTL <https://github.com/taocpp/PEGTL/>`_ -- library for creating PEG
-  parsers. License: MIT.
-* `sajson <https://github.com/chadaustin/sajson>`_ -- high-performance
-  JSON parser. License: MIT.
-* `PocketFFT <https://gitlab.mpcdf.mpg.de/mtr/pocketfft>`_ -- FFT library.
-  License: 3-clause BSD.
-* `stb_sprintf <https://github.com/nothings/stb>`_ -- locale-independent
-  snprintf() implementation. License: Public Domain.
-* `fast_float <https://github.com/fastfloat/fast_float>`_ -- locale-independent
-  number parsing. License: Apache 2.0.
-* `tinydir <https://github.com/cxong/tinydir>`_ -- directory (filesystem)
-  reader. License: 2-clause BSD.
-
-Code derived from the following projects is used in the library:
-
-* `ksw2 <https://github.com/lh3/ksw2>`_ -- sequence alignment in
-  `seqalign.hpp` is based on the ksw_gg function from ksw2. License: MIT.
-* `QCProt <https://theobald.brandeis.edu/qcp/>`_ -- superposition method
-  in `qcp.hpp` is taken from QCProt and adapted to our project. License: BSD.
-* `Larch <https://github.com/xraypy/xraylarch>`_ -- calculation of f' and f"
-  in `fprime.cpp` is based on CromerLiberman code from Larch.
-  License: 2-clause BSD.
-
-Projects included under `third_party/` that are not used in the library
-itself, but are used in command-line utilities, python bindings or tests:
-
-* `zpp serializer <https://github.com/eyalz800/serializer>`_ --
-  serialization framework. License: MIT.
-* `The Lean Mean C++ Option Parser <http://optionparser.sourceforge.net/>`_ --
-  command-line option parser. License: MIT.
-* `doctest <https://github.com/onqtam/doctest>`_ -- testing framework.
-  License: MIT.
-* `linalg.h <http://github.com/sgorsten/linalg/>`_ -- linear algebra library.
-  License: Public Domain.
-* `zlib <https://github.com/madler/zlib>`_ -- a subset of the zlib library
-  for decompressing gz files, used as a fallback when the zlib library
-  is not found in the system. License: zlib.
-
-Not distributed with Gemmi:
-
-* `nanobind <https://github.com/wjakob/nanobind>`_ -- used for creating
-  Python bindings. License: 3-clause BSD.
-* `zlib-ng <https://github.com/zlib-ng/zlib-ng>`_ -- optional, can be used
-  instead of zlib for faster reading of gzipped files.
-* `cctbx <https://github.com/cctbx/cctbx_project>`_ -- used in tests
-  (if cctbx is not present, these tests are skipped) and
-  in scripts that generated space group data and 2-fold twinning operations.
-  License: 3-clause BSD.
-
-Mentions:
-
-* `NLOpt <https://github.com/stevengj/nlopt>`_
-  was used to try out various optimization methods for class Scaling.
-  License: MIT.
-
-Email me if I forgot about something.
-
-List of C++ headers
--------------------
-
-Here is a list of C++ headers in `gemmi/include/`.
-This list also provides an overview of the library.
-
-.. include:: headers.rst
diff --git a/docs/misc.rst b/docs/misc.rst
new file mode 100644
index 00000000..fba497b8
--- /dev/null
+++ b/docs/misc.rst
@@ -0,0 +1,149 @@
+Miscellaneous utils
+###################
+
+FASTA and PIR reader
+--------------------
+
+Gemmi provides a function to parse two sequence file formats, FASTA and PIR.
+The function takes a string containing the file's content as an argument:
+
+.. doctest::
+
+  >>> with open('P0C805.fasta') as f:
+  ...     fasta_str = f.read()
+  >>> gemmi.read_pir_or_fasta(fasta_str)  #doctest: +ELLIPSIS
+  [<gemmi.FastaSeq object at 0x...>]
+
+The string must start with a header line that begins with `>`.
+In the case of the PIR format, which starts with `>P1;` (or F1, DL, DC, RL, RC,
+or XX instead of P1), the next line is also part of the header.
+The sequence file may contain multiple sequences, each preceded by a header.
+Whitespace in a sequence is ignored, except for blank lines,
+which are only allowed between sequences.
+A sequence can contain letters, dashes, and residue names in parentheses.
+The latter is an extension inspired by the format used in mmCIF files,
+in which non-standard residues are given in parentheses, e.g., `MA(MSE)GVN`.
+The sequence may end with `*`.
+
+`FastaSeq` objects, returned from `read_pir_or_fasta()`,
+contain only two strings:
+
+.. doctest::
+
+  >>> (fasta_seq,) = _
+  >>> fasta_seq.header
+  'sp|P0C805|PSMA3_STAA8 Phenol-soluble modulin alpha 3 peptide OS=Staphylococcus aureus (strain NCTC 8325 / PS 47) OX=93061 GN=psmA3 PE=1 SV=1'
+  >>> fasta_seq.seq
+  'MEFVAKLFKFFKDLLGKFLGNN'
+
+.. _logger:
+
+Logger
+======
+
+Gemmi Logger is a tiny helper class for passing messages from a gemmi function
+to the calling function. It doesn't belong in this section, but it's
+documented here because it's used in the previous subsection and I haven't found
+a better spot for it.
+
+The messages being passed are usually info or warnings that a command-line
+program would print to stdout or stderr.
+
+The Logger has two member variables:
+
+.. literalinclude:: ../include/gemmi/logger.hpp
+   :language: cpp
+   :start-at: ///
+   :end-at: int threshold
+
+and a few member functions for sending messages.
+
+When a function takes a Logger argument, we can pass:
+
+**C++**
+
+* `{&Logger::to_stderr}` to redirect messages to stderr
+  (to_stderr() calls fprintf),
+* `{&Logger::to_stdout}` to redirect messages to stdout,
+* `{&Logger::to_stdout, 3}` to print only warnings (threshold=3),
+* `{nullptr, 0}` to disable all messages,
+* `{}` to throw errors and ignore other messages (the default, see Quirk above),
+* `{[](const std::string& s) { do_anything(s);}}` to do anything else.
+
+**Python**
+
+* `sys.stderr` or `sys.stdout` or any other stream (an object with `write`
+  and `flush` methods), to redirect messages to that stream,
+* `(sys.stdout, 3)` to print only warnings (threshold=3),
+* `(None, 0)` to disable all messages,
+* `None` to throw errors and ignore other messages (the default, see Quirk above),
+* a function that takes a message string as its only argument
+  (e.g. `lambda s: print(s.upper())`).
+
+
+.. _pdb_dir:
+
+Copy of the PDB archive
+=======================
+
+Some of the examples in this documentation work with a local copy
+of the Protein Data Bank archive. This subsection describes
+the assumed setup and functions for working with this setup.
+
+Like in BioJava, we assume that the `$PDB_DIR` environment variable
+points to a directory that contains `structures/divided/mmCIF` -- the same
+arrangement as on the
+`PDB's FTP <ftp://ftp.wwpdb.org/pub/pdb/data/structures/>`_ server.
+
+.. code-block:: console
+
+    $ cd $PDB_DIR
+    $ du -sh structures/*/*  # as of Jun 2017
+    34G    structures/divided/mmCIF
+    25G    structures/divided/pdb
+    101G   structures/divided/structure_factors
+    2.6G   structures/obsolete/mmCIF
+
+A traditional way to keep an up-to-date local archive is to rsync it
+once a week:
+
+.. code-block:: shell
+
+    #!/bin/sh -x
+    set -u  # PDB_DIR must be defined
+    rsync_subdir() {
+      mkdir -p "$PDB_DIR/$1"
+      # Using PDBe (UK) here, can be replaced with RCSB (USA) or PDBj (Japan),
+      # see https://www.wwpdb.org/download/downloads
+      rsync -rlpt -v -z --delete \
+	  rsync.ebi.ac.uk::pub/databases/pdb/data/$1/ "$PDB_DIR/$1/"
+    }
+    rsync_subdir structures/divided/mmCIF
+    #rsync_subdir structures/obsolete/mmCIF
+    #rsync_subdir structures/divided/pdb
+    #rsync_subdir structures/divided/structure_factors
+
+Gemmi has a helper function for using the local archive copy.
+It takes a PDB code (case insensitive) and a symbol denoting what file
+is requested: P for PDB, M for mmCIF, S for SF-mmCIF.
+
+.. doctest::
+
+  >>> os.environ['PDB_DIR'] = '/copy'
+  >>> gemmi.expand_if_pdb_code('1ABC', 'P') # PDB file
+  '/copy/structures/divided/pdb/ab/pdb1abc.ent.gz'
+  >>> gemmi.expand_if_pdb_code('1abc', 'M') # mmCIF file
+  '/copy/structures/divided/mmCIF/ab/1abc.cif.gz'
+  >>> gemmi.expand_if_pdb_code('1abc', 'S') # SF-mmCIF file
+  '/copy/structures/divided/structure_factors/ab/r1abcsf.ent.gz'
+
+If the first argument is not in the PDB code format (4 characters for now)
+the function returns the first argument.
+
+.. doctest::
+
+  >>> arg = 'file.cif'
+  >>> gemmi.is_pdb_code(arg)
+  False
+  >>> gemmi.expand_if_pdb_code(arg, 'M')
+  'file.cif'
diff --git a/docs/mol.rst b/docs/mol.rst
index 04268be8..446bc436 100644
--- a/docs/mol.rst
+++ b/docs/mol.rst
@@ -24,9 +24,9 @@ Reading coordinate files
 
 Gemmi support the following coordinate file formats:
 
-    * mmCIF (PDBx/mmCIF),
-    * PDB (with popular extensions),
-    * mmJSON.
+* mmCIF (PDBx/mmCIF),
+* PDB (with popular extensions),
+* mmJSON.
 
 It can also read coordinates from the chemical components dictionary
 (CCD) and from Refmac monomer library -- these are not really coordinate
@@ -1885,95 +1885,6 @@ way around, if we know the kind of residues encoded with single letters:
   ['DSN', 'ALA', 'N2C', 'MVA', 'DSN', 'ALA', 'NCY', 'MVA']
 
 
-Molecular weight
-----------------
-
-Gemmi provides a simple function to calculate molecular weight
-from the sequence. It uses the same built-in table of popular residues.
-Since in this example we have two rare components that are not tabulated,
-we must specify the average weight of unknown residue:
-
-.. doctest::
-
-  >>> gemmi.calculate_sequence_weight(seq, unknown=130.0)
-  784.6114543066407
-
-In such case the result is not accurate, but this is not a typical case.
-
-Now we will take a PDB file with standard residues
-and calculate the Matthews coefficient:
-
-.. doctest::
-
-  >>> st = gemmi.read_structure('../tests/5cvz_final.pdb')
-  >>> list(st[0])
-  [<gemmi.Chain A with 141 res>]
-  >>> # we have just a single chain, which makes this example simpler
-  >>> chain = st[0]['A']
-  >>> chain.get_polymer()
-  <gemmi.ResidueSpan of 0: []>
-  >>> # Not good. The chain parts where not assigned automatically,
-  >>> # because of the missing TER record in this file. We need to call:
-  >>> st.setup_entities()  # it should sort out chain parts
-  >>> chain.get_polymer()
-  <gemmi.ResidueSpan of 141: Axp [17(ALA) 18(ALA) 19(ALA) ... 157(SER)]>
-  >>> st.get_entity_of(_)  # doctest: +ELLIPSIS
-  <gemmi.Entity 'A' polymer polypeptide(L) object at 0x...>
-  >>> weight = gemmi.calculate_sequence_weight(_.full_sequence)
-  >>> # Now we can calculate Matthews coefficient
-  >>> st.cell.volume_per_image() / weight
-  3.1983428753317003
-
-We could continue and calculate the solvent content, assuming the protein
-density of 1.35 g/cm\ :sup:`3` (the other constants below are the Avogadro
-number and Å\ :sup:`3`/cm\ :sup:`3` = 10\ :sup:`-24`):
-
-.. doctest::
-
-  >>> protein_fraction = 1. / (6.02214e23 * 1e-24 * 1.35 * _)
-  >>> print('Solvent content: {:.1f}%'.format(100 * (1 - protein_fraction)))
-  Solvent content: 61.5%
-
-Gemmi also includes a program that calculates the solvent content:
-:ref:`gemmi-contents <gemmi-contents>`.
-
-FASTA and PIR
--------------
-
-The coordinate files can contain sequences internally.
-Nevertheless, we may need to use a sequence from UniProt or another source.
-Gemmi provides a function to parse two sequence file formats, FASTA and PIR.
-The function takes a string containing the file's content as an argument:
-
-.. doctest::
-
-  >>> with open('P0C805.fasta') as f:
-  ...     fasta_str = f.read()
-  >>> gemmi.read_pir_or_fasta(fasta_str)  #doctest: +ELLIPSIS
-  [<gemmi.FastaSeq object at 0x...>]
-
-The string must start with a header line that begins with `>`.
-In the case of PIR format, which starts with `>P1;` (or F1, DL, DC, RL, RC,
-or XX instead of P1), the next line is also part of the header.
-The sequence file may contain multiple sequences, each preceded by a header.
-Whitespace in a sequence is ignored, except for blank lines,
-which are only allowed between sequences.
-A sequence can contain letters, dashes, and residue names in parentheses.
-The latter is an extension inspired by the format used in mmCIF files,
-in which non-standard residues are given in parentheses, e.g., `MA(MSE)GVN`.
-The sequence may end with `*`.
-
-FastaSeq objects, returned from `read_pir_or_fasta()`,
-contain only two strings:
-
-.. doctest::
-
-  >>> (fasta_seq,) = _
-  >>> fasta_seq.header
-  'sp|P0C805|PSMA3_STAA8 Phenol-soluble modulin alpha 3 peptide OS=Staphylococcus aureus (strain NCTC 8325 / PS 47) OX=93061 GN=psmA3 PE=1 SV=1'
-  >>> fasta_seq.seq
-  'MEFVAKLFKFFKDLLGKFLGNN'
-
 .. _sequence-alignment:
 
 Sequence alignment
@@ -3072,3 +2983,15 @@ rainbow-colored chain:
     :scale: 100
     :target: https://www.rcsb.org/3d-view/5XG2/
 
+
+Multiprocessing
+---------------
+
+(Python-specific)
+
+The example script below traverses subdirectories and asynchronously
+analyzes coordinate files, using 4 worker processes in parallel.
+
+.. literalinclude:: ../examples/multiproc.py
+   :language: python
+   :lines: 4-
diff --git a/docs/program.rst b/docs/program.rst
index 77735613..f647fc72 100644
--- a/docs/program.rst
+++ b/docs/program.rst
@@ -1,5 +1,7 @@
 .. highlight:: console
 
+.. _program:
+
 Gemmi program
 #############
 
diff --git a/include/gemmi/seqtools.hpp b/include/gemmi/seqtools.hpp
index 64f5602e..5ae031de 100644
--- a/include/gemmi/seqtools.hpp
+++ b/include/gemmi/seqtools.hpp
@@ -13,7 +13,7 @@ namespace gemmi {
 constexpr double h2o_weight() { return 2 * 1.00794 + 15.9994; }
 
 inline double calculate_sequence_weight(const std::vector<std::string>& seq,
-                                        double unknown=0.) {
+                                        double unknown=100.) {
   double weight = 0.;
   for (const std::string& item : seq) {
     ResidueInfo res_info = find_tabulated_residue(Entity::first_mon(item));
diff --git a/tests/disulf.cpp b/tests/disulf.cpp
index 899e6dec..34c2e84b 100644
--- a/tests/disulf.cpp
+++ b/tests/disulf.cpp
@@ -8,8 +8,7 @@
 #include <gemmi/neighbor.hpp>
 #include <gemmi/contact.hpp>
 #include <gemmi/model.hpp>
-#include <gemmi/mmread.hpp>
-#include <gemmi/gz.hpp>
+#include <gemmi/mmread_gz.hpp>
 #include <gemmi/dirwalk.hpp>
 #include <stdexcept>  // for runtime_error
 #include <chrono>
@@ -115,7 +114,7 @@ static std::vector<BondInfo> find_disulfide_bonds2(Model& model,
 static void check_disulf(const std::string& path) {
   if (verbose)
     printf("path: %s\n", path.c_str());
-  Structure st = read_structure(MaybeGzipped(path));
+  Structure st = read_structure_gz(path);
   Model& model = st.first_model();
   using Clock = std::chrono::steady_clock;
   auto start = Clock::now();