Skip to content

Latest commit

 

History

History
492 lines (346 loc) · 17.4 KB

README.md

File metadata and controls

492 lines (346 loc) · 17.4 KB

Entangled

Upload Python Package Python package

Entangled is a solution for Literate Programming, a technique in which the programmer writes a human narrative first, then implementing the program in code blocks. Literate programming was introduced by Donald Knuth in 1984 and has since then found several surges in popularity. One thing holding back the popularity of literate programming is the lack of maintainability under increasing program complexity. Entangled solves this issue by offering a two-way synchronisation mechanism. You can edit and debug your code as normal in your favourite IDE or text editor. Entangled will make sure that your Markdown files stay up-to-date with your code and vice-versa. Because Entangled works with Markdown, you can use it with most static document generators. To summarise, you keep using:

  • your favourite editor: Entangled runs as a daemon in the background, keeping your text files synchronised.
  • your favourite programming language: Entangled is agnostic to programming languages.
  • your favourite document generator: Entangled is configurable to any dialect of Markdown.

We’re trying to increase the visibility of Entangled. If you like Entangled, please consider adding this badge Entangled badge to the appropriate location in your project:

[![Entangled badge](https://img.shields.io/badge/entangled-Use%20the%20source!-%2300aeff)](https://entangled.github.io/)

Get started

To install Entangled, all you need is a Python (version ≥3.11) installation. If you use poetry, and you start a new project,

poetry init 
poetry add entangled-cli

The poetry init command will create a pyproject.toml file and a virtual environment to install Python dependencies in. To activate the virtual environment, run poetry shell inside the project directory.

Or, if you prefer plain old pip,

pip install entangled-cli

Use

Run the entangled watch daemon in the root of your project folder. By default all Markdown files are monitored for fenced code blocks like so:

``` {.rust #hello file="src/world.rs"}
...
```

The syntax of code block properties is the same as CSS properties: #hello gives the block the hello identifier, .rust adds the rust class and the file attribute is set to src/world.rs (quotes are optional). For Entangled to know how to tangle this block, you need to specify a language and a target file. However, now comes the cool stuff. We can split our code in meaningful components by cross-refrences.

Hello World in C++

The combined code-blocks in this example compose a compilable source code for "Hello World". For didactic reasons we don't always give the listing of an entire source file in one go. In stead, we use a system of references known as noweb (after Ramsey 1994).

Inside source fragments you may encounter a line with <<...>> marks like,

``` {.cpp file=hello_world.cc}
#include <cstdlib>
#include <iostream>

<<example-main-function>>
```

which is then elsewhere specified. Order doesn't matter,

``` {.cpp #hello-world}
std::cout << "Hello, World!" << std::endl;
```

So we can reference the <<hello-world>> code block later on.

``` {.cpp #example-main-function}
int main(int argc, char **argv)
{
    <<hello-world>>
}
```

A definition can be appended with more code as follows (in this case, order does matter!):

``` {.cpp #hello-world}
return EXIT_SUCCESS;
```

These blocks of code can be tangled into source files.

Configuring

Entangled is configured by putting a entangled.toml in the root of your project.

# required: the minimum version of Entangled
version = "2.0"            

# default watch_list is ["**/*.md"]
watch_list = ["docs/**/*.md"]

# default ignore_list is ["**/README.md"]
ignore_list = ["docs/**/examples.md"]

You may add languages as follows:

[[languages]]
name = "Java"
identifiers = ["java"]
comment = { open = "//" }

# Some languages have comments that are not terminated by
# newlines, like XML or CSS.
[[languages]]
name = "XML"
identifiers = ["xml", "html", "svg"]
comment = { open = "<!--", close = "-->" }

The identifiers are the tags that you may use in your code block header to identify the language. Using the above config, you should be able to write:

``` {.html file=index.html}
<!DOCTYPE html>
<html lang="en">
    <<header>>
    <<body>>
</html>
```

And so on...

Reading from pyproject.toml

If you have a pyproject.toml file, either because you use poetry to set up Entangled or because you're actually developing a Python project, you may want to put the configuration in pyproject.toml instead. Add a tool.entangled table like so:

[tool.entangled]
version = "2.0"
watch_list = ["docs/**/*.md"]

To add languages in your pyproject.toml, add tool.entangled.languages sections. Be aware that these should be lists, not tables, so you will need to use double brackets, like so:

[[tool.entangled.languages]]
name = "Java"
identifiers = ["java"]
comment = { open = "//" }

Working with Git

When using Entangled in conjunction with Git, there are a few tricks that you may want to know about.

Restoring files when both Markdown and code have changed

When you edited both Markdown and code without the daemon running, you may need to do some tricks to get back into a consistent state.

git add .
git commit -m 'fixed everything'   # save everything you did
entangled tangle --force           # overwrites some changes you made
git restore src/brilliant_code.c   # retrieve from latest commit
entangled stitch                   # apply changes back to markdown
git add .
git commit --amend                 # amend your commit to perfection

There may be better/faster ways to do this.

Entangled conflicts after merging branches

Entangled can get confused when you merge, and there is a conflict on .entangled/filedb.json. This file keeps track of which files are sources for Entangled and which ones are generated by Entangled. That way, Entangled will never overwrite files it isn't supposed to, and the other way around, when you rename a target, the old one gets removed. It is very hard to merge this file though. When you need to, you can regenerate this file using:

entangled tangle -r

This will perform the tangle as if it is the first time, but it won't actually write files.

Hooks

Entangled has a system of hooks: these add actions to the tangling process:

  • build trigger actions in a generated Makefile
  • brei trigger actions (or tasks) using Brei, which is automatically installed along with Entangled. This is now prefered over the build hook.
  • quarto_attributes add attributes to the code block in Quatro style with #| comments at the top of the code block.
  • shebang takes the first line if it starts with #! and puts it at the top of the file.

build hook

You can enable this hook in entangled.toml:

version = "2.0"
watch_list = ["docs/**/*.md"]
hooks = ["build"]

Then in your Markdown, you may enter code tagged with the .build tag.

``` {.python .build target=docs/fig/plot.svg}
from matplotlib import pyplot as plt
import numpy as np

x = np.linspace(-np.pi, np.pi, 100)
y = np.sin(x)
plt.plot(x, y)
plt.savefig("docs/fig/plot.svg")
```

This code will be saved into a Python script in the .entangled/build directory, or if you specify the file= attribute some other location. Second, a Makefile is generated in .entangled/build, that can be invoked as,

make -f .entangled/build/Makefile

You may configure how code from different languages is evaluated in entangled.toml. For example, to add Gnuplot support, and also make Julia code run through DaemonMode.jl, you may do the following:

[hook.build.runners]
Gnuplot = "gnuplot {script}"
Julia = "julia --project=. --startup-file=no -e 'using DaemonMode; runargs()' {script}"

Once you have the code in place to generate figures and markdown tables, you can use the syntax at your disposal to include those into your Markdown. In this example that would be

![My awesome plot](fig/plot.svg)

In the case of tables or other rich content, Standard Markdown (or CommonMark) has no syntax for including other Markdown files, so you'll have to check with your own document generator how to do that. In MkDocs, you could use mkdocs-macro-plugin, Pandoc has pandoc-include, etc.

You can also specify intermediate data generation like so:

``` {.python .build target="data/result.csv"}
import numpy as np
import pandas as pd

result = np.random.normal(0.0, 1.0, (100, 2))
df = pd.DataFrame(result, columns=["x", "y"])
df.to_csv("data/result.csv")
```

``` {.python .build target="fig/plot.svg" deps="data/result.csv"}
import pandas as pd

df = pd.read_csv("data/result.csv")
plot = df.plot()
plot.savefig("fig/plot.svg")
```

The snippet for generating the data is given as a dependency for that data; to generate the figure, both result.csv and the code snippet are dependencies.

quarto_attributes hook

Sometimes using the build hook (or the brei hook, see below), leads to long header lines. It is then better to specify attributes in a header section of your code. The Quarto project came up with a syntax, having the header be indicated by a comment with a vertical bar, e.g. #| or //| etc. The quarto_attributes hook reads those attributes and adds them to the properties of the code block.

Example with the brei hook:

``` {.python .task}
#| description: Draw a triangle
#| creates: docs/fig/triangle.svg
#| collect: figures
from matplotlib import pyplot as plt
plt.plot([[-1, -0.5], [1, -0.5], [0, 1], [-1, -0.5]])
plt.savefig("docs/fig/triangle.svg")
```

![](fig/triangle.svg)

Using these attributes it is possible to write in Entangled using completely standard Markdown syntax. The following configuration disables the curly braces alltogether, though currently the quarto tags encoding the meta-data will end-up in the tangled code.

#| file: entangled.toml
version="2.0"
watch_list=["*.typ"]
hooks=["quarto_attributes"]

[markers]
open="^(?P<indent>\\s*)```(?P<properties>.*)$"
close="^(?P<indent>\\s*)```\\s*$"

Then you can write code like so:

```python
#| id: hello
print("Hello, World!")
```

```python
#| file: test.py
if __name__ == "__main__":
    <<hello>>
```

The id attribute is reserved for the code's identifier (normally indicated with #) and the classes attribute can be used to indicate a list of classes in addition to the language class already given.

Brei

Entangled has a small build engine (similar to GNU Make) embedded, called Brei. You may give it a list of tasks (specified in TOML) that may depend on one another. Brei will run these when dependencies are newer than the target. Execution is lazy and in parallel. Brei supports:

  • Running tasks by passing a script to any configured interpreter, e.g. Bash, Python, Lua etc.
  • Redirecting stdout or stdin to or from files.
  • Defining so called "phony" targets.
  • Define template for programmable reuse.
  • include other Brei files, even ones that need to be generated by another task.
  • Variable substitution, including writing stdout to variables.

Brei is available as a separate package, see the Brei documentation.

Examples

To write out "Hello, World!" to a file msg.txt, we may do the following,

[[task]]
stdout = "secret.txt"
language = "Python"
script = """
print("Uryyb, Jbeyq!")
"""

To have this message decoded define a pattern,

[pattern.rot13]
stdout = "{stdout}"
stdin = "{stdin}"
language = "Bash"
script = """
tr a-zA-Z n-za-mN-ZA-M
"""

[[call]]
pattern = "rot13"
  [call.args]
  stdin = "secret.txt"
  stdout = "msg.txt"

To define a phony target "all",

[[task]]
name = "all"
requires = ["msg.txt"]

The brei hook

The following example uses both brei and quatro_attributes hooks. To add a Brei task, tag a code block with the .task class.

First we generate some data.

``` {.python #some-functions}
# define some functions
```

Now we show what that data would look like:

``` {.python .task}
#| description: Generate data
#| creates: data/data.npy

<<some-functions>>

# generate and save data
```

Then we plot in another task.

``` {.python .task}
#| description: Plot data
#| creates: docs/fig/plot.svg
#| requires: data/data.npy
#| collect: figures

# load data and plot
```

The collect attribute tells the Brei hook to add the docs/fig/plot.svg target to the figures collection. All figures can then be rendered as follows, having in entangled.toml

version = "2.0"
watch_list = ["docs/**/*.md"]
hooks = ["quatro_attributes", "brei"]

[brei]
include = [".entangled/tasks.json"]

And run

entangled brei figures

You can use ${variable} syntax inside Brei tasks just as you would in a stand-alone Brei script.

Support for Document Generators

Entangled has been used successfully with the following document generators. Note that some of these examples were built using older versions of Entangled, but they should work just the same.

Pandoc

Pandoc is a very versatile tool for converting documents in any format. It specifically has very wide support for different forms of Markdown syntax out in the wild, including a filter system that lets you extend the workings of Pandoc. Those filters can be written in any language through an API, for instance Python filters can be written using panflute, but there is also native support for Lua.

To work with Entangled style literate documents, there is a set of Pandoc filters available. The major downside of Pandoc, is that it offers no help in making your output HTML look beautiful. One option is to use the Bootstrap template, but you may wan't to try out others as well, or design your own. These days a lot can be done with a single well designed CSS file.

  • ➕ dynamic
  • ➕ supports most Markdown syntax out of the box
  • ➕ excellent for science: citation, numbered figures, tables and equations
  • ➕ support for LaTeX
  • ➖ harder to setup
  • ➖ takes work to make look good

Example: Hello World in C++

MkDocs

MkDocs is specifically taylored towards converting Markdown into good looking, easy to navigate HTML, especially when used in combination with the mkdocs-material theme. To use Entangled style code blocks with MkDocs, you'll need to install the mkdocs-entangled-plugin as well.

  • ➕ specifically designed for Markdown to HTML, i.e. software documentation
  • ➕ pretty output, out of the box
  • ➕ easy to install
  • ➖ not intended for scientific use: numbering and referencing equations, figures and tables is hard to setup
  • ➖ documentation is on par with most Python projects: Ok for most things, but really hard if you want specifics

Example: TBD

Typst

Typst has a syntax that is similar to Markdown when it comes to code blocks. Set the code block markers in entangled.toml like so:

version="2.0"
watch_list=["*.typ"]
hooks=["quarto_attributes"]

[markers]
open="^(?P<indent>\\s*)```(?P<properties>.*)$"
close="^(?P<indent>\\s*)```\\s*$"

Documenter.jl

Documenter.jl is the standard tool to write Julia documention in. It has internal support for evaluating code block contents.

Example: Intro to code generation in Julia

PDoc

PDoc is a tool for documenting smaller Python projects. It grabs all documentation from the doc-strings in your Python library and generates a page from that. To have it include its own literate source, I had to use some very ugly hacks.

Example: check-deps, a Universal dependency checker in Python

Docsify

Docsify serves the markdown files and does the conversion to HTML in a Javascript library (in browser).

Example: Guide to C++ on the web through WASM

History

This is a rewrite of Entangled in Python. Older versions were written in Haskell. The rewrite in Python was motivated by ease of installation, larger community and quite frankly, a fit of mental derangement.

Contributing

If you have an idea for improving Entangled, please file an issue before creating a pull request. Code in this repository is formatted using black and type checked using mypy.

License

Copyright 2023 Netherlands eScience Center, written by Johan Hidding, licensed under the Apache 2 license, see LICENSE.