Skip to content

Commit 4aea522

Browse files
authored
Implement a persist marker. (#28)
1 parent b5aa10b commit 4aea522

11 files changed

+318
-22
lines changed

.pre-commit-config.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ repos:
2929
- id: blacken-docs
3030
additional_dependencies: [black]
3131
- repo: https://gitlab.com/pycqa/flake8
32-
rev: 3.8.3
32+
rev: 3.8.4
3333
hooks:
3434
- id: flake8
3535
additional_dependencies: [

docs/changes.rst

+1
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ all releases are available on `Anaconda.org <https://anaconda.org/pytask/pytask>
1313
- :gh:`26` makes commands return the correct exit codes.
1414
- :gh:`27` implements the ``pytask_collect_task_teardown`` hook specification to perform
1515
checks after a task is collected.
16+
- :gh:`28` implements the ``@pytask.mark.persist`` decorator.
1617

1718

1819
0.0.6 - 2020-09-12

docs/explanations/why_another_build_system.rst

+18
Original file line numberDiff line numberDiff line change
@@ -66,6 +66,24 @@ Cons
6666
- Bus factor of 1.
6767

6868

69+
`Luigi <https://github.com/spotify/luigi>`_
70+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
71+
72+
A build system written by Spotify.
73+
74+
Derivatives:
75+
76+
- `sciluigi <https://github.com/pharmbio/sciluigi>`_
77+
78+
79+
`scipipe <https://github.com/scipipe/scipipe>`_
80+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
81+
82+
Cons
83+
84+
- written in Go.
85+
86+
6987
`Scons <https://github.com/SCons/scons>`_
7088
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
7189

Original file line numberDiff line numberDiff line change
@@ -0,0 +1,90 @@
1+
How to make tasks persist
2+
=========================
3+
4+
You are able to create persisting tasks with pytask. It means that if all dependencies
5+
and products exist, the task will not be executed even though a dependency, the task's
6+
source file or a product has changed. Instead, the state of the dependencies, the source
7+
file and the products is updated in the database such that the next execution will skip
8+
the task successfully.
9+
10+
When is this useful?
11+
--------------------
12+
13+
1. You ran a formatter like Black against your project and there are some expensive
14+
tasks which should not be executed.
15+
16+
2. You want to integrate a task which you have already run elsewhere. Place the
17+
dependencies and products and the task definition in the correct place and make the
18+
task persist.
19+
20+
21+
.. caution::
22+
23+
This feature can corrupt the integrity of your project. At best, document why you
24+
have applied the decorator as a courtesy to yourself and your contributors.
25+
26+
27+
How to do it?
28+
-------------
29+
30+
To create a persisting task, apply the correct decorator and, et voilà, it is done.
31+
32+
Let us take the second scenario as an example. First, we define the tasks, the
33+
dependency and the product and save everything in the same folder.
34+
35+
.. code-block:: python
36+
37+
# Content of task_file.py
38+
39+
import pytask
40+
41+
42+
@pytask.mark.persist
43+
@pytask.mark.depends_on("input.md")
44+
@pytask.mark.produces("output.md")
45+
def task_make_input_bold(depends_on, produces):
46+
produces.write_text("**" + depends_on.read_text() + "**")
47+
48+
49+
.. code-block::
50+
51+
# Content of input.md. Do not copy this line.
52+
53+
Here is the text.
54+
55+
56+
.. code-block::
57+
58+
# Content of output.md. Do not copy this line.
59+
60+
**Here is the text.**
61+
62+
63+
If you run pytask in this folder, you get the following output.
64+
65+
.. code-block:: console
66+
67+
$ pytask demo
68+
========================= Start pytask session =========================
69+
Platform: win32 -- Python 3.8.5, pytask 0.0.6, pluggy 0.13.1
70+
Root: xxx/demo
71+
Collected 1 task(s).
72+
73+
p
74+
====================== 1 succeeded in 1 second(s) ======================
75+
76+
The green p signals that the task persisted. Another execution will show the following.
77+
78+
.. code-block:: console
79+
80+
$ pytask demo
81+
========================= Start pytask session =========================
82+
Platform: win32 -- Python 3.8.5, pytask 0.0.6, pluggy 0.13.1
83+
Root: xxx/demo
84+
Collected 1 task(s).
85+
86+
s
87+
====================== 1 succeeded in 1 second(s) ======================
88+
89+
Now, the task is skipped successfully because nothing has changed compared to the
90+
previous run.

docs/tutorials/index.rst

+1
Original file line numberDiff line numberDiff line change
@@ -16,3 +16,4 @@ organize and start your own project.
1616
how_to_configure_pytask
1717
how_to_select_tasks
1818
how_to_clean
19+
how_to_make_tasks_persist

src/_pytask/cli.py

+2
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,7 @@ def pytask_add_hooks(pm):
4646
from _pytask import logging
4747
from _pytask import mark
4848
from _pytask import parametrize
49+
from _pytask import persist
4950
from _pytask import resolve_dependencies
5051
from _pytask import skipping
5152

@@ -59,6 +60,7 @@ def pytask_add_hooks(pm):
5960
pm.register(logging)
6061
pm.register(mark)
6162
pm.register(parametrize)
63+
pm.register(persist)
6264
pm.register(resolve_dependencies)
6365
pm.register(skipping)
6466

src/_pytask/outcomes.py

+4
Original file line numberDiff line numberDiff line change
@@ -16,3 +16,7 @@ class SkippedDependencyNotFound(PytaskOutcome):
1616

1717
class SkippedUnchanged(PytaskOutcome):
1818
"""Outcome if task has run before and is unchanged."""
19+
20+
21+
class Persisted(PytaskOutcome):
22+
"""Outcome if task should persist."""

src/_pytask/persist.py

+65
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
"""Implement the ability for tasks to persist."""
2+
import click
3+
from _pytask.config import hookimpl
4+
from _pytask.dag import node_and_neighbors
5+
from _pytask.enums import ColorCode
6+
from _pytask.exceptions import NodeNotFoundError
7+
from _pytask.mark import get_specific_markers_from_task
8+
from _pytask.outcomes import Persisted
9+
10+
11+
@hookimpl
12+
def pytask_parse_config(config):
13+
"""Add the marker to the configuration."""
14+
config["markers"]["persist"] = (
15+
"Prevent execution of a task if all products exist and even if something has "
16+
"changed (dependencies, source file, products). This decorator might be useful "
17+
"for expensive tasks where only the formatting of the has changed. The state "
18+
"of the files which have changed will also be remembered and another run will "
19+
"skip the task with success."
20+
)
21+
22+
23+
@hookimpl
24+
def pytask_execute_task_setup(session, task):
25+
"""Exit persisting tasks early.
26+
27+
The decorator needs to be set and all nodes need to exist.
28+
29+
"""
30+
if get_specific_markers_from_task(task, "persist"):
31+
try:
32+
for name in node_and_neighbors(session.dag, task.name):
33+
node = (
34+
session.dag.nodes[name].get("task")
35+
or session.dag.nodes[name]["node"]
36+
)
37+
node.state()
38+
except NodeNotFoundError:
39+
all_nodes_exist = False
40+
else:
41+
all_nodes_exist = True
42+
43+
if all_nodes_exist:
44+
raise Persisted
45+
46+
47+
@hookimpl
48+
def pytask_execute_task_process_report(report):
49+
"""Set task status to success.
50+
51+
Do not return ``True`` so that states will be updated in database.
52+
53+
"""
54+
if report.exc_info and isinstance(report.exc_info[1], Persisted):
55+
report.success = True
56+
57+
58+
@hookimpl
59+
def pytask_execute_task_log_end(report):
60+
"""Log a persisting task with a green p."""
61+
if report.success:
62+
if report.exc_info:
63+
if isinstance(report.exc_info[1], Persisted):
64+
click.secho("p", fg=ColorCode.SUCCESS.value, nl=False)
65+
return True

tests/test_persist.py

+125
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,125 @@
1+
import textwrap
2+
3+
import pytest
4+
from _pytask.database import create_database
5+
from _pytask.database import State
6+
from _pytask.outcomes import Persisted
7+
from _pytask.outcomes import SkippedUnchanged
8+
from _pytask.persist import pytask_execute_task_process_report
9+
from pony import orm
10+
from pytask import main
11+
12+
13+
class DummyClass:
14+
pass
15+
16+
17+
@pytest.mark.end_to_end
18+
def test_persist_marker_is_set(tmp_path):
19+
session = main({"paths": tmp_path})
20+
assert "persist" in session.config["markers"]
21+
22+
23+
@pytest.mark.end_to_end
24+
def test_multiple_runs_with_persist(tmp_path):
25+
"""Perform multiple consecutive runs and check intermediate outcomes with persist.
26+
27+
1. The product is missing which should result in a normal execution of the task.
28+
2. Change the product, check that run is successful and state in database has
29+
changed.
30+
3. Run the task another time. Now, the task is skipped successfully.
31+
32+
"""
33+
source = """
34+
import pytask
35+
36+
@pytask.mark.persist
37+
@pytask.mark.depends_on("in.txt")
38+
@pytask.mark.produces("out.txt")
39+
def task_dummy(depends_on, produces):
40+
produces.write_text(depends_on.read_text())
41+
"""
42+
tmp_path.joinpath("task_dummy.py").write_text(textwrap.dedent(source))
43+
tmp_path.joinpath("in.txt").write_text("I'm not the reason you care.")
44+
45+
session = main({"paths": tmp_path})
46+
47+
assert session.exit_code == 0
48+
assert len(session.execution_reports) == 1
49+
assert session.execution_reports[0].success
50+
assert session.execution_reports[0].exc_info is None
51+
assert tmp_path.joinpath("out.txt").exists()
52+
53+
tmp_path.joinpath("out.txt").write_text("Never again in despair.")
54+
55+
session = main({"paths": tmp_path})
56+
57+
assert session.exit_code == 0
58+
assert len(session.execution_reports) == 1
59+
assert session.execution_reports[0].success
60+
assert isinstance(session.execution_reports[0].exc_info[1], Persisted)
61+
62+
orm.db_session.__enter__()
63+
64+
create_database(
65+
"sqlite", tmp_path.joinpath(".pytask.sqlite3").as_posix(), True, False
66+
)
67+
task_id = tmp_path.joinpath("task_dummy.py").as_posix() + "::task_dummy"
68+
node_id = tmp_path.joinpath("out.txt").as_posix()
69+
70+
state = State[task_id, node_id].state
71+
assert float(state) == tmp_path.joinpath("out.txt").stat().st_mtime
72+
73+
orm.db_session.__exit__()
74+
75+
session = main({"paths": tmp_path})
76+
77+
assert session.exit_code == 0
78+
assert len(session.execution_reports) == 1
79+
assert session.execution_reports[0].success
80+
assert isinstance(session.execution_reports[0].exc_info[1], SkippedUnchanged)
81+
82+
83+
@pytest.mark.end_to_end
84+
def test_migrating_a_whole_task_with_persist(tmp_path):
85+
source = """
86+
import pytask
87+
88+
@pytask.mark.persist
89+
@pytask.mark.depends_on("in.txt")
90+
@pytask.mark.produces("out.txt")
91+
def task_dummy(depends_on, produces):
92+
produces.write_text(depends_on.read_text())
93+
"""
94+
tmp_path.joinpath("task_dummy.py").write_text(textwrap.dedent(source))
95+
for name in ["in.txt", "out.txt"]:
96+
tmp_path.joinpath(name).write_text(
97+
"They say oh my god I see the way you shine."
98+
)
99+
100+
session = main({"paths": tmp_path})
101+
102+
assert session.exit_code == 0
103+
assert len(session.execution_reports) == 1
104+
assert session.execution_reports[0].success
105+
assert isinstance(session.execution_reports[0].exc_info[1], Persisted)
106+
107+
108+
@pytest.mark.unit
109+
@pytest.mark.parametrize(
110+
"exc_info, expected",
111+
[
112+
(None, None),
113+
((None, None, None), None),
114+
((None, Persisted(), None), True),
115+
],
116+
)
117+
def test_pytask_execute_task_process_report(exc_info, expected):
118+
report = DummyClass()
119+
report.exc_info = exc_info
120+
result = pytask_execute_task_process_report(report)
121+
122+
if expected:
123+
assert report.success
124+
else:
125+
assert result is None

0 commit comments

Comments
 (0)