Skip to content

Commit

Permalink
Try installing stuff to a virtualenv (#162)
Browse files Browse the repository at this point in the history
  • Loading branch information
sampsyo authored Mar 24, 2024
2 parents 5e10abc + 52bf91c commit 77f44ec
Show file tree
Hide file tree
Showing 4 changed files with 90 additions and 84 deletions.
42 changes: 18 additions & 24 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,43 +14,37 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

- uses: actions/setup-python@v4
- uses: actions/setup-python@v5
with:
python-version: '3.11'
cache: pip
cache-dependency-path: |
mygfa/pyproject.toml
pollen_data_gen/pyproject.toml
pollen_py/pyproject.toml
slow_odgi/pyproject.toml
- name: Install Flit
run: pip install flit
python-version: '3.12'

- name: Install mygfa
run: cd mygfa ; flit install --symlink
- name: Install pollen_data_gen
run: cd pollen_data_gen ; flit install --symlink
- name: Install pollen_py
run: cd pollen_py ; flit install --symlink
- name: Install slow_odgi
run: cd slow_odgi ; flit install --symlink
# Set up and use uv.
- uses: actions/cache@v4
id: cache-uv
with:
path: ~/.cache/uv
key: ${{ runner.os }}-python-${{ matrix.python-version }}-uv
- name: Create and activate virtualenv with uv
run: |
curl -LsSf https://astral.sh/uv/install.sh | sh
uv venv
echo "VIRTUAL_ENV=.venv" >> $GITHUB_ENV
echo "$PWD/.venv/bin" >> $GITHUB_PATH
- name: Install Python tools
run: uv pip install -r requirements.txt

# Set up for tests.
- name: Install Turnt
run: pip install turnt

run: uv pip install turnt
- name: Problem matcher
run: echo '::add-matcher::.github/tap-matcher.json'

- name: Fetch test data
run: make fetch SMALL=1

- name: Pull odgi container
run: |
docker pull quay.io/biocontainers/odgi:0.8.3--py310h6cc9453_0
docker tag quay.io/biocontainers/odgi:0.8.3--py310h6cc9453_0 odgi
- name: Install odgi alias
run: |
mkdir -p $HOME/.local/bin
Expand Down
83 changes: 46 additions & 37 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,35 +3,52 @@
<img src="https://github.com/cucapra/pollen/blob/main/pollen_icon_transparent.png">
</h1>

Pangenome Graph Queries in Calyx
================================
Accelerated Pangenome Graph Queries
===================================

Pollen is a nascent project to accelerate queries on pangenomic graphs.
We are designing a graph-manipulating DSL that exposes functionality that pangenomicists care about.
Our DSL will support graph queries in the vein of the [odgi][] project.
We will compile programs written in this DSL into the [Calyx][] IR and then leverage Calyx to generate hardware accelerators.
We will compile programs written in this DSL into fast query code.
Eventually, we aim to generate custom hardware accelerators for these queries via the [Calyx][] compiler.

There are several things in this repository:

* `mygfa`, a simple Python library for parsing, processing, and emitting [GFA][] files.
* `slow_odgi`, a reference implementation of several GFA queries from the [odgi][] tool using `mygfa`.
* A proof-of-concept Calyx-based hardware accelerator generator for a single GFA query (`odgi depth`) and a data generator for this hardware.
* FlatGFA, an experimental fast binary format for representing and analyzing GFA files.


`mygfa` and `slow_odgi`
-----------------------

The `mygfa` library is an extremely simple Python library for representing (and parsing and emitting) GFA files. It emphasizes clarify over efficiency. Similarly, `slow_odgi` is a set of GFA analyses based on `mygfa`; it's meant to act as a *reference implementation* of the much faster functionality in [odgi][]. Check out [the slow_odgi README](slow_odgi/) for more details.

To use them, try using [uv][]:

$ uv venv
$ uv pip install -r requirements.txt
$ source .venv/bin/activate

Now type `slow_odgi --help` to see if everything's working.

[uv]: https://github.com/astral-sh/uv

Running using Docker
--------------------
Running Pollen is easy if you use our Docker [package][]:
```
docker run -it --rm ghcr.io/cucapra/pollen:latest
```
If you prefer to install locally, we point you to the somewhat more involved instructions [below](#installing-pollen-locally).


Aside: Slow Odgi
----------------
Proof-of-Concept Hardware Generator
-----------------------------------

`slow_odgi` is a reference implementation of a subset of odgi commands.
It is written purely in Python, with correctness and clarity as goals and speed as a non-goal.
While independent of Pollen proper, it has been an aid to us during the process of designing the DSL and understanding the domain.
See [here](slow_odgi/) for more!
This repository contains a proof-of-concept hardware accelerator generator for a simple GFA query. This section contains some guides for trying out this generator.

### The Docker Image

Getting Started with Pollen
---------------------------
Running the hardware generator is easy if you use our [Docker image][package]:

docker run -it --rm ghcr.io/cucapra/pollen:latest

If you prefer to install locally, we point you to the somewhat more involved instructions [below](#installing-locally).

### Generating an Accelerator: Quick

Expand All @@ -48,7 +65,6 @@ exine depth -a -r <filename.og> --tmpdir <path>
```
The node depth accelerator will be saved at `<path>/<filename.futil>` and the input data will be saved at `<path>/<filename.data>`.


### Generating an Accelerator: Full Walkthrough

Take [depth][] as an example. To generate and run a node depth accelerator for the graph `k.og`, first navigate to the root directory of this repository. Then run
Expand Down Expand Up @@ -94,20 +110,12 @@ Fifth, we run our hardware accelerator. The following code simulates the Calyx c
exine depth -r depth.data -x depth.futil
```

Testing
-------

Navigative to the root directory of the pollen repository and run `make test`.
Warning: the tests take approximately 2 hours to complete.


Installing Pollen locally
-------------------------
### Installing Locally

You will need [Flit][] version 3.7.1 and [Turnt][] version 1.11.0.
We will guide you through the installation of our major dependencies, [Calyx][] and [odgi][], and then show you how to install Pollen itself.

### Calyx
#### Calyx

Below we show you how to build Calyx from source and set it up for our use.
If you are curious, this tracks the "[installing from source][calyx-install-src]" and "[installing the command-line driver][calyx-install-fud]" sections of the Calyx documentation.
Expand All @@ -125,8 +133,7 @@ If you are curious, this tracks the "[installing from source][calyx-install-src]

You will be warned that `synth-verilog` and `vivado-hls` were not installed correctly; this is fine for our purposes.


### Odgi
#### Odgi

We recommend that you build odgi from source, as described [here][odgi-from-source].
To check that this worked, run `odgi` from the command line.
Expand All @@ -137,15 +144,17 @@ To verify that this worked, open up a Python shell and try `import odgi`.
If it succeeds quietly, great!
If it segfaults, try the preload step explained [here][odgi-preload].

#### Pollen

### Pollen
Clone this repository:

Clone this repository using
```
git clone https://github.com/cucapra/pollen.git
```
and run `cd pollen_py && flit install -s --user`.
git clone https://github.com/cucapra/pollen.git

And then install the Python tools using [uv][]:

$ uv venv
$ uv pip install -r requirements.txt
$ source .venv/bin/activate

[calyx]: https://calyxir.org
[odgi]: https://odgi.readthedocs.io/en/latest/
Expand Down
5 changes: 5 additions & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
-e ./mygfa
-e ./slow_odgi
-e ./pollen_py
-e ./pollen_data_gen
turnt
44 changes: 21 additions & 23 deletions slow_odgi/README.md
Original file line number Diff line number Diff line change
@@ -1,50 +1,48 @@
# `slow_odgi`

### Overview

`slow_odgi` is a reference implementation of [`odgi`](https://github.com/pangenome/odgi). It is written purely in Python, with correctness and clarity as goals and speed as a non-goal.
`slow_odgi` is a reference implementation of [odgi][]. It is written purely in Python, with correctness and clarity as goals and speed as a non-goal.
While independent of Pollen proper, it has been an aid to us during the process of designing the DSL and understanding the domain.
Think of it as a code-forward spec for `odgi` commands.

### Installation
[odgi]: https://github.com/pangenome/odgi

## Installation

*These instructions assume that you are in this directory, i.e. `path/to/pollen/slow_odgi/`.*
One easy way to install everything in the Pollen repo is to use [uv][]:

If you don't care for an executable, it is possible to skip installation and just run `PYTHONPATH=../mygfa python3 -m slow_odgi`.
You can use this phrase (adjusting relative paths as necessary) wherever we use the `slow_odgi` executable in the sections that follow.
$ uv venv
$ uv pip install -r requirements.txt
$ source .venv/bin/activate

To install the `slow_odgi` executable:
1. Ensure you have [`setuptools`](https://packaging.python.org/en/latest/tutorials/installing-packages/#ensure-pip-setuptools-and-wheel-are-up-to-date).
2. Run `python3 -m pip install --user -e ../mygfa .`.
[uv]: https://github.com/astral-sh/uv

Alternately,
1. Ensure you have [`flit`](https://flit.pypa.io/en/latest/#install).
1. Change directories to `../mygfa` and run `flit install --user --symlink`.
2. Change directories back to `slow_odgi` (this directory) and run `flit install --user --symlink`.
## Try it!

### Try it!
1. Change to the root directory `pollen/`.
2. Run `make fetch`; this downloads a set of pangenome graphs for us to play with.
3. Try `slow_odgi chop test/note5.gfa -n 3`; this runs `chop` on the graph `note5.gfa` with parameter `3`.
4. Play with the other commands that we support! See below for a full listing.

### Testing
## Testing

To test `slow_odgi`, we treat `odgi` as an oracle and compare our outputs against theirs. We mostly test against a set of pangenome graphs available in the `odgi` repository, and, in a few cases, supplement these with short hand-rolled GFA files of our own.

To run these tests, you will need
1. `odgi`; see [here](https://github.com/pangenome/odgi). Our tests were run against a built-from-source copy of `odgi` (commit 34f006f).
2. `turnt`; see [here](https://github.com/cucapra/turnt).
To run these tests, you will need:

With these in place, run `make test-slow-odgi`. The "oracle" files will be generated first, and this will toss up a large number of warnings which can all be ignored. Then the tests will begin to run, and the `ok`/`not-ok` signals there are actually of interest.
1. [Odgi][]. Our tests were run against a built-from-source copy of odgi (commit `34f006f`).
2. [Turnt][]. This is installed automatically if you use `requirements.txt` as above.

With these in place, run `make test-slow-odgi`. The "oracle" files will be generated first, and this will toss up a large number of warnings which can all be ignored. Then the tests will begin to run, and the `ok`/`not ok` signals there are actually of interest.

There are a two known points of divergence versus `odgi`, both having to do with the command `flip`.
The reasons are subtly related, but are documented independently:
1. We disagree against graph note5.gfa; see https://github.com/cucapra/pollen/pull/52#issuecomment-1513958802
2. We disagree against the handmade graph flip4.gfa; see https://github.com/pangenome/odgi/issues/496.

1. We disagree against graph note5.gfa; see [Pollen PR #52](https://github.com/cucapra/pollen/pull/52#issuecomment-1513958802).
2. We disagree against the handmade graph flip4.gfa; see [odgi issue #496](https://github.com/pangenome/odgi/issues/496).

[turnt]: https://github.com/cucapra/turnt

### Explanation of Commands
## Explanation of Commands

The remainder of this document will explain, in some detail, the eleven commands that we have implemented. Below we sometimes elide graph information that is inconsequential to the explanation. Unless specified, this is meant to be read as "don't care" and not as absence.

Expand Down

0 comments on commit 77f44ec

Please sign in to comment.