Skip to content

Latest commit

 

History

History
104 lines (72 loc) · 15.2 KB

GettingStarted.md

File metadata and controls

104 lines (72 loc) · 15.2 KB

Getting Started with Dynamatic

Overview

Welcome to the Dynamatic project!

This project is the natural successor of Dynamatic (often referred to as legacy Dynamatic throughout this repository), an academic, open-source high-level synthesis compiler based on LLVM IR that produces synthesizable synchronous dynamically-scheduled circuits (elastic circuits) from C/C++ code. Dynamatic enabled numerous scientific publications in top conferences over the years and is actively being used by many researchers in various research groups. Dynamatic aims to achieve the same goals (and more!) but uses the more recent and powerful MLIR ecosystem instead of LLVM IR. Furthermore, this project puts a lot more emphasis on code quality, reliability, and interoperability, while remaining as user-friendly as possible.

The project welcomes contributions from the open-source community. In addition, given its academic nature, Dynamatic also aims to include technical contributions from students as part of academic projects. As students may be at various levels of study and have varying degrees of expertise with compiler technology, Dynamatic provides support to them in the form of high-level design documentation (this document and others in docs), tutorials, and the possibility of commiting experimental work to the repository.

Building the project

We use the CMake build system to build and test Dynamatic. You can find instructions on how to build the project, including its software dependencies, in the README.md file at the top level of the repository. You can also check out our advanced build instructions to personalize the build to your needs.

Software architecture

This section provides an overview of the software architecture of the project and is meant as an entry-point for users who would like to start digging into the codebase. It describes the project's directory structure, our software dependencies (i.e., git submodules), and our testing infrastructure.

Directory structure

This section is intended to give an overview of the project's directory structure and an idea of what each directory contains to help new users more easily look for and find specific parts of the implementation. Note that the superproject is structured very similarly to LLVM/MLIR, thus this overview is useful for navigating this repository as well. For exploring/editing the codebase, we strongly encourage the use of an IDE with a go to reference/implementation feature (e.g., VSCode) to easily navigate between header/source files. Below is a visual representation of a subset of the project's directory structure, with basic information on what each directory contains.

├── bin # Symbolic links to commonly used binaries after build (untracked)
├── build # Files generated during build (untracked)
│   └── bin # Binaries generated by the superproject
│       └── include
│           └── dynamatic # Compiled TableGen headers (*.h.inc)
├── docs # Documentation and tutorials, where this file lies
├── experimental # Experimental passes and tools
├── include
│   ├── dynamatic # All header files (*.h)
├── integration-test # Integration tests
├── lib # Implementation of compiler passes (*.cpp)
│   ├── Conversion # Implementation of conversion passes (*.cpp)
│   └── Transforms # Implementation of transform passes (*.cpp)
├── polygeist # Polygeist repository (submodule)
│   └── llvm-project # LLVM/MLIR repository (submodule)
├── test # Unit tests
├── tools # Implementation of executables generated during build
│   └── dynamatic-opt # Dynamatic optimizer
├── tutorials # Dynamatic tutorials
├── visual-dataflow # Interactive dataflow visualizer (depends on Godot)
├── build.sh # Build script to build the entire project
└── CMakeLists.txt # Top level CMake file for building the superproject

Dependencies

Dynamatic uses git submodules to manage its software dependencies (all hosted on GitHub). We depend on Polygeist, a C/C++ frontend for MLIR which itself depends on LLVM/MLIR through a git submodule. The project is set up so that you can include LLVM/MLIR headers directly from Dynamatic code without having to specify their path through Polygeist. We also depend on godot-cpp, the official C++ bindings for the Godot game engine which we use as the frontend to our interactive dataflow circuit visualizer. Finally, we inherit two MLIR dialects (Handshake and HW) from the CIRCT project (details below).

Polygeist

Polygeist is a C/C++ frontend for MLIR including polyhedral optimizations and parallel optimizations features. Polygeist is thus responsible for the first step of our compilation process, that is taking source code written in C/C++ into the MLIR ecosystem. In particular, we care that our entry point to MLIR is at a very high semantic level, namely, at a level where polyhedral analysis is possible. The latter allows us to easily identify dependencies between memory accesses in source programs in a very accurate manner, which is key to optimizing the allocation of memory interfaces and resources in our elastic circuits down the line. Polygeist is able to emit MLIR code in the Affine dialect, which is perfectly suited for this kind of analysis.

CIRCT

CIRCT is "an (experimental!) effort looking to apply MLIR and the LLVM development methodology to the domain of hardware design tools". It is a very actively developed project from which we only use a small portion of the codebase (though we inherit a lot from their robust development methodologies and code infrastructure), namely, the Handshake dialect with its associated compiler passes and the HW dialect. Handshake allows us to model our dataflow circuits (achieving the same goal as what .dot files written in the DOT language achieved in legacy Dynamatic) within MLIR, letting us leverage the full power of the compiler infrastructure to optimize dataflow circuits before emitting them to synthesizable RTL designs. HW serves as an intermediate lowering step between Handshake and RTL, essentially allowing us to model a netlist in MLIR.

Working with submodules

Having a project with submodules means that you have to pay attention to a couple additional things when pulling/pushing code to the project to maintain it in sync with the submodules. If you are unfamiliar with submodules, you can learn more about how to work with them here. Below is a very short and incomplete description of how our submodules are managed by our repository as well as a few pointers on how to perform simple git-related tasks in this context.

Along the history of Dynamatic's (in this context, called the superproject) directory structure and file contents, the repository stores the commit hash of a specific commit for each submodule's repository to identify the version of each subproject that the superproject currently depends on. These commit hashes are added and commited the same way as any other modification to the repository, and can thus evolve as development moves forward, allowing us to use more recent version of our submodules as they are pushed to their respective repositories. Here are a few concrete things you need to keep in mind while using the repository that may differ from your usual submodule-free workflow.

  • Clone the repository with git clone --recurse-submodules [email protected]:EPFL-LAP/dynamatic.git to instruct git to also pull and check out the version of the submodules referenced in the latest commit of Dynamatic's main branch.

  • When pulling the latest commit(s), use git pull --recurse-submodules from the top level repository to also update the checked out commit from submodules in case the superproject changed the subprojects commits it is tracking.

  • To commit changes made to files within Polygeist from the superproject (which is possible thanks to the fact that we use a fork of Polygeist), you first need to commit these changes to the Polygeist fork, and then update the Polygeist commit tracked by the superproject. More precisely,

    1. cd to the polygeist subdirectory,
    2. git add your changes and git commit them to the Polygeist fork,
    3. cd back to the top level directory,
    4. git add polygeist to tell the superproject to track your new Polygeist commit and git commit to Dynamatic.

    If you want to push these changes to remote, note that you will need to git push twice, once from the polygeist subdirectory (the Polygeist commit) and once from the top level directory (the Dynamatic commit).

Testing

Dynamatic features unit tests that evaluate the behavior of a small part of the implementation (typically, one compiler pass) against an expected output. All files within the test directory with the .mlir extension are automatically considered as unit test files. They can be ran/checked all at once by running ninja check-dynamatic from a terminal within the top level build directory. We use the FileCheck LLVM utility to compare the actual output of the implementation with the expected one. docs/Testing.md describes the structure of FileCheck unit test files and explains how to create your own unit tests.

Dynamatic also contains integration tests that assess the whole flow by going from C to VHDL. Each folder containing C source code inside the integration-test directory is a separate integration test.

Contributing

Dynamatic welcomes contributions from the open-source community and from students as part of academic projects. We generally follow the LLVM and MLIR community practices, and currently use GitHub issues and pull requests to handle bug reports/design proposals and code contributions, respectively. Here are some high-level guidelines (inspired from CIRCT's guidelines):

  • Please use clang-format in the LLVM style to format the code (see .clang-format). There are good plugins for common editors like VSCode that can be set up to format each file on save, or you can run it manually. This makes code easier to read and understand, and more uniform throuhgout the codebase.
  • Please pay attention to warnings from clang-tidy (see .clang-tidy). Not all necessarily need to be acted upon, but in the majority of cases they help in identifying code-smells.
  • Please follow the LLVM Coding Standards.
  • Please practice incremental development, preferring to send a small series of incremental patches rather than large patches. There are other policies in the LLVM Developer Policy document that are worth skimming.
  • Please create an issue if you run into a a bug or problem with Dynamatic.
  • Please create a PR to get a code review. For reviewers, it is good to look at the primary author of the code you are touching to make sure they are at least CC'd on the PR.

GitHub Issues & Pull requests

The project uses GitHub issues and pull requests (PRs) to handle contributions from the community. If you are unfamiliar with those, here are some guidelines on how to use them productively:

  • Use meaningful titles and descriptions for issues and PRs you create. Titles should be short yet specific and descriptions should give a good sense of what you are bringing forward, be it a bug report or code contribution.
  • If you intend to contribute a large chunk of code to the project, it may be a good idea to first open a GitHub issue to describe the high-level design of your contribution there and leave it up for discussion. This can only increase the likelihood of your work eventually being merged, as the community will have had a chance to discuss the design before you propose your implementation in a PR (e.g., if the contribution is deemed to large, the community may advise to split it up in several incremental patches). This is especially advisable to first-time contributors to open-source projects and/or compiler development beginners.
  • Use "Squash and Merge" in PRs when they are approved - we don't need the intra-change history in the repository history.

Experimental work

One of Dynamatic's priority is to keep the repository's main branch stable at all times, with a high code quality throughout the project. At the same time, as an academic project we also receive regular code contributions from students with widely different backgrounds and field expertises. These contributions are often part of research-oriented academic projects, and are thus very "experimental" in nature. They will generally result in code that doesn't quite match the standard of quality (less tested, reliable, interoperable) that we expect in the repository. Yet, we still want to keep track of these efforts on the main branch to make them visible to and usable by the community, and encourage future contributions to the more experimental parts of the codebase.

To achieve these dual and slightly conflicting goals, Dynamatic supports experimental contributions to the repository. These will still have to go through a PR but will be merged more easily (i.e., with slightly less regards to code quality) compared to non-experimental contributions. We offer this possibility as a way to push for the integration of research work inside the project, with the ultimate goal of having these contributions graduate to full non-experimental work. Obviously, we strongly encourage developers to make their submitted code contributions as clean and reliable as possible regardless of whether they are classified as experimental. It can only increase their chance of acceptance.

To clearly separate them from the rest, all experimental contributions should exist within the experimental directory which is located at the top level of the repository. The latter's internal structure is identical to the one at the top level (see the repository's structure) with an include folder for all headers, a lib folder for pass implementations, etc. All public code entities defined within experimental work should live under the dynamatic::experimental C++ namespace for clear separation with non-experimental publicly defined entities.