See also the accompanying system repository and samples repository accompanying repositories for this system.
Differential privacy is the gold standard definition of privacy protection. This project aims to connect theoretical solutions from the academic community with the practical lessons learned from real-world deployments, to make differential privacy broadly accessible to future deployments. Specifically, we provide several basic building blocks that can be used by people involved with sensitive data, with implementations based on vetted and mature differential privacy research. Here in the Core, we provide a pluggable open source library of differentially private algorithms and mechanisms for releasing privacy preserving queries and statistics, as well as APIs for defining an analysis and a validator for evaluating these analyses and composing the total privacy loss on a dataset.
The mechanisms library provides a fast, memory-safe native runtime for validating and running differentially private analyses. The runtime and validator are built in Rust, while Python support is available and R support is forthcoming.
Differentially private computations are specified as an analysis graph that can be validated and executed to produce differentially private releases of data. Releases include metadata about accuracy of outputs and the complete privacy cost of the analysis.
- More about the Core Repository
- Installation
- Getting Started
- Communication
- Releases and Contributing
- Contributing Team
The primary releases available in the library, and the mechanisms for generating these releases, are enumerated below. For a full listing of the extensive set of components available in the library see this documentation.
Statistics | Mechanisms | Utilities |
---|---|---|
Count | Gaussian | Cast |
Histogram | Geometric | Clamping |
Mean | Laplace | Digitize |
Quantiles | Filter | |
Sum | Imputation | |
Variance/Covariance | Transform |
There are three sub-projects that address individual architectural concerns. These sub-projects communicate via protobuf messages that encode a graph description of an arbitrary computation, called an analysis
.
- Location:
/validator-rust
The core library, is the validator
, which provides a suite of utilities for checking and deriving sufficient conditions for an analysis to be differentially private. This includes checking if specific properties have been met for each component, deriving sensitivities, noise scales and accuracies for various definitions of privacy, building reports and dynamically validating individual components. This library is written in Rust.
- Location:
/runtime-rust
There must also be a medium to execute the analysis, called a runtime
. There is a reference runtime written in Rust, but runtimes may be written using any computation framework--be it SQL, Spark or Dask--to address your individual data needs.
- Python Bindings: core-python
- R Bindings (in progress): core-R
- Rust Bindings (in progress): core-Rust
Finally, there are helper libraries for building analyses, called bindings
. Bindings may be written for any language, and are thin wrappers over the validator and/or runtime(s). Language bindings are currently available for Python, with support for at minimum R, Rust and SQL forthcoming.
- Location:
/validator-rust/prototypes
Communication among projects is handled via Protocol Buffer definitions in the /validator-rust/prototypes
directory. All three sub-projects implement:
- Protobuf code generation
- Protobuf serialization/deserialization
- Communication over FFI
- Handling of distributable packaging
At some point the projects have compiled cross-platform (more testing needed). The validator and reference runtime compile to standalone libraries that may be linked into your project, allowing communication over C foreign function interfaces.
Refer to troubleshooting.md for install problems.
Refer to core-python which contains python bindings, including links to PyPi packages.
The crates are intended for library consumers.
The Rust Validator and Runtime are available as crates:
The source install is intended for library developers.
You may find it easier to use the library with this repository set up as a submodule of some set of language bindings. In this case, switch to the language bindings setup. You can still push commits and branches from the core submodule of whatever bindings language you prefer.
-
Clone the repository
git clone [email protected]:opendifferentialprivacy/whitenoise-core.git
-
Install system dependencies (rust, gcc)
Mac:curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh xcode-select --install
Linux:
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh sudo apt-get install diffutils gcc make m4
Windows: Install WSL and refer to the linux instructions.
-
In a new terminal:
Build cratecargo build
Test crate
cargo test
Document crate
cargo rustdoc --open
Build production docs
./build_docs.sh
There are crates in validator-rust
and runtime-rust
, and a virtual crate in root that runs commands on both.
Switch between crates via cd
, or by setting the manifest path --manifest-path=validator-rust/Cargo.toml
.
We have numerous Jupyter notebooks demonstrating the use of the Core library and validator through our Python bindings. These are in our accompanying samples repository which has exemplars, notebooks and sample code demonstrating most facets of this project.
The Rust documentation includes full documentation on all pieces of the library and validator, including extensive component by component descriptions with examples.
- Please use GitHub issues for bug reports, feature requests, install issues, and ideas.
- Gitter is available for general chat and online discussions.
- For other requests, please contact us at [email protected].
- Note: We encourage you to use GitHub issues, especially for bugs.
Please let us know if you encounter a bug by creating an issue.
We appreciate all contributions. We welcome pull requests with bug-fixes without prior discussion.
If you plan to contribute new features, utility functions or extensions to the core, please first open an issue and discuss the feature with us.
- Sending a PR without discussion might end up resulting in a rejected PR, because we may be taking the core in a different direction than you might be aware of.
Joshua Allen, Christian Covington, Eduardo de Leon, Ira Globus-Harris, James Honaker, Jason Huang, Saniya Movahed, Michael Phelan, Raman Prasad, Michael Shoemate, You?