Skip to content

GSoC 2020 project to integrate the RDKit and MongoDb

License

Notifications You must be signed in to change notification settings

vvrubel/mongo-rdkit

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

mongo-rdkit

Build Status

Mongo-rdkit is an integration between MongoDB, a NoSQL database platform, and RDKit, a collection of cheminformatics and machine-learning software. This package contains tools to create and manipulate a chemically-intelligent database, as well as methods for high-performance searches on the database that leverage native MongoDB features.

Useful links:

Documentation

Jupyter Notebooks and resources for getting started in the docs folder on GitHub.

Installation

macOS and Linux:

Ensure that you have either Anaconda or Miniconda installed and that conda has been added to PATH.

Clone the repository into your desired directory.

Navigate so that your current working directory is mongo-rdkit.

Create a conda environment called mongo_rdkit that includes all dependencies needed for this package:

conda env create --quiet --force --file env.yml

Activate said conda environment:

source activate mongo_rdkit

Install a local copy of mongo-rdkit by running this from the same directory as setup.py (mongo-rdkit is not yet published to PyPI):

pip install -e .

You can now import mongordkit in your Python interpreter or run all tests using the pytest command.

Windows:

Similarly, ensure that conda has been added to PATH.

Clone the repository into your desired directory and navigate into it.

Create a conda environment called mongo_rdkit that includes dependencies:

conda env create --quiet --force --file env.yml

Activate this conda environment:

call activate mongo_rdkit

Check that you are able to import mongordkit:

python -c "import mongordkit"

If this fails, you may need to add the current directory manually to PYTHONPATH:

set PYTHONPATH=%PYTHONPATH%;C:.

You can now use mongordkit in your interpreter and run tests using python -m pytest.

Package Contents

Modules

mongordkit contains two main modules, each of which contains a variety of importable methods and classes. Database contains functionality for writing and registering data. Search contains functionality for setting up and performing substructure and similarity search. Detailed walkthroughs can be found in the notebooks, listed below.

Notebooks

  • Creating and Writing to MongoDB: documentation and demos for creating and modifying mongo-rdkit databases.
  • Similarity and Substructure Search: documentation and demos for similarity and substructure search.
  • Similarity Benchmarking: documentation for reproducing similarity benchmarking.
  • Substructure Benchmarking: documentation for reproducing substructure benchmarking.

Configuration

  • azure_pipelines.yml: CI/CD pipeline configurations.
  • conftest.py: pytest configurations.
  • env.yml: required dependencies.
  • setup.py: python package setup including pip dependencies

License

Code released under the BSD License.

About

GSoC 2020 project to integrate the RDKit and MongoDb

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%