Skip to content

GSoC 2024 Project Ideas

Cédric Bouysset edited this page Feb 3, 2024 · 22 revisions
Google Summer of Code 2023

This page will be updated for GSoC 2024 if MDAnalysis is accepted as a GSoC Organization. (Some links below are outdated/broken and will be updated as soon as we know that MDAnalysis will host GSoC 2024 Contributors.)

Hello, and welcome to MDAnalysis!

Please read our blog post for important official information. Note the blog post link will be updated if MDAnalysis is accepted as an Organization for GSoC 2024.

Please see our Google Summer of Code wiki page for some general information, including advice on application writing and also see our GSoC FAQ for commonly asked questions.

If you just found out about the MDAnalysis Python package from the GSoC website, you can have a look at the MDAnalysis 2021 Trailer [YouTube] to get an overview of the scope of the MDAnalysis package.

Prerequisites

MDAnalysis is a Python library for the analysis of computer simulations of many-body systems at the molecular scale, spanning use cases from interactions of drugs with proteins to novel materials. Therefore, our GSoC projects require a basic knowledge and hands-on experience of molecular dynamics simulations and the associated analyses, or equivalent experience in simulations and modeling of molecular systems (physics, biophysics, chemistry, or materials). For our suggested projects, please check carefully the project descriptions to see the associated desirable skills.

To Prospective Applicants

If you are interested in taking part, please do get in touch on the GSoC with MDAnalysis Discussion Forum. Given the GSoC program structure (short, medium, and long projects), letting us know of your intentions to apply and getting acquainted with the project early will be very helpful.

To Prospective Mentors

MDAnalysis welcomes new mentors, please do get in touch in the developer forum if you are interested in taking part. We typically expect mentors to be familiar with our development process as evidenced by contributions to the code base and interactions on the developer forum.

Overview

A list of projects ideas for Google Summer of Code 2024.

The currently proposed projects are:

  1. Generalize groups
  2. Extend MDAnalysis interoperability
  3. Benchmarking and performance optimization
  4. On the fly transformations
  5. Deepchem converter
  6. 2D visualization for small molecules
  7. Better interfacing of Blender and MDAnalysis

Or work on your own idea! Get in contact with us to propose an idea and we will work with you to flesh it out into a full project. Raise an issue in the Issue Tracker or contact us via the GSoC with MDAnalysis Discussion Forum.

Look at the list of all available mentors for MDAnalysis for potential mentors for your project. Please send all communications to the discussion forum (and don't contact mentors privately). You can certainly ask for the opinion of a specific mentor if you know that their expertise is particularly suitable for your project.


Project summary

The table summarizes the project ideas; long descriptions come after the table (or click on the links under each project name). The difficulty is a somewhat subjective ranking, where easy means that we know pretty much what needs to be done, medium requires some additional research into best solutions as part of the project, and hard is high risk/high reward where we think a solution exists but we will have to work with the student to find it and implement it. The project size is either 175 h (medium) or 350 h (long) projects.

project name difficulty project size description skills mentors
1 Generalise Groups hard 350 hours Generalise concept of groups Python, NetworkX, Molecular modeling @richardjgowers, @yuxuanzhuang
2 Extend MDAnalysis Interoperability medium 350 hours Extend converters module to other relevant packages Python, Molecular modeling @hmacdope, @yuxuanzhuang
3 Benchmarking and performance optimization medium/hard 175 hours Write benchmarks for automated performance analysis and address performance bottlenecks Python, Molecular modeling @hmacdope
4 On the fly transformations easy, medium, hard 90, 175, 350 hours Description Skills @richardjgowers, @cbouy
5 Deepchem converter easy, medium, hard 90, 175, 350 hours Description Skills @richardjgowers
6 2D visualization for small molecules easy 90 hours Add basic 2D visualization functionalities for small molecule groups in notebooks Python, basic knowledge of MDAnalysis and RDKit @cbouy
7 Better interfacing of Blender and MDAnalysis medium 350 hours Add functionality to visualize simple MDAnalysis results in Blender Python, basic knowledge of MDAnalysis, familiarity with Blender ideal @BradyAJohnston, @yuxuanzhuang

Project 1: Bead and Ring Groups

It is common to want to consider a group of atoms as a single site/particle, for example defining the position of a water molecule (or a larger solvent) as its center of mass. It then follows that it is useful to consider many such groupings as an array of quasi-particles, leading to something like an AtomGroup-Group, e.g. a Group representing a solvent where each item in the Group is a single molecule. The goal of this project is to make two such groupings, BeadGroup and RingGroup:

  • BeadGroup: groups of atoms that can be represented as a single site/particle. This could be used for analysis purposes, as well as to define coarse-grained beads.
  • RingGroup: aromatic rings (eg benzene, nucleobases etc.) can be defined by their position (the geometric center of the ring) and their normal vector (the direction they are facing). This class would be implemented as a special case of BeadGroup which also defines a directionality.

Objectives

  1. Design and implement a BeadGroup class to represent a container of many groupings of atoms
  2. Generalise existing methods (e.g. center_of_mass) to BeadGroup
  3. Implement RingGroup, as a special case of BeadGroup
  4. Implement ring finding functions to quickly define these groups
  5. Implement basic RingGroup analysis functions, eg angle between rings, π-stacking identification.

Relevant skills

  • Python
  • Graph theory (eg the NetworkX package)
  • Molecular modeling
  • Chemistry

Related issues:

Mentors

  • @richardjgowers
  • @yuxuanzhang

Project 2: Extend interoperability

MDAnalysis has been pushing towards interoperability objectives. In pursuit of this aim, we have already added converters to the ParmEd and RDKit libraries. We aim to continue this direction by focusing on other relevant packages such as MDTraj, pytraj, OpenBabel, and Psi4.

Objectives

  • Create converter classes to and from MDAnalysis to your chosen package(s)

Relevant skills

  • Python
  • Any other language relevant to your chosen package (likely C++)
  • Basic knowledge of the chosen package(s)
  • Molecular modeling
  • Molecular Dynamics/Cheminformatics/Quantum Chemistry (depending on the chosen package)

Mentors

  • @hmacdope
  • @yuxuanzhuang

Project 3: Benchmarking and performance optimization

The performance of the MDAnalysis library is assessed by automated benchmarks with ASV. The benchmarks are publicly available and are updated every night.

The goal of this project is to increase the performance assessment coverage and identify code that should be improved.

Objectives

  1. Write benchmark cases.
  2. Analyze the performance history to identify code that needs to be improved.
  3. Optimize the code for at least one of the discovered performance bottlenecks.

Relevant skills

  • Python
  • Molecular modeling

Mentors

  • @hmacdope

Project 4: On the fly transformations

Add detailed description here

Objectives

Add objectives here

Relevant skills

Add relevant skills here

Mentors

  • @richardjgowers

Project 5: Deepchem converter

Add detailed description here

Objectives

Add objectives here

Relevant skills

Add relevant skills here

Mentors

  • @richardjgowers

Project 6: 2D visualization for small molecules

MDAnalysis currently lacks visualization functionalities. While it is possible to use other compatible 3D visualization libraries such as NGLView to depict entire molecular systems, this only provides limited information (atoms and their connectivity), which may not be sufficient for small molecules such as drug-like compounds. Since the addition of the RDKit Converter, MDAnalysis has the possibility to reuse functionalities from a popular cheminformatics library, RDKit, to depict molecules by simply converting MDAnalysis atom groups to RDKit objects. This project can range from a basic 2D visualization to more enhanced depictions that include metadata or a heatmap from atom-level data.

Objectives

  • Add functionality such that small molecule groups can be easily visualized in notebooks

Relevant skills

  • Python
  • Basic knowledge of MDAnalysis
  • Basic knowledge of RDKit

Mentors

  • @cbouy

Project 7: Better interfacing of Blender and MDAnalysis

Blender is industry-leading 3D modelling, animation software. Through the add-on Molecular Nodes, MDAnalysis universes are able to be imported into the 3D scene, enabling advanced rendering of molecular dynamics trajectories that is not possible inside of any other molecule viewer. Currently there is initial support for streaming of MD trajectories into the 3D viewport, and the possibility to update selections and visualize some basic analysis results inside of Blender.

Objectives

  • Add functionality such that some simple MDAnalysis results can be visualized interactively inside of Blender

Relevant skills

  • Python
  • Basic knowledge of MDAnalysis
  • Familiarity with Blender is ideal

Mentors

  • @BradyAJohnston
  • @yuxuanzhuang
Clone this wiki locally