Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Abstract_Site_Representations #523

Open
wants to merge 3 commits into
base: old_main
Choose a base branch
from

Conversation

CalCraven
Copy link
Contributor

@CalCraven CalCraven commented Mar 11, 2021

The goal of of this PR is to allow support of different site classes beyond the currently implemented Atom Class.

This PR is linked to Issue #495, and further discussion can be found at this link.

Checklist:

  • Adjust Atom class to it refers to atom specific behavior, and allows for the addition of site variants.
  • Add a Bead class that behaves similarly to Atom class.
  • Cover flexible cases where a Bead site will have different behavior from an Atom site.
  • Add a residue attribute to all site variants
  • Add tests for implementation of a topology composed of coarse-grained beads.
  • Add a Virtual_Site class that is drawn from a grouping of Bead or Atom classes.
  • Implement relevant usages for information drawn from Virtual_Site.
  • Add tests for implementation of a topology that includes a virtual site.
  • Consider use cases for additional abstract representations in the Topology.

If anyone can link me to some specific use cases, that would be extremely helpful as I start to get develop some of these classes. As discussion for this PR builds, I'll also include a summary of what I think are relevant questions to address and possible solutions.

List of Questions:

  1. What are some examples of commonly used coarse-grained forcefields used?
  • Martini...

@codecov
Copy link

codecov bot commented Mar 11, 2021

Codecov Report

Merging #523 (9943f31) into main (3a387c7) will decrease coverage by 1.29%.
The diff coverage is 0.00%.

@@            Coverage Diff             @@
##             main     #523      +/-   ##
==========================================
- Coverage   90.79%   89.49%   -1.30%     
==========================================
  Files          56       57       +1     
  Lines        4626     4693      +67     
==========================================
  Hits         4200     4200              
- Misses        426      493      +67     
Impacted Files Coverage Δ
gmso/core/bead.py 0.00% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3a387c7...9943f31. Read the comment docs.

@lgtm-com
Copy link
Contributor

lgtm-com bot commented Mar 11, 2021

This pull request introduces 3 alerts when merging 7764272 into fee3df2 - view on LGTM.com

new alerts:

  • 2 for Unused local variable
  • 1 for Unused import

@umesh-timalsina
Copy link
Member

@chrisjonesBSU @jennyfothergill deferring this to you guys' expertise.

Copy link
Collaborator

@jennyfothergill jennyfothergill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgive me if this review is not helpful--I'm not familiar with gmso.
These ideas are really interesting to me:

Add a Virtual_Site class that is drawn from a grouping of Bead or Atom classes.
Implement relevant usages for information drawn from Virtual_Site.
Add tests for implementation of a topology that includes a virtual site.
I like the idea of using the Bead class to store a coarse grain mapping. (e.g., a thiophene moiety, smiles 'c1cscc1', is mapped to a single bead and maybe the Bead class could store that mapping.)

As for what properties I would use of a bead, I can think of type, position, mass, charge...


bead_type_: Optional[AtomType] = Field(
None,
description='AtomType associated with the atom'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
description='AtomType associated with the atom'
description='AtomType associated with the bead'

should this be bead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good catch @jennyfothergill. I'm going to go through and try to write some more implementation for the bead class. Based on the properties you mentioned above, looks like it will act relatively similarly to the current site class.

@bc118
Copy link
Contributor

bc118 commented Apr 8, 2021

I would request that the residues be a part of the atom/bead in all cases, preferably as an attribute. I outline some of the major reasons below.

For all GOMC and many other bio-type simulations the residues are required to conduct a standard simulation and analysis.

For GOMC the residue names are required for the following reasons:

  • The residues are essential a molecule type, and the residues names are required to build the systems in MoSDeF. Currently, in GOMC, a molecule with multiple residues will not work (i.e., a protein with all the amino acids named as different residues), but none the less the residue names are still required.
  • The special MC moves require the molecule to be specified by the residue names along with other molecule characteristics.
  • When using the CHARMM file writers in MoSDeF, the FFs are specified on a per residue basis (i.e., per molecule as GOMC is currently designed), allowing unique FF xml files to be utilized for individual molecules as the user sees fit. This also minimizes errors in editing a large FF file, when the user can simply create one for each residue/molecule, if the residues are utilized to specify the FF selections.

Other bio simulations in MoSDeFs future:

  • There are multiple residues per molecules in a protein, with all the amino acids named as different residues. This is the standard method for running these bio simulations.
  • All the currently protein/bio analysis codes are specifically designed to select atoms based on their residues. therefore, without this attribute the bio people will in all likelihood avoid any code that does not include it.

Specific coarse grain FF example:

  • Martini FF (http://cgmartini.nl) which can be used by for bio simulation and hopefully we will be able to use it in GOMC in the future.
  • I'm purely guessing here, but some of these Martini FF's coarse grain beads may or could be residue specific.

@jennyfothergill
Copy link
Collaborator

jennyfothergill commented Apr 9, 2021

I wonder if residue/virtual site could be grouped into one attribute. It seems to me that they would both represent a grouping of particles and could contain which particles the residue/virtual site contains, a type name, and an id/index? Is there anything further that a bio reside would need?

@bc118
Copy link
Contributor

bc118 commented Apr 9, 2021

As long as GOMC or bio engines can access this residue for building the engines input files (psf, pdb, and FF files) info we "should" be OK. For example, the residue name will need to be tied to each atom number/ID.

There could be some bonding between the same type of residues in a protein, or if side group is added. Therefore, all the residues may not be exactly the same molecularly speaking.

Example: A residue at the beginning or end of a protein will be "capped" and have a slightly different structure that the same residue in the middle of the protein. This is the same concept as building a polymer from monomers with the ends being different, except the monomer is the amino acids in the bio case.

I just think we should make the residue information flexible as described above so bio application are easily accommodated.

Eventually, it would be nice if we can eventually use the crystallography protein data (pdb), or at least a modified mol2 version to input the protein data and keeping all the original information, mol ID, residue names... However, I believe this is a good deal of work.

@CalCraven
Copy link
Contributor Author

I would request that the residues be a part of the atom/bead in all cases, preferably as an attribute. I outline some of the major reasons below.

I agree with these reasons. I think particular for integration with commonly used bio simulation tools, such as Chimera or Avogadro that rely on residues for most of their tools and analysis, I think this makes sense to implement within this PR.

I wonder if residue/virtual site could be grouped into one attribute. It seems to me that they would both represent a grouping of particles and could contain which particles the residue/virtual site contains, a type name, and an id/index? Is there anything further that a bio reside would need?

I see what you're saying @jennyfothergill, but as far as I can tell there might be differences that are worth implementing them separately. I think it makes sense to include in each atom or bead class an attribute called Residue. In my experience, Gromacs uses these for a lot of the selection syntax, and it replaces the necessity of carrying around bulky groups.ndx as an extra specification to gmx grompp. Lammps also has a molecule-tag that could pull from the residue attribute, although it looks like it will only take integers for that specification. For engines that don't need residue information, that attribute just wouldn't be recorded in the output files.
However, I think virtual sites would be used on a less frequent basis. Maybe you want to keep track of the center of mass of a group of atoms, so you can specify a virtual site made up of that group. That may be from all atoms that make up a single residue, but I can see use cases where you want information from a more flexible grouping. I think including a secondary virtual_site class would be a good way to treat these situations.

As far as actually creating this virtual_sites class, I think the most difficult thing will be making sure the group is properly translated to various engines, and will certainly require some software specific knowledge of how these are treated in the engines we want to support. That might have to be done in a separate PR for each engine. But I think some basic knowledge of what that would look like will help direct the way they are built in GMSO.

@lgtm-com
Copy link
Contributor

lgtm-com bot commented Jun 9, 2022

This pull request introduces 3 alerts when merging 9943f31 into 3a387c7 - view on LGTM.com

new alerts:

  • 2 for Unused local variable
  • 1 for Unused import

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants