-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lammps Writer Changes For Updated Topology Class #701
Conversation
This pull request introduces 4 alerts when merging 086dbe4 into 4f00c99 - view on LGTM.com new alerts:
Heads-up: LGTM.com's PR analysis will be disabled on the 5th of December, and LGTM.com will be shut down ⏻ completely on the 16th of December 2022. Please enable GitHub code scanning, which uses the same CodeQL engine ⚙️ that powers LGTM.com. For more information, please check out our post on the GitHub blog. |
Possible method suggested by @daico007 for iterating through atomtypes and connection types. I will test this for the speedups compared to just iterating through all of the connections during the writing step. # Create an atom type map to do indexing
atypes = list(topology.atom_types())
atypes_map = dict()
for i, uatype in enumerate(unique_atypes):
matched_idx = list()
for idx, atype in enumerate(atypes):
if atype == uatype:
atypes_map[atype] = i
matched_idx.append(idx)
for idx in reversed(matched_idx):
atypes.pop(idx)
# Make connection types map to extract index later
uc_types = topology.connection_types(PotentialFilters.UNIQUE_NAME_CLASS)
c_types = [ctype for ctype in topology.connection_types]
c_typesmap = dict()
for uc_type in uc_types:
matched_idx = list()
for idx, c_type in enumerate(c_types):
if c_type == uc_type:
c_typesmap[c_type] = uc_type
matched_idx.append(idx)
for idx in reversed(matched_idx):
c_types.pop(idx) |
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## main #701 +/- ##
==========================================
- Coverage 91.98% 91.91% -0.07%
==========================================
Files 67 67
Lines 6460 6818 +358
==========================================
+ Hits 5942 6267 +325
- Misses 518 551 +33
☔ View full report in Codecov by Sentry. |
This pull request introduces 4 alerts when merging a1d8ba8 into 6bcdfcc - view on LGTM.com new alerts:
Heads-up: LGTM.com's PR analysis will be disabled on the 5th of December, and LGTM.com will be shut down ⏻ completely on the 16th of December 2022. Please enable GitHub code scanning, which uses the same CodeQL engine ⚙️ that powers LGTM.com. For more information, please check out our post on the GitHub blog. |
This pull request introduces 4 alerts when merging 9413d01 into fe321e2 - view on LGTM.com new alerts:
Heads-up: LGTM.com's PR analysis will be disabled on the 5th of December, and LGTM.com will be shut down ⏻ completely on the 16th of December 2022. It looks like GitHub code scanning with CodeQL is already set up for this repo, so no further action is needed 🚀. For more information, please check out our post on the GitHub blog. |
7c2a488
to
6119f1b
Compare
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CodeQL found more than 10 potential problems in the proposed changes. Check the Files changed tab for more details.
…t matching with lammps output from mbuild writer.
…nique potential types
Co-authored-by: Co Quach <[email protected]>
Co-authored-by: Co Quach <[email protected]>
Co-authored-by: Co Quach <[email protected]>
… appearance in the topology. Copies of connection types are made, which must be filtered using potential filters to get whatever subset is considered unique.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great overall! I only have a few comments about code style for consistency (between units on comment line and actually being written out), and maybe some missing using conversion.
I pulled this PR and reviewed/tested it and things seem to be functional, so I am in favor of merging it soon. There are a few things, as we discussed last time, regarding the necessary of LAMMPS specific harmonic bonds/angles, but I think that can be addressed in a different. Another thing would be using the centralize unit conversing units (since HOOMD and GROMACS is using same/similar functionality.
for potStr in exp_unitsDict: | ||
potContainer = getattr(self, potStr + "_types") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like this method only convert all the potentials. Should we also convert the atoms' position too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the initial goal for this method was just to be used on the parameters of the potential expressions, which is why it's missing things. Pairpotentials come to mind as an example of that. However, I ended up converting the expressions on writing out, instead of upfront in the lammpswriter, and mostly relying on methods that are living in lammpsdata.py because of that. I still think the best way to do the conversion would be:
- Grab the set of base UnitSystem. Presumably, this would be from a library, or passed directly to whatever function by the user. I can see a useful added input of also passing units specific to parameters in the topology, such as: {"angle_types":{"theta":u.degrees}}, which would override the value for that UnitSystem.
- Determine if the unit changes should happen in place or to a copy of the topology. This would be mostly for speedups, and prevent issues with overwriting whatever you pass to the function.
- Iterate through all components of the topology. Each component should be passed to a generic unit_conversion method with the base units and the conversion should be done. These components include: [box, sites, atomtypes, bondtypes, angletypes, dihedraltypes, impropertypes, pairpotentials]. Each of these may have their own specific methods, but should be accessible through a single api.
- Once the conversions are made, then the topology attributes can be written out directly as floats with the to_value() method. Rounding should also occur on writing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To summarize, I think this function is not used currently at all. I can either delete it, or leave it as a starting place for the next PR that looks to normalize and standardize all of the unit handling in GMSO.
for site in top.sites: | ||
site.atom_type = ff.atom_types[site.name] | ||
|
||
top.update_atom_types() | ||
site.atom_type = ff.atom_types[site.atom_type.name] | ||
|
||
for bond in top.bonds: | ||
bond.bond_type = bond.connection_type = ff.bond_types[ | ||
"opls_111~opls_112" | ||
] | ||
|
||
top.update_bond_types() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could remove this section completely at this point. It was there when the apply
method was not yet available.
reg = UnitRegistry() | ||
dim = u.dimensions.current_mks * u.dimensions.time | ||
conversion = 1 * getattr(u.physical_constants, "elementary_charge").value | ||
reg.add( | ||
"elementary_charge", | ||
base_value=conversion, | ||
dimensions=dim, | ||
tex_repr=r"\rm{e}", | ||
) | ||
conversion = 1 * getattr(u.physical_constants, "boltzmann_constant_mks").value | ||
dim = u.dimensions.energy / u.dimensions.temperature | ||
reg.add( | ||
"kb", base_value=conversion, dimensions=dim, tex_repr=r"\rm{kb}" | ||
) # boltzmann temperature | ||
conversion = ( | ||
4 | ||
* np.pi | ||
* getattr(u.physical_constants, "reduced_planck_constant").value ** 2 | ||
* getattr(u.physical_constants, "eps_0").value | ||
/ ( | ||
getattr(u.physical_constants, "electron_charge").value ** 2 | ||
* getattr(u.physical_constants, "electron_mass").value | ||
) | ||
) | ||
dim = u.dimensions.length | ||
reg.add( | ||
"a0", base_value=conversion, dimensions=dim, tex_repr=r"\rm{a0}" | ||
) # bohr radius | ||
conversion = ( | ||
getattr(u.physical_constants, "reduced_planck_constant").value ** 2 | ||
/ u.Unit("a0", registry=reg).base_value ** 2 | ||
/ getattr(u.physical_constants, "electron_mass").value | ||
) | ||
dim = u.dimensions.energy | ||
reg.add( | ||
"Ehartree", base_value=conversion, dimensions=dim, tex_repr=r"\rm{Ehartree}" | ||
) # Hartree energy | ||
conversion = np.sqrt( | ||
10**9 / (4 * np.pi * getattr(u.physical_constants, "eps_0").value) | ||
) | ||
dim = u.dimensions.charge | ||
reg.add( | ||
"Statcoulomb_charge", | ||
base_value=conversion, | ||
dimensions=dim, | ||
tex_repr=r"\rm{Statcoulomb_charge}", | ||
) # Static charge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be cleaner to put this inside _unit_style_factory
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Upon chatting with @daico007, I think all of this is going to get moved to the units module in an upcoming PR. So, I'll leave this here for now, as it's not worth the effort to move this right before it will all get moved again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great overall! I only have a few comments about code style for consistency (between units on comment line and actually being written out), and maybe some missing using conversion.
I pulled this PR and reviewed/tested it and things seem to be functional, so I am in favor of merging it soon (after addressing the failing test and comments above). There are a few things, as we discussed last time, regarding the necessity of LAMMPS specific harmonic bonds/angles json, but I think that can be addressed in a different. Another thing would be using the centralize unit conversing units (since HOOMD and GROMACS is using same/similar functionality.
Note: we still have one failing tests at
|
gmso/formats/lammpsdata.py
Outdated
_parameter_converted_to_float( | ||
improper_type.parameters["k"], base_unyts, cfactorsDict | ||
), | ||
improper_type.parameters["chieq"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
improper_type.parameters["chieq"] | |
improper_type.parameters["phi_eq"] |
Currently, in the PeriodicImproperPotential
json, the three variables are k
(energy), n
(dimensionless) and phi_eq
(angle). I think the chieq
could be a typo. Also, why don't the n
get written out somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess, this is supposed to write out harmonic improper, in which case, we need to adjust the list in _accepted_potential
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the docs (https://docs.lammps.org/improper_style.html), looks like our PeriodicImproper
is called fourier
in lammps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah so the naming is slightly weird, that's on of the reasons I was tagging things with LAMMPS as the potential template name.
As far as I can tell:
GMSO PeriodicImproperPotential = lammps modified improper_style CVFF (not sure if there's a 1-1 conversion here
GMSO HarmonicImproperPotential = 1/2* LAMMPS improper_style harmonic
GMSO FourierTorsionPotential != LAMMPS improper_style fourier (we would probably just need a separate template for it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The equations for PeriodicImproper and and lammps improper_style fourier, are similar, but missing the phi_eq term.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I am looking back here: https://github.com/mosdef-hub/mbuild/blob/main/mbuild/formats/lammpsdata.py.
It seems like that the current mbuild version support LAMMPS improper harmonic
(coming fromstructure.impropers
) and cvff
(coming from structure.dihedrals
where dihedral.improper == True
). So I think, in GMSO, that would map GMSO harmonic
--> LAMMPS harmonic
and GMSO periodic
--> LAMMPS cvff
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to add a test for this, for some reason it's not being caught at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work, I only read through the code and made comments. I'll download and perform testing when it passes the tests.
gmso/core/topology.py
Outdated
Parameters | ||
---------- | ||
expressionMap : dict, default={} | ||
map with keys of the potential type and the potential to change to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a little confusing and only made sense after looking at the example. Maybe:
Map where the keys represent the current potential type and the corresponding values represent the desired potential type.
""" | ||
from gmso.utils.conversions import convert_topology_expressions | ||
|
||
return convert_topology_expressions(self, expressionMap) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why make a method for convert_potential_styles
instead of just using convert_topology_expressions
. Regardless of whether this method remains, the import should be at the top of the module considering that it is an in-package import and not an optional external dependency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After discussing with @daico007, all of this will get refactored so that the 3 engine writers are using similar methods. This method is analagous to what HOOMD and GROMACS Top writers are currently using locally, so that code will end up being extracted to this topology method.
gmso/formats/lammpsdata.py
Outdated
) * get_units(unit_style, "length") | ||
type_list[i].parameters["epsilon"] = float( | ||
pair.split()[1] | ||
) * get_units(unit_style, "energy") | ||
elif len(pair.split()) == 4: | ||
warnings.warn("Currently not reading in mixing rules") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think this should be added then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, there seems to be no direct way to get this from the lammpsdata file. It is usually set in the lammps simulation input file.
pair_style lj/cut rcut
pair_modify mix combining_rule
read_data data.lammps
So I think we're handcuffed here, I think this warning could be changed though to tell the user they need to manually specify the mixing rule as topology.combining_rule = "geometric" or "lorentz". The default in GMSO is lorentz.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This len(pair.split()) == 4
bit is actually referring to the pair_style lj/cut if a specific cutoff is given for the pair https://docs.lammps.org/pair_lj.html. GMSO does not keep track of rcut values as of now, as that is typically part of the input file, not the topology files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to raise this as an issue, to see if someone would want to make a site specific or pair potential with these parameters for them.
) | ||
) | ||
out_file.write("{:d} atoms\n".format(top.n_sites)) | ||
if atom_style in ["full", "molecular"]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How will this handle hybrid atom_styles that add bonds and angles to styles such as DPD or ellipsoid (the former from personal use and the latter is in the example directory for lammps).
Also "bond", "angle", and "amoeba" have native bond and angle capability so the lists here need to be updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also "bond", "angle", and "amoeba" have native bond and angle capability so the lists here need to be updated.
The only atom styles supported at this point are the ones we supported in the mbuild lammpswriter. So for now, we just have these from the doc string:
atom_style : str, optional, default='full'
Defines the style of atoms to be saved in a LAMMPS data file.
The following atom styles are currently supported: 'full', 'atomic', 'charge', 'molecular'
see http://lammps.sandia.gov/doc/atom_style.html for more information on atom styles.
How will this handle hybrid atom_styles that add bonds and angles to styles such as DPD or ellipsoid (the former from personal use and the latter is in the example directory for lammps).
Hybrid is going to be interesting, to say the least. According to the lammps read_data docs, hybrid will look like:
atom-ID atom-type x y z sub-style1 sub-style2
So the way to handle that would be for something in the full topology:
if len(atom_style) > 0 and atom_style[0] == "hybrid":
style_formatsList = get_styleformats(atom_style[1:], top, section_type)
sectionStr = "" # or put default info here
for style_format in style_formatsList:
sectionStr+=style_format(top)
out_file.write(sectionStr)
Something that you iterate through the topology for might look like:
if len(atom_style) > 0 and atom_style[0] == "hybrid":
style_formatsList = get_styleformats(atom_style[1:], top, section_type)
for site in top.sites:
sectionStr = ""
for style_format in style_formatsList:
sectionStr+=style_format(site)
out_file.write(style_format(site))
where atom_style isn't a string, but a list of styles you pass to the writer function.
i.e. atom_style = ["hybrid", "charge", "full"]
I think this is something that can be added as we add some of these specific styles into the writer.
def _write_site_data(out_file, top, atom_style, base_unyts, cfactorsDict): | ||
"""Write atomic positions and charges to LAMMPS file..""" | ||
out_file.write("\nAtoms\n\n") | ||
if atom_style == "atomic": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about other styles (e.g. DPD)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will be addressed in later PR's on a case by case basis.
expected_units_dim = dict( | ||
zip(["c0", "c1", "c2", "c3", "c4", "c5"], ["energy"] * 6) | ||
) | ||
base_units = u.UnitSystem("atomic", "Å", "mp", "fs", "nK", "rad") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will a user ever need to type the symbol, Å? if so, using this nonstandard character could be a concern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The string: "angstrom"
should work just as well here and provide equivalency. So we shouldn't run into the issue of being forced to use the Å to create the unyt.Unit.
…ort locations from conversions.py
…ing, added ljunitclass that will always be unitless
…on degrees regardless of unitsystem.
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this PR is ready for now. There several changes that would optimize/clean things up a bit, but those make more sense in follow up PRs per ongoing discussions.
This PR aims to refactor the current lammpswriter, which was questionably functional in its previous state and not entirely compatible with modern changes to a topology.
In terms of making sure the writer is nice and flexible. I'm pulling out some of the functions for conversions and making those available in the Topology class. That way users can make sure they know what functional forms are being used before writing, and there isn't a lot of conversions happening under the hood, which causes mistakes when people aren't sure how something is working.
See the relevant tests in test_lammpswriter.py for the unit_styles and atom_styles we aim to support. Lot's of different functional forms are technically possible, most of which are actually set in the input lammps file, not in the data file. So we don't currently have in place a lot of testing for them.
Current workflow:
PR Checklist