Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add restricted insertions attribute for moves #47

Merged
merged 25 commits into from
Apr 17, 2020

Conversation

rmatsum836
Copy link
Collaborator

@rmatsum836 rmatsum836 commented Apr 1, 2020

Based on numerous discussions with @rsdefever, restricted_insertions is a useful argument to run GCMC or GEMC simulations of slit pores.

This PR adds restricted_insert as an attribute of the moves object. This is a work in progress.

  • Add checks to ensure restricted_insertions are valid
  • Add unit tests
  • Decide on appropriate data structures for this attribute

@rmatsum836 rmatsum836 added the WIP label Apr 1, 2020
@rsdefever
Copy link
Collaborator

@rmatsum836 this looks like a great start! Let me know if I can help! I'm excited to get this feature supported.

@rsdefever rsdefever added this to the Version 0.2.0 release milestone Apr 1, 2020
@rmatsum836
Copy link
Collaborator Author

Based on the conversation we had regarding #49, I made the arguments for restricted insertions lists. The black formatting test is failing (not surprised), but all of the tests are passing locally for me. I would also like to add a few more tests to ensure this addition is correctly integrated into the current code.

@rmatsum836
Copy link
Collaborator Author

I need to rework this in order to specify restrictions by components in the box. Will try to have an update later today.

@rmatsum836
Copy link
Collaborator Author

This is still a little rough around the edges but I think it's ready for a first review. Some things I'm unsure about:

  • Is the check to validate the restricted insertions vs box sufficient in inp_functions.py?
  • Are there any additional checks that need to be done?

Copy link
Collaborator

@rsdefever rsdefever left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks good, and a very nice addition. A few overall comments:

Major:
(1) Can you add an example with restricted insertion to the examples directory and then add a test for the example.
(2) Create a moves object, and add a restricted_type, but don't add the restricted_value. Then try mc.print_inputfile(). You'll see it errors out when trying to write the input file. I think we should try to catch this error before then.
(3) I know you were following the "template" with the way I've handled other things associated with the moves object, but in this case would it be better to have an add_restricted_insertion or restrict_insertion function to add these restrictions? Somehow it feels more natural in my head. Interested in your opinion.

Minor:
(1) In the future, try to apply black formatting as you go (and ideally before commits), so we don't end up with a bunch of commit messages that are Apply black formatting.

len(restricted_type)))
if len(restricted_type[0]) != self._n_species:
raise ValueError("{} species specified"
' but restricted_typed for only {}'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The word "only" only makes sense if len > n_species. I'd just remove it.

if restricted_value:
if len(self.restricted_type) != len(restricted_value):
raise ValueError("'restricted_type' is a list of length"
" {} but 'restricted_value' is a list of length"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logically the comparison makes sense, but I think the error message could be a bit clearer. The underlying problem is that the user is specifying restricted values for the wrong number of boxes.

)
if not isinstance(restricted_value[0], (list, tuple)):
raise TypeError("'restricted_value' specified for"
" a box must be an iterable.")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "must be an iterable" terminology, while technically correct, is, perhaps a bit confusing and not to-the-point. Something about a list or tuple with one element per species would be more clear IMO.

mosdef_cassandra/core/moves.py Show resolved Hide resolved
run_length=500,
temperature=300.0,
pressure=1,
chemical_potentials=["none", 10.0],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to specify a chemical potential for GEMC. In fact we should add an error if someone tries for future.

run_length=500,
temperature=300.0,
pressure=1,
chemical_potentials=["none", 10.0],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repeat about chemical potential

)

if typ == 'interface':
assert "\nrestricted_insertion {} {} {}\n".format(typ, value[0], value[1]) in inp_data
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you done a check to confirm that this (and the other sections where you are testing the input writers) actually works for Cassandra? I.e., the formatting is actually correct. It looks correct but would rather be safe than sorry.

@@ -241,7 +241,7 @@ def generate_input(system, moves, run_type, run_length, temperature, **kwargs):
box_dims = np.hstack((box.lengths, box.angles))
box_matrix = convert_box.convert_to_boxmatrix(box_dims)
boxes.append(box_matrix)
inp_data += get_box_info(boxes)
inp_data += get_box_info(boxes, moves)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have an opinion about passing the full moves object vs. just the restricted insertion details? I'm just thinking from a readability of the code standpoint it might be clearer to see exactly what part of the moves is need in writing the box info?

@@ -1054,6 +1116,7 @@ def get_move_probability_info(**kwargs):
"volume",
"insert",
"swap",
"restricted_insertion"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add docstrings?

@rsdefever rsdefever removed the WIP label Apr 13, 2020
@rmatsum836
Copy link
Collaborator Author

Thanks @rsdefever for the comments, I'll get to work on addressing them.

I know you were following the "template" with the way I've handled other things associated with the moves object, but in this case would it be better to have an add_restricted_insertion or restrict_insertion function to add these restrictions? Somehow it feels more natural in my head. Interested in your opinion.

In the comment above, are you referring to the way that restricted insertions get set in the Moves object? So would add_restricted_insertions be a method of Moves that adds restricted insertions into the object?

@rsdefever
Copy link
Collaborator

So would add_restricted_insertions be a method of Moves that adds restricted insertions into the object?

Yeah that was the thought anyway...it might be a bad idea. The interface could perhaps look something like:

moves.add_restricted_insert(species=species, box=box, restricted_type='sphere', radius=0.3)

But...this also starts to encroach on how to best handle the species and box spec (e.g., #49).

I'm also OK with leaving it as is for now and adding it to the list of things to think about. In my head, editing a move probability with something like moves.prob_swap = 0.2 makes good sense, but needing to specify the restricted_types and then separately restricted_values feels a bit more cumbersome. I'm not entirely sure what the difference is in my head? Maybe it's that the other attributes of the moves object come with default values. As much as anything I wanted to get some other feedback on these thoughts.

@rmatsum836
Copy link
Collaborator Author

I kinda like this idea. I agree it's cumbersome to add both restricted_type and restricted_value and have to check that both attributes are compatible with each other. Similar to what you had above, I was thinking something like:

moves.add_restricted_insertion(species=species, restricted_types='sphere', values=0.3)

def add_restricted_insertion(species, restricted_types, values):
    self._restricted_types = restricted_types
    self._restricted_values = values

This makes sense to me because the restricted insertions seem like a very optional argument that won't get used most of the time. Also this allows a user to pass in the type and value of restricted insertions at the same time, but the values get set as separate attributes under the hood. The attributes will also be "private" so that a user doesn't try to directly specify self._restricted_types or self._restricted_values. Does that work?

@rsdefever
Copy link
Collaborator

That all sounds good. Questions:

(1) How do we specify the species and verify that it is valid (i.e., part of the species list)
(2) Same as (1), but for the box.

If we can cleanly address those two issues, I prefer have an add_restricted_insertion method. If those present currently insurmountable challenges, I am fine with pushing ahead by forcing the user to edit both attributes (i.e., the current implementation).

@rmatsum836
Copy link
Collaborator Author

That I'm not sure about. Currently I'm trying to follow the current implementation in that if species is defined as [water, methane] and then the first index of restricted_type would correspond to water and the second would correspond to methane. We could specify and verify the species from species_topologies that exists as an attribute in the moves object.

For boxes however, I don't think moves carries box information so this would be a little bit tougher. moves does contain information for number of boxes though, so currently I'm just checking to make sure that the restricted insertions information matches up with the number of boxes.

@rmatsum836
Copy link
Collaborator Author

I went ahead and added the method add_restricted_insertions to moves. I personally like this much better than the previous implementation, mainly because it's much easier to check that the restricted_type and restricted_value passed are compatible with each other.

I think I addressed your comments, including the addition of a gcmc_restricted example. If you like the add_resricted_insertions method as well, I will go ahead and delete the old implementation in moves.py that is currently commented out.

@rsdefever
Copy link
Collaborator

I like the new interface (i.e., method-based) much more. I think it is safe to delete the commented code. It looks like you've covered most of my requested changes. Once you're comfortable I'll give it one more look-over and then LGTM. 👍

I think long run I may need to consider overhauling System and Moves somewhat. But that is a separate issue and this looks great!

@codecov-io
Copy link

Codecov Report

Merging #47 into master will decrease coverage by 0.70%.
The diff coverage is 84.42%.

@@            Coverage Diff             @@
##           master      #47      +/-   ##
==========================================
- Coverage   95.48%   94.78%   -0.71%     
==========================================
  Files          20       21       +1     
  Lines        1463     1572     +109     
==========================================
+ Hits         1397     1490      +93     
- Misses         66       82      +16     

@rmatsum836
Copy link
Collaborator Author

moves.py has been cleaned up and I added more tests in test_moves.py to make sure the ValueErrors are raised as intended. I think this is ready for a second review.

One possible source of error I noticed is that we can create a moves object with a specific ensemble, add restricted insertions, and then change the ensemble like so:

moves = mc.Moves("gcmc", [methane_oplsaa])
moves.add_restricted_insertions([methane_oplsaa], [[typ]], value)
moves.ensemble = "nvt"

This doesn't cause an error within the package, but will cause the simulation to fail. We should consider ways to prevent this from happening. Maybe prevent the user from setting the ensemble outside of initializing the moves object.

@rmatsum836
Copy link
Collaborator Author

Just realized I need to account for the restricted option in probability swap as well.

@rmatsum836
Copy link
Collaborator Author

I think the PR is ready to be reviewed, I'm happy where it's at.

Copy link
Collaborator

@rsdefever rsdefever left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few small picks, then LGTM.

(1) If I have a restricted insertion already added, then add another one, it appears that the first restricted insertion is replaced. Can we throw a warning?
(2) See comments about other errors and docstrings.

Once those are resolved I will merge! LGTM today!

Comment on lines 193 to 196
restricted_type : list
list of restricted insertion types, with one species per element.
restricted_value : list
list of restricted insertion values, with one species per element.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these could be more clear; i.e. WRT the nesting of lists. One list per box and then one list per species.

Comment on lines 208 to 221
for box in restricted_type:
if len(box) != len(species_topologies):
raise ValueError(
"Length of 'species' and "
" length of species list in 'restricted_type'"
" must match."
)
for box in restricted_value:
if len(box) != len(species_topologies):
raise ValueError(
"Length of 'species' and "
" length of species list in 'restricted_value'"
" must match."
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps these error messages can be made more verbose. If I do this:

species_list = [methane]
moves = mc.Moves('gcmc', species_list)
moves.add_restricted_insertions(species_list, ['sphere'], [5.0])

The error message I get is:

ValueError: Length of 'species' and  length of species list in 'restricted_type' must match.

I'm not sure that points me towards fixing my problem.

@rsdefever rsdefever merged commit 335a458 into MaginnGroup:master Apr 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants