Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support electronic temperature #258

Merged
merged 29 commits into from
Sep 4, 2024
Merged

Conversation

zjgemi
Copy link
Collaborator

@zjgemi zjgemi commented Aug 27, 2024

Summary by CodeRabbit

  • New Features

    • Introduced a new argument for handling electronic temperature configurations, enhancing simulation flexibility.
    • Added the capability to specify optional output paths, improving output management.
    • Expanded functionality for retrieving configurations with new temperature parameters.
    • Enhanced exploration scheduling with elemental temperature configurations.
  • Bug Fixes

    • Improved data validation processes to ensure consistency across trajectories and outputs.
  • Documentation

    • Updated documentation to reflect new parameters and functionalities across multiple modules.
  • Tests

    • Added unit tests to validate the functionality of temperature setup and related methods.

Copy link

coderabbitai bot commented Aug 27, 2024

Walkthrough

Walkthrough

The changes introduce a new argument, use_ele_temp, across various components of the dpgen2 package to manage electronic temperature settings. Modifications include enhancements to the execute method for handling temperature data, writing to a JSON file, and returning this data as part of the output. Additionally, several functions have been updated to accept new parameters for improved output handling, and new unit tests have been added to validate the functionality of these changes.

Changes

File(s) Change Summary
dpgen2/entrypoint/args.py Added the use_ele_temp argument to the input_args function for electronic temperature configuration.
dpgen2/op/run_lmp.py Modified get_output_sign to include optional_output, enhanced execute to handle ele_temp, write to a JSON file, and added get_ele_temp function for reading temperature parameters.
dpgen2/entrypoint/submit.py Introduced use_ele_temp parameter to make_lmp_naive_exploration_scheduler and updated workflow_concurrent_learning to utilize this parameter.
tests/exploration/test_traj_render_lammps.py Added unit tests for the TrajRenderLammps class to validate get_ele_temp and set_ele_temp methods.
tests/op/test_run_lmp.py Introduced tests for the get_ele_temp function to validate its behavior under different scenarios.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant InputArgs
    participant ExecuteMethod
    participant GetEleTemp
    participant System

    User->>InputArgs: Call with use_ele_temp
    InputArgs->>User: Return configured arguments
    User->>ExecuteMethod: Execute with arguments
    ExecuteMethod->>GetEleTemp: Retrieve electronic temperature
    GetEleTemp->>ExecuteMethod: Return temperature data
    ExecuteMethod->>System: Write temperature data to job.json
    ExecuteMethod->>User: Return output including optional_output
Loading

Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 2078c54 and b22a73e.

Files selected for processing (5)
  • dpgen2/entrypoint/args.py (2 hunks)
  • dpgen2/entrypoint/submit.py (2 hunks)
  • dpgen2/op/run_lmp.py (6 hunks)
  • tests/exploration/test_traj_render_lammps.py (1 hunks)
  • tests/op/test_run_lmp.py (2 hunks)
Files skipped from review as they are similar to previous changes (2)
  • dpgen2/entrypoint/args.py
  • tests/op/test_run_lmp.py
Additional context used
Ruff
tests/exploration/test_traj_render_lammps.py

10-10: .context.dpgen2 imported but unused

Remove unused import: .context.dpgen2

(F401)

Additional comments not posted (9)
tests/exploration/test_traj_render_lammps.py (3)

9-11: The existing comment is still valid. Please address the unused import as suggested in the previous review.

Tools
Ruff

10-10: .context.dpgen2 imported but unused

Remove unused import: .context.dpgen2

(F401)


18-37: LGTM!

The code changes are approved.


39-58: LGTM!

The code changes are approved.

dpgen2/op/run_lmp.py (4)

82-82: LGTM!

The code change is approved.


187-196: LGTM!

The code changes are approved.

Also applies to: 208-209


Line range hint 219-239: LGTM!

The code change is approved.


314-333: Skipping generating a similar comment as the past comment is still valid and applicable to the current code segment.

dpgen2/entrypoint/submit.py (2)

373-378: LGTM!

The code changes to introduce the use_ele_temp parameter in the make_lmp_naive_exploration_scheduler function are approved.


629-631: LGTM!

The code changes to pass the use_ele_temp configuration to the exploration workflow in the workflow_concurrent_learning function are approved.


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Outside diff range, codebase verification and nitpick comments (2)
dpgen2/exploration/render/traj_render_lammps.py (1)

76-81: Mismatch in get_confs Function Call and Definition

The function call to get_confs in conf_selector_frame.py includes an extra parameter use_ele_temp that is not present in the function definitions found in traj_render_lammps.py and traj_render.py. This discrepancy needs to be resolved to maintain consistency in the codebase.

  • File: dpgen2/exploration/selector/conf_selector_frame.py
  • Issue: Extra parameter use_ele_temp in function call.
Analysis chain

LGTM! But verify the function usage in the codebase.

The new logic for handling use_ele_temp and ele_temp is correctly integrated. However, ensure that all function calls to get_confs match the new signature.

The code changes are approved.

Run the following script to verify the function usage:

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify all function calls to `get_confs` match the new signature.

# Test: Search for the function usage. Expect: Only occurrences of the new signature.
rg --type python -A 5 $'get_confs'

Length of output: 64


Script:

#!/bin/bash
# Description: Verify all function calls to `get_confs` match the new signature.

# Test: Search for the function usage. Expect: Only occurrences of the new signature.
rg --type py -A 5 'get_confs'

Length of output: 1440

dpgen2/op/run_lmp.py (1)

Line range hint 208-228: Add exception handling for file operations.

Ensure that potential exceptions during file operations are caught and handled gracefully.

ele_temp = get_ele_temp(lmp_input_name)
if ele_temp is not None:
-    with open("job.json", "w") as f:
-        json.dump(data, f, indent=4)
+    try:
+        with open("job.json", "w") as f:
+            json.dump(data, f, indent=4)
+    except OSError as e:
+        logging.error(f"Failed to write job.json: {e}")
+        raise

dpgen2/utils/setup_ele_temp.py Show resolved Hide resolved
Comment on lines +37 to +39
from .setup_ele_temp import (
setup_ele_temp,
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Address unused import.

The import statement for setup_ele_temp is currently unused. Consider removing it, adding it to __all__, or using a redundant alias.

Apply this diff to address the unused import:

-from .setup_ele_temp import (
-    setup_ele_temp,
-)
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
from .setup_ele_temp import (
setup_ele_temp,
)
Tools
Ruff

38-38: .setup_ele_temp.setup_ele_temp imported but unused; consider removing, adding to __all__, or using a redundant alias

(F401)

@@ -99,18 +104,40 @@ def execute(
def validate_trajs(
trajs,
model_devis,
optional_outputs=None,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplify nested if statements.

The nested if statements can be combined to improve readability.

Apply this diff to simplify the nested if statements:

            elif tt is None and mm is None:
                if optional_outputs and oo is not None:
                    raise FatalError(
                        f"trajs frame is {tt} while optional_outputs frame is {oo}"
                    )

Also applies to: 114-143

Comment on lines +118 to +121
if len(list(mm.rglob("fparam.npy"))) > 0:
setup_ele_temp(False)
if len(list(mm.rglob("aparam.npy"))) > 0:
setup_ele_temp(True)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplify the logic by combining checks.

The logic for checking fparam.npy and aparam.npy files can be simplified by combining the checks.

Apply this diff to simplify the logic:

        if len(list(mm.rglob("fparam.npy"))) > 0 or len(list(mm.rglob("aparam.npy"))) > 0:
            setup_ele_temp(len(list(mm.rglob("aparam.npy"))) > 0)
            ms = dpdata.MultiSystems(type_map=type_map)
            for dtype in dpdata.LabeledSystem.DTYPES:
                if dtype.name in ["energies", "forces"]:
                    dtype.required = False
            ms.from_deepmd_npy(mm, labeled=True)  # type: ignore
        else:
            ms.from_deepmd_npy(mm, labeled=False)  # type: ignore

Also applies to: 123-133

Comment on lines 62 to 81
def clean_lines(string_list, remove_empty_lines=True):
"""Strips whitespace, carriage returns and empty lines from a list of strings.

Args:
string_list: List of strings
remove_empty_lines: Set to True to skip lines which are empty after
stripping.

Returns:
List of clean strings with no whitespaces.
"""
for s in string_list:
clean_s = s
if "#" in s:
ind = s.index("#")
clean_s = s[:ind]
clean_s = clean_s.strip()
if (not remove_empty_lines) or clean_s != "":
yield clean_s

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optimize the clean_lines function.

Consider using a list comprehension instead of a generator for better readability and performance.

-def clean_lines(string_list, remove_empty_lines=True):
-    for s in string_list:
-        clean_s = s
-        if "#" in s:
-            ind = s.index("#")
-            clean_s = s[:ind]
-        clean_s = clean_s.strip()
-        if (not remove_empty_lines) or clean_s != "":
-            yield clean_s
+def clean_lines(string_list, remove_empty_lines=True):
+    return [
+        s[:s.index("#")].strip() if "#" in s else s.strip()
+        for s in string_list
+        if (not remove_empty_lines) or s.strip() != ""
+    ]
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def clean_lines(string_list, remove_empty_lines=True):
"""Strips whitespace, carriage returns and empty lines from a list of strings.
Args:
string_list: List of strings
remove_empty_lines: Set to True to skip lines which are empty after
stripping.
Returns:
List of clean strings with no whitespaces.
"""
for s in string_list:
clean_s = s
if "#" in s:
ind = s.index("#")
clean_s = s[:ind]
clean_s = clean_s.strip()
if (not remove_empty_lines) or clean_s != "":
yield clean_s
def clean_lines(string_list, remove_empty_lines=True):
return [
s[:s.index("#")].strip() if "#" in s else s.strip()
for s in string_list
if (not remove_empty_lines) or s.strip() != ""
]

Comment on lines 204 to 215
if os.path.exists("job.json"):
with open("job.json", "r") as f:
data = json.load(f)
if "use_ele_temp" in data and "ele_temp" in data:
if data["use_ele_temp"] == 1:
setup_ele_temp(False)
sys.data["fparam"] = np.tile(data["ele_temp"], [1, 1])
elif data["use_ele_temp"] == 2:
setup_ele_temp(True)
sys.data["fparam"] = np.tile(
data["ele_temp"], [1, sys.get_natoms(), 1]
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add exception handling for file operations.

Ensure that potential exceptions during file operations are caught and handled gracefully.

 if os.path.exists("job.json"):
-    with open("job.json", "r") as f:
-        data = json.load(f)
+    try:
+        with open("job.json", "r") as f:
+            data = json.load(f)
+    except (OSError, json.JSONDecodeError) as e:
+        logging.error(f"Failed to read job.json: {e}")
+        raise
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if os.path.exists("job.json"):
with open("job.json", "r") as f:
data = json.load(f)
if "use_ele_temp" in data and "ele_temp" in data:
if data["use_ele_temp"] == 1:
setup_ele_temp(False)
sys.data["fparam"] = np.tile(data["ele_temp"], [1, 1])
elif data["use_ele_temp"] == 2:
setup_ele_temp(True)
sys.data["fparam"] = np.tile(
data["ele_temp"], [1, sys.get_natoms(), 1]
)
if os.path.exists("job.json"):
try:
with open("job.json", "r") as f:
data = json.load(f)
except (OSError, json.JSONDecodeError) as e:
logging.error(f"Failed to read job.json: {e}")
raise
if "use_ele_temp" in data and "ele_temp" in data:
if data["use_ele_temp"] == 1:
setup_ele_temp(False)
sys.data["fparam"] = np.tile(data["ele_temp"], [1, 1])
elif data["use_ele_temp"] == 2:
setup_ele_temp(True)
sys.data["fparam"] = np.tile(
data["ele_temp"], [1, sys.get_natoms(), 1]
)

Comment on lines 102 to 133
use_ele_temp = 0
ele_temp = None
if "fparam" in conf_frame.data:
use_ele_temp = 1
ele_temp = conf_frame.data["fparam"][0][0]
if "aparam" in conf_frame.data:
use_ele_temp = 2
ele_temp = conf_frame.data["aparam"][0][0][0]
if ele_temp:
lines = list(clean_lines(incar.splitlines()))
params = {}
for line in lines:
for sline in line.split(";"):
m = re.match(r"(\w+)\s*=\s*(.*)", sline.strip())
if m:
key = m.group(1).strip()
val = m.group(2).strip()
params[key.upper()] = val
params["ISMEAR"] = -1
import scipy.constants as pc

params["SIGMA"] = ele_temp * pc.Boltzmann / pc.electron_volt
incar = (
"\n".join([key + " = " + str(val) for key, val in params.items()])
+ "\n"
)
data = {
"use_ele_temp": use_ele_temp,
"ele_temp": ele_temp,
}
with open("job.json", "w") as f:
json.dump(data, f, indent=4)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add validation checks for ele_temp.

Ensure that ele_temp is a valid numerical value before using it in calculations and writing it to the JSON file.

 if ele_temp:
+    if not isinstance(ele_temp, (int, float)):
+        raise ValueError("ele_temp must be a numerical value")
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
use_ele_temp = 0
ele_temp = None
if "fparam" in conf_frame.data:
use_ele_temp = 1
ele_temp = conf_frame.data["fparam"][0][0]
if "aparam" in conf_frame.data:
use_ele_temp = 2
ele_temp = conf_frame.data["aparam"][0][0][0]
if ele_temp:
lines = list(clean_lines(incar.splitlines()))
params = {}
for line in lines:
for sline in line.split(";"):
m = re.match(r"(\w+)\s*=\s*(.*)", sline.strip())
if m:
key = m.group(1).strip()
val = m.group(2).strip()
params[key.upper()] = val
params["ISMEAR"] = -1
import scipy.constants as pc
params["SIGMA"] = ele_temp * pc.Boltzmann / pc.electron_volt
incar = (
"\n".join([key + " = " + str(val) for key, val in params.items()])
+ "\n"
)
data = {
"use_ele_temp": use_ele_temp,
"ele_temp": ele_temp,
}
with open("job.json", "w") as f:
json.dump(data, f, indent=4)
use_ele_temp = 0
ele_temp = None
if "fparam" in conf_frame.data:
use_ele_temp = 1
ele_temp = conf_frame.data["fparam"][0][0]
if "aparam" in conf_frame.data:
use_ele_temp = 2
ele_temp = conf_frame.data["aparam"][0][0][0]
if ele_temp:
if not isinstance(ele_temp, (int, float)):
raise ValueError("ele_temp must be a numerical value")
lines = list(clean_lines(incar.splitlines()))
params = {}
for line in lines:
for sline in line.split(";"):
m = re.match(r"(\w+)\s*=\s*(.*)", sline.strip())
if m:
key = m.group(1).strip()
val = m.group(2).strip()
params[key.upper()] = val
params["ISMEAR"] = -1
import scipy.constants as pc
params["SIGMA"] = ele_temp * pc.Boltzmann / pc.electron_volt
incar = (
"\n".join([key + " = " + str(val) for key, val in params.items()])
+ "\n"
)
data = {
"use_ele_temp": use_ele_temp,
"ele_temp": ele_temp,
}
with open("job.json", "w") as f:
json.dump(data, f, indent=4)

Comment on lines +269 to +271
"optional_outputs": prep_run_explore.outputs.artifacts["optional_outputs"]
if "optional_outputs" in prep_run_explore.outputs.artifacts
else None,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use get method for better readability.

Replace the if block with the get method to improve code readability.

"optional_outputs": prep_run_explore.outputs.artifacts.get("optional_outputs", None)
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"optional_outputs": prep_run_explore.outputs.artifacts["optional_outputs"]
if "optional_outputs" in prep_run_explore.outputs.artifacts
else None,
"optional_outputs": prep_run_explore.outputs.artifacts.get("optional_outputs", None)
Tools
Ruff

269-271: Use prep_run_explore.outputs.artifacts.get("optional_outputs", None) instead of an if block

Replace with prep_run_explore.outputs.artifacts.get("optional_outputs", None)

(SIM401)

Comment on lines 326 to 343
def get_ele_temp(lmp_input_name):
with open(lmp_input_name, encoding="utf8") as f:
lmp_input_lines = f.readlines()

idx = find_only_one_key(
lmp_input_lines, ["pair_style", "deepmd"], raise_not_found=False
)
if idx is None:
return
fields = lmp_input_lines[idx].split()

if "fparam" in fields:
return fields[fields.index("fparam") + 1]

if "aparam" in fields:
return fields[fields.index("aparam") + 1]

return None
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optimize the get_ele_temp function.

Consider using a single return statement for better readability and performance.

def get_ele_temp(lmp_input_name):
    with open(lmp_input_name, encoding="utf8") as f:
        lmp_input_lines = f.readlines()

    idx = find_only_one_key(
        lmp_input_lines, ["pair_style", "deepmd"], raise_not_found=False
    )
    if idx is None:
        return None

    fields = lmp_input_lines[idx].split()

    if "fparam" in fields:
        return fields[fields.index("fparam") + 1]

    if "aparam" in fields:
        return fields[fields.index("aparam") + 1]

    return None

Committable suggestion was skipped due to low confidence.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

trajs = ["foo", "bar"]
model_devis = ["zar", "par"]
optional_outputs = ["dar", None]
with self.assertRaises(FatalError) as context:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove unused variable context.

The variable context is assigned but never used.

Apply this diff to remove the unused variable:

- with self.assertRaises(FatalError) as context:
+ with self.assertRaises(FatalError):
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
with self.assertRaises(FatalError) as context:
with self.assertRaises(FatalError):
Tools
Ruff

164-164: Local variable context is assigned to but never used

Remove assignment to unused variable context

(F841)

Copy link

codecov bot commented Aug 27, 2024

Codecov Report

Attention: Patch coverage is 82.20859% with 29 lines in your changes missing coverage. Please review.

Project coverage is 83.70%. Comparing base (8fb287e) to head (b22a73e).
Report is 14 commits behind head on master.

Files with missing lines Patch % Lines
dpgen2/op/run_lmp.py 59.25% 11 Missing ⚠️
dpgen2/fp/vasp.py 88.70% 7 Missing ⚠️
dpgen2/exploration/render/traj_render_lammps.py 85.71% 4 Missing ⚠️
dpgen2/fp/prep_fp.py 50.00% 2 Missing ⚠️
dpgen2/op/collect_data.py 60.00% 2 Missing ⚠️
dpgen2/op/select_confs.py 90.47% 2 Missing ⚠️
dpgen2/entrypoint/submit.py 75.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #258      +/-   ##
==========================================
- Coverage   83.75%   83.70%   -0.06%     
==========================================
  Files         103      104       +1     
  Lines        5805     5958     +153     
==========================================
+ Hits         4862     4987     +125     
- Misses        943      971      +28     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Comment on lines 98 to 105
use_ele_temp = 0
for ii in labeled_data:
if len(list(ii.rglob("fparam.npy"))) > 0:
setup_ele_temp(False)
use_ele_temp = 1
if len(list(ii.rglob("aparam.npy"))) > 0:
setup_ele_temp(True)
use_ele_temp = 2
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using elif for efficiency.

The logic is sound, but using elif for the second condition can make the code more efficient since both files are unlikely to be present simultaneously.

 if len(list(ii.rglob("fparam.npy"))) > 0:
     setup_ele_temp(False)
     use_ele_temp = 1
- if len(list(ii.rglob("aparam.npy"))) > 0:
+ elif len(list(ii.rglob("aparam.npy"))) > 0:
     setup_ele_temp(True)
     use_ele_temp = 2
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
use_ele_temp = 0
for ii in labeled_data:
if len(list(ii.rglob("fparam.npy"))) > 0:
setup_ele_temp(False)
use_ele_temp = 1
if len(list(ii.rglob("aparam.npy"))) > 0:
setup_ele_temp(True)
use_ele_temp = 2
use_ele_temp = 0
for ii in labeled_data:
if len(list(ii.rglob("fparam.npy"))) > 0:
setup_ele_temp(False)
use_ele_temp = 1
elif len(list(ii.rglob("aparam.npy"))) > 0:
setup_ele_temp(True)
use_ele_temp = 2

Comment on lines 114 to 141
if use_ele_temp:
# Work around dpdata does not support custom dtype for mixed-type
fmtobj = dpdata.system.load_format("deepmd/npy/mixed")
mixed_systems = fmtobj.mix_system(
*list(ms.systems.values()), type_map=ms.atom_names
)
for fn in mixed_systems:
folder = os.path.join(name, fn)
nframes = mixed_systems[fn].get_nframes()
set_size = 2000

nsets = nframes // set_size
if set_size * nsets < nframes:
nsets += 1
for ii in range(nsets):
set_stt = ii * set_size
set_end = (ii + 1) * set_size
set_folder = os.path.join(folder, "set.%06d" % ii)
if use_ele_temp == 1:
np.save(
os.path.join(set_folder, "fparam"),
mixed_systems[fn].data["fparam"][set_stt:set_end],
)
elif use_ele_temp == 2:
np.save(
os.path.join(set_folder, "aparam"),
mixed_systems[fn].data["aparam"][set_stt:set_end],
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider making set_size configurable and refactoring nested loops.

  1. The set_size value is hardcoded, which might not be optimal for all use cases. Consider making it a configurable parameter.
  2. The nested loops and conditions can be refactored for better readability.

Example refactor:

 set_size = ip["optional_parameter"].get("set_size", 2000)

 for ii in range((nframes + set_size - 1) // set_size):
     set_stt = ii * set_size
     set_end = min((ii + 1) * set_size, nframes)
     set_folder = os.path.join(folder, "set.%06d" % ii)
     param_key = "fparam" if use_ele_temp == 1 else "aparam"
     np.save(
         os.path.join(set_folder, param_key),
         mixed_systems[fn].data[param_key][set_stt:set_end],
     )
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if use_ele_temp:
# Work around dpdata does not support custom dtype for mixed-type
fmtobj = dpdata.system.load_format("deepmd/npy/mixed")
mixed_systems = fmtobj.mix_system(
*list(ms.systems.values()), type_map=ms.atom_names
)
for fn in mixed_systems:
folder = os.path.join(name, fn)
nframes = mixed_systems[fn].get_nframes()
set_size = 2000
nsets = nframes // set_size
if set_size * nsets < nframes:
nsets += 1
for ii in range(nsets):
set_stt = ii * set_size
set_end = (ii + 1) * set_size
set_folder = os.path.join(folder, "set.%06d" % ii)
if use_ele_temp == 1:
np.save(
os.path.join(set_folder, "fparam"),
mixed_systems[fn].data["fparam"][set_stt:set_end],
)
elif use_ele_temp == 2:
np.save(
os.path.join(set_folder, "aparam"),
mixed_systems[fn].data["aparam"][set_stt:set_end],
)
if use_ele_temp:
# Work around dpdata does not support custom dtype for mixed-type
fmtobj = dpdata.system.load_format("deepmd/npy/mixed")
mixed_systems = fmtobj.mix_system(
*list(ms.systems.values()), type_map=ms.atom_names
)
for fn in mixed_systems:
folder = os.path.join(name, fn)
nframes = mixed_systems[fn].get_nframes()
set_size = ip["optional_parameter"].get("set_size", 2000)
for ii in range((nframes + set_size - 1) // set_size):
set_stt = ii * set_size
set_end = min((ii + 1) * set_size, nframes)
set_folder = os.path.join(folder, "set.%06d" % ii)
param_key = "fparam" if use_ele_temp == 1 else "aparam"
np.save(
os.path.join(set_folder, param_key),
mixed_systems[fn].data[param_key][set_stt:set_end],
)

dpgen2/exploration/selector/conf_selector_frame.py Outdated Show resolved Hide resolved
@@ -63,6 +63,8 @@ def get_confs(
id_selected: List[List[int]],
type_map: Optional[List[str]] = None,
conf_filters: Optional["ConfFilters"] = None,
use_ele_temp: int = 0,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

provide as class attribute?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if I understand it. Set use_ele_temp as class attribute and update it in __init__ function?

dpgen2/exploration/render/traj_render_lammps.py Outdated Show resolved Hide resolved
dpgen2/fp/vasp.py Show resolved Hide resolved
dpgen2/op/collect_data.py Outdated Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Comment on lines 73 to 106
optional_outputs: Optional[List[Path]] = None,
) -> dpdata.MultiSystems:
ele_temp = None
if optional_outputs:
assert ntraj == len(optional_outputs)
ele_temp = []
for ii in range(ntraj):
with open(optional_outputs[ii], "r") as f:
data = json.load(f)
if self.use_ele_temp:
ele_temp.append(data["ele_temp"])
if self.use_ele_temp:
if self.use_ele_temp == 1:
setup_ele_temp(False)
elif self.use_ele_temp == 2:
setup_ele_temp(True)
else:
raise ValueError(
"Invalid value for 'use_ele_temp': %s", self.use_ele_temp
)

ntraj = len(trajs)
traj_fmt = "lammps/dump"
ms = dpdata.MultiSystems(type_map=type_map)
for ii in range(ntraj):
if len(id_selected[ii]) > 0:
ss = dpdata.System(trajs[ii], fmt=traj_fmt, type_map=type_map)
ss.nopbc = self.nopbc
if self.use_ele_temp == 1 and ele_temp:
ss.data["fparam"] = np.tile(ele_temp[ii], [len(ss), 1])
elif self.use_ele_temp == 2 and ele_temp:
ss.data["aparam"] = np.tile(
ele_temp[ii], [len(ss), ss.get_natoms(), 1]
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review the implementation of get_confs method with new parameters.

The method now includes handling for use_ele_temp and ele_temp. The logic for setting up electronic temperature parameters based on the value of use_ele_temp is integrated well within the existing structure. However, there are issues with the use of the variable ntraj which is referenced before being defined.

To resolve the undefined variable issue, define ntraj at the beginning of the method. Here's the proposed fix:

 def get_confs(
     self,
     trajs: List[Path],
     id_selected: List[List[int]],
     type_map: Optional[List[str]] = None,
     conf_filters: Optional["ConfFilters"] = None,
     optional_outputs: Optional[List[Path]] = None,
 ) -> dpdata.MultiSystems:
+    ntraj = len(trajs)  # Define ntraj at the beginning of the method
     ele_temp = None
     if optional_outputs:
         assert ntraj == len(optional_outputs)
         ele_temp = []
         for ii in range(ntraj):
             with open(optional_outputs[ii], "r") as f:
                 data = json.load(f)
             if self.use_ele_temp:
                 ele_temp.append(data["ele_temp"])
         if self.use_ele_temp:
             if self.use_ele_temp == 1:
                 setup_ele_temp(False)
             elif self.use_ele_temp == 2:
                 setup_ele_temp(True)
             else:
                 raise ValueError(
                     "Invalid value for 'use_ele_temp': %s", self.use_ele_temp
                 )

Additionally, ensure that unit tests are implemented to cover these new functionalities, especially the different branches of use_ele_temp.

Would you like me to help draft the unit tests for these new functionalities?

Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
optional_outputs: Optional[List[Path]] = None,
) -> dpdata.MultiSystems:
ele_temp = None
if optional_outputs:
assert ntraj == len(optional_outputs)
ele_temp = []
for ii in range(ntraj):
with open(optional_outputs[ii], "r") as f:
data = json.load(f)
if self.use_ele_temp:
ele_temp.append(data["ele_temp"])
if self.use_ele_temp:
if self.use_ele_temp == 1:
setup_ele_temp(False)
elif self.use_ele_temp == 2:
setup_ele_temp(True)
else:
raise ValueError(
"Invalid value for 'use_ele_temp': %s", self.use_ele_temp
)
ntraj = len(trajs)
traj_fmt = "lammps/dump"
ms = dpdata.MultiSystems(type_map=type_map)
for ii in range(ntraj):
if len(id_selected[ii]) > 0:
ss = dpdata.System(trajs[ii], fmt=traj_fmt, type_map=type_map)
ss.nopbc = self.nopbc
if self.use_ele_temp == 1 and ele_temp:
ss.data["fparam"] = np.tile(ele_temp[ii], [len(ss), 1])
elif self.use_ele_temp == 2 and ele_temp:
ss.data["aparam"] = np.tile(
ele_temp[ii], [len(ss), ss.get_natoms(), 1]
)
def get_confs(
self,
trajs: List[Path],
id_selected: List[List[int]],
type_map: Optional[List[str]] = None,
conf_filters: Optional["ConfFilters"] = None,
optional_outputs: Optional[List[Path]] = None,
) -> dpdata.MultiSystems:
ntraj = len(trajs) # Define ntraj at the beginning of the method
ele_temp = None
if optional_outputs:
assert ntraj == len(optional_outputs)
ele_temp = []
for ii in range(ntraj):
with open(optional_outputs[ii], "r") as f:
data = json.load(f)
if self.use_ele_temp:
ele_temp.append(data["ele_temp"])
if self.use_ele_temp:
if self.use_ele_temp == 1:
setup_ele_temp(False)
elif self.use_ele_temp == 2:
setup_ele_temp(True)
else:
raise ValueError(
"Invalid value for 'use_ele_temp': %s", self.use_ele_temp
)
ntraj = len(trajs)
traj_fmt = "lammps/dump"
ms = dpdata.MultiSystems(type_map=type_map)
for ii in range(ntraj):
if len(id_selected[ii]) > 0:
ss = dpdata.System(trajs[ii], fmt=traj_fmt, type_map=type_map)
ss.nopbc = self.nopbc
if self.use_ele_temp == 1 and ele_temp:
ss.data["fparam"] = np.tile(ele_temp[ii], [len(ss), 1])
elif self.use_ele_temp == 2 and ele_temp:
ss.data["aparam"] = np.tile(
ele_temp[ii], [len(ss), ss.get_natoms(), 1]
)
Tools
Ruff

77-77: Undefined name ntraj

(F821)


79-79: Undefined name ntraj

(F821)

@wanghan-iapcm
Copy link

please ref to the following prs of the dpdata repo, which fix the issue of customized dtype
deepmodeling/dpdata#702
deepmodeling/dpdata#704

Signed-off-by: zjgemi <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Outside diff range, codebase verification and nitpick comments (3)
tests/utils/test_ele_temp.py (1)

1-44: Overall structure is good, but test coverage could be improved.

The file provides a basic test structure for the setup_ele_temp function. However, to ensure robust testing:

  1. Consider adding more test cases to cover different scenarios and edge cases.
  2. Add docstrings to the test class and methods to explain their purpose.
  3. Consider using parameterized tests to test multiple input combinations efficiently.

To improve the overall test architecture, consider:

  1. Creating a base test class with common setup and teardown methods.
  2. Using fixtures (if using pytest) or setUp methods to create common test data.
  3. Organizing tests into logical groups or separate files as the test suite grows.
Tools
Ruff

11-11: .context.dpgen2 imported but unused

Remove unused import: .context.dpgen2

(F401)

dpgen2/exploration/selector/conf_selector_frame.py (1)

73-74: Improve docstring for optional_outputs parameter.

The current docstring for optional_outputs is too vague.

Consider providing more specific information about what these optional outputs are and how they are used. For example:

optional_outputs : Optional[List[Path]]
    A list of paths to optional output files generated during the exploration.
    These may include additional data files that can influence the configuration selection process.
tests/fp/test_vasp.py (1)

21-22: New imports look good.

The addition of dumps_incar and loads_incar imports from dpgen2.fp.vasp is appropriate for the new test cases.

Consider updating the module's documentation to reflect that these functions are now part of the public API.

Comment on lines 21 to 40
def test_setup_ele_temp(self):
system = dpdata.LabeledSystem(
data={
"atom_names": ["H"],
"atom_numbs": [1],
"atom_types": np.zeros(1, dtype=int),
"cells": np.eye(3).reshape(1, 3, 3),
"coords": np.zeros((1, 1, 3)),
"energies": np.zeros(1),
"forces": np.zeros((1, 1, 3)),
"orig": np.zeros(3),
"nopbc": True,
}
)
setup_ele_temp(False)
system.data["fparam"] = np.array([[1.0]])
system.to_deepmd_npy("ele_temp_data")
self.assertEqual(len(glob.glob("ele_temp_data/*/fparam.npy")), 1)
new_system = dpdata.LabeledSystem("ele_temp_data", fmt="deepmd/npy")
self.assertTrue("fparam" in new_system.data)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enhance test coverage and clarity for setup_ele_temp.

While the test method provides some coverage, there are a few areas that could be improved:

  1. The purpose of setup_ele_temp(False) is not clear. Consider adding a comment explaining its expected effect.
  2. The test doesn't seem to directly test the effects of setup_ele_temp. Consider adding assertions that specifically check the changes made by this function.
  3. The assertion only checks if 'fparam' exists in the new system, not its value. Consider adding an assertion to verify the value of 'fparam'.

Here's a suggested improvement to the test method:

def test_setup_ele_temp(self):
    system = dpdata.LabeledSystem(
        data={
            "atom_names": ["H"],
            "atom_numbs": [1],
            "atom_types": np.zeros(1, dtype=int),
            "cells": np.eye(3).reshape(1, 3, 3),
            "coords": np.zeros((1, 1, 3)),
            "energies": np.zeros(1),
            "forces": np.zeros((1, 1, 3)),
            "orig": np.zeros(3),
            "nopbc": True,
        }
    )
    
    # Test with use_ele_temp=False
    setup_ele_temp(False)
    self.assertNotIn("fparam", system.data, "fparam should not be present when use_ele_temp is False")
    
    # Test with use_ele_temp=True
    setup_ele_temp(True)
    self.assertIn("fparam", system.data, "fparam should be present when use_ele_temp is True")
    self.assertEqual(system.data["fparam"].shape, (1, 1), "fparam should have shape (1, 1)")
    
    # Test writing and reading
    system.to_deepmd_npy("ele_temp_data")
    self.assertEqual(len(glob.glob("ele_temp_data/*/fparam.npy")), 1)
    new_system = dpdata.LabeledSystem("ele_temp_data", fmt="deepmd/npy")
    self.assertIn("fparam", new_system.data)
    np.testing.assert_array_equal(system.data["fparam"], new_system.data["fparam"], "fparam values should be preserved")

This improved version tests both True and False cases for setup_ele_temp, checks the shape of fparam, and verifies that the values are preserved when writing and reading the data.

Comment on lines +9 to +17
# isort: off
from .context import (
dpgen2,
)
from dpgen2.utils import (
setup_ele_temp,
)

# isort: on
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove unused import dpgen2.

The static analysis tool has flagged that dpgen2 is imported but unused. Consider removing this import if it's not needed elsewhere in the file.

Apply this diff to remove the unused import:

 # isort: off
 from .context import (
-    dpgen2,
 )
 from dpgen2.utils import (
     setup_ele_temp,
 )
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# isort: off
from .context import (
dpgen2,
)
from dpgen2.utils import (
setup_ele_temp,
)
# isort: on
# isort: off
from .context import (
)
from dpgen2.utils import (
setup_ele_temp,
)
# isort: on
Tools
Ruff

11-11: .context.dpgen2 imported but unused

Remove unused import: .context.dpgen2

(F401)

Comment on lines 126 to 160
class TestIncar(unittest.TestCase):
def test_loads_dumps_incar(self):
incar = """
PREC=A
ENCUT=600
ISYM=0
ALGO=fast
EDIFF=1.000000e-06
LREAL=A
NPAR=1
KPAR=1

NELMIN=4
ISIF=2
ISMEAR=1
SIGMA=1.000000
IBRION=-1

NSW=0

LWAVE=F
LCHARG=F
PSTRESS=0

KSPACING=0.160000
KGAMMA=.FALSE."""
params = loads_incar(incar)
self.assertEqual(len(params), 19)
self.assertEqual(params["PREC"], "A")
self.assertEqual(params["EDIFF"], "1.000000e-06")
self.assertEqual(params["IBRION"], "-1")
self.assertEqual(params["KGAMMA"], ".FALSE.")
new_incar = dumps_incar(params)
new_params = loads_incar(new_incar)
self.assertEqual(params, new_params)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TestIncar class is a good addition, but could be improved.

The TestIncar class and its test_loads_dumps_incar method provide good coverage for the new INCAR handling functionality. However, consider the following improvements:

  1. Add multiple test cases with simpler INCAR strings to cover different scenarios.
  2. Instead of asserting the exact number of parameters, check for the presence of all expected keys.
  3. Verify all parameters in the parsed dictionary, not just a few.
  4. Add test cases for INCAR files with comments and varying whitespace.

Here's a suggested refactor for more robust testing:

class TestIncar(unittest.TestCase):
    def test_loads_dumps_incar(self):
        incar_strings = [
            """
            PREC=A
            ENCUT=600
            ISYM=0
            """,
            """
            # This is a comment
            ALGO=fast
            EDIFF=1.000000e-06
            LREAL = A  # Inline comment
            """,
            # Add more test cases here
        ]
        
        for incar in incar_strings:
            params = loads_incar(incar)
            
            # Check if all expected keys are present
            expected_keys = [line.split('=')[0].strip() for line in incar.strip().split('\n') if '=' in line and not line.strip().startswith('#')]
            self.assertTrue(all(key in params for key in expected_keys))
            
            # Verify all parameters
            for key, value in params.items():
                self.assertIn(f"{key}={value}", incar.replace(" ", ""))
            
            # Test round-trip
            new_incar = dumps_incar(params)
            new_params = loads_incar(new_incar)
            self.assertEqual(params, new_params)

    def test_incar_with_comments(self):
        incar_with_comments = """
        # This is a comment
        PREC=A
        ENCUT=600  # Inline comment
        """
        params = loads_incar(incar_with_comments)
        self.assertEqual(len(params), 2)
        self.assertEqual(params["PREC"], "A")
        self.assertEqual(params["ENCUT"], "600")

This refactored version provides more comprehensive testing and is more resilient to changes in the INCAR format.

Comment on lines 119 to 141
incar = vasp_inputs.incar_template

use_ele_temp = 0
ele_temp = None
if "fparam" in conf_frame.data:
use_ele_temp = 1
ele_temp = conf_frame.data["fparam"][0][0]
if "aparam" in conf_frame.data:
use_ele_temp = 2
ele_temp = conf_frame.data["aparam"][0][0][0]
if ele_temp:
import scipy.constants as pc

params = loads_incar(incar)
params["ISMEAR"] = -1
params["SIGMA"] = ele_temp * pc.Boltzmann / pc.electron_volt
incar = dumps_incar(params)
data = {
"use_ele_temp": use_ele_temp,
"ele_temp": ele_temp,
}
with open("job.json", "w") as f:
json.dump(data, f, indent=4)
Copy link

@coderabbitai coderabbitai bot Sep 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improve robustness and readability of electronic temperature handling.

  1. Add validation for ele_temp value.

  2. Implement error handling for file operations.

  3. Use constants instead of magic numbers for use_ele_temp.

  4. Add validation for ele_temp:

if ele_temp is not None and not isinstance(ele_temp, (int, float)):
    raise ValueError("ele_temp must be a numerical value")
  1. Implement error handling for file operations:
try:
    with open("job.json", "w") as f:
        json.dump(data, f, indent=4)
except IOError as e:
    logging.error(f"Failed to write job.json: {e}")
    raise
  1. Use constants for use_ele_temp:
NO_ELE_TEMP = 0
FPARAM_ELE_TEMP = 1
APARAM_ELE_TEMP = 2

use_ele_temp = NO_ELE_TEMP
if "fparam" in conf_frame.data:
    use_ele_temp = FPARAM_ELE_TEMP
    ele_temp = conf_frame.data["fparam"][0][0]
elif "aparam" in conf_frame.data:
    use_ele_temp = APARAM_ELE_TEMP
    ele_temp = conf_frame.data["aparam"][0][0][0]

Consider extracting the electronic temperature handling logic into a separate method for better modularity and testability.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please write a UT for lines 123-141

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wanghan-iapcm, I've written a unit test for the lines 123-141 in the dpgen2/fp/vasp.py file. The test covers different scenarios based on the presence of fparam and aparam in conf_frame.data, and verifies the contents of the INCAR file and job.json file.

Here is the unit test:

import unittest
from unittest.mock import MagicMock, patch
from dpgen2.fp.vasp import PrepVasp, loads_incar, dumps_incar
import json
import os

class TestPrepVasp(unittest.TestCase):

    @patch("dpgen2.fp.vasp.Path.write_text")
    @patch("dpgen2.fp.vasp.open", new_callable=unittest.mock.mock_open)
    def test_prep_task_with_fparam(self, mock_open, mock_write_text):
        # Mock conf_frame with fparam
        conf_frame = MagicMock()
        conf_frame.data = {"fparam": [[[300.0]]]}

        # Mock vasp_inputs
        vasp_inputs = MagicMock()
        vasp_inputs.incar_template = "PREC = Normal\nEDIFF = 1E-4"

        # Create an instance of PrepVasp and call prep_task
        prep_vasp = PrepVasp()
        prep_vasp.prep_task(conf_frame, vasp_inputs)

        # Verify INCAR modifications
        incar_content = dumps_incar(loads_incar(vasp_inputs.incar_template))
        incar_content += "ISMEAR = -1\nSIGMA = 0.025851991\n"
        mock_write_text.assert_called_with(incar_content)

        # Verify job.json content
        mock_open().write.assert_called_once_with(json.dumps({
            "use_ele_temp": 1,
            "ele_temp": 300.0
        }, indent=4))

    @patch("dpgen2.fp.vasp.Path.write_text")
    @patch("dpgen2.fp.vasp.open", new_callable=unittest.mock.mock_open)
    def test_prep_task_with_aparam(self, mock_open, mock_write_text):
        # Mock conf_frame with aparam
        conf_frame = MagicMock()
        conf_frame.data = {"aparam": [[[[500.0]]]]}

        # Mock vasp_inputs
        vasp_inputs = MagicMock()
        vasp_inputs.incar_template = "PREC = Normal\nEDIFF = 1E-4"

        # Create an instance of PrepVasp and call prep_task
        prep_vasp = PrepVasp()
        prep_vasp.prep_task(conf_frame, vasp_inputs)

        # Verify INCAR modifications
        incar_content = dumps_incar(loads_incar(vasp_inputs.incar_template))
        incar_content += "ISMEAR = -1\nSIGMA = 0.043086652\n"
        mock_write_text.assert_called_with(incar_content)

        # Verify job.json content
        mock_open().write.assert_called_once_with(json.dumps({
            "use_ele_temp": 2,
            "ele_temp": 500.0
        }, indent=4))

    @patch("dpgen2.fp.vasp.Path.write_text")
    @patch("dpgen2.fp.vasp.open", new_callable=unittest.mock.mock_open)
    def test_prep_task_without_ele_temp(self, mock_open, mock_write_text):
        # Mock conf_frame without fparam or aparam
        conf_frame = MagicMock()
        conf_frame.data = {}

        # Mock vasp_inputs
        vasp_inputs = MagicMock()
        vasp_inputs.incar_template = "PREC = Normal\nEDIFF = 1E-4"

        # Create an instance of PrepVasp and call prep_task
        prep_vasp = PrepVasp()
        prep_vasp.prep_task(conf_frame, vasp_inputs)

        # Verify INCAR is not modified
        mock_write_text.assert_called_with(vasp_inputs.incar_template)

        # Verify job.json is not created
        mock_open.assert_not_called()

if __name__ == "__main__":
    unittest.main()

This test suite covers the following scenarios:

  1. fparam is present in conf_frame.data.
  2. aparam is present in conf_frame.data.
  3. Neither fparam nor aparam is present in conf_frame.data.

Each test verifies that the INCAR file is modified correctly and that the job.json file is written with the expected content when applicable.

Comment on lines 212 to 223
if os.path.exists("job.json"):
with open("job.json", "r") as f:
data = json.load(f)
if "use_ele_temp" in data and "ele_temp" in data:
if data["use_ele_temp"] == 1:
setup_ele_temp(False)
sys.data["fparam"] = np.tile(data["ele_temp"], [1, 1])
elif data["use_ele_temp"] == 2:
setup_ele_temp(True)
sys.data["aparam"] = np.tile(
data["ele_temp"], [1, sys.get_natoms(), 1]
)
Copy link

@coderabbitai coderabbitai bot Sep 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improve error handling and readability in electronic temperature processing.

  1. Add error handling for file operations and JSON parsing.

  2. Use constants instead of magic numbers for use_ele_temp.

  3. Simplify condition checking.

  4. Implement error handling:

try:
    with open("job.json", "r") as f:
        data = json.load(f)
except (IOError, json.JSONDecodeError) as e:
    logging.error(f"Failed to read or parse job.json: {e}")
    raise
  1. Use constants for use_ele_temp (as suggested in the previous comment).

  2. Simplify condition checking:

if "use_ele_temp" in data and "ele_temp" in data:
    use_ele_temp = data["use_ele_temp"]
    ele_temp = data["ele_temp"]
    if use_ele_temp == FPARAM_ELE_TEMP:
        setup_ele_temp(False)
        sys.data["fparam"] = np.tile(ele_temp, [1, 1])
    elif use_ele_temp == APARAM_ELE_TEMP:
        setup_ele_temp(True)
        sys.data["aparam"] = np.tile(ele_temp, [1, sys.get_natoms(), 1])

Consider extracting the electronic temperature processing logic into a separate method for better modularity and testability.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please write a UT for electronic temperature processing

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wanghan-iapcm, I've created a unit test for the electronic temperature processing logic in the RunVasp class. The test covers different scenarios, including handling of use_ele_temp values and file operations. Here's the code:

import unittest
from unittest.mock import patch, mock_open
import json
import numpy as np
from dpgen2.fp.vasp import RunVasp, setup_ele_temp

class TestRunVasp(unittest.TestCase):

    @patch("dpgen2.fp.vasp.setup_ele_temp")
    @patch("dpgen2.fp.vasp.open", new_callable=mock_open, read_data='{"use_ele_temp": 1, "ele_temp": 300}')
    def test_run_task_fparam(self, mock_file, mock_setup_ele_temp):
        sys = RunVasp().run_task("command", "out", "log")
        mock_setup_ele_temp.assert_called_once_with(False)
        self.assertIn("fparam", sys.data)
        np.testing.assert_array_equal(sys.data["fparam"], np.tile(300, [1, 1]))

    @patch("dpgen2.fp.vasp.setup_ele_temp")
    @patch("dpgen2.fp.vasp.open", new_callable=mock_open, read_data='{"use_ele_temp": 2, "ele_temp": 300}')
    def test_run_task_aparam(self, mock_file, mock_setup_ele_temp):
        sys = RunVasp().run_task("command", "out", "log")
        mock_setup_ele_temp.assert_called_once_with(True)
        self.assertIn("aparam", sys.data)
        np.testing.assert_array_equal(sys.data["aparam"], np.tile(300, [1, sys.get_natoms(), 1]))

    @patch("dpgen2.fp.vasp.open", new_callable=mock_open)
    def test_run_task_no_job_json(self, mock_file):
        mock_file.side_effect = FileNotFoundError
        with self.assertRaises(FileNotFoundError):
            RunVasp().run_task("command", "out", "log")

    @patch("dpgen2.fp.vasp.open", new_callable=mock_open, read_data='{"invalid_json":')
    def test_run_task_invalid_json(self, mock_file):
        with self.assertRaises(json.JSONDecodeError):
            RunVasp().run_task("command", "out", "log")

if __name__ == "__main__":
    unittest.main()

This test suite includes:

  • test_run_task_fparam: Tests the case where use_ele_temp is 1 and verifies that fparam is correctly set.
  • test_run_task_aparam: Tests the case where use_ele_temp is 2 and verifies that aparam is correctly set.
  • test_run_task_no_job_json: Tests the scenario where the job.json file is missing.
  • test_run_task_invalid_json: Tests the scenario where the job.json file contains invalid JSON.

This should ensure that the electronic temperature processing logic is thoroughly tested. Let me know if you need any further adjustments!

Copy link

@wanghan-iapcm wanghan-iapcm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you may want to define new functions for ele_temp support in PrepVasp.prep_task and RunVasp.run_task to improve the readability of the code.

Moreover the functions can be tested independent with the rest of the code.

please add UT for:

  1. traj_render_lammps correct converts the optional_output to fparam and aparam when use_ele_temp == 1 or use_ele_temp == 2, respectively.
  2. PrepVasp see the ut written by the AI
  3. RunVasp see the ut written by the AI
  4. get_ele_temp

Signed-off-by: zjgemi <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Outside diff range, codebase verification and nitpick comments (2)
dpgen2/fp/vasp.py (2)

103-125: Improve error handling and readability in electronic temperature processing.

  1. Add validation for ele_temp value.

  2. Implement error handling for file operations.

  3. Use constants instead of magic numbers for use_ele_temp.

  4. Add validation for ele_temp:

if ele_temp is not None and not isinstance(ele_temp, (int, float)):
    raise ValueError("ele_temp must be a numerical value")
  1. Implement error handling for file operations:
try:
    with open("job.json", "w") as f:
        json.dump(data, f, indent=4)
except IOError as e:
    logging.error(f"Failed to write job.json: {e}")
    raise
  1. Use constants for use_ele_temp:
NO_ELE_TEMP = 0
FPARAM_ELE_TEMP = 1
APARAM_ELE_TEMP = 2

use_ele_temp = NO_ELE_TEMP
if "fparam" in conf_frame.data:
    use_ele_temp = FPARAM_ELE_TEMP
    ele_temp = conf_frame.data["fparam"][0][0]
elif "aparam" in conf_frame.data:
    use_ele_temp = APARAM_ELE_TEMP
    ele_temp = conf_frame.data["aparam"][0][0][0]

176-188: Improve error handling and readability in electronic temperature processing.

  1. Implement error handling for file operations and JSON parsing.

  2. Use constants instead of magic numbers for use_ele_temp.

  3. Simplify the condition checking.

  4. Implement error handling:

try:
    with open("job.json", "r") as f:
        data = json.load(f)
except (IOError, json.JSONDecodeError) as e:
    logging.error(f"Failed to read or parse job.json: {e}")
    raise
  1. Use constants for use_ele_temp (as suggested in the previous comment).

  2. Simplify the condition checking:

if "use_ele_temp" in data and "ele_temp" in data:
    use_ele_temp = data["use_ele_temp"]
    ele_temp = data["ele_temp"]
    if use_ele_temp == FPARAM_ELE_TEMP:
        setup_ele_temp(False)
        system.data["fparam"] = np.tile(ele_temp, [1, 1])
    elif use_ele_temp == APARAM_ELE_TEMP:
        setup_ele_temp(True)
        system.data["aparam"] = np.tile(ele_temp, [1, system.get_natoms(), 1])

Comment on lines +9 to +11
from .context import (
dpgen2,
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove unused import statement.

The import statement from .context import dpgen2 is unused in the code.

Apply this diff to remove the unused import statement:

-from .context import (
-    dpgen2,
-)
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
from .context import (
dpgen2,
)
Tools
Ruff

10-10: .context.dpgen2 imported but unused

Remove unused import: .context.dpgen2

(F401)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Outside diff range, codebase verification and nitpick comments (1)
dpgen2/op/run_lmp.py (1)

314-333: Optimize the get_ele_temp function.

Consider using a single return statement for better readability and performance.

Apply this diff to optimize the function:

def get_ele_temp(lmp_log_name):
    with open(lmp_log_name, encoding="utf8") as f:
        lmp_log_lines = f.readlines()

-   for line in lmp_log_lines:
-       fields = line.split()
-       if fields[:2] == ["pair_style", "deepmd"]:
-           if "fparam" in fields:
-               # for rendering variables
-               try:
-                   return float(fields[fields.index("fparam") + 1])
-               except Exception:
-                   pass
-           if "aparam" in fields:
-               try:
-                   return float(fields[fields.index("aparam") + 1])
-               except Exception:
-                   pass
-
-   return None
+   for line in lmp_log_lines:
+       fields = line.split()
+       if fields[:2] == ["pair_style", "deepmd"]:
+           if "fparam" in fields:
+               return fields[fields.index("fparam") + 1]
+           if "aparam" in fields:
+               return fields[fields.index("aparam") + 1]
+   return None

@wanghan-iapcm wanghan-iapcm merged commit ce4ab3e into deepmodeling:master Sep 4, 2024
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants