Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce crashing inside Phenix refine, reported by Tom Terwilliger #52

Open
russell-taylor opened this issue Oct 29, 2023 · 6 comments
Open
Assignees

Comments

@russell-taylor
Copy link
Collaborator

Tom is getting a crash (after running for about six hours) in reduce when he issues the following command:

phenix.refine 3kz4_biomt.pdb 3kz4-sf.mtz \
   refinement.main.number_of_macro_cycles=0 \
   refinement.main.nqh_flips=False

This is the command-line argument causing the crash: molprobity.reduce -oh -his -flip -keep -allalt -limit120 -pen9999 -

The crash dumps the stdin file to reduce_fail.pdb. When he runs the command line directly, feeding it the name of the input file rather than feeding the file from standard input, reduce finishes in minutes and does not crash.

Note1: Reduce behaves differently when run on a file from stdin than when run with a filename, so that may be the difference.
Note2: I've fixed a number of crash bugs in Reduce recently, which may not be in the Phenix version yet, that may be causing this.

@russell-taylor russell-taylor self-assigned this Oct 29, 2023
@russell-taylor
Copy link
Collaborator Author

russell-taylor commented Oct 29, 2023

When I run molprobity.reduce on my Windows machine giving it the file name on the command line, it completes in a few minutes and writes the reduced output to standard output.

When I run molpribity.reduce with the '-' option to read from standard input, it says that it is reading from standard input and then completes in a few minutes and writes the reduced output to standard output.

@russell-taylor
Copy link
Collaborator Author

The molprobity.reduce command-line I am using runs reduce version 3.7.2, which is from December 2020, which should still have the crash issues. Perhaps they happen on Linux rather than Windows? On Linux, I'm running a version that is using reduce 4.14.

When I build version 3.7.2 on Linux and then run using the command-line version that reads from standard input, it says that it is reading from standard input and then runs, growing its virtual memory footprint to 4.4GB with a resident size of 4.1GB. after a few minutes, it writes its output to standard output.

@russell-taylor
Copy link
Collaborator Author

When I run reduce2 on this model, I notice that during model loading it takes 5GB of RAM, and during hydrogen addition it takes more than 12GB. It is possible that phenix.refine is using a lot of memory and together they are using up all memory. I've asked Tom to try watching 'top' while running on this model to see if this is what is happening. It would explain the slowness as the system starts to thrash.

@russell-taylor
Copy link
Collaborator Author

russell-taylor commented Oct 29, 2023

Running the phenix command line Tom provided on the source files he provided on Linux uses an increasing amount of memory (50G virtual, 30G resident) then starts thrashing to disk on my home server that has 32GB RAM. This is even before we see reduce show up as a subprocess. This seems to make the memory-exceeded theory probable.

@russell-taylor
Copy link
Collaborator Author

russell-taylor commented Oct 30, 2023

Tom ran on Anaconda, which has 1TB of RAM. The job took 53GB, which was nowhere near the available memory.

(nope) See if I can reproduce the crash on my laptop. The job runs for a few minutes, then quits with a 0-length log file. It produces no other output in this directory.

(nope) See if I can reproduce on my desktop server, to see if it finishes even with the memory thrashing. It crashed while running when trying to allocate more memory.

(nope) See if I can reproduce on Anaconda.

  • Build phenix from current master bootstrap.py on 12/18/2023
  • Run molprobity.reduce -oh -his -flip -keep -allalt -limit120 -pen9999 - < reduce_fail.pdb
  • It runs to completion in several minutes

@russell-taylor
Copy link
Collaborator Author

russell-taylor commented Dec 18, 2023

(done) mmtbx/utils/init.py has a run_reduce_with_timeout() function that may be the one being called by Phenix. It uses stdin_lines as the thing to send. Try running a Python script that reads the file into lines and then calls this function.

  • Reproduce the behavior found in phenix/pdb_tools/add_hydrogens.py using run_reduce_with_timeout() with the arguments provided above. Note: The time limit appears to be 120 seconds, which is a much shorter time than it takes to fully optimize.
  • It produces a seemingly-correct output file.

Here is the script:

from mmtbx.utils import run_reduce_with_timeout
from six.moves import cStringIO as StringIO

with open('reduce_fail.pdb', 'r') as file:
  lines = file.readlines()

ero = run_reduce_with_timeout(stdin_lines = lines,
  parameters = '-oh -his -flip -keep -allalt -limit120 -pen9999 -')

out = StringIO()
ero.show_stdout(out=out)
print(out.getvalue())

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant