Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adds script to import cctbx data to rs #264

Merged
merged 39 commits into from
Sep 20, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
a2a2343
adds script to import cctbx data to rs
dermen Aug 10, 2024
36b5873
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Aug 10, 2024
592007b
removes cctbx dependency
dermen Aug 11, 2024
8db20f9
merged precommit
dermen Aug 11, 2024
cd94855
merged precommit 2
dermen Aug 11, 2024
1233625
rejig to io
dermen Aug 20, 2024
fa209fe
adds mpi support
dermen Aug 20, 2024
f19bae4
addresses review
dermen Aug 25, 2024
0106d9d
removes comment
dermen Aug 25, 2024
45b8ddd
addresses review pt2
dermen Aug 26, 2024
c52719f
uses better names
dermen Aug 26, 2024
d1c4424
cleans up io __init__
dermen Aug 27, 2024
c2d84fd
adds support for more columns
dermen Aug 27, 2024
b2afcb5
adds verbose flag for read_dials_stills
dermen Aug 28, 2024
f315274
more debug statements
dermen Aug 28, 2024
45707ca
Merge remote-tracking branch 'upstream/main'
dermen Sep 3, 2024
5f850e5
unit tests for refl table reader
dermen Sep 3, 2024
b28bd46
adds back in the comma
dermen Sep 3, 2024
c71dd06
cleanup
dermen Sep 3, 2024
d30d18f
more cleanup
dermen Sep 4, 2024
a1e07bb
use tempfile, remove __main__ for production
Sep 10, 2024
37ce079
refactor, add mpi test with dummy comms
Sep 12, 2024
36c3376
Merge pull request #1 from kmdalton/stills
dermen Sep 17, 2024
53f6021
get ray_context
dermen Sep 17, 2024
493630c
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 17, 2024
cb7570b
Update common.py
dermen Sep 17, 2024
479104d
Update common.py
dermen Sep 17, 2024
7f7dd26
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 17, 2024
7fd1208
allow nan inference for float64
Sep 18, 2024
a365d34
remove dtype inference from read_dials_stills
Sep 18, 2024
cc1117b
make cell/sg optional. improve docstring
Sep 18, 2024
73e1ed4
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 18, 2024
8e32b56
fix docstring
Sep 19, 2024
af61760
make dtype inference optional
Sep 19, 2024
bce6f80
test dtype inference toggle
Sep 19, 2024
58b862d
test mtz_dtypes flag and mtz writing
Sep 19, 2024
1fc6f7a
no need for a list of files
Sep 19, 2024
6c248e4
separate test for mtz io
Sep 19, 2024
1d0b88d
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Sep 19, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 26 additions & 12 deletions reciprocalspaceship/io/dials.py
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@ def _concat(refl_data):
ds = refl_data
else:
refl_data = [ds for ds in refl_data if ds is not None]
ds = rs.concat(refl_data)
ds = rs.concat(refl_data, check_isomorphous=False)
expt_ids = set(ds.BATCH)
LOGGER.debug(f"Found {len(ds)} refls from {len(expt_ids)} expts.")
LOGGER.debug("Mapping batch column.")
Expand Down Expand Up @@ -191,30 +191,44 @@ def _read_dials_stills_ray(fnames, unitcell, spacegroup, numjobs=10, extra_cols=
@spacegroupify
def read_dials_stills(
fnames,
unitcell,
spacegroup,
unitcell=None,
spacegroup=None,
numjobs=10,
parallel_backend=None,
extra_cols=None,
verbose=False,
comm=None,
):
"""
Read reflections from still images processed by DIALS from fnames and return
them as a DataSet. This method does not convert columns to native rs MTZ dtypes.

Parameters
----------
fnames: filenames
unitcell: unit cell tuple, Gemmi unit cell obj
spacegroup: space group symbol eg P4
numjobs: if backend==ray, specify the number of jobs (ignored if backend==mpi)
parallel_backend: ray, mpi, or None
extra_cols: list of additional column names to extract from the refltables. By default, this method will search for
fnames : list or tuple
A list or tuple of filenames (strings).
unitcell : gemmi.UnitCell or similar (optional)
The unit cell assigned to the returned dataset.
spacegroup : gemmi.SpaceGroup or similar (optional)
The spacegroup assigned to the returned dataset.
numjobs : int
If backend==ray, specify the number of jobs (ignored if backend==mpi).
parallel_backend : string (optional)
"ray", "mpi", or None for serial.
extra_cols : list (optional)
Optional list of additional column names to extract from the refltables. By default, this method will search for
miller_index, id, s1, xyzcal.px, intensity.sum.value, intensity.sum.variance, delpsical.rad
verbose: whether to print stdout
comm: optionally override the communicator used by backend='mpi'
verbose : bool
Whether to print logging info to stdout
comm : mpi4py.MPI.Comm
Optionally override the communicator used by backend='mpi'

Returns
-------
rs dataset (pandas Dataframe)
ds : rs.DataSet
The dataset containing reflection info aggregated from fnames. This method will not convert any of the
columns to native rs MTZ dtypes. DIALS data are natively double precision (64-bit). Converting to MTZ
will downcast them to 32-bit. Use ds.infer_mtz_dtypes() to convert to native rs dtypes if you required.
kmdalton marked this conversation as resolved.
Show resolved Hide resolved
"""
_set_logger(verbose)

Expand Down
8 changes: 7 additions & 1 deletion tests/io/test_dials.py
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ def make_refls(unit_cell, sg, seed=8675309, file_prefix=""):
def test_dials_reader(parallel_backend, verbose=False):

unit_cell = 78, 78, 235, 90, 90, 120
sg = "P6522"
sg = "P 65 2 2"
comm = None
if parallel_backend == "mpi":
comm = DummyComm()
Expand Down Expand Up @@ -162,6 +162,12 @@ def test_dials_reader(parallel_backend, verbose=False):
assert np.allclose(df_m.I, df_m["intensity.sum.value"])
assert np.allclose(df_m.varI, df_m["intensity.sum.sigma"] ** 2)

# Test that you don't need cell and symmetry to load the tables
ds =read_dials_stills(
pack_names, parallel_backend=None, numjobs=1, verbose=verbose
)
assert ds.cell is None
assert ds.spacegroup is None

def test_verbosity():
with tempfile.TemporaryDirectory() as tdir:
Expand Down