Skip to content

Commit

Permalink
Update performance.rst
Browse files Browse the repository at this point in the history
Grammar-only
  • Loading branch information
delucchi-cmu committed Jun 14, 2024
1 parent cb822ff commit 3e56c90
Showing 1 changed file with 10 additions and 9 deletions.
19 changes: 10 additions & 9 deletions docs/performance.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,17 +8,18 @@ the `HiPSCat <https://github.com/astronomy-commons/hipscat>`_ data format,
efficient algorithms,
and `Dask <https://dask.org/>`_ framework for parallel computing.

Here we demonstrate the results of the performance tests of LSDB for the cross-matching operations,
performed on `Bridges2 cluster at Pittsburgh Supercomputing Center <https://www.psc.edu/resources/bridges-2/>`_ using single node with 128 cores and 256 GB of memory.
Here we demonstrate the results of the performance tests of LSDB for cross-matching operations,
performed on `Bridges2 cluster at Pittsburgh Supercomputing Center <https://www.psc.edu/resources/bridges-2/>`_
using a asingle node with 128 cores and 256 GB of memory.

Cross-matching performance overhead
-----------------------------------

We compare I/O speed and cross-matching performance of LSDB on an example of cross-matching of
We compare I/O speed and cross-matching performance of LSDB on an example cross-matching of
ZTF DR14 (metadata only, 1.2B rows, 60GB)
and Gaia DR3 (1.8B rows, 972GB) catalogs.
The cross-matching took 46 minutes and produced a catalog of 498GB.
While the cross-matching optimized reading of Gaia and didn't load southern part of the Gaia which doesn't overlap with ZTF,
While the cross-matching optimized reading of Gaia (and didn't load the southern part of Gaia which doesn't overlap with ZTF),
we would judge the performance by the output size, which gives us 185MB/s of the cross-matching speed.

We compare it to just copying both catalogs with `cp -r` command, which took 86 minutes and produced 1030GB of data,
Expand All @@ -38,8 +39,8 @@ and `smatch <https://github.com/esheldon/smatch>`_ package.
We use the same ZTF DR14 and Gaia DR3 catalogs as in the previous test, but we only use coordinate columns for the cross-matching.
With each algorithm, we perform the cross-matching of the catalogs within a 1 arcsecond radius, selecting the closest match.
To compare the performance on different scales,
we select subsets of the catalogs with cone search around an arbitrary point,
ranging the radius from 1 arcminute to 25 degrees.
we select subsets of the catalogs with a cone search around an arbitrary point,
increasing the radius from 1 arcminute to 25 degrees.

The analysis has the following steps:

Expand All @@ -64,13 +65,13 @@ The results of the analysis are shown in the following plot:

Some observations from the plot:

* Construction of the ``SkyCoord`` objects in astropy is the most time-consuming step, on this step spherical coordinates are converted to Cartesian, so `match_coordinates_sky` has less work to do comparing to other algorithms. So if your analysis doesn't require the ``SkyCoord`` objects anywhere else, it would be more fair to add up the time of the `SkyCoord` objects construction and the `match_coordinates_sky` execution.
* Construction of the ``SkyCoord`` objects in astropy is the most time-consuming step; in this step spherical coordinates are converted to Cartesian, so `match_coordinates_sky` has less work to do comparing to other algorithms. So if your analysis doesn't require the ``SkyCoord`` objects anywhere else, it would be more fair to add up the time of the `SkyCoord` objects construction and the `match_coordinates_sky` execution.
* All algorithms but LSDB have a nearly linear dependency on the number of rows in the input catalogs starting from a small number of rows. LSDB has a constant overhead associated with the graph construction and Dask overhead, which is negligible for large catalogs, where the time starts to grow linearly.
* LSDB is the only method allowing to parallelize the cross-matching operation, so we run it with 1, 4, 16, and 64 workers.
* 16 and 64-worker cases show the same performance, which shows the limits of the parallelization, at least with the hardware setup used in the analysis.
* Despite the fact that LSDB's crossmatching algorithm does more job converting spherical coordinates to Cartesian, it's getting faster than astropy's algorithm for the large catalogs, even with a single worker. This is probably due to the fact that LSDB utilises batching approach, which allows to grow less deep k-D trees for each partition of the data, and thus less time is spent on the tree traversal.
* Despite the fact that LSDB's crossmatching algorithm does similar work converting spherical coordinates to Cartesian, it's getting faster than astropy's algorithm for larger catalogs, even with a single worker. This is probably due to the fact that LSDB utilises a batching approach, which constructs shallower k-D trees for each partition of the data, and thus less time is spent on the tree traversal.

Summarizing, the cross-matching approach implemented in LSDB is competitive with the existing tools and is more efficient for large catalogs, starting with roughly one million rows.
Also, LSDB allows to work with out-of-memory datasets, which is not possible with astropy and smatch, and not demonstrated in the analysis.
Also, LSDB allows work with out-of-memory datasets, which is not possible with astropy and smatch, and not demonstrated in the analysis.

The complete code of the analysis is available `here <https://github.com/lincc-frameworks/notebooks_lf/tree/main/sprints/2024/05_30/xmatch_bench>`_.

0 comments on commit 3e56c90

Please sign in to comment.