Skip to content

A collection of scripts for analyzing the data from the Gaia satellite

Notifications You must be signed in to change notification settings

GalacticDynamics-Oxford/GaiaTools

Repository files navigation

This is a collection of tools for analyzing the data from the Gaia satellite.

The package includes an expanded catalogue of clusters with distances
computed by Baumgardt&Vasiliev(2021), line-of-sight velocities taken
from Baumgardt et al.(2019, MNRAS, 482, 5138), and mean parallaxes and
proper motions (PM) computed using Gaia EDR3 by Vasiliev&Baumgardt(2021).
The fitting procedure is a slightly updated version of the method used by
Vasiliev(2019a, MNRAS, 484, 1525), and not identical to the one used in
the most recent analysis pipeline (the latter is too hand-tuned and
cumbersome). Nevertheless, it produces results that are quite close to
the published ones.

The description below follows the order in which these tools should be used
(although not all steps are necessary for a particular task: for instance,
if you only need to compute the orbital properties of clusters using the
existing table of PM, skip to the last step).

    get_mean_pm.py:
A small module providing the routines for computing the mean parallax and PM
and optionally its dispersion for a star cluster, with or without accounting
for spatially correlated systematic errors, as described in the Appendix of
Vasiliev(2019b, MNRAS, 489, 623).
This module is intended to be used internally by run_fit.py, but can also
be run as a main program as a test, illustrating the procedure on mock data.
DEPENDENCIES: numpy, scipy.

    input.txt:
List of globular clusters in the Milky Way (input for the remaining scripts).
The coordinates are taken from the Harris(2010) catalogue with a few more
recent additions and corrections, and the remaining data -- from the three
papers listed above.
sigma is the central velocity dispersion [km/s];
rmax is the maximum distance [arcmin] from cluster center used to query
the Gaia archive.
The file result.txt generated by run_fit.py has identical format, but the
actual content will be slightly different because the fitting approach is not
the same. If you want to use the results from the published catalogue, rename
or copy input.txt to result.txt instead of running the fit from scratch.

    McMillan17.ini:
The Milky Way potential from McMillan 2017, used in orbit integrations
(run_orbits.py)

    query_gaia_archive.py:
Retrieve the data from the Gaia archive (all sources satisfying the maximum
distance from cluster center and a simple parallax cut).
Source data for each cluster is stored in a separate numpy zip file:
"data/[cluster_name].npz".
DEPENDENCIES: numpy, astroquery (astropy-affiliated package);
optionally zero_point (parallax zero-point correction from Lindegren+2020).
RESOURCES: run time: a few minutes (depending on internet speed);
disk space: ~100 Mb to store the downloaded data.

    run_fit.py:
The main script performing the membership determination and measuring the
mean PM for each cluster, as described in the Appendix of Vasiliev(2019a).
It reads the data previously stored in "data/*.npz" by query_gaia_archive.py,
performs the fit, estimates the uncertainties on the mean PM (optionally
taking into account systematic errors), writes the summary for each cluster
to the file "result.txt" (same columns as "input.txt", updating the following
ones:
mean pmra, pmdec, parallax, their uncertainties, and the PM correlation coef).
Additionally, the data for all stars from each cluster are written to a file
"data/[cluster_name].txt":
ra,dec are the celestial coordinates;
x,y are orthogonally projected coordinates w.r.t. cluster center (in degrees);
pmra, pmdec are the PM components (mas/yr);
pmra_e, pmdec_e, pm_corr are their uncertainties and correlation coefficient;
g_mag is G-band magnitude;  bp_rp is the colour;
filter is the flag (0/1) specifying whether the star passed the quality
filters on the initial sample (i.e., astrometric noise and photometry);
prob is the cluster membership probability (only for stars with filter==1).
DEPENDENCIES: numpy, scipy; optionally autograd (needed only for computing
the statistical uncertainties, however, the systematic ones are almost always
higher, so "autograd" is not really needed).
RESOURCES: run time: 30-60 minutes; memory: a few gigabytes;
disk space: ~200 Mb to store the results for all stars in all clusters.

    run_orbits.py:
Convert the sky coordinates, distances, mean PM and line-of-sight velocities
of all clusters produced by runfit.py (result.txt) to Galactocentric cartesian
coordinates, sampling from uncertainty covariance matrix of all parameters.
(Hint: one may use input.txt in place of result.txt without running the fit).
Produces the file "posvel.txt" which contains bootstrapped samples (by default
100 for each cluster) of positions and velocities.
After this first step, compute the galactic orbit for each of these samples,
obtain peri/apocenter distances, orbital energy and actions, and store the
median and 68% confidence intervals on these quantities in a file
"result_orbits.txt".
This second step uses the fiducial potential from McMillan 2017, and employs
the Agama library ( https://github.com/GalacticDynamics-Oxford/Agama ) for
computing the orbits and actions. Check out the 'data/' folder in the Agama
distribution for other possible potential choices.
For many clusters, these confidence intervals reported in "result_orbits.txt"
are small enough to realistically represent the uncertainties;
however, often the distribution of these parameters is significantly
correlated, elongated and does not resemble an ellipse at all,
hence these results may only serve as a rough guide.
DEPENDENCIES: numpy, agama.
RESOURCES: run time: ~30 CPU minutes (parallelized - wall-clock time is lower).

About

A collection of scripts for analyzing the data from the Gaia satellite

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages