This module provides the stellar classification setup for the TESS Asteroseismic Science Operations Center (TASOC).
The code is available through our GitHub organisation (https://github.com/tasoc/starclass) and full documentation for this code can be found on https://tasoc.dk/code/.
Note
Even though the full code and documentation are freely available, we highly encourage users to not attempt to use the code to generate their own photometry or classifications from TESS. Instead we encourage you to use the fully processed data products from the full TASOC pipeline, which are available from TASOC and MAST. If you are interested in working on details in the processing, we welcome you to join the T'DA working group.
The overall strategy of the classification pipeline is to have different classifiers are run on the same data, and all the results from the individual classifiers are passed into an overall "meta-classifier" which will assign the final classifications based on the inputs from all classifiers.
Classification is done in two levels (1 and 2), where the first level separates stars into overall classes of stars that exhibit similar lightcurves. In level 2, these classes are further separated into the individual pulsation classes.
Start by making sure that you have Git Large File Storage (LFS) installed. You can verify that is installed by running the command:
>>> git lfs version
Go to the directory where you want the Python code to be installed and simply download it or clone it via git as:
>>> git clone https://github.com/tasoc/starclass.git .
All dependencies can be installed using the following command. It is recommended to do this in a dedicated virtualenv or similar:
>>> pip install -r requirements.txt
You can test your installation by going to the root directory where you cloned the repository and run the command:
>>> pytest
The first step is to train the classifier. The classifier can be trained through the following command:
>>> python run_training.py -tf 0.2
The option
-tf
specifies the fraction of data to use for testing.The training set can be set through
-t
, by default this is currently set tokeplerq9v3
.The full list of possible arguments can be found in
run_training.py
.The actual classifiers together with their performance metrics are saved to
~/starclass/starclass/data/L1/{keplerq9v3}/
Once the classifier has been trained, it can be applied to a new data set. The first step is to create a database that will list all the light curves that have to be classified, and to which the results and features will afterwards be saved.
>>> python run_create_todo_list.py {/input/folder/with/all/the/light/curves/}
In case the light curves are not stored in
.fits
or.fits.gz
format, the regex pattern needs to be specified through--pattern
.The full list of possible arguments can be found in
run_create_todolist.py
.Now that we have created an input database, we can actually run the classifier to classify all the light curves. In case you want to run starclass on a large scale multi-core computer, you need to make sure MPI is installed and also install the python library
mpi4py
.>>> python run_starclass.py {/input/folder/with/light/curves/}
>>> mpiexec -n 4 python run_starclass_mpi.py {/input/folder/with/light/curves/}
The full list of possible arguments together with additional information can be found in
run_starclass.py
andrun_starclass_mpi.py
.The results are all saved to the
todo.sqlite
file we created in the previous step.