Accessing Truth Catalog Information#
Owner: Scott Daniel (@danielsf)
Last Run: 2018-10-17 (by @yymao)
Learning Objectives: We have implemented a catalog containing truth information for DC2 Run 1.1 and 1.2. This notebook will introduce the contents of that notebook as well as how to query them.
Note: Currently, the truth catalog only contains static sources. We still need to design a way to efficiently deliver light curves for all of the transient and variable sources in DC2. Galaxies with time-varying AGN and variable stars in this catalog report a magntiude which is the baseline magnitude on top of which the variability model is applied.
To run this notebook, follow the instructions to setup Jupyter-dev at NERSC: https://confluence.slac.stanford.edu/x/1_ubDQ
# import packages and methods that will be used in this notebook
import healpy
import numpy as np
import GCRCatalogs
from astropy.coordinates import SkyCoord
import matplotlib
%matplotlib inline
import matplotlib.pyplot as plt
First, load the catalog.
truth_catalog = GCRCatalogs.load_catalog('dc2_truth_run1.2_static')
You can list the quantities contained in the catalog using the same API as any other GCR-based catalog.
truth_catalog.list_all_quantities(include_native=True)
The get_quantity_info
method will give you access to descriptions of all of the native quantities in the catalog. Note that the quantities ugrizy
map directly to `mag_true_[ugrizy]’.
for qty in ['object_id', 'ra', 'dec', 'redshift', 'healpix_2048',
'agn', 'star', 'sprinkled',
'u', 'g', 'r', 'i', 'z', 'y']:
info_dict = truth_catalog.get_quantity_info(qty)
print(qty,info_dict)
In the cells below, we will query the catalog for all of the bright stars and galaxies in a 0.2 degree disc centered on RA=54.6
, Dec=-28.0
.
In the next cell, will define some methods (filter_on_healpix
and filter_on_dist
) needed to make that selection. These methods will be passed into the catalog’s get_quantities
method with the filters
kwarg. The methods should accept numpy arrays and return a numpy array of booleans indicating whether or not each object in the input arrays passed the test.
We will use healpy
’s query_disc
method to find which healpixels overlap our region of interest.
center_ra = 54.6
center_dec = -28.0
radius = 0.2
center_ra_rad = np.radians(center_ra)
center_dec_rad = np.radians(center_dec)
center_vec = np.array([np.cos(center_dec_rad)*np.cos(center_ra_rad),
np.cos(center_dec_rad)*np.sin(center_ra_rad),
np.sin(center_dec_rad)])
list_of_healpix = healpy.query_disc(2048, center_vec, np.radians(radius), nest=True, inclusive=True)
def filter_on_healpix(hp):
return np.array([hh in list_of_healpix for hh in hp])
def angularSeparation(ra, dec, center_ra, center_dec):
return SkyCoord(ra, dec, unit='deg').separation(SkyCoord(center_ra, center_dec, unit='deg')).deg
def filter_on_dist(ra, dec):
return angularSeparation(ra, dec, center_ra, center_dec) < radius
Now we will query the catalog or all stars (star==1
) in the region of interest with magnitude r<17.0
. This query functions like any other GRC-based catalog query with one exception. The truth catalog is ultimately stored as a sqlite database. This means that all native_filters
(filters applied directly to the catalog’s native quantities), should be phrased as strings which could be inserted into an SQL WHERE
clause. Because the native_filters
are applied when the quantities loaded into memory (as opposed to filters
, which are applied after the quantities have been loaded), we want these to be as restrictive as possible so as to limit the memory consumption of the loaded catalog. The sqlite databases are indexed on star
, agn
, sprinkled
, healpix_2048
and object_id
.
Note: We are aware that nside==2048
healpixels may be too fine a resolution to helpfully limit the catalog query. We are open to the idea of using a coarser resolution in future truth catalogs.
After the coarse spatial limits applied by the native_filter
on healpix_2048
, we use the filter_on_healpix
and filter_on_dist
methods to actually get sources in our region of interest.
bright_stars = truth_catalog.get_quantities(['ra', 'dec', 'mag_true_r'],
native_filters=['star==1',
'r<17.0',
'healpix_2048<=%d' % list_of_healpix.max(),
'healpix_2048>=%d' % list_of_healpix.min()],
filters=[(filter_on_healpix, 'healpix_2048'),
(filter_on_dist, 'ra', 'dec')])
plt.scatter(bright_stars['ra'], bright_stars['dec'])
Now we will performa similar spatial query for galaxies (star==0
) which were not added by the sprinkler (sprinkled==1
) with magnitudes r<21
.
*Note: “the sprinkler” is a piece of code used to add an extra population of AGN and supernovae into the Utral Deep Drilling Field of DC2. Objects added by the sprinkler will have no counterpart in the underlying protoDC2 extragalactic catalog, which is why we have added a sprinkled
flag to the catalog.
bright_unsprinkled_galaxies = truth_catalog.get_quantities(['ra', 'dec', 'mag_true_r'],
native_filters=['star==0', 'sprinkled==0',
'r<21.0',
'healpix_2048<=%d' % list_of_healpix.max(),
'healpix_2048>=%d' % list_of_healpix.min()],
filters=[(filter_on_healpix, 'healpix_2048'),
(filter_on_dist, 'ra', 'dec')])
plt.scatter(bright_unsprinkled_galaxies['ra'], bright_unsprinkled_galaxies['dec'])
print(bright_unsprinkled_galaxies['mag_true_r'].min())