Matching catalogs based on proximity (simple)
=============================================
Matching two catalogs based on proximity based on a configuration
dictionary
.. code:: ipython3
%load_ext autoreload
%autoreload 2
ClCatalogs
----------
Given some input data
.. code:: ipython3
import numpy as np
from astropy.table import Table
input1 = Table({
'ID': [f'CL{i}' for i in range(5)],
'RA': [0.0, 0.0001, 0.00011, 25, 20],
'DEC': [0.0, 0.0, 0.0, 0.0, 0.0],
'Z': [0.2, 0.3, 0.25, 0.4, 0.35],
'MASS': [10**13.5, 10**13.4, 10**13.3, 10**13.8, 10**14],
'RADIUS_ARCMIN': [1.0, 1.0, 1.0, 1.0, 1.0],
})
input2 = Table({
'ID': ['CL0', 'CL1', 'CL2', 'CL3'],
'RA': [0.0, 0.0001, 0.00011, 25],
'DEC': [0.0, 0, 0, 0],
'Z': [0.3, 0.2, 0.25, 0.4],
'MASS': [10**13.3, 10**13.4, 10**13.5, 10**13.8],
'RADIUS_ARCMIN': [1.0, 1.0, 1.0, 1.0],
})
display(input1)
display(input2)
.. raw:: html
Table length=5
ID | RA | DEC | Z | MASS | RADIUS_ARCMIN |
str3 | float64 | float64 | float64 | float64 | float64 |
CL0 | 0.0 | 0.0 | 0.2 | 31622776601683.793 | 1.0 |
CL1 | 0.0001 | 0.0 | 0.3 | 25118864315095.82 | 1.0 |
CL2 | 0.00011 | 0.0 | 0.25 | 19952623149688.83 | 1.0 |
CL3 | 25.0 | 0.0 | 0.4 | 63095734448019.43 | 1.0 |
CL4 | 20.0 | 0.0 | 0.35 | 100000000000000.0 | 1.0 |
.. raw:: html
Table length=4
ID | RA | DEC | Z | MASS | RADIUS_ARCMIN |
str3 | float64 | float64 | float64 | float64 | float64 |
CL0 | 0.0 | 0.0 | 0.3 | 19952623149688.83 | 1.0 |
CL1 | 0.0001 | 0.0 | 0.2 | 25118864315095.82 | 1.0 |
CL2 | 0.00011 | 0.0 | 0.25 | 31622776601683.793 | 1.0 |
CL3 | 25.0 | 0.0 | 0.4 | 63095734448019.43 | 1.0 |
Create two ``ClCatalog`` objects, they have the same properties of
``astropy`` tables with additional functionality. For the proximity
matching, the main columns to be included are: -
``id - if not included, one will be assigned -``\ ra\ ``(in degrees) - necessary -``\ dec\ ``(in degrees) - necessary -``\ z\ ``- necessary if used as matching criteria or for angular to physical convertion -``\ mass\ ``(or mass proxy) - necessary if used as preference criteria for unique matches -``\ radius\ ``- necessary if used as a criteria of matching (also requires``\ radius_unit\`
to be passed)
.. code:: ipython3
from clevar.catalog import ClCatalog
c1 = ClCatalog('Cat1', id=input1['ID'], ra=input1['RA'], dec=input1['DEC'], z=input1['Z'], mass=input1['MASS'])
c2 = ClCatalog('Cat2', id=input2['ID'], ra=input2['RA'], dec=input2['DEC'], z=input2['Z'], mass=input2['MASS'])
# Format for nice display
for c in ('ra', 'dec', 'z'):
c1[c].info.format = '.2f'
c2[c].info.format = '.2f'
for c in ('mass',):
c1[c].info.format = '.2e'
c2[c].info.format = '.2e'
display(c1)
display(c2)
.. raw:: html
Cat1
tags: id(id), ra(ra), dec(dec), z(z), mass(mass)
Radius unit: None
ClData length=5
id | ra | dec | z | mass | mt_self | mt_other | mt_multi_self | mt_multi_other |
str3 | float64 | float64 | float64 | float64 | object | object | object | object |
CL0 | 0.00 | 0.00 | 0.20 | 3.16e+13 | None | None | [] | [] |
CL1 | 0.00 | 0.00 | 0.30 | 2.51e+13 | None | None | [] | [] |
CL2 | 0.00 | 0.00 | 0.25 | 2.00e+13 | None | None | [] | [] |
CL3 | 25.00 | 0.00 | 0.40 | 6.31e+13 | None | None | [] | [] |
CL4 | 20.00 | 0.00 | 0.35 | 1.00e+14 | None | None | [] | [] |
.. raw:: html
Cat2
tags: id(id), ra(ra), dec(dec), z(z), mass(mass)
Radius unit: None
ClData length=4
id | ra | dec | z | mass | mt_self | mt_other | mt_multi_self | mt_multi_other |
str3 | float64 | float64 | float64 | float64 | object | object | object | object |
CL0 | 0.00 | 0.00 | 0.30 | 2.00e+13 | None | None | [] | [] |
CL1 | 0.00 | 0.00 | 0.20 | 2.51e+13 | None | None | [] | [] |
CL2 | 0.00 | 0.00 | 0.25 | 3.16e+13 | None | None | [] | [] |
CL3 | 25.00 | 0.00 | 0.40 | 6.31e+13 | None | None | [] | [] |
The ``ClCatalog`` object can also be read directly from a file, for
details, see catalogs.ipynb.
Matching
--------
Import the ``ProximityMatch`` and create a object for matching
.. code:: ipython3
from clevar.match import ProximityMatch
mt = ProximityMatch()
Prepare the configuration. The main values are:
- ``type``: Type of matching to be considered. Can be a simple match of
ClCatalog1->ClCatalog2 (``cat1``), ClCatalog2->ClCatalog1 (``cat2``)
or cross matching.
- ``which_radius``: Given a pair of clusters, which radius will be used
for the matching.
- ``preference``: In cases where there are multiple matched, how the
best candidate will be chosen.
- ``verbose``: Print result for individual matches
(default=\ ``True``).
We also need to provide some specific configuration for each catalog
with:
- ``delta_z``: Defines redshift window for matching. The possible
values are:
- ``'cat'``: uses redshift properties of the catalog
- ``'spline.filename'``: interpolates data in ``'filename'``
assuming (z, zmin, zmax) format
- ``float``: uses ``delta_z*(1+z)``
- ``None``: does not use z
- ``match_radius``: Radius of the catalog to be used in the matching.
If ``'cat'`` uses the radius in the catalog, else must be in format
``'value unit'``. (ex: ``'1 arcsec'``, ``'1 Mpc'``)
In this case, because one of the configuraion radius has physical units,
we need a cosmology object to convert it to angular size (this is done
internally).
.. code:: ipython3
match_config = {
'type': 'cross', # options are cross, cat1, cat2
'which_radius': 'max', # Case of radius to be used, can be: cat1, cat2, min, max
'preference': 'angular_proximity', # options are more_massive, angular_proximity or redshift_proximity
'catalog1': {'delta_z':.2,
'match_radius': '1 mpc'
},
'catalog2': {'delta_z':.2,
'match_radius': '10 arcsec'
}
}
from clevar.cosmology import AstroPyCosmology
cosmo = AstroPyCosmology()
Once the configuration is prepared, the whole process can be done with
one call:
.. code:: ipython3
%%time
mt.match_from_config(c1, c2, match_config, cosmo=cosmo)
.. parsed-literal::
## ClCatalog 1
## Prep mt_cols
* zmin|zmax from config value
* ang radius from set scale
## ClCatalog 2
## Prep mt_cols
* zmin|zmax from config value
* ang radius from set scale
## Multiple match (catalog 1)
Finding candidates (Cat1)
* 4/5 objects matched.
## Multiple match (catalog 2)
Finding candidates (Cat2)
* 4/4 objects matched.
## Finding unique matches of catalog 1
Unique Matches (Cat1)
* 4/5 objects matched.
## Finding unique matches of catalog 2
Unique Matches (Cat2)
* 4/4 objects matched.
Cross Matches (Cat1)
* 4/5 objects matched.
Cross Matches (Cat2)
* 4/4 objects matched.
CPU times: user 97.2 ms, sys: 1.01 ms, total: 98.2 ms
Wall time: 96.7 ms
This will fill the matching columns in the catalogs: -
``mt_multi_self``: Multiple matches found - ``mt_multi_other``: Multiple
matches found by the other catalog - ``mt_self``: Best candidate found -
``mt_other``: Best candidate found by the other catalog - ``mt_cross``:
Best candidate found in both directions
.. code:: ipython3
display(c1)
display(c2)
.. raw:: html
Cat1
tags: id(id), ra(ra), dec(dec), z(z), mass(mass)
Radius unit: None
ClData length=5
| mt_input |
---|
id | ra | dec | z | mass | mt_self | mt_other | mt_multi_self | mt_multi_other | mt_cross | zmin | zmax | ang |
str3 | float64 | float64 | float64 | float64 | object | object | object | object | object | float64 | float64 | float64 |
CL0 | 0.00 | 0.00 | 0.20 | 3.16e+13 | CL0 | CL0 | ['CL0', 'CL1', 'CL2'] | ['CL0', 'CL1', 'CL2'] | CL0 | -0.04 | 0.44 | 0.08418388522320427 |
CL1 | 0.00 | 0.00 | 0.30 | 2.51e+13 | CL1 | CL1 | ['CL0', 'CL1', 'CL2'] | ['CL0', 'CL1', 'CL2'] | CL1 | 0.04 | 0.56 | 0.062361611333396835 |
CL2 | 0.00 | 0.00 | 0.25 | 2.00e+13 | CL2 | CL2 | ['CL0', 'CL1', 'CL2'] | ['CL0', 'CL1', 'CL2'] | CL2 | 0.00 | 0.50 | 0.0710414327593546 |
CL3 | 25.00 | 0.00 | 0.40 | 6.31e+13 | CL3 | CL3 | ['CL3'] | ['CL3'] | CL3 | 0.12 | 0.68 | 0.05169945411341919 |
CL4 | 20.00 | 0.00 | 0.35 | 1.00e+14 | None | None | [] | [] | None | 0.08 | 0.62 | 0.05623291641697765 |
.. raw:: html
Cat2
tags: id(id), ra(ra), dec(dec), z(z), mass(mass)
Radius unit: None
ClData length=4
| mt_input |
---|
id | ra | dec | z | mass | mt_self | mt_other | mt_multi_self | mt_multi_other | mt_cross | zmin | zmax | ang |
str3 | float64 | float64 | float64 | float64 | object | object | object | object | object | float64 | float64 | float64 |
CL0 | 0.00 | 0.00 | 0.30 | 2.00e+13 | CL0 | CL0 | ['CL0', 'CL1', 'CL2'] | ['CL0', 'CL1', 'CL2'] | CL0 | 0.04 | 0.56 | 0.002777777777777778 |
CL1 | 0.00 | 0.00 | 0.20 | 2.51e+13 | CL1 | CL1 | ['CL0', 'CL1', 'CL2'] | ['CL0', 'CL1', 'CL2'] | CL1 | -0.04 | 0.44 | 0.002777777777777778 |
CL2 | 0.00 | 0.00 | 0.25 | 3.16e+13 | CL2 | CL2 | ['CL0', 'CL1', 'CL2'] | ['CL0', 'CL1', 'CL2'] | CL2 | 0.00 | 0.50 | 0.002777777777777778 |
CL3 | 25.00 | 0.00 | 0.40 | 6.31e+13 | CL3 | CL3 | ['CL3'] | ['CL3'] | CL3 | 0.12 | 0.68 | 0.002777777777777778 |
The steps of matching are stored in the catalogs and can be checked:
.. code:: ipython3
c1.show_mt_hist()
.. parsed-literal::
prep_cat_for_match(cat='Cat1', delta_z=0.2, match_radius='1 mpc', n_delta_z=1, n_match_radius=1,
cosmo='AstroPyCosmology(H0=70.0, Omega_dm0=0.25, Omega_b0=0.05, Omega_k0=0.0)')
prep_cat_for_match(cat='Cat2', delta_z=0.2, match_radius='10 arcsec', n_delta_z=1, n_match_radius=1,
cosmo='AstroPyCosmology(H0=70.0, Omega_dm0=0.25, Omega_b0=0.05, Omega_k0=0.0)')
multiple(cat1='Cat1', cat2='Cat2', radius_selection='max')
multiple(cat1='Cat2', cat2='Cat1', radius_selection='max')
unique(cat1='Cat1', cat2='Cat2', preference='angular_proximity')
unique(cat1='Cat2', cat2='Cat1', preference='angular_proximity')
Save and Load
-------------
The results of the matching can easily be saved and load using
``ClEvaR`` tools:
.. code:: ipython3
mt.save_matches(c1, c2, out_dir='temp', overwrite=True)
.. code:: ipython3
mt.load_matches(c1, c2, out_dir='temp')
display(c1)
display(c2)
.. parsed-literal::
Cat1
<< ClEvar used in matching: 0.13.2 >>
* Total objects: 5
* multiple (self): 4
* multiple (other): 4
* unique (self): 4
* unique (other): 4
* cross: 4
Cat2
<< ClEvar used in matching: 0.13.2 >>
* Total objects: 4
* multiple (self): 4
* multiple (other): 4
* unique (self): 4
* unique (other): 4
* cross: 4
.. raw:: html
Cat1
tags: id(id), ra(ra), dec(dec), z(z), mass(mass)
Radius unit: None
ClData length=5
| mt_input |
---|
id | ra | dec | z | mass | mt_self | mt_other | mt_multi_self | mt_multi_other | mt_cross | zmin | zmax | ang |
str3 | float64 | float64 | float64 | float64 | object | object | object | object | object | float64 | float64 | float64 |
CL0 | 0.00 | 0.00 | 0.20 | 3.16e+13 | CL0 | CL0 | ['CL0', 'CL1', 'CL2'] | ['CL0', 'CL1', 'CL2'] | CL0 | -0.04 | 0.44 | 0.08418388522320427 |
CL1 | 0.00 | 0.00 | 0.30 | 2.51e+13 | CL1 | CL1 | ['CL0', 'CL1', 'CL2'] | ['CL0', 'CL1', 'CL2'] | CL1 | 0.04 | 0.56 | 0.062361611333396835 |
CL2 | 0.00 | 0.00 | 0.25 | 2.00e+13 | CL2 | CL2 | ['CL0', 'CL1', 'CL2'] | ['CL0', 'CL1', 'CL2'] | CL2 | 0.00 | 0.50 | 0.0710414327593546 |
CL3 | 25.00 | 0.00 | 0.40 | 6.31e+13 | CL3 | CL3 | ['CL3'] | ['CL3'] | CL3 | 0.12 | 0.68 | 0.05169945411341919 |
CL4 | 20.00 | 0.00 | 0.35 | 1.00e+14 | None | None | [] | [] | None | 0.08 | 0.62 | 0.05623291641697765 |
.. raw:: html
Cat2
tags: id(id), ra(ra), dec(dec), z(z), mass(mass)
Radius unit: None
ClData length=4
| mt_input |
---|
id | ra | dec | z | mass | mt_self | mt_other | mt_multi_self | mt_multi_other | mt_cross | zmin | zmax | ang |
str3 | float64 | float64 | float64 | float64 | object | object | object | object | object | float64 | float64 | float64 |
CL0 | 0.00 | 0.00 | 0.30 | 2.00e+13 | CL0 | CL0 | ['CL0', 'CL1', 'CL2'] | ['CL0', 'CL1', 'CL2'] | CL0 | 0.04 | 0.56 | 0.002777777777777778 |
CL1 | 0.00 | 0.00 | 0.20 | 2.51e+13 | CL1 | CL1 | ['CL0', 'CL1', 'CL2'] | ['CL0', 'CL1', 'CL2'] | CL1 | -0.04 | 0.44 | 0.002777777777777778 |
CL2 | 0.00 | 0.00 | 0.25 | 3.16e+13 | CL2 | CL2 | ['CL0', 'CL1', 'CL2'] | ['CL0', 'CL1', 'CL2'] | CL2 | 0.00 | 0.50 | 0.002777777777777778 |
CL3 | 25.00 | 0.00 | 0.40 | 6.31e+13 | CL3 | CL3 | ['CL3'] | ['CL3'] | CL3 | 0.12 | 0.68 | 0.002777777777777778 |
Getting Matched Pairs
---------------------
There is functionality inbuilt in ``clevar`` to plot some results of the
matching, such as: - Recovery rates - Distances (anguar and redshift) of
cluster centers - Scaling relations (mass, redshift, …) for those cases,
check the match_metrics.ipynb and match_metrics_advanced.ipynb
notebooks.
If those do not provide your needs, you can get directly the matched
pairs of clusters:
.. code:: ipython3
from clevar.match import get_matched_pairs
mt1, mt2 = get_matched_pairs(c1, c2, 'cross')
These will be catalogs with the corresponding matched pairs:
.. code:: ipython3
import pylab as plt
plt.scatter(mt1['mass'], mt2['mass'])
.. parsed-literal::
.. image:: proximity_matching_files/proximity_matching_23_1.png
Outputing matched catalogs
--------------------------
To save the current catalogs, you can use the ``write`` inbuilt
function:
.. code:: ipython3
c1.write('c1_temp.fits', overwrite=True)
This will allow you to save the catalog with its current labels and
matching information.
Outputing matching information to original catalogs
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If your input data came from initial files, ``clevar`` also provides
functions create output files that combine all the information on them
with the matching results.
To add the matching information to an input catalog, use:
::
from clevar.match import output_catalog_with_matching
output_catalog_with_matching('input_catalog.fits', 'output_catalog.fits', c1)
- note: ``input_catalog.fits`` must have the same number of rows that
``c1``.
To create a matched catalog containig all columns of both input
catalogs, use:
::
from clevar.match import output_matched_catalog
output_matched_catalog('input_catalog1.fits', 'input_catalog2.fits',
'output_catalog.fits', c1, c2, matching_type='cross')
where ``matching_type`` must be ``cross``, ``cat1`` or ``cat2``.
- note: ``input_catalog1.fits`` must have the same number of rows that
``c1`` (and the same for ``c2``).