Matching catalogs based on proximity (detailed)
===============================================
Here we show the specific steps of matching two catalogs based on
proximity
.. raw:: html
Table of Contents
.. raw:: html
.. container:: toc
.. raw:: html
.. raw:: html
-
1 ClCatalogs
.. raw:: html
.. raw:: html
-
2 Matching
.. raw:: html
.. raw:: html
-
2.1 Prepare the catalogs
.. raw:: html
.. raw:: html
-
2.2 Multiple matching
.. raw:: html
.. raw:: html
-
2.3 Unique matching
.. raw:: html
.. raw:: html
-
2.4 Cross matching
.. raw:: html
.. raw:: html
.. raw:: html
.. raw:: html
-
3 Save and Load
.. raw:: html
.. raw:: html
-
4 Getting Matched Pairs
.. raw:: html
.. raw:: html
-
5 Outputing matched catalogs
.. raw:: html
.. raw:: html
-
5.1 Outputing matching information to original catalogs
.. raw:: html
.. raw:: html
.. raw:: html
.. raw:: html
.. code:: ipython3
%load_ext autoreload
%autoreload 2
ClCatalogs
----------
Given some input data
.. code:: ipython3
import numpy as np
from astropy.table import Table
input1 = Table({
'ID': [f'CL{i}' for i in range(5)],
'RA': [0.0, 0.0001, 0.00011, 25, 20],
'DEC': [0.0, 0.0, 0.0, 0.0, 0.0],
'Z': [0.2, 0.3, 0.25, 0.4, 0.35],
'MASS': [10**13.5, 10**13.4, 10**13.3, 10**13.8, 10**14],
'RADIUS_ARCMIN': [1.0, 1.0, 1.0, 1.0, 1.0],
})
input2 = Table({
'ID': ['CL0', 'CL1', 'CL2', 'CL3'],
'RA': [0.0, 0.0001, 0.00011, 25],
'DEC': [0.0, 0, 0, 0],
'Z': [0.3, 0.2, 0.25, 0.4],
'MASS': [10**13.3, 10**13.4, 10**13.5, 10**13.8],
'RADIUS_ARCMIN': [1.0, 1.0, 1.0, 1.0],
})
display(input1)
display(input2)
.. raw:: html
Table length=5
ID | RA | DEC | Z | MASS | RADIUS_ARCMIN |
str3 | float64 | float64 | float64 | float64 | float64 |
CL0 | 0.0 | 0.0 | 0.2 | 31622776601683.793 | 1.0 |
CL1 | 0.0001 | 0.0 | 0.3 | 25118864315095.82 | 1.0 |
CL2 | 0.00011 | 0.0 | 0.25 | 19952623149688.83 | 1.0 |
CL3 | 25.0 | 0.0 | 0.4 | 63095734448019.43 | 1.0 |
CL4 | 20.0 | 0.0 | 0.35 | 100000000000000.0 | 1.0 |
.. raw:: html
Table length=4
ID | RA | DEC | Z | MASS | RADIUS_ARCMIN |
str3 | float64 | float64 | float64 | float64 | float64 |
CL0 | 0.0 | 0.0 | 0.3 | 19952623149688.83 | 1.0 |
CL1 | 0.0001 | 0.0 | 0.2 | 25118864315095.82 | 1.0 |
CL2 | 0.00011 | 0.0 | 0.25 | 31622776601683.793 | 1.0 |
CL3 | 25.0 | 0.0 | 0.4 | 63095734448019.43 | 1.0 |
Create two ``ClCatalog`` objects, they have the same properties of
``astropy`` tables with additional functionality. For the proximity
matching, the main columns to be included are: -
``id - if not included, one will be assigned -``\ ra\ ``(in degrees) - necessary -``\ dec\ ``(in degrees) - necessary -``\ z\ ``- necessary if used as matching criteria or for angular to physical convertion -``\ mass\ ``(or mass proxy) - necessary if used as preference criteria for unique matches -``\ radius\ ``- necessary if used as a criteria of matching (also requires``\ radius_unit\`
to be passed)
.. code:: ipython3
from clevar.catalog import ClCatalog
c1 = ClCatalog('Cat1', id=input1['ID'], ra=input1['RA'], dec=input1['DEC'], z=input1['Z'], mass=input1['MASS'])
c2 = ClCatalog('Cat2', id=input2['ID'], ra=input2['RA'], dec=input2['DEC'], z=input2['Z'], mass=input2['MASS'])
# Format for nice display
for c in ('ra', 'dec', 'z'):
c1[c].info.format = '.2f'
c2[c].info.format = '.2f'
for c in ('mass',):
c1[c].info.format = '.2e'
c2[c].info.format = '.2e'
display(c1)
display(c2)
.. raw:: html
Cat1
tags: id(id), ra(ra), dec(dec), z(z), mass(mass)
Radius unit: None
ClData length=5
id | ra | dec | z | mass | mt_self | mt_other | mt_multi_self | mt_multi_other |
str3 | float64 | float64 | float64 | float64 | object | object | object | object |
CL0 | 0.00 | 0.00 | 0.20 | 3.16e+13 | None | None | [] | [] |
CL1 | 0.00 | 0.00 | 0.30 | 2.51e+13 | None | None | [] | [] |
CL2 | 0.00 | 0.00 | 0.25 | 2.00e+13 | None | None | [] | [] |
CL3 | 25.00 | 0.00 | 0.40 | 6.31e+13 | None | None | [] | [] |
CL4 | 20.00 | 0.00 | 0.35 | 1.00e+14 | None | None | [] | [] |
.. raw:: html
Cat2
tags: id(id), ra(ra), dec(dec), z(z), mass(mass)
Radius unit: None
ClData length=4
id | ra | dec | z | mass | mt_self | mt_other | mt_multi_self | mt_multi_other |
str3 | float64 | float64 | float64 | float64 | object | object | object | object |
CL0 | 0.00 | 0.00 | 0.30 | 2.00e+13 | None | None | [] | [] |
CL1 | 0.00 | 0.00 | 0.20 | 2.51e+13 | None | None | [] | [] |
CL2 | 0.00 | 0.00 | 0.25 | 3.16e+13 | None | None | [] | [] |
CL3 | 25.00 | 0.00 | 0.40 | 6.31e+13 | None | None | [] | [] |
The ``ClCatalog`` object can also be read directly from a file, for
details, see catalogs.ipynb.
Matching
--------
Import the ``ProximityMatch`` and create a object for matching
.. code:: ipython3
from clevar.match import ProximityMatch
mt = ProximityMatch()
Prepare the catalogs
~~~~~~~~~~~~~~~~~~~~
The first step is to prepare each catalog with the matching
configuration:
- ``delta_z``: Defines redshift window for matching. The possible
values are:
- ``'cat'``: uses redshift properties of the catalog
- ``'spline.filename'``: interpolates data in ``'filename'``
assuming (z, zmin, zmax) format
- ``float``: uses ``delta_z*(1+z)``
- ``None``: does not use z
- ``match_radius``: Radius of the catalog to be used in the matching.
If ``'cat'`` uses the radius in the catalog, else must be in format
``'value unit'``. (ex: ``'1 arcsec'``, ``'1 Mpc'``)
In this case, because one of the configuraion radius has physical units,
we also need a cosmology (``cosmo``) object to convert it to angular
size (this is done internally).
.. code:: ipython3
from clevar.cosmology import AstroPyCosmology
mt_config1 = {'delta_z':.2,
'match_radius': '1 mpc',
'cosmo':AstroPyCosmology()}
mt_config2 = {'delta_z':.2,
'match_radius': '1 arcsec'}
mt.prep_cat_for_match(c1, **mt_config1)
mt.prep_cat_for_match(c2, **mt_config2)
.. parsed-literal::
## Prep mt_cols
* zmin|zmax from config value
* ang radius from set scale
## Prep mt_cols
* zmin|zmax from config value
* ang radius from set scale
This will add values to the ``mt_input`` attribute of the catalogs:
.. code:: ipython3
display(c1.mt_input)
display(c2.mt_input)
.. raw:: html
ClData length=5
zmin | zmax | ang |
float64 | float64 | float64 |
-0.04 | 0.44 | 0.08418388522320427 |
0.04 | 0.56 | 0.062361611333396835 |
0.00 | 0.50 | 0.0710414327593546 |
0.12 | 0.68 | 0.05169945411341919 |
0.08 | 0.62 | 0.05623291641697765 |
.. raw:: html
ClData length=4
zmin | zmax | ang |
float64 | float64 | float64 |
0.04 | 0.56 | 0.0002777777777777778 |
-0.04 | 0.44 | 0.0002777777777777778 |
0.00 | 0.50 | 0.0002777777777777778 |
0.12 | 0.68 | 0.0002777777777777778 |
Multiple matching
~~~~~~~~~~~~~~~~~
The next step is to match the catalogs and store all candidates that
pass the matching criteria. You can also pass the argument: -
``radius_selection``: Given a pair of clusters, which radius will be
used for the matching.
.. code:: ipython3
%%time
mt.multiple(c1, c2)
mt.multiple(c2, c1)
.. parsed-literal::
Finding candidates (Cat1)
* 4/5 objects matched.
Finding candidates (Cat2)
* 4/4 objects matched.
CPU times: user 28.5 ms, sys: 539 µs, total: 29.1 ms
Wall time: 28.4 ms
This will fill the ``mt_multi_self`` and ``mt_multi_other`` columns:
.. code:: ipython3
display(c1)
display(c2)
.. raw:: html
Cat1
tags: id(id), ra(ra), dec(dec), z(z), mass(mass)
Radius unit: None
ClData length=5
| mt_input |
---|
id | ra | dec | z | mass | mt_self | mt_other | mt_multi_self | mt_multi_other | zmin | zmax | ang |
str3 | float64 | float64 | float64 | float64 | object | object | object | object | float64 | float64 | float64 |
CL0 | 0.00 | 0.00 | 0.20 | 3.16e+13 | None | None | ['CL2', 'CL1', 'CL0'] | ['CL2', 'CL1', 'CL0'] | -0.04 | 0.44 | 0.08418388522320427 |
CL1 | 0.00 | 0.00 | 0.30 | 2.51e+13 | None | None | ['CL2', 'CL1', 'CL0'] | ['CL2', 'CL1', 'CL0'] | 0.04 | 0.56 | 0.062361611333396835 |
CL2 | 0.00 | 0.00 | 0.25 | 2.00e+13 | None | None | ['CL2', 'CL1', 'CL0'] | ['CL2', 'CL1', 'CL0'] | 0.00 | 0.50 | 0.0710414327593546 |
CL3 | 25.00 | 0.00 | 0.40 | 6.31e+13 | None | None | ['CL3'] | ['CL3'] | 0.12 | 0.68 | 0.05169945411341919 |
CL4 | 20.00 | 0.00 | 0.35 | 1.00e+14 | None | None | [] | [] | 0.08 | 0.62 | 0.05623291641697765 |
.. raw:: html
Cat2
tags: id(id), ra(ra), dec(dec), z(z), mass(mass)
Radius unit: None
ClData length=4
| mt_input |
---|
id | ra | dec | z | mass | mt_self | mt_other | mt_multi_self | mt_multi_other | zmin | zmax | ang |
str3 | float64 | float64 | float64 | float64 | object | object | object | object | float64 | float64 | float64 |
CL0 | 0.00 | 0.00 | 0.30 | 2.00e+13 | None | None | ['CL2', 'CL1', 'CL0'] | ['CL2', 'CL1', 'CL0'] | 0.04 | 0.56 | 0.0002777777777777778 |
CL1 | 0.00 | 0.00 | 0.20 | 2.51e+13 | None | None | ['CL2', 'CL1', 'CL0'] | ['CL2', 'CL1', 'CL0'] | -0.04 | 0.44 | 0.0002777777777777778 |
CL2 | 0.00 | 0.00 | 0.25 | 3.16e+13 | None | None | ['CL2', 'CL1', 'CL0'] | ['CL2', 'CL1', 'CL0'] | 0.00 | 0.50 | 0.0002777777777777778 |
CL3 | 25.00 | 0.00 | 0.40 | 6.31e+13 | None | None | ['CL3'] | ['CL3'] | 0.12 | 0.68 | 0.0002777777777777778 |
Unique matching
~~~~~~~~~~~~~~~
Once all candidates are stored in each catalog, we can find the best
candidates. You can also pass the argument: - ``preference``: In cases
where there are multiple matched, how the best candidate will be chosen.
.. code:: ipython3
%%time
mt.unique(c1, c2, preference='angular_proximity')
mt.unique(c2, c1, preference='angular_proximity')
.. parsed-literal::
Unique Matches (Cat1)
* 4/5 objects matched.
Unique Matches (Cat2)
* 4/4 objects matched.
CPU times: user 57.6 ms, sys: 5.88 ms, total: 63.5 ms
Wall time: 62 ms
This will fill the ``mt_self`` and ``mt_other`` columns:
.. code:: ipython3
display(c1)
display(c2)
.. raw:: html
Cat1
tags: id(id), ra(ra), dec(dec), z(z), mass(mass)
Radius unit: None
ClData length=5
| mt_input |
---|
id | ra | dec | z | mass | mt_self | mt_other | mt_multi_self | mt_multi_other | zmin | zmax | ang |
str3 | float64 | float64 | float64 | float64 | object | object | object | object | float64 | float64 | float64 |
CL0 | 0.00 | 0.00 | 0.20 | 3.16e+13 | CL0 | CL0 | ['CL2', 'CL1', 'CL0'] | ['CL2', 'CL1', 'CL0'] | -0.04 | 0.44 | 0.08418388522320427 |
CL1 | 0.00 | 0.00 | 0.30 | 2.51e+13 | CL1 | CL1 | ['CL2', 'CL1', 'CL0'] | ['CL2', 'CL1', 'CL0'] | 0.04 | 0.56 | 0.062361611333396835 |
CL2 | 0.00 | 0.00 | 0.25 | 2.00e+13 | CL2 | CL2 | ['CL2', 'CL1', 'CL0'] | ['CL2', 'CL1', 'CL0'] | 0.00 | 0.50 | 0.0710414327593546 |
CL3 | 25.00 | 0.00 | 0.40 | 6.31e+13 | CL3 | CL3 | ['CL3'] | ['CL3'] | 0.12 | 0.68 | 0.05169945411341919 |
CL4 | 20.00 | 0.00 | 0.35 | 1.00e+14 | None | None | [] | [] | 0.08 | 0.62 | 0.05623291641697765 |
.. raw:: html
Cat2
tags: id(id), ra(ra), dec(dec), z(z), mass(mass)
Radius unit: None
ClData length=4
| mt_input |
---|
id | ra | dec | z | mass | mt_self | mt_other | mt_multi_self | mt_multi_other | zmin | zmax | ang |
str3 | float64 | float64 | float64 | float64 | object | object | object | object | float64 | float64 | float64 |
CL0 | 0.00 | 0.00 | 0.30 | 2.00e+13 | CL0 | CL0 | ['CL2', 'CL1', 'CL0'] | ['CL2', 'CL1', 'CL0'] | 0.04 | 0.56 | 0.0002777777777777778 |
CL1 | 0.00 | 0.00 | 0.20 | 2.51e+13 | CL1 | CL1 | ['CL2', 'CL1', 'CL0'] | ['CL2', 'CL1', 'CL0'] | -0.04 | 0.44 | 0.0002777777777777778 |
CL2 | 0.00 | 0.00 | 0.25 | 3.16e+13 | CL2 | CL2 | ['CL2', 'CL1', 'CL0'] | ['CL2', 'CL1', 'CL0'] | 0.00 | 0.50 | 0.0002777777777777778 |
CL3 | 25.00 | 0.00 | 0.40 | 6.31e+13 | CL3 | CL3 | ['CL3'] | ['CL3'] | 0.12 | 0.68 | 0.0002777777777777778 |
Cross matching
~~~~~~~~~~~~~~
If you want to make sure the same pair was found in both directions:
.. code:: ipython3
c1.cross_match()
c2.cross_match()
This will fill the ``mt_cross`` column:
.. code:: ipython3
display(c1)
display(c2)
.. raw:: html
Cat1
tags: id(id), ra(ra), dec(dec), z(z), mass(mass)
Radius unit: None
ClData length=5
| mt_input |
---|
id | ra | dec | z | mass | mt_self | mt_other | mt_multi_self | mt_multi_other | mt_cross | zmin | zmax | ang |
str3 | float64 | float64 | float64 | float64 | object | object | object | object | object | float64 | float64 | float64 |
CL0 | 0.00 | 0.00 | 0.20 | 3.16e+13 | CL0 | CL0 | ['CL2', 'CL1', 'CL0'] | ['CL2', 'CL1', 'CL0'] | CL0 | -0.04 | 0.44 | 0.08418388522320427 |
CL1 | 0.00 | 0.00 | 0.30 | 2.51e+13 | CL1 | CL1 | ['CL2', 'CL1', 'CL0'] | ['CL2', 'CL1', 'CL0'] | CL1 | 0.04 | 0.56 | 0.062361611333396835 |
CL2 | 0.00 | 0.00 | 0.25 | 2.00e+13 | CL2 | CL2 | ['CL2', 'CL1', 'CL0'] | ['CL2', 'CL1', 'CL0'] | CL2 | 0.00 | 0.50 | 0.0710414327593546 |
CL3 | 25.00 | 0.00 | 0.40 | 6.31e+13 | CL3 | CL3 | ['CL3'] | ['CL3'] | CL3 | 0.12 | 0.68 | 0.05169945411341919 |
CL4 | 20.00 | 0.00 | 0.35 | 1.00e+14 | None | None | [] | [] | None | 0.08 | 0.62 | 0.05623291641697765 |
.. raw:: html
Cat2
tags: id(id), ra(ra), dec(dec), z(z), mass(mass)
Radius unit: None
ClData length=4
| mt_input |
---|
id | ra | dec | z | mass | mt_self | mt_other | mt_multi_self | mt_multi_other | mt_cross | zmin | zmax | ang |
str3 | float64 | float64 | float64 | float64 | object | object | object | object | object | float64 | float64 | float64 |
CL0 | 0.00 | 0.00 | 0.30 | 2.00e+13 | CL0 | CL0 | ['CL2', 'CL1', 'CL0'] | ['CL2', 'CL1', 'CL0'] | CL0 | 0.04 | 0.56 | 0.0002777777777777778 |
CL1 | 0.00 | 0.00 | 0.20 | 2.51e+13 | CL1 | CL1 | ['CL2', 'CL1', 'CL0'] | ['CL2', 'CL1', 'CL0'] | CL1 | -0.04 | 0.44 | 0.0002777777777777778 |
CL2 | 0.00 | 0.00 | 0.25 | 3.16e+13 | CL2 | CL2 | ['CL2', 'CL1', 'CL0'] | ['CL2', 'CL1', 'CL0'] | CL2 | 0.00 | 0.50 | 0.0002777777777777778 |
CL3 | 25.00 | 0.00 | 0.40 | 6.31e+13 | CL3 | CL3 | ['CL3'] | ['CL3'] | CL3 | 0.12 | 0.68 | 0.0002777777777777778 |
Save and Load
-------------
The results of the matching can easily be saved and load using
``ClEvaR`` tools:
.. code:: ipython3
mt.save_matches(c1, c2, out_dir='temp', overwrite=True)
.. code:: ipython3
mt.load_matches(c1, c2, out_dir='temp')
display(c1)
display(c2)
.. parsed-literal::
Cat1
<< ClEvar used in matching: 0.13.2 >>
* Total objects: 5
* multiple (self): 4
* multiple (other): 4
* unique (self): 4
* unique (other): 4
* cross: 4
Cat2
<< ClEvar used in matching: 0.13.2 >>
* Total objects: 4
* multiple (self): 4
* multiple (other): 4
* unique (self): 4
* unique (other): 4
* cross: 4
.. raw:: html
Cat1
tags: id(id), ra(ra), dec(dec), z(z), mass(mass)
Radius unit: None
ClData length=5
| mt_input |
---|
id | ra | dec | z | mass | mt_self | mt_other | mt_multi_self | mt_multi_other | mt_cross | zmin | zmax | ang |
str3 | float64 | float64 | float64 | float64 | object | object | object | object | object | float64 | float64 | float64 |
CL0 | 0.00 | 0.00 | 0.20 | 3.16e+13 | CL0 | CL0 | ['CL2', 'CL1', 'CL0'] | ['CL2', 'CL1', 'CL0'] | CL0 | -0.04 | 0.44 | 0.08418388522320427 |
CL1 | 0.00 | 0.00 | 0.30 | 2.51e+13 | CL1 | CL1 | ['CL2', 'CL1', 'CL0'] | ['CL2', 'CL1', 'CL0'] | CL1 | 0.04 | 0.56 | 0.062361611333396835 |
CL2 | 0.00 | 0.00 | 0.25 | 2.00e+13 | CL2 | CL2 | ['CL2', 'CL1', 'CL0'] | ['CL2', 'CL1', 'CL0'] | CL2 | 0.00 | 0.50 | 0.0710414327593546 |
CL3 | 25.00 | 0.00 | 0.40 | 6.31e+13 | CL3 | CL3 | ['CL3'] | ['CL3'] | CL3 | 0.12 | 0.68 | 0.05169945411341919 |
CL4 | 20.00 | 0.00 | 0.35 | 1.00e+14 | None | None | [] | [] | None | 0.08 | 0.62 | 0.05623291641697765 |
.. raw:: html
Cat2
tags: id(id), ra(ra), dec(dec), z(z), mass(mass)
Radius unit: None
ClData length=4
| mt_input |
---|
id | ra | dec | z | mass | mt_self | mt_other | mt_multi_self | mt_multi_other | mt_cross | zmin | zmax | ang |
str3 | float64 | float64 | float64 | float64 | object | object | object | object | object | float64 | float64 | float64 |
CL0 | 0.00 | 0.00 | 0.30 | 2.00e+13 | CL0 | CL0 | ['CL2', 'CL1', 'CL0'] | ['CL2', 'CL1', 'CL0'] | CL0 | 0.04 | 0.56 | 0.0002777777777777778 |
CL1 | 0.00 | 0.00 | 0.20 | 2.51e+13 | CL1 | CL1 | ['CL2', 'CL1', 'CL0'] | ['CL2', 'CL1', 'CL0'] | CL1 | -0.04 | 0.44 | 0.0002777777777777778 |
CL2 | 0.00 | 0.00 | 0.25 | 3.16e+13 | CL2 | CL2 | ['CL2', 'CL1', 'CL0'] | ['CL2', 'CL1', 'CL0'] | CL2 | 0.00 | 0.50 | 0.0002777777777777778 |
CL3 | 25.00 | 0.00 | 0.40 | 6.31e+13 | CL3 | CL3 | ['CL3'] | ['CL3'] | CL3 | 0.12 | 0.68 | 0.0002777777777777778 |
Getting Matched Pairs
---------------------
There is functionality inbuilt in ``clevar`` to plot some results of the
matching, such as: - Recovery rates - Distances (anguar and redshift) of
cluster centers - Scaling relations (mass, redshift, …) for those cases,
check the match_metrics.ipynb and match_metrics_advanced.ipynb
notebooks.
If those do not provide your needs, you can get directly the matched
pairs of clusters:
.. code:: ipython3
from clevar.match import get_matched_pairs
mt1, mt2 = get_matched_pairs(c1, c2, 'cross')
These will be catalogs with the corresponding matched pairs:
.. code:: ipython3
import pylab as plt
plt.scatter(mt1['mass'], mt2['mass'])
.. parsed-literal::
.. image:: proximity_matching_detailed_files/proximity_matching_detailed_32_1.png
Outputing matched catalogs
--------------------------
To save the current catalogs, you can use the ``write`` inbuilt
function:
.. code:: ipython3
c1.write('c1_temp.fits', overwrite=True)
This will allow you to save the catalog with its current labels and
matching information.
Outputing matching information to original catalogs
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Assuming your input data came from initial files, ``clevar`` also
provides functions create output files that combine all the information
on them with the matching results.
To add the matching information to an input catalog, use:
::
from clevar.match import output_catalog_with_matching
output_catalog_with_matching('input_catalog.fits', 'output_catalog.fits', c1)
- note: ``input_catalog.fits`` must have the same number of rows that
``c1``.
To create a matched catalog containig all columns of both input
catalogs, use:
::
from clevar.match import output_matched_catalog
output_matched_catalog('input_catalog1.fits', 'input_catalog2.fits',
'output_catalog.fits', c1, c2, matching_type='cross')
where ``matching_type`` must be ``cross``, ``cat1`` or ``cat2``.
- note: ``input_catalog1.fits`` must have the same number of rows that
``c1`` (and the same for ``c2``).