Metrics of matching (simple)

Example of the functions to plot match_metrics of matching.

%load_ext autoreload
%autoreload 2
import numpy as np
import pylab as plt

Generate random data and add to catalog

# For reproducibility
np.random.seed(1)
from support import gen_cluster
input1, input2 = gen_cluster(ra_min=0, ra_max=30, dec_min=0, dec_max=30)
Initial number of clusters (logM>12.48): 2,740
Clusters in catalog1: 835
Clusters in catalog2: 928
from clevar import ClCatalog
c1 = ClCatalog('Cat1', ra=input1['RA'], dec=input1['DEC'], z=input1['Z'], mass=input1['MASS'],
            mass_err=input1['MASS_ERR'], z_err=input1['Z_ERR'])
c2 = ClCatalog('Cat2', ra=input2['RA'], dec=input2['DEC'], z=input2['Z'], mass=input2['MASS'],
            mass_err=input2['MASS_ERR'], z_err=input2['Z_ERR'])
# Format for nice display
for c in ('ra', 'dec', 'z', 'z_err'):
    c1[c].info.format = '.2f'
    c2[c].info.format = '.2f'
for c in ('mass', 'mass_err'):
    c1[c].info.format = '.2e'
    c2[c].info.format = '.2e'
/home/aguena/.local/lib/python3.9/site-packages/clevar-0.13.2-py3.9.egg/clevar/catalog.py:267: UserWarning: id column missing, additional one is being created.
  warnings.warn(

Match catalogs

from clevar.match import ProximityMatch
from clevar.cosmology import AstroPyCosmology

match_config = {
    'type': 'cross', # options are cross, cat1, cat2
    'which_radius': 'max', # Case of radius to be used, can be: cat1, cat2, min, max
    'preference': 'angular_proximity', # options are more_massive, angular_proximity or redshift_proximity
    'catalog1': {'delta_z':.2,
                'match_radius': '1 mpc'
                },
    'catalog2': {'delta_z':.2,
                'match_radius': '10 arcsec'
                }
}

cosmo = AstroPyCosmology()
mt = ProximityMatch()
mt.match_from_config(c1, c2, match_config, cosmo=cosmo)
## ClCatalog 1
## Prep mt_cols
* zmin|zmax from config value
* ang radius from set scale

## ClCatalog 2
## Prep mt_cols
* zmin|zmax from config value
* ang radius from set scale

## Multiple match (catalog 1)
Finding candidates (Cat1)
* 719/835 objects matched.

## Multiple match (catalog 2)
Finding candidates (Cat2)
* 721/928 objects matched.

## Finding unique matches of catalog 1
Unique Matches (Cat1)
* 719/835 objects matched.

## Finding unique matches of catalog 2
Unique Matches (Cat2)
* 720/928 objects matched.
Cross Matches (Cat1)
* 719/835 objects matched.
Cross Matches (Cat2)
* 719/928 objects matched.

Recovery rate

Compute recovery rates, they are computed in mass and redshift bins. There are several ways they can be displayed: - Single panel with multiple lines - Multiple panels - 2D color map

from clevar.match_metrics import recovery

Simple plot

The recovery rates are shown as a function of redshift in mass bins. They can be displayed as a continuous line or with steps:

zbins = np.linspace(0, 2, 21)
mbins = np.logspace(13, 14, 5)
info = recovery.plot(c1, 'cross', zbins, mbins, shape='steps')
plt.show()
info = recovery.plot(c1, 'cross', zbins, mbins, shape='line')
plt.show()
../_images/match_metrics_12_0.png ../_images/match_metrics_12_1.png

You can also smoothen the lines of the plot:

info = recovery.plot(c1, 'cross', zbins, mbins, shape='line',
                     plt_kwargs={'n_increase':3})
plt.show()
../_images/match_metrics_14_0.png

They can also be transposed to be shown as a function of mass in redshift bins.

zbins = np.linspace(0, 2, 5)
mbins = np.logspace(13, 14, 20)
info = recovery.plot(c1, 'cross', zbins, mbins,
                     shape='line', transpose=True)
../_images/match_metrics_16_0.png

The full information of the recovery rate histogram in a dictionay containing:

  • data: Binned data used in the plot. It has the sections:

    • recovery: Recovery rate binned with (bin1, bin2). bins where no cluster was found have nan value.

    • edges1: The bin edges along the first dimension.

    • edges2: The bin edges along the second dimension.

    • counts: Counts of all clusters in bins.

    • matched: Counts of matched clusters in bins.

info['data'].keys()
dict_keys(['recovery', 'edges1', 'edges2', 'matched', 'counts'])

Panels plots

You can also have a panel for each bin:

zbins = np.linspace(0, 2, 21)
mbins = np.logspace(13, 14, 5)
info = recovery.plot_panel(c1, 'cross', zbins, mbins)

zbins = np.linspace(0, 2, 5)
mbins = np.logspace(13, 14, 20)
info = recovery.plot_panel(c1, 'cross', zbins, mbins, transpose=True)
../_images/match_metrics_20_0.png ../_images/match_metrics_20_1.png

2D plots

zbins = np.linspace(0, 2, 10)
mbins = np.logspace(13, 14, 5)

info = recovery.plot2D(c1, 'cross', zbins, mbins)
plt.show()
info = recovery.plot2D(c1, 'cross', zbins, mbins,
                       add_num=True, num_kwargs={'fontsize':15})
../_images/match_metrics_22_0.png ../_images/match_metrics_22_1.png

Sky plots

It is possible to plot the recovery rate by positions in the sky. It is done based on healpix pixelizations:

info = recovery.skyplot(c1, 'cross', nside=16, ra_lim=[-5, 35], dec_lim=[-5, 35])
/home/aguena/miniconda3/envs/clmmenv/lib/python3.9/site-packages/healpy/projaxes.py:920: MatplotlibDeprecationWarning: You are modifying the state of a globally registered colormap. In future versions, you will not be able to modify a registered colormap in-place. To remove this warning, you can make a copy of the colormap first. cmap = copy.copy(mpl.cm.get_cmap("viridis"))
  newcm.set_over(newcm(1.0))
/home/aguena/miniconda3/envs/clmmenv/lib/python3.9/site-packages/healpy/projaxes.py:921: MatplotlibDeprecationWarning: You are modifying the state of a globally registered colormap. In future versions, you will not be able to modify a registered colormap in-place. To remove this warning, you can make a copy of the colormap first. cmap = copy.copy(mpl.cm.get_cmap("viridis"))
  newcm.set_under(bgcolor)
/home/aguena/miniconda3/envs/clmmenv/lib/python3.9/site-packages/healpy/projaxes.py:922: MatplotlibDeprecationWarning: You are modifying the state of a globally registered colormap. In future versions, you will not be able to modify a registered colormap in-place. To remove this warning, you can make a copy of the colormap first. cmap = copy.copy(mpl.cm.get_cmap("viridis"))
  newcm.set_bad(badcolor)
/home/aguena/miniconda3/envs/clmmenv/lib/python3.9/site-packages/healpy/projaxes.py:202: MatplotlibDeprecationWarning: Passing parameters norm and vmin/vmax simultaneously is deprecated since 3.3 and will become an error two minor releases later. Please pass vmin/vmax directly to the norm when creating it.
  aximg = self.imshow(
../_images/match_metrics_24_1.png

Distances of matching

Here we evaluate the distance between the cluster centers and their redshifts. These distances can be shown for all matched clusters, or in bins:

from clevar.match_metrics import distances
info = distances.central_position(
    c1, c2, 'cross', radial_bins=20, radial_bin_units='degrees')
../_images/match_metrics_27_0.png
info = distances.central_position(
    c1, c2, 'cross', radial_bins=20, radial_bin_units='degrees',
    quantity_bins='mass', bins=mbins, log_quantity=True)
../_images/match_metrics_28_0.png
info = distances.central_position(
    c1, c2, 'cross', radial_bins=20, radial_bin_units='degrees',
    quantity_bins='z', bins=zbins[::2], log_quantity=False)
../_images/match_metrics_29_0.png
info = distances.redshift(c1, c2, 'cross', redshift_bins=20, normalize='cat1')
../_images/match_metrics_30_0.png
info = distances.redshift(
    c1, c2, 'cross', redshift_bins=20, normalize='cat1',
    quantity_bins='mass', bins=mbins, log_quantity=True)
../_images/match_metrics_31_0.png
info = distances.redshift(
    c1, c2, 'cross', redshift_bins=20, normalize='cat1',
    quantity_bins='z', bins=zbins[::2], log_quantity=False)
../_images/match_metrics_32_0.png

The full information of the distances is outputed in a dictionary containing:

  • distances: values of distances.

  • data: Binned data used in the plot. It has the sections:

    • hist: Binned distances with (distance_bins, bin2). bins where no cluster was found have nan value.

    • distance_bins: The bin edges for distances.

    • bins2 (optional): The bin edges along the second dimension.

info.keys()
dict_keys(['distances', 'data', 'ax'])

You can also smoothen the lines of the plot:

info = distances.central_position(
    c1, c2, 'cross', radial_bins=20, radial_bin_units='degrees',
    shape='line', plt_kwargs={'n_increase':3})
../_images/match_metrics_36_0.png

Scaling Relations

from clevar.match_metrics import scaling

Redshift plots

Simple plot

info = scaling.redshift(c1, c2, 'cross')
../_images/match_metrics_41_0.png

Color points by \(\log(M)\) value

info = scaling.redshift_masscolor(c1, c2, 'cross', add_err=True)
../_images/match_metrics_43_0.png

Color points by density at plot

info = scaling.redshift_density(c1, c2, 'cross', add_err=True)
../_images/match_metrics_45_0.png

Split data into mass bins

info = scaling.redshift_masspanel(c1, c2, 'cross', add_err=True)
../_images/match_metrics_47_0.png

Split data into mass bins and color by density

info = scaling.redshift_density_masspanel(c1, c2, 'cross', add_err=True)
../_images/match_metrics_49_0.png

Evaluate metrics of the distribution

info = scaling.redshift_metrics(c1, c2, 'cross')
../_images/match_metrics_51_0.png
info = scaling.redshift_density_metrics(c1, c2, 'cross', ax_rotation=45)
../_images/match_metrics_52_0.png
info = scaling.redshift_density_dist(c1, c2, 'cross', ax_rotation=45, add_err=False)
../_images/match_metrics_53_0.png

All of these functions with scatter plot can also fit a relation:

info = scaling.redshift_density_metrics(
    c1, c2, 'cross', ax_rotation=45,
    add_fit=True, fit_bins1=20)
../_images/match_metrics_55_0.png

The full information of the scaling relation is outputed to a dictionay containing:

  • binned_data (optional): input data for fitting, with values:

    • x: x values in fit (log of values if log=True).

    • y: y values in fit (log of values if log=True).

    • y_err: errorbar on y values (error_log if log=True).

  • fit (optional): fitting output dictionary, with values:

    • pars: fitted parameter.

    • cov: covariance of fitted parameters.

    • func: fitting function with fitted parameter.

    • func_plus: fitting function with fitted parameter plus 1x scatter.

    • func_minus: fitting function with fitted parameter minus 1x scatter.

    • func_scat: scatter of fited function.

    • func_dist: P(y|x) - Probability of having y given a value for x, assumes normal distribution and uses scatter of the fitted function.

    • func_scat_interp: interpolated scatter from data.

    • func_dist_interp: P(y|x) using interpolated scatter.

  • plots (optional): additional plots:

    • fit: fitted data

    • errorbar: binned data

info['fit']['pars']
array([ 1.00479763, -0.00409297])

Evaluate the distribution

See how the distribution of mass happens in each bin for one of the catalogs

%%time
info = scaling.redshift_dist_self(
    c2, redshift_bins_dist=21,
    mass_bins=[10**13.0, 10**13.2, 10**13.5, 1e15],
    redshift_bins=4, shape='line',
    fig_kwargs={'figsize':(15, 6)})
CPU times: user 196 ms, sys: 161 ms, total: 357 ms
Wall time: 134 ms
../_images/match_metrics_59_1.png

Compare with the distribution on the other catalog

%%time
info = scaling.redshift_dist(
    c1, c2, 'cross', redshift_bins_dist=21,
    mass_bins=[10**13.0, 10**13.2, 10**13.5, 1e15],
    redshift_bins=4, shape='line',
    fig_kwargs={'figsize':(15, 6)})
CPU times: user 231 ms, sys: 195 ms, total: 427 ms
Wall time: 173 ms
../_images/match_metrics_61_1.png

Mass plots

Simple plot

info = scaling.mass(c1, c2, 'cross', add_err=True)
../_images/match_metrics_64_0.png

Color points by redshift value

info = scaling.mass_zcolor(c1, c2, 'cross', add_err=True)
../_images/match_metrics_66_0.png

Color points by density at plot

info = scaling.mass_density(c1, c2, 'cross', add_err=True)
../_images/match_metrics_68_0.png

Split data into redshift bins

info = scaling.mass_zpanel(c1, c2, 'cross', add_err=True)
for ax in info['axes'].flatten():
    ax.set_ylim(.8e13, 2.2e15)
../_images/match_metrics_70_0.png

Split data into redshift bins and color by density

info = scaling.mass_density_zpanel(c1, c2, 'cross', add_err=True)
for ax in info['axes'].flatten():
    ax.set_ylim(.8e13, 2.2e15)
../_images/match_metrics_72_0.png

Evaluate metrics of the distribution

info = scaling.mass_metrics(c1, c2, 'cross')
../_images/match_metrics_74_0.png
info = scaling.mass_density_metrics(c1, c2, 'cross', ax_rotation=45)
../_images/match_metrics_75_0.png
info = scaling.mass_density_dist(c1, c2, 'cross', ax_rotation=45,
                                 add_err=False, plt_kwargs={'s':5})
../_images/match_metrics_76_0.png

All of these functions with scatter plot can also fit a relation:

info = scaling.mass_density_metrics(
    c1, c2, 'cross', ax_rotation=45,
    add_fit=True, fit_bins1=8)
../_images/match_metrics_78_0.png

The full information of the scaling relation is outputed to a dictionay containing:

  • fit (optional): fitting output dictionary, with values:

    • pars: fitted parameter.

    • cov: covariance of fitted parameters.

    • func: fitting function with fitted parameter.

    • func_plus: fitting function with fitted parameter plus 1x scatter.

    • func_minus: fitting function with fitted parameter minus 1x scatter.

    • func_scat: scatter of fited function.

    • func_chi: sqrt of chi_square(x, y) for the fitted function.

  • plots (optional): additional plots:

    • fit: fitted data

    • errorbar: binned data

Evaluate the distribution

See how the distribution of mass happens in each bin for one of the catalogs

%%time
info = scaling.mass_dist_self(
    c2, mass_bins_dist=21,
    mass_bins=[10**13.0, 10**13.2, 10**13.5, 1e14, 1e15],
    redshift_bins=4, shape='line',
    fig_kwargs={'figsize':(15, 6)})
CPU times: user 224 ms, sys: 170 ms, total: 394 ms
Wall time: 158 ms
../_images/match_metrics_81_1.png

Compare with the distribution on the other catalog

%%time
info = scaling.mass_dist(
    c1, c2, 'cross', mass_bins_dist=21,
    mass_bins=[10**13.0, 10**13.2, 10**13.5, 1e14, 1e15],
    redshift_bins=4, shape='line',
    fig_kwargs={'figsize':(15, 6)})
CPU times: user 436 ms, sys: 175 ms, total: 611 ms
Wall time: 375 ms
../_images/match_metrics_83_1.png