Usage

Currently, only data vector concealment is implemented in Smokescreen. Posterior-level concealment is under development.

Data Vector Concealment (blinding)

The Smokescreen library provides a method for blinding data vectors. This method is based on the Muir et al. (2021) data-vector blinding method.

To conceal a data-vector you need the following elements:

  • A CCL cosmology object

  • A dictionary of the nuisance parameters used in the likelihood (soon to be deprecated)

  • A Firecrown Likelihood, which takes a SACC data-vector (see more below). It can be either a path to the python file containing the likelihood or the module itself.

  • A dictionary of cosmological parameters to be shifted in the format:

    # for a random uniform parameter shift:
    {'PARAM_Y': (Y_MIN, Y_MAX), 'PARAM_Z': (Z_MIN, Z_MAX)}
    # or for a determinist shift (used for debugging):
    {'PARAM_Y': Y_VALUE, 'PARAM_Z': Z_VALUE}
    
    # for a Gaussian parameter shift:
    {'PARAM_Y': (MEAN_Y, STD_Y), 'PARAM_Z': (MEAN_Z, STD_Z)}
    
  • A SACC data-vector

  • A random seed as int or string

Attention

Likelihood Requirements

The blinding module requires the Firecrown likelihood to be built with certain requirements. First, we must be able to build the likelihood by providing a sacc object with the measurements for the data-vector:

def build_likelihood(build_parameters):
    """
    This is a generic likelihood theory model
    for a generic data vector.
    """
    sacc_data = build_parameters['sacc_data']

This is simular to what is currently done in TXPipe.

The likelihood module also must have a method .compute_theory_vector(ModellingTools) which calls for the calculation of the theory vector inside the likelihood object.

Danger

Likelihoods with hardcoded sacc files:

If you provide a Firecrown likelihood with a hardcoded path to a sacc file as the data-vector, Smokescreen will conceal the hardcoded sacc file and not the one you provided. This is because the likelihood is built with the hardcoded path. Firecrown currently has not checks to avoid a hardcoded sacc file in the build_likelihood(...) function. To avoid this, please build the likelihood as described above.

The likelihood can be provided either as a path to the python file containing the build_likelihood function or as a python module. In the latter case, the module must be imported.

TL;DR: Check the Smokescreen notebooks folder for a couple of examples.

From the command line

The blinding module can be used to blind the data-vector measurements. The module can be used as follows:

python -m smokescreen --config configuration_file.yaml

From Smokescreen version 1.3.0 you should call the module as:

smokescreen datavector --config configuration_file.yaml

You can find an example of a configuration file here:

path_to_sacc: "./cosmicshear_sacc.fits"
likelihood_path: "./cosmicshear_likelihood.py"
systematics:
    trc1_delta_z: 0.1
    trc0_delta_z: 0.1
shifts_dict:
    Omega_c: [0.20, 0.42]
    sigma8: [0.67, 0.92]
seed: 2112
shift_distribution: "flat"
# only needed if you want a different reference cosmology
# than ccl.VanillaLCDM
reference_cosmology:
    sigma8: 0.85
keep_original_sacc: true

Warning

By default, the original SACC file is deleted after the encryption. If you want to keep the original SACC file, you can set the `keep_original_sacc` parameter to `true` in the configuration file.

Or you can use the following command to create a template configuration file:

python -m smokescreen --print_config > template_config.yaml
# or in version 1.3.0+
smokescreen datavector --print_config > template_config.yaml

Note that the reference_cosmology is optional. If not provided, the CCL VanillaLCDM reference cosmology will be the one used to compute the data vector.

From a notebook/your code

The smokescreen module can be used to blind the data-vector measurements. The module can be used as follows:

# import the module
import pyccl as ccl
from smokescreen import ConcealDataVector
# import the likelihood that contains the model and data vector
[...]
import my_likelihood

# create the cosmology ccl object
cosmo = ccl.Cosmology(Omega_c=0.27,
                      Omega_b=0.045,
                      h=0.67,
                      sigma8=0.8,
                      n_s=0.96,
                      transfer_function='bbks')
# load a sacc object with the data vector [FIXME: this is a placeholder, the sacc object should be loaded from the likelihood]
sacc_data = sacc.Sacc.load_fits('path/to/data_vector.sacc')
# create a dictionary of the necessary firecrown nuisance parameters
syst_dict = {
            "ia_a_1": 1.0,
            "ia_a_2": 0.5,
            "ia_a_d": 0.5,
            "lens0_bias": 2.0,
            "lens0_b_2": 1.0,
            "lens0_b_s": 1.0,
            "lens0_mag_bias": 1.0,
            "src0_delta_z": 0.000,
            "lens0_delta_z": 0.000,}
# create the smokescreen object
smoke = ConcealDataVector(cosmo, syst_dict, sacc_data, my_likelihood,
                          {'Omega_c': (0.22, 0.32), 'sigma8': (0.7, 0.9)}, shift_distr='flat')
# conceals (blinds) the data vector
smoke.calculate_concealing_factor()
concealed_dv = smoke.apply_concealing_to_likelihood_datavec()

# create the smokescreen object with Gaussian shifts
smoke_gaussian = ConcealDataVector(cosmo, syst_dict, sacc_data, my_likelihood,
                                   {'Omega_c': (0.27, 0.05), 'sigma8': (0.8, 0.02)}, shift_distr='gaussian')
# conceals (blinds) the data vector with Gaussian shifts
smoke_gaussian.calculate_concealing_factor()
concealed_dv_gaussian = smoke_gaussian.apply_concealing_to_likelihood_datavec()

To encrypt the original sacc file, follow the instructions in the next section.

Encryting and Decrypting SACC files

From Smokescreen version 1.3.0, you can encrypt and decrypt SACC files. This is useful when you want to share the data vector with someone else but you don’t want them to see the data. The encryption is done using the cryptography library. It is important to note that the encryption is done using a symmetric key, so the person you are sharing the data with must have the key to decrypt the file.

When running the data vector concealment module, encryption is performed by default. The decryption key is saved in a file with the same name as the original file but a .key extension. The key is saved in the same directory as the encrypted file.

Warning

By default, the original SACC file is deleted after the encryption. If you want to keep the original SACC file, you can set the `keep_original_sacc` parameter to `true` in the configuration file or set the flag `–keep_original true` via command line

Encrypting files

To encrypt a sacc file (or any file), you can use the following command:

smokescreen encrypt --path_to_sacc path/to/sacc.fits --path_to_save path/to/save/the/file/ [--keep_original true]

This will generate an encrypted file with the extension .encrpt and a key file with the extension .key in the same directory as the encrypted file or in the directory specified by –path_to_save.

You can also encrypt a file from a notebook/your code:

from smokescreen.encryption import encrypt_sacc
encrypt_sacc('path/to/sacc.fits', 'path/to/save/the/file/', save_file=True, keep_original=False)

Decrypting files

To decrypt the file, you can use the following command:

smokescreen decrypt --path_to_sacc [path_to_encrypted_sacc] --path_to_key [path_to_file_with_key]

or from a notebook/your code:

from smokescreen.encryption import decrypt_sacc
decrypt_sacc('path/to/encrypted_sacc.encrpt', 'path/to/key.key', save_file=True)

The save_file parameter is optional and is set to True by default. If set to False, the decrypted file will not be saved to disk.

Posterior Concealment (blinding)

Warning

UNDER DEVELOPMENT