Usage
Currently, only data vector concealment is implemented in Smokescreen. Posterior-level concealment is under development.
Data Vector Concealment (blinding)
The Smokescreen library provides a method for blinding data vectors. This method is based on the Muir et al. (2021) data-vector blinding method.
To conceal a data-vector you need the following elements:
A CCL cosmology object
A dictionary of the nuisance parameters used in the likelihood (soon to be deprecated)
A Firecrown Likelihood, which takes a SACC data-vector (see more below). It can be either a path to the python file containing the likelihood or the module itself.
A dictionary of cosmological parameters to be shifted in the format:
# for a random uniform parameter shift: {'PARAM_Y': (Y_MIN, Y_MAX), 'PARAM_Z': (Z_MIN, Z_MAX)} # or for a determinist shift (used for debugging): {'PARAM_Y': Y_VALUE, 'PARAM_Z': Z_VALUE} # for a Gaussian parameter shift: {'PARAM_Y': (MEAN_Y, STD_Y), 'PARAM_Z': (MEAN_Z, STD_Z)}
A SACC data-vector
A random seed as int or string
Attention
Likelihood Requirements
The blinding module requires the Firecrown likelihood to be built with certain requirements. First, we must be able to build the likelihood by providing a sacc object with the measurements for the data-vector:
def build_likelihood(build_parameters): """ This is a generic likelihood theory model for a generic data vector. """ sacc_data = build_parameters['sacc_data']This is simular to what is currently done in TXPipe.
The likelihood module also must have a method
.compute_theory_vector(ModellingTools)
which calls for the calculation of the theory vector inside the likelihood object.
Danger
Likelihoods with hardcoded sacc files:
If you provide a Firecrown likelihood with a hardcoded path to a sacc file as the data-vector, Smokescreen will conceal the hardcoded sacc file and not the one you provided. This is because the likelihood is built with the hardcoded path. Firecrown currently has not checks to avoid a hardcoded sacc file in the build_likelihood(...)
function. To avoid this, please build the likelihood as described above.
The likelihood can be provided either as a path to the python file containing the build_likelihood
function or as a python module. In the latter case, the module must be imported.
TL;DR: Check the Smokescreen notebooks folder for a couple of examples.
From the command line
The blinding module can be used to blind the data-vector measurements. The module can be used as follows:
python -m smokescreen --config configuration_file.yaml
From Smokescreen version 1.3.0 you should call the module as:
smokescreen datavector --config configuration_file.yaml
You can find an example of a configuration file here:
path_to_sacc: "./cosmicshear_sacc.fits"
likelihood_path: "./cosmicshear_likelihood.py"
systematics:
trc1_delta_z: 0.1
trc0_delta_z: 0.1
shifts_dict:
Omega_c: [0.20, 0.42]
sigma8: [0.67, 0.92]
seed: 2112
shift_distribution: "flat"
# only needed if you want a different reference cosmology
# than ccl.VanillaLCDM
reference_cosmology:
sigma8: 0.85
keep_original_sacc: true
Warning
By default, the original SACC file is deleted after the encryption. If you want to keep the original SACC file, you can set the `keep_original_sacc` parameter to `true` in the configuration file.
Or you can use the following command to create a template configuration file:
python -m smokescreen --print_config > template_config.yaml
# or in version 1.3.0+
smokescreen datavector --print_config > template_config.yaml
Note that the reference_cosmology is optional. If not provided, the CCL VanillaLCDM reference cosmology will be the one used to compute the data vector.
From a notebook/your code
The smokescreen module can be used to blind the data-vector measurements. The module can be used as follows:
# import the module
import pyccl as ccl
from smokescreen import ConcealDataVector
# import the likelihood that contains the model and data vector
[...]
import my_likelihood
# create the cosmology ccl object
cosmo = ccl.Cosmology(Omega_c=0.27,
Omega_b=0.045,
h=0.67,
sigma8=0.8,
n_s=0.96,
transfer_function='bbks')
# load a sacc object with the data vector [FIXME: this is a placeholder, the sacc object should be loaded from the likelihood]
sacc_data = sacc.Sacc.load_fits('path/to/data_vector.sacc')
# create a dictionary of the necessary firecrown nuisance parameters
syst_dict = {
"ia_a_1": 1.0,
"ia_a_2": 0.5,
"ia_a_d": 0.5,
"lens0_bias": 2.0,
"lens0_b_2": 1.0,
"lens0_b_s": 1.0,
"lens0_mag_bias": 1.0,
"src0_delta_z": 0.000,
"lens0_delta_z": 0.000,}
# create the smokescreen object
smoke = ConcealDataVector(cosmo, syst_dict, sacc_data, my_likelihood,
{'Omega_c': (0.22, 0.32), 'sigma8': (0.7, 0.9)}, shift_distr='flat')
# conceals (blinds) the data vector
smoke.calculate_concealing_factor()
concealed_dv = smoke.apply_concealing_to_likelihood_datavec()
# create the smokescreen object with Gaussian shifts
smoke_gaussian = ConcealDataVector(cosmo, syst_dict, sacc_data, my_likelihood,
{'Omega_c': (0.27, 0.05), 'sigma8': (0.8, 0.02)}, shift_distr='gaussian')
# conceals (blinds) the data vector with Gaussian shifts
smoke_gaussian.calculate_concealing_factor()
concealed_dv_gaussian = smoke_gaussian.apply_concealing_to_likelihood_datavec()
To encrypt the original sacc file, follow the instructions in the next section.
Encryting and Decrypting SACC files
From Smokescreen version 1.3.0, you can encrypt and decrypt SACC files. This is useful when you want to share the data vector with someone else but you don’t want them to see the data. The encryption is done using the cryptography library. It is important to note that the encryption is done using a symmetric key, so the person you are sharing the data with must have the key to decrypt the file.
When running the data vector concealment module, encryption is performed by default. The decryption key is saved in a file with the same name as the original file but a .key extension. The key is saved in the same directory as the encrypted file.
Warning
By default, the original SACC file is deleted after the encryption. If you want to keep the original SACC file, you can set the `keep_original_sacc` parameter to `true` in the configuration file or set the flag `–keep_original true` via command line
Encrypting files
To encrypt a sacc file (or any file), you can use the following command:
smokescreen encrypt --path_to_sacc path/to/sacc.fits --path_to_save path/to/save/the/file/ [--keep_original true]
This will generate an encrypted file with the extension .encrpt and a key file with the extension .key in the same directory as the encrypted file or in the directory specified by –path_to_save.
You can also encrypt a file from a notebook/your code:
from smokescreen.encryption import encrypt_sacc
encrypt_sacc('path/to/sacc.fits', 'path/to/save/the/file/', save_file=True, keep_original=False)
Decrypting files
To decrypt the file, you can use the following command:
smokescreen decrypt --path_to_sacc [path_to_encrypted_sacc] --path_to_key [path_to_file_with_key]
or from a notebook/your code:
from smokescreen.encryption import decrypt_sacc
decrypt_sacc('path/to/encrypted_sacc.encrpt', 'path/to/key.key', save_file=True)
The save_file parameter is optional and is set to True by default. If set to False, the decrypted file will not be saved to disk.
Posterior Concealment (blinding)
Warning
UNDER DEVELOPMENT