The dregs
CLI
The DESC data registry also comes with a Command Line Interface (CLI) tool,
dregs
, which can perform some simple actions.
See the tutorials section for a demonstration of its usage.
dregs
The data registry CLI interface
usage: dregs [-h] {ls,register,delete} ...
options
- -h, --help
show this help message and exit
dregs delete
usage: dregs delete [-h] {dataset} ...
- -h, --help
show this help message and exit
dregs delete dataset
usage: dregs delete dataset [-h] [--config_file CONFIG_FILE]
[--root_dir ROOT_DIR] [--site SITE]
[--schema SCHEMA]
dataset_id
- dataset_id
The dataset_id you wish to delete
- -h, --help
show this help message and exit
- --config_file <config_file>
Location of data registry config file
- --root_dir <root_dir>
Location of the root_dir
- --site <site>
Get the root_dir through a pre-defined ‘site’
- --schema <schema>
Which schema to connect to
dregs ls
usage: dregs ls [-h] [--owner OWNER] [--owner_type {user,group,production}]
[--all] [--config_file CONFIG_FILE] [--root_dir ROOT_DIR]
[--site SITE] [--schema SCHEMA]
- -h, --help
show this help message and exit
- --owner <owner>
List datasets for a given owner
- --owner_type {user,group,production}
List datasets for a given owner type
- --all
List all datasets
- --config_file <config_file>
Location of data registry config file
- --root_dir <root_dir>
Location of the root_dir
- --site <site>
Get the root_dir through a pre-defined ‘site’
- --schema <schema>
Which schema to connect to
dregs register
usage: dregs register [-h] {dataset} ...
- -h, --help
show this help message and exit
dregs register dataset
usage: dregs register dataset [-h] [--name NAME]
[--version_suffix VERSION_SUFFIX]
[--creation_date CREATION_DATE]
[--access_API ACCESS_API] [--owner OWNER]
[--owner_type {user,group,project,production}]
[--description DESCRIPTION]
[--execution_id EXECUTION_ID]
[--is_overwritable]
[--location_type {dataregistry,external,dummy}]
[--url URL] [--contact_email CONTACT_EMAIL]
[--old_location OLD_LOCATION] [--make_symlink]
[--execution_name EXECUTION_NAME]
[--execution_description EXECUTION_DESCRIPTION]
[--execution_start EXECUTION_START]
[--execution_site EXECUTION_SITE]
[--execution_configuration EXECUTION_CONFIGURATION]
[--input_datasets INPUT_DATASETS [INPUT_DATASETS ...]]
[--config_file CONFIG_FILE]
[--root_dir ROOT_DIR] [--site SITE]
[--schema SCHEMA]
relative_path version
- relative_path
Destination for the dataset within the data registry. Path isrelative to <registry root>/<owner_type>/<owner>.
- version
Semantic version string of the format MAJOR.MINOR.PATCH or a specialflag “patch”, “minor” or “major”. When a special flag is used itautomatically bumps the relative version for you (see examples for moredetails).
- -h, --help
show this help message and exit
- --name <name>
Any convenient, evocative name for the human. Note the combination of name, version and version_suffix must be unique. If None name is generated from the relative path.
- --version_suffix <version_suffix>
Optional version suffix to place at the end of the version string. Cannot be used for production datasets.
- --creation_date <creation_date>
Dataset creation date
- --access_API <access_api>
Describes the software that can read the dataset (e.g., ‘gcr-catalogs’, ‘skyCatalogs’)
- --owner <owner>
Owner of the dataset (defaults to $USER)
- --owner_type {user,group,project,production}
Datasets owner type, can be ‘user’, ‘group’, ‘project’ or ‘production’. (default=user)
- --description <description>
User provided human-readable description of the dataset
- --execution_id <execution_id>
Execution this dataset is linked to
- --is_overwritable
True means this dataset can be overwritten in the future
- --location_type {dataregistry,external,dummy}
What is the physical location of the data? ‘dataregistry’ means the data is located within the <root_dir> and managed by the dataregistry, external means the data is not managed by the dataregistry, either because it is off-site or because it is stored outside <root_dir> therefore there is only a database entry (in this case a url or contact_email must be provided during registration) and ‘dummy’ is a dataset for testing purposes only (only a database entry is created in this case). (default=dataregistry)
- --url <url>
URL that points to the data (used in the case of external datasets, i.e., location_type=’external’).
- --contact_email <contact_email>
Contact information for someone regarding the dataset.
- --old_location <old_location>
Absolute location of dataset to copy. If None dataset should alreadybe at correct relative_path.
- --make_symlink
Flag to make symlink to data rather than copy any files.
- --execution_name <execution_name>
Typically pipeline name or program name
- --execution_description <execution_description>
Human readible description of execution
- --execution_start <execution_start>
Date the execution started
- --execution_site <execution_site>
Where was the execution performed?
- --execution_configuration <execution_configuration>
Path to text file used to configure the execution
- --input_datasets <input_datasets>
List of dataset ids that were the input to this execution
- --config_file <config_file>
Location of data registry config file
- --root_dir <root_dir>
Location of the root_dir
- --site <site>
Get the root_dir through a pre-defined ‘site’
- --schema <schema>
Which schema to connect to