The `dregs` CLI

The DESC data registry also comes with a Command Line Interface (CLI) tool, dregs, which can perform some simple actions.

See the tutorials section for a demonstration of its usage.

dregs

The data registry CLI interface

usage: dregs [-h] {show,ls,modify,register,delete} ...

options

-h, --help: show this help message and exit

dregs delete

usage: dregs delete [-h] {dataset_by_id,dataset} ...

-h, --help: show this help message and exit

dregs delete dataset

usage: dregs delete dataset [-h] [--config_file CONFIG_FILE]
                            [--root_dir ROOT_DIR] [--site SITE]
                            [--schema SCHEMA] [--namespace NAMESPACE]
                            [--entry_mode ENTRY_MODE]
                            name version_string owner owner_type

name: The dataset name for the dataset you wish to delete

version_string: The dataset version_string for the dataset you wish to delete

owner: The dataset owner for the dataset you wish to delete

owner_type: The dataset owner_type for the dataset you wish to delete

-h, --help: show this help message and exit

--config_file <config_file>: Location of data registry config file

--root_dir <root_dir>: Location of the root_dir

--site <site>: Get the root_dir through a pre-defined ‘site’

--schema <schema>: Which schema to connect to

--namespace <namespace>: Which namespace to connect to

--entry_mode <entry_mode>: Which schema to default to in the namespace (working is default)

dregs delete dataset_by_id

usage: dregs delete dataset_by_id [-h] [--config_file CONFIG_FILE]
                                  [--root_dir ROOT_DIR] [--site SITE]
                                  [--schema SCHEMA] [--namespace NAMESPACE]
                                  [--entry_mode ENTRY_MODE]
                                  dataset_id

dataset_id: The dataset_id you wish to delete

-h, --help: show this help message and exit

--config_file <config_file>: Location of data registry config file

--root_dir <root_dir>: Location of the root_dir

--site <site>: Get the root_dir through a pre-defined ‘site’

--schema <schema>: Which schema to connect to

--namespace <namespace>: Which namespace to connect to

--entry_mode <entry_mode>: Which schema to default to in the namespace (working is default)

dregs ls

usage: dregs ls [-h] [--owner OWNER]
                [--owner_type {user,group,production,project}] [--name NAME]
                [--return_cols RETURN_COLS [RETURN_COLS ...]]
                [--max_rows MAX_ROWS] [--max_chars MAX_CHARS]
                [--keyword KEYWORD] [--config_file CONFIG_FILE]
                [--root_dir ROOT_DIR] [--site SITE] [--schema SCHEMA]
                [--namespace NAMESPACE] [--entry_mode ENTRY_MODE]

-h, --help: show this help message and exit

--owner <owner>: List datasets for a given owner. By default owner is $USER. Selecting ‘–owner none’ will return results from all owners.

--owner_type {user,group,production,project}: List datasets for a given owner type

--name <name>: Only return datasets with a given name (wildcard support)

--return_cols <return_cols>: List of columns to return in the query

--max_rows <max_rows>: Maximum number of rows to print (default 500)

--max_chars <max_chars>: Maximum number of characters to print in a column (default 40)

--keyword <keyword>: Keyword to filter by

--config_file <config_file>: Location of data registry config file

--root_dir <root_dir>: Location of the root_dir

--site <site>: Get the root_dir through a pre-defined ‘site’

--schema <schema>: Which schema to connect to

--namespace <namespace>: Which namespace to connect to

--entry_mode <entry_mode>: Which schema to default to in the namespace (working is default)

dregs modify

usage: dregs modify [-h] {dataset} ...

-h, --help: show this help message and exit

dregs modify dataset

usage: dregs modify dataset [-h] [--config_file CONFIG_FILE]
                            [--root_dir ROOT_DIR] [--site SITE]
                            [--schema SCHEMA] [--namespace NAMESPACE]
                            [--entry_mode ENTRY_MODE]
                            dataset_id column new_value

dataset_id: dataset_id of dataset to modify

column: Column in the dataset table to modify

new_value: Updated value

-h, --help: show this help message and exit

--config_file <config_file>: Location of data registry config file

--root_dir <root_dir>: Location of the root_dir

--site <site>: Get the root_dir through a pre-defined ‘site’

--schema <schema>: Which schema to connect to

--namespace <namespace>: Which namespace to connect to

--entry_mode <entry_mode>: Which schema to default to in the namespace (working is default)

dregs register

usage: dregs register [-h] {dataset} ...

-h, --help: show this help message and exit

dregs register dataset

usage: dregs register dataset [-h] [--relative_path RELATIVE_PATH]
                              [--creation_date CREATION_DATE]
                              [--access_api ACCESS_API] [--owner OWNER]
                              [--owner_type {user,group,project,production}]
                              [--description DESCRIPTION]
                              [--execution_id EXECUTION_ID]
                              [--is_overwritable]
                              [--location_type {dataregistry,external,meta_only,dummy}]
                              [--url URL] [--contact_email CONTACT_EMAIL]
                              [--old_location OLD_LOCATION]
                              [--execution_name EXECUTION_NAME]
                              [--execution_description EXECUTION_DESCRIPTION]
                              [--execution_start EXECUTION_START]
                              [--execution_site EXECUTION_SITE]
                              [--execution_configuration EXECUTION_CONFIGURATION]
                              [--input_datasets INPUT_DATASETS [INPUT_DATASETS ...]]
                              [--keywords KEYWORDS [KEYWORDS ...]]
                              [--config_file CONFIG_FILE]
                              [--root_dir ROOT_DIR] [--site SITE]
                              [--schema SCHEMA] [--namespace NAMESPACE]
                              [--entry_mode ENTRY_MODE]
                              name version

name: Any convenient, evocative name for the human. Note the combination of name and version must be unique.

version: Semantic version string of the format MAJOR.MINOR.PATCH or a specialflag “patch”, “minor” or “major”. When a special flag is used itautomatically bumps the relative version for you (see examples for moredetails).

-h, --help: show this help message and exit

--relative_path <relative_path>: Relative path storing the data, relative to <root_dir>. If None, generated from the name and version_string

--creation_date <creation_date>: Dataset creation date

--access_api <access_api>: Describes the software that can read the dataset (e.g., ‘GCRCatalogs’, ‘skyCatalogs’)

--owner <owner>: Owner of the dataset (defaults to $USER)

--owner_type {user,group,project,production}: Datasets owner type, can be ‘user’, ‘group’, ‘project’ or ‘production’. (default=user)

--description <description>: User provided human-readable description of the dataset

--execution_id <execution_id>: Execution this dataset is linked to

--is_overwritable: True means this dataset can be overwritten in the future

--location_type {dataregistry,external,meta_only,dummy}: What is the physical location of the data? ‘dataregistry’ means the data is located within the <root_dir> and managed by the dataregistry. External means the data is not managed by the dataregistry, either because it is off-site or because it is stored outside <root_dir> therefore there is only a database entry (in this case a url or contact_email must be provided during registration). ‘meta_only’ is for a legitimate entry involving no actual data, but possibly referring to other entries which do directly reference managed data, as may happen for some GCRCatalogs entries, and ‘dummy’ is a dataset for internal testing purposes only. The data registry will only attempt to manage data created with this field set to ‘dataregistry’. (default=dataregistry)

--url <url>: URL that points to the data (used in the case of external datasets, i.e., location_type=’external’).

--contact_email <contact_email>: Contact information for someone regarding the dataset.

--old_location <old_location>: Absolute location of dataset to copy. If None dataset should alreadybe at correct relative_path.

--execution_name <execution_name>: Typically pipeline name or program name

--execution_description <execution_description>: Human readible description of execution

--execution_start <execution_start>: Date the execution started

--execution_site <execution_site>: Where was the execution performed?

--execution_configuration <execution_configuration>: Path to text file used to configure the execution

--input_datasets <input_datasets>: List of dataset ids that were the input to this execution

--keywords <keywords>: List of (predefined) keywords to tag dataset with

--config_file <config_file>: Location of data registry config file

--root_dir <root_dir>: Location of the root_dir

--site <site>: Get the root_dir through a pre-defined ‘site’

--schema <schema>: Which schema to connect to

--namespace <namespace>: Which namespace to connect to

--entry_mode <entry_mode>: Which schema to default to in the namespace (working is default)

dregs show

usage: dregs show [-h] {keywords} ...

-h, --help: show this help message and exit

dregs show keywords

usage: dregs show keywords [-h] [--config_file CONFIG_FILE]
                           [--root_dir ROOT_DIR] [--site SITE]
                           [--schema SCHEMA] [--namespace NAMESPACE]
                           [--entry_mode ENTRY_MODE]

-h, --help: show this help message and exit

--config_file <config_file>: Location of data registry config file

--root_dir <root_dir>: Location of the root_dir

--site <site>: Get the root_dir through a pre-defined ‘site’

--schema <schema>: Which schema to connect to

--namespace <namespace>: Which namespace to connect to

--entry_mode <entry_mode>: Which schema to default to in the namespace (working is default)

The dregs CLI