The DYAMOND Data Library#
The Library Structure#
Since the DYAMOND data sets contain several petabytes of data, most of the data is archived in DKRZ’s tape archive. To access the data sets associated with the project, we provide the get_dyamond_summer and get_dyamond_winter tools (see below). Files will be downloaded to the levante hard disks, and shared among all DYAMOND users. Please note that the quota for this storage is limited when downloading large amounts of data. To save space, data sets will be removed from the disks after two weeks without access.
For any questions, please contact us at dyamond@esiwace.eu.
Searching for a Data Set#
Load the hsm-tools module#
To use get_dyamond_summer and get_dyamond_winter, you first need to load the hsm-tools module. Run the following commands, or add them to your .bash_profile to load the module automatically when you log in. Since there may be conflicts between different versions of slk, we recommend unloading packems from modules before proceeding:
module use /work/k20200/k202134/hsm-tools/outtake/module
module load hsm-tools/unstable
Use slk login to authenticate with the tape library:
slk login
If you do not have access to the tape library, please see the documentation for slk login and known issues on slk.
Search and retrieve your files#
get_dyamond_summer and get_dyamond_winter use the same syntax. For brevity, we will only show examples for get_dyamond_summer here. The same commands apply to get_dyamond_winter. Searching for files with get_dyamond_summer is done using regex syntax (.* for matching a sequence of characters). Downloaded files will be printed to stdout for piping and re-using in other commands, any other output goes to stderr. Here are some examples:
# Get the description files of all runs
get_dyamond_summer datadescription.txt
# get all files containing MPAS-3.75km/history.2016-08-02
get_dyamond_summer MPAS-3.75km/history.2016-08-02
# A more free search using .* for filling a gap in the filename
get_dyamond_summer FV3-3.25km.*v200_C3072_144x72.fre.nc
# Anything from FV3-3.25km, 2>&1 redirects stderr to stdout for less
get_dyamond_summer FV3-3.25km 2>&1 |less
The above commands will return a list of files that match the search string. To retrieve the files, add the --get flag to the command and use sbatch to submit the job to the batch system:
sbatch get_dyamond_summer --get MPAS-3.75km/history.2016-08-02
sbatch get_dyamond_summer --get FV3-3.25km.*v200_C3072_144x72.fre.nc
The download will run in the background and may take hours to days to complete. You can check the status of the running job via squeue -me and by looking at the log file it creates: get_dyamond_summer.log$SLURM_JOB_ID. You must call sbatch from a writable directory to ensure the successful creation of the log file. Otherwise, the job will crash immediately and without any messages.. You can run the command again without the --get flag to see the files that are already downloaded.
Warning
Files in the scratch directory where get_dyamond_summer downloads to will be deleted after two weeks without access. Download and process your data promptly.
Note
get_dyamond_summer requires regular expressions to search for files. Please use regex syntax when searching for patterns. For example, instead of only an asterisk (*), use a dot before (.*) to search for any characters containing a specific string.