(dyamond-library)= # The DYAMOND Data Library ## The Library structure The DYAMOND Data Library is located in {file}`/fastdata/ka1081/DYAMOND/`, while most of the data is archived in the DKRZ tape system. The library is structured as follows: - {file}`data/` contains DYAMOND data sets according to the latest data requests - {file}`scripts_DYAMOND_summer/` contains some example post-processing scripts for **DYAMOND Summer** data sets - {file}`indices/` contains the lists of the data archived in DKRZ's tape archive - {file}`requests/` -- in this folder, requests for data can be made (deprecated) Since the DYAMOND data sets contain hundreds of terabytes of data, most of the DYAMOND Summer data is archived in DKRZ's tape archive. To access the data sets associated with the project, you will need to use the `get_dyamond_summer` tool. The files then will be downloaded to {file}`/scratch/k/k202134/INTAKE_CACHE/` folder, which is accessible to all DYAMOND users. Please keep in mind that the quota for this folder is limited, when downloading large amounts of data. In order to stay within the quota, unused data sets will be deleted from the {file}`/scratch/k/k202134/INTAKE_CACHE/` folder on a regular basis. For any questions, please contact us via . ## Accessing DYAMOND stored at DKRZ tape system Available **DYAMOND Summer** and **DYAMOND Winter** files on disks can be found in {file}`data/`. If you miss some files or even a whole data set, please search for the files by using our new `get_dyamond_summer` tool and then retrieve the files as described below. :::{warning} Please not that the tools used for the below instructions are still under development and may change in the future. When faced with any issues, please contact us via . ::: ### Searching for a data set #### **Load hsm-tools module** In order to use the `get_dyamond_summer` tool, you first need to load `hsm-tools` module. You can run the following commands to load the module and access to the `get_dyamond_summer` tool or put them into your {file}`.bash_profile` file to load the module automatically when you log in. Since there may be conflicts between different versions of the `slk` at the moment, we recommend to unload `packems` from modules before proceeding: ```bash export hsm_tools='/work/k20200/k202134/hsm-tools' module use $hsm_tools/outtake/module module load hsm-tools/unstable ``` Then check if your StrongLink token is valid: ```bash slk_helpers session ``` If you do not have access to the tape library please see the documentation for [slk login](https://docs.dkrz.de/doc/datastorage/hsm/cli.html#slk-login) and [known issues on slk](https://docs.dkrz.de/doc/datastorage/hsm/known_issues.html#ldap-user-not-known-to-stronglink-prior-to-first-login). #### **Search and retrieve your files using get_dyamond_summer** Searching files by `get_dyamond_summer` is done by using the following command: ```bash get_dyamond_summer MPAS-3.75km/history.2016-08-02 get_dyamond_summer FV3-3.25km.*v200_C3072_144x72.fre.nc get_dyamond_summer FV3-3.25km.* # Yields many results ``` The above commands will return a list of files that match the search string. If you want to retrieve the files, you can add the `--get` flag to the command as well as `sbatch` to submit the job to the batch system: ```bash sbatch get_dyamond_summer --get MPAS-3.75km/history.2016-08-02 sbatch get_dyamond_summer --get FV3-3.25km.*v200_C3072_144x72.fre.nc ``` The download then will run in the background and **will take hours to days to complete**. You can check the status of the running job via `squeue -me` and by looking at the log file it creates {file}`get_dyamond_summer.log${SLURM_JOB_ID}`. **You need to call sbatch from a writable directory to ensure the successful creation of the log file. Otherwise the job will crash immediately and without any messages.** The data will be downloaded to the {file}`/scratch/k/k202134/INTAKE_CACHE/` folder. You can run the command again without the `--get` flag to see the files that are already downloaded. :::{warning} Files in the {file}`scratch` directory where {file}`get_dyamond_summer` downloads to will get deleted after two weeks without access. So download and process fast. ::: :::{note} `slk` requires regular expressions to search for the files. Therefore please use `regex` syntax when looking up for patterns. For example instead of only an asterisk (`*`) you need to use a dot before (`.*`) to search for any characters containing a specific string. :::