.. _intake-query_yaml-command-line: Finding files on the command line with query_yaml ================================================== To load :code:`query_yaml`, use .. code:: bash module use /work/k20200/k202134/hsm-tools/outtake/module module load hsm-tools/unstable Then you can search for files with :code:`query_yaml`. Just calling it without any other arguments will display a tree view of the nextGEMS catalog. Adding names of sub-trees will limit the search (e.g. :code:`query_yaml ICON`). Once you have limited it to one dataset, the contents of this dataset will be listed (:code:`query_yaml ICON ngc4008`). In general, using :code:`--cdo` with :code:`--var NAME` on one specific dataset is a good choice if you want to use the output of :code:`query_yaml` with :code:`cdo`. The full list of options can be obtained from the help function .. literalinclude:: query_yaml_help Dealing with dataset variants ----------------------------- zarr datasets with various variants ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - variants will be indicated in parentenses behind the dataset name, e.g. :code:`ngc4008 (time, zoom)`. - :code:`query_yaml` will be fast. - use queries with :code:`--search_args`, e.g. :code:`--search_args time=PT3H zoom=5` to get the desired file set. - combine with :code:`--cdo` to get the decorations needed for opening with cdo (or other libnetcdf-based utilities). - Note that the resulting dataset will still contain a lot of variables (i.e. don't just feed it into :code:`cdo -timmean`) Datasets spread over various netCDF/files (no kerchunk) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - :code:`query_yaml` will be slow to show the contents of the dataset (without :code:`--uri``), as it has to open all files to check for their contents. - just using :code:`query_yaml` with :code:`--uri`, but without :code:`--var NAME` will dump all files on you, regardless of your interest in the variable (may or may not be useful). - combine :code:`--uri` with :code:`--var` to get files for a specific variable: :code:`query_yaml.py FESOM IFS_4.4-FESOM_5-cycle3 2D_1h_native --uri --var sst` Datasets represented via kerchunk (some netCDF, FDB/GRIB) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - :code:`query_yaml` will be fast. - Plain :code:`--uri` will lead you to the index - Use :code:`--cdo` with :code:`--var NAME` to get actual file names - File names will be sorted alphabetically as a best guess. If this is the right order in time depends on the person creating the files. see also .. _link: ../healpix/regridding