Selecting and Reindexing of Area of Interest#

The elegant way of dealing with large output#

Before working through this script, it is helpful to have had a look into Triangular Meshes and Basic Plotting to get a basic understanding of plotting on a triangular basis.

Three-dimensional global ICON output for example requires many times more memory than two-dimensional output. The handling of such large amounts of data can very quickly lead to the exhaustion of the given memory. If you are sitting right next to a supercomputer, you are tempted to just request more RAM and go for it. However, there are elegant solutions besides the powerful one and since a lot of RAM also means a lot of electricity and a lot of coolant, the motivation for an elegant way is also to save limited resources.

In many cases, we rarely look at the complete global output, but rather at a specific, selected area. It is therefore advisable to cut out only this area. For this purpose it is very helpful to reduce the grid information from the global grid file to the area of interest but in such a way that the indexing makes sense starting at 0 and counting up continuously. The advantage is that we generate a new local grid-file, which looks like the global grid-file but is much smaller in terms of storage capacity and therefore easier and faster to handle.

So let’s do something for the climate 🌍 and first of all load the necessary libraries and the global grid-file:

import xarray as xr
import numpy as np

Constructing New Grid with Selected Cells, Vertices and Edges#

Wow, that’s already great ! We have received a lot of information in the form of indices in individual arrays about our green window. We merge them into one dataset so that everything is compact:

It could be that we need more variables for future calculations. Therefore we create a dictionary with further interesting variables, which we reindex as a precaution.

vars_to_renumber = {
    "cell": [
        "adjacent_cell_of_edge",
        "cells_of_vertex",
        "neighbor_cell_index",
        "cell_index",
    ],
    "vertex": ["vertex_of_cell", "edge_vertices", "vertices_of_vertex"],
    "edge": ["edge_of_cell", "edges_of_vertex"],
}

We now come to the heart of this script: the reindexing.
Several things happen here, which is why it is best to define a function. This function reindex_grid() needs 3 inputs and returns 1 output. The inputs are the original, complete grid and the parts of the grid that should be reindexed, hence indices and vars_to_renumber. The indices define the cells, vertices and edges. The vars_to_renumber are all variables that we are still interested in and can be composed of cells, vertices and edges. Output of our function will be a new_grid containing all indices and variables for our green window around the birthplace of Max Planck in such a way that everything starts counting at 0.

Let’s go through it step by step:

Line 1: We define a function wiht 3 input variables.

Line 2: We define as new_grid the area in the old grid that contains the indices we selected at the beginning of this script for the cells, vertices and edges. For this we use the .load() function, which loads the 17GB file into memory and processes it there: this is a little faster.

Line 3: We open a for-loop that accesses the coordinates and the entries of the array selected_indices.

Line 4: We open an array, which is only filled with -2 (exceptional value like nan but as an integer) in the original, old grid length and call it renumbering.

Line 5: We start counting at 0 at the index positions of the long renumbering array, which belong to the indices of the selected dark green area, until we have reached the length of the short, previously selected array. So what we get is an array with the length of the original grid dimension (20971520 cells, 10485762 vertices, 31457280 edges), which contains a value other than -2 only at that position within the array which is inside the dark green selected window.

Line 6: we open another for loop over the remaining variables vars_to_renumber to be reindexed.

Line 7: For the variables stored in the dictionary of a particular dimension (cell, vertex, edge), we take one item and access it in new_grid (line 2) and subtract 1 to work in python 0-based system; this is done in the square brackets on the right side of the equal sign. We use this to select the valid position in the renumbering array but in total we add 1 to output the new_grid in the same 1-based thinking as the original grid.

Line 8: We output the new_grid.

def reindex_grid(grid, indices, vars_to_renumber):
    new_grid = grid.load().isel(
        cell=indices.cell, vertex=indices.vertex, edge=indices.edge
    )
    for dim, idx in indices.coords.items():
        renumbering = np.full(grid.dims[dim], -2, dtype="int")
        renumbering[idx] = np.arange(len(idx))
        for name in vars_to_renumber[dim]:
            new_grid[name].data = renumbering[new_grid[name].data - 1] + 1
    return new_grid

After long theory we want to use our function and create the actual new_grid:

Let’s see if everything worked as we wanted it and choose reindexed variable like .vertex_of_cell and sort it:

np.unique(new_grid.vertex_of_cell)

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
       18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34,
       35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50])

Voilà ! That worked and we are done and have now built a new grid-file tailored to the area of our interest, which was provided with a new indexing.

For further processing, the two datasets selected_indices and new_grid can be saved to have them quickly accessible for further calculations.

selected_indices.to_netcdf(
    f"selected_indices_region_{bottom_bound}-{top_bound}_{left_bound}-{right_bound}.nc",
    mode="w",
)
new_grid.to_netcdf(
    f"new_grid_region_{bottom_bound}-{top_bound}_{left_bound}-{right_bound}.nc",
    mode="w",
)

Selecting and Reindexing of Area of Interest#

The elegant way of dealing with large output#

Importing the Grid-File#

Cells#

Vertices#

Edges#

Constructing New Grid with Selected Cells, Vertices and Edges#

This Page