--- file_format: mystnb kernelspec: name: python3 display_name: Python 3 execution: timeout: 600 --- # Remapping a dataset In this tutorial we will remap a dataset from the native ICON grid to HEALPix. The dataset used as an example is from the EERIE project and is available online. It consists of five 2D variables and covers a period of 23 years, with a time step of six hours. ```{admonition} Note :class: note Please note that all of the methods used in this example are generic and can be applied to any combination of source and target grids. ``` ```{code-cell} ipython3 import healpix as hp import numpy as np import xarray as xr import easygems.healpix as egh import easygems.remap as egr ds = xr.open_dataset( "https://eerie.cloud.dkrz.de/datasets/icon-esm-er.eerie-control-1950.v20240618.atmos.native.2d_6h_inst/kerchunk", engine="zarr", chunks={}, ).squeeze() ds ``` The first step is to create a HEALPix grid that is close to the resolution of our source grid. Here we will choose an order of 9 (also known as "zoom") and "nest" ordering. ```{code-cell} ipython3 order = zoom = 9 nside = hp.order2nside(order) pixels = np.arange(hp.nside2npix(nside)) hp_lon, hp_lat = hp.pix2ang(nside=nside, ipix=pixels, lonlat=True, nest=True) ``` Next, we can use our defined source and target grids to compute interpolation weights. The `easygems` package provides a function to compute these weights using the [Delaunay triangulation](https://docs.scipy.org/doc/scipy/tutorial/spatial.html#delaunay-triangulations) method. ```{code-cell} ipython3 weights = egr.compute_weights_delaunay( points=(ds.lon, ds.lat), xi=(hp_lon, hp_lat) ) ``` ````{tip} For larger grids, computing the remapping weights can be quite resource-intensive. However, you can save the calculated weights for future use: ```{code-block} python weights.to_netcdf("healpix_weights.nc") ``` ```` These weights can be applied to single fields directly: ```{code-cell} ipython3 egh.healpix_show(egr.apply_weights(ds.ts.isel(time=0), **weights)); ``` Alternatively, we can use xarray's [apply_ufunc()](https://docs.xarray.dev/en/stable/generated/xarray.apply_ufunc.html) function to lift the function onto a full dataset. This requires a coupled of additional information from the user, e.g. the input dimension along which the function should be applied, and the resulting output dimensions name and size. ```{code-cell} ipython3 ds_remap = xr.apply_ufunc( egr.apply_weights, ds, kwargs=weights, keep_attrs=True, input_core_dims=[["ncells"]], output_core_dims=[["cell"]], output_dtypes=["f4"], vectorize=True, dask="parallelized", dask_gufunc_kwargs={ "output_sizes": {"cell": weights.sizes["tgt_idx"]}, }, ) ds_remap ``` The resulting dataset can then be used as usual and the remapping is performed on demand. ```{code-cell} ipython3 egh.healpix_show(ds_remap.ts.sel(time="1991-01-01").mean("time")); ``` ## Attaching a coordinate reference system It is good practice to store map projection information in the Coordinate Reference Systems (CRS). By making the `crs` coordinate non-dimensional it will "stick" to the dataset even if individual variables are subselected. ```{code-cell} ipython3 crs = xr.DataArray( name="crs", attrs={ "grid_mapping_name": "healpix", "healpix_nside": nside, "healpix_order": "nest", }, ) ds_remap = ds_remap.assign_coords(crs=crs) ```