--- file_format: mystnb kernelspec: name: python3 execution: timeout: 180 --- # Remapping a dataset to HEALPix In this tutorial we will remap a dataset from the native ICON grid to HEALPix. The dataset used as an example is from the EERIE project and is available online. It consists of five 2d variables and covers 23 years with a 6 hour time step. ```{code-cell} ipython3 import healpix as hp import numpy as np import xarray as xr import easygems.healpix as egh import easygems.remap as egr ds = xr.open_dataset( "https://eerie.cloud.dkrz.de/datasets/icon-esm-er.eerie-control-1950.v20240618.atmos.native.2d_6h_inst/kerchunk", engine="zarr", chunks={}, ) ds ``` The first step is to create a HEALPix grid that is close to the resolution of our source grid. Here we will choose an order of 9 (also known as "zoom") and "nest" ordering. ```{code-cell} ipython3 order = zoom = 9 nside = hp.order2nside(order) npix = hp.nside2npix(nside) hp_lon, hp_lat = hp.pix2ang(nside=nside, ipix=np.arange(npix), lonlat=True, nest=True) hp_lon = (hp_lon + 180) % 360 - 180 # [-180, 180) hp_lon += 360 / (4 * nside) / 4 # shift quarter-width ``` Next, we can use our defined source and target grids to compute interpolation weights. The `easygems` package provides a function to compute these weights using the [Delaunay triangulation](https://docs.scipy.org/doc/scipy/tutorial/spatial.html#delaunay-triangulations) method. ```{code-cell} ipython3 weights = egr.compute_weights_delaunay((ds.lon, ds.lat), (hp_lon, hp_lat)) # You can also save the calculated weights for future use # weights.to_netcdf("healpix_weights.nc") ``` These weights can be applied to single fields directly: ```{code-cell} ipython3 egh.healpix_show(egr.apply_weights(ds.ts.isel(time=0), **weights)) ``` Alternatively, we can use xarray's [apply_ufunc()](https://docs.xarray.dev/en/stable/generated/xarray.apply_ufunc.html) function to lift the function onto a full dataset. This requires a coupled of additional information from the user, e.g. the input dimension along which the function should be applied, and the resulting output dimensions name and size. ```{code-cell} ipython3 ds_remap = xr.apply_ufunc( egr.apply_weights, ds, kwargs=weights, keep_attrs=True, input_core_dims=[["ncells"]], output_core_dims=[["cell"]], output_dtypes=["f4"], vectorize=True, dask="parallelized", dask_gufunc_kwargs={ "output_sizes": {"cell": npix}, }, ) ds_remap ``` The resulting dataset can then be used as usual and the remapping is performed on demand. ```{code-cell} ipython3 egh.healpix_show(ds_remap.ts.sel(time="1991-01-01").mean("time")) ``` ## Storing the coordinate reference system It is good practice to store map projection information in the Coordinate Reference Systems (CRS). By making the `crs` coordinate non-dimensional it will "stick" to the dataset even if individual variables are subselected. ```{code-cell} ipython3 ds_remap["crs"] = xr.DataArray( name="crs", data=0, attrs={ "grid_mapping_name": "healpix", "healpix_nside": 2**zoom, "healpix_order": "nest", }, ) ```