--- file_format: mystnb kernelspec: name: python3 display_name: Python 3 execution: timeout: 60 --- # Remapping a dataset In this tutorial we will remap a dataset from the native ICON grid to HEALPix. The example dataset consists of five 2D variables sampled at six-hourly intervals. ```{admonition} Note :class: note Please note that all of the methods used in this example are generic and can be applied to any combination of source and target grids. ``` ```{code-cell} ipython3 import healpix as hp import intake import numpy as np import xarray as xr import easygems.healpix as egh import easygems.remap as egr cat = intake.open_catalog("https://data.nextgems-h2020.eu/online.yaml") ds = cat.tutorial["ICON.native.2d_PT6H_inst"](chunks={}).to_dask() ds ``` The first step is to create a HEALPix grid that is close to the resolution of our source grid. Here we will choose an order of 8 (also known as zoom or refinement level) and nested ordering. ```{code-cell} ipython3 order = zoom = 8 nside = hp.order2nside(order) pixels = np.arange(hp.nside2npix(nside)) hp_lon, hp_lat = hp.pix2ang(nside=nside, ipix=pixels, lonlat=True, nest=True) ``` Next, we can use our defined source and target grids to compute interpolation weights. The `easygems` package provides a function to compute these weights using the [Delaunay triangulation](https://docs.scipy.org/doc/scipy/tutorial/spatial.html#delaunay-triangulations) method. ```{code-cell} ipython3 weights = egr.compute_weights_delaunay( points=(ds.lon, ds.lat), xi=(hp_lon, hp_lat) ) ``` ````{tip} For larger grids, computing the remapping weights can be quite resource-intensive. However, you can save the calculated weights for future use: ```{code-block} python weights.to_netcdf("healpix_weights.nc") ``` ```` These weights can be applied to single fields directly: ```{code-cell} ipython3 egh.healpix_show(egr.apply_weights(ds.ts.isel(time=0), **weights)); ``` Alternatively, we can use xarray's [apply_ufunc()](https://docs.xarray.dev/en/stable/generated/xarray.apply_ufunc.html) function to lift the function onto a full dataset. This requires a coupled of additional information from the user, e.g. the input dimension along which the function should be applied, and the resulting output dimensions name and size. ```{code-cell} ipython3 ds_remap = xr.apply_ufunc( egr.apply_weights, ds, kwargs=weights, keep_attrs=True, input_core_dims=[["ncells"]], output_core_dims=[["cell"]], output_dtypes=["f4"], vectorize=True, dask="parallelized", dask_gufunc_kwargs={ "output_sizes": {"cell": weights.sizes["tgt_idx"]}, }, ) ds_remap ``` The resulting dataset can then be used as usual and the remapping is performed on demand. ```{code-cell} ipython3 egh.healpix_show(ds_remap.ts.isel(time=0)); ``` ## Attaching a coordinate reference system It is good practice to store map projection information in the [Coordinate Reference Systems (CRS)](https://cfconventions.org/Data/cf-conventions/cf-conventions-1.13/cf-conventions.html#grid-mappings-and-projections). By making the `crs` coordinate non-dimensional it will "stick" to the dataset even if individual variables are subselected. ```{code-cell} ipython3 crs = xr.DataArray( name="crs", attrs={ "grid_mapping_name": "healpix", "refinement_level": order, "indexing_scheme": "nested", }, ) cell = xr.DataArray( pixels, dims=("cell",), attrs={"standard_name": "healpix_index"} ) ds_remap_cf = ds_remap.assign_coords(crs=crs, cell=cell) ds_remap_cf ```