{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Workflow Example\n", "Here we illustrate a complete workflow example including the following steps:\n", "- Data loading\n", "- Converting the data to CF format\n", "- Preprocessing the data\n", "- Running a diagnostic\n", "- Visualizing the results\n", "\n", "## Imports" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "from pathlib import Path\n", "\n", "import xarray as xr\n", "from datatree import DataTree\n", "\n", "import valenspy as vp #The Valenspy package\n", "from valenspy.inputconverter_functions import _non_convertor" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Input Convertors\n", "\n", "Input convertors are used to convert the data to CF format.\n", "There main component is a function that takes the file and returns the data CF convention.\n", "See input_convertors_functions.py for examples.\n", "\n", "The Input convertor is a class that does the following:\n", "- Convert the data\n", "- Check if the converted data meets the CF convention\n", "- ..." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "#Import Converter - This input converter will not do anything to the data.\n", "ic = vp.InputConverter(_non_convertor)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Loading datasets\n", "Load the data and convert to CF format if necessary.\n", "\n", "In this illustration we will load EOBS data, CMIP6 historical and future data" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
<xarray.Dataset> Size: 1MB\n",
"Dimensions: (lat: 20, lon: 20, time: 730)\n",
"Coordinates:\n",
" * lat (lat) float64 160B 49.05 49.15 49.25 49.35 ... 50.75 50.85 50.95\n",
" * lon (lon) float64 160B 3.05 3.15 3.25 3.35 3.45 ... 4.65 4.75 4.85 4.95\n",
" * time (time) datetime64[ns] 6kB 1953-01-01 1953-01-02 ... 1954-12-31\n",
"Data variables:\n",
" tas (time, lat, lon) float32 1MB dask.array<chunksize=(26, 20, 20), meta=np.ndarray>\n",
"Attributes:\n",
" E-OBS_version: 29.0e\n",
" Conventions: CF-1.4\n",
" References: http://surfobs.climate.copernicus.eu/dataaccess/access_eo...\n",
" history: Fri Mar 22 09:55:59 2024: ncks --no-abc -d time,0,27027 /...\n",
" NCO: netCDF Operators version 5.1.4 (Homepage = http://nco.sf....<xarray.Dataset> Size: 13MB\n",
"Dimensions: (time: 24, bnds: 2, lat: 256, lon: 512)\n",
"Coordinates:\n",
" * time (time) datetime64[ns] 192B 1953-01-16T12:00:00 ... 1954-12-16T...\n",
" * lat (lat) float64 2kB -89.46 -88.77 -88.07 ... 88.07 88.77 89.46\n",
" * lon (lon) float64 4kB 0.0 0.7031 1.406 2.109 ... 357.9 358.6 359.3\n",
" height float64 8B 2.0\n",
"Dimensions without coordinates: bnds\n",
"Data variables:\n",
" time_bnds (time, bnds) datetime64[ns] 384B dask.array<chunksize=(12, 2), meta=np.ndarray>\n",
" lat_bnds (time, lat, bnds) float64 98kB dask.array<chunksize=(12, 256, 2), meta=np.ndarray>\n",
" lon_bnds (time, lon, bnds) float64 197kB dask.array<chunksize=(12, 512, 2), meta=np.ndarray>\n",
" tas (time, lat, lon) float32 13MB dask.array<chunksize=(12, 256, 512), meta=np.ndarray>\n",
"Attributes: (12/46)\n",
" Conventions: CF-1.7 CMIP-6.2\n",
" activity_id: CMIP\n",
" branch_method: standard\n",
" branch_time_in_child: 0.0\n",
" branch_time_in_parent: 29219.0\n",
" contact: cmip6-data@ec-earth.org\n",
" ... ...\n",
" variant_label: r1i1p1f1\n",
" license: CMIP6 model data produced by EC-Earth...\n",
" cmor_version: 3.4.0\n",
" tracking_id: hdl:21.14100/18af2970-6a17-45fe-b629-...\n",
" history: 2019-06-06T07:27:13Z ; CMOR rewrote d...\n",
" latest_applied_cmor_fixer_version: v3.0<xarray.Dataset> Size: 13MB\n",
"Dimensions: (time: 24, bnds: 2, lat: 256, lon: 512)\n",
"Coordinates:\n",
" * time (time) datetime64[ns] 192B 2015-01-16T12:00:00 ... 2016-12-16T...\n",
" * lat (lat) float64 2kB -89.46 -88.77 -88.07 ... 88.07 88.77 89.46\n",
" * lon (lon) float64 4kB 0.0 0.7031 1.406 2.109 ... 357.9 358.6 359.3\n",
" height float64 8B 2.0\n",
"Dimensions without coordinates: bnds\n",
"Data variables:\n",
" time_bnds (time, bnds) datetime64[ns] 384B dask.array<chunksize=(12, 2), meta=np.ndarray>\n",
" lat_bnds (time, lat, bnds) float64 98kB dask.array<chunksize=(12, 256, 2), meta=np.ndarray>\n",
" lon_bnds (time, lon, bnds) float64 197kB dask.array<chunksize=(12, 512, 2), meta=np.ndarray>\n",
" tas (time, lat, lon) float32 13MB dask.array<chunksize=(12, 256, 512), meta=np.ndarray>\n",
"Attributes: (12/45)\n",
" Conventions: CF-1.7 CMIP-6.2\n",
" activity_id: ScenarioMIP\n",
" branch_method: standard\n",
" branch_time_in_child: 60265.0\n",
" branch_time_in_parent: 60265.0\n",
" contact: cmip6-data@ec-earth.org\n",
" ... ...\n",
" variable_id: tas\n",
" variant_label: r1i1p1f1\n",
" license: CMIP6 model data produced by EC-Earth-Consortium ...\n",
" cmor_version: 3.4.0\n",
" tracking_id: hdl:21.14100/697b3a82-4ffc-49ce-b070-2ca86ce8a06f\n",
" history: 2019-06-29T08:25:09Z ; CMOR rewrote data to be co...<xarray.DatasetView> Size: 0B\n",
"Dimensions: ()\n",
"Data variables:\n",
" *empty*<xarray.DatasetView> Size: 0B\n",
"Dimensions: ()\n",
"Data variables:\n",
" *empty*<xarray.DataArray 'tas' (lat: 20, lon: 20)> Size: 2kB\n",
"dask.array<sub, shape=(20, 20), dtype=float32, chunksize=(20, 20), chunktype=numpy.ndarray>\n",
"Coordinates:\n",
" height float64 8B 2.0\n",
" * lon (lon) float64 160B 3.05 3.15 3.25 3.35 3.45 ... 4.65 4.75 4.85 4.95\n",
" * lat (lat) float64 160B 49.05 49.15 49.25 49.35 ... 50.75 50.85 50.95