ESMValTool Integration
======================
This guide explains how to use ACCESS-MOPPy as a transparent CMORisation
pre-processor for `ESMValTool `_ and
`ESMValCore `_.
.. contents:: Table of Contents
:local:
:depth: 2
Overview
--------
ESMValTool assumes that the data it reads is already in CMIP-compliant
format. Raw ACCESS-ESM1.6 output uses UM STASH codes, MOM5 variable
names, and a non-standard directory structure, so ESMValTool cannot use it
directly.
The ``access_moppy.esmval`` subpackage bridges this gap **without
modifying ESMValCore or ESMValTool**. It:
1. Parses an ESMValTool recipe YAML to find which variables and time ranges
are needed.
2. Locates the corresponding raw ACCESS-ESM1.6 files on disk.
3. Runs ACCESS-MOPPy's CMORisation pipeline and writes CMIP DRS-structured
NetCDF output to a local cache directory.
4. Generates an ESMValCore 2.14+ ``LocalDataSource`` config file and places
it in the ESMValCore user config directory
(``~/.config/esmvaltool/``) so ESMValTool finds the data automatically
— no ``--config`` flag required.
it is simply reading well-formed CMIP6 data.
Architecture
------------
.. code-block:: text
┌──────────────────────────────────────────────────────────────┐
│ User runs: moppy-esmval-run my_recipe.yml │
└──────────────────────────┬───────────────────────────────────┘
│
┌────────────────▼──────────────────┐
│ CMORiseOrchestrator │
│ │
│ 1. Parse recipe YAML │
│ 2. Extract ACCESS-ESM1-6 datasets│
│ 3. Map (mip, short_name) → │
│ compound_name │
│ 4. Locate raw ACCESS files │
│ 5. Run ACCESS_ESM_CMORiser │
│ (skip if cached & current) │
│ 6. Write CMIP DRS output tree │
└────────────────┬──────────────────┘
│
┌────────────────▼──────────────────┐
│ ~/.config/esmvaltool/ │
│ moppy-esmval-data.yml │
│ │
│ projects: │
│ CMIP6: │
│ data: │
│ moppy-cache: │
│ type: LocalDataSource │
│ rootpath: ~/.cache/… │
└────────────────┬──────────────────┘
│ (auto-loaded by ESMValCore)
┌────────────────▼──────────────────┐
│ esmvaltool run my_recipe.yml │
│ (no --config flag needed) │
└───────────────────────────────────┘
Installation
------------
Install ACCESS-MOPPy with the ESMValTool integration extras::
pip install "access_moppy[esmval]"
This pulls in ``esmvalcore>=2.14`` as an optional dependency. ESMValTool
itself is not strictly required (only ESMValCore is needed to run recipes
using the generated config overlay); install it separately if needed::
pip install ESMValTool
Quick Start
-----------
**Step 1 — Prepare a recipe**
Write or obtain a normal ESMValTool recipe that references
``dataset: ACCESS-ESM1-6`` and ``project: CMIP6``:
.. code-block:: yaml
# my_recipe.yml
documentation:
title: ACCESS-ESM1-6 surface temperature example
description: Minimal recipe demonstrating ACCESS-MOPPy ESMValTool integration.
authors:
- anonymous
datasets:
- {dataset: ACCESS-ESM1-6, project: CMIP6, exp: historical,
ensemble: r1i1p1f1, grid: gn, timerange: '2000/2005'}
diagnostics:
temperature_bias:
variables:
tas:
mip: Amon
scripts:
plot:
script: examples/plot_map.py
**Step 2 — CMORise and run (two-step)**
.. code-block:: bash
# CMORise required data and write ESMValCore config
# (written to ~/.config/esmvaltool/moppy-esmval-data.yml automatically):
moppy-esmval-prepare my_recipe.yml \
--input-root /g/data/p73/archive/.../MyRun \
--cache-dir ~/.cache/moppy-esmval
# Run ESMValTool — config is picked up automatically, no --config flag:
esmvaltool run my_recipe.yml
**Step 3 — Or use the one-step wrapper**
.. code-block:: bash
moppy-esmval-run my_recipe.yml \
--input-root /g/data/p73/archive/.../MyRun \
--cache-dir ~/.cache/moppy-esmval
This is equivalent to running both commands above sequentially.
**Step 4 — Or via the esmvaltool CLI**
Once ACCESS-MOPPy is installed alongside ESMValCore, it registers a new
sub-command on the ``esmvaltool`` executable::
esmvaltool cmorise my_recipe.yml \
--input-root /g/data/p73/archive/.../MyRun \
--cache-dir ~/.cache/moppy-esmval
esmvaltool run my_recipe.yml
Command Reference
-----------------
``moppy-esmval-prepare``
~~~~~~~~~~~~~~~~~~~~~~~~
CMORise the data required by a recipe without invoking ESMValTool.
.. code-block:: text
usage: moppy-esmval-prepare [-h]
RECIPE
--input-root PATH
--cache-dir PATH
[--model-id ID]
[--config FILE]
[--output-config FILE]
[--workers N]
[--dry-run]
[--pattern COMPOUND_NAME:GLOB]
[-v]
.. list-table::
:header-rows: 1
:widths: 25 75
* - Argument
- Description
* - ``RECIPE``
- Path to the ESMValTool YAML recipe (required).
* - ``--input-root``
- Root directory of the raw ACCESS-ESM1.6 archive (required).
* - ``--cache-dir``
- Directory where CMORised files will be written in CMIP DRS structure (required).
* - ``--model-id``
- ACCESS-MOPPy model identifier. Default: ``ACCESS-ESM1.6``.
* - ``--config``
- Path to any existing file in the user's ESMValCore config directory.
The MOPPy data-source file is written into the same directory.
* - ``--output-config``
- Where to write the generated ESMValCore data-source config file
(default: ``~/.config/esmvaltool/moppy-esmval-data.yml``).
* - ``--workers``
- Number of parallel CMORisation workers. Default: ``1``.
* - ``--dry-run``
- Log what would be done without running CMORisation.
* - ``--pattern``
- Raw-file glob pattern override for a specific variable, e.g.
``Amon.tas:/output*/atmosphere/netCDF/*mon.nc``. Can be repeated.
* - ``-v / --verbose``
- Enable DEBUG-level logging.
``moppy-esmval-run``
~~~~~~~~~~~~~~~~~~~~
CMORise and immediately invoke ``esmvaltool run``. Accepts all the same
arguments as ``moppy-esmval-prepare`` plus:
.. list-table::
:header-rows: 1
:widths: 25 75
* - Argument
- Description
* - ``--esmvaltool-args``
- Extra arguments forwarded verbatim to ``esmvaltool run`` (quoted string).
``esmvaltool cmorise``
~~~~~~~~~~~~~~~~~~~~~~
Registered via the ``esmvaltool_commands`` entry-point group. Performs
the same preparation step as ``moppy-esmval-prepare``. All parameters
are the same, passed as keyword arguments because the ``esmvaltool`` CLI
uses Google `fire `_ for dispatch.
Python API
----------
All components are importable directly for use in scripts and notebooks:
.. code-block:: python
from access_moppy.esmval import CMORiseOrchestrator, RecipeReader, VariableIndex
# Parse recipe
reader = RecipeReader("my_recipe.yml")
print(reader.tasks) # list of CMORTask objects
# Check which variables are supported
idx = VariableIndex()
print(idx.is_supported("Amon", "tas")) # True
# Run orchestrator
orch = CMORiseOrchestrator(
input_root="/g/data/p73/archive/.../MyRun",
cache_dir="~/.cache/moppy-esmval",
)
results = orch.prepare_recipe("my_recipe.yml")
CMORiseOrchestrator.summarise(results)
# Write ESMValCore 2.14+ config (placed in ~/.config/esmvaltool/ by default)
from access_moppy.esmval.config_gen import write_esmval_config
cfg = write_esmval_config("~/.cache/moppy-esmval")
print(f"Config written to: {cfg}")
# esmvaltool run my_recipe.yml # no --config flag needed
File Pattern Overrides
----------------------
By default the file finder uses broad component-level glob patterns
relative to ``--input-root``. When the default patterns do not match
your archive layout you can override them per variable:
.. code-block:: bash
moppy-esmval-prepare my_recipe.yml \
--input-root /data/archive/MyRun \
--cache-dir ~/.cache/moppy-esmval \
--pattern "Amon.tas:/output[0-4]*/atmosphere/netCDF/*mon.nc" \
--pattern "Omon.tos:/output[0-4]*/ocean/ocean-2d-surface_temp*.nc"
Or in Python:
.. code-block:: python
orch = CMORiseOrchestrator(
input_root="/data/archive/MyRun",
cache_dir="~/.cache/moppy-esmval",
pattern_overrides={
"Amon.tas": "output[0-4]*/atmosphere/netCDF/*mon.nc",
"Omon.tos": "output[0-4]*/ocean/ocean-2d-surface_temp*.nc",
},
)
The default patterns for each component are:
.. list-table::
:header-rows: 1
:widths: 20 80
* - Component
- Default glob patterns (relative to ``--input-root``)
* - ``atmosphere``, ``land``, ``aerosol``
- ``output[0-9]*/atmosphere/netCDF/*mon.nc`` (plus ``*dai.nc``, ``*3hr.nc``, ``*6hr.nc``)
* - ``ocean``
- ``output[0-9]*/ocean/ocean-{1,2,3}d-*.nc``
* - ``sea_ice``
- ``output[0-9]*/ice/iceh-1monthly-mean*.nc``
Caching
-------
The orchestrator caches results: if a CMORised file already exists under
``--cache-dir`` *and* is newer than all raw input files, the variable is
skipped and reported as ``"cached"``. To force re-CMORisation, either
delete the relevant files from the cache or use ``--dry-run`` first to
inspect what is cached.
HPC Usage (NCI Gadi)
---------------------
On NCI Gadi the raw ACCESS-ESM1.6 output typically lives under
``/g/data/p73/archive/`` or ``/g/data/access/ACCESS-ESM1-6/``. A typical
workflow uses the ``--workers`` flag together with a pre-existing PBS
interactive session:
.. code-block:: bash
# Inside a PBS interactive session or a Gadi login node
moppy-esmval-run my_recipe.yml \
--input-root /g/data/p73/archive/CMIP7/ACCESS-ESM1-6/.../MyRun \
--cache-dir /scratch/tm70/$USER/moppy-esmval-cache \
--workers 8 \
--esmvaltool-args "--max-parallel-tasks 4"
For very large variable sets, combine with the existing ``moppy-cmorise``
PBS batch system to CMORise first, then run ESMValTool against the output:
.. code-block:: bash
# 1. CMORise with PBS (schedules a job per variable)
moppy-cmorise my_batch_config.yml
# 2. Once jobs finish, point ESMValTool at the output
moppy-esmval-prepare my_recipe.yml \
--input-root /g/data/.../MyRun \
--cache-dir /scratch/tm70/$USER/moppy-output \
--dry-run # nothing to CMORise — will just write config file
esmvaltool run my_recipe.yml
Troubleshooting
---------------
**"No supported ACCESS-ESM datasets found in recipe"**
Check that your recipe includes ``project: CMIP6`` and
``dataset: ACCESS-ESM1-5`` or ``dataset: ACCESS-ESM1-6`` in the datasets block.
**"No raw files found for 'Amon.xxx'"**
The default glob patterns may not match your archive layout. Use
``--pattern`` to override, and add ``-v`` to see the exact paths being
searched.
**"No mapping found for 'Amon.xxx'"**
The variable is not yet supported in the ACCESS-ESM1.6 mapping file.
Check ``access_moppy.esmval.VariableIndex().all_compound_names()`` for
the full list of supported variables.
**ESMValTool cannot find the CMORised data**
Verify that ``--cache-dir`` in the prepare step matches the ``rootpath``
in ``~/.config/esmvaltool/moppy-esmval-data.yml``. If you wrote the
config file to a non-default location, set the ``ESMVALTOOL_CONFIG_DIR``
environment variable to that directory before calling
``esmvaltool run``.
API Reference
-------------
.. autoclass:: access_moppy.esmval.recipe_reader.RecipeReader
:members:
:undoc-members:
.. autoclass:: access_moppy.esmval.recipe_reader.CMORTask
:members:
.. autoclass:: access_moppy.esmval.variable_mapper.VariableIndex
:members:
:undoc-members:
.. autoclass:: access_moppy.esmval.file_finder.RawFileFinder
:members:
:undoc-members:
.. autoclass:: access_moppy.esmval.orchestrator.CMORiseOrchestrator
:members:
:undoc-members:
.. autofunction:: access_moppy.esmval.config_gen.write_esmval_config
.. autofunction:: access_moppy.esmval.config_gen.merge_into_existing_config