ESMValTool Integration

This guide explains how to use ACCESS-MOPPy as a transparent CMORisation pre-processor for ESMValTool and ESMValCore.

Overview

ESMValTool assumes that the data it reads is already in CMIP-compliant format. Raw ACCESS-ESM1.6 output uses UM STASH codes, MOM5 variable names, and a non-standard directory structure, so ESMValTool cannot use it directly.

The access_moppy.esmval subpackage bridges this gap without modifying ESMValCore or ESMValTool. It:

  1. Parses an ESMValTool recipe YAML to find which variables and time ranges are needed.

  2. Locates the corresponding raw ACCESS-ESM1.6 files on disk.

  3. Runs ACCESS-MOPPy’s CMORisation pipeline and writes CMIP DRS-structured NetCDF output to a local cache directory.

  4. Generates an ESMValCore 2.14+ LocalDataSource config file and places it in the ESMValCore user config directory (~/.config/esmvaltool/) so ESMValTool finds the data automatically — no --config flag required.

it is simply reading well-formed CMIP6 data.

Architecture

┌──────────────────────────────────────────────────────────────┐
│  User runs:  moppy-esmval-run my_recipe.yml                  │
└──────────────────────────┬───────────────────────────────────┘
                           │
          ┌────────────────▼──────────────────┐
          │  CMORiseOrchestrator               │
          │                                   │
          │  1. Parse recipe YAML             │
          │  2. Extract ACCESS-ESM1-6 datasets│
          │  3. Map (mip, short_name) →       │
          │     compound_name                 │
          │  4. Locate raw ACCESS files       │
          │  5. Run ACCESS_ESM_CMORiser       │
          │     (skip if cached & current)    │
          │  6. Write CMIP DRS output tree    │
          └────────────────┬──────────────────┘
                           │
          ┌────────────────▼──────────────────┐
          │  ~/.config/esmvaltool/            │
          │  moppy-esmval-data.yml             │
          │                                   │
          │  projects:                        │
          │    CMIP6:                         │
          │      data:                        │
          │        moppy-cache:               │
          │          type: LocalDataSource    │
          │          rootpath: ~/.cache/…     │
          └────────────────┬──────────────────┘
                           │ (auto-loaded by ESMValCore)
          ┌────────────────▼──────────────────┐
          │  esmvaltool run my_recipe.yml      │
          │  (no --config flag needed)         │
          └───────────────────────────────────┘

Installation

Install ACCESS-MOPPy with the ESMValTool integration extras:

pip install "access_moppy[esmval]"

This pulls in esmvalcore>=2.14 as an optional dependency. ESMValTool itself is not strictly required (only ESMValCore is needed to run recipes using the generated config overlay); install it separately if needed:

pip install ESMValTool

Quick Start

Step 1 — Prepare a recipe

Write or obtain a normal ESMValTool recipe that references dataset: ACCESS-ESM1-6 and project: CMIP6:

# my_recipe.yml
documentation:
  title: ACCESS-ESM1-6 surface temperature example
  description: Minimal recipe demonstrating ACCESS-MOPPy ESMValTool integration.
  authors:
    - anonymous

datasets:
  - {dataset: ACCESS-ESM1-6, project: CMIP6, exp: historical,
     ensemble: r1i1p1f1, grid: gn, timerange: '2000/2005'}

diagnostics:
  temperature_bias:
    variables:
      tas:
        mip: Amon
    scripts:
      plot:
        script: examples/plot_map.py

Step 2 — CMORise and run (two-step)

# CMORise required data and write ESMValCore config
# (written to ~/.config/esmvaltool/moppy-esmval-data.yml automatically):
moppy-esmval-prepare my_recipe.yml \
    --input-root /g/data/p73/archive/.../MyRun \
    --cache-dir ~/.cache/moppy-esmval

# Run ESMValTool — config is picked up automatically, no --config flag:
esmvaltool run my_recipe.yml

Step 3 — Or use the one-step wrapper

moppy-esmval-run my_recipe.yml \
    --input-root /g/data/p73/archive/.../MyRun \
    --cache-dir ~/.cache/moppy-esmval

This is equivalent to running both commands above sequentially.

Step 4 — Or via the esmvaltool CLI

Once ACCESS-MOPPy is installed alongside ESMValCore, it registers a new sub-command on the esmvaltool executable:

esmvaltool cmorise my_recipe.yml \
    --input-root /g/data/p73/archive/.../MyRun \
    --cache-dir ~/.cache/moppy-esmval

esmvaltool run my_recipe.yml

Command Reference

moppy-esmval-prepare

CMORise the data required by a recipe without invoking ESMValTool.

usage: moppy-esmval-prepare [-h]
    RECIPE
    --input-root PATH
    --cache-dir PATH
    [--model-id ID]
    [--config FILE]
    [--output-config FILE]
    [--workers N]
    [--dry-run]
    [--pattern COMPOUND_NAME:GLOB]
    [-v]

Argument

Description

RECIPE

Path to the ESMValTool YAML recipe (required).

--input-root

Root directory of the raw ACCESS-ESM1.6 archive (required).

--cache-dir

Directory where CMORised files will be written in CMIP DRS structure (required).

--model-id

ACCESS-MOPPy model identifier. Default: ACCESS-ESM1.6.

--config

Path to any existing file in the user’s ESMValCore config directory. The MOPPy data-source file is written into the same directory.

--output-config

Where to write the generated ESMValCore data-source config file (default: ~/.config/esmvaltool/moppy-esmval-data.yml).

--workers

Number of parallel CMORisation workers. Default: 1.

--dry-run

Log what would be done without running CMORisation.

--pattern

Raw-file glob pattern override for a specific variable, e.g. Amon.tas:/output*/atmosphere/netCDF/*mon.nc. Can be repeated.

-v / --verbose

Enable DEBUG-level logging.

moppy-esmval-run

CMORise and immediately invoke esmvaltool run. Accepts all the same arguments as moppy-esmval-prepare plus:

Argument

Description

--esmvaltool-args

Extra arguments forwarded verbatim to esmvaltool run (quoted string).

esmvaltool cmorise

Registered via the esmvaltool_commands entry-point group. Performs the same preparation step as moppy-esmval-prepare. All parameters are the same, passed as keyword arguments because the esmvaltool CLI uses Google fire for dispatch.

Python API

All components are importable directly for use in scripts and notebooks:

from access_moppy.esmval import CMORiseOrchestrator, RecipeReader, VariableIndex

# Parse recipe
reader = RecipeReader("my_recipe.yml")
print(reader.tasks)   # list of CMORTask objects

# Check which variables are supported
idx = VariableIndex()
print(idx.is_supported("Amon", "tas"))  # True

# Run orchestrator
orch = CMORiseOrchestrator(
    input_root="/g/data/p73/archive/.../MyRun",
    cache_dir="~/.cache/moppy-esmval",
)
results = orch.prepare_recipe("my_recipe.yml")
CMORiseOrchestrator.summarise(results)

# Write ESMValCore 2.14+ config (placed in ~/.config/esmvaltool/ by default)
from access_moppy.esmval.config_gen import write_esmval_config
cfg = write_esmval_config("~/.cache/moppy-esmval")
print(f"Config written to: {cfg}")
# esmvaltool run my_recipe.yml   # no --config flag needed

File Pattern Overrides

By default the file finder uses broad component-level glob patterns relative to --input-root. When the default patterns do not match your archive layout you can override them per variable:

moppy-esmval-prepare my_recipe.yml \
    --input-root /data/archive/MyRun \
    --cache-dir ~/.cache/moppy-esmval \
    --pattern "Amon.tas:/output[0-4]*/atmosphere/netCDF/*mon.nc" \
    --pattern "Omon.tos:/output[0-4]*/ocean/ocean-2d-surface_temp*.nc"

Or in Python:

orch = CMORiseOrchestrator(
    input_root="/data/archive/MyRun",
    cache_dir="~/.cache/moppy-esmval",
    pattern_overrides={
        "Amon.tas": "output[0-4]*/atmosphere/netCDF/*mon.nc",
        "Omon.tos": "output[0-4]*/ocean/ocean-2d-surface_temp*.nc",
    },
)

The default patterns for each component are:

Component

Default glob patterns (relative to --input-root)

atmosphere, land, aerosol

output[0-9]*/atmosphere/netCDF/*mon.nc (plus *dai.nc, *3hr.nc, *6hr.nc)

ocean

output[0-9]*/ocean/ocean-{1,2,3}d-*.nc

sea_ice

output[0-9]*/ice/iceh-1monthly-mean*.nc

Caching

The orchestrator caches results: if a CMORised file already exists under --cache-dir and is newer than all raw input files, the variable is skipped and reported as "cached". To force re-CMORisation, either delete the relevant files from the cache or use --dry-run first to inspect what is cached.

HPC Usage (NCI Gadi)

On NCI Gadi the raw ACCESS-ESM1.6 output typically lives under /g/data/p73/archive/ or /g/data/access/ACCESS-ESM1-6/. A typical workflow uses the --workers flag together with a pre-existing PBS interactive session:

# Inside a PBS interactive session or a Gadi login node
moppy-esmval-run my_recipe.yml \
    --input-root /g/data/p73/archive/CMIP7/ACCESS-ESM1-6/.../MyRun \
    --cache-dir /scratch/tm70/$USER/moppy-esmval-cache \
    --workers 8 \
    --esmvaltool-args "--max-parallel-tasks 4"

For very large variable sets, combine with the existing moppy-cmorise PBS batch system to CMORise first, then run ESMValTool against the output:

# 1. CMORise with PBS (schedules a job per variable)
moppy-cmorise my_batch_config.yml

# 2. Once jobs finish, point ESMValTool at the output
moppy-esmval-prepare my_recipe.yml \
    --input-root /g/data/.../MyRun \
    --cache-dir /scratch/tm70/$USER/moppy-output \
    --dry-run   # nothing to CMORise — will just write config file

esmvaltool run my_recipe.yml

Troubleshooting

“No supported ACCESS-ESM datasets found in recipe”

Check that your recipe includes project: CMIP6 and dataset: ACCESS-ESM1-5 or dataset: ACCESS-ESM1-6 in the datasets block.

“No raw files found for ‘Amon.xxx’”

The default glob patterns may not match your archive layout. Use --pattern to override, and add -v to see the exact paths being searched.

“No mapping found for ‘Amon.xxx’”

The variable is not yet supported in the ACCESS-ESM1.6 mapping file. Check access_moppy.esmval.VariableIndex().all_compound_names() for the full list of supported variables.

ESMValTool cannot find the CMORised data

Verify that --cache-dir in the prepare step matches the rootpath in ~/.config/esmvaltool/moppy-esmval-data.yml. If you wrote the config file to a non-default location, set the ESMVALTOOL_CONFIG_DIR environment variable to that directory before calling esmvaltool run.

API Reference