Variable Mapping Reference
==========================
.. contents:: Table of Contents
:local:
:depth: 3
This page is intended for **developers** who need to understand how ACCESS-MOPPy maps
raw ACCESS model output variables to CMIP-compliant output, or who want to add support
for new variables or models.
Overview
--------
ACCESS-MOPPy uses **JSON mapping files** to describe how raw model variables (e.g.
UM STASH codes such as ``fld_s02i208``, or MOM5/MOM6 diagnostics such as ``temp``)
correspond to CMIP output variables (e.g. ``Amon.rsds``).
At runtime, :func:`~access_moppy.utilities.load_model_mappings` reads the appropriate
JSON file, finds the requested CMIP variable, and returns the mapping dictionary.
The relevant :class:`~access_moppy.base.CMORiser` subclass then uses the mapping to
load, transform, and write the data.
The mapping system also handles CMIP7 compound names transparently: a CMIP7 name is
first resolved to its CMIP6 equivalent via a separate translation table, and the
CMIP6 mapping is then applied as normal.
Mapping Files
-------------
Location
^^^^^^^^
All mapping files live inside the installed package under::
src/access_moppy/mappings/
The files shipped with ACCESS-MOPPy are:
.. list-table::
:widths: 35 65
:header-rows: 1
* - File
- Description
* - ``ACCESS-ESM1.6_mappings.json``
- Primary mapping file for ACCESS-ESM1.6 (atmosphere, ocean, land, sea ice, aerosol)
* - ``ACCESS-CM3_mappings.json``
- Mapping file for ACCESS-CM3 (atmosphere, ocean)
* - ``ACCESS-OM3_mappings.json``
- Mapping file for ACCESS-OM3 (ocean, time-invariant)
* - ``cmip7_to_cmip6_compound_name_mapping.json``
- Translation table: CMIP7 branded name → CMIP6 ``table.variable``
* - ``cmip6_to_cmip7_compound_name_mapping.json``
- Translation table: CMIP6 ``table.variable`` → CMIP7 branded name
Selecting a mapping file
^^^^^^^^^^^^^^^^^^^^^^^^
The mapping file to use is determined by the ``model_id`` argument of
``ACCESS_ESM_CMORiser`` (default: ``"ACCESS-ESM1.6"``).
:func:`~access_moppy.utilities.load_model_mappings` constructs the filename as
``{model_id}_mappings.json`` and looks for it inside the ``access_moppy.mappings``
package resource directory.
Top-level Structure of a Mapping File
--------------------------------------
Each mapping file is a JSON object with the following top-level keys:
.. code-block:: json
{
"model_info": { ... },
"aerosol": { "var1": { ... }, "var2": { ... } },
"atmosphere": { "var1": { ... }, "var2": { ... } },
"land": { "var1": { ... }, "var2": { ... } },
"landIce": { "var1": { ... }, "var2": { ... } },
"ocean": { "var1": { ... }, "var2": { ... } },
"sea_ice": { "var1": { ... }, "var2": { ... } }
}
``model_info``
A metadata block describing the model and which components have mappings.
.. code-block:: json
{
"model_id": "ACCESS-ESM1.6",
"components": ["aerosol", "atmosphere", "land", "landIce", "ocean", "sea_ice"],
"description": "Variable mappings for ACCESS-ESM1.6 Earth System Model"
}
Each **component** key (``aerosol``, ``atmosphere``, etc.) maps CMIP variable names to
their entry dictionaries. When :func:`~access_moppy.utilities.load_model_mappings` is
called with, say, ``compound_name="Amon.rsds"``, it extracts the CMIP name ``rsds``
and searches each component in turn until it finds the entry.
Variable Entry Fields
---------------------
Each variable entry inside a component block shares the same set of optional and
required fields:
.. list-table::
:widths: 25 15 60
:header-rows: 1
* - Field
- Required
- Description
* - ``CF standard Name``
- Yes
- The `CF conventions `_ standard name for the output variable.
May be an empty string ``""`` when no standard name has been assigned.
* - ``dimensions``
- Yes
- Ordered dictionary that maps **model dimension names** (keys) to **CMIP dimension names** (values).
This tells the CMORiser how to rename coordinates. Example::
"dimensions": {"time": "time", "lat": "lat", "lon": "lon"}
* - ``units``
- Yes
- Expected physical units of the CMIP output variable (e.g. ``"W m-2"``, ``"kg m-2 s-1"``).
* - ``positive``
- Yes
- Sign convention: ``"up"``, ``"down"``, or ``null`` if not applicable.
* - ``model_variables``
- Yes
- List of raw model variable names that must be loaded from the input files.
These are passed by name into the calculation context.
``null`` is allowed for ``internal`` calculations that produce data without any input file.
* - ``calculation``
- Yes
- Dictionary that specifies *how* to derive the output variable.
See :ref:`calculation-types` below.
* - ``zaxis``
- No
- Present for variables on vertical levels. Describes the vertical coordinate type
and the variables needed to reconstruct it.
See :ref:`zaxis-field` below.
* - ``ressource_file``
- No
- Name of a bundled NetCDF resource file (stored under ``src/access_moppy/resources/``)
that should be used instead of (or in addition to) user-provided input data.
When this field is set and no ``input_data`` is passed to ``ACCESS_ESM_CMORiser``,
the bundled file is used automatically.
Example — simple direct variable
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. code-block:: json
"rldscs": {
"CF standard Name": "surface_downwelling_longwave_flux_in_air_assuming_clear_sky",
"dimensions": {"time": "time", "lat": "lat", "lon": "lon"},
"units": "W m-2",
"positive": "down",
"model_variables": ["fld_s02i208"],
"calculation": {
"type": "direct",
"formula": "fld_s02i208"
}
}
.. _calculation-types:
Calculation Types
-----------------
The ``calculation`` dictionary always contains a ``"type"`` key. The five supported
types are described below.
``direct``
^^^^^^^^^^
The output variable is taken straight from one model variable with no transformation.
.. code-block:: json
"calculation": {
"type": "direct",
"formula": ""
}
``formula``
^^^^^^^^^^^
The output is derived by calling a **registered function** from the
:data:`~access_moppy.derivations.custom_functions` registry (see
:ref:`custom-functions`).
.. code-block:: json
"calculation": {
"type": "formula",
"operation": "",
"args": ["", "", ...],
"kwargs": {"": ""}
}
- ``args`` is a list of positional arguments. Each item is either a string
(variable name looked up in the input context), a number (used as-is), or a
nested expression dictionary (see :ref:`expression-language` below).
- ``kwargs`` is a dictionary of keyword arguments. Values follow the same rules as
``args`` items.
- Alternatively, ``operands`` may be used instead of ``args`` for legacy entries —
both are treated identically by the expression evaluator.
Optional operands example (ocean ``hfds``):
.. code-block:: json
"calculation": {
"type": "formula",
"operation": "calc_hfds",
"args": ["sfc_hflux_from_runoff", "sfc_hflux_coupler", "sfc_hflux_pme"],
"kwargs": {
"frazil_3d_int_z": {"optional": "frazil_3d_int_z"},
"frazil_2d": {"optional": "frazil_2d"}
}
}
Wrapping a value in ``{"optional": ""}`` means the variable is passed as
``None`` if it is absent from the input dataset, instead of raising a
``KeyError``.
``operation``
^^^^^^^^^^^^^
A shorthand for common two-argument arithmetic operations. Functionally equivalent
to ``formula`` but expressed more compactly:
.. code-block:: json
"calculation": {
"type": "operation",
"operation": "",
"args": ["", ""]
}
Supported ``operation`` values: ``"add"``, ``"subtract"``, ``"multiply"``,
``"divide"``, ``"power"``.
Example (land ``npp`` — net primary productivity divided by tile fraction):
.. code-block:: json
"calculation": {
"type": "operation",
"operation": "divide",
"args": ["fld_s03i262", "fld_s03i395"]
}
``dataset_function``
^^^^^^^^^^^^^^^^^^^^
Calls a more complex **dataset-level function** that receives the entire
xarray Dataset and may modify dimensions or coordinates (e.g. interpolating
from hybrid-height levels to physical height levels).
.. code-block:: json
"calculation": {
"type": "dataset_function",
"function": "",
"kwargs": {}
}
Available ``dataset_function`` values: ``"cl_level_to_height"``,
``"cli_level_to_height"``, ``"clw_level_to_height"``, ``"level_to_height"``.
These functions are defined in
:mod:`access_moppy.derivations.calc_atmos` and registered in
:data:`~access_moppy.derivations.custom_functions`.
``internal``
^^^^^^^^^^^^
The output variable is computed entirely internally from ancillary information
(grid geometry, etc.) without reading any user-provided input file.
.. code-block:: json
"calculation": {
"type": "internal",
"function": "",
"args": []
}
Currently the only available function is ``"calculate_areacella"`` (atmospheric
grid-cell area, computed from latitude/longitude coordinate arrays).
Variables that use this type do **not** require ``input_data`` to be passed to
``ACCESS_ESM_CMORiser``.
.. _expression-language:
Expression Language
-------------------
The ``formula`` calculation type uses a small recursive expression language that is
evaluated by :func:`~access_moppy.derivations.evaluate_expression`.
An expression can be one of:
.. list-table::
:widths: 30 70
:header-rows: 1
* - Expression form
- Meaning
* - ``""``
- Look up the named variable in the input context (an xarray DataArray).
* - ````
- A literal numeric value (integer or float).
* - ``{"literal": }``
- Explicit literal — useful when the value might be a string or ambiguous.
* - ``{"optional": ""}``
- Look up the variable; return ``None`` if absent instead of raising an error.
* - ``{"operation": "", "args": [...], "kwargs": {...}}``
- Nested function call: recursively evaluate ``args``/``kwargs``, then call the
registered function ````.
Expressions can be arbitrarily nested, allowing compound derivations to be expressed
in a single JSON structure.
.. _custom-functions:
Custom Functions Registry
--------------------------
All functions available to the ``formula``, ``operation``, and ``dataset_function``
calculation types are registered in the dictionary
:data:`access_moppy.derivations.custom_functions`.
Built-in operations
^^^^^^^^^^^^^^^^^^^
.. list-table::
:widths: 30 70
:header-rows: 1
* - Name
- Description
* - ``add``
- Sum of any number of arguments: ``a + b + c + ...``
* - ``subtract``
- Difference: ``a - b``
* - ``multiply``
- Product: ``a * b``
* - ``divide``
- Ratio: ``a / b``
* - ``power``
- Exponentiation: ``a ** b``
* - ``sum``
- ``xarray.DataArray.sum(**kwargs)``
* - ``mean``
- Arithmetic mean of multiple arguments
* - ``kelvin_to_celsius``
- ``x - 273.15``
* - ``celsius_to_kelvin``
- ``x + 273.15``
* - ``isel``
- Select a single index slice: ``x.isel(**kwargs)``
* - ``calculate_monthly_minimum``
- Resample to monthly minimum
* - ``calculate_monthly_maximum``
- Resample to monthly maximum
* - ``drop_axis``
- Drop a named dimension/axis
* - ``drop_time_axis``
- Drop the time dimension (for time-invariant fields stored in time-varying files)
* - ``squeeze_axis``
- Squeeze (remove) size-1 dimensions
Atmosphere functions
^^^^^^^^^^^^^^^^^^^^
Defined in :mod:`access_moppy.derivations.calc_atmos`.
.. list-table::
:widths: 35 65
:header-rows: 1
* - Name
- Description
* - ``cl_level_to_height``
- Convert cloud fraction from hybrid-height levels to physical height levels
* - ``cli_level_to_height``
- Convert cloud ice content from hybrid-height levels to physical height levels
* - ``clw_level_to_height``
- Convert cloud liquid water from hybrid-height levels to physical height levels
* - ``level_to_height``
- Generic hybrid-height level → physical height conversion
* - ``calculate_areacella``
- Compute atmospheric grid-cell area from lat/lon coordinates
Aerosol functions
^^^^^^^^^^^^^^^^^
Defined in :mod:`access_moppy.derivations.calc_aerosol`.
.. list-table::
:widths: 35 65
:header-rows: 1
* - Name
- Description
* - ``optical_depth``
- Sum spectral band optical depths to produce a broadband aerosol optical depth
Land functions
^^^^^^^^^^^^^^
Defined in :mod:`access_moppy.derivations.calc_land`.
.. list-table::
:widths: 35 65
:header-rows: 1
* - Name
- Description
* - ``calc_topsoil``
- Extract top-soil layer diagnostic
* - ``calc_landcover``
- Derive land cover fractions from tile data
* - ``extract_tilefrac``
- Extract a specific tile fraction
* - ``weighted_tile_sum``
- Weighted sum over surface tiles
* - ``calc_carbon_pool_kg_m2``
- Convert carbon pool units to kg m⁻²
* - ``calc_cland_with_wood_products``
- Total land carbon including wood products
* - ``calc_mass_pool_kg_m2``
- Convert mass pool to kg m⁻²
* - ``calc_nitrogen_pool_kg_m2``
- Convert nitrogen pool units to kg m⁻²
* - ``calc_mrsfl``
- Compute frozen soil moisture
* - ``calc_mrsll``
- Compute liquid soil moisture
* - ``calc_mrsol``
- Compute total soil moisture
* - ``calc_tsl``
- Compute soil temperature profile
Ocean functions
^^^^^^^^^^^^^^^
Defined in :mod:`access_moppy.derivations.calc_ocean`.
.. list-table::
:widths: 35 65
:header-rows: 1
* - Name
- Description
* - ``calc_areacello``
- Compute ocean grid-cell area
* - ``calc_hfds``
- Downward ocean heat flux (composite of runoff, coupler, P-E terms, plus optional frazil)
* - ``calc_hfgeou``
- Upward geothermal heat flux
* - ``calc_msftbarot``
- Barotropic mass streamfunction
* - ``calc_overturning_streamfunction``
- Meridional overturning circulation streamfunction
* - ``calc_rsdoabsorb``
- Shortwave radiation absorbed in ocean
* - ``calc_global_ave_ocean``
- Volume-weighted global ocean average
* - ``calc_total_mass_transport``
- Total mass transport across an ocean section
* - ``calc_umo_corrected``
- Zonal mass transport corrected for barotropic flow
* - ``calc_vmo_corrected``
- Meridional mass transport corrected for barotropic flow
* - ``calc_zostoga``
- Global mean thermosteric sea level change
* - ``ocean_floor``
- Extract ocean floor (bottom-cell) values
Sea ice functions
^^^^^^^^^^^^^^^^^
Defined in :mod:`access_moppy.derivations.calc_seaice`.
.. list-table::
:widths: 35 65
:header-rows: 1
* - Name
- Description
* - ``calc_seaice_extent``
- Sea ice extent (area where concentration > 15 %)
* - ``calc_hemi_seaice``
- Hemisphere-specific sea ice aggregate
* - ``calc_siarean`` / ``calc_siareas``
- Northern/southern hemisphere sea ice area
* - ``calc_sivoln`` / ``calc_sivols``
- Northern/southern hemisphere sea ice volume
* - ``calc_sisnmassn`` / ``calc_sisnmasss``
- Northern/southern hemisphere sea ice snow mass
* - ``calc_siextentn`` / ``calc_siextents``
- Northern/southern hemisphere sea ice extent
.. _zaxis-field:
Vertical Axis (``zaxis``) Field
---------------------------------
For variables defined on vertical levels the mapping entry may include a ``zaxis``
block that describes the vertical coordinate:
.. code-block:: json
"zaxis": {
"type": "hybrid_height",
"coordinate_variables": {
"sigma_theta": "b",
"surface_altitude": "orog",
"theta_level_height": "lev"
},
"formula": "z = a + b*orog"
}
- ``type``: currently always ``"hybrid_height"`` (UM eta-based hybrid height coordinate).
- ``coordinate_variables``: mapping from the UM variable name (key) to the CMIP output
coordinate name (value).
- ``formula``: human-readable label for the vertical coordinate reconstruction formula.
The actual vertical interpolation is carried out by the ``dataset_function``
registered functions (e.g. ``level_to_height``) using the auxiliary variables
identified in ``coordinate_variables``.
Resource Files
--------------
Some variables (e.g. ``areacello``, ``zfull``) are derived from static ancillary
data that is bundled with ACCESS-MOPPy rather than read from user-supplied files.
These are listed in the ``ressource_file`` field (note the non-standard spelling,
kept for historical compatibility).
Bundled resource files live under::
src/access_moppy/resources/
When ``ressource_file`` is set and no ``input_data`` is provided to
``ACCESS_ESM_CMORiser``, the bundled file is resolved via
:func:`importlib.resources.files` and used automatically.
CMIP7 Compound Name Translation
--------------------------------
CMIP7 uses a longer "branded" compound name format:
``realm.variable.operation.frequency.domain``
(e.g. ``atmos.tas.tavg-h2m-hxy-u.mon.glb``).
The files ``cmip7_to_cmip6_compound_name_mapping.json`` and
``cmip6_to_cmip7_compound_name_mapping.json`` provide a bidirectional look-up
table between these names and the familiar CMIP6 ``table.variable`` form.
These mappings are generated from the official CMIP7 Data Request API and contain
~1 974 entries. The function
:func:`~access_moppy.utilities._get_cmip7_to_cmip6_mapping` resolves a CMIP7 name
to its CMIP6 equivalent (with support for regex patterns when a single exact match
is not available).
The resolved CMIP6 name is then passed to :func:`~access_moppy.utilities.load_model_mappings`
as usual, so the variable-level mapping files only need to be maintained in CMIP6
terms.
Adding New Mappings
-------------------
To add support for a new variable, open the relevant model mapping JSON file and add
an entry under the appropriate component key.
Checklist
^^^^^^^^^
1. Identify the correct **component** (``atmosphere``, ``ocean``, etc.) based on the
model realm.
2. Use the **CMIP6 variable short name** as the JSON key.
3. Fill in all required fields: ``CF standard Name``, ``dimensions``, ``units``,
``positive``, ``model_variables``, ``calculation``.
4. Choose the simplest applicable ``calculation.type``:
- Single variable, no transform → ``direct``
- Arithmetic on two variables → ``operation``
- Custom function with ≥ 1 argument → ``formula``
- Dataset-level level interpolation → ``dataset_function``
- No input data needed → ``internal``
5. If the function you need does not yet exist in
:data:`~access_moppy.derivations.custom_functions`, implement it in the
appropriate ``calc_*.py`` module under :mod:`access_moppy.derivations`, import
it in :mod:`access_moppy.derivations.__init__`, and register it in the
``custom_functions`` dictionary.
6. Run the test suite to ensure no regressions.
Example — adding a new atmosphere variable
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Suppose you want to add ``huss`` (near-surface specific humidity, ``fld_s03i237``):
.. code-block:: json
"huss": {
"CF standard Name": "specific_humidity",
"dimensions": {"time": "time", "lat": "lat", "lon": "lon"},
"units": "1",
"positive": null,
"model_variables": ["fld_s03i237"],
"calculation": {
"type": "direct",
"formula": "fld_s03i237"
}
}
Adding a new model
^^^^^^^^^^^^^^^^^^
1. Create ``src/access_moppy/mappings/_mappings.json`` following the
same top-level structure (``model_info`` + component keys).
2. Pass ``model_id=""`` to ``ACCESS_ESM_CMORiser`` to activate the new
mapping file.
3. If the model uses a different CMORiser class (e.g. a new ocean component), implement
a :class:`~access_moppy.base.CMORiser` subclass and wire it up in
:mod:`access_moppy.driver`.