Variable Mapping Reference
This page is intended for developers who need to understand how ACCESS-MOPPy maps raw ACCESS model output variables to CMIP-compliant output, or who want to add support for new variables or models.
Overview
ACCESS-MOPPy uses JSON mapping files to describe how raw model variables (e.g.
UM STASH codes such as fld_s02i208, or MOM5/MOM6 diagnostics such as temp)
correspond to CMIP output variables (e.g. Amon.rsds).
At runtime, load_model_mappings() reads the appropriate
JSON file, finds the requested CMIP variable, and returns the mapping dictionary.
The relevant CMORiser subclass then uses the mapping to
load, transform, and write the data.
The mapping system also handles CMIP7 compound names transparently: a CMIP7 name is first resolved to its CMIP6 equivalent via a separate translation table, and the CMIP6 mapping is then applied as normal.
Mapping Files
Location
All mapping files live inside the installed package under:
src/access_moppy/mappings/
The files shipped with ACCESS-MOPPy are:
File |
Description |
|---|---|
|
Primary mapping file for ACCESS-ESM1.6 (atmosphere, ocean, land, sea ice, aerosol) |
|
Mapping file for ACCESS-CM3 (atmosphere, ocean) |
|
Mapping file for ACCESS-OM3 (ocean, time-invariant) |
|
Translation table: CMIP7 branded name → CMIP6 |
|
Translation table: CMIP6 |
Selecting a mapping file
The mapping file to use is determined by the model_id argument of
ACCESS_ESM_CMORiser (default: "ACCESS-ESM1.6").
load_model_mappings() constructs the filename as
{model_id}_mappings.json and looks for it inside the access_moppy.mappings
package resource directory.
Top-level Structure of a Mapping File
Each mapping file is a JSON object with the following top-level keys:
{
"model_info": { ... },
"aerosol": { "var1": { ... }, "var2": { ... } },
"atmosphere": { "var1": { ... }, "var2": { ... } },
"land": { "var1": { ... }, "var2": { ... } },
"landIce": { "var1": { ... }, "var2": { ... } },
"ocean": { "var1": { ... }, "var2": { ... } },
"sea_ice": { "var1": { ... }, "var2": { ... } }
}
model_infoA metadata block describing the model and which components have mappings.
{ "model_id": "ACCESS-ESM1.6", "components": ["aerosol", "atmosphere", "land", "landIce", "ocean", "sea_ice"], "description": "Variable mappings for ACCESS-ESM1.6 Earth System Model" }
Each component key (aerosol, atmosphere, etc.) maps CMIP variable names to
their entry dictionaries. When load_model_mappings() is
called with, say, compound_name="Amon.rsds", it extracts the CMIP name rsds
and searches each component in turn until it finds the entry.
Variable Entry Fields
Each variable entry inside a component block shares the same set of optional and required fields:
Field |
Required |
Description |
|---|---|---|
|
Yes |
The CF conventions standard name for the output variable.
May be an empty string |
|
Yes |
Ordered dictionary that maps model dimension names (keys) to CMIP dimension names (values). This tells the CMORiser how to rename coordinates. Example: "dimensions": {"time": "time", "lat": "lat", "lon": "lon"}
|
|
Yes |
Expected physical units of the CMIP output variable (e.g. |
|
Yes |
Sign convention: |
|
Yes |
List of raw model variable names that must be loaded from the input files.
These are passed by name into the calculation context.
|
|
Yes |
Dictionary that specifies how to derive the output variable. See Calculation Types below. |
|
No |
Present for variables on vertical levels. Describes the vertical coordinate type and the variables needed to reconstruct it. See Vertical Axis (zaxis) Field below. |
|
No |
Name of a bundled NetCDF resource file (stored under |
Example — simple direct variable
"rldscs": {
"CF standard Name": "surface_downwelling_longwave_flux_in_air_assuming_clear_sky",
"dimensions": {"time": "time", "lat": "lat", "lon": "lon"},
"units": "W m-2",
"positive": "down",
"model_variables": ["fld_s02i208"],
"calculation": {
"type": "direct",
"formula": "fld_s02i208"
}
}
Calculation Types
The calculation dictionary always contains a "type" key. The five supported
types are described below.
direct
The output variable is taken straight from one model variable with no transformation.
"calculation": {
"type": "direct",
"formula": "<model_variable_name>"
}
formula
The output is derived by calling a registered function from the
custom_functions registry (see
Custom Functions Registry).
"calculation": {
"type": "formula",
"operation": "<function_name>",
"args": ["<var1>", "<var2>", ...],
"kwargs": {"<key>": "<var_or_literal>"}
}
argsis a list of positional arguments. Each item is either a string (variable name looked up in the input context), a number (used as-is), or a nested expression dictionary (see Expression Language below).kwargsis a dictionary of keyword arguments. Values follow the same rules asargsitems.Alternatively,
operandsmay be used instead ofargsfor legacy entries — both are treated identically by the expression evaluator.
Optional operands example (ocean hfds):
"calculation": {
"type": "formula",
"operation": "calc_hfds",
"args": ["sfc_hflux_from_runoff", "sfc_hflux_coupler", "sfc_hflux_pme"],
"kwargs": {
"frazil_3d_int_z": {"optional": "frazil_3d_int_z"},
"frazil_2d": {"optional": "frazil_2d"}
}
}
Wrapping a value in {"optional": "<var>"} means the variable is passed as
None if it is absent from the input dataset, instead of raising a
KeyError.
operation
A shorthand for common two-argument arithmetic operations. Functionally equivalent
to formula but expressed more compactly:
"calculation": {
"type": "operation",
"operation": "<op_name>",
"args": ["<var1>", "<var2>"]
}
Supported operation values: "add", "subtract", "multiply",
"divide", "power".
Example (land npp — net primary productivity divided by tile fraction):
"calculation": {
"type": "operation",
"operation": "divide",
"args": ["fld_s03i262", "fld_s03i395"]
}
dataset_function
Calls a more complex dataset-level function that receives the entire xarray Dataset and may modify dimensions or coordinates (e.g. interpolating from hybrid-height levels to physical height levels).
"calculation": {
"type": "dataset_function",
"function": "<function_name>",
"kwargs": {}
}
Available dataset_function values: "cl_level_to_height",
"cli_level_to_height", "clw_level_to_height", "level_to_height".
These functions are defined in
access_moppy.derivations.calc_atmos and registered in
custom_functions.
internal
The output variable is computed entirely internally from ancillary information (grid geometry, etc.) without reading any user-provided input file.
"calculation": {
"type": "internal",
"function": "<function_name>",
"args": []
}
Currently the only available function is "calculate_areacella" (atmospheric
grid-cell area, computed from latitude/longitude coordinate arrays).
Variables that use this type do not require input_data to be passed to
ACCESS_ESM_CMORiser.
Expression Language
The formula calculation type uses a small recursive expression language that is
evaluated by evaluate_expression().
An expression can be one of:
Expression form |
Meaning |
|---|---|
|
Look up the named variable in the input context (an xarray DataArray). |
|
A literal numeric value (integer or float). |
|
Explicit literal — useful when the value might be a string or ambiguous. |
|
Look up the variable; return |
|
Nested function call: recursively evaluate |
Expressions can be arbitrarily nested, allowing compound derivations to be expressed in a single JSON structure.
Custom Functions Registry
All functions available to the formula, operation, and dataset_function
calculation types are registered in the dictionary
access_moppy.derivations.custom_functions.
Built-in operations
Name |
Description |
|---|---|
|
Sum of any number of arguments: |
|
Difference: |
|
Product: |
|
Ratio: |
|
Exponentiation: |
|
|
|
Arithmetic mean of multiple arguments |
|
|
|
|
|
Select a single index slice: |
|
Resample to monthly minimum |
|
Resample to monthly maximum |
|
Drop a named dimension/axis |
|
Drop the time dimension (for time-invariant fields stored in time-varying files) |
|
Squeeze (remove) size-1 dimensions |
Atmosphere functions
Defined in access_moppy.derivations.calc_atmos.
Name |
Description |
|---|---|
|
Convert cloud fraction from hybrid-height levels to physical height levels |
|
Convert cloud ice content from hybrid-height levels to physical height levels |
|
Convert cloud liquid water from hybrid-height levels to physical height levels |
|
Generic hybrid-height level → physical height conversion |
|
Compute atmospheric grid-cell area from lat/lon coordinates |
Aerosol functions
Defined in access_moppy.derivations.calc_aerosol.
Name |
Description |
|---|---|
|
Sum spectral band optical depths to produce a broadband aerosol optical depth |
Land functions
Defined in access_moppy.derivations.calc_land.
Name |
Description |
|---|---|
|
Extract top-soil layer diagnostic |
|
Derive land cover fractions from tile data |
|
Extract a specific tile fraction |
|
Weighted sum over surface tiles |
|
Convert carbon pool units to kg m⁻² |
|
Total land carbon including wood products |
|
Convert mass pool to kg m⁻² |
|
Convert nitrogen pool units to kg m⁻² |
|
Compute frozen soil moisture |
|
Compute liquid soil moisture |
|
Compute total soil moisture |
|
Compute soil temperature profile |
Ocean functions
Defined in access_moppy.derivations.calc_ocean.
Name |
Description |
|---|---|
|
Compute ocean grid-cell area |
|
Downward ocean heat flux (composite of runoff, coupler, P-E terms, plus optional frazil) |
|
Upward geothermal heat flux |
|
Barotropic mass streamfunction |
|
Meridional overturning circulation streamfunction |
|
Shortwave radiation absorbed in ocean |
|
Volume-weighted global ocean average |
|
Total mass transport across an ocean section |
|
Zonal mass transport corrected for barotropic flow |
|
Meridional mass transport corrected for barotropic flow |
|
Global mean thermosteric sea level change |
|
Extract ocean floor (bottom-cell) values |
Sea ice functions
Defined in access_moppy.derivations.calc_seaice.
Name |
Description |
|---|---|
|
Sea ice extent (area where concentration > 15 %) |
|
Hemisphere-specific sea ice aggregate |
|
Northern/southern hemisphere sea ice area |
|
Northern/southern hemisphere sea ice volume |
|
Northern/southern hemisphere sea ice snow mass |
|
Northern/southern hemisphere sea ice extent |
Vertical Axis (zaxis) Field
For variables defined on vertical levels the mapping entry may include a zaxis
block that describes the vertical coordinate:
"zaxis": {
"type": "hybrid_height",
"coordinate_variables": {
"sigma_theta": "b",
"surface_altitude": "orog",
"theta_level_height": "lev"
},
"formula": "z = a + b*orog"
}
type: currently always"hybrid_height"(UM eta-based hybrid height coordinate).coordinate_variables: mapping from the UM variable name (key) to the CMIP output coordinate name (value).formula: human-readable label for the vertical coordinate reconstruction formula.
The actual vertical interpolation is carried out by the dataset_function
registered functions (e.g. level_to_height) using the auxiliary variables
identified in coordinate_variables.
Resource Files
Some variables (e.g. areacello, zfull) are derived from static ancillary
data that is bundled with ACCESS-MOPPy rather than read from user-supplied files.
These are listed in the ressource_file field (note the non-standard spelling,
kept for historical compatibility).
Bundled resource files live under:
src/access_moppy/resources/
When ressource_file is set and no input_data is provided to
ACCESS_ESM_CMORiser, the bundled file is resolved via
importlib.resources.files() and used automatically.
CMIP7 Compound Name Translation
CMIP7 uses a longer “branded” compound name format:
realm.variable.operation.frequency.domain
(e.g. atmos.tas.tavg-h2m-hxy-u.mon.glb).
The files cmip7_to_cmip6_compound_name_mapping.json and
cmip6_to_cmip7_compound_name_mapping.json provide a bidirectional look-up
table between these names and the familiar CMIP6 table.variable form.
These mappings are generated from the official CMIP7 Data Request API and contain
~1 974 entries. The function
_get_cmip7_to_cmip6_mapping() resolves a CMIP7 name
to its CMIP6 equivalent (with support for regex patterns when a single exact match
is not available).
The resolved CMIP6 name is then passed to load_model_mappings()
as usual, so the variable-level mapping files only need to be maintained in CMIP6
terms.
Adding New Mappings
To add support for a new variable, open the relevant model mapping JSON file and add an entry under the appropriate component key.
Checklist
Identify the correct component (
atmosphere,ocean, etc.) based on the model realm.Use the CMIP6 variable short name as the JSON key.
Fill in all required fields:
CF standard Name,dimensions,units,positive,model_variables,calculation.Choose the simplest applicable
calculation.type:Single variable, no transform →
directArithmetic on two variables →
operationCustom function with ≥ 1 argument →
formulaDataset-level level interpolation →
dataset_functionNo input data needed →
internal
If the function you need does not yet exist in
custom_functions, implement it in the appropriatecalc_*.pymodule underaccess_moppy.derivations, import it inaccess_moppy.derivations.__init__, and register it in thecustom_functionsdictionary.Run the test suite to ensure no regressions.
Example — adding a new atmosphere variable
Suppose you want to add huss (near-surface specific humidity, fld_s03i237):
"huss": {
"CF standard Name": "specific_humidity",
"dimensions": {"time": "time", "lat": "lat", "lon": "lon"},
"units": "1",
"positive": null,
"model_variables": ["fld_s03i237"],
"calculation": {
"type": "direct",
"formula": "fld_s03i237"
}
}
Adding a new model
Create
src/access_moppy/mappings/<MODEL_ID>_mappings.jsonfollowing the same top-level structure (model_info+ component keys).Pass
model_id="<MODEL_ID>"toACCESS_ESM_CMORiserto activate the new mapping file.If the model uses a different CMORiser class (e.g. a new ocean component), implement a
CMORisersubclass and wire it up inaccess_moppy.driver.