Volume 11, Issue 12 p. 4513-4558
Research Article
Open Access

UKESM1: Description and Evaluation of the U.K. Earth System Model

Alistair A. Sellar

Corresponding Author

Alistair A. Sellar

Met Office Hadley Centre, Exeter, UK

Correspondence to: A. A. Sellar,

[email protected]

Search for more papers by this author
Colin G. Jones

Colin G. Jones

National Centre for Atmospheric Science, UK

School of Earth and Environment, University of Leeds, Leeds, UK

Search for more papers by this author
Jane P. Mulcahy

Jane P. Mulcahy

Met Office Hadley Centre, Exeter, UK

Search for more papers by this author
Yongming Tang

Yongming Tang

Met Office Hadley Centre, Exeter, UK

Search for more papers by this author
Andrew Yool

Andrew Yool

National Oceanography Centre, Southampton, UK

Search for more papers by this author
Andy Wiltshire

Andy Wiltshire

Met Office Hadley Centre, Exeter, UK

Search for more papers by this author
Fiona M. O'Connor

Fiona M. O'Connor

Met Office Hadley Centre, Exeter, UK

Search for more papers by this author
Marc Stringer

Marc Stringer

National Centre for Atmospheric Science, UK

Department of Meteorology, University of Reading, Reading, UK

Search for more papers by this author
Richard Hill

Richard Hill

Met Office Hadley Centre, Exeter, UK

Search for more papers by this author
Julien Palmieri

Julien Palmieri

National Oceanography Centre, Southampton, UK

Search for more papers by this author
Stephanie Woodward

Stephanie Woodward

Met Office Hadley Centre, Exeter, UK

Search for more papers by this author
Lee de Mora

Lee de Mora

Plymouth Marine Laboratory, Plymouth, UK

Search for more papers by this author
Till Kuhlbrodt

Till Kuhlbrodt

National Centre for Atmospheric Science, UK

Department of Meteorology, University of Reading, Reading, UK

Search for more papers by this author
Steven T. Rumbold

Steven T. Rumbold

National Centre for Atmospheric Science, UK

Department of Meteorology, University of Reading, Reading, UK

Search for more papers by this author
Douglas I. Kelley

Douglas I. Kelley

UK Centre for Ecology and Hydrology, Wallingford, UK

Search for more papers by this author
Rich Ellis

Rich Ellis

UK Centre for Ecology and Hydrology, Wallingford, UK

Search for more papers by this author
Colin E. Johnson

Colin E. Johnson

National Centre for Atmospheric Science, UK

School of Earth and Environment, University of Leeds, Leeds, UK

Search for more papers by this author
Jeremy Walton

Jeremy Walton

Met Office Hadley Centre, Exeter, UK

Search for more papers by this author
Nathan Luke Abraham

Nathan Luke Abraham

National Centre for Atmospheric Science, UK

Department of Chemistry, University of Cambridge, Cambridge, UK

Search for more papers by this author
Martin B. Andrews

Martin B. Andrews

Met Office Hadley Centre, Exeter, UK

Search for more papers by this author
Timothy Andrews

Timothy Andrews

Met Office Hadley Centre, Exeter, UK

Search for more papers by this author
Alex T. Archibald

Alex T. Archibald

National Centre for Atmospheric Science, UK

Department of Chemistry, University of Cambridge, Cambridge, UK

Search for more papers by this author
Ségolène Berthou

Ségolène Berthou

Met Office Hadley Centre, Exeter, UK

Search for more papers by this author
Eleanor Burke

Eleanor Burke

Met Office Hadley Centre, Exeter, UK

Search for more papers by this author
Ed Blockley

Ed Blockley

Met Office Hadley Centre, Exeter, UK

Search for more papers by this author
Ken Carslaw

Ken Carslaw

National Centre for Atmospheric Science, UK

School of Earth and Environment, University of Leeds, Leeds, UK

Search for more papers by this author
Mohit Dalvi

Mohit Dalvi

Met Office Hadley Centre, Exeter, UK

Search for more papers by this author
John Edwards

John Edwards

Met Office Hadley Centre, Exeter, UK

Search for more papers by this author
Gerd A. Folberth

Gerd A. Folberth

Met Office Hadley Centre, Exeter, UK

Search for more papers by this author
Nicola Gedney

Nicola Gedney

Met Office Hadley Centre, Exeter, UK

Search for more papers by this author
Paul T. Griffiths

Paul T. Griffiths

National Centre for Atmospheric Science, UK

Department of Chemistry, University of Cambridge, Cambridge, UK

Search for more papers by this author
Anna B. Harper

Anna B. Harper

Department of Mathematics, University of Exeter, Exeter, UK

Search for more papers by this author
Maggie A. Hendry

Maggie A. Hendry

Met Office Hadley Centre, Exeter, UK

Search for more papers by this author
Alan J. Hewitt

Alan J. Hewitt

Met Office Hadley Centre, Exeter, UK

Search for more papers by this author
Ben Johnson

Ben Johnson

Met Office Hadley Centre, Exeter, UK

Search for more papers by this author
Andy Jones

Andy Jones

Met Office Hadley Centre, Exeter, UK

Search for more papers by this author
Chris D. Jones

Chris D. Jones

Met Office Hadley Centre, Exeter, UK

Search for more papers by this author
James Keeble

James Keeble

National Centre for Atmospheric Science, UK

Department of Chemistry, University of Cambridge, Cambridge, UK

Search for more papers by this author
Spencer Liddicoat

Spencer Liddicoat

Met Office Hadley Centre, Exeter, UK

Search for more papers by this author
Olaf Morgenstern

Olaf Morgenstern

NIWA, Wellington, New Zealand

Search for more papers by this author
Robert J. Parker

Robert J. Parker

National Centre for Earth Observation, Leicester, UK

Earth Observation Science, School of Physics and Astronomy, University of Leicester, Leicester, UK

Search for more papers by this author
Valeriu Predoi

Valeriu Predoi

National Centre for Atmospheric Science, UK

Department of Meteorology, University of Reading, Reading, UK

Search for more papers by this author
Eddy Robertson

Eddy Robertson

Met Office Hadley Centre, Exeter, UK

Search for more papers by this author
Antony Siahaan

Antony Siahaan

British Antarctic Survey, Cambridge, UK

Search for more papers by this author
Robin S. Smith

Robin S. Smith

National Centre for Atmospheric Science, UK

Department of Meteorology, University of Reading, Reading, UK

Search for more papers by this author
Ranjini Swaminathan

Ranjini Swaminathan

Department of Meteorology, University of Reading, Reading, UK

National Centre for Earth Observation, Leicester, UK

Search for more papers by this author
Matthew T. Woodhouse

Matthew T. Woodhouse

CSIRO Climate Science Centre, Aspendale, Australia

Search for more papers by this author
Guang Zeng Mohamed Zerroukat

Mohamed Zerroukat

Met Office Hadley Centre, Exeter, UK

Search for more papers by this author
First published: 30 October 2019
Citations: 429

Abstract

We document the development of the first version of the U.K. Earth System Model UKESM1. The model represents a major advance on its predecessor HadGEM2-ES, with enhancements to all component models and new feedback mechanisms. These include a new core physical model with a well-resolved stratosphere; terrestrial biogeochemistry with coupled carbon and nitrogen cycles and enhanced land management; tropospheric-stratospheric chemistry allowing the holistic simulation of radiative forcing from ozone, methane, and nitrous oxide; two-moment, five-species, modal aerosol; and ocean biogeochemistry with two-way coupling to the carbon cycle and atmospheric aerosols. The complexity of coupling between the ocean, land, and atmosphere physical climate and biogeochemical cycles in UKESM1 is unprecedented for an Earth system model. We describe in detail the process by which the coupled model was developed and tuned to achieve acceptable performance in key physical and Earth system quantities and discuss the challenges involved in mitigating biases in a model with complex connections between its components. Overall, the model performs well, with a stable pre-industrial state and good agreement with observations in the latter period of its historical simulations. However, global mean surface temperature exhibits stronger-than-observed cooling from 1950 to 1970, followed by rapid warming from 1980 to 2014. Metrics from idealized simulations show a high climate sensitivity relative to previous generations of models: Equilibrium climate sensitivity is 5.4 K, transient climate response ranges from 2.68 to 2.85 K, and transient climate response to cumulative emissions is 2.49 to 2.66 K TtC−1.

Key Points

  • UKESM1 represents a major advance over its predecessor HadGEM2-ES, both in the complexity of its components and its internal coupling
  • The complex coupling presents challenges to the model development; we document the tuning process employed to obtain acceptable performance
  • UKESM1 performs well, having a stable pre-industrial state and showing good agreement with observations in a wide variety of contexts

Plain Language Summary

We describe the development and behavior of UKESM1, a novel climate model that includes improved representations of processes in the atmosphere, ocean, and on land. These processes are inter-related: For example, dust is produced on the land and blown up into the atmosphere where it affects the amount of sunlight falling on Earth. Dust can also be dissolved in the ocean, where it affects marine life. This in turn changes both the amount of carbon dioxide absorbed by the ocean and the material emitted from the surface into the atmosphere, which has an affect on the formation of clouds. UKESM1 includes many processes and interactions such as these, giving it a high level of complexity. Ensuring realistic process behavior is a major challenge in the development of our model, and we have carefully tested this. UKESM1 performs well, correctly exhibiting stable results from a continuous pre-industrial simulation (used to provide a reference for future experiments) and showing good agreement with observations toward the end of its historical simulations. Results for some properties—including the degree to which average surface temperature changes with increased amounts of carbon dioxide in the atmosphere—are examined in detail.

1 Introduction

Government and society require reliable guidance on how the Earth's climate will evolve in the coming century and how this can be mitigated through changes in human activity. This requires detailed understanding of how the climate will respond to future emissions of pollutants, as well as to anthropogenic land use. In addition to the physical climate response, there is a growing need to understand the response of the global biosphere and associated biogeochemical cycles, both to a changing climate and to increasing concentrations of atmospheric CO2. Changes in the efficiency of the Earth's natural carbon cycle may provide a strong feedback onto future climate warming, reducing available carbon emissions to realize key policy targets, such as limiting global warming to less than 2 °C above pre-industrial levels (Friedlingstein, 2015). Warming and increased absorption of CO2 by the global oceans are expected to have negative impacts on seawater chemistry and ocean ecosystems (Gehlen et al., 2014), while warming on land may trigger vegetation change (Walther, 2010) as well as permafrost thaw and release of ancient carbon stores (Burke et al., 2018; Chadburn et al., 2017). Changes in atmospheric composition and temperature have the potential to alter the rate and balance of chemical reactions with implications for stratospheric ozone recovery (Chipperfield et al., 2017) and future air quality (Silva et al., 2017). To address these challenges, the U.K.'s Met Office and Natural Environment Research Council (NERC) have jointly developed the U.K. Earth System Model (UKESM1) as a successor to the HadGEM2-ES model (Collins et al., 2011). UKESM1 will deliver simulations to the Coupled Model Intercomparison Project Phase 6 (CMIP6; Eyring et al., 2016) and will be a community research tool for studying past and future climate.

UKESM1 uses the coupled climate model HadGEM3-GC3.1 (GC3.1 hereafter, Kuhlbrodt et al., 2018; Williams et al., 2018) as its physical core, with the following components interactively coupled to GC3.1: terrestrial carbon and nitrogen cycles, including dynamic vegetation and representation of agricultural land use change; ocean biogeochemistry (BGC) with prognostic diatom and non-diatom concentrations and a unified troposphere-stratosphere chemistry model, tightly coupled to a multi-species modal aerosol scheme. These additional Earth system components, as well as couplings across components and coupling with the physical climate model, allow a number of important policy and science questions to be addressed that would not be possible with GC3.1. Primary among these are estimates of available carbon emission budgets commensurate with limiting global warming below specific targets (Matthews et al., 2017). Introducing prognostic atmospheric chemistry into GC3.1 allows an assessment of the mitigation potential from reducing short lived climate forcers such as methane, black carbon, and tropospheric ozone (Collins et al., 2018; Stohl et al., 2015), while coupling atmospheric chemistry to terrestrial biogeochemistry opens up the potential to study the risk of permafrost thaw, methane release, and their impacts on Arctic warming. A full atmosphere treatment of ozone chemistry supports a joined-up study of stratospheric ozone recovery and climate change as well as analysis of interactions between climate warming, tropospheric ozone, and vegetation (Sadiq et al., 2017). In the marine domain, interactive treatment of ocean heat and carbon uptake, carbonate chemistry, and ocean biology enables future risks around ocean acidification and warming to be assessed (Bopp et al., 2013). Finally, feedbacks from changes in Earth system components of the model on future climate warming can also be investigated.

To provide a model capable of addressing the above questions, in developing UKESM1, we emphasized Earth system process-completeness, particularly with respect to couplings between prognostic model components, across different model domains (e.g., atmosphere, ocean, land, and ice) and between different process parametrizations. We aimed to make UKESM1 internally consistent, using the model to predict processes as much as possible, rather than prescribe them from an external data source. This treatment increases the model degrees of freedom when predicting future changes in the coupled Earth system. As one example we highlight some of the interactions involved in the UKESM1 representation of dust; similar interactions exist for numerous other model processes and variables. Dust is produced from the land surface, depending on model predicted vegetation type, soil moisture, and surface winds. The dust produced is then advected by the model atmosphere and interacts with the radiation parametrization. Atmospheric dust can be deposited into the model ocean, thousands of kilometers from where it was produced, and act as a source of soluble iron influencing ocean biological activity and thus marine uptake of CO2. Surface chlorophyll, resulting from this marine biological activity, feeds into surface emission parametrizations of dimethyl sulphide (DMS) and primary marine organic aerosol, both of which increase cloud condensation nuclei in the model atmosphere, influencing the model's radiation and cloud microphysical schemes and thus the physical climate response. Retaining such interactions in a fully prognostic sense means that UKESM1 is one of the most process-complete Earth system models available today.

Developing an Earth system model that retains this degree of process realism, process couplings, and prognostic treatment comes with risks. Primary among these is constraining the performance of the new Earth system components that often do not have the same degree of observational cover as more traditional physical climate processes. In addition, the increased degree of coupling between processes, as well as the preferential use of model predicted fields as input to dependent parametrizations (e.g., predicted vegetation type, seawater DMS, and atmospheric chemical oxidants) rather than prescribed, temporally constant fields, increases the risk that errors in one model component will negatively impact dependent variables or parametrizations in another part of the model, with the danger of a cascade of biased interactions driving the overall simulated climate to be significantly in error. In developing UKESM1, when a new Earth system process was included or made prognostic when it had previously been constrained by prescribed input, we carefully assessed the performance of the full model as well as the dependent model components. Our aim was to ensure new processes performed as realistically as possible and that dependent processes as well as the full model retained an acceptable level of skill when assessed against a basket of performance indicators. If the introduction of a new Earth system process or prognostic coupling was neutral or positive for model performance, then it was included. If the impact was negative, the degree of performance degradation was weighed against the potential benefit of more realistic future projections. Wherever possible we leaned toward inclusion of increased process realism or coupling.

In section 2 we describe the component models of UKESM1 and how they are coupled to one another. Section 3 discusses the methodology for tuning the fully coupled Earth system model to achieve acceptable performance. In section 4 we evaluate the model against observations and examine a few of its emergent properties, and finally in section 5 we summarize and discuss some of the challenges arising from the development of UKESM1. Appendix Appendix C describes some non-standard configurations of the model: first, configurations of reduced complexity intended for carbon cycle research and second, the configuration which runs with prescribed emissions (rather than concentrations) of CO2. Appendix Appendix D lists the code versions and simulation identifiers for experiments used in this work.

2 Component Models and Coupling

UKESM1 is built upon component models which have been developed to study climate processes for specific science domains such as ocean biogeochemistry or atmospheric composition. Each of these has been evaluated and tuned in uncoupled configurations (atmosphere-only, ocean-only, or offline land), before their inclusion in the coupled Earth system model. Its physical core GC3.1 has been tested in coupled atmosphere-land-ocean-sea ice mode, but without feedbacks from any of the additional Earth system components in UKESM1. The components themselves have been developed as models in their own right, with corresponding descriptions in the literature. Below we list the component models and their primary references; in Appendix Appendix A we give more detailed descriptions.
  • Physical core atmosphere-land-ocean-sea ice model: HadGEM3-GC3.1 (Kuhlbrodt et al., 2018; Williams et al., 2018)
  • Terrestrial biogeochemistry: JULES (Joint UK Land Environment Simulator; Clark et al., 2011), with developments to plant physiology and functional types (Harper et al., 2016, 2018), land use, and the introduction of nitrogen cycling.
  • Ocean biogeochemistry: MEDUSA (Model of Ecosystem Dynamics, nutrient Utilisation, Sequestration and Acidification; Yool et al., 2013), with the addition of a diagnostic submodel for DMS surface seawater concentration (Anderson et al., 2001) to support interactive DMS emissions to the atmosphere.
  • Atmospheric composition: UKCA (United Kingdom Chemistry and Aerosol model; Archibald et al., 2019; Mulcahy et al., 2018) with unified stratospheric-tropospheric chemistry and close coupling between chemistry and aerosols.

UKESM1 is a successor to HadGEM2-ES (Collins et al., 2011), and many of its components, namely, the land surface, atmospheric chemistry, and the core physical model, have been developed from that model's components. The most significant developments for century-scale climate projections are (a) the inclusion of interactive nitrogen cycling in the terrestrial carbon cycle, which addresses a common bias across CMIP5 models to simulate excessive land carbon uptake, and (b) the interactive stratospheric chemistry, which enables us to predict the radiative forcing of ozone, methane, and nitrous oxide in the full model domain and allows stratospheric ozone recovery to evolve with climate change. Additionally, aerosol is represented by prognostic mass and number, allowing better representation of aerosol size distribution and hence cloud-aerosol forcing.

Of prime importance for an Earth system model (ESM) is the coupling between its components, since these couplings determine which feedbacks the model is capable of simulating and hence the potential applications of the model. Table 1 lists the couplings which are active in the default configuration of UKESM1, with a summary of the rationale for their inclusion. UKESM1 uses the OASIS3-MCT coupler (Craig et al., 2017) as the interface between marine and atmospheric components. Coupling information is exchanged between the ocean and atmosphere every 3 hr. Between the land and atmosphere the coupling frequency is every dynamical model timestep (20 min). Couplings involving aerosols, chemistry, or radiation are performed once per hour. Ocean physics, sea ice, and BGC are coupled every ocean model timestep (45 min).

Table 1. Coupled Interactions Between Earth System Components
Components Coupling Reason for inclusion
Land-ocean-atmosphere Carbon pools coupled to atmospheric CO2 concentration and climate Simulate changes in natural carbon sinks and sources
Land Carbon cycle coupled to nitrogen availability Improve carbon uptake process representation and enable future coupling to atmospheric nitrogen chemistry
Chemistry-radiation Radiation uses 3D concentrations of N2O, CH4 and O3 from chemistry Improve representation of GHG radiative forcing and simulate radiative response to changes in chemistry
Chemistry-water vapor Two-way coupling between water vapor and chemistry* Improve representation of H2O sources and simulate radiative response to changes in chemistry
Aerosol-radiation-cloud Prognostic aerosol size distribution is used by radiation and cloud microphysics* Improve representation of aerosol radiative forcing
Chemistry-aerosol Oxidation processes for secondary organic* and sulphate aerosol coupled to chemistry Improve representation of aerosol radiative forcing
Clouds-chemistry Photolytic reaction rates depend on 3D radiative intensity* Simulate chemical response to cloud and ozone changes
Vegetation-chemistry-aerosol Interactive emissions of biogenic volatile organic compounds* Simulate chemical and aerosol response to changes in vegetation productivity and distribution
Vegetation-aerosol-ocean Dust emissions are coupled to simulated bare soil fraction Include dependence of airborne dust on vegetation distribution
Aerosol-ocean Deposition of iron in mineral dust Simulate biological response to dust production and transport
Ocean-aerosol Emissions of PMOA* & DMS depend on ocean BGC Changes in ocean biology can impact aerosol-cloud feedbacks
  • Note. Asterisk (*) indicates an enhancement in capability relative to HadGEM2-ES.

3 Performance Tuning of Final Coupled Model

In the final stages of development of any coupled climate model or ESM, there is a phase of fine-tuning model parameters or settings in order to minimize model biases and satisfy high-level performance metrics such as global TOA radiation balance, water, and energy conservation. For discussion of these issues and a summary of common practices, see, for example, Mauritsen et al. (2012), Hourdin et al. (2016), Schmidt et al. (2017). This process is generally more complex for an ESM because of the large number of interactions between component models, and the concomitant potential for biases in one component to degrade the performance of other parts of the coupled system. In this section we summarize all of the parameters tuned in the development of the fully coupled UKESM1 and the rationale for that tuning. Note that prior to coupled testing, the component models were developed and tuned in uncoupled mode, and this tuning is described in the documentation papers for those components (Harper et al., 2016, 2018; Morgenstern et al., 2009; Mulcahy et al., 2018; O’Connor et al., 2014; Walters et al., 2019; Williams et al., 2018; Yool et al., 2013). The parameter values from that component model tuning acted as the starting point for the tuning of UKESM1; in the case of physical model parameters, the starting point was the values used in GC3.1 (the physical core of UKESM1).

The goals of the UKESM1 tuning were twofold: first, to keep long-term global mean carbon fluxes and net TOA radiation close to zero under pre-industrial forcing, to avoid significant drifts in either the climate or carbon pools, and second, to avoid large biases between observations and the model state. Despite the majority of reference observations being taken in the last 40 years, most of the UKESM1 coupled model tuning was performed using pre-industrial tests with perpetual 1850 forcing, during the model's spin-up phase. Of course, present-day tests would allow a cleaner comparison to observations, but such tests are complicated by the fact that present-day climate and the carbon cycle are not in equilibrium with their forcing and by the difficulty of choosing initial conditions. The only truly reliable way to test the present-day impact of changes to ESM components is to allow the change to spin-up in a pre-industrial configuration and then run transient historical tests through to the present-day. However, this is prohibitively slow to perform iteratively and instead we made pragmatic judgements to account for expected differences between the pre-industrial and present-day model state.

A wide range of model outputs were used to inform the tuning process. These included metrics (single numbers such as global mean TOA radiation or global total plant productivity) as well as maps of key model variables. In Appendix Appendix B, we list the main metrics and variables which were used during the tuning process to identify priority biases and monitor the stability of the pre-industrial state. While these summarize the primary tuning targets, other variables were examined from time to time, particularly when a particular bias or process needed checked in more detail.

Now that we have completed pre-industrial spin-up and historical simulations with the tuned model, it is possible to assess the accuracy of these judgements with hindsight. To this end we have performed two “untuned” experiments parallel to the pre-industrial and historical simulations described in section 4. The untuned configuration is identical to the final model except that the parameters described below have been reset to their original pre-tuned values. The pre-industrial untuned run branched from the beginning of the UKESM1 pre-industrial control (i.e., from the end of the spin-up described in section 4), while the untuned historical run branched from the year 1990 of a single member of the UKESM1 historical ensemble. Table 2 presents a set of metrics summarizing the biases which were tackled by this tuning exercise. The first 5 years have been discarded to exclude initial spin-up effects, and there is minimal drift in these metrics during the 20-year period over which they are calculated. In practice, the tuning was guided by a wide array of information including the pattern of biases across multiple quantities as described above; the metrics in Table 2 serve only to illustrate the impact of the tuning at a headline level.

Table 2. Metrics Illustrating Biases Targeted by the Coupled Model Tuning, Showing the Impact of This Tuning on Pre-Industrial and Present-Day Performance
Pre-industrial Present-day Present-day
Untuned Tuned Untuned Tuned Observations
Net TOA down (W m−2) −0.81 −0.09 0.23 0.81 0.61–0.81
Snow SW metrica (W m−2) 44.4 52.5 45.1 54.3 50.5–53.6
Bare soil Aus. (106km2) 5.2 3.0 4.7 2.8 2.5–4.0
Dust errorb (unit-less) 0.89 0.44 0.82 0.47 0
GPP (GtC year−1) 122 117 139 130 113–125
  • Note. Pre-industrial refers to years 5–25 of the pre-industrial control. Present-day refers to years 1995–2014 of the fifth ensemble member of the historical experiment. See text for observations details and section 4 for model experiment definitions. RMS = root-mean-square.
  • a TOA outgoing clear-sky SW over land, 30°N–60°N mean.
  • b RMS( urn:x-wiley:jame:media:jame20999:jame20999-math-0001) for dust concentration and AOD observations (oi) and their model counterparts (mi).

3.1 Snow-Vegetation Interactions

The most significant tuning of the coupled ESM concerned the interaction between shortwave (SW) radiation, snow, and vegetation. We found it necessary to tune the parameters governing this interaction in order to reduce a substantial SW radiation bias. The extent to which vegetation is covered by snow makes an enormous difference to surface albedo, and there is a large diversity in how models parametrize this process (Boisier et al., 2012; Bright et al., 2015; Ménard et al., 2014). JULES represents the bending and partial burying of vegetation by snow (Ménard et al., 2014) on a per-PFT basis, by reducing the leaf area index (LAI) used in the albedo calculation as snow depth increases (Walters et al., 2019). When the depth of snow (d) is less than the canopy height hi of a given PFT, the partial burying is described by the equation
urn:x-wiley:jame:media:jame20999:jame20999-math-0002(1)
where i denotes the PFT, Li is the true LAI for each PFT, and Li is the “exposed” LAI used in the albedo calculation. The parameter ni thus governs the rate at which vegetation is covered by snow as snow depth increases, but it is not well constrained by observations and must be tuned for a given LAI distribution. The LAI in GC3.1 is prescribed using a seasonal climatology derived from MODIS satellite observations, while the phenology in UKESM1 is simulated interactively using the scheme of Clark et al. (2011). Figure 1 shows LAI for C3 grass in the two models in JJA and DJF. GC3.1 shows an almost complete loss of LAI in winter in central Asia and North America. This seasonal reduction is more extreme than seen in other observational products (Garrigues et al., 2008) and may in part result from the obscuring of vegetation by snow as seen by the satellite data which is used in constructing the LAI climatology. Even the needleleaf tree PFT in this climatology sees LAI reductions of 80–95% in regions dominated by evergreen needleleaf trees (not shown). Therefore, in GC3.1 there is a little need for the parametrization in equation 1, since LAI is already very small and ni is accordingly tuned to very low values.
Details are in the caption following the image
Seasonal changes in leaf area index. Left: prescribed LAI in GC3.1, right: simulated LAI in UKESM1 1995–2014 (mean of nine historical simulations), top row: June–August, middle row: December–February, bottom row: JJA minus DJF expressed as a percentage of the JJA value.

In contrast, the LAI in UKESM1 is more similar between summer and winter for grasses and evergreen PFTs, with reductions of 20–35%. This wintertime LAI may be too high because UKESM1 does not fully represent the phenology of grasses, but the large difference from the LAI climatology used in GC3.1 requires a different tuning for the snow-vegetation parametrization. Accordingly ni is given higher values in UKESM1, particularly for grasses, so that the radiative effect of snow partially covering vegetation is taken into account. The specific values of ni were chosen to maximize the similarity in grid box mean wintertime surface albedo between UKESM1 and GC3.1, to avoid large differences in regional mean radiative fluxes. Using the GC3.1 values of ni in UKESM1 produces a widespread negative bias in DJF clear-sky outgoing SW across northern mid-latitude land masses (Figure 2c). The tuning of ni increases the wintertime albedo in seasonally snow-covered regions, increasing the TOA outgoing clear-sky SW (Figure 2b), which reduces the negative bias but does introduce positive bias in some regions (Figure 2d). The net impact of this tuning can be summarized by the second metric in Table 2, defined as TOA outgoing clear-sky SW mean over land between 30°N and 60°N for DJF, representing the clear-sky radiative impact of snow on TOA radiation. A crude observational range of 50.5–53.6 W m−2 has been derived by calculating the same metric for two satellite data sets: CERES-EBAF Ed4.0 (Loeb et al., 2018) and ISCCP-FD (Zhang et al., 2004), using the 2000–2010 period common to both data sets. Table 2 shows that the impact of the tuning was to increase this metric from well below the observational range to just above it.

Details are in the caption following the image
Clear-sky outgoing SW (W m−2) during DJF from UKESM1 historical simulations (1995–2015). (a) Tuned UKESM1 model climatology, (b) UKESM1 minus parallel simulation with original GC3.1 values of ni, (c) bias of untuned simulation relative to CERES-EBAF, (d) bias of tuned UKESM1 model relative to CERES-EBAF.

3.2 Bare Soil and Dust Emissions

In UKESM1, emissions of mineral dust are tightly coupled to the fraction of bare soil in a grid box and hence are very sensitive to biases in vegetation fraction in arid and semi-arid areas. In particular the model, prior to tuning, was prone to excessive emissions of dust in Australia, India, and central Asia. We tuned bare soil fraction in these regions by increasing the vegetation dynamics parameter laimin (Clark et al., 2011) for grass PFTs. When LAI for a given PFT exceeds this laimin, some excess productivity is allocated to horizontal spread of the PFT within the grid box. For the same productivity, smaller values of laimin allow greater vegetation fractions and hence smaller bare soil area. Two satellite land cover data sets were used as a reference for this tuning: European Space Agency (ESA) Climate Change Initiative Land Cover data (CCI-LC; Poulter et al., 2015) and that from the International Geosphere-Biosphere Programme Land Use and Cover Change project (IGBP-LUCC; Loveland et al., 2000). The impact of increasing laimin is illustrated by considering the total area of Australian bare soil in the model, which is reduced by nearly a factor of two to 2.8 ×106km2, covering around one third of the continent. The resulting bare soil area lies between the equivalent values for IGBP-LUCC and CCI-LC of 2.5 ×106km2 and 4.0×106km2, respectively. Bare soil areas in central Asia were similarly reduced, but the same is not true for India where the model suffers from a deficit in monsoon rainfall (Williams et al., 2018), which leads to such low plant productivity that the laimin parameter has little or no impact.

Even with reduced biases in vegetation cover, there was also a need to tune the parameters of the dust emission scheme itself. These have been described by Woodward (2011), and in summary they control the emission response to soil moisture and friction velocity and include an overall scaling factor for the total emission. These were tuned to maximize agreement with in situ surface concentration measurements from the University of Miami network (18 sites), and Aerosol Robotic Network (AERONET; Holben et al., 2001) total aerosol optical depth (AOD) measurements at eight dust-dominated sites. While regional metrics are used in the tuning process (see section 4.6), overall performance is summarized by a single global dust error metric, defined as the RMS difference between urn:x-wiley:jame:media:jame20999:jame20999-math-0003 and urn:x-wiley:jame:media:jame20999:jame20999-math-0004 for all of these University of Miami and AERONET observations. The square root is used as a weighting function to increase the contribution of lower concentrations at remote sites to the global metric. The calculation uses seasonal means of the model and observations; each site receives equal weight; thus the concentration observations contribute more strongly than AOD: 18 versus 8 sites. The final UKESM1 configuration has a global dust error of 0.47 (unit-less), which is just over half the error of the untuned configuration; the reduction comes from a combination of the changes to dust parameters and the reduction in bare soil area described above. For context, the value of this metric is 0.51 for GC3.1 and was 0.98 for HadGEM2-ES.

3.3 Terrestrial Productivity

Carbon fluxes and stores are key performance indicators for an Earth system model, due to their importance in regulating carbon feedbacks. The global total GPP of plants in UKESM1 was tuned downward to better agree with observations by reducing the quantum efficiency of photosynthesis. Quantum efficiency is specified separately for each PFT, and the UKESM1 values were adjusted to the lower end of the ranges found by the literature review of Skillman (2008). To inform this tuning we combined GPP from our pre-industrial tests with an estimate of the likely pre-industrial-to-present-day change based on previous models and compared this against the present-day GPP estimated by the FLUXNET model tree ensembles (MTE) data product of Jung et al. (2011). The goal was to improve both the annual mean and seasonal cycle of the global total GPP. The global total GPP for the last 20 years of the resulting UKESM1 historical simulations is 130 GtC year−1, a little higher than the Jung et al. (2011) 5–95% range of 119±6 GtC year−1.

3.4 Surface Chlorophyll

The ocean-atmosphere coupling of PMOA emissions described in section A5 makes aerosol emission dependent upon the interactively simulated surface chlorophyll concentrations, which allows the model to capture potential feedbacks between ocean biogeochemistry and climate. However, ocean models typically have larger biases in ocean chlorophyll than other marine biogeochemistry properties (Kwiatkowski et al., 2014; Yool et al., 2013), posing the risk of introducing a large bias in aerosol load. We mitigate this risk by multiplying the chlorophyll concentration by a global scaling factor. A value of 0.5 is used for this factor, chosen to produce a similar global mean outgoing SW radiation flux at TOA to that in the atmosphere-only implementation of the PMOA emission scheme, which was driven by the observation-based GlobColour data set (Maritorena et al., 2010). This scaling factor reflects the fact that MEDUSA chlorophyll concentrations are higher than those of GlobColour. Note that this scaling of surface chlorophyll is performed only within the PMOA emission parametrization: There is no impact on the prognostic chlorophyll simulated by the ocean model.

3.5 Energy Balance

The physical core of UKESM1 was already well-tuned before adding ESM components (Kuhlbrodt et al., 2018; Williams et al., 2018), and with the exception of the snow-vegetation interaction above it was not deemed necessary to re-tune the physical model parameters. The TOA radiation balance was however altered by the addition of ESM components, through a combination of changes in cloud, surface albedo, and radiatively active gases. Without further tuning the net downward radiation at TOA in 1850 was −0.81 W m−2, which would have resulted in a significant downward drift in model temperatures. We brought the TOA radiation into balance by tuning parameters in the Anderson et al. (2001) parametrization of DMS seawater concentration to permit lower minimum values of DMS. The standard configuration of this parametrization has a prescribed minimum value of 2.29 nM for seawater DMS concentration, while gridded observational DMS data sets, such as those of Kettle et al. (1999) or Lana et al. (2011), contain significantly smaller values than this over large regions of the ocean. Anderson et al. (2001) themselves point out that the data from which their parametrization is derived is likely to contain a sampling bias toward higher values, so a lower minimum is reasonable. We reduced the minimum from 2.29 to 1.00 nM, extending the Anderson et al. (2001) linear relationship (between DMS concentration and log10 of chlorophyll concentration, surface SW, and nutrient availability) to lower values of DMS. The ensuing reduction in DMS gave widespread decreases in cloud droplet number (and thus cloud albedo) across the Southern Ocean and stratocumulus regions, resulting in a drop in reflected SW at TOA of 2 to 5 W m−2 over large areas of the ocean.

Using the ocean DMS parametrization to balance the TOA, rather than for example cloud parameters, allows greater traceability between UKESM1 and its physical core GC3.1, by minimizing the number of physical model parameter changes between the two models. While the goal of this tuning was to balance the pre-industrial TOA, we note that the mean TOA over the last 20 years of the tuned historical run is within the Johnson et al. (2016) present-day observational estimate of 0.61–0.81 W m−2 (5–95% confidence).

3.6 Effectiveness of Model Tuning

Despite the difficulties noted at the start of this section, the tuning exercise was successful in reducing the biases it sought to address. The metrics in table 2 which summarize these biases are all improved relative to the untuned equivalents; that is, the pre-industrial net TOA is closer to zero, and present-day values of the other metrics are closer to their observed values. Recall that the tuning decisions were based mainly on pre-industrial tests, so the present-day performance could not be assured until historical simulations were completed. In the case of the snow-vegetation interactions affecting the outgoing SW metric, the tuning possibly went too far since the present-day value of the “snow SW” metric changed from low-biased to high-biased, though still closer to the observations. This illustrates the imprecise nature of the tuning exercise.

Finally we note that, while the parallel runs presented in Table 2 give a good indication of the magnitude of the direct impact of tuning, they are likely to underestimate the full impact. This is because these tests are designed to be relatively short (25 years), in order to minimize confounding internal variability. While there is minimal drift after the first 5 years, the increased GPP and dust emissions in particular would produce slow responses in terrestrial carbon stores and ocean biogeochemistry respectively which could take more than a century to equilibrate. Assessing the full impact of these changes would require the untuned pre-industrial configuration to be run to equilibrium, before initializing a new ensemble of historical simulations; such an exercise is too costly to perform at this time.

We have not yet assessed the impact of this tuning on emergent properties such as radiative forcing or climate sensitivity (as done for example by Mauritsen et al., 2012); this will be considered in future work.

4 Evaluation and Characterization of Model Behavior

In this section we characterize the behavior of the model, using coupled simulations that form the core set of experiments for CMIP6 (Eyring et al., 2016):
  1. piControl: a pre-industrial control with 1850 forcing; over 1,100 years of this simulation have been completed.
  2. abrupt4xCO2: as piControl but with CO2 concentration instantaneously quadrupled; 150 years duration.
  3. 1pctCO2: as piControl but with CO2 concentration increasing at 1% per year; 150 years duration.
  4. historical: simulating the evolving climate since the pre-industrial era, with transient forcing from 1850 to 2014.

The pre-industrial control is initialized from the end of a 500-year coupled spin-up. The coupled spin-up was itself preceded by 5,000 years of ocean-only simulation which used an asynchronous coupling procedure, and by 1,000 years of land-only spin-up. The spin-up methodology and evolution will be documented in a companion paper in this special issue. Historical results in the present paper make use of nine members of an initial condition ensemble, with each member initialized from a different date in the pre-industrial control. Unless otherwise stated, results from the historical simulation use a mean of these nine ensemble members.

In section 4.1 we explore the drift and variability of the pre-industrial control. Sections 4.2-4.6 evaluate the latter period of the historical simulations with reference to observations. Finally in section 4.7 we examine emergent properties of the model diagnosed from the full historical period and from the idealized CO2 experiments.

4.1 Pre-Industrial Trends and Variability

The pre-industrial control simulation is important for quantifying the model's natural variability and any long-term drifts in the model state. Significant drifts in climate or the carbon cycle would compromise the model's utility for many applications, particularly in assessing future Earth system responses to anthropogenic emissions. Figure 3 shows how five key properties of the model vary over the pre-industrial control. Generally drift is very small relative to internal variability.

Details are in the caption following the image
Time series of key model properties for the first 1,125 years of the pre-industrial control. From top to bottom: global mean net TOA radiation; global mean 1.5 m air temperature; hemispheric sea ice extent (area of model grid boxes with >15% ice cover); global total area of aggregated plant functional types (note reduced scale for shrubs); cumulative carbon uptake (dashed lines are ±0.1GtCyear−1). In the upper three panels, thin lines represent annual means and thick lines a centered 21-year rolling mean. In the lower two panels, all lines represent annual means or accumulations.

There is a relatively constant outgassing of CO2 from the ocean at a rate of ≈0.02GtCyear−1, slowing slightly as the run progresses, and a release of soil carbon of a similar magnitude. Both of these, and the total carbon uptake, are well below the ±0.1GtCyear−1 threshold set by C4MIP (Jones et al., 2016) as a guide for acceptable drift in carbon pools, shown as dotted lines in Figure 3.

Net TOA radiation is very close to zero, which ensures that there are no discernible drifts in surface temperature or sea ice extent. TOA radiation does drift very slowly over the course of the run with a mean of 0.009 W m−2 over the first 500 years, and 0.038 W m−2 over years 500 to 1,000.

Surface air temperature shows strong multi-century variability, most notably during years 400 to 900, with swings of up to 0.3 K over a 100-year period (Figure 3). This is linked to episodes of oceanic deep convection in the Southern Ocean which bring warm saline deep Atlantic water to the surface, leading to a significant reduction in sea ice and loss of accumulated ocean heat to the model atmosphere. The warm, convecting, phases of these cycles coincide with reductions in Antarctic sea ice cover.

Aggregated vegetation types show variability on a range of timescales. Grasses expand and retract into regions of bare soil on interannual and decadal timescales in response to local variations in rainfall and surface climate, resulting in a clear anti-correlation between grass and bare soil in Figure 3. Tree coverage evolves on a much slower timescale, with a small decrease of around 1% during the first 600 years of the run, followed by a slight recovery, mostly at the expense of shrubs.

The UKESM1 historical simulations are initialized using atmosphere-sea ice-land-ocean conditions taken at different points in the pre-industrial control simulation. Ideally, these initial conditions will sample the model's multi-decadal internal variability so that each historical run is initialized using well-separated states of the pre-industrial coupled model. To achieve this, we consider two large-scale modes of sea surface temperature (SST) multi-decadal variability, the Inter-decadal Pacific Oscillation (IPO; Power et al., 1999; Zhang et al., 1997) and the Atlantic Multi-decadal Oscillation (AMO; Kerr, 2000).

The IPO pattern is the largest multi-decadal quasi-periodic SST variability over the Pacific basin. It is represented by the first area-weighted Empirical Orthogonal Function (EOF) of deseasonalized, detrended, and low-pass filtered monthly SSTs. Figure 4 shows the IPO pattern for HadISST1.1 (1870–2017; Rayner et al., 2003) and for the UKESM1 pre-industrial control simulation. The IPO index is the principal component of this EOF. The AMO pattern is a quasi-periodic oscillation of North Atlantic SSTs. The AMO index is calculated as the area-weighted mean time series of the deseasonalized, detrended, and low-pass filtered North Atlantic SSTs. Low-pass filtering the data minimizes the influence of the El Nino/Southern Oscillation (ENSO). Observed and modeled AMO pattern are shown in Figure 5.

Details are in the caption following the image
First area-weighted EOF of Pacific SST, representing the spatial pattern of the observed IPO, scaled by the standard deviation of the IPO index. (a) HadISST1.1 and (b) pre-industrial control. Monthly SST and sea ice data are used in both cases. Grid points containing sea ice during any point in the data set are masked prior to calculating the mode patterns.
Details are in the caption following the image
AMO pattern, calculated as the North Atlantic SST correlation with the area-weighted mean North Atlantic SST time series, scaled by the standard deviation of the AMO index. (a) HadISST1.1 and (b) pre-industrial control. Monthly SST and sea ice data are used in both cases. Grid points containing sea ice during any point in the data set are masked prior to calculating the mode patterns.

The observed and simulated patterns are similar, for both modes of variability. The most notable difference between the two IPO patterns is the westward extent of the equatorial warm region, which extends further toward the maritime continent in the simulation. The AMO patterns differ in terms of the northerly position of the warmest region. Nevertheless, the similarity of these climate mode patterns provides confidence that the simulation is, to first order, replicating the observed multi-decadal variability in the Pacific and North Atlantic basins.

The phase space spanned by the observed and modeled IPO and AMO indices is shown in Figure 6. The observations show no clear relationship between the IPO and AMO and populate all four sectors of phase space. The model has a much longer time series and hence explores the phase space more fully. Consistent with the observations, the IPO and AMO appear to be independent of each other in the model simulation. We use the simulation phase space to choose well-separated initial conditions for each historical run (marked in Figure 6), thus ensuring that the beginning of the historical ensemble samples the multi-decadal variability of the ocean. For the same reason, we also prescribed that initial conditions for historical members were separated by at least 30 years in the pre-industrial control.

Details are in the caption following the image
The phase space of IPO versus AMO monthly indices, showing time series of these indices expressed as deviations from their mean value, normalized by their respective standard deviation. Left: observations (148 years); right: the first 833 years of the pre-industrial control simulation. The initial conditions chosen for each historical run are marked with a red star.

The model also exhibits a realistic simulation of the El Nino/Southern Oscillation (ENSO). The addition of ESM components has made no significant difference to the ENSO behavior of the physical model GC3.1 documented in Kuhlbrodt et al. (2018). We have repeated the ENSO analysis of Kuhlbrodt et al. (2018), and the results are statistically indistinguishable (not shown).

4.2 Present-Day Surface Climate

In Figure 7 we evaluate climatological seasonal mean sea surface temperature against the HadISST1.1 observation data set (Rayner et al., 2003). As is common in ocean models of this resolution (Flato et al., 2013), UKESM1 exhibits a strong cold bias in the NW Atlantic, related to the too-zonal representation of the Gulf Stream and North Atlantic current. Again, similar to other low-resolution models, we see a year-round cold bias in the north Pacific, and a too-westward extension of the equatorial Pacific cold tongue during JJA. Overall the pattern of SST bias is similar to that of GC3.1 (Kuhlbrodt et al., 2018), including a warm bias of 0.5–2 K across much of the Southern Ocean.

Details are in the caption following the image
Seasonal mean sea surface temperature (1995–2014) for HadISST1.1 observations (Rayner et al., 2003, top), UKESM1 historical ensemble mean (middle), and UKESM1 minus HadISST (bottom). Left column: DJF; right column: JJA.

Figures 8 and 9 make similar seaonal comparisons for climatological 1.5 m air temperature over land and precipitation, against the observation data sets of CRUTEM4.6 (Jones et al., 2012) and GPCP2.3 (Adler et al., 2003), respectively. UKESM1 shows a wintertime cold bias over North America, Europe, and NW Russia. One driver of this cold bias relates to the SST biases noted above, influencing land temperatures through atmospheric advection. These continental cold biases are larger than the mean upwind SST bias, suggesting a potential feedback involving snow cover, snow albedo, and surface temperatures whereby a cold surface temperature bias is maintained in the model through excess snow cover, both spatially and temporally through the winter season.

Details are in the caption following the image
Seasonal mean air temperature at 1.5 m (1995–2014) for CRUTEM4.6 observations (Jones et al., 2012, top), UKESM1 historical ensemble mean (middle), and UKESM1 minus CRUTEM (bottom). Left column: DJF; right column: JJA.
Details are in the caption following the image
Seasonal mean precipitation (1995–2014) for GPCP2.3 (Adler et al., 2003, top), UKESM1 historical ensemble mean (middle), and UKESM1 minus GPCP (bottom). Left column: DJF; right column: JJA.

In common with the CMIP5 generation of models (Flato et al., 2013), UKESM1 exhibits a double intertropical convergence zone in the tropical Pacific, linked to the Equatorial cold tongue bias. Tropical Atlantic precipitation is too far south, and in DJF this pattern extends westward over South America. In JJA UKESM1 exhibits the same dry Indian monsoon bias described by Williams et al. (2018) for GC3.1. This dry bias leads to a vegetation deficit (described in section 4.3) and some of the Earth system feedbacks discussed by Martin and Levine (2012).

While our emphasis in this paper is on evaluating UKESM1 at larger spatial scales (i.e., continental or ocean basin scales) and longer time scales (multi-decadal to centennial), UKESM1 will be used to make a set of future projections following the CMIP6 scenarioMIP protocol (O’Neill et al., 2016). Output from these projections will be used to drive impacts models (e.g., to assess future risks to food security or water resources). Such models require high temporal and spatial resolution surface climate data as forcing. Furthermore, assessing a model's higher time frequency statistics increases confidence in the simulated long-term means, variability, and future changes, all of which are based on the underlying high time frequency behavior. As an initial evaluation of UKESM1, Figure 10 shows simulated and observed daily timescale precipitation for four distinct climatic regions, and Figure 11 shows a similar analysis for monthly mean near-surface air temperature.

Details are in the caption following the image
Daily precipitation contributions to mean precipitation rate for the period 1980–2014, for UKESM1 members and observational data sets, for four regions and different seasons (see labels in each panel). The extent of each region is shown in gray in the top panel: AMZ, Amazon; EUR, Europe; SEA, South-East Asia; EAS, East Asia. Precipitation rates are binned with pseudo-exponential bin sizes (see Klingaman et al., 2017) to account for more frequent rain events at low rain rates. Each bin frequency is then multiplied by the mean bin rate, which gives the precipitation contribution from each bin. The x axis is plotted with a log scale so the area under the curve is approximately the mean precipitation.
Details are in the caption following the image
Frequency of occurrence of monthly mean near-surface air temperature over the period 1961–2014 binned into 1°C wide classes for the same four regions as in Figure 10. Observations are from the CRUTS4.02 data set (thick black line), and nine UKESM1 historical ensemble members are shown (thin colored lines). Only grid points that include a fraction of land greater than 0.995 are included in the spatial means.

For each of the regions, Figure 10 plots the fractional contribution from different daily precipitation rates to the total seasonal accumulation, calculated for the seasons specified in the figure labels. Observations include the GPCC/HOAPS v1 global daily precipitation (Andersson et al., 2010; Schamm et al., 2014), based on both gauge and satellite observations, the Global Precipitation Climatology Project (GPCPv1.1; Adler et al., 2003), and the Tropical Rainfall Measuring Mission 3B42v6 product (TRMM; Huffman et al., 2007). All observational data were first re-gridded onto the UKESM1 grid. TRMM data are only used over the SEA and AMZ regions, whereas GPCC is not used over the predominantly ocean area of SEA.

Over EUR we show winter (DJF, full lines) and summer (JJA, dashed lines) precipitation, with the nine UKESM1 members in black and the observations in different shades of blue. In both seasons the distributions are skewed toward a long tail of light precipitation (<1 mm day−1) that contributes to the seasonal accumulation. UKESM1 captures this skewness quite well. In DJF the model slightly overestimates precipitation contributions from moderate to heavy daily rates (3 to 10 mm day−1) but does capture the heavy rain tail of the distribution. Conversely, in JJA moderate to heavy rainfall is underestimated by the model while very heavy events (>20 mm day−1) contribute too much to the seasonal accumulation. As the moderate to heavy daily rates contribute more to the seasonal accumulation (i.e., they occur more frequently) than the very heavy rates, the underestimate of the former results in a modest dry bias over EUR in JJA, primarily in central and south-east Europe (Figure 9).

Turning to the EAS region, where precipitation is dominated by the summer monsoon, we plot distributions for the summer (wet) season (JJAS) and the winter (dry) season (NDJFM). In JJAS UKESM1 significantly overestimates the contribution of rainfall events >30 mm day−1, with some unrealistic daily rates in excess of 100 mm day−1. Figure 9 indicates that this results in a significant wet bias, primarily over the ocean part of EAS (i.e., the west Pacific warm pool region). In NDJFM seasonal precipitation is significantly lower than JJAS in both observations and the model, although the model systematically overestimates precipitation from most of the bins. Again Figure 9 suggests this overestimate is mainly along the coastal ocean in EAS. To the south of the EAS region, over the maritime continent and west Pacific warm pool (SEA) the rainfall distribution in the main wet months (NDJFM) is very well captured, with only a modest overestimate in the model across most bins. Finally, the AMZ region shows precipitation over the Amazon rainforest. While the overall spread of the observed distribution is captured by the model, there is a significant underestimate of moderate to heavy rainfall days (approximately 8 to 30 mm day−1) that leads to the seasonal mean dry bias seen over the central Amazon basin in Figure 9, which is surrounded to the south by a wet bias of equivalent magnitude.

In Figure 11 for the same four regions we compare spatially averaged 1.5 m monthly mean air temperatures between UKESM1 and CRUTS4.02 observations (Harris et al., 2014). From this we wish to see if UKESM1 captures the spread of cold/warm months around its climatological mean value, providing an evaluation of the inter-annual variability of near-surface climate in the model. For all regions we first interpolate the 0.5° CRUTS data onto the UKESM1 grid and then mask out ocean and coastal points, using the UKESM1 land-sea mask and a criterion that a grid point has to have fractional land amount >0.995 to be included in the spatial average.

Considering EUR, in winter (DJF) a systematic bias is evident in the model distribution, with mode values shifted relative to CRUTS by 3–4°C. Nevertheless, the spread of cold/warm months around the mode of the distribution, as well as the shape (skewness), are well captured in the model. Figure 8 indicates the bulk of the cold bias in EUR is over Scandinavia and European Russia, as noted above. In the summer (JJA) a smaller cold bias remains in the model distribution, which continues to be located in the northern part of EUR. A similar cold bias is seen over EAS during boreal winter, while the summer distribution is well simulated, albeit with a slight tendency for too frequent cold months (surface temperatures <18°C) not seen in CRUTS. Over AMZ, the mode of the UKESM1 distribution is quite accurate, although the model now has a bias toward extreme warm months (mean temperatures >27°C) not seen in CRUTS. These are likely associated with the dry precipitation bias over the region (Figure 9), shifting the surface energy balance from predominantly evaporative toward a larger sensible heat component. Finally, the SEA region shows a cold bias in the model distribution. It should be noted for SEA that due to the small number of land points used in the spatial means and the extreme topography of islands in this region, neither the model nor CRUTS data are likely a robust representation of true land temperatures in this region. We plan to extend our analysis of UKESM1 near-surface temperatures in a future study, considering daily temperature variability, as well as daily maximum and minimum values to better isolate model processes controlling biases in regional surface temperature.

Figure 12 compares the areal extent of snow with the GlobSnow observation product (Takala et al., 2011) for four different snow depths. In addition to the area for observed monthly mean snow depths, the equivalent for monthly maximum depth is shown as an indicator of inter-annual variability. The model has too much very thin snow compared with the observations, but a comparable amount of deeper snow. The deeper snow tends to thaw slightly later than the observations in the spring but increase at a similar rate to the observations in the autumn.

Details are in the caption following the image
Area covered by snow greater than a threshold depth of snow, expressed as snow water equivalent (SWE). Thresholds are 2, 10, 40, and 100 mm. Model: UKESM1 historical ensemble mean 1990–2009 (red line) and ensemble spread (red shading). Observations: GlobSnow (Takala et al., 2011) monthly mean SWE (black) and monthly maximum SWE (gray). Model fields are masked where observations have missing data.

Figure 13 compares permafrost extent diagnosed from the model against the data set of Brown et al. (2002). The model has permafrost in the majority of the regions where the Brown et al. (2002) data set suggest there is continuous permafrost. There is permafrost for approximately 47% of the land surface where the air temperature is less than zero. This falls within the range of observations which is 40% to 72%.

Details are in the caption following the image
Comparison of permafrost extent in the model against the data set of Brown et al. (2002). Purple regions show permafrost diagnosed from the model: defined as grid points for which the maximum monthly mean soil temperature at 2 m depth is below 0°C for the period 1990–2009. Gray regions are where modeled annual mean air temperature is below 0°C. The green and orange lines are, respectively, the limits of continuous and discontinuous permafrost from Brown et al. (2002).

Figure 14 compares September and March sea ice concentration/extent, in both hemispheres, against the HadISST.2.2.0.0 data set of Titchner and Rayner (2014) for the period 1990–2009. Also shown are the modeled and observed sea ice extent (the total area of ocean with at least 15% ice cover) and the corresponding Integrated Ice Edge Error (IIEE). The IIEE metric, introduced by Goessling et al. (2016), is a measure of the error in ice edge placement and is essentially the area integral of all grid cells where the model and observations disagree about whether sea ice concentration is above 15% (see Goessling et al., 2016, for further details).

Details are in the caption following the image
UKESM1 historical ensemble mean 1990–2009 Arctic (top) and Antarctic (bottom) sea ice concentration (shading) for September (left) and March (right). Areas with sea ice less than 15% are masked out. Also shown (orange lines) are corresponding 15% sea ice concentration contours from the HadISST.2.2.0.0 data set (Titchner & Rayner, 2014). Inlaid boxes show the observed and modeled sea ice and the corresponding Integrated Ice Edge Error in units of 106km2 (IIEE; Goessling et al., 2016).

In general there is very good agreement between the modeled (shading) and the observed (orange lines) location of the ice edge as depicted by the 15% concentration contour. In the Arctic the model sea ice is, overall, a little more extensive than the observations, particularly so in the north Pacific (Bering Sea and Sea of Okhotsk) in winter (March). The model also has too much winter sea ice north of Iceland and does not reproduce the Oden ice tongue, which is captured in the HadISST data set. In the summer the Arctic sea ice is slightly over extensive within the Arctic Ocean basin, except for in the Kara Sea where the sea ice extent is much smaller than observations. In the Antarctic the sea ice is slightly less extensive than observed, particularly in the Ross Sea and east of the Weddell Sea sector south of Africa.

In response to the cool northern hemisphere surface temperature reported above and in section 4.7, the present-day sea ice is much thicker than observational estimates. Mean September sea ice thickness for 1990–2009 is over 4.5 m throughout the central Arctic (not shown). However, despite the high modeled thicknesses, the spatial extent of sea ice in UKESM1 agrees well with the HadISST observational data set.

4.3 Terrestrial Biogeochemistry

To have confidence in the processes responsible for terrestrial carbon uptake, one needs to evaluate model simulated vegetation structure, surface carbon fluxes, and carbon stores. If a model performed well for one or two of these properties, but not all three, it would suggest that the model was getting the right answer for the wrong reasons. In this section we evaluate and characterize the behavior of each of these three aspects.

UKESM1 represents vegetation structure with nine natural PFTs and four crop/pasture PFTs, and the fractional coverage of each PFT within a grid box is determined by a competition hierarchy. We evaluate the vegetation distribution using two complementary approaches: first with spatial maps of aggregated PFTs (Figure 15) and second by aggregating fractions within biomes (Figure 16). The biomes are regions of the world defined by Olson et al. (2006) based on their common ecosystem characteristics. Within each of these regions Figure 16 shows the fractional coverage of aggregated PFTs for UKESM1 and the IGBP-LUCC and CCI-LC land cover data sets; the same region definition is used for all three data sets. Broadly the model does a very good job in representing vegetation structure of the biomes. Notable biases include an excess of C3 grass in tundra regions which the observations indicate should contain more bare ground; this can be seen in the northern Siberian C3 grass fraction in Figure 15. Some of this bias is a side effect of the laimin tuning described in section 3, which improved the bare soil fraction in dust-producing regions but has allowed grass to spread too extensively in northern Russia and Canada. In contrast, Saharan bare soil extends too far south, with a deficit of grassland in the Sahel. Similarly there is excess bare soil in eastern India. Both of these biases result from precipitation deficits in these regions, associated with errors in the position and intensity of monsoon rains (see Williams et al., 2018). There is a small overestimation of tree fraction in savanna biomes, most notably on the southern edge of the Amazon forest. This is due to the lack of fire disturbance, the inclusion of which would be expected to improve vegetation structure in these regions (Burton et al., 2019).

Details are in the caption following the image
UKESM1 aggregated PFT fractions compared to the IGBP and CCI land cover observation-based data sets. Model data are for the year 2005 to coincide with the validity time of the CCI-LC data, using the mean vegetation fraction of the historical ensemble.
Details are in the caption following the image
Fractional coverage of aggregated PFTs within seven “biome” regions defined by Olson et al. (2006). Each biome shows three columns: UKESM1 (historical ensemble mean for the year 2005), IGBP land cover, and CCI land cover.

Figure 17 shows the evolution of terrestrial carbon fluxes in the ensemble mean of the historical simulations. From 1850 to 1980, decadal-mean GPP slowly increases from 118 to 122 Pg year−1, after which it increases rapidly to 134 Pg year−1 in the last decade of the experiment. GPP in UKESM1 is slightly higher than observational estimates, which are around 120 Pg year−1 in the last few decades (Jung et al., 2011). Net ecosystem productivity (NEP), the balance of GPP and respiration, also increases more rapidly after 1980. NEP is balanced by a small land use change emission flux and a much larger crop harvest flux; the sum of these three fluxes is the net biome productivity (NBP), the total carbon flux from the atmosphere to the terrestrial biosphere. The land is a net source of carbon prior to 1980: cumulative NBP is −40 Pg from 1850 to 1979. After 1980 the land becomes a sink of atmospheric CO2: cumulative NBP between 1980 and 2015 is 22 Pg.

Details are in the caption following the image
Global total carbon fluxes. Gross primary productivity (GPP), net primary productivity (NPP), heterotrophic respiration (Rh), net ecosystem productivity (NEP), land use emissions (fLuc), crop harvest carbon flux (fHarvest) and net biome productivity (NBP). A 10-year running mean has been applied to all data.

Generally the model has an excellent zonal distribution of vegetation carbon (Figure 18), lying between the two observational estimates. The exception is a high-bias between 25°S and 10°S, reflecting the excess of trees in savanna biomes noted above. Global total vegetation carbon is 551 Pg in year 2000 and 583 Pg in 1850 which is consistent with observation-based estimates (450-650 Pg in 1750; Ciais et al., 2013); 72% of vegetation carbon is stored in tropical forests and 26% in extra-tropical forests, with the remaining 2% being stored in shrubs and grasses.

Details are in the caption following the image
Zonal mean carbon stores for model (black, 1995–2014) and observation data sets (colors). Left: Vegetation carbon, observations: GEOCARBON (Avitabile et al., 2016); Saatchi et al. (2011). Right: Soil carbon, observations: WISE30sec (Batjes, 2016); IGBP-DIS (Global Soil Data Task Group, 2002); Carvalhais et al. (2014, thin green lines show ensemble inter-quartile range).

At 1,780 GtC, the UKESM1 global total soil carbon is less than that estimated by the Carvalhais et al. (2014) data set for the whole soil profile (2,450 GtC) and slightly less than that estimated by Batjes (2016) for the top 2 m (1,970 GtC). The model represents the top 3 m of soil. There is not enough soil carbon simulated for the northern high latitudes (Figure 18); this is partly because turnover times are too fast (not shown) but also because there is not yet a representation of permafrost carbon included within UKESM1. The simulated peak of soil carbon in the tropics is slightly further south than in any of the observational data sets.

4.4 Ocean Biogeochemistry

Ocean biochemistry in UKESM1 plays a fundamental role in the carbon cycle, acting as a source or sink of atmospheric CO2 and in the aerosol simulation via natural emissions of DMS and PMOA. In this section we focus on these two roles, evaluating the present-day surface flux of CO2 and the productivity which underpins the aerosol emissions.

Figure 19 shows simulated net primary production, averaged over the period 2000–2009, compared with three satellite-derived observational estimates for the same period (Behrenfeld & Falkowski, 1997; Carr et al., 2006; Westberry et al., 2008, these three estimates are averaged here to a single field, as per Yool et al., 2013). In general, the model simulates the main geographical patterns of high productivity in upwelling regions, low production in subtropical oligotrophic areas, and seasonally high productivity in high latitude areas. However, modeled productivity is more tightly focused on the equator, especially in the Pacific Ocean. Oligotrophic areas exhibit lower production in the model, in part due to unrepresented mesoscale processes, and high latitude productivity is elevated above that observed, especially in the Southern Ocean.

Details are in the caption following the image
Mean ocean net primary production for 2000–2009. Left: an average of three satellite-derived observational estimates (see text). Right: UKESM1 historical ensemble mean.

Figure 20 shows simulated net air-sea CO2 flux, averaged for the period 2000–2009, and compared with the observational estimate of Rödenbeck et al. (2013). The model captures the main characteristics of the broad pattern of CO2, with outgassing (blue) in the equatorial and tropical band, especially in the Pacific Ocean, and ingassing (red) at higher latitudes, in particular the North Atlantic. The model does show some discrepancies, for instance excessive outgassing in the upwelling areas associated with the Humboldt and Benguela Currents, and excess ingassing close to Antarctica linked to a deficit in sea ice cover in spring and summer.

Details are in the caption following the image
Net air-to-sea flux of CO2, 2000–2009 mean. Left: Rödenbeck et al. (2013) observation-based estimate. Right: UKESM1 historical ensemble mean.

4.5 Atmospheric Chemistry

A thorough evaluation of the chemistry performance of UKESM1 can be found in Archibald et al. (2019); here we focus on those aspects which represent major enhancements relative to HadGEM2-ES, namely, stratospheric chemistry, interactive photolysis, and biogenic emissions of VOCs.

First we assess the quality of the simulation of total column ozone (TCO), which is dominated by the stratospheric ozone layer, versus the NIWA-Bodeker Scientific TCO database (version3.4; Bodeker et al., 2005) covering 1979–2016. We also use the TCO record measured at South Pole using a Dobson photospectrometer (Rosen et al., 1993; WOUDC, 2014).

Figure 21 shows the zonal-mean TCO averaged over 1995–2014, averaged over two historical UKESM1 simulations, and the difference versus the climatology. In the extra-polar regions, there is a general overestimation of ozone of around 20 to 40 Dobson Units (DU), whereas in the Antarctic, the model underestimates ozone relative to the climatology by up to −60 DU in November and December. Since satellite-based observation of TCO is particularly difficult at high latitudes, we compare this result directly to the Dobson long-term measurements at South Pole (Figure 22). The comparison shows a good agreement with the Dobson record in October, but in November and December, the model only tracks the most stable, deepest ozone holes and does not reproduce the observed interannual variability. Linear trends during the period of increasing ozone depleting substances (1979–1997) mirror this finding (Figure 23). In the tropics the model does not produce significant trends, in agreement with the climatology, but in both polar regions the model produces too large negative trends during this period during spring and summer in both hemispheres. A separate analysis of trends during the ozone recovery period (1998–2014) shows generally non-significant trends, in agreement with the observations.

Details are in the caption following the image
Left: Zonal- and multi-annual mean total column ozone (Dobson Units, DU), averaged over two historical simulations, for 1995–2014. Right: Corresponding model bias versus the NIWA-Bodeker Scientific TCO database version 3.4 (updated from Bodeker et al., 2005) for the same years.
Details are in the caption following the image
Monthly mean total column ozone (DU) at the South Pole, for the spring and summer months. Colors: Two UKESM1 historical simulations. (thin lines: full variability; thick lines: smoothed with a 7-year boxcar filter). Symbols: Monthly means derived from the South Pole Dobson measurements starting in 1964 (WOUDC, 2014). All monthly means are shown that are derived from measurements taken both before and after mid-month (hence March and September are excluded).
Details are in the caption following the image
(Left) Trends in zonal-mean total column ozone (DU year−1) averaged over two historical UKESM1 simulations, for 1979–1997 (19 years). Stippling indicates non-significance at 95% confidence. (Right) Bias in the trend versus the NIWA-Bodeker Scientific climatology for the same years.

An analysis of the vertical distribution of ozone and its trends (not shown) reveals a generally good correspondence with observations but too strong a downdraught of mesosphere air during polar winter which contributes to low TCO during spring and also the general overestimation of ozone in the extra-polar ozone layer discussed above.

The good correspondence in October suggests that the sensitivity of ozone to anthropogenic chemical ozone depletion is generally correct, broadly in agreement with scientific consensus on the underlying gas-phase chemistry (World Meteorological Organization, 2018), but the underestimation of variability in November and December suggest insufficient, and insufficiently variable, meridional heat transport toward both poles which would serve to inject variability into TCO and would shorten, on average, the lifetime of the polar vortex (Garfinkel et al., 2013). This is consistent with a stratospheric cold bias of 5–8 K over the Antarctic and a polar night jet that is too strong by more than 10 m s−1 (not shown).

The general overestimation of ozone in the extra-polar regions would be partly addressed by a more comprehensive inclusion of heterogeneous reactions on sulphate aerosols involving bromine compounds (Dennison et al., 2019), which is not part of UKESM1. The nine CMIP5 models that did simulate stratospheric ozone interactively exhibited substantial biases (Eyring et al., 2013). Relative to this previous generation of models, UKESM1 compares well to the reference climatology.

Turning now to the troposphere, Figure 24 shows that for the historical simulation of UKESM1, the burden of ozone in the troposphere (defined as in Tilmes et al., 2016) has increased steadily over the period 1850–2014. The burden of ozone has recently been evaluated for the present-day (2014–2016) by Gaudel et al. (2018) as 300±12 Tg from 60°S–60°N. The UKESM1 tropospheric ozone burden, subset over the same latitude range for the year 2014 is 314 Tg, in excellent agreement with the observations. UKESM1 simulates an increase over the historical period (1850–2014) of more than 60 Tg of ozone in the troposphere, an increase of approximately 23% from a pre-industrial baseline of 260±4 Tg (1850–1875) to 320±7 Tg (1995–2010).

Details are in the caption following the image
Time series of the evolution of the tropospheric ozone burden in six members of the historical ensemble of the UKESM1 simulations (colored lines). The black line shows the ensemble mean. Tropospheric ozone burden is calculated as the total mass of ozone between the surface and the tropopause, using a tropopause based on monthly mean ozone mixing ratio of 150 ppb.

Methane lifetime is a core metric for atmospheric chemistry models due to the role methane plays in multiple chemical cycles, with impacts on ozone, aerosols (via oxidant concentrations), as well as exerting a strong direct radiative effect in its own right. The methane lifetime in UKESM1 is 8.4 years (calculated as total atmospheric burden divided by chemical loss and deposition for the period 1995–2010). This lies within the 9.1±0.9 year range estimated by Prather et al. (2012) based on constraints from methyl chloroform, and the 7.6 to 14 year range of SPARC (2013). Table 3 summarizes methane lifetime along with five other metrics of the global chemical state of the model.

Table 3. Metrics Summarizing the State of Atmospheric Chemistry in the Latter Part of the Historical Period (1995–2010)
Metric UKESM1
CH4 lifetime (whole atm) 8.4 years
OH mean concentration 1.25×106 molecules per cm3
OH ratio NH:SH 1.3
CO burden 304 Tg
CH4 burden (trop.) 4,134 Tg
O3 burden (trop.) 320 Tg

Figure 25 shows the global distribution of isoprene and monoterpene emissions simulated by the interactive BVOC scheme in UKESM1. The emissions in this figure represent the mean over the last 10 years (2005–2014) of the historical run. Annual global total isoprene and monoterpene emissions amount to approximately 520 and 120 TgC year−1, respectively. The spatial pattern of emissions is broadly consistent with state-of-the-art BVOC emission models (e.g., Arneth et al., 2008; Guenther et al., 2012; Messina et al., 2016; Müller et al., 2008; Sindelarova et al., 2014; Stavrakou et al., 2009; Wiedinmyer et al., 2006; Young et al., 2009).

Details are in the caption following the image
Annual mean emissions (TgC year−1) of monoterpene (top) and isoprene (bottom) from the UKESM1 historical simulations (2005–2014).

4.6 Aerosols

Figure 26 shows the UKESM1 ensemble annual mean AOD at 440 nm for the period 1992 to 2008. Annual mean observations of AOD covering the same period from the ground-based, global AERONET (Holben et al., 2001) are overlaid on the simulated global AOD map. Overall, the model captures the spatial distribution of AOD well. This is reflected in the good correlation found in the point-wise comparison shown in Figure 26 (r2=0.63). AOD over continental polluted regions and tropical biomass regions is generally well-represented in the model, including the east-west contrast in continental North America. There are fewer observations in remote marine regions, but where observations exist, the model is able to capture the low AOD values in the Pacific Ocean and downwind of anthropogenic pollution source regions in the Atlantic Ocean.

Details are in the caption following the image
Annual mean aerosol optical depth at 440 nm from UKESM1 ensemble mean 1992–2008. Left: Model field (contours) overlaid with circles depicting the observed AOD at 67 ground-based AERONET stations. Right: Scatterplot of AERONET versus UKESM1 at the AERONET station locations. AERONET = Aerosol Robotic Network; AOD = aerosol optical depth; RMSE = root-mean-square error.

The model AOD in Figure 26 is biased low at sites close to dust sources in West Africa and the tropical Atlantic. This is examined in more detail in Figure 27 which compares the model against seasonal mean AERONET AOD at dust-dominated sites and near-surface dust concentrations from the University of Miami observing network. The AODs in this figure are consistent with the annual mean result, with seasonal mean AOD lower than the observations by 39% on average. We note that these stations are all at similar latitudes in a region strongly affected by the West African Monsoon, which is challenging to simulate, and small biases in wind-speed or precipitation in this area would strongly affect local dust production. Thus the bias in AODs may not be representative of the wider dust simulation. Indeed, dust concentrations over the Atlantic are generally well-simulated (Figure 27). Further afield, there is some high bias in concentration over the Pacific and low bias over the Southern Ocean.

Details are in the caption following the image
Scatterplot of 1995–2014 mean seasonal mean near-surface dust concentration and AODs at dust-dominated stations from the UKESM1 historical simulations versus data from the University of Miami network and AERONET stations. Colors indicate season: black DJF; blue MAM; green JJA; and red SON. Symbols indicate areas for concentrations: asterisks Atlantic; squares north Pacific; triangles south Pacific; diamonds Southern Ocean; and pluses AOD. The model:obs ratio meaned over stations and seasons for each area are as follows: Atlantic 1.10; north Pacific 2.07; south Pacific 2.03; Southern Ocean 0.48; AOD 0.61.

4.7 Emergent Behavior of the Coupled Model

We consider some emergent properties of UKESM1 using metrics of global climate sensitivity and carbon uptake. First, equilibrium climate sensitivity (ECS) estimates the equilibrium response of global mean surface air temperature to a doubling of CO2. While this cannot be directly evaluated against observations, it is a useful metric for comparing the response of models to CO2 perturbations across multiple generations of models. Figure 28 shows the temporal evolution of surface air temperature in response to an abrupt CO2 quadrupling (the CMIP6 abrupt4xCO2 experiment; Eyring et al., 2016). We calculate effective climate sensitivity as in Gregory et al. (2004) by regressing the change in TOA net radiation against surface temperature over the first 150 years of the run. The projected equilibrium temperature change from the abrupt4xCO2 experiment is divided by 2 to correspond to a CO2 doubling. The resulting ECS for UKESM1 is 5.4 K.

Details are in the caption following the image
Change in global mean surface air temperature in the abrupt4xCO2 (red) and 1pctCO2 (blue) experiments. The parallel portion of the pre-industrial control has been subtracted from each experiment to remove any background temperature drift.

Transient climate response (TCR) is calculated from the CMIP6 1pctCO2 experiment in which CO2 is increased at 1% per year. TCR is defined as the temperature change from the pre-industrial control after CO2 has doubled, averaged over years 61–80. A four-member initial condition ensemble for UKESM1 (Figure 28) have TCR in the range 2.68–2.85 K. Transient climate response to cumulative emission (TCRE; Gillett et al., 2013) extends the TCR concept to use carbon feedbacks from an ESM in order to project the warming due to a cumulative CO2 emission of 1 TtC. Following Gillett et al. (2013) we calculate TCRE as TCR (as defined above) divided by the cumulative compatible anthropogenic emission up to year 70 of the same 1pctCO2 experiment. CO2 concentration is prescribed in these experiments, but the anthropogenic emission compatible with the model's carbon uptake can be calculated as the change in the total mass of carbon in the ocean, land, and atmosphere. To remove the (very small) background drift in ocean and land carbon stores (Figure 3) we calculate the compatible emission as the difference between the 1pctCO2 store at year 70 and that of pre-industrial control in the same year, yielding a range of compatible emissions from 1.07 to 1.08 TtC and TCRE from 2.49 to 2.66 K TtC−1.

The UKESM1 values of ECS, TCR, and TCRE are all higher than those of CMIP5 models (respectively 2.1 K to 4.7 K, 1.0 K to 2.6 K, and 0.8 K TtC−1 to 2.4 K TtC−1; Andrews et al., 2012; Gillett et al., 2013). Elsewhere in this special issue, Bodas-Salcedo et al. (2019) analyze the increase in atmospheric climate feedbacks in HadGEM3-GC3.1 relative to the previous version of HadGEM3, whose ECS (3.2 K; Senior et al., 2016) is within the range of CMIP5 models. They find that the feedbacks have become more positive as a result of improvements to cloud microphysics and cloud-aerosol interactions. Both of these developments reduce errors in the present-day simulation of clouds relative to observations. These stronger feedbacks contribute to increases in all three of these metrics; TCRE is additionally increased relative to CMIP5 models by the inclusion of nitrogen limitation which reduces the terrestrial carbon sink for a given CO2 atmospheric concentration and hence increases the CO2 concentration and corresponding warming compatible with a 1TtC cumulative emission. More detailed analysis of the climate sensitivity and carbon feedbacks in UKESM1 will be reported in subsequent papers.

Carbon uptake can also be diagnosed from historical, rather than idealized, simulations and thus compared against estimates rooted in historical observations. In Table 4 we compare cumulative 1850–2014 carbon uptake and land use emissions against observation-based estimates from the Global Carbon Budget (GCB; Le Quere et al., 2018). The ocean and land sinks are both within the GCB range. UKESM1 land use emissions in Table 4 represent the carbon flux due to clearance of above-ground biomass, including crop harvest and deforestation. The model's total emission due to land use includes an additional contribution from decomposition of below-ground biomass which we have not diagnosed. The total land use emissions will hence will be closer to, or above, the GCB central estimate.

Table 4. Cumulative 1850–2014 Carbon Uptake and Land Use Emissions for the UKESM1 Historical Ensemble Mean and Estimated by GCB
Cumulative flux (GtC) UKESM1 GCB
Ocean sink 135 150±20
Land sink 158 185±50
Land use emissions 152 195±75
  • Note. UKESM1 land use emissions in this table include only the CO2 flux from above-ground biomass; GCB land use emission estimate includes both above- and below-ground biomass.

Surface temperature is one of the few variables for which reliable observations cover the full period of the 1850–2014 historical simulation, which allows us to evaluate the model's first-order climate response to the evolving forcing over this period. The UKESM1 global mean surface temperature anomaly in the historical ensemble is shown in Figure 29, alongside the HadCRUT4 observation data set (Morice et al., 2012). The observations represent only a single realization of the internal variability of the climate system, so one should not expect the model ensemble to be centered on the observations, but rather that the range of observational uncertainty overlaps the ensemble range (under the assumption that the model ensemble is large enough to sample the relevant internal variability). Most ensemble members begin to warm in the early 20th century, then cool strongly between 1950 and 1970 before warming rapidly through to the end of the simulation. The observations show a limited cooling of 0.1 to 0.2 K during 1940 to 1970, but the model ensemble mean cools by nearly 0.4 K over the same period. All ensemble members also show a stronger cooling response to the large volcanic eruptions of 1883, 1963, and 1991 than is seen in the observations.

Details are in the caption following the image
Global and hemispheric mean surface temperature anomaly, relative to 1851–1880 mean. Thin blue lines are ensemble members of the historical experiment; the thick blue line is their ensemble mean. The HadCRUT4.6.0.0 median and 5–95% uncertainty range are shown in black and gray, respectively. For consistency with the observations, model fields are constructed using a combination of 1.5 m air temperature over land points and SST at ocean points and masked to match the temporally evolving coverage of HadCRUT. Model time series are de-trended using the linear trend of the pre-industrial control over the equivalent period.

Figure 29 also separates the mean surface temperature into northern and southern hemisphere time series. The stronger-than-observed cooling is restricted to the northern hemisphere which, together with its temporal evolution, points to either aerosol or land use forcing as the prime driver. Further investigation into this discrepancy will be the subject of future work. In the southern hemisphere the model ensemble overlaps with the observational uncertainty for the entire duration of the experiment, with the exception of the dip following the Mt. Pinatubo eruption in 1991 noted above.

5 Discussion and Conclusions

In this paper we describe and evaluate the new Earth system model UKESM1, which has been jointly developed by NERC and the Met Office. UKESM1 represents a major advance on its predecessor model HadGEM2-ES, both in the sophistication of its component models and in the complexity of the couplings between these components. The most important advances are
  1. inclusion of nitrogen cycling within the terrestrial carbon cycle, and increased ocean biological complexity, in order to better predict allowable anthropogenic emissions;
  2. major advances in the treatment of land management, with the separation of crop and pasture areas, and the introduction of a harvest carbon flux;
  3. a well-resolved stratosphere with interactive stratospheric chemistry to assess the impact of changes in ozone depleting substances;
  4. a state-of-the-art aerosol scheme with prognostic size distribution for a better understanding of aerosol radiative forcing;
  5. coupling between the ocean biogeochemistry and atmospheric composition via interactive emissions of primary marine organic aerosol;
  6. coupling between the terrestrial biosphere and atmospheric composition in the form of interactive BVOC emissions from vegetation; and
  7. coupling between atmospheric chemistry and physics through water vapor changes arising from chemical reactions.

Achieving acceptable performance in the presence of this additional model complexity posed major challenges during the development and tuning of UKESM1. The greatest difficulties arose where one component model was sensitive to biases in the variables it received from another component. One example of this is the dependence of dust emissions on bare soil fraction, and the sensitivity of bare soil to the model's climate in semi-arid regions. Through careful tuning of vegetation and dust parameters we were able to significantly reduce both bare soil and dust errors. However, in some regions (particularly western India) the climate biases were such that we could not obtain a good simulation of vegetation. In this respect we still have not answered some of the difficult questions posed by Martin and Levine (2012) regarding coupling and component model biases in Earth system models, though we have made significant steps forward.

Generally, the model performs very well. The pre-industrial control is stable, with acceptably small drifts in carbon pools and virtually no drift in surface climate over 1,100 years. The Southern Ocean exhibits large multi-century oscillations in surface temperature and sea ice extent. Further work is in progress to understand the mechanisms behind this variability, but there is a growing body of evidence which indicates that such oscillations may be a feature of the natural state of the Earth system (e.g., Kurtakoti et al., 2018; Reintges et al., 2017; Zhang et al., 2018).

Evaluation of the latter portion of the historical experiments against observations is generally positive. Particular highlights are vegetation structure, which has reduced or eliminated biases in the Boreal forests and Australia that were seen in the predecessor model HadGEM2-ES (Collins et al., 2011), and stratospheric ozone, which exhibits smaller biases than the previous generation of coupled chemistry-climate models with interactive stratospheric chemistry (Eyring et al., 2013).

The most significant deficiency in the model's performance is its stronger-than-observed global mean surface cooling in the mid-20th century, followed by stronger-than-observed warming after 1990. The cooling is most likely related to aerosols and/or land use change, suggesting that the combined radiative forcing of these processes is excessive during this period. While these processes will not have a first-order impact on century-scale projections at a global scale (which are dominated by greenhouse gas forcing), understanding and improving this bias will be a priority for future configurations of UKESM1. The strong warming toward the end of the century may be due to the reduction of aerosol emissions in Europe and North America, potentially adding weight to the hypothesis that aerosol forcing is too strong in this model prior to 1990. Or it could be a result of the model's high climate sensitivity (discussed below) or a combination of both. There are large uncertainties in the magnitude of the true historical forcing due to aerosols and land use change, and these results present an opportunity to better understand and constrain the relevant processes, with the potential to improve ESMs more generally.

In the introduction to this special issue, Senior et al. (2020) outline the development strategy for UKESM1 and its physical core HadGEM3-GC3.1, which included a step (before model finalization) of estimating effective radiative forcing (ERF) due to key anthropogenic forcing agents (GHGs, aerosols, and land use change), in order to avoid unrealistic historical performance at a global mean scale. See Mulcahy et al. (2018) in this special issue for action which was taken on aerosol forcing as a result of this strategy. These ERF experiments assessed only the 1850–2000 forcing, but as Figure 29 demonstrates, it is possible to have good agreement in global mean surface temperature toward the end of the historical period (implying that the integrated historical forcing is accurate, perhaps through a compensation of errors) and still have significant deviations prior to this. This brings interesting new evidence to bear on the questions about model development strategy which have been discussed by, for example, Hourdin et al. (2016) and Schmidt et al. (2017), and will prompt an evolution of the strategy summarized in Senior et al. (2020).

The high sensitivity of the model to CO2 concentrations and emissions, as evidenced by the high values of ECS, TCR, and TCRE will also be a focus of further study. Bodas-Salcedo et al. (2019) show that some of the changes responsible for the high sensitivity represent significant improvements in process evaluation, but better observations and more in-depth analysis are required to assess all the processes involved. Note that these climate sensitivity and carbon feedback metrics are emergent properties of the model and were not subject to tuning during the development process.

Near- to medium-term plans for model development in the UKESM project include new model configurations with additional capabilities. First will come a model with interactive ice sheets on Greenland and Antarctica, with two-way coupling between the ice sheets and both the atmosphere and ocean. Following that, we are developing a hybrid-resolution functionality to allow the most expensive model components (e.g., atmospheric composition) to run at a reduced horizontal resolution relative to the model physics. These ambitious developments will be reported in future papers.

Other priorities for further work include evaluation of the interactions between component models. The parametrizations used in coupling schemes (such as those for emissions or deposition of gases and aerosols) are generally based on good physical understanding and empirical relationships derived from in situ observation, but it is hard to evaluate these relationships at the scale of a model grid box, let alone regional or global scales. For example, is the response of the model's bare soil fraction to its climate more or less sensitive than that of the real world? Such questions have an important bearing not only for obtaining a good present-day simulation but also for interpreting its response in model projections.

The complexity of UKESM1, both in terms of its component models and its internal coupling, is unprecedented for an Earth system model. The sophisticated interactions were included to facilitate the exploration of future feedbacks but increase the difficulty of obtaining skilful simulation of the observed period. In the face of this challenge, the generally successful performance of the model is a major accomplishment.

Acknowledgments

A. Se., J. M., Y. T., A. W., F. O.'C., R. H., S. W., J. W., C. E. J., M. A., T. A., S. B., E. B., E. B., M. D., J. E., M. H., A. J. H., B. J., A. J., C. D. J., S. L., E. R., and M. Z. were supported by the Met Office Hadley Centre Climate Programme funded by BEIS and Defra. C. G. J., T. K., R. S. S., M. S., S. R., V. P., C. E. H., R. P., R. S., A. Y., J. P., L. dM., A. Si., R. E., and D. K. were supported by the National Environmental Research Council (NERC) National Capability Science Multi-Centre (NCSMC) funding for the U.K. Earth System Modelling project: C. G. J., T. K., R. S. S., M. S., S. R., V. P., and C. E. J. through Grant NE/N017978/1; R. P. and R. S. through Grant NE/N018079/1; A. Y., J. P., and L. dM. through Grant NE/N018036/1; A. Si. through Grant NE/N01801X/1; R. E. and D. K. through Grant NE/N017951/1. C. G. J., T. K., S. R., A. X. Y., J. P., F. O.'C., and G. F. additionally acknowledge the EU Horizon 2020 CRESCENDO project, Grant 641816. O. M. acknowledges support by the New Zealand government under the Strategic Science Investment Fund and from the Deep South National Science Challenge (www.deepsouthchallenge.co.nz). M. T. W. is supported by the Earth System and Climate Change Hub of the Australian Government's National Environmental Science Programme (NESP). A. A., N. L. A., P. G., and J. K. acknowledge support from NERC and NCAS for funding aspects of UKCA model development and evaluation. We acknowledge use of the Monsoon2 system, a collaborative facility supplied under the Joint Weather and Climate Research Programme, a strategic partnership between the Met Office and the Natural Environment Research Council. We are grateful to J. M. Prospero and D. L. Savoie from the Rosenstiel School of Marine and Atmospheric Science, University of Miami, for making available observational data from the University of Miami Aerosol Network for the dust evaluation. The simulation data used in this study are archived at the Met Office and are available for research purposes through the JASMIN platform (www.jasmin.ac.uk) maintained by the Centre for Environmental Data Analysis (CEDA); for details please contact [email protected] referencing this paper. Information on the model versions and simulation identifiers are provided in Appendix Appendix D

    Appendix A: Component Model Details

    In this appendix we summarize the component models of UKESM1. We do not include a detailed description of each component; rather we refer the reader to previously published descriptions of each component and focus instead on developments which have been made since these publications, and on summarizing the key model elements from an Earth system perspective.

    A1 Underlying Physical Model

    The components of an ESM are dependent on the core physical model for accurate simulation of winds and ocean currents (which advect the biogeochemical tracers), atmospheric moist and turbulent processes (which control chemical reaction rates and aerosol-cloud interactions), and temperature, precipitation, and soil moisture (upon which aerosol and gas removal processes, vegetation, and CO2 uptake depend). Biases in the physical model which are tolerable in their own right can impact disproportionately on Earth system components (e.g., Hardiman et al., 2015; Martin & Levine, 2012). Simulated ocean-land surface temperature and surface and top-of-atmosphere (TOA) radiative fluxes retain their primary importance, even for the ESM components, because of their role in feedbacks.

    The Earth system components of UKESM1 are built upon the state-of-the-art physical climate model HadGEM3-GC3.1 (Kuhlbrodt et al., 2018; Williams et al., 2018). The ocean component of GC3.1 (Storkey et al., 2018) uses the NEMO dynamical ocean on a nominally 1° tripolar grid with 75 vertical levels and an explicit nonlinear free surface. The sea ice (Ridley et al., 2018) is modeled by CICE on the same horizontal grid as NEMO, with five ice thickness categories, multi-layer thermodynamics with four layers of ice and one of snow, and elastic-viscous-plastic rheology. The atmosphere (Walters et al., 2019) uses the Met Office Unified Model (UM) with 85 vertical levels on a terrain-following hybrid height coordinate, with an 85 km model top. The land surface is modeled using JULES (Best et al., 2011). Land and atmosphere share the same horizontal grid: a regular latitude-longitude grid with 1.25°×1.875° resolution. Surface and sub-surface runoff is transported to the ocean using the TRIP river routing scheme (Total Runoff Integrating Pathways; Oki & Sud, 2006). Comparing UKESM1 to its predecessor HadGEM2-ES, the ocean and sea ice models have been completely replaced, and the atmosphere has a new dynamical core (Wood et al., 2014), a well-resolved stratosphere, major advances in cloud physics including prognostic cloud, condensate and rain (Wilson et al., 2008), and a new modal aerosol scheme (Mann et al., 2010). The coupled physical model as a whole is evaluated in Kuhlbrodt et al. (2018) and has been shown to represent a substantial improvement upon its predecessors.

    We believe that we have achieved our goal of keeping the physical core of UKESM1 as close to GC3.1 as possible. The differences are restricted to the following components:
    1. enhancements to aerosol emissions to improve their process realism (adding couplings between the biosphere and atmospheric composition which were previously missing), described in section A5;
    2. tuning parameters which control dust emissions and snow-vegetation interactions, the motivation for which is explained in section 3;
    3. a change to the soil hydrology scheme to address a problem with soil moisture. GC3.1, as in recent versions of HadGEM3, uses an adaptation of the van Genuchten (1980) empirical equations for soil hydraulic conductivity properties (Wiltshire et al., 2019). During early tests of UKESM1, it was discovered that the implementation of these equations in JULES produced unrealistic soil moisture profiles in the Earth system model. In some regions the soil moisture profile was inverted relative to observed profiles, with less moisture in the deepest soil layer (1–3 m) than the surface layer (0–10 cm). Such a lack of moisture at depth would limit the availability of moisture to vegetation during dry periods and had the potential to distort the dynamic simulation of vegetation distribution. To remedy this, UKESM1 instead uses equations based on the work of Brooks and Corey (1964) with consistent ancillaries files for soil properties, as employed in HadGEM2. This results in much higher soil hydraulic conductivity and thus more moisture in the deeper soil layers. The change of scheme makes little difference to the surface energy budget. Work is in progress to refine the van Genuchten implementation for future versions of HadGEM3 in order to remedy this issue;
    4. the land surface characteristics of the Greenland and Antarctic ice sheets are represented using the sub-gridscale elevation class tiling scheme and albedo and snow pack physics described in Shannon et al. (2019); we do not however use the glacier-specific tunings or precipitation and surface wind downscaling from that work. In UKESM1 we use 10 irregularly spaced elevation classes and allow for up to 10 discrete layers in the snow pack. These schemes provide significant improvements to the surface mass balance of the ice sheets simulated by UKESM1 compared to GC3.1.

    Additionally, the physical simulation in UKESM1 is affected by its Earth system components through the radiative impact of changes in atmospheric composition and through the impact of vegetation structure on surface energy and momentum fluxes. These couplings are described alongside the Earth system component models in the rest of this appendix.

    A2 Terrestrial Biogeochemistry

    The terrestrial biogeochemistry in UKESM1 is based largely on that of HadGEM2-ES (Collins et al., 2011), though the underlying code has been migrated from MOSES (Met Office Surface Exchange Scheme) to JULES (Best et al., 2011; Clark et al., 2011), and major enhancements have been developed for UKESM1. As in HadGEM2-ES, fractional cover and canopy height of PFTs are simulated by the TRIFFID vegetation dynamics (Cox, 2001), and the resulting carbon fluxes drive the RothC soil carbon model (Coleman & Jenkinson, 1999). The model simulates the fractional extent of wetland within each grid box, which is used to derive an emission of CH4; in the standard configuration of the model, this emission is diagnostic, but the model has the capability to run with interactive CH4 emissions.

    Parameters related to photosynthesis, respiration, and leaf turnover have been updated based on the TRY database (Kattge et al., 2011). These include nitrogen content of leaves, roots, and stems, leaf mass per unit area, leaf growth rate, and the relationship between leaf nitrogen and the maximum rate of carboxylation of the enzyme Rubisco at 25°C (Harper et al., 2016, 2018). The number of natural PFTs was increased from five to nine to represent the distinction between evergreen and deciduous plants and between tropical and temperate evergreen trees. The canopy scheme was updated so that the exponential decay of both photosynthetically available radiation and leaf nitrogen through the canopy now depend on the total leaf area index. In offline land-only tests, these developments improved model simulation of gross and net primary productivity (GPP and NPP, respectively) at the site level and globally (Harper et al., 2016). In particular, the NPP of C4 grasses was reduced and GPP and NPP of trees was increased.

    Plants still compete for space based on their NPP and phenology as in Cox (2001), but the competition is now based on a pure height ranking, where the tallest trees get access to space in a grid box first. In uncoupled component model tests, the new dynamic vegetation and PFTs yield a closer match to observed vegetation distribution, with particular improvements to tropical and boreal forests and the high latitudes, where previously the model overestimated the distribution of shrubs (Harper et al., 2018).

    Figure A1 illustrates the overall distribution of these PFTs with the dominant vegetation type, that is, the PFT having the greatest fractional coverage in each grid box. The additional PFT sub-divisions added by Harper et al. (2016) can be seen most clearly in the Boreal forests, with needleleaf deciduous trees generally dominating over needleleaf evergreen trees in more northerly continental climates. Tropical broadleaf evergreen trees dominate most of South America, sub-Saharan Africa, SE Asia, and the maritime continent. In contrast, there are very few areas in which broadleaf deciduous trees or temperate broadleaf evergreen dominate, as these tend to grow sparsely in regions dominated by grasses or shrubs.

    Details are in the caption following the image
    Dominant PFT within each model grid box for the year 2005, using the mean vegetation fraction of the historical ensemble. BLD = broadleaf deciduous tree; BLETr = tropical broadleaf evergreen tree; BLETe = temperate broadleaf evergreen tree; NLD = needleleaf deciduous tree; NLE = needleleaf evergreen tree; C3G = natural C3 grass; C4G = natural C4 grass; Sh = Shrub; Cr = Crop (C3+C4 grass); Pa = Pasture (C3+C4 grass); BS = bare soil.

    Terrestrial carbon uptake is limited by the availability of nitrogen and other nutrients. An ecosystem becomes limited when insufficient nitrogen is available for plants to allocate net photosynthate to growth. In such cases photosynthesis is down-regulated to match the available inorganic nitrogen. In UKESM1, nitrogen does not directly affect photosynthetic capacity through leaf nitrogen concentrations but acts indirectly by controlling the biomass and leaf area index within the TRIFFID vegetation dynamics. A second mechanism acts through the soil by limiting the decomposition of litter into soil carbon in the RothC model. This occurs when insufficient nitrogen is available to decompose high C:N litter into low C:N soil organic matter. The vegetation model includes retranslocation of nitrogen during senescence of leaves and roots into a labile pool to supply nutrients for the following season's leaf growth. The soil model simulates mineralization and immobilization, with mineralized nitrogen becoming available for plant uptake and ecosystem loss. Inorganic nitrogen is represented by a single grid box pool to which all PFTs have equal access. As PFTs have differing nitrogen demands and turnover, the inclusion of nitrogen processes influences vegetation competition and therefore distribution.

    To simulate the impact of agriculture and anthropogenic land use on the climate (via biophysical effects) and carbon cycle (via biogeochemical effects), a representation of land use and land management is implemented. Going beyond the HadGEM2-ES treatment of all agriculture as “disturbed” in a common way (Jones et al., 2011), UKESM1 distinguishes crop and pasture land. This is done by reserving a fraction of each grid box for crops and a fraction for pasture; these land use fractions are prescribed as an external forcing and may vary in time. Within the crop (pasture) fraction only the C3 crop (pasture) and C4 crop (pasture) PFTs are allowed to grow, with the area of each crop PFT determined by TRIFFID. Only the nine natural PFTs (including natural C3 and C4 grass) may grow the remainder of the grid box. When the prescribed crop or pasture fraction increases, natural vegetation is removed from the portion of the grid box into which agriculture has expanded, representing anthropogenic land clearance. The vegetation carbon thus removed is distributed as follows: Root biomass enters the soil carbon pools, and above-ground biomass is distributed to three product pools which release CO2 to the atmosphere at turnover rates of 1, 10, and 100 years to represent a range of fast-, medium-, and slow-decay wood products. Conversely, when crop and pasture areas are reduced, the natural PFTs are allowed to recolonize the vacated grid box fraction.

    The crop and pasture PFTs are physiologically identical to the natural grasses, but simple representations of fertilizer application and harvesting are applied to the crop PFTs. The crop PFTs are prevented from being nitrogen-limited by a fertilizer flux, which is calculated to exactly meet the nitrogen demand which cannot be met by available soil nitrogen. A crop harvest carbon flux is included by preventing 30% of crop PFT litter from entering the soil; instead this carbon flux enters the atmosphere, representing the rapid turnover of crop products. The harvest flux prevents unrealistic accumulation of soil carbon in croplands.

    Finally, JULES now simulates biogenic emissions of VOCs, described in section A4.

    A3 Ocean Biogeochemistry

    The Model of Ecosystem Dynamics, nutrient Utilisation, Sequestration and Acidification (MEDUSA; Yool et al., 2013) is an intermediate-complexity plankton ecosystem model designed to incorporate sufficient complexity to address key feedbacks between anthropogenically-driven changes (climate and acidification) and oceanic biogeochemistry. MEDUSA-2 resolves a dual size-structured ecosystem of small (nanophytoplankton and microzooplankton) and large (microphytoplankton and mesozooplankton) components. This explicitly includes the biogeochemical cycles of nitrogen, silicon, and iron nutrients as well as the cycles of carbon, alkalinity, and dissolved oxygen. Large phytoplankton are treated as diatoms and utilize silicic acid in addition to nitrogen, iron, and carbon.

    Similar to its living components, MEDUSA-2's detrital components are split into two size classes. Small, slow-sinking detritus is represented explicitly with separate nitrogen and carbon tracers; a fixed Fe:N ratio is assumed. Large, fast-sinking detritus is represented implicitly, and created and remineralized within a single timestep. Fast-sinking detritus consists of organic nitrogen, carbon, and iron, together with silicon and calcium carbonate biominerals that are involved in a ballast parameterization (Armstrong et al., 2001). At the seafloor, MEDUSA-2 resolves five reservoirs to store sinking organic material reaching the sediment.

    The model's nitrogen, silicon and alkalinity cycles are closed and conservative (e.g., no riverine inputs), while the other three cycles (carbon, iron, and oxygen) are open. The iron cycle includes aeolian and benthic sources and is depleted by scavenging. The ocean's carbon cycle exchanges CO2 with the atmosphere. The oxygen cycle exchanges with the atmosphere, and dissolved oxygen is additionally created by primary production and depleted by remineralization. The various elemental cycles include both fixed and variable stoichiometry: Iron has a fixed ratio with nitrogen throughout; nitrogen and carbon have fixed (but different) ratios in phytoplankton and zooplankton, and variable ratios in detritus; diatom silicon has a variable ratio with nitrogen; calcium carbonate (cf. alkalinity) is produced at a variable rate relative to organic carbon; oxygen production and consumption reflects the C:N ratio of organic matter produced and consumed.

    During its development for UKESM1, several changes have occurred to the science and code of MEDUSA-2 compared to that originally described in Yool et al. (2013). The principal science changes are as follows:
    1. The carbonate chemistry submodel in MEDUSA-2 has been upgraded to MOCSY-2.0 (Orr & Epitalon, 2015), increasing the calculation accuracy of pCO2 and thus CO2 air-sea flux.
    2. A diagnostic submodel for DMS surface seawater concentration (Anderson et al., 2001) was added to support interactive DMS emissions to the atmosphere (see section 4.6).
    3. For consistency with earlier work, the diel light cycle within UKESM1 is time-averaged for MEDUSA-2.

    A4 Atmospheric Chemistry

    Atmospheric composition in UKESM1 is simulated by the U.K. Chemistry and Aerosols (UKCA) model. UKCA has several chemistry configurations available; the specific configuration used in UKESM1 is described in Archibald et al. (2019) and comprises a combination of the stratospheric scheme of Morgenstern et al. (2009) (with minor updates to reaction rates as in Morgenstern et al., 2017) and the “TropIsop” tropospheric chemistry of O’Connor et al. (2014), which includes extensions to the volatile organic compound (VOC) representation relative to the tropospheric chemistry in HadGEM2-ES. The merged scheme thus simulates interactive chemistry from the surface to the top of the model (85 km), enabling a holistic treatment of ozone, the third most important radiatively active gas in the atmosphere and the primary source of the hydroxyl radical (OH) in the troposphere. In total, the chemistry solves 291 thermal and photolytic reactions which describe the chemistry of 81 species. This includes the oxidation reactions which are a source of sulphate and secondary organic aerosol, and hence the chemistry is tightly coupled to aerosol production.

    In UKESM1, prognostic chemistry influences radiation through the three-dimensional concentrations of four gases: O3, CH4, N2O, and H2O. CH4 and N2O are forced with globally constant surface concentrations, and above the surface are treated as interactive chemical tracers. This allows the model to represent the spatiotemporal variability of these species (in particular their decay in the upper troposphere) while constraining their global scale evolution to match past observations or a particular forcing scenario. The model has the capability to run with interactive CH4 emissions, but in the standard configuration surface concentrations are prescribed as above.

    In contrast, O3 is fully interactive and is forced by surface emissions of ozone precursors such as VOCs, three-dimensional emissions of NOx (including emissions from aircraft and interactive production by lightning), and concentrations of ozone depleting substances. Chemical reactions which produce or destroy water vapor increment the humidity tracer of the dynamical core. This allows interactive simulation of chemical water vapor feedbacks, the most significant of which is due to stratospheric methane oxidation (see e.g., Hegglin et al., 2014; Hurst et al., 2011; Rohs et al., 2006). Concentrations of ozone depleting substances (CFCs, HCFCs, and Halons) are treated as in Morgenstern et al. (2009) and thus are not coupled between UKCA and radiation.

    As an inert gas, CO2 is not modified by UKCA chemistry and is specified as a global constant in the default configuration of UKESM1. See Appendix C2 for a description of the CO2 emission-driven configuration.

    The photolysis scheme that is used to drive the photochemistry in UKCA is that of Fast-JX (Neu et al., 2007; Telford et al., 2013) v6.4, augmented by the offline scheme after Lary and Pyle (1991) above 60 km for wavelengths between 113 and 177 nm. Fast-JX includes scattering calculations over a broad wavelength range (177–850 nm) to cover the stratosphere and lower mesosphere. This allows photolysis rates to vary with the optical depth associated with cloud liquid and frozen water as well as both the surface albedo calculated by the model's radiation scheme (Edwards & Slingo, 1996) and UKCA's own prognostic ozone distribution. However, there is no coupling to the aerosol scheme, and so the interactive photolysis rates are independent of aerosol loading.

    The main purpose of this interactive calculation is to allow photolysis and hence methane lifetime to respond to stratospheric ozone recovery. This limitation made the future methane lifetime projections of HadGEM2-ES (which used fixed photolysis rates) anomalous relative to other models in the Atmospheric Chemistry and Climate Model Intercomparison Project (Voulgarakis et al., 2013). Figure A2 compares interactive rates in UKESM1 against the offline tabulated rates used by HadGEM2-ES, for July 2005–2014. The morphology and magnitude of the photolysis rates match reasonably well, and there is a strong correlation between the offline and online rates (r2>0.96). The Fast-JX scheme gives rise to photolysis rates that are more variable, and higher than the offline rates by a factor of 1.2–1.3. This difference persists throughout the annual cycle (not shown) and leads to the methane lifetime being approximately 25% shorter with interactive photolysis than that with offline photolysis (O’Connor et al., 2014; Telford et al., 2013). For UKESM1, such a reduction in methane lifetime brings it within the observational range, highlighting the importance of the improved, interactive, scheme.

    Details are in the caption following the image
    Comparison of 2005–2014 July mean photolysis rates from a single historical run, for O1D and NO2 below 20 km between the offline photolysis scheme used in the CMIP5 configuration of HadGEM2-ES and the Fast-JX scheme in UKESM1.

    Biogenic volatile organic compounds (BVOCs) play an important role in atmospheric composition and consequently in climate. BVOCs act as ozone precursors, impact strongly on the chemical lifetime of methane, and their oxidized derivatives contribute to the formation of secondary organic aerosols (SOA). The most important BVOCs for climate and air quality are isoprene and terpenes, which typically contribute 50–70% of the total BVOC emission flux in contemporary chemistry-climate models. In UKESM1, isoprene feeds into the chemical mechanism for isoprene photooxidation, while a monoterpene tracer (which oxidizes and condenses to form SOA) is used to represent terpenes.

    In UKESM1, biogenic emissions of isoprene and monoterpene are calculated interactively by means of the iBVOC model (Pacifico et al., 2012, 2015), which applies the isoprene emission scheme of Pacifico et al. (2011) and the terpene emission parametrization from Guenther et al. (1995). These parametrizations take as inputs the modeled temperature and gross primary productivity (GPP) and hence add an important coupling between climate, atmospheric composition, and vegetation dynamics. In order to minimize differences in organic carbon aerosol load between UKESM1 and GC3.1, the per-PFT monoterpene emission capacity parameters were adjusted within observed ranges to achieve a present-day global total monoterpene emission of approximately 125 TgC year−1, to match the emission data set of Guenther et al. (1995) used by GC3.1. Similarly, isoprene emission factors were adjusted in line with published values to achieve a present-day global total emission of approximately 500 TgC year−1 (Arneth et al., 2008; Guenther et al., 2012; Lathière et al., 2006; Messina et al., 2016).

    A5 Aerosols

    The aerosol configuration of UKESM1 is the same as that of GC3.1 and is described by Mulcahy et al. (2018). It uses the UKCA-GLOMAP modal scheme (Mann et al., 2010) for sulphate, black carbon, organic carbon, and sea salt, and a variant of the Woodward (2011) bin scheme for mineral dust. UKESM1 does however differ from GC3.1 in its treatment of natural emissions of aerosols and aerosol precursors, as follows:
    1. Biogenic emissions of monoterpene, a precursor of SOA, are interactively calculated in the land surface scheme as described in section A4, whereas GC3.1 prescribes a climatological emission which is representative of the present-day.
    2. DMS emissions are coupled to ocean biogeochemistry, using the Anderson et al. (2001) parametrization of seawater DMS concentration rather than prescribed using a present-day observational climatology.
    3. UKESM1 introduces an emission of primary marine organic aerosol (PMOA), using the parametrization of Gantt et al. (2011, 2012), acting as a source of Aitken mode organic aerosol. This parametrization is coupled to the interactive surface chlorophyll concentration of MEDUSA. GC3.1 has no explicit PMOA representation but instead applies a scale factor of 1.7 to its DMS emissions as a proxy for PMOA (Mulcahy et al., 2018). This DMS scale factor is not applied in UKESM1 since PMOA is explicitly simulated.

    These three couplings allow the model aerosol and cloud-radiative behavior to respond directly to changes in both the marine and terrestrial biospheres, as well as indirectly through the impact of climate on the biosphere. A detailed assessment of the impact of these emissions developments on the aerosol and radiation performance of the model is in progress.

    Another difference in the treatment of aerosol between UKESM1 and GC3.1 is that the latter prescribes time-invariant concentrations of the chemical oxidants involved in the formation of sulphate and secondary organic aerosol, namely, O3, OH, NO3, HO2, and H2O2. In UKESM1 these oxidants are interactively simulated by the chemistry component and are themselves depleted by the oxidation reactions. This oxidant coupling means that changes in methane and other ozone precursors can affect the model's aerosol loading and aerosol radiative forcing.

    Emissions of mineral dust in UKESM1 are dependent upon the modeled bare soil fraction as well as on soil moisture and near-surface wind. Deposition of dust to the ocean in turn acts as a source of iron nutrient to MEDUSA, fertilizing phytoplankton growth wherever iron is limited. Dust therefore acts as a unique link between the terrestrial and marine biospheres. The dust scheme in UKESM1 is similar to that in GC3.1 and is described in Woodward (2011), with the following changes. First, in UKESM1 the bare soil fraction (which dictates the area available for dust emissions) is interactively simulated by the TRIFFID vegetation dynamics, rather than prescribed. This significant difference in driving data necessitated tuning of the dust emission parameters, as described in section 3. Second, dust emission from seasonally vegetated sources was disabled in UKESM1 in order to reduce the complexity of the coupling to TRIFFID, so that dust emissions depend on bare soil fraction but not on simulated leaf area index. Note that following a careful tuning of the vegetation simulation, also described in section 3, it was not found necessary to impose a map of preferential dust source regions such as those used in earlier models (e.g., Collins et al., 2011; Ginoux et al., 2001; Tegen et al., 2002; Zender et al., 2003).

    Appendix B: Metrics and Variables Used in Model Tuning

    Table B1 lists the primary metrics which were monitored during the tuning and spin-up of UKESM1. These were calculated based on annual mean model outputs (in some cases with longer term meaning applied to smooth interannual noise) and plotted as time series. These time series plots were updated routinely as test runs progressed, to rapidly understand the model's response to parameter changes. Priority 1 metrics were checked more frequently than the priority 2 metrics.

    Table B1. Priority Metrics Monitored During the Tuning of UKESM1
    Model variable Location or averaging domain
    Priority 1
    TOA radiation: SW, LW, and net flux Global and latitude bands
    Cloud fraction Global and latitude bands
    Surface air temperature Global and latitude bands
    AMOC strength 26°N
    ACC strength Drake passage
    Sea surface temperature Global and ocean basins
    Depth-mean water temperature Global and ocean basins
    CO2 air-sea flux Global
    Priority 2
    Sea ice extent and thickness Arctic and Antarctic
    Surface emissions of DMS, PMOA, BVOCs, dust Global and latitude bands
    Soil and vegetation carbon Global
    Ocean mixed-layer depth Global and ocean basins
    Ocean surface DIC, iron, nitrate Global
    Primary productivity Global land and ocean
    • Note. Latitude bands are 90°S–30°S, 30°S–30°N, 30°N–90°N. Priority ocean regions monitored were North Atlantic, Southern Ocean, Weddell Sea, and Ross Sea.

    In addition to these time series, we also periodically examined seasonal mean, long-term averaged maps of these and other quantities to ensure there was not error cancelation in the spatial means, and to understand better the processes which were likely responsible for driving differences between test runs.

    Appendix C: Special Configurations of the Model

    C1 Faster Configurations: UKESM1-CN and UKESM1-C

    Because the chemistry scheme in UKESM1 is particularly expensive, we have implemented a cheaper configuration of the model, known as UKESM1-CN (a carbon-nitrogen cycle model) which reduces computational expense by utilizing the simpler “offline oxidant” chemistry mechanism of GC3.1 (Walters et al., 2019). This scheme simulates sulphur chemistry for the production of sulphate aerosol using prescribed oxidants but excludes other interactive chemical processes. In addition, UKESM1-CN does not have interactive emissions of BVOCs. Apart from these two changes, UKESM1-CN is identical to the full UKESM1 configuration. It runs approximately twice as fast as UKESM1. UKESM1-CN will not be used for submissions to CMIP6 but is intended for use in carbon-nitrogen cycle research and other applications which do not depend upon interactive chemistry.

    Because it is lacking fully interactive chemistry, the UKESM1-CN configuration needs additional input files of the following variables:
    1. O3, for radiation and as an oxidant for aerosol chemistry,
    2. other oxidants for aerosol chemistry (OH, NO3, HO2, and H2O2),
    3. and biogenic emissions of monoterpene.

    For consistency with the full model, each of these inputs are derived from the interactive model fields produced by a prior run with UKESM1. Experience thus far indicates that headline metrics such as ECS are consistent between UKESM1 and UKESM1-CN. Future work will study the traceability between these configurations of differing complexity.

    A further simplification incorporates only the carbon cycle (UKESM1-C), primarily for understanding the impact of the nitrogen processes. The nitrogen processes do not add significant cost, so UKESM1-C runs at the same speed as UKESM1-CN.

    C2 CO2 Emission-Driven Configuration

    All of the results in section 4, and most of the simulations that will be submitted to CMIP6, are driven by CO2 concentrations rather than CO2 emissions. When UKESM1 runs in CO2 concentration-driven mode, atmospheric CO2 concentrations are externally prescribed as a global constant, and the biogeochemical fluxes of CO2 are purely diagnostic. From these natural fluxes, compatible anthropogenic emissions can be derived, as per Friedlingstein et al. (2006).

    When UKESM1 runs in CO2 emission-driven mode, atmospheric CO2 is held as an advected 3D tracer, updated by the natural surface fluxes and anthropogenic emissions. In this mode CO2 concentrations are free to evolve and thus drive a radiative response the model's carbon cycle feedbacks. The 3D nature of the free tracer means that (unlike in concentration-driven mode) the model resolves the spatial structure of CO2, including hemispheric and stratosphere-troposphere differences, and its seasonal cycle.

    Two emission-driven simulations have been performed: a pre-industrial control with zero anthropogenic emissions and a historical run from 1850 with evolving CMIP6 anthropogenic emissions. In Figure C1 we compare the seasonal cycle of the historical run against observations from the Carbon Dioxide Information Analysis Centre (CDIAC; Keeling et al., 1994). These show that the timing of the model's seasonal cycle compares well against the observations, though the summer draw-down begins too early. The seasonal range is weaker than observations at Mauna Loa by approximately 40% and stronger at Barrow by approximately 20%.

    Details are in the caption following the image
    Climatological seasonal cycle of near-surface atmospheric CO2 concentration at Mauna Loa, Hawaii (upper) and Point Barrow, Alaska (lower) for the period 1960–2008. Observations are from CDIAC. Model values are from the UKESM1 emission-driven historical experiment.

    Figure C2 shows the temporal evolution of CO2 for these experiments at Mauna Loa, again compared to historical observations from CDIAC. The pre-industrial control concentrations vary on decadal timescales within the range 280–288 ppm, which encompasses the observed value of 284 ppm prescribed in concentration-driven experiments (Meinshausen et al., 2017). The historical experiment drifts slightly higher than the observations, reaching a bias of about 10 ppm by 2000. Such close agreement with the observations is a good result given the number of degrees of freedom in the emission-driven configuration and compares well with the CMIP5 generation of models.

    Details are in the caption following the image
    Monthly time series of near-surface CO2 at Mauna Loa, Hawaii, from UKESM1 emission-driven piControl and historical experiments, and CDIAC observations.

    It is critical that the fully coupled model conserves carbon to a high accuracy: A significant loss of conservation would confound the analysis of carbon feedbacks. The UKESM1 atmospheric convection and boundary layer schemes only conserve tracer mass to within order 0.1 Gt of CO2 per year, which is not sufficiently accurate for emission-driven simulations. An atmospheric mass correction has therefore been implemented, as was required in HadGEM2-ES (Jones et al., 2011). In UKESM1 this correction uses an implementation of the Priestley (2002) conservation algorithm to distribute the correction increments globally. The ocean and land components did not require any such correction scheme to achieve this level of conservation.

    The finalized model conserves carbon to high accuracy. Over the first 64 years of the emission-driven pre-industrial control and historical simulations, the model gained an average of −1.3×10−3 GtC year−1and1.2×10−4 GtC year−1, respectively. These rates are two or more orders of magnitude smaller than the C4MIP-recommended 0.1 Gt year−1 threshold for acceptable drift in carbon pools in CO2 concentration-driven experiments (Jones et al., 2016), so we consider this an acceptable loss of conservation for CO2 emission-driven experiments.

    Appendix D: Simulation Identifiers

    All simulations used in this work were performed using version 10.9 of the UM, version 5.0 of JULES, NEMO version 3.6, CICE version 5.1.2, and OASIS3-MCT version 3.0. Simulation identifiers are as follows:
    1. piControl: u-aw310
    2. 4xCO2: u-bb446
    3. 1pctCO2: u-bb448, u-bd334, u-bd335, u-bd336
    4. historical: u-bc179, u-bc292, u-bc370, u-bb075, u-az513, u-az515, u-az524, u-bb277, u-bc370, u-bc470
    5. untuned pre-industrial test: u-bd881 (the control is u-aw310)
    6. untuned present-day test: u-bd880 (the control is u-az513)