Volume 58, Issue 2 e2019RG000657
Review Article
Open Access

The Structure of Climate Variability Across Scales

Christian L. E. Franzke

Corresponding Author

Christian L. E. Franzke

Meteorological Institute, University of Hamburg, Hamburg, Germany

Center for Earth System Research and Sustainability, University of Hamburg, Hamburg, Germany

Correspondence to: C. L. E. Franzke,

[email protected]

Search for more papers by this author
Susana Barbosa

Susana Barbosa

Centre for Information Systems and Computer Graphics, INESC TEC, Porto, Portugal

Search for more papers by this author
Richard Blender

Richard Blender

Meteorological Institute, University of Hamburg, Hamburg, Germany

Center for Earth System Research and Sustainability, University of Hamburg, Hamburg, Germany

Search for more papers by this author
Hege-Beate Fredriksen

Hege-Beate Fredriksen

Department of Physics and Technology, UiT-The Arctic University of Norway, Tromso, Norway

Search for more papers by this author
Thomas Laepple

Thomas Laepple

Alfred Wegener Institute, Helmholtz Centre for Polar and Marine Research, Potsdam, Germany

Search for more papers by this author
Fabrice Lambert

Fabrice Lambert

Geography Institute, Pontificia Universidad Catolica de Chile, Santiago, Chile

Search for more papers by this author
Tine Nilsen

Tine Nilsen

Department of Mathematics and Statistics, UiT-The Arctic University of Norway, Tromso, Norway

Department of Geography, Justus Liebig University of Giessen, Giessen, Germany

Search for more papers by this author
Kristoffer Rypdal

Kristoffer Rypdal

Department of Mathematics and Statistics, UiT-The Arctic University of Norway, Tromso, Norway

Search for more papers by this author
Martin Rypdal

Martin Rypdal

Department of Mathematics and Statistics, UiT-The Arctic University of Norway, Tromso, Norway

Search for more papers by this author
Manuel G, Scotto

Manuel G, Scotto

CEMAT and Department of Mathematics, IST, University of Lisbon, Lisbon, Portugal

Search for more papers by this author
Stéphane Vannitsem

Stéphane Vannitsem

Royal Meteorological Institute of Belgium, Brussels, Belgium

Search for more papers by this author
Nicholas W. Watkins

Nicholas W. Watkins

London School of Economics, London, UK

School of Engineering and Innovation, The Open University, Milton Keynes, UK

Centre for Fusion, Space and Astrophysics, University of Warwick, Coventry, UK

Search for more papers by this author
Lichao Yang

Lichao Yang

Center for Earth System Research and Sustainability, University of Hamburg, Hamburg, Germany

Department of Atmospheric and Oceanic Sciences, Peking University, Beijing, China

Search for more papers by this author
Naiming Yuan

Naiming Yuan

Institute of Atmospheric Physics, Chinese Academy of Sciences, Beijing, China

Search for more papers by this author
First published: 05 March 2020
Citations: 66

Abstract

One of the most intriguing facets of the climate system is that it exhibits variability across all temporal and spatial scales; pronounced examples are temperature and precipitation. The structure of this variability, however, is not arbitrary. Over certain spatial and temporal ranges, it can be described by scaling relationships in the form of power laws in probability density distributions and autocorrelation functions. These scaling relationships can be quantified by scaling exponents which measure how the variability changes across scales and how the intensity changes with frequency of occurrence. Scaling determines the relative magnitudes and persistence of natural climate fluctuations. Here, we review various scaling mechanisms and their relevance for the climate system. We show observational evidence of scaling and discuss the application of scaling properties and methods in trend detection, climate sensitivity analyses, and climate prediction.

Key Points

  • Climate variability operates on a continuum of spatial and temporal scales in such a way that the variability exhibits scaling relationships
  • Climatologically relevant imprints of scaling include long-range dependence and non-Gaussian fluctuations
  • Scaling has implications for trend detection, climate sensitivity, and predictability

Plain Language Summary

Climate variables are related over long times and large distances. This shows up as correlations for averages on long intervals or between distant areas. An important finding is that the majority of correlations in climate can be described by a simple mathematical relationship. We present such correlations for temperature on long times. Similarly, the intensity of precipitation events depends on their frequency in a simple manner. A useful concept is scaling where a scale denotes the width of an average. Scaling says that averages on different scales are related by a simple function—mathematically, this is a power law with the scaling exponent as a characteristic number. Scaling has impacts on predictability, temperature trends, and the assessment of future climate changes caused by anthropogenic forcing.

1 Introduction

An emerging topic in climate science is the systematic change of the temporal and spatial structures of climate variability seen across a multitude of spatial and temporal scales, in particular power law behavior (e.g., Graves et al., 2017; Hurst, 1951; Huybers & Curry, 2006; Lovejoy & Schertzer, 2013; Mandelbrot & Wallis, 1968). The intensity distribution of climate variables in relation to their frequency of occurrence also shows such power law behavior. It is of importance to improve our understanding of the underlying structure of climate variability since this may potentially allow us not only to improve our predictive capabilities but also contribute to an improved overall understanding of the complex Earth system as a whole. The presence of power law behavior in both the temporal and spatial domains and in intensities can reveal aspects of the underlying dynamics of the Earth system such as climate sensitivity and predictability.

This behavior can be illustrated with two climatological time series (Figure 1). Our choice of precipitation data (Figure 1a) exhibits the typical intermittent behavior with no or only very little precipitation on most days interspersed with an occasional extreme event. Hence, precipitation is a climatological variable that is highly episodic. Consequently, the distribution of precipitation is much more heavy tailed than a Gaussian distribution (Figure 1b). Thus, large values are much more likely than in the case of variables that are Gaussian distributed; the Gaussian distribution decays much faster than a power law. The tails of many precipitation distributions, as well as of other climatological quantities, decay according to a power law (see section 2 for details). This power law relation between intensity and probability of occurrence constitutes a scaling relationship.

Details are in the caption following the image
(a) Daily precipitation at Xichang, China. (b) Probability density function of precipitation (red dashed line: corresponding power law fit with exponent 4.97; black dashed line: corresponding exponential probability density function with parameter 8.21). (c) Annual mean Central England Temperature (CET). Red line: non-linear trend, magenta line: 11-year running mean, and blue line: decadal-scale fluctuations as derived from an empirical mode decomposition (EMD) and (d) detrended fluctuation analysis (DFA) plot with d=0.25. Circles: fluctuation function and red line: straight line with slope 0.25. (e) Autocorrelation function of CET (black line) and the red dashed line indicates a power law decay.

As a second time series we present the Central England Temperature (CET) (Parker et al., 1992) time series for the period 1772-2017. The CET consists of observations from stations located throughout central England. In Figure 1c we show the annual mean time series overlayed by an 11-year running mean and the nonlinearly filtered decadal-scale CET data using empirical mode decomposition (Franzke, 2009; Huang et al., 1998; Huang & Wu, 2008). empirical mode decomposition allows for a systematic decomposition of time series into dynamically relevant oscillatory modes and a nonlinear trend. The CET time series exhibits decadal-scale variations about an instantaneous mean (Franzke, 2009). The observed decadal-scale variability is a visible imprint of the scaling and long-range dependence (e.g., Gil-Alana, 2008; Graves et al., 2015). Intuitively, long-range dependence has the property that spatially coherent anomalies persist for a long time; for example, heat waves or droughts may last for many years (Cook et al., 2015), which is indicative of a decay of serial correlation which is slower than exponential, for example, power law decay. Long-range dependence means that positive (negative) anomalies are very likely followed by positive (negative) anomalies for long periods of time. The decay of serial correlations of long-range dependent systems behaves according to a power law (Figures 1d and 1e) as can be shown by an analysis using detrended fluctuation analysis (DFA) (see section 22). This approach provides more robust estimates than the standard autocorrelation function, which can be noisy at long lags (Figure 1e). In brief, this method computes the variance for moving windows of different sizes which yields a scaling relationship for the correlation strength of values at different times.

To summarize, many climatological time series exhibit a power law behavior in their amplitudes or their autocorrelations or both. This behavior is an imprint of scaling, which is a fundamental property of many physical and biological systems and has also been discovered in financial and socioeconomic data as well as in information networks (Ball, 2003; Clauset et al., 2009; Mandelbrot, 1963; Mantegna & Stanley, 1999; Saichev et al., 2009; Willinger et al., 2004). While the power law has no preferred scale, the exponential function, also ubiquitous in physical and biological systems, does have a preferred scale, namely, the e-folding scale, that is, the amount by which its magnitude has decayed by a factor of urn:x-wiley:rog:media:rog20226:rog20226-math-0001. For example, the average height of humans is a good predictor for the height of the next person you meet as there are no humans that are 10 times larger or smaller than you. However, the average wealth of people is not a good predictor for the wealth of the next person you meet as there are people who can be more than a 1,000 times richer or poorer than you are. Hence, the height of people is well described by a Gaussian distribution, while the wealth of people follows a power law (Newman, 2005).

Furthermore, a fascinating aspect of scaling in the climate system is that it occurs in many different characteristics of climate variables. As demonstrated above, it exists in time and intensity and, as we will discuss below, in space. For instance, negative vorticity anomalies, such as blocking, can be very persistent (e.g., Feldstein & Franzke, 2017), while positive vorticity anomalies, such as storms, have a heavy-tailed probability distribution of intensities (Blender et al., 2016; Corral et al., 2010) and heavy-tailed waiting time distributions (Franzke, 2013; Yang, Franzke & Fu 2019). Persistence and heavy-tailed distributions are described by scaling relationships. Different dynamical regimes are likely causing the scaling properties in the intensity, time, and space. In section 13 we discuss potential physical mechanisms which can explain scaling in the climate system. While there have been many mechanisms discussed in the literature (e.g., Beran, 1994; Beran et al., 2013), their applicability to the climate system is still an open question.

While the existence of scaling has been known for a long time and across many scientific areas, it had been largely ignored for an almost equally long time in the analysis of climate data, with some exceptions (e.g., Becker et al., 2014; Blender & Fraedrich, 2003; Dangendorf et al., 2014; Gil-Alana, 2003; Franzke, 2012; Koscielny-Bunde et al., 1998; Mann, 2011; Vyushin et al., 2004). Only recently has its usefulness been more widely appreciated in climate science, partly due to its inclusion in text books (e.g., Chandler & Scott, 2011; Lovejoy & Schertzer, 2013; Mudelsee, 2013; Schmitt & Huang, 2016) and partly due to the establishment of working groups such as Climate Variability Across Scales, part of Past Global Changes, who employ scaling approaches to improve our understanding of the complexities of the Earth system (see, e.g., Crucifix et al., 2017).

These scaling ideas enter the climate sciences from theoretical physics, applied mathematics, statistics, and theoretical climatology. They are rarely taught in standard meteorology, oceanography, or climate science courses. Here, we aim to bridge these disciplinary gaps by introducing the main ideas in a manner that is accessible and applicable for climate scientists.

1.1 Scales in the Climate System

One of the fascinating aspects of the climate system is the close relationship between the spatial and temporal scales of the relevant physical processes. This accounts for the success of scaling analyses of the equations of motion and the systematic derivation of simplified versions of the primitive equations, such as the quasi-geostrophic or the shallow-water equations (e.g., Franzke et al., 2019; Klein, 2010; Majda & Wang, 2006; Vallis, 2017). For instance, the quasi-geostrophic equations are valid in the limit of a small Rossby number (Vallis, 2017) and describe Rossby and synoptic-scale waves and, thus, provide an excellent conceptual model to understand many important aspects of the atmosphere and ocean.

The many physical processes in the Earth's climate system span a vast dynamic range, both in space (from 10−3 to 107 m) and time (from seconds to millions of years) (Figure 2). Williams et al. (2017) provide a census of atmospheric processes, the variability of which range from seconds to decades. In the climate system, we typically deal with the following physical processes and associated scales: turbulent eddies on time scales of a few seconds and length scales of millimeters to centimeters, convective activity on temporal scales of hours and spatial scales of hundreds of meters to a few kilometers, synoptic weather systems varying diurnally on spatial scales of hundreds to thousands of kilometers, large-scale teleconnection patterns with an intraseasonal to interannual temporal variability and spatial scales that can span an entire hemisphere, the coupled atmosphere-ocean system which varies from decadal to centennial time scales and a global spatial scale, and the ice ages that represent global variations on millennial time scales (Figure 2). The main four components of the climate system (atmosphere, ocean, land, and cryosphere) tend to operate on different time scales that interact nonlinearly with each other creating a plethora of interesting effects and feedbacks (Peters et al., 2004; Rial et al., 2004; Williams et al., 2017).

Details are in the caption following the image
Schematic diagram of important spatial and temporal scales in the climate system. The solid lines denote an estimate of the relative variance of climate variability. The dashed lines denote the variance contribution to the total variance from climatic processes with characteristic spatial scales smaller than those indicated on the x axis. The periodic climate components are denoted by spikes of arbitrary width. See Mitchell (1976) for more details. Figure source: Mitchell (1976).

An intriguing property of the climate system is that despite the fact that we have to deal with many different physical processes, the variability constitutes a continuum of fluctuations, that is, while the variability spectrum may be interspersed by spikes belonging to some particular and well-defined forcing process (e.g., daily, annual, or Milankovich cycles), the vast part of the spectrum is continuous and scales over large ranges.

1.2 Power Law Scaling

By scaling, we mean the power law relationship between the amplitude of fluctuations and their probability of occurrence on a given temporal or spatial scale:
urn:x-wiley:rog:media:rog20226:rog20226-math-0002(1)
where f is an arbitrary function which can either be deterministic or stochastic, y is a climate variable or time, and γ denotes the scaling exponent, a factor which allows us to zoom in and out. In case of f being a stochastic function, the equality has to be interpreted as equality in distribution. When considering a time series, f is a stochastic process and equation 1 implies that the variability of short time scales is statistically similar to the variability on longer time scales. This also implies that no preferred time scale exists. Furthermore, this equation describes a self-similar process (Lamperti, 1962); if y would denote time, then equation 1 would imply that the variance would go to infinity for increasing time scales. Furthermore, the fact that climate data exhibit scaling indicates that the statistical properties remain independent of the scale (Hurst, 1951; Feder, 1988; Franzke et al., 2012; Kolmogorov, 1940; Lamperti, 1962; Mandelbrot & Van Ness, 1968; Mandelbrot, 1982; Taqqu, 2013) as is the case for fractals (Feder, 1988). The scaling property might already be a familiar concept from power spectrum analyses where, in addition to pronounced peaks, one also examines for the existence of linear slopes in a double logarithmic scale representation (e.g., Huybers & Curry, 2006; Wunsch, 2003).

In Figure 3 we display time series sample paths in order to illustrate the scaling property; these were generated from an Autoregressive Fractional Integrated Moving Average (ARFIMA) scaling model (see section 12 and Appendix A). The displayed long-range dependence process has a slope of 0.75 in a log-log plot of fluctuation function versus scale, while a short-range dependence (SRD) process has a slope of 0.5 at long time scales. A slope of 0.5 corresponds to white noise which means that the process is uncorrelated (Figure 3c). The power spectrum (Figures 3d) exhibits the corresponding behavior of increasing power for lower frequencies (with a singularity at zero) of a long-range dependence process exhibiting while the SRD spectrum is flat at low frequencies. Scaling in intensities is displayed in Figure 4 for the α-stable distribution.

Details are in the caption following the image
Time series with scaling and nonscaling behavior. (a) A time series with scaling behavior (long-term persistence parameter d=0.495) and (b) zooms in the time period between 400 and 600 time units of (a). After zooming in, the time series in (b) shows a similar pattern as the time series in (a). (c) A time series without scaling behavior (first-order autoregressive process xt+1=0.5xt+ζt) and (d) zooms in the time period between 400 and 600 time units of (c). (e) Fluctuation functions for a short-term-dependent process (first-order autoregressive process) (black line) and scaling model in form of a long-term-dependent process (red line) with regression lines with slopes of 0.5, which corresponds to d=0.0 (blue line), and slope of 0.75, which corresponds to d=0.25 (green line). (f) Power spectrum of the short-term-dependent process (black), and the long-term-dependent process (red) plotted in (a) and (c). The blue line is the theoretical slope line of a long-term-dependent process with slope β=−0.5 (d=0.25), and the red green line is the theoretical slope line of the short-term-dependent process with slope β=0.0 (d=0.0). The relationship between slopes of the power spectrum β and the DFA is as follows: β=2(d+0.5)−1.
Details are in the caption following the image
Time series with scaling and nonscaling behavior. (a) Probability distribution function of an α-stable distribution with linear axis scaling and (b) with logarithmic axis scaling. The case α=2 corresponds to the exponential Gaussian distribution, while α values less than 2 correspond to power laws.

1.3 Climate Variability Across Scales

The first attempt to conceptualize atmospheric variability over a wide range of scales has been made by Mitchell (1976). Mitchell's ambitious composite spectrum (Figure 5) ranged from hours to the age of the Earth and focused on the peaks in the power spectrum, thus emphasizing the quasiperiodic phenomena in the climate system and its forcings. Although Mitchell (1976) made a candid admission that his spectrum was mostly an “educated guess,” and despite subsequent improvements in climate and paleoclimate data, the original work has achieved almost iconic status.

Details are in the caption following the image
Estimates of relative variance of climate over all periods of variation in the climate system. Source: Mitchell (1976).

Mitchell's scale-bound view led to a climate dynamics framework that emphasizes the importance of numerous processes occurring at well-defined time scales and the separation into quasiperiodic “foreground” processes (illustrated as sharp peaks in Figure 5) and the “unimportant background noise.” We argue that while this division is not wrong per se, it can only explain a small fraction of the overall variability and the underlying climate system dynamics. Wunsch (2003) showed that the quasiperiodic signals represent only a small fraction of the total variability which is more akin to a Lorentzian spectrum of an autoregressive process, while Pelletier (1997) and Huybers and Curry (2006) put an emphasis on the power law behavior of the background spectrum.

Lovejoy and Schertzer (2013) and Lovejoy (2015b) postulated the existence of five distinct power law scaling regimes. These regimes are based on different scaling exponents for the relationship E(ω)∼ωβ, where E denotes the spectral energy and ω frequency (Huybers & Curry, 2006). The proposed regimes are as follows:
  1. the weather regime with time scales from 6 hr up to 20 days with an exponent of β≈1.8
  2. the macroweather regime with time scales between 20 days and 50 years and β=0.2
  3. the climate regime with time scales between 50 and 80,000 years (includes glacial-interglacial cycles) and β=1.8
  4. the macroclimate regime between 80,000 and 500,000 years and β=−0.6
  5. the megaclimate regime for time scales larger than 500,000 years which takes us to the limit of reliable proxies (Lovejoy & Schertzer, 2013) and β=1.8.

See Figure 2a of Lovejoy (2015b) for an illustration of the scaling regimes.

Some recent studies focused more on the continuum aspects of the spectra (Huybers & Curry, 2006; Pelletier, 1998; Paillard, 2001). For instance, Huybers and Curry (2006) reported qualitatively similar results for the macroweather and climate regimes, while Nilsen et al. (2016) provided quantitative evidence that supports the hypothesis of just one scaling regime at least for the Holocene. Nilsen et al. (2016) also question whether it is meaningful to classify climate variability into universal regimes on time scales where we observe forced global climate changes and in particular geological time scales. The reason is that the variability on the long time scales is fundamentally forced by time-dependent external processes, for example, the Milankovich cycle; hence, its statistics are time varying (Nilsen et al., 2016). On shorter temporal scales, on the other hand, scaling is better established in many climatic data sets for a wide range of spatial and intensity ranges. Furthermore, it has been recognized that quasiperiodic signals represent only a small fraction of the total climate variability, and while many studies have focused on understanding these quasiperiodic signals, we argue that the continuous variance spectrum is of equal significance and deserving of future research efforts.

1.4 Scope of the Review

Because scales and scaling properties in the climate system are hard to adequately cover in a single paper, we will restrict this review to topics relevant to the interpretation and reconstruction of time series and to the impacts of scaling on climate variability, trends, prediction, and climate sensitivity. While we cover the potential physical mechanisms behind scaling, we can only provide a broad and nonrigorous introduction to the mathematical framework of scaling processes. More rigorous treatments can be found elsewhere (Beran, 1994; Beran et al., 2013; Baillie, 1996; Doukhan et al., 2002; Embrechts & Maejima, 2007; Guegan, 2005; Lovejoy & Schertzer, 2013; Palma, 2007; Samorodnitsky, 2007, 2016).

Our review is structured as follows: Section 6 covers the basic ideas of scaling and estimation methods; section 26 provides empirical evidence of scaling in climatic time series; section 33 discusses applications of scaling-like trend detection, climate prediction, and climate sensitivity. We end with an outlook and open research questions in section 37.

2 Basic Concepts Related to Scaling Relationships

In this section we provide a brief review of the mathematical and physical background to scaling, with an emphasis on an intuitive understanding of the main ideas, leaving the details to the specialist literature.

2.1 Scaling and Power Laws

2.1.1 Scaling From Dimensional Analysis

In the physical sciences, scaling is a well-known and long established concept (Bolster et al., 2011; Longair, 2003; Watkins et al., 2016). For instance, scaling can be used to explain: (i) how a pendulum's angular frequency depends on its length or (ii) how the gravitational force between two bodies depends on their distance from one another.

In the first example, the angular frequency ω depends on the length l as
urn:x-wiley:rog:media:rog20226:rog20226-math-0003(2)
where g is the gravitational acceleration. In the second example, Newton's law of universal gravitation states that the gravitational force, F, between two bodies with masses m1 and m2, is inversely proportional to the square of the distance between their centers, r, as
urn:x-wiley:rog:media:rog20226:rog20226-math-0004(3)
where G is the universal gravitational constant. The scaling property in both examples is so well established that it can be used to extrapolate and to test the behavior of systems outside their initial observable range. It can easily be seen that equations 2 and 3 are different forms of the power law from equation 1 with γ equal to −(1/2) and −2, respectively.

While originally a result of empirical observation, the above equations can also be derived from dimensional analyses. This embodies the physical principle of similarity, which requires that (natural) physical laws should be independent of (human) physical units used to describe a system. According to Buckingham's Π theorem (Buckingham, 1914; Meinsma, 2019), dimensional analysis can be used to show that any physical equation involving n variables can be rewritten using n-m dimensionless parameters, where m ≥ 0, thus revealing possible scaling relations which can then be empirically tested (Bolster et al., 2011).

Dimensional analysis remains a very powerful technique for systems which resist analytic or numerical treatment. The prime example is geophysical fluid turbulence. In 1941 Kolmogorov (Kolmogorov, 1991b, 1991a) derived a scaling relationship between turbulent kinetic energy E and the horizontal scale as measured by wavenumber k for isotropic turbulence. Thereby, he derived the Kolmogorov -5/3 spectrum(for details and underlying assumptions see Vallis, e.g., 2017):
urn:x-wiley:rog:media:rog20226:rog20226-math-0005(4)

While a power law distribution of the energy spectrum has been confirmed by observational evidence in the atmosphere (Nastrom & Gage, 1985; Straus & Ditlevsen, 1999), the exact exponent is still a matter of debate (Lovejoy et al., 2007; Lovejoy & Schertzer, 2013). For instance, Lovejoy et al. (2007) have shown that the atmosphere is anisotropic with different scaling exponents in the horizontal and vertical directions, which violates Kolmogorov's assumption of isotropy. Also, the theoretical −(5/3) scaling for large horizontal scales is −2.4 according to aircraft measurements (Lovejoy et al., 2009). This does not invalidate dimensional analysis but only shows that some of the underlying assumptions made by Kolmogorov in his first model (homogeneous and isotropic three-dimensional turbulence) describe an idealized system but are typically not valid in the real atmosphere or ocean, where vertical stratification, jet streams, and the presence of boundaries prevents full isotropy and homogeneity.

Another example of scaling is the addition of N random numbers, where the standard error scales as σNN1/2, a result familiar to all scientists from the undergraduate laboratory and the treatment of experimental errors (e.g., Wilks, 2011). Interestingly, this result can be connected to a physical situation, by considering the root-mean-square of the displacement yN from the origin of the first N steps of a random walk, which is one of the most basic stochastic models for a time series. In a typical one-dimensional discrete random walk, a particle may start at a location and each step moves it either to the left or to the right with equal probability. The resulting root-mean-square of the total displacement, y, after N steps scales with N1/2 which can also be expressed in terms of time, t, as
urn:x-wiley:rog:media:rog20226:rog20226-math-0006(5)
This describes the growth of the diffusing edge of a particle cloud executing Brownian motion (See Appendix B) (Bouchaud & Potters, 2003). The random walk model is statistically self-similar; that is, the time series generated by a random walk looks approximately the same as parts of it. In other words, the shapes and behaviors of the time series are independent of the time scale under consideration. Mathematically, statistical self-similarity can be written as
urn:x-wiley:rog:media:rog20226:rog20226-math-0007(6)
and is equivalent to the scaling relationship in equation 1 where urn:x-wiley:rog:media:rog20226:rog20226-math-0008 refers to that both sides are equally distributed. Here, γSS is the self-similarity parameter. In some processes, such as fractional Brownian motion, this is identical to the Hurst exponent H. The Hurst exponent H is named after Harold Edwin Hurst who first identified a scaling relationship investigating the flow levels of the Nile river and other reservoirs (Doukhan et al., 2002; Hurst, 1951, 1957). He developed the R/S method (see details below in Appendices 41 and D) to estimate the scaling exponent. A list of used exponents is given in Table 1.
Table 1. Table of Scaling Exponents
Exponent Name Relationship to other exponents
γ general power law exponent
γSS self-similarity exponent
H Hurst exponent urn:x-wiley:rog:media:rog20226:rog20226-math-0009 where H measures long-
range dependence
α stability exponent
β power spectrum exponent from a station- β:=2H−1 where H measures
ary process long-range dependence
d long-range dependence parameter urn:x-wiley:rog:media:rog20226:rog20226-math-0010 for Gaussian processes
τ(q) multifractal exponent/Renyi scaling
exponent
  • Note. d is used in the statistics community in autoregressive fractional integrated moving average models. These models are asymptotically self-similar. H is used in the physical and climatological communities and can be a measure of long-range dependence or self-similarity in systems with Gaussian fluctuations. Here, we use H only as a measure of long-range dependence.

The range of problems we can handle with scaling analysis can be greatly broadened if we introduce the concept of fractals by considering scaling exponents γ which are nonrational. Just as in the integer or rational cases, there is physically instructive information in fractal exponents that can go beyond that from dimensional analysis. These nonrational exponents will play an important role from now on since they are necessary to describe the observed scaling in climate time series due to long-range dependence and heavy-tailed probability density functions (PDFs). They will be discussed in the following subsections.

2.2 Scaling in PDFs and Non-Gaussianity

2.2.1 Non-Gaussian but Stable PDFs

The central limit theorem states that the sum of independent and identically distributed random variables with finite variance approaches a Gaussian distribution and results in an N1/2 scaling, where N is the length of the sums (von Storch & Zwiers, 2003; Wilks, 2011). However, many natural systems, for example, precipitation (Figure 1) (Peters et al., 2001; 2010; Yang, Franzke & Fu 2019) and the Greenland ice cores (Ditlevsen, 1999; Gairing et al., 2017; Peavoy & Franzke, 2010), show more erratic fluctuations, that is, the corresponding PDF decays much slower than the corresponding Gaussian distribution with the same mean and variance. Hence, such distributions have heavier tails than the corresponding Gaussian distribution and very extreme events are much more likely than in the Gaussian world.

This behavior can be explained by the generalized central limit theorem (Sornette, 2006), a generalization of the standard central limit theorem (Wilks, 2011) which permits the random variables to have infinite variance, which means that the sums of such random variables scale as N1/α and follow α-stable distributions with 0<α ≤ 2 (Doukhan et al., 2002; Sornette, 2006; Samorodnitsky, 2016). For α=2 we recover the Gaussian case with finite variance. The central limit theorem expresses the fact that sums of random variables from short-tailed PDFs converge to a fixed point, that is, a Gaussian distribution which retains its shape and is therefore a stable distribution (Mantegna & Stanley, 1999). In the case of the generalized central limit theorem, there is a series of such fixed points which can be imagined as forming a line in the space of all possible distributions, with each point on the line corresponding to an exponent α in the range from 2 to 0. Hence, sums of random variables from heavy-tailed, power law PDFs converge to a power law distribution, the α-stable distribution, rather than being Gaussian. In general, the α-stable PDFs do not have an analytic representation except via their characteristic functions, that is, the Fourier transform of the PDF p(x) (Gardiner, 2009). The α-stable distributions with α<2 have characteristic functions of the form urn:x-wiley:rog:media:rog20226:rog20226-math-0011 and so p(x) decays asymptotically as a power law: p(s)∼s−(1+α) as s (Sornette, 2006). Furthermore, these power law distributions decay so slowly that for α<2 the variance does not exist and for α<1 not even the mean exists. There is a corresponding random walk with α-stable increments, often called a “Lévy flight,” whose root-mean-square displacement grows as ∼t1/α, which is referred to as superdiffusion (Gardiner, 2009).

2.2.2 Other Non-Gaussian PDFs

The α-stable model is simple and, thus, economical but can have extremely wild fluctuations. The properties of observational data may motivate other models for fluctuations which are less extreme than in the α-stable model. In particular, the infinite variance property of the α-stable model may yield fluctuations with tails that are heavier than desired and observed. Thus, other non-Gaussian PDFs need to be considered, such as stretched exponentials, where the PDF is given by urn:x-wiley:rog:media:rog20226:rog20226-math-0012 with s between 0 and 1 or a log-normal distribution. Furthermore, heavy-tailed PDFs can also originate from extreme value statistics (Coles, 2001) that rely on the Fisher-Tippett-Gnedenko theorem (Coles, 2001) which is based on the maxima of identically and independently distributed sequences of random variables, rather than their sums as in the central limit theorem.

Unlike α-stable distributions, these are not stable under addition which means that they converge toward the Gaussian distribution under addition. For instance, a first-order autoregressive process xt+1=axt+σ2ζ where ζ is a Gaussian-distributed random variable, with variance σ2, is also Gaussian distributed for x. However, if ζ were assumed to be log-normal, then the process distribution x would not be log-normal but asymptotically Gaussian. This suggests that also nonlinear and multiplicative processes need to be considered to explain the existence of power law PDFs. For instance, non-Gaussian distributions can also be created by multiplicative processes, such as multiplying a state variable with Gaussian noise (Franzke, 2017; Majda et al., 2008, 2009; Sardeshmukh & Sura, 2009). Such multiplicative noise can create heavy-tailed distributions. They naturally occur in stochastic climate theory (Franzke et al., 2015; Franzke & O'Kane, 2017; Gottwald et al., 2017; Penland & Sardeshmukh, 2012; Sardeshmukh & Penland, 2015; Sura, 2011). The energy cascade in turbulence is a particularly important multiplicative physical model as it describes the nonlinear interaction between different scales or waves (Vallis, 2017).

2.3 Long-Range Dependence

Long-range dependence is characterized by a slow, power law decay of the autocorrelation function. This implies that even long ago states still affect the current state, thus, even far apart in time states, show dependence on each other.

The most basic long-range dependence model is the fractional Brownian motion (See Appendix B). The main difference between fractional Brownian motion and regular Brownian motion is that in Brownian motion the increments are independent of each other while in fractional Brownian motion such increments are dependent in time (Figure 6). This dependence actually covers the whole past; that is the reason that this model is sometimes also called long-term persistence or long memory (for H>0.5). There are different definitions of fractional Brownian motion, and we refer to the specialist literature for more details (e.g., Beran et al., 2013; Beran, 1994; Embrechts & Maejima, 2007; Lévy, 1953; Mandelbrot & Van Ness, 1968).

Details are in the caption following the image
Example time series for fractional Brownian motion (fBm) and the corresponding fractional Gaussian noise (fGn; lower panel) for (a) H=0.7 (fGn is persistent), (b) H=0.5 (fGn is uncorrelated white noise), and (c) H=0.3 (fGn is antipersistent). The fractional Brownian motion has self-similarity exponent H, and if H is greater than 0.5, it is long range dependent, as in the H=0.7 case above.
While fractional Brownian motion is a continuous-time process, the statistics literature prefers a more flexible model, the discrete time ARFIMA (e.g., Beran, 1994; Hosking, 1981; Granger, 1978; Granger & Joyeux, 1980):
urn:x-wiley:rog:media:rog20226:rog20226-math-0013(7)
where B denotes the back shift operator BXt=Xt−1,B2Xt=Xt−2,…. The polynomials Φ and Ψ are defined as urn:x-wiley:rog:media:rog20226:rog20226-math-0014 and urn:x-wiley:rog:media:rog20226:rog20226-math-0015, where p and q are integers and denote the order of the autoregressive Φ and moving average Ψ parts, respectively. The noise variables Zt are assumed to be independent Gaussian distributed with zero mean and constant variance urn:x-wiley:rog:media:rog20226:rog20226-math-0016. See Appendix A for more details.

However, the ARFIMA model can also be generalized to use α-stable distributed increments (Franzke et al., 2012; Graves et al., 2017; Kokoszka & Taqqu, 1994; Stoev & Taqqu, 2005). For these infinite variance models no agreed upon definition of long-range dependence exists (Samorodnitsky, 2016). Note that for d=0 the ARFIMA model reduces to the Autoregressive Moving Average model which is a SRD process. In general, ARFIMA models can also be driven by non-Gaussian (e.g., t-distributed) noise (Graves et al., 2017). ARFIMA models are more flexible than fractional Brownian motion since they combine a long-range dependence component with SRD behavior (Beran, 1994; Beran et al., 2013; Franzke et al., 2012; Graves et al., 2015). The R package ARFIMA can be used to estimate ARFIMA models (Veenstra, 2012).

These are the two most important and widely used paradigmatic models of long-range dependence, but since they were not derived from basic physical laws their use in climate research was originally, and continous to be, met with criticism (e.g., Klemes, 1974; Maraun et al., 2004; Mann, 2011). Long-range dependence also implies that even the most distant past still influences the current and future climate, which appears at odds with common intuition. Many geophysical equations of motion such as the Navier-Stokes or the primitive equations are usually Markovian, that is, their current state only depends on the immediately preceding state and not on states in the more distant past. Furthermore, they do not have memory terms (Chorin & Hald, 2013; Gottwald et al., 2017; Mori, 1965; Zwanzig, 1973, 2001). This fact appears to be at odds with the observed (non-Markovian) long-range dependence behavior of many climate time series and has led to much debate (Bunde et al., 2014; Cohn & Lins, 2005; Franzke, 2012; Maraun et al., 2004; Mann, 2011; Percival et al., 2001; Vyushin & Kushner, 2009). The debate stems from the fact that the underlying equations of motion are Markovian. However, long-range dependence is frequently seen in time series from an aggregated system rather than data from a less ambiguous physical variable, and so the apparent paradox may be illusory since even Markovian systems can appear non-Markovian when not observing the full system. We will discuss possible physical mechanisms to explain this behavior in section 13.

2.4 Multifractals

In section 1, we discussed scaling in precipitation intensities and in temperature time series. For intensity fields as well as time series, there are notions of multifractality that generalize self-similar scaling.

Intensity fields in geophysics can have spatial characteristics that are consistent with random cascades (Kahane & Peyriere, 1976; Kantelhardt, 2009; Sornette, 2006). In such cascades, the intensity in a spatial region distributes nonuniformly between its smaller-scale subregions according to multiplicative processes. The simplest example is the binomial cascade introduced by Kahane (1985). This model originates in turbulence theory, as a rigorous analysis of the Kolmogorov-Obukhov model for spatial variability of the energy dissipation rate (Kolmogorov, 1962; Obukhov, 1962). The multiplicative chaos model (Riedi et al., 1999) is a modern version of the same idea.

The binomial cascade and the multiplicative chaos models define singular (nonsmooth) measures. By construction, the qth moments of the region-averaged intensities are power laws in spatial scale, with exponents that depend concavely on q. Consequently, the distributions of intensities between different spatial regions become increasingly leptokurtic with decreasing scale.

A multifractal time series X(t) is one where the qth moment of an increment |X(tt)−X(t)| scales with the time lag Δt, with an exponent ζ(q) that depends concavely on q. The scaling function ζ(q) is linear for self-similar processes.

There are several ways to construct multifractal stochastic processes from multifractal measures. In most constructions, a multifractal intensity field on the time axis determines the amplitudes in the time series, analogous to how the energy dissipation rate determines the amplitude of velocity field fluctuations in turbulence theory.

For strictly concave scaling functions the distributions of increments are more leptokurtic on short time scales than on longer time scales. Consequently, all multifractal time series are non-Gaussian. The reverse implication does not hold. It is well known that unless one carefully verifies scaling of higher-order moments, standard techniques for estimation of multifractality can lead to spurious results for time series with non-Gaussian marginal distributions.

While multifractals are an abstract concept, they are useful for modeling time series with volatility clustering in time series, where the serial correlations between large and small amplitude events are different.

Applications of multifractal models in climate science have been shown by Schmitt et al. (1995). More recently, Ashkenazy et al. (2003) analyzed climate data from the past 100 kyr and found evidence for nonlinearity and clustering of the magnitude of climatic changes, consistent with multifractality. Similar results have been found by Maslov (2014). Evidence of multifractal scaling in temperature, wind, and precipitation has been found by Baranowski et al. (2015), Gan et al. (2007), and Royer et al. (2008). See Appendix E for multifractal estimators.

2.5 Physical Scaling Mechanisms

Scaling, and particularly long-range dependence, is an actively discussed topic in climate research. There is no obvious physical mechanism in the climate system that would allow the distant past to directly affect the current state of the system. Since the equations of motion used in climate models are all usually Markovian and do not contain memory terms, how can we explain the presence of long-range dependence, and scaling, in the climate system?

2.5.1 Model Reduction

Long-range dependence can be explained using the Mori-Zwanzig formalism from statistical physics (Gottwald et al., 2017; Mori, 1965; Zwanzig, 1973, 2001) which rigorously demonstrates how model reduction leads to the emergence of memory terms in the reduced equations of motion. Let us consider the following example (Gottwald et al., 2017; Zwanzig, 2001):
urn:x-wiley:rog:media:rog20226:rog20226-math-0017(8)
urn:x-wiley:rog:media:rog20226:rog20226-math-0018(9)
where Lij are constant parameters. If we are now only interested in the dynamics of x, we can formally solve for y
urn:x-wiley:rog:media:rog20226:rog20226-math-0019(10)
which we can now insert into equation 8
urn:x-wiley:rog:media:rog20226:rog20226-math-0020(11)

The first term is a Markovian term from the original equations, the second term is a memory term since it integrates over the past, and the last term is the initial condition which can be considered to be random. This example explicitly shows how one gets memory terms when looking only at parts of the full state vector. Equation 11 is still exactly equivalent to the original system.

Most of our measurements are point measurements or just measurements of a subset of the continuous fields. In either case, their dynamics stem from a low-dimensional system embedded in a climate system of infinite dimensions. The Mori-Zwanzig formalism shows that memory effects arise if only a small part of the full system is observed. Thus, long-range dependence could be a direct result of this observation. While the memory term in equation 11 is fairly general—which makes it impossible to know how exactly memory decays—a power law decay is a possibility, especially when making additional assumptions about the memory kernel. Kupferman (2004) approximated the memory kernel with a power law.

2.5.2 Nonlinearity and Regimes

Lorenz put forward the idea that deterministic systems can be almost intransitive; that is, they can exhibit long-lasting climate changes and hence no unique climate state exists (Lorenz, 1968, 1976). Such long-term anomalies can be a form of scaling in that the variance increases with increasing time scale. Several studies have shown that nonlinearity can lead to scaling (Franzke et al., 2015; Lorenz, 1976; Mesa et al., 2012). Atmospheric circulation regime behavior, a main component of the climate system (Feldstein & Franzke, 2017; Franzke, 2013; Franzke et al., 2011; Hannachi et al., 2017; Ghil & Robertson, 2002; Nicolis, 1990), has been suggested as a prime candidate for scaling (Franzke et al., 2015). An example of atmospheric circulation regimes is given by the quasi-stationary circulation systems like blocking events, which are quasi-stationary high-pressure systems that can last for weeks and cause heat waves and cold spells (Feldstein & Franzke, 2017; Hannachi et al., 2017). It has been shown for very long but finite time series that regime behavior is a plausible mechanism for scaling because the residence times of the regimes are power law distributed (Diebold & Inoue, 2001; Franzke et al., 2015). The residence time is the time the system stays in one regime state. If these time intervals are power law distributed, then the system can exhibit long-range dependence. This implies that memory effects in the climate system may not be needed to explain the apparent scaling of variance with time scale. The origin of this scaling has been found to be associated with the coarse graining of the dynamics into a finite number of specific regimes, leading to non-Markovian dynamics (Nicolis, 1990; Nicolis & Nicolis, 1988, 1995; Nicolis et al., 1997; Vannitsem, 2001).

Recent model experiments suggest also another possible nonlinear mechanism that could explain long-range dependence: the coupling of the atmosphere with other components of the climate system that have very different characteristic time scales. A case in point is ocean-atmosphere coupling, for which a reduced order nonlinear coupled model has been developed recently (De Cruz et al., 2016; Vannitsem et al., 2015). This model employs the quasi-geostrophic equations to describe the large-scale dynamics of the atmosphere and oceans in extratropical regions. The coupling is achieved via an energy balance scheme and momentum transfer through wind stress.

Multiple scaling regimes were found (Figure 7) using a Haar wavelet analysis (see Appendix 44). Remarkably, no low-frequency variability was found in the coupled model for small friction coefficients and the moments peak at a scale of about 10 days and decrease for larger periods. By low-frequency variability, we mean a set of long-periodic, attracting orbits that couple the dynamical modes of the ocean and the atmosphere in this model. If low-frequency variability develops in the system, then additional peaks emerge at 10,000 and 40,000 days. Similar to Lovejoy (2015b), this allows us to define different regimes based on the respective scaling exponents. The structure of the low-frequency variability and long-range dependence critically depends on the water depth (Figures 7b and 7c). This suggests that one plausible explanation of observed scaling regimes lies in the coupling of climate subcomponents. We will further discuss this coupling mechanism in a linear framework next.

Details are in the caption following the image
(a) First and second moments, q=1,2, of the first mode of the stream function field as a function of time scale for a wind stress drag coefficient C=0.010 kg·m−2·s−1 and ocean layer depths h=164.8 m. (b) As in (a) but for C=0.015 kg·m−2·s−1 and h=164.8 m. (c) As in (b) but for C=0.015 kg·m−2·s−1 and h=41.2 m.

2.5.3 Superposition of Linear SRD Models and Linear Response

Another plausible scaling mechanism is the superposition of SRD models such as first-order autoregressive process models (Granger, 1980). This approach assumes that each climate subcomponent (atmosphere, ocean, land, cryosphere, etc.) evolves according to some SRD process. The superposition of those climate subcomponent processes can result in scaling and long-range dependence behavior (Granger, 1980). The plausibility of this hypothesis has been confirmed by the linear response in energy balance models (Fredriksen & Rypdal, 2017). Linear model types include the vertical diffusion model of Fraedrich et al. (2004) for the ocean temperature. With two layers the model produces a 1/f spectral range in the mixed layer temperature for a white noise surface forcing.

Another example is the Pacific Decadal Oscillation (Mantua & Hare, 2002) which also shows strong long-range dependence (Yuan et al., 2014). The Pacific Decadal Oscillation shows variability on interannual to multidecadal time scales. The Pacific Decadal Oscillation is not thought of being a single physical model of variability; instead, it is the aggregation of several different physical processes such as El Niño–Southern Oscillation teleconnections, sea surface temperature reemergence, and stochastic atmospheric forcing (Newman et al., 2003, 2016; Qiu et al., 2007; Schneider & Cornuelle, 2005; Vimont, 2005). Hence, the Pacific Decadal Oscillation is rather an imprint of scaling in the climate system than its cause.

On the one hand this superposition mechanism is physically plausible; on the other hand from a statistical point of view it requires the estimation of many parameters. Hence, from a model selection point of view, which favors an economical model with as few parameters as necessary over more complex models (Occam's razor principle) (Burnham & Anderson, 2003), the scaling models are preferable. This does not mean that they are the best representation of the underlying dynamics. This suggests that in practice one has to decide whether we want to better understand the physical processes behind certain phenomena or want an efficient and skillful statistical model, for example, for prediction purposes.

2.5.4 Non-Gaussianity and Multiplicative Noise

As discussed above, scaling can also arise from the distribution of the increments or the driving noise in a stochastic process. So far, we only discussed scaling in additive noise processes which in addition may have heavy tails. Also, Gaussian noise can produce power law PDFs when it occurs in a multiplicative or state-dependent process (Bódai & Franzke, 2017; Franzke, 2017; Majda et al., 2009; Penland & Sardeshmukh, 2012; Sornette, 2006; Sardeshmukh & Sura, 2009; Sura & Hannachi, 2015). The simplest multiplicative noise process is the Kesten process (Sornette, 2006), a first-order autoregressive process model with random coefficients:
urn:x-wiley:rog:media:rog20226:rog20226-math-0021(12)
where an and bn are independent random variables. Under certain conditions, the Kesten process has a process cumulative probability density function with a power law decay of its tails, that is,
urn:x-wiley:rog:media:rog20226:rog20226-math-0022(13)
where γ is the power law exponent.

Stochastic climate theory predicts the presence of multiplicative noise in nonlinear systems (Franzke et al., 2015, 2019, 2005; Franzke & Majda, 2006; Franzke, 2017; Gottwald et al., 2017; Majda et al., 1999, 2001, 2008, 2009; Penland & Sardeshmukh, 2012; Sardeshmukh & Sura, 2009; Sura & Hannachi, 2015). It can also be shown that multiplicative noise leads to power laws over some ranges in stochastic climate models (Majda et al., 2009; Sardeshmukh & Sura, 2009; Sura & Hannachi, 2015). Unlike power law processes, stochastic climate theory also provides mechanisms to limit extremes. This power law roll-off is due to the same nonlinear interaction that causes the multiplicative noise in the first place: the nonlinear interaction between slow and fast components (Franzke et al., 2005; Franzke & Majda, 2006; Majda et al., 1999, 2001, 2008, 2009; Sardeshmukh & Sura, 2009; Sura & Sardeshmukh, 2008). This can be understood as follows: The fast components of the flow, for example, convection, synoptic-scale weather systems, are effectively serially uncorrelated on the time scale of the slow components, for example, Rossby waves or the ocean. This time scale separation allows us to treat the fast components effectively as a noise variable. While there are nonlinear interactions between the slow and the fast components in the climate system, this can be written now as a product of the slow flow variable and a noise variable, that is, multiplicative noise, also called state-dependent noise since the impact of the noise can be modulated by the state of the slow variable. This is consistent with the findings of Sardeshmukh and Sura (2009) where they found evidence in global circulation model simulations that multiplicative noise is due to turbulent adiabatic fluxes and not rapid diabatic forcing fluctuations. An example is wind gusts: If the large-scale wind speed is low, then there are only weak wind gusts; on the other hand, if the large-scale wind speed is high, also, the wind gusts are strong. This behavior can be easily represented by a multiplicative noise where the wind gusts are computed by the product of the large-scale wind speed and a noise. The relevance of multiplicative noise has been shown for sea surface temperature variability (Sura & Sardeshmukh, 2008), atmospheric vorticity variability (Sardeshmukh & Sura, 2009), teleconnection patterns such as the North Atlantic Oscillation (Majda et al., 2009; Önskog et al., 2019), and extreme events (Franzke, 2017; Penland & Sardeshmukh, 2012; Sura, 2013).

While theoretical considerations predict a power law, for example, the generalized central limit theorem (Sornette, 2006), our climate system is of finite size and thus infinitely large events cannot occur which means that the power laws need to cut or roll-off at some intensity or spatial size. This is also consistent with the dynamical systems theory of extremes (Lucarini et al., 2016) which shows that pure power law dynamics cannot occur at arbitrarily large intensities or sizes.

2.5.5 Nonstationarities

While Hurst (1951) was the first to discover scaling in natural time series, Kolmogorov (1940), Lamperti (1962), Mandelbrot (1965), and Mandelbrot and Wallis (1968) developed the first mathematical long-range dependence models (see above) to explain such behavior (Graves et al., 2017). From the outset, the long-range dependence concept was controversial, especially in hydrology (Klemes, 1974). Klemes, argued that long-range dependence can be caused by nonstationarities and by random walks with an absorbing boundary. The latter is mostly relevant for natural storage systems but less so for the climate system and will therefore not be discussed here. Klemes, argues that long-range dependence is only an apparent effect and that there is no real memory in the climate system. While it is easy to construct nonstationary models exhibiting long-range dependence (Klemes, 1974), they raise deep philosophical questions about how the climate system is modeled. In general, all models of natural systems are assumed to have fixed parameters stemming from the underlying physical laws and all apparent nonstationarities would be the result of nonlinearities in the underlying equations of motion or due to changes in external forcing (e.g., greenhouse gas emissions and Milankovich cycles). One could design nonstationary climate models by introducing random jumps in model parameters which would lead to shifts in the mean state, as proposed for hydrology by Klemes (1974). For instance, the inclusion of volcanic activity, which is very intermittent, improves the scaling behavior of climate simulations (Vyushin et al., 2004). While the success and skill of current numerical weather and climate predictions show the usefulness of the stationarity assumption, the question remains unresolved whether nonstationary models could provide a viable alternative.

2.5.6 Self-Organized Criticality

Self-organized criticality (SOC) may be another possible mechanism behind scaling (Bak et al., 1987; Bak, 1996; Watkins et al., 2016). SOC refers to a process driven by a slow and constant energy input that leads to sudden burst behavior without any typical scale. Hence, the statistics of a SOC process are described by power laws (Hergarten, 2003).

Peters et al. (2001), Peters and Christensen (2006), and Peters and Neelin (2006) used SOC to explain the observed scaling of precipitation. The atmosphere receives energy from evaporation due to solar radiation. The water vapor is stored in the atmosphere until a dynamical threshold (saturation) is reached, at which point energy bursts out; that is, it rains and latent heat is released. These burst events have no typical scale and are a possible explanation of the observed power law behavior of the tail of the PDF of precipitation event sizes and durations.

Another potential mechanism for power laws is the highly optimized tolerance framework (Carlson & Doyle, 1999, 2000, 2002). This framework relates power laws to evolving structures. However, this framework has been developed for biological and engineering systems. How well it can also be applied to the climate systems needs to be examined. A recent application was to ecosystems and wild fires (Moritz et al., 2005).

2.5.7 Scaling via Turbulent Cascades

While the above approaches apply to the time domain and aim to explain the presence of long-range dependence in the climate system or intensity distributions, we now discuss a theory to explain the existence of scaling in the space domain. We focus on energy spectra, that is, on how energy is distributed with spatial scale.

At the largest scales, the atmosphere is forced in a quasi-steady manner by the solar gradient between the equator and the poles, which leads to a meridional temperature gradient. The corresponding energy flux is represented by nonlinear terms in the equations of motion used in coupled atmosphere-ocean models. The nonlinear interactions between different spatial scales cause large eddies to break up into smaller “daughter eddies,” transferring their energy fluxes to ever smaller scales (Vallis, 2017) until viscosity dissipates the energy as heat.

This process can be modeled by cascade models. In the first cascade models, the parent eddies were typically large cubes that produced smaller daughter cubes of half the parent's diameter (Schertzer & Lovejoy, 1987). Now, for each daughter, one flips a coin to decide how the energy flux from the parent eddy will be transferred over to the daughter. This can be done so that some daughter eddies occasionally receive zero energy, while others have their fluxes multiplicatively boosted to conserve the total energy (Frisch et al., 1978; Mandelbrot, 1974; Novikov & Stewart, 1964). The outcome of these cascades is power laws for the distribution of the energy with spatial scale (Nastrom & Gage, 1985; Straus & Ditlevsen, 1999; Vallis, 2017). These are qualitatively consistent with the theoretical power law spectra predicted by Kolmogorov (Kolmogorov, 1991b, 1991a) as discussed above.

2.6 Estimation Methods for Scaling Exponents

A multitude of estimators have been developed over the years to provide accurate estimates of the scaling exponents, and different estimators infer different aspects of the scaling properties. For instance, most estimators infer the long-range dependence parameter d or the Hurst exponent H of a time series and are insensitive to non-Gaussianity of its amplitudes, which can cause them to differ from the self-similarity parameter γSS. When deciding whether or not to use a particular estimator, one should always be aware of the underlying assumptions that went into its construction.

2.6.1 Estimation of the Power Law Exponent

Recognizing the existence of power law tails and estimating the corresponding tail parameter or scaling exponent of power law PDFs are important topics. Clauset et al. (2009) provide a review on this topic and carefully explain the potential pitfalls. First, it is important to realize that true power laws can be hard to identify and that simple regression approaches can lead to false positive identifications (Clauset et al., 2009). Clauset et al. (2009) recommend the use of a maximum likelihood estimator. Code for the power law estimation for the statistical programming language R is available online (http://tuvalu.santafe.edu/~aaronc/powerlaws/plfit.r). They also show that the widely used least squares regression approach can lead to inaccurate estimates and cannot answer the question whether the data obey a power law decay at all (Clauset et al., 2009). Gerlach and Altmann (2019) propose a different way to identify power laws using shuffling and undersampling of the data. This approach leads to less rejections and larger confidence intervals than the Clauset et al. (2009) approach and potentially to more false positive identifications. While that study is mostly concerned with power law tails of PDFs, the maximum likelihood estimator approach can also be used for estimating the long-range dependence parameter. With a maximum likelihood estimator also the parameters of other distributions such as the generalized extreme value, stretched exponential or the log-normal distribution can be estimated. Most of these distributions can be estimated with standard functions or packages included in the statistical software package R.

Extreme value statistics also provides methods to estimate the tail exponent of distributions (e.g., Beirlant et al., 2006; Coles, 2001; Embrechts et al., 2013). However, they fit an extreme value distribution, either the generalized extreme value or the generalized pareto distribution. Those distributions can have either a power law or an exponential decay of their tail. Gilleland and Katz (2016) provide a R package for the estimation of extreme value distributions.

While direct confirmation of power laws can be difficult, it is advisable to perform a model selection exercise where different models, for example, power law and stretched exponential, are fitted to the data and then the best fitted model is selected using some objective criterion. Burnham and Anderson (2003) discuss systematic ways of model selection.

2.6.2 Estimation of the Long-Range Dependence Parameter

When analyzing and modeling the temporal dependence of long-range dependent time series, it is important to accurately estimate the strength of long-range dependence. This can be achieved by determining the Hurst exponent, H, or the fractional integration parameter, d, arising from the ARFIMA class of processes (Box et al., 2015; Franzke et al., 2012; Franzke, 2017; McNeil et al., 2015). H is more commonly used in physics, while d is preferred by the statistics community (Appendix D). In principle, one could also analyze the autocorrelation function and search for a power law decay at large time lags. However, the estimation of the autocorrelation function suffers from sampling errors. The below described estimators provide more robust estimates of the scaling exponent then the autocorrelation function.

As emphasized by Samorodnitsky (2016), most definitions of long-range dependence in the literature are based on the second-order (or variance) properties of the process which include the asymptotic behavior of covariances, spectral density, and variances of partial sums. Second-order properties are popular choices because they are conceptually simple and they can be easily estimated from data. However, as noted by Samorodnitsky (2016) this complicates the definition of long-range dependence when a process has infinite variance, because theoretical moments and sample moments will behave in strikingly different ways.

Many methods are used to estimate the value of the Hurst exponent. They can be broadly divided into time domain methods and frequency domain methods. Time domain methods include variance-type estimators (Giraitis et al., 1999; Taqqu et al., 1995), the rescaled range or R/S statistic (Bhattacharya et al., 1983; Mandelbrot & Taqqu, 1979), least squares regression using subsampling (Higuchi, 1990), and the variance of residuals estimators (Peng et al., 1994). Frequency domain estimators include Whittle estimators (Dahlhaus, 1989; Fox & Taqqu, 1986) and connections to Fourier spectrum decay (Geweke & Porter-Hudak, 1983; Lobato & Robinson, 1996). Wavelet-based regression approaches have also been considered (Abry et al., 2000, 1995; Percival & Guttorp, 1994). Extensions of wavelet estimators to other settings such as observational noise and irregularly spaced time series have been published (Gloter & Hoffmann, 2007; Knight et al., 2016; Stoev et al., 2006). Other works about long-range dependence estimation including multiscale approaches are Hsu (2006) and Coeurjolly et al. (2014).

For technical details of the most common estimators see Appendix C. A more detailed comparison of different estimator and a discussion of their strengths and weaknesses is given by for examnple Faÿ et al. (2009), Franzke et al. (2012), Schmittbuhl et al. (1995), and Witt and Malamud (2013).

2.7 Spectral Analysis and Its Limitations

Analyzing data by Fourier decomposition is attractive because each sinusoid has an exact and unambiguous time scale: its period. Furthermore, one has to assume that the signal is statistically stationary in time or homogeneous in space for Fourier analysis to be valid. This implies that the probability laws that govern the behavior of the time series do not change over time. As a result, the phases of the Fourier modes are randomly distributed and can be neglected. However, this only applies if the time series is Gaussian because the phases are correlated otherwise; a fact that is often overlooked. If observationally data really had to meet these criteria fully, then Fourier analysis would not be as useful as it has been, so we can relax these conditions in practice to some extend.

A potential problem with Fourier analysis is that the interpretation of the resulting spectra is nontrivial. To illustrate this problem, we consider surrogate time series which have the same multifractal characteristics as the EPICA dust series (Lambert et al., 2008; Lovejoy, 2018). Since the time series are generated by a multifractal stochastic process, even the strong peaks are randomly excited (Figure 8). However, the presence of such strong peaks can be misinterpreted as signatures of important climatic processes (Lovejoy, 2018). This example should also be taken as a cautionary tale against the simple interpretation of peaks in Fourier analysis as quasi-oscillations resulting from physical mechanisms. The important point we want to emphasize here is not that these methods do not work but rather that their resulting spectra need to be carefully interpreted and the model representing the underlying process carefully chosen.

Details are in the caption following the image
Time series (lower left) and spike plot (upper left) of a 2,048-point multifractal time series simulation. Its corresponding power spectrum is displayed on the right (black line). In the spike plots the horizontal dashed lines correspond to the 10−5 and 10−10 probability levels. The blue curve in the power spectrum plot is the averaged spectrum over 5,000 multifractal simulations. Above the blue curve is an orange 2 standard deviation curve and (red) 3, 4 standard deviation curves (probabilities of 0.1% and 0.003%, respectively). The arrows indicate spikes with Gaussian probability p<0.05. See Lovejoy (2018) for more details of the used multifractal model. Figure is from Lovejoy (2018).

3 Empirical Evidence of Scaling

The atmospheric near-surface temperature is a relevant indicator for climatic long-range dependence. Instrumental measurements reach back to the seventeenth century (Central England Temperature) (Parker et al., 1992) and cover inhabited areas and ocean areas near ship routes densely during the last 50 to 100 years. Furthermore, temperature is reconstructed using statistical relationships with proxy data for time horizons of thousands up to a million years. In all these data, a continuous background in variability shows up, which is parsimoniously described by power laws. In all these data sets, long-range dependence is considered to be present if power law scaling reaches the longest time scale observed (in contrast to a stringent mathematical definition which requires infinity as a limit).

All observations and especially climate reconstructions based on proxy data are subject to nonclimatic influences such as measurement errors or imperfect recording of the climate signal. As this can also affect the scaling (e.g., Franzke et al., 2012; Rust et al., 2008), it is important to consider these uncertainties in order to infer useful information about the scaling of climate variability.

3.1 Station Temperatures

Significant evidence for long-range dependence in station temperature have been reported in many studies (Bunde et al., 2014; Capparelli et al., 2013; Fraedrich & Blender, 2003; Franzke, 2010, 2012; Gao & Franzke, 2017; Graves et al., 2015; Koscielny-Bunde et al., 1998; Ludescher et al., 2015; Løvsletten & Rypdal, 2016, 2015). The exponent H=0.65 of the fluctuation function F(τ)∼τH which was determined by DFA was suggested to be universal for local temperature variations. This exponent is related to the exponent β=0.3 in a power spectrum, urn:x-wiley:rog:media:rog20226:rog20226-math-0023 by β:=2H−1. Figure 9 shows the results for the fluctuation function obtained by DFA (see Appendix 43) for 12 meteorological stations distributed worldwide. In the interannual range (up to the total duration) the stations reveal linear slopes given by an exponent H≈0.65. It is tempting to conclude that atmospheric long-range dependence could be characterized by a single universal value. Later on, it was found that the value H=0.65 might be a transition phenomenon between memory-less inner continents with H=0.5 and some oceanic areas with H≈1 (Fraedrich & Blender, 2003; Ludescher et al., 2015; Løvsletten & Rypdal, 2016).

Details are in the caption following the image
Detrended fluctuation analysis fluctuation functions F(τ) for daily temperature data at the indicated stations. The lines show the exponents (slopes) H=0.65. Dashed vertical lines at 1 year and 15 years indicate the time range denoted as decadal here.

3.2 Near-Surface Temperatures on Land and Ocean

Observed near-surface temperatures are available since 1900 with a sufficient density to estimate a global pattern of long-range dependence (Jones, 1994; Jones et al., 2001; Parker et al., 1995). Figure 10 shows the power law exponents H for the monthly data estimated by DFA with quadratic trend (DFA2). The results are concentrated in coastal regions, Europe, North America, and along-ship routes. In inner continental locations, long-range dependence is negligible with H≈0.5. Along the coasts the memory increases to H≈[0.6,0.8] and in the central North Atlantic and in the equatorial Indian ocean, the highest values of H≈0.9 are found. This pattern shows that long-range dependence in the considered time range is a marine phenomenon. On land no memory is evident on these time scales far from the coasts. The finding of H≈0.65 in the stations can be explained by the locations of the stations considered in Figure 9. Inhabited areas with observational stations are traditionally along coasts. Note that there can be a huge gradient in long-range dependence as seen along the western North Pacific coast. Later on, Fraedrich and Blender (2003) found that this surface temperature long-range dependence can be found in coupled atmosphere-ocean simulations with a dynamical ocean model but not with a slab ocean model.

Details are in the caption following the image
Fluctuation exponent H in observed sea surface and near-surface air temperatures from HadCRUT2 data (Climatic Research Unit, University of East Anglia, Norwich) estimated by detrended fluctuation analysis with quadratic trend for the decadal scale (see the slopes in Figure 9).

A spectrum with β=1 is denoted as 1/f noise (see Appendix F). If this extends to vanishing frequency (or infinite time), stationarity is violated. Clearly, in observational data this limit cannot be attained, and whether the sea surface temperature data are stationary remains open. However, this type of variability indicates that short-term averages, as are typically used in climate science, are not defined and should be interpreted with caution.

The 1/f spectrum of the sea surface temperature in some oceanic regions can be obtained by a diffusion model in a vertical column with two compartments (Fraedrich et al., 2004). A decisive parameter is the diffusivity in the abyssal ocean. This parameter determines turbulent mixing and is caused by tides and the orography. Although the atmospheric forcing is white, the sea surface temperature shows long-range dependence close to nonstationarity. Furthermore, wind also provides a significant part of the mechanical energy for diapycnal mixing in the ocean (Munk & Wunsch, 1998; Wunsch & Ferrari, 2004) which could also lead to this behavior as discussed in section 14.

In summary, the analysis of observational data reveals that long-range dependence in the lower atmosphere is predominantly found in oceanic regions where the variability is close to nonstationarity (1/f spectrum). Far from coasts long-range dependence on decadal time scales is not observed. In transition zones along coasts a spectrum is found which is approximately given by urn:x-wiley:rog:media:rog20226:rog20226-math-0024.

3.3 Scaling in Regional and Global Mean Temperatures

Local and regional surface temperature variations have a much greater magnitude and show a weaker scaling than global average temperature variability (e.g., Laepple & Huybers, 2014a). This difference in variability is strongest for interannual and shorter time scales and decreases on longer time scales. The reduced variability of the global mean temperatures reflects cancelation of variability in the global mean, and the weaker cancelation toward lower frequencies is consistent with findings that temperature anomalies have greater spatial autocorrelation toward longer time scales (Jones et al., 1997). This behavior is also reproduced in diffusive energy balance models (Rypdal et al., 2015) where the predicted slope of the spectrum of global mean temperatures is around double the slope of regional temperatures over a large frequency range following from the horizontal diffusive coupling. Thus, it is important to consider the spatial scale analyzed when interpreting the variability and long-term memory of temperature time-series.

3.4 Paleo Data and Simulations

The isotope fraction δ18O in ice cores can be used as a proxy for the surface temperature due to the different weights of the molecules (Barlow et al., 1997; Dansgaard, 1964). As the annual snowfall is preserved on the Greenland and Antarctic ice sheets, this allows in principle to analyze climate scaling from interannual to multimillennial time scales. However, especially on the shorter time scales, nontemperature effects such as snow redistribution (Fisher et al., 1985), diffusion (Johnsen et al., 2000) and aliasing effects from intermittent snowfall (Laepple et al., 2017) considerably affect the recorded variability and have to be corrected for, to infer about climate scaling (Münch & Laepple, 2018). The oxygen fractions measured in the Greenland ice cores GRIP and GISP2 during the last 10,000 years reveal estimates for H (Blender et al., 2006) which are clearly above 0.5 in the millennial time range (Figure 11). The corresponding exponents in the power spectrum urn:x-wiley:rog:media:rog20226:rog20226-math-0025 correspond to H≈0.5. Hence, scaling can be assumed, at least approximately. It is remarkable that this result can be obtained in an extremely long coupled atmosphere-ocean simulation (Blender et al., 2006) which reveals intense long-range dependence south of Greenland (see Figure 11) with exponents of similar magnitude but much less long range dependence in other oceanic regions; the Pacific ocean reveals no long-range dependence of a comparable intensity. The simulation was performed under present-day conditions; hence, no external variability is necessary to explain this result. The long-range dependence is related to the variability of the zonally averaged stream function in the North Atlantic Ocean. Evidence for long-range dependence has also been found in other coupled climate models (Fredriksen & Rypdal, 2016; Østvand et al., 2014; Vyushin et al., 2004; Zhu et al., 2010). Vyushin et al. (2004) showed that the inclusion of volcanic eruptions improves the simulation of long-range dependence in climate models.

Details are in the caption following the image
Detrended fluctuation analysis fluctuation functions for the Greenland ice cores GRIP, GISP2, and simulated sea surface temperature (model CSIRO) close to 30°W, 65°N. The slopes indicate the exponents H=0.5 (no memory), 0.7, and 0.84.

An important question is how well climate models reproduce the true internal variability on time scales of centuries and longer. The local model surface temperature spectra seem to indicate a lower (or even zero) spectral exponent β for frequencies below 1×10−2 year−1, but on long time scales the finite sample size errors are so large that this cannot be concluded with high statistical confidence (Fredriksen & Rypdal, 2016). This flattening of the spectra on time scales longer than a century cannot be detected in the instrumental temperatures, since the time series are too short, but they are also not detected in temperature reconstructions of Holocene climate (Laepple & Huybers, 2014b) even after correcting for nonclimate effects on the spectra. On the contrary, some authors claim higher exponents for local temperatures (β≈2.2) for these longer time scales based on composite spectra established from different proxy records (Huybers & Curry, 2006; Lovejoy & Schertzer, 2012b). This conclusion, however, can only be drawn with confidence from proxy records spanning the last glacial period and including the last deglacation and hence may not be valid for the Holocene climate.

Analyzing global mean temperatures that are dominated by external forcing on the other hand suggests that the scaling and variability are comparable between reconstructions and model simulations (Crowley, 2000; Zhu et al., 2019) or climate model simulations may even overestimate global mean temperature variability.

The issue is illustrated in Figure 12a where we have plotted time series of reconstructed annual temperatures for the Northern Hemisphere and the corresponding series derived from the NorESM model with historical forcing. In Figure 12b we have estimated their Haar fluctuation functions (see Appendix 44). The fluctuation level is more than twice as high for the model temperatures on time scales less than 100 years. It turns out that this is due to the higher short-time responses to large volcanic eruptions in the models. This can be seen by elimination of these spikes from the model signal, as shown in the zoomed in signals in Figure 12. The fluctuation function of this “chopped” signal is very close to that of the reconstructed temperature (Figure 12). The cloud of thin curves in Figure 12 is fluctuation functions estimated for 20 realizations of fractional Gaussian noises with Hurst exponent H=0.9 of the same length as the NorESM model run. The width of the cloud suggests that the reconstruction and the chopped model signal are consistent with such a fractional Gaussian noise, although the power at time scales longer than a few centuries is somewhat high. It is easy to verify that this increased power is due to the temperature difference between the Medieval Warming Anomaly and the Little Ice Age. Some authors interpret the power in this oscillation as a signature of a transition to scaling with an exponent β>1 on time scales longer than a few centuries (Huybers & Curry, 2006; Lovejoy & Schertzer, 2012b). On the other hand, Nilsen et al. (2016) argue that existing temperature reconstructions for the Holocene are generally consistent with a single scaling regime with H≈0.9 on all time scales shorter than the duration of the interglacial period.

Details are in the caption following the image
(a) Red: the Moberg Northern hemisphere (NH) temperature reconstruction (Moberg et al., 2005). Blue: NH surface temperature in a NorESM simulation with historical forcing. (b) Haar fluctuation functions for various signals. Red bullets: the NH reconstruction shown in panel (a). Blue bullets: the NorESM simulation shown in panel (a). Green bullets: the same NorESM simulation with the spikes due to volcanic eruptions removed, as shown by the green curve in panel (d). The full curves are Haar fluctuation functions for 20 realizations of an fGn with H=0.9. (c) A close-up on panel (a) to illustrate that the fast responses due to volcanic eruptions are almost absent in the reconstruction. (d) The green curve illustrates how we have chopped off the volcanic responses from the NorESM signal.

Similar ideas are advocated by Rypdal and Rypdal (2016b), who demonstrate that temperatures derived from ice cores over the last 100 kyr can be described as sudden transitions between stadials and interstadials superposed on a 1/f noise (H≈1) background. According to this view, monofractal, near Gaussian, scaling with H≈1 is a useful description of the climate noise background in Quaternary climate. Whether a scaling description is appropriate for the succession of transitions between stadials and interstadials is another story and needs more research.

3.5 Scaling Behaviors in Other Meteorological and Climatological Variables

Besides the evidence of long-range dependence in temperatures, there are also scaling behaviors detected in many other variables, including precipitation, river runoff, total ozone, relative humidity, and sea level change. For in situ precipitation records, small long-range dependence parameters have been found by several studies (Jiang et al., 2017; Kantelhardt et al., 2006; Yang, Fu 2019). On average, the DFA exponent H mainly ranges between 0.5 and 0.55, indicating weak long-range dependence. For river runoff much stronger long-range dependence has been detected. Kantelhardt et al. (2006) found that the mean long-range dependence parameter for river runoff is 0.72 based on 42 river runoff records observed from Europe, North and South America, Africa, Australia, and Asia. Wang et al. (2008) detected long-range dependence close to 1/f (H≈1) for the intra-annual Yangtze discharge. As for relative humidity, Chen et al. (2007) reported that the mean DFA exponent H for in situ relative humidity records over China is around 0.75. Recently, there were also results reported indicating that sea level changes are characterized by long-range dependence. The DFA exponent H has a large variation range from 0.60 to 0.95, depending on different regions (Dangendorf et al., 2014). Other variables such as wind speed, atmospheric circulation indices, and ozone anomalies have also been shown to have the scaling behavior (Feng et al., 2009; Franzke et al., 2015; Vyushin et al., 2007; Vyushin & Kushner, 2009; Varotsos & Kirk-Davidoff, 2006).

3.6 Evidence of Multifractal Behavior

Besides long-range dependence that only needs one exponent H to describe monofractal behavior, there is also empirical evidence of multifractal behavior. For instance, in precipitation records, although the measured long-range dependence is weak, pronounced multifractality has been found (Kantelhardt et al., 2006), indicating that precipitation records of different amplitudes have different scaling behaviors. Similar properties also exist in river runoff data (Koscielny-Bunde et al., 2006), where the multifractality was found to be even stronger than that in precipitation records (Kantelhardt et al., 2006). For temperature related records such as the surface mean air temperature and diurnal temperature range, different multifractal behaviors were found over different regions (Lin & Fu, 2008; Yuan, Fu, & Mao, 2013). Such as in the south of the Yangtze River, pronounced multifractality was found in diurnal temperature range records, while in the north, the multifractal behavior is very weak or even nonexistent (Yuan, Fu, & Mao, 2013). Other variables such as wind speed and relative humidity have also been shown to have the multifractal behavior (Baranowski et al., 2015; Kavasseri & Nagarajan, 2005).

4 Applications of Scaling in Climate Research

4.1 Scaling for Trend Detection

The identification of trends is one of the most frequent and prominent goals in the analysis of geophysical time series (Chandler & Scott, 2011; Wu et al., 2007). Although apparently an easily understandable objective, trend assessment is very challenging, starting with the lack of a precise definition of trend itself. Implicit in the intuitive notion of trend are concepts such as long term, smoothness, or monotonicity, but it is not unambiguously defined how long is “long term” or how smooth needs a pattern to be in order to be a trend. Furthermore, time series characterized by scaling behavior often exhibit features that can be classified broadly as a trend, even in the absence of any genuine trend.

Unlike the notion of trend, stationarity is a well-defined statistical property. A time series (Xt) is weakly stationary if its first and second moments are time invariant (i.e., the mean and variance are constant and the covariance depends only on the time lag between the observations). The trend in a time series can be ascribed to a nonstationary generating process (at least the mean is not constant in time) and described by a trend model. Trend models can be broadly classified as either being (i) deterministic or (ii) stochastic. Deterministic trend models represent deterministic (nonrandom) nonstationary processes which are described by a function evolving in time; one example is trends which are forced by external factors such as anthropogenic greenhouse gas emissions. Stochastic trend models represent stochastic nonstationary processes described by models such as a random walk or an autoregressive integrated moving average model. Such models exhibit apparent trends without any external forcing; instead, these trends are caused by the internal dynamics of the process. Unfortunately, it is extremely difficult, for example, by visual judgment alone (Percival & Rothrock, 2005), to select a trend model or even to distinguish between deterministic and stochastic trend models, particularly in the case of short time series (Franzke, 2012).

Furthermore, weakly stationary processes can generate a time series with an apparent trend, particularly when a short segment of the process is observed. Such “spurious” trends can be misleadingly taken as evidence of nonstationary behavior when in fact the process is stationary. A classical example since long recognized as a potential culprit in the interpretation of climate variability (Wunsch, 1999) is the typical “red noise” (Red noise sometimes means that the spectral power increases with period scale. However, in climate science red noise typically denotes the power spectrum of a first-order autoregressive process which has first increasing power for increasing period but then becomes white noise; also called Lorentzian spectrum. See Figure 3 for an example.) structure of climate records. A red noise (described by a simple first-order autoregressive process) can produce visually appealing trends despite being a stationary process, particularly in the case of short time series.

Long-range dependence processes, for example, described by ARFIMA models, are another type of stationary processes that can produce apparent nonstationary behavior and local “spurious” trends. Since long-range dependence is a feature common in many geophysical time series, a crucial challenge in the identification and estimation of trends is to discriminate between nonstationary processes and stationary long-range dependent processes. The problem is, however, quite difficult since genuine trends generated by a nonstationary process and spurious trends from a long-range dependence process can coexist in the same time series. Disentangling the different contributions to the observed temporal structure is not possible by visual inspection, and even specific methodologies addressing the issue have to rely on substantial assumptions and simplifications, for example, on the type of nonstationary behavior or the dominance of one specific type of process. For instance, the approach proposed by Beran and Feng (2002) of semiparametric fractional autoregressive (SEMIFAR) models considers a trend function modeled nonparametrically, with the remaining components of the model estimated by maximum likelihood. Despite the flexibility of SEMIFAR models, the performance is poor in the case of short time series, and the trend is estimated based on a subjective concept of smoothness. More importantly, discrimination between stochastic and deterministic trends remains difficult to achieve, given that a significant amount of spurious trends associated with long-range dependence behavior alone can be easily included in the nonparametric trend estimation. The statistical test of Berkes et al. (2006) aims to discriminate between stationary long-range dependence time series and nonstationary time series with change points in the mean but requires previous identification of a small number of change points in the time series.

Shifting from a general trend model to a linear trend significantly constrains the problem of trend identification in the presence of long-range dependence. Although such an assumption is hardly realistic and is theoretically limiting, it is nevertheless of practical relevance since the overwhelming majority of trends in geophysical records are reported as the slope from a linear regression model. Most studies are based on the up-front assumption that a time series can be described by a nonstationary linear trend with a stochastic long-range dependence component (e.g., Bunde et al., 2014; Capparelli et al., 2013; Franzke, 2010, 2012; Lennartz & Bunde, 2009; Ludescher et al., 2015; Myrvoll-Nilsen et al., 2019; Rybski & Bunde, 2009) and focus on the assessment of the corresponding uncertainty (e.g., Cohn & Lins, 2005; Koutsoyiannis, 2006; and Koutsoyiannis & Montanari, 2007).

A complementary approach is to test the assumption of a linear deterministic trend itself. The test devised by Phillips and Perron (1988), the PP test, is a classical unit root test for testing nonstationarity in the form of a random walk, which is a scaling process. The test devised by Kwiatkowski et al. (1992), the KPSS test, is a parametric statistical test which assumes as a null hypothesis a deterministic linear trend plus a stationary stochastic noise. Although the two tests can be applied independently, their joint use is recommended for trend assessment purposes (Carrion-i-Silvestre et al., 2001; Kwiatkowski et al., 1992). For a time series with a random walk stochastic trend the PP test should not reject the null hypothesis and the KPSS test should reject the linear trend null hypothesis. Conversely, for a time series with a linear deterministic trend, the PP test should reject the null hypothesis but not the KPSS test. In the case of a times series for which both tests fail to reject the null hypothesis, then the time series or the tests are not sufficiently informative to distinguish between a stochastic (random walk) trend and a deterministic trend. However, if both tests reject their respective null hypothesis, this is an indication that alternative parametrizations for long-term behavior need to be considered, such as long-range dependence. Since long-range dependence is a common feature of geophysical time series, this outcome of rejection of both PP and KPSS test is quite common, for example, in the case of air temperature (Fatichi et al., 2009) or global sea surface temperature (Barbosa, 2011). Unfortunately, these tests require long time series (in order to meet asymptotic assumptions) and are known to have low explanatory power particularly against long-range dependence alternatives (Lee & Schmidt, 1996; Leybourne & Newbold, 1999).

A widely used trend test is the Mann-Kendall test (Kendall, 1948; Mann, 1945), which in its original form is only valid for independent data. The Mann-Kendall test is a nonparametric test which tests for the presence of a monotonic trend without making any assumptions about the form of the trend. This is in contrast to most other trend tests which have to assume some parametric trend form, for example, a linear trend. The Mann-Kendall test has been extended to also account for serial correlation in time series (Hamed & Rao, 1998) and also for the presence of long-range dependence (Hamed, 2008).

A further trend significance test has been developed by Lennartz and Bunde (2009, 2011) and Tamazian et al. (2015). This method has been developed for the DFA method in the presence of long-range dependence in the time series (Bunde et al., 2014; Ludescher et al., 2015). Based on Monte Carlo simulations, they studied how the trend uncertainties vary with the strength of long-range dependence, as well as the data length. This method has been applied to evaluate trend significances of the surface air temperature and the sea ice extent in Antarctica (Bunde et al., 2014; Ludescher et al., 2015; Yuan et al., 2017).

Recent developments in temperature trend significance testing with long-range dependent noise show how we can also incorporate information about forced global temperature changes in the trend estimate (Myrvoll-Nilsen et al., 2019). In that way, one avoids attributing forced changes deviating from, for example, a linear trend as part of the stochastic variability (Gil-Alana, 2005; Fatichi et al., 2009; Franzke, 2012, 2014). Results show that the observed trends since 1900 are significant relative to the noise for most locations and to a larger degree than when assuming a linear trend (Løvsletten & Rypdal, 2016).

4.2 Scaling for Climate Response and Sensitivity

Linear response models, which predict how the climate system will react to a change in forcing, for example, anthropogenic greenhouse gas emissions, have shown considerable success in describing the global temperature response in climate model data, instrumental data, and in multiproxy reconstructions (Caldeira & Myhrvold, 2013; Fredriksen & Rypdal, 2016; Held et al., 2010; Geoffroy et al., 2013; Lovejoy et al., 2015; Østvand et al., 2014; Rypdal & Rypdal, 2014; Rypdal et al., 2015). In particular, Rypdal and Rypdal (2014) demonstrated that a scaling linear response function provides a good description of the global temperature response to radiative forcing over both the historical period and to a multiproxy reconstruction of the temperature over the last millennium.

It has been known for several decades that Coupled Atmosphere-Ocean General Circulation Models exhibit climate responses on multiple time scales (Caldeira & Myhrvold, 2013; Fredriksen & Rypdal, 2017; Held et al., 2010; Geoffroy et al., 2013); that is, there is more than one time constant involved in the response. The scaling response studied in Rypdal and Rypdal (2014) could be considered an approximation to the multiple time scale response, bridging the responses on time scales from years to centuries, and Fredriksen and Rypdal (2017) demonstrates how an energy balance box model can provide such an approximation.

In addition to describing the response to historical radiative forcing, the same scaling response to a white noise stochastic forcing is also consistent with the observed internal variability. One way of extracting the internal variability from observed global temperature is to compute the deterministic, historically forced variability from the response model and subtract this from the observed record. The power spectrum of this estimated internal variability compares well with a power law (Rypdal & Rypdal, 2014). This tendency for a multibox model to form a power law spectrum is studied systematically in Fredriksen and Rypdal (2017) and reflects a well-known result which states that a scaling spectrum can be obtained from the aggregation over an ensemble of first-order autoregressive processes (Granger, 1980); see also section 15. Thus, long-range dependence can be caused by the constructive superposition of SRD processes.

The emergent scale invariance makes it possible to infer equilibrium climate sensitivity from a scaling frequency-dependent climate sensitivity R(f)∼fβ/2. This scaling response implies infinite magnitude response as f→0, and therefore, there must exist a lower-frequency limit of where the scaling response is valid and where the response stabilizes as we go to even lower frequencies. R(f) can be estimated for a given climate model by exploiting the relation between the historic radiative forcing applied to a model and the observed instrumental global temperature. Rypdal et al. (2018) applied this to an ensemble of Earth system models, where the inferred values of R(f) evaluated at f=1/1,000 year−1 correlate strongly to estimates of equilibrium climate sensitivity from idealized model runs. They could use the distribution of estimated R(f) over the model ensemble to constrain the distribution of equilibrium climate sensitivity obtained from the ensemble of idealized runs. Thus, scale-invariant linear response models are useful tools for the estimation of equilibrium climate sensitivity from observation data. The advantage over multibox energy balance models is that the scale-invariant models contain fewer free parameters and are less prone to statistical overfitting.

4.3 Scaling for Climate Prediction

The property of long-range dependence in the climate system raises the question whether models which explicitly include long-range dependence can be used for skillful predictions. The first attempt for climate predictions was tried by Baillie and Chung (2002). Recently, a model was developed for seasonal to decadal predictions, the Stochastic Seasonal to Inter-Annual Prediction System (StocSIPS) (Lovejoy, 2015a; Lovejoy et al., 2015, 2018). StocSIPS is based on the low-frequency limit of a fractional differential equation, the fractional energy balance model. This model is valid for periods between 20 days and 50 years, where intermittency is relatively weak so that a quasi-Gaussian approximation can be used. StocSIPS forecast skill compares favorable with operational long-range forecasting models based on traditional climate models. One advantage of StocSIPS is that data assimilation of observations is not necessary, since it can directly be fitted to observed data. This also implies that downscaling of forecasts is not needed.

Yuan, Fu, and Liu (2013) and Yuan et al. (2014) developed a method for the extraction of the long-range dependence using a fractional integrated statistical model. They proposed a new variable memory kernel which clearly shows how the states from the distant past maintain their impacts over time till the current time. Accordingly, climate variables with long-range dependence can be decomposed into two parts: (i) the memory part, which represents the influences accumulated from the past, and (ii) the residual part, which is related to the current dynamical forcing conditions. With the memory part extracted, one can at least determine on what basis the considered time series will continue to change. By combining this with the estimated residual part, it is possible to make predictions. Therefore, they proposed a new perspective for climate prediction for climate variables with long-range dependence. Because the influence from the past can be extracted quantitatively, one only need to focus on the prediction of the residual part.

Also, statistical models with non-Gaussian features have recently been developed. For instance, Önskog et al. (2018) show that the forecast skill of the North Atlantic Oscillation (Feldstein & Franzke, 2017) increases in a SRD statistical model when non-Gaussian noise is used. Graves et al. (2017) developed a Bayesian framework for ARFIMA models with various non-Gaussian noises and demonstrated its usefulness using the t-stable and the α-stable distributions.

5 Outlook and Open Questions

Here we have provided an overview of scaling methods and their relevance for understanding the climate system and its variability on time scales of days to millenia and ice ages. Scaling methods have improved our understanding of the climate system. The climate community mainly distinguishes between weather and climate, even though it is not well defined where weather ends and climate starts. Weather systems evolve over a few days, with the weather prediction limit at about 10–14 days (Zhang et al., 2019), while climate starts at time scales of about 30–40 years. This leaves a large gap in between. The area between weather and climate, the weather-climate interface, consists of the active research areas of subseasonal-to-seasonal (S2S) up to decadal predictions.

Through scaling analysis, we have now a better understanding of the climate system and that it consists of different scaling regimes distinguished by their scaling exponents. While in the weather and climate regimes, the variability strongly increases with time scale; this is not the case for the regime in between where the increase is rather weak. The exact ranges of these scaling regimes and their robustness and meaning are still a matter of debate (e.g., Huybers & Curry, 2006; Nilsen et al., 2016). On longer time scales, such as decadal time scales, the effect of global warming might already affect the variability making the observed scaling likely not a product of internal climate processes but a response to external forcing and nonstationarity. More research is needed to clarify this point.

An important future research question is to understand these differences and to elucidate how predictable subseasonal-to-seasonal and decadal processes are. Forecasts on these time scales are of societal importance and currently an important research topic. In this context it is also an important research question: How the slope of the scaling relationship determines predictability? While the predictive skill in weather forecasts has significantly improved over the last few decades, the skill of seasonal forecasts is rather limited and for decadal forecasts only the climate change signal and perhaps the El Niño–Southern Oscillation phenomenon is currently the predictable components.

Scaling of paleo-climate data has received a lot of attention (e.g., Bunde et al., 2014; Fredriksen & Rypdal, 2016; Huybers & Curry, 2006; Laepple & Huybers, 2013; Lovejoy & Schertzer, 2013; Lovejoy & Varotsos, 2016; Nilsen et al., 2016; Rypdal & Rypdal, 2016a; Schmitt et al., 1995; Zhu et al., 2019) and has been used in evaluating how well climate models reproduce observed long-term climate variability (Blender et al., 2006; Fraedrich & Blender, 2003; Østvand et al., 2014). While global mean temperature variability on interannual to millennial time scales seems to be consistent between climate models and climate reconstructions, the strong discrepancy of slow climate variability at regional scales calls for continued research on the temporal and spatial structures of climate variability and also on improving the interpretation and quality of paleo-climate records. How well scaling can contribute to the reconstruction of past climate needs to be assessed.

Over longer time scales, there are several unanswered questions in paleoclimate where scaling approaches may help our understanding of the climate system. One of the great puzzles of Quaternary science is the transition from the 41-kyr world before 1×106 years ago to the current 100-kyr glacial-interglacial regime, without any external forcing changes. Potential explanations for this change have involved ice sheet dynamics (Clark & Pollard, 1998), the progressive cooling of Earth's temperature throughout the Quaternary (Snyder, 2016), the amount of dust in the atmosphere (Chalk et al., 2017), or continental distribution (Kender et al., 2018). However, all studies acknowledge that the transition period between the 41-kyr cycle ∼1.2 Myr ago and the 100-kyr cycle since ∼600 kyr ago is poorly defined and not well characterized. The recovery of the 800-kyr-long EPICA ice core allowed a first look into the younger part of that transition section (Jouzel et al., 2007). The soon to be started oldest ice project aims to recover an Antarctic ice core that will reach back to the 41-kyr world, 1.2×106 years ago and provide a high-resolution record throughout the Mid-Pleistocene Transition (MPT), which denotes the fundamental change in the behavior of glacial cycles around 1×106 years ago. Before the MPT the glacial cycles were dominated by a 41,000-year period; after the MPT they followed less regular cycles with an approximate period of 100,000 years. The statistical techniques described in this review could provide a robust description of the data variability upon which physical and dynamic models can be chosen to explain the observed changes.

Another interesting paleo-climatic question has been why human civilization has evolved during the Holocene and not during any of the previous interglacials (Robinson et al., 2006). Has the Holocene climate been exceptionally stable in time or in space (Kopp et al., 2017)? While Greenlandic ice core records provide evidence for a very stable Holocene (Ditlevsen et al., 1996), the dependency of climate variability on the climate state seems to be much smaller in the rest of the world (Rehfeld et al., 2018). A related question is whether conditions in the fertile crescent during previous interglacials were markedly different from the Holocene. Scaling analyses could help answer these questions by providing a description of both temporal and spatial variability at different times during the Pleistocene.

A still open question is the mechanism of long-range dependence in the climate system. While the evidence is strong for long-range dependence, it leads to counterintuitive implications, that is, that the distant past still influences the present. There are also studies who show that inhomogeneities on station time series increase the strength of long-range dependence (Mills, 2007; Rust et al., 2008). These inhomogeneities take on the form of jumps or shifts due to changes in the station instruments or location. The fact that jumps lead to increased long-range dependence strength that would be consistent with the fact that volcanic eruptions improve the reproduction of long-range dependence in climate models (Vyushin et al., 2004). Maraun et al. (2004) point out the difficulty in distinguishing between long-range dependence and the superposition of SRD processes in practice. However, that long-range dependence could be due to the superposition of SRD representing the climate system on different time scales would be physically meaningful. More work on the physical origin is needed; especially, it has to be examined whether the climate system indeed has long memory, even on long time scales, or whether the observed long-range dependence is the result of the superposition of short memory effects or nonlinearities. Whichever of the two is the case would affect not only climate sensitivity but also the climate evolution on long time scales.

As we have shown here, scaling is an ubiquitous feature of the climate system for a multitude of time scales. Hence, it also should be included in our climate models. The parameterization problem can be seen as a model reduction problem, and as discussed above the Mori-Zwanzig formulism predicts the presence of memory terms (Franzke et al., 2015; Gottwald et al., 2017); however, most current weather and climate prediction models do not include memory terms (Berner et al., 2017). Recent research showed the benefit of including such memory terms (Frederiksen et al., 2017; Sakradzija et al., 2015; Vissio & Lucarini, 2018), although some difficulties in implementing the approach in simple climate models were reported (Demaeyer & Vannitsem, 2018). Hence, whether memory terms in parameterization schemes are useful needs more research.

As already discussed, the presence of long-range dependence hampers the detection of externally forced trends especially if the form of the trend is not a priori specified and thus nonparametric. Furthermore, there is also evidence of scaling breaks in temperature time series for the Holocene period (Lovejoy & Schertzer, 2012b) and the Central England Temperature time series (Graves et al., 2015). However, how robust these breaks are is still a matter of debate (Nilsen et al., 2016) and improved statistical methods are needed. The existence of scaling breaks would also create new questions about the origin of long-range dependence in climate. If long-range dependence is an intrinsic property of the equations of motion, then one would not expect scaling breaks, at least not without changes in external forcing or experiencing of a bifurcation (which are unlikely for the Holocene and Central England Temperature periods).

This review provided evidence for the relevance of scaling in the climate system and how it can affect the detection of trends, the estimation of climate sensitivity, and the skill of long-range predictions. We also discussed various physical mechanisms which can cause scaling in the climate system.

Glossary

  • Brownian motion
  • Brownian motion is a continuous-time stochastic process, also called Wiener process. The increments of Brownian motion are Gaussian distributed independent random variables.
  • Fractal
  • A fractal is a self-similar object, that is, when shrinking or enlarging a fractal pattern, its appearance remains statistically unchanged. A good introduction is given by Feder (1988).
  • Heavy-tailed distribution
  • Heavy-tailed distributions are distributions whose tail decays slower than exponential. In particular, its tail is heavier than for a corresponding Gaussian distribution. A good introduction is given by Sornette (2006).
  • Leptokurtic
  • A leptokurtic distribution has a kurtosis which is larger than 3. The Gaussian distribution has a kurtosis of 3. Hence, a leptokurtic distribution has fatter tails than the corresponding Gaussian distribution.
  • Long memory
  • Synonym for long-range dependence
  • Long-term persistence (LTP)
  • Synonym for long-range dependence
  • long-range dependence
  • Long-range dependence is the property of the autocorrelation function of a time series to decay according to a power law. Consequently, the power spectrum of such a time series has increasing power for lower frequencies and a singularity at zero frequency.
  • Monofractal
  • Monofractals are fractals described by a single scaling exponent
  • Multifractal
  • Multifractals are fractals described by multiple scaling relationships and whose exponents are functions of scale
  • Power law
  • A power law describes a functional relationship between two variables where a change in one variable results in a proportional relative change in the other variable. Mathematically, it is of the following form: f(x)=axk where k is the power law exponent.
  • Random walk
  • Also known as Drunkards walk, has scaling power spectrum with slope −2, variance increases as urn:x-wiley:rog:media:rog20226:rog20226-math-0026. A good introduction is given by Sornette (2006).
  • Red noise
  • Red noise means that the spectral power increases on longer time scales. However, in climate science red noise typically denotes the power spectrum of a first-order autoregressive process which has first increasing power for increasing period but then becomes white noise; also called Lorentzian spectrum
  • Scaling
  • In mathematical form we can express scaling as follows:
    urn:x-wiley:rog:media:rog20226:rog20226-math-0027(14)
    where F is the fluctuation function, γ the scaling exponent, a is a rescaling factor of time t, and urn:x-wiley:rog:media:rog20226:rog20226-math-0028 denotes equality in distribution. a can be seen as the factor with which one is zooming in or out, and the scaling property now means that the statistical properties of the data stay the same (Feder, 1988; Franzke et al., 2012; Mandelbrot, 1982) and this is the same property as fractals have.
  • Short-range dependence
  • Short-range dependence is the property of the autocorrelation function of a time series to decay exponentially. Consequently, the power spectrum of such a time series has almost constant power at lower frequencies.
  • Self-similarity
  • A self-similar object is exactly or approximately similar to a part of itself. When zooming in or out, one sees similar structures. Self-similarity is a property of Fractals.
  • Unit root
  • A unit root is a characteristic of stochastic processes. In particular, it denotes that a stochastic process is nonstationary without necessarily having a trend. A good introduction is given by Box et al. (2015).
  • Volatility clustering
  • Volatility clustering refers to the observation that in many time series large changes are followed by large changes of either sign, while small changes are followed by small changes of either sign.
  • Acknowledgments

    We thank three anonymous reviewers for helping to improve this manuscript and Shaun Lovejoy for many discussions during the preparation of this manuscript. C. F was supported through the cluster of excellence CliSAP (EXC177) and the collaborative research center “Energy Transfers in Atmosphere and Ocean” (TRR181) funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) - Projektnummer 274762653 and DFG Grant FR3515/3-1. We also thank PAGES and CliSAP for providing generous funding for organizing two workshops in Jouvence, Canada, and Hamburg, Germany, at which the idea for this review was conceived. This is a contribution to the PAGES CVAS group. Past Global Changes (PAGES) is supported by the U.S. National Science Foundation and the Swiss Academy of Sciences. Data and code sources are provided in the text.

      Appendix A: Fractionally Integrated Processes

      Integration, or the inverse procedure differentiation, is a standard procedure in time series analysis to deal with nonstationary time series (e.g., Box et al., 2015). For instance, a linear trend can be removed from a time series by examining the time series increments instead; higher-order trends can consequently be removed by repeating differentiation multiple times and examining the resulting increment time series. Hence, repeated application of differentiation can make every time series stationary. The resulting increment time series can be modeled with an autoregressive moving averaging time series model. In order to represent the original time series, the modeled time series would be subsequently cumulatively summed up as many times as differences have been taken before. This results in an autoregressive integrated moving average model. A fractional Brownian motion (see Appendix B) is an example of a nonstationary time series; its variance goes to infinity with increasing time. Its increments, fractional Gaussian noise (see Appendix B), on the other hand, are stationary.

      In standard time series analysis only integer order integration or differentiation is used. However, the integration and differentiation processes can be generalized to also use noninteger integration orders, so-called fractional integration and differentiation (Samorodnitsky, 2016). This allows to mathematically model long-range dependent time series.

      The fractional integration parameter d is introduced as follows: let (Xn) be a fractionally differenced process with
      urn:x-wiley:rog:media:rog20226:rog20226-math-0029
      where (Zn) is white noise with zero-mean and unit variance, and
      urn:x-wiley:rog:media:rog20226:rog20226-math-0030
      where urn:x-wiley:rog:media:rog20226:rog20226-math-0031 is the Gamma function. Hence,
      urn:x-wiley:rog:media:rog20226:rog20226-math-0032
      Observe that
      urn:x-wiley:rog:media:rog20226:rog20226-math-0033
      and for d∈(0,0.5)
      urn:x-wiley:rog:media:rog20226:rog20226-math-0034(A1)
      where urn:x-wiley:rog:media:rog20226:rog20226-math-0035. Note that the autocovariance equation A1 has the same (asymptotic) power law decay as the autocovariance equation B2.

      Appendix B: Fractional Brownian Motion

      Brownian motion is an important stochastic process (e.g., Embrechts & Maejima, 2007). Brownian motion (also called the Wiener process) is the limit of the symmetric random walk. While the random walk is a discrete-time process, Brownian motion has continuous sampling paths and is a continuous-time process, while at the same time it is nowhere differentiable.

      Brownian motion has independent increments. In contrast, fractional Brownian motion has dependent increments. These increments Xn are called fractional Gaussian noise, and the strength of the dependence is measured by the parameter H:
      urn:x-wiley:rog:media:rog20226:rog20226-math-0036(B1)
      where H is often called the Hurst exponent and can take values in (0,1]. Fractional Gaussian noise is a discrete-time increment process of fractional Brownian motion. For fractional Gaussian noise even those values that are far apart in time are still serially correlated. Hence, even the distant past affects the current values. If urn:x-wiley:rog:media:rog20226:rog20226-math-0037, then the process is standard Brownian motion; if urn:x-wiley:rog:media:rog20226:rog20226-math-0038, then the increments are positively correlated, while for urn:x-wiley:rog:media:rog20226:rog20226-math-0039 they are negatively correlated and antipersistent, which is the opposite of long-range dependence because the process will wildly fluctuate.
      Note that the stationarity of the increments of fractional Brownian motion implies that this is a stationary zero-mean Gaussian process whose autocorrelation function, acf(h):=E(XnXn+h), satisfies (e.g., Beran et al., 2013)
      urn:x-wiley:rog:media:rog20226:rog20226-math-0040(B2)
      provided that H∈[0,1]. If H∈[0.5,1], then the correlations are not summable; thus, they go to infinity, and we say that Xn exhibits long-range dependence and H measures its intensity. If, on the other hand, H∈[0,0.5], we say that Xn is antipersistent.
      Fractional Brownian motion is self-similar. By considering probability distributions, it can be shown that
      urn:x-wiley:rog:media:rog20226:rog20226-math-0041(B3)

      The Hurst exponent describes the raggedness of the time series, with a higher H leading to smoother time series. Examples for persistent, white noise (serial uncorrelated time series), and antipersistent time-series are displayed in Figure 6. Fractional Brownian motion was introduced by Kolmogorov (1940). More rigorous treatment of fractional Brownian motion can be found in the books by Embrechts and Maejima (2007) and Beran et al. (2013).

      Appendix C: Details of Long-Range Dependence Parameter Estimators

      C.1 Time Domain Methods

      C.1.1 R/S Estimator

      The R/S method was the first long-range dependence estimator. For a time series X1,…,XN, the R/S statistic (Beran, 1994; Hurst, 1951) is given by
      urn:x-wiley:rog:media:rog20226:rog20226-math-0042
      where urn:x-wiley:rog:media:rog20226:rog20226-math-0043. I measures how far the partial sums, Yi, exceed the straight line they would follow if all observations were equal (to the sample mean). III is the difference between the highest and lowest positions of the partial sums with respect to the straight line of uniform growth. For either fractional Gaussian noise or the ARFIMA model
      urn:x-wiley:rog:media:rog20226:rog20226-math-0044
      here KH is a positive, finite constant which depends on H. H>0.5 for data with long-range dependence. Following Taqqu et al. (1995), the methodology for estimating H comprises the following steps: subdivide the time series X1,…,XN, into K blocks of size r:=N/K. For each lag n, compute urn:x-wiley:rog:media:rog20226:rog20226-math-0045, starting at points ri=iN/K+1, for i=1,2,…, such that ri ≤ Nn. Plot urn:x-wiley:rog:media:rog20226:rog20226-math-0046 versus urn:x-wiley:rog:media:rog20226:rog20226-math-0047 by fitting a straight line. The slope of the line gives H. However, this R/S approach does not result in reliable estimates and its use is no longer recommended (Franzke et al., 2012; Rea et al., 2009).

      C.1.2 Variance-Type Estimator

      As a more robust alternative, Taqqu et al. (1995) proposed the aggregated variance method to estimate H. Variance-type estimators are a popular method to estimate the long-range dependence parameter. The variance-type estimator of Taqqu et al. (1995) takes the form
      urn:x-wiley:rog:media:rog20226:rog20226-math-0048(C1)
      where
      urn:x-wiley:rog:media:rog20226:rog20226-math-0049
      with [·] denoting the integer part and urn:x-wiley:rog:media:rog20226:rog20226-math-0050 is the aggregated series of order m
      urn:x-wiley:rog:media:rog20226:rog20226-math-0051
      A major drawback of this variance-type estimator is that its bias is of order no less than urn:x-wiley:rog:media:rog20226:rog20226-math-0052 so that only when dealing with very long time series such an estimator can provide reliable point estimates for H. Thus, Giraitis et al. (1999) introduced the following refined estimator of C1
      urn:x-wiley:rog:media:rog20226:rog20226-math-0053(C2)
      where
      urn:x-wiley:rog:media:rog20226:rog20226-math-0054
      for m0<m1, such that m0 as N and N/m1. Giraitis et al. (1999) proved that the estimator in C2 is less biased than C1. Specifically, this method plots the logarithm of the variance of an aggregated (averaged) process against the logarithm of the aggregation level. A least squares line is then fitted to the data, the slope of which provides an estimate of H.

      C.1.3 DFA Estimator

      The DFA is a variant of the above method (Bunde et al., 2014; Koscielny-Bunde et al., 1998; Kantelhardt et al., 2001; Lennartz & Bunde, 2009, 2011; Ludescher et al., 2015; Peng et al., 1994; Rybski & Bunde, 2009) and estimates the variability of a time series, Xt, on different time scales. First, a profile is computed by urn:x-wiley:rog:media:rog20226:rog20226-math-0055. The profile is then split into Ns nonoverlapping segments of equal length s, and then the local trend is subtracted for each segment v by a polynomial least squares fit. Linear (DFA1), quadratic (DFA2), or higher-order polynomials can be used for detrending. In the nth-order DFA, trends of order n in the profile, and of order n−1 in the original time series, are eliminated. Next, the variance for each of the Ns segments is calculated by averaging over all data points i in the vth segment:
      urn:x-wiley:rog:media:rog20226:rog20226-math-0056(C3)
      By computing the average over all segments and taking the square root, we obtain the fluctuation function:
      urn:x-wiley:rog:media:rog20226:rog20226-math-0057(C4)
      For time series with long-range dependence, F(s) will increase with s as a power law,
      urn:x-wiley:rog:media:rog20226:rog20226-math-0058(C5)
      with the exponent H>0.5. Unlike most algorithms, the DFA algorithm developed by Løvsletten (2017) is capable of dealing with missing data. The R package nonlinearTseries provides code for DFA. The DFA method is biased for H<0.5 (Franzke et al., 2012).

      C.1.4 Wavelet-Based Estimators

      The long-range dependence parameter can also be estimated using wavelets. A wavelet (WL) ψ is a localized wave function with zero average and is normalized to one. A family of wavelets is generated by scaling the wavelet ψ by a factor s and translating it by u urn:x-wiley:rog:media:rog20226:rog20226-math-0059. The wavelet transform allows one to construct a time-frequency representation of a signal, the wavelet spectrum. One can then infer the self-similarity parameter from the wavelet spectrum via ordinary least squares at large wavelet scales (Abry & Veitch, 1998; Stoev & Taqqu, 2005). A widely used wavelet for scaling analysis is the Haar wavelet (Lovejoy & Schertzer, 2012a, 2012b; Lovejoy, 2014). The Haar wavelet mother function is given by
      urn:x-wiley:rog:media:rog20226:rog20226-math-0060(C6)
      In the Haar wavelet technique, one usually considers the original time series X1,…,XN and divides the time series into Ns segments of length s. For each segment v, one first determines the mean value urn:x-wiley:rog:media:rog20226:rog20226-math-0061 of the data, then considers the quantity urn:x-wiley:rog:media:rog20226:rog20226-math-0062 for a zeroth-order wavelet (WT0), urn:x-wiley:rog:media:rog20226:rog20226-math-0063 for first-order wavelet (WT1), and urn:x-wiley:rog:media:rog20226:rog20226-math-0064 for second-order wavelet (WT2). By averaging urn:x-wiley:rog:media:rog20226:rog20226-math-0065 over all segments and taking the square root, the wavelet fluctuation function can be obtained as (Bogachev et al., 2017)
      urn:x-wiley:rog:media:rog20226:rog20226-math-0066(C7)
      For time series with long-range dependence, the parameter can be estimated according to the relationship
      urn:x-wiley:rog:media:rog20226:rog20226-math-0067(C8)

      Similar to the orders of DFA, the different orders of wavelet methods also indicates trend elimination. For example, in WT2, effects of the linear external trends are eliminated.

      C.2 Frequency Domain Methods

      Spectral methods are also widely used for estimating the long-range dependence parameter.

      C.2.1 Geweke-Porter-Hudak Estimator

      A widely used method is the Geweke-Porter-Hudak (GPH) estimator (Geweke & Porter-Hudak, 1983). Spectral methods find d by estimating the spectral slope. The periodogram, an estimate of the spectral density of a finite-length time series, is given by
      urn:x-wiley:rog:media:rog20226:rog20226-math-0068(C9)
      where λj=j/N is the frequency and the square brackets denote rounding toward 0. A series with long-range dependence has a spectral density proportional to |λ|−2d close to the origin. Since f(λ) is an estimator of the spectral density, d is estimated by a regression of the logarithm of the periodogram versus the logarithm of the frequency λ. Thus, having calculated the spectral density estimate f(λ), semiparametric estimators fit a power law of the form f(λ,b,d)=b|λ|d , where b is the scaling factor. The R package fracdiff provides code for GPH.

      C.2.2 Whittle Estimator

      The Whittle estimator is based on the periodogram. Specifically, it involves the function
      urn:x-wiley:rog:media:rog20226:rog20226-math-0069
      where I(·) represents the periodogram, f(·,·) is the spectral density at frequency λ, and θ denotes the vector of unknown parameters. The Whittle estimator corresponds to the value of θ which minimizes the function G(·). In the case of fractional Gaussian noise or fractional autoregressive integrated moving average model, θ={H}. The R package longmemo provides code for the Whittle estimator.

      Appendix D: Hurst Exponent and Long-Range Dependence

      Examining water levels of the Nile river, Harold E. Hurst discovered that if the variance is computed for windows of different sizes and then plotted against the window size, he obtained a power law behavior (Hurst, 1951, 1957). This has been named the Hurst phenomenon, and the exponent of this power law is the Hurst exponent.

      The Hurst exponent, as we defined it here, is related to long-range dependence (Talkner & Weber, 2000). The tail exponent, which measures the power law decay of PDFs, does not affect long-range dependence but does affect the self-similarity exponent (Franzke et al., 2012).

      The long-range dependence parameter, d, can be related to H in monofractal Gaussian systems as urn:x-wiley:rog:media:rog20226:rog20226-math-0070. However, it is typically used with ARFIMA models that are only asymptotically self-similar.

      Appendix E: Estimation of Multifractality

      For some climatic time series, it may be not sufficient to characterize the scaling behavior using only one constant exponent. This is the so-called multifractality. To quantify this property, a traditional method is the partition function,
      urn:x-wiley:rog:media:rog20226:rog20226-math-0071(E1)
      where τ(q) is the Renyi scaling exponent and urn:x-wiley:rog:media:rog20226:rog20226-math-0072 is the profile of the time series xt as for DFA. When τ(q) is linear in q, the time series is considered monofractal, otherwise it is multifractal. In recent years, the multifractal DFA (MF-DFA) has gained increasing popularity (Kantelhardt et al., 2002).
      MF-DFA is a generalized version of DFA, as shown below,
      urn:x-wiley:rog:media:rog20226:rog20226-math-0073(E2)
      For q=2, the monofractal DFA is retrieved. Analogous to equation (25), for each q, the generalized fluctuation exponent h(q) can be defined as
      urn:x-wiley:rog:media:rog20226:rog20226-math-0074(E3)
      Since it is easy to verify that Zq(s) is related to Fq(s) by Fq(s)=[(1/Ns)Zq(s)]1/q, the Renyi scaling exponent τ(q) can be connected with h(q) as (Bogachev et al., 2017),
      urn:x-wiley:rog:media:rog20226:rog20226-math-0075(E4)
      Another way to characterize the multifractality is the singularity strength k (or Holder exponent) and the singularity spectrum f(k) (Koscielny-Bunde et al., 2006). Based on a Legendre transform, the singularity spectrum f(k) can be derived as
      urn:x-wiley:rog:media:rog20226:rog20226-math-0076(E5)
      where k is given by
      urn:x-wiley:rog:media:rog20226:rog20226-math-0077(E6)
      Using equation (30), k and f(k) can be related to h(q) as
      urn:x-wiley:rog:media:rog20226:rog20226-math-0078(E7)
      and
      urn:x-wiley:rog:media:rog20226:rog20226-math-0079(E8)

      Accordingly, the strength of the multifractality can be estimated from MF-DFA, by calculating the width of the singularity spectrum (the differences between the maximum and the minimum k).

      While MF-DFA is equivalent to the wavelet transform modulus maxima (WTMM) method, it is much easier to implement on a computer (Arneodo et al., 2002; Muzy et al., 1991).

      Appendix F: Power Spectrum urn:x-wiley:rog:media:rog20226:rog20226-math-0080 and 1/f Noise

      Power spectra are important to understand temporal variability (Kay & Marple, 1981). Power spectra are especially useful for detecting (quasi)periodic signals like the diurnal and annual cycles which constitute an important aspect of climate variability. However, power spectra can also reveal the background variability of the climate system (Huybers & Curry, 2006).

      A power spectrum displays the fraction of squared amplitudes at different frequency ranges after Fourier transformation of a time series (von Storch & Zwiers, 2003; Wilks, 2011). The most common ways of computing a power spectrum are via the Fourier transform or the maximum entropy method (von Storch & Zwiers, 2003).

      The 1/f noise has a power law form of f−1 in which the squared amplitudes increase with decreasing frequencies; hence, longer time scales exhibit a stronger variability. The 1/f is a generic term which also applies to 1/fβ where the power law has a different exponent β.