Model Testing
Econometrics provides many useful tools for evaluating models, including climate models. I plan to do a few projects on this topic in the years ahead.
TROPOSPHERIC TRENDS: MODELS vs OBSERVATIONS ROUND II: In fall 2010 I published a paper with Steve McIntyre and Chad Herman comparing climate model-generated predictions to observations from satellites and weather balloons in the lower- and mid-troposphere over the tropics, a key region for assessing climate model validity. That paper applied two methods, the panel model, which is a fairly well-known econometric method, as well as the Vogelsang-Franses multivariate trend estimation method, a less-well known but superior alternative which adapts the general HAC method to the estimation of robust confidence intervals for linear trends. The data set used in MMH spanned 1979 to 2009. I extended the data set to include weather balloon data back to 1958 for the purpose of comparing observed lower- and mid-troposphere trends in the tropics to climate model predictions. A challenge in this case is that the 1977-78 Pacific Climate Shift introduces a step-like change in the mean of the data which causes a spurious increase in the estimated trend. But controlling for the step-change affects the VF critical values. Tim Vogelsang has extended the theory behind the VF method to yield robust trend variances in the presence of autocorrelation of unknown form when a step-change occurs at a known point in the sample. In our new paper, just released as a Discussion Paper and en route to a journal, Tim and I present a detailed explanation of the HAC approach to trend comparisons, including the relevant asymptotics and a bootstrap method for generating empirical critical values, then we apply the method to the Hadley and RICH balloon data for the tropical troposphere. Controlling for the 1977 Pacific Climate Shift we find the trends are insignificant from 1958-2010 and the discrepancy with climate models is highly significant.
NEW PAPER IN CLIMATE DYNAMICS TESTING CLIMATE MODEL VALIDITY: Lise Tole and I published a paper in Climate Dynamics testing the ability of climate models to reproduce the spatial pattern of temperature trends over land. This builds on previous work of mine looking at the correlation between indicators of industrial development over land and the spatial pattern of warming trends, a relationship that is not predicted by models and is supposed to have been filtered out of the surface climate record. The paper is
I have written a pair of op-eds to explain the work. The first part appeared in the Financial Post on June 21. A version with the citations provided is here. Part II is here online, and the versions with citations is here.
MODEL-DATA TREND COMPARISONS: My first foray into this topic looks at how to compare model-generated trends to observations. There have been some rather simplistic methods used before now, based on t-stats with "effective degrees of freedom" adjustments &whatnot. The following paper explains more accurate testing methods using panel regression and multivariate trend estimations that have higher power and greater robustness to complex autocorrelation patterns. The application is to the tropical troposphere, an important regions for testing models' ability to quantify the atmospheric response to greenhouse gases. A few recent studies differed on whether models significantly overstate the warming or not. We find that up to 1999 there was only weak evidence for this, but on updated data the models appear to significantly overpredict warming.
CORRECTION to MMH10: In 2010 Steve, Chad and I published a paper that applied panel and multivariate (VF) methods to test the significance of trends and of model-obs differences in the tropical troposphere. There were a couple of typos, and also Chad discovered an error in the GISS data as archived at the PCMDI (not a huge one, just an error splicing pre- and post-2000 runs together). We re-did our analyses and used the updated versions of the observational data for the purpose. The correction has been published:
The GISS correction and data revisions strengthen all our original findings, reducing the observational trends and raising (slightly) the model trends. (a) The combined MSU trends have a p-value just over 0.05; still significant but "marginal". (b) The HadAT 1979-2009 trend in the LT drops from significance to marginal. (c) The average 1979-2009 MT trend across all observational series drops to insignificance. (d) The RICH 1979-2009 MT trend drops to insignificance. (e) The RSS 1979-2009 MT series is now significantly different from models in the panel regression test. For the 1979-2009 interval, all observational series individually and jointly are significantly below models at both the LT and MT layers. (f) Over the 1979-1999 interval the model-obs differences are still marginally significant but in the MT layer it is now at about the 6% level, so it is nearly significant.
TROPOSPHERIC TRENDS: MODELS vs OBSERVATIONS ROUND II: In fall 2010 I published a paper with Steve McIntyre and Chad Herman comparing climate model-generated predictions to observations from satellites and weather balloons in the lower- and mid-troposphere over the tropics, a key region for assessing climate model validity. That paper applied two methods, the panel model, which is a fairly well-known econometric method, as well as the Vogelsang-Franses multivariate trend estimation method, a less-well known but superior alternative which adapts the general HAC method to the estimation of robust confidence intervals for linear trends. The data set used in MMH spanned 1979 to 2009. I extended the data set to include weather balloon data back to 1958 for the purpose of comparing observed lower- and mid-troposphere trends in the tropics to climate model predictions. A challenge in this case is that the 1977-78 Pacific Climate Shift introduces a step-like change in the mean of the data which causes a spurious increase in the estimated trend. But controlling for the step-change affects the VF critical values. Tim Vogelsang has extended the theory behind the VF method to yield robust trend variances in the presence of autocorrelation of unknown form when a step-change occurs at a known point in the sample. In our new paper, just released as a Discussion Paper and en route to a journal, Tim and I present a detailed explanation of the HAC approach to trend comparisons, including the relevant asymptotics and a bootstrap method for generating empirical critical values, then we apply the method to the Hadley and RICH balloon data for the tropical troposphere. Controlling for the 1977 Pacific Climate Shift we find the trends are insignificant from 1958-2010 and the discrepancy with climate models is highly significant.
NEW PAPER IN CLIMATE DYNAMICS TESTING CLIMATE MODEL VALIDITY: Lise Tole and I published a paper in Climate Dynamics testing the ability of climate models to reproduce the spatial pattern of temperature trends over land. This builds on previous work of mine looking at the correlation between indicators of industrial development over land and the spatial pattern of warming trends, a relationship that is not predicted by models and is supposed to have been filtered out of the surface climate record. The paper is
- **McKitrick, Ross R. and Lise Tole (2012) “Evaluating Explanatory Models of the Spatial Pattern of Surface Climate Trends using Model Selection and Bayesian Averaging Methods” Climate Dynamics, 2012, DOI: 10.1007/s00382-012-1418-9
I have written a pair of op-eds to explain the work. The first part appeared in the Financial Post on June 21. A version with the citations provided is here. Part II is here online, and the versions with citations is here.
MODEL-DATA TREND COMPARISONS: My first foray into this topic looks at how to compare model-generated trends to observations. There have been some rather simplistic methods used before now, based on t-stats with "effective degrees of freedom" adjustments &whatnot. The following paper explains more accurate testing methods using panel regression and multivariate trend estimations that have higher power and greater robustness to complex autocorrelation patterns. The application is to the tropical troposphere, an important regions for testing models' ability to quantify the atmospheric response to greenhouse gases. A few recent studies differed on whether models significantly overstate the warming or not. We find that up to 1999 there was only weak evidence for this, but on updated data the models appear to significantly overpredict warming.
- ** McKitrick, Ross R., Stephen McIntyre and Chad Herman (2010) "Panel and Multivariate Methods for Tests of Trend Equivalence in Climate Data Sets". Atmospheric Science Letters, DOI: 10.1002/asl.290. Data/code archive.
CORRECTION to MMH10: In 2010 Steve, Chad and I published a paper that applied panel and multivariate (VF) methods to test the significance of trends and of model-obs differences in the tropical troposphere. There were a couple of typos, and also Chad discovered an error in the GISS data as archived at the PCMDI (not a huge one, just an error splicing pre- and post-2000 runs together). We re-did our analyses and used the updated versions of the observational data for the purpose. The correction has been published:
- *McKitrick, Ross, Stephen McIntyre and Chad Herman (2011) Correction to "Panel and Multivariate Methods for Tests of Trend Equivalence in Climate Data Series" Atmospheric Science Letters October 7 2011, DOI: 10.1002/asl.360.
The GISS correction and data revisions strengthen all our original findings, reducing the observational trends and raising (slightly) the model trends. (a) The combined MSU trends have a p-value just over 0.05; still significant but "marginal". (b) The HadAT 1979-2009 trend in the LT drops from significance to marginal. (c) The average 1979-2009 MT trend across all observational series drops to insignificance. (d) The RICH 1979-2009 MT trend drops to insignificance. (e) The RSS 1979-2009 MT series is now significantly different from models in the panel regression test. For the 1979-2009 interval, all observational series individually and jointly are significantly below models at both the LT and MT layers. (f) Over the 1979-1999 interval the model-obs differences are still marginally significant but in the MT layer it is now at about the 6% level, so it is nearly significant.