Skip to main content
Log in

Statistical Model-Based Composite Indicators for Tracking Coherent Policy Conclusions

  • Original Research
  • Published:
Social Indicators Research Aims and scope Submit manuscript

Abstract

In the last years, with the data revolution and the use of new technologies, phenomena are frequently described by a huge quantity of information useful for making strategic decisions. A priority for policymakers is having simple statistical tools useful to synthesize data. Such tools are represented by composite indicators (CIs). According to the glossary of statistical terms of OECD (The OECD-JRC handbook on practices for developing composite indicators. Paper presented at the OECD Committee on Statistics, 2004), OECD-JRC (Handbook on constructing composite indicators. Methodology and user guide, OECD, Paris, 2008), a CI is formed when manifest (observed) indicators are compiled into a single index, on the basis of an underlying model of the multi-dimensional concept that is being measured, and weights commonly represent the relative importance of each indicator. CIs are increasingly used for bench-marking countries’ performances and the methodological challenges raise a series of technical issues that, if not adequately addressed, can lead to CIs being misinterpreted or manipulated. Yet doubts are often raised about the robustness of the resulting countries’ rankings and about the significance of the associated policy message. In this paper, we propose a model-based approach for the construction of CIs with a hierarchical structure where the CIs (first and second order) are estimated using the Hierarchical Disjoint Non-Negative Factor Analysis (Cavicchia and Vichi in Hierarchical disjoint non-negative factor analysis. Manuscript submitted for publication, 2020) in a LS framework. In order to assess the methodology of construction of a CI, a set of properties is proposed and applied. Some well-known CIs, such as the Human Development Index and the Multidimensional Poverty index, are taken into consideration to show the importance of those properties. Therefore, we include into our proposal the most frequently used approaches in the literature of CIs, and we evaluate the model to assess their performances.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. The relations between MIs and SCIs are known. The magnitude of the relations between MIs and SCIs and between SCIs and GCI has to be estimated.

  2. \(\alpha \ge 0.9\), Excellent; \(0.9 > \alpha \ge 0.8\), Good; \(0.8 > \alpha \ge 0.7\), Acceptable; \(0.7 > \alpha \ge 0.6\), Questionable; \(0.6 > \alpha \ge 0.5\), Poor; \(0.5 > \alpha\), Unacceptable.

References

  • Aaker, D., & Bagozzi, R. (1979). Unobservable variables in structural equation models with an application in industrial selling. Journal of Marketing Research, 16, 147.

    Article  Google Scholar 

  • Alkire, S. (2010). Human development: Definitions, critiques, and related concepts, human development research papers (2009 to present). Human Development Report Office (HDRO), United Nations Development Programme (UNDP).

  • Alkire, S., & Foster, J. (2011). Counting and multidimensional poverty measurement. Journal of Public Economics, 95(7), 476–487.

    Article  Google Scholar 

  • Alkire, S., Foster, J., Seth, S., Santos, M., Roche, J., & Ballon, P. (2015). Multidimensional poverty measurement and analysis. Oxford: Oxford University Press.

    Book  Google Scholar 

  • Anderson, J. C., & Gerbing, D. W. (1982). Some methods for respecifying measurement models to obtain unidimensional construct measurement. Journal of Marketing Research, 19(4), 453–460.

    Article  Google Scholar 

  • Anderson, T. W., & Rubin, H. (1956). Statistical inferences in factor analysis. Proceedings of the Third Symposium on Mathematical Statistics and Probability, 5, 111–150.

    Google Scholar 

  • Blalock, H. M. (1964). Causal inferences in nonexperimental research. New York, NY: Norton.

    Google Scholar 

  • Bollen, K. A. (1984). Multiple indicators: Internal consistency or no necessary relationship? Quality and Quantity, 18(4), 377–385.

    Article  Google Scholar 

  • Bollen, K. A. (1989). Structural equations with latent variables. New York: John Wiley and Sons Inc.

    Book  Google Scholar 

  • Bollen, K. A. (2001). Indicator: Methodology. International Encyclopedia of the Social and Behavioral Sciences, 7282–7287.

  • Bollen, K. A. (2011). Evaluating effect, composite, and causal indicators in structural equation models. MIS Quarterly, 35(2), 359–372.

    Article  Google Scholar 

  • Bollen, K. A., & Bauldry, S. (2011). Three cs in measurement models: Causal indicators, composite indicators, and covariates. Psychological Methods, 16(3), 265–284.

    Article  Google Scholar 

  • Bollen, K. A., & Lennox, R. (1991). Conventional wisdom on measurement: A structural equation perspective. Psychological Bulletin, 110, 305–314.

    Article  Google Scholar 

  • Bollen, K. A., & Ting, K. (2000). A tetrad test for causal indicators. Psychological Methods, 5, 3–22.

    Article  Google Scholar 

  • Bro, R. (1997). Parafac. Tutorial and applications. Chemometrics and Intelligent Laboratory Systems, 38(2), 149–171.

    Article  Google Scholar 

  • Budden, M., Hadavas, P., & Hoffman, L. (2008). On the generation of correlation matrices. Applied Mathematics E-Notes, 8, 279–282.

    Google Scholar 

  • Burt, R. S. (1973). Confirmatory factor-analysis structures and the theory construction process. Sociological Methods and Research, 2, 131–187.

    Article  Google Scholar 

  • Burt, R. S. (1976). Interpretational confounding of unobserved variables in structural equation models. Sociological Methods and Research, 5, 3–52.

    Article  Google Scholar 

  • Cataldo, R., Crocetta, C., Grassia, M., Lauro, C., Marino, M., & Voytsekhovska, V. (2020). Methodological PLS-PM framework for SDGs system. Social Indicators Research. https://doi.org/10.1007/s11205-020-02271-5.

    Article  Google Scholar 

  • Cataldo, R., Grassia, M., Lauro, C., & Marino, M. (2016). Developments in higher-order PLS-PM for the building of a system of composite indicators. Quality & Quantity, 51, 657–674.

    Article  Google Scholar 

  • Cavicchia, C., & Vichi, M. (2020). Hierarchical disjoint non-negative factor analysis. Manuscript submitted for publication.

  • Chaouachi, S. G., & Rached, K. S. B. (2012). Perceived deception in advertising: Proposition of a measurement scale. IBIMA Publishing Journal of Marketing Research & Case Studies. https://doi.org/10.5171/2012.712622.

    Article  Google Scholar 

  • de Neubourg, C., de Milliano, M., & Plavgo, I. (2014). Lost (in) dimensions: Consolidating progress in multidimensional poverty research. Innocenti Working Papers.

  • Despotis, D. K. (2005). A reassessment of the human development index via data envelopment analysis. Journal of the Operational Research Society, 56(8), 969–980.

    Article  Google Scholar 

  • Diamantopoulos, A., & Siguaw, J. A. (2006). Formative versus reflective indicators in organizational measure development: A comparison and empirical illustration. British Journal of Management, 17(4), 263–282.

    Article  Google Scholar 

  • Ebert, U., & Welsch, H. (2004). Meaningful environmental indices: A social choice approach. Journal of Environmental Economics and Management, 47(2), 270–283.

    Article  Google Scholar 

  • Edwards, J. R. (2011). The fallacy of formative measurement. Organizational Research Methods, 14(2), 370–388.

    Article  Google Scholar 

  • Edwards, J. R., & Bagozzi, R. P. (2000). On the nature and direction of the relationship between constructs and measures. Psychological Methods, 5(2), 155–174.

    Article  Google Scholar 

  • Fornell, C., & Larcker, D. F. (1981). Evaluating structural equation models with unobservable variables and measurement error. Journal of Marketing Research, 18(1), 39–50.

    Article  Google Scholar 

  • George, D., & Mallery, P. (2003). SPSS for Windows step by step: A simple guide and reference. 11.0 update (4th ed.). Boston, MA: Allyn & Bacon.

    Google Scholar 

  • Green, S. B., Lissitz, R. W., & Mulaik, S. A. (1977). Limitations of coefficient alpha as an index of test unidimensionality. Educational and Psychological Measurement, 37(4), 827–838.

    Article  Google Scholar 

  • Guttman, L. (1954). Some necessary conditions for common-factor analysis. Psychometrika, 19(2), 149–161.

    Article  Google Scholar 

  • Harshman, R. A. (1970). Foundations of the parafac procedure: Models and conditions for an “explanatory” multi-modal factor analysis. UCLA Working Papers in Phonetics 16.

  • Hauser, R. M., & Goldberger, A. S. (1971). The treatment of unobservable variables in path analysis. Sociological Methodology, 3, 81–117.

    Article  Google Scholar 

  • Hayduk, L. A. (1980). Causal models in marketing. New York, NY: Wiley.

    Google Scholar 

  • Hayduk, L. A. (1987). Structural equation modeling with LISREL: Essentials and advances. Baltimore, MD: Johns Hopkins University.

    Google Scholar 

  • Horst, P. (1965). Factor analysis of data matrices. pt. 3. New York: Holt, Rinehart and Winston.

    Google Scholar 

  • Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components. Journal of Educational Psychology, 24(6), 417–441498520.

    Article  Google Scholar 

  • Hubert, L., & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2(1), 193–218.

    Article  Google Scholar 

  • Jarvis, C. B., MacKenzie, S. B., & Podsakoff, P. M. (2003). A critical review of construct indicators and measurement model misspecification in marketing and consumer research. Journal of Consumer Research, 30(2), 199–218.

    Article  Google Scholar 

  • Jöreskog, K. G. (1970). A general method for analysis of covariance structure. Biometrika, 57, 239–251.

    Article  Google Scholar 

  • Jöreskog, K. G. (1978). Structural analysis of covariance and correlation matrices. Psychometrika, 43(4), 443–477.

    Article  Google Scholar 

  • Jöreskog, K. G., & Goldberger, A. S. (1975). Estimation of a model with multiple indicators and multiple causes of a single latent variable. Journal of the American Statistical Association, 70(351a), 631–639.

    Article  Google Scholar 

  • Kaiser, H. F. (1960). The application of electronic computers to factor analysis. Educational and Psychological Measurement, 20(1), 141–151.

    Article  Google Scholar 

  • Land, K. (1970). On the estimation of path coefficients for unmeasured variables from correlations among observed variables. Social Forces, 48, 506–511.

    Article  Google Scholar 

  • Lauro, C., Grassia, M., & Cataldo, R. (2018). Model based composite indicators: New developments in partial least squares-path modeling for the building of different types of composite indicators. Social Indicators Research, 135, 421–455.

    Article  Google Scholar 

  • MacCallum, R. C., & Browne, M. W. (1993). The use of causal indicators in covariance structure models: Some practical issues. Psychological Bulletin, 114(3), 533–541.

    Article  Google Scholar 

  • Maggino, F. (Ed.). (2017). Complexity in society: From indicators construction to their synthesis (1st ed., Vol. 70). Social indicators research series. New York, NY: Springer.

    Google Scholar 

  • Maggino, F., & Zumbo, B. D. (2012). Measuring the quality of life and the construction of social indicators. In K. C. Land, A. C. Michalos, & M. J. Sirgy (Eds.), Handbook of social indicators and quality of life research (pp. 201–238). Netherlands: Springer. https://doi.org/10.1007/978-94-007-2421-1_10.

    Chapter  Google Scholar 

  • Martínez, R. (2012). Inequality and the new human development index. Applied Economics Letters, 19(6), 533–535.

    Article  Google Scholar 

  • McGillivray, M. (1991). The human development index: Yet another redundant composite development indicator? World Development, 19(10), 1461–1468.

    Article  Google Scholar 

  • McNemar, Q. (1946). Opinion-attitude methodology. Psychological Bulletin, 43(4), 289–374.

    Article  Google Scholar 

  • Munda, G., & Nardo, M. (2009). Noncompensatory/nonlinear composite indicators for ranking countries: A defensible setting. Applied Economics, 41(12), 1513–1523.

    Article  Google Scholar 

  • Nardo, M., Saisana, M., Saltelli, A., & Tarantola, S. (2005). Tools for composite indicators building, report eur 21682, European Commission. Ispra: Join Research Centre.

    Google Scholar 

  • Noorbakhsh, F. (1998). The human development index: Some technical issues and alternative indices. Journal of International Development, 10(5), 589–605.

    Article  Google Scholar 

  • OECD. (2004). The OECD-JRC handbook on practices for developing composite indicators. Paper presented at the OECD Committee on Statistics.

  • OECD-JRC. (2008). Handbook on constructing composite indicators. Methodology and user guide. Paris: OECD.

    Book  Google Scholar 

  • Pearson, K. (1901). On lines and planes of closest fit to systems of points in space. Philosophical Magazine and Journal of Science, 2(11), 559–572.

    Article  Google Scholar 

  • Radloff, L. S. (1977). The CES-D scale: A self-report depression scale for research in the general population. Applied Psychological Measurement, 1(3), 385–401.

    Article  Google Scholar 

  • Revelle, W. (1979). Hierarchical cluster analysis and the internal structure of tests. Multivariate Behavioral Research, 14(1), 57–74.

    Article  Google Scholar 

  • Revelle, W., & Zinbarg, R. (2009). Coefficients alpha, beta, omega, and the glb: Comments on sijtsma. Psychometrika, 74, 145–154.

    Article  Google Scholar 

  • Ronald, I., & Welzel, C. (2005). Modernization, cultural change, and democracy: The human development sequence. Cambridge: Cambridge University Press.

    Google Scholar 

  • Sagar, A. D., & Najam, A. (1998). The human development index: A critical review. Ecological Economics, 25(3), 249–264.

    Article  Google Scholar 

  • Saltelli, A., Nardo, M., Saisana, M., & Tarantola, S. (2004). Composite indicators: The controversy and the way forward. Palermo: OECD World Forum on Key Indicators.

    Google Scholar 

  • Schmid, J., & Leiman, J. M. (1957). The development of hierarchical factor solutions. Psychometrika, 22(1), 53–61.

    Article  Google Scholar 

  • Schmitt, N. (1996). Uses and abuses of coefficient alpha. Psychological Assessment, 8(4), 350–353.

    Article  Google Scholar 

  • Sen, A. K. (1981). Poverty and famines: An essay on entitlement and deprivation. Oxford: Clarendon Press. Clarendon Press.

    Google Scholar 

  • Sen, A. K. (1985). Commodities and capabilities. Amsterdam: North-Holland.

    Google Scholar 

  • Sen, A. K. (1992). Inequality reexamined. Oxford: Clarendon Press.

    Google Scholar 

  • Stiglitz, J. E., Sen, A. K., & Fitoussi, J. P. (2009). Report by the commission on the measurement of economic performance and social progress. Technical report, Commission on the Measurement of Economic Performance and Social Progress, Paris.

  • ten Berge, J. M. F. (2005). Least squares optimization in multivariate analysis. Leiden: DSWO Press.

    Google Scholar 

  • Thompson, G. H. (1951). The factorial analysis of human ability (5th ed.). New York: Houghton Mifflin.

    Google Scholar 

  • Trabold-Nübler, H. (1991). The human development index: A new development indicator? Intereconomics: Review of European Economic Policy, 26(5), 236–243.

    Article  Google Scholar 

  • Ullman, J. (2006). Structural equation modeling: Reviewing the basics and moving forward. Journal of Personality Assessment, 87, 35–50.

    Article  Google Scholar 

  • Vichi, M. (2017). Disjoint factor analysis with cross-loadings. Advances in Data Analysis and Classification, 11(3), 563–591.

    Article  Google Scholar 

  • Vinzi, V. E., Chin, W. W., Henseler, J., & Wang, H. (2010). Handbook of partial least squares. Berlin: Springer.

    Book  Google Scholar 

  • Welzel, C., Inglehart, R., & Kligemann, H. D. (2003). The theory of human development: A cross-cultural analysis. European Journal of Political Research, 42(3), 341–379.

    Article  Google Scholar 

  • Werts, C. E., Linn, R. L., & Jöreskog, K. G. (1974). Intraclass reliability estimates: Testing structural assumptions. Educational and Psychological Measurement, 34(1), 25–33.

    Article  Google Scholar 

  • Wherry, R. J. (1959). Hierarchical factor solutions without rotation. Psychometrika, 24(1), 45–51.

    Article  Google Scholar 

  • Wherry, R. J. (1975). Underprediction from overfitting: 45 years of shrinkage. Personnel Psychology, 28(1), 1–18.

    Article  Google Scholar 

  • Wherry, R. J. (1984). Contributions to correlational analysis. New York: Academic Press.

    Google Scholar 

  • Yeomans, K., & Golder, P. (1982). The guttman-kaiser criterion as a predictor of the number of common factors. Journal of the Royal Statistical Society. Series D (The Statistician), 31(3), 221–229.

    Google Scholar 

  • Zinbarg, R. E., Revelle, W., Yovel, I., & Li, W. (2005). Cronbach’s α, revelle’s β, and mcdonald’s ω H: Their relations with each other and two alternative conceptualizations of reliability. Psychometrika, 70(1), 123–133.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Carlo Cavicchia.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Examples

Appendix: Examples

Example 1

Let us consider the vector \(\mathbf{g}\) as a normalized normal random vector of 15 statistical units, the vector \(\mathbf{c}\) as a vector of 3 weights equal to 1 and the matrix \(\mathbf{B}\) as an identity matrix of order 10, that is with all weights equal to 1. So, we have 10 MIs explained by 3 SCIs with the constraint that each MI can be explained by one SCI, only.

Let us consider the matrix \(\mathbf{V}'\) given by:

$$\begin{aligned} \begin{bmatrix} 1 &{} 1 &{} 1 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0\\ 0 &{} 0 &{} 0 &{} 1 &{} 1 &{} 1 &{} 1 &{} 0 &{} 0 &{} 0\\ 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 0 &{} 1 &{} 1 &{} 1\\ \end{bmatrix} \end{aligned}$$

which describes the relationships between MIs and SCIs.

We can generate the data starting from (5) and hypothesizing 3 levels of the error \(\mathbf{E}\): small, medium and large. The corresponding matrices \(\mathbf{X}\) are compared in Fig. 5. In Fig. 5 (left) the small error does not modify the columns of \(\mathbf{X}\) that can be well distinguished. The same does not apply to the matrix \(\mathbf{X}\) in Fig. 5 (right) with high error.

Fig. 5
figure 5

Heatmap of 3 data matrices \(\mathbf{X}\) with (left) small, (center) medium and (right) large errors

First of all, we can measure the goodness of fit of the model by \(R_{GCI}^2\)

Error

\(R_{GCI}^2\)

Small

0.974

Medium

0.622

Large

0.131

We can see how the smaller the error, the better the values of \(R_{GCI}^2\) will be. Thus, we can compute the \(R_{SCI_h}^2\) for each of the three SCIs in every situation of error:

Error

\(R_{SCI_1}^2\)

\(R_{SCI_2}^2\)

\(R_{SCI_3}^2\)

Small

0.988

0.988

0.989

Medium

0.778

0.837

0.855

Large

0.624

0.539

0.672

Here, we can see that the values of \(R_{SCI_h}^2\) are greater than \(R_{GCI}^2\) for each error and for each SCI.

Example 2

Let us consider, now, a different situation where the data matrix \(\mathbf{X}\) is divided in three blocks: the first three variables are normal with mean equal to 0 and variance equal to 1, the variables from forth to seventh are normal with mean equal to 3 and variance equal to 1 and finally, the last three variables are normal with mean equal to 7 and variance equal to 1. This matrix \(\mathbf{X}\) is generated considering a small error and its heatmap is reported in Fig. 6.

Fig. 6
figure 6

Heatmap of matrix \(\mathbf{X}\) formed by 3 blocks of variables

We can see that if we use the same weight system as before (\(\mathbf{c} = \mathbf{1}_3\) and \(\mathbf{B}=\mathbf{I}_{10}\)) the fit of the model will be relatively bad \(R_{GCI}^2=\frac{SS_{mod}}{SS_{tot}} =0.548\), because the arithmetic mean is not a good measure when variables are divided in blocks or are very different each other (i.e., the example with large error). However, we can see that the values of \(R_{SCI_h}^2\) are very good

\(R_{SCI_1}^2\)

\(R_{SCI_2}^2\)

\(R_{SCI_3}^2\)

0.999

0.999

1

Therefore, in this situation the researcher should avoid using the GCI and consider the SCIs. The case of the dimensions of well-being: it is more appropriate to stop the analysis at the SCIs’ level such as Material Living Conditions and Quality of Life (Stiglitz et al. 2009).

Example 3

Let us consider the setting given in the Example 1 with the difference that the vector \(\mathbf{c}\) has three values not equal to 1. Thus, we have three clusters of MIs with different weights (i.e., importance) for GCI but where into each cluster any MI has the same weight (i.e., equal to 1). Thus, the GCI is a weighted mean of the MIs and all MIs belonging to clusters corresponding to a high value of \(\mathbf{c}\) (e.g., higher weight) are more important.

Let us consider \(\mathbf{c}=[0.8 0.5 0.6]'\) and try to measure the goodness of fit of the model by \(R_{GCI}^2\) and \(R_{SCI_h}^2\) for all levels of error

Error

Small

Medium

Large

\(R_{GCI}^2\)

0.934

0.577

0.269

\(R_{SCI_1}^2\)

0.984

0.848

0.679

\(R_{SCI_2}^2\)

0.954

0.754

0.610

\(R_{SCI_3}^2\)

0.972

0.832

0.739

We can see that the values of \(R_{SCI_h}^2\) are greater than \(R_{GCI}^2\) for each error and for each SCIs also in this situation, that’s why the generated matrix of data is divided in three blocks and we have already seen arithmetic mean is not a good estimator when data are divided in different blocks of variables. Let us now consider \(\mathbf{B}\) as a diagonal matrix (not equal to the identity matrix). Here, each MI has a different weight also into every single cluster. Thus, for example: \(\mathbf{B}= {\text{dg}}([0.9 0.7 0.8 0.8 0.6 0.9 0.7 0.7 0.6 0.8]')\) and \(\mathbf{c}=[0.7 0.6 0.5]'\); and let us try to measure the goodness of fit of the model by \(R_{GCI}^2\) and \(R_{SCI_h}^2\) for all levels of error

Error

Small

Medium

Large

\(R_{GCI}^2\)

0.909

0.431

0.196

\(R_{SCI_1}^2\)

0.968

0.699

0.599

\(R_{SCI_2}^2\)

0.936

0.722

0.554

\(R_{SCI_3}^2\)

0.934

0.554

0.513

Results highlights the same situation previously studied: the arithmetic mean is a good GCI only when the MIs are similar.

Example 4

Let us generate a sample of 5000 vectors (of 1000 units) from a standard normal distribution and let us compute the empirical distribution of variance for each type of normalizations presented previously. The average of the sample variance for each type of normalization is reported:

Raw data

Standardized

Min–max

Norm dispersion

1.04

1

0.02

24,229.95

It is useful to recall that for standardized data variance is constantly equal to one for all MIs thus keeping under control the effect of the variability while maintaining the data centered; that is why this is the most used method of normalization. The Min–max method keeps constant the range of values of the data, however, the value of the variance is reduced toward zero. This might be considered a problem because different MIs tend to be compressed reducing differences. The variance of data with Normalized dispersion method depends on the value of the mean of the raw data (i.e., they are inversely proportional), so if the mean is close to zero the value of the variance is very big (e.g., in the last example) otherwise the variance decreases consistently.

Let us generate a sample of 5000 vectors (of 1000 units) from a normal distribution with mean equal to 10 and variance equal to 1 and let us compute the empirical distribution of variance for each type of normalization previously presented. The average of the sample variance for each type of normalization is reported:

Raw data

Standardized

Min–max

Norm dispersion

0.95

1

0.03

0.01

Fig. 7
figure 7

Comparison among empirical distribution of variances

In Fig. 7, it is possible to observe the difference among empirical distribution of variances according to different types of normalization via histograms.

Example 5

Suppose \(a=0.8\) and \(b=0.3\). In order to guarantee a valid correlation matrix, c must be into the range: \([-0.332364, 0.812364]\). In fact, for different values of c the eigenvalues are:

\(c=0.9\)

\(c=0.8\)

\(c=-0.2\)

\(c=-0.4\)

−  0.0651

0.0087

0.0658

−  0.0369

0.7024

0.7000

1.1274

1.2293

2.3627

2.2913

1.8069

1.8076

Hence, values of c chosen outside of the given range, e.g., \(c =0.4\) or \(c = 0.9\), give a matrix \(\mathbf{A}\) that is not positive semi-definite, since it has at least one negative eigenvalue.

Example 6

Given a (\(100 \times 3\)) raw data matrix \(\mathbf{X}\) with correlation matrix \(\mathbf{R}\):

$$\begin{aligned} \mathbf{R} = \begin{bmatrix} 1 &{} 0.45 &{} -0.5\\ 0.45 &{} 1 &{} -0.8\\ -0.5 &{} -0.8 &{} 1\\ \end{bmatrix} \end{aligned}$$

After a normalization of \(\mathbf{X}\), we can edit the third MI with the formula \(\tilde{x}_{ij}=1 - x_{ij}\). In this way, we would obtain a new correlation matrix with all positive values:

$$\begin{aligned} \tilde{\mathbf{R}} = \begin{bmatrix} 1 &{} 0.45 &{} 0.5\\ 0.45 &{} 1 &{} 0.8\\ 0.5 &{} 0.8 &{} 1\\ \end{bmatrix} \end{aligned}$$

It is important to notice how this operation does not change the absolute value of correlations (i.e., intensity, level or magnitude).

Example 7

Let us suppose that data have the hierarchical structure presented in Fig. 8.

Fig. 8
figure 8

Path diagram of two-level GCI with three SCIs

If we try to estimate this situation with a model with only 2 factors, it will be evident how two aspects are considered together into one single factor:

 

Factor 1

Factor 2

Unidimensionality

2.737

0.556

Reliability

0.526

0.794

The Factor 1 explains the first two clusters of MIs, and its measure of unidimensionality is higher than 1. We can also see how its Cronbach’s \(\alpha\) is not acceptable.

If we try to consider one more factor, the measures change essentially:

 

Factor 1

Factor 2

Factor 3

Unidimensionality

0.400

0.618

0.556

Reliability

0.781

0.781

0.794

Here the values of Cronbach’s \(\alpha\) are all good and the unidimensionality is verified per each factor (i.e., the values of the variance of the second component of the cluster are lower than 1).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cavicchia, C., Vichi, M. Statistical Model-Based Composite Indicators for Tracking Coherent Policy Conclusions. Soc Indic Res 156, 449–479 (2021). https://doi.org/10.1007/s11205-020-02318-7

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11205-020-02318-7

Keywords

Navigation