Using validity theory and psychometrics to evaluate and support expanded uses of existing scales

Houts, Carrie R.; Bush, Elizabeth Nicole; Edwards, Michael C.; Wirth, R. J.

doi:10.1007/s11136-022-03162-7

Using validity theory and psychometrics to evaluate and support expanded uses of existing scales

Special Section: Reducing Research Waste in (Health-Related) Quality of Life Research
Published: 03 June 2022

Volume 31, pages 2969–2975, (2022)
Cite this article

Quality of Life Research Aims and scope Submit manuscript

Carrie R. Houts ORCID: orcid.org/0000-0003-1233-9389¹,
Elizabeth Nicole Bush²,
Michael C. Edwards¹ &
…
R. J. Wirth¹

506 Accesses
2 Citations
2 Altmetric
Explore all metrics

Abstract

Background

Scale development is a complex activity requiring significant investments of time and money to produce evidence of a scale’s ability to produce reliable scores and valid inferences. With increasing use of clinical outcome assessments (COAs) in medical product development, evidentiary expectations of regulatory bodies to support inferences are a key consideration. The goal of this paper is to demonstrate how existing methods in measurement science can be used to identify and fill evidence gaps when considering re-purposing an existing scale for a new use case (e.g., new patient population, altering the recall period), rather than creating a new COA tool.

Methods

We briefly review select validity theory and psychometric concepts, linking them to the nomenclature in the COA/regulated space. Four examples (two in-text and two in online supplemental materials) of modifications are presented to demonstrate these ideas in practice for quality of life (QOL)-related measures.

Results

Each example highlights the initial process of evaluating the desired validity claims, identifying gaps in evidence to support these claims, and determining how such gaps could be filled, often without having to develop a new measure.

Conclusions

If an existing scale, with minimal modification or additional evidence, can be shown to be fit for a new purpose, considerable effort can be saved and research waste avoided. In many cases, a new instrument is simply unnecessary. Far better to recycle an “old” scale for a new use–with sufficient evidence that it is fit for that purpose–than to “buy” a new one.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Randomized Controlled Trials 3: Measurement and Analysis of Patient-Reported Outcomes

Toward mindfulness in quality-of-life research: perspectives on how to avoid rigor becoming rigidity

Article Open access 09 January 2017

Recommended Methods for the Collection of Health State Utility Value Evidence in Clinical Studies

Article 19 October 2017

References

Altman, D. G. (1994). The scandal of poor medical research. BMJ, 308(6924), 283–284. https://doi.org/10.1136/bmj.308.6924.283
Article CAS PubMed PubMed Central Google Scholar
Chalmers, I., & Glasziou, P. (2009). Avoidable waste in the production and reporting of research evidence. Lancet, 374(9683), 86–89. https://doi.org/10.1016/S0140-6736(09)60329-9
Article PubMed Google Scholar
Vodicka, E., Kim, K., Devine, E. B., Gnanasakthy, A., Scoggins, J. F., & Patrick, D. L. (2015). Inclusion of patient-reported outcome measures in registered clinical trials: Evidence from clinical trials.gov (2007–2013). Contemporary clinical trials, 43, 1–9. https://doi.org/10.1016/j.cct.2015.04.004
Article CAS PubMed Google Scholar
Rosenthal, R. (1979). The file drawer problem and tolerance for null results. Psychological Bulletin, 86(3), 638–641. https://doi.org/10.1037/0033-2909.86.3.638
Article Google Scholar
Shrout, P. E., & Rodgers, J. L. (2018). Psychology, science, and knowledge construction: Broadening perspectives from the replication crisis. Annual Review of Psychology, 69, 487–510. https://doi.org/10.1146/annurev-psych-122216-011845
Article PubMed Google Scholar
U.S. Food and Drug Administration. (2018). Discussion document for patient-focused drug development public workshop on guidance 3: Select, develop, or modify fit-for-purpose clinical outcome assessments. Retrieved from https://www.fda.gov/media/116277/download
Oosterveld, P., Vorst, H. C. M., & Smits, N. (2019). Methods for questionnaire design: A taxonomy linking procedures to test goals. Quality of Life Research, 28, 2501–2512.
Article Google Scholar
Smits, N., van der Ark, L. A., & Conijn, J. M. (2018). Measurement versus prediction in the construction of patient reported outcome questionnaires: Can we have our cake and eat it? Quality of Life Research, 27, 1673–1682.
Article Google Scholar
U.S. Food and Drug Administration. (2009). Guidance for industry patient-reported outcome measures: Use in medical product development to support labeling claims. Retrieved from https://www.fda.gov/media/77832/download
U.S. Food and Drug Administration (2020). Patient-focused drug development: Collecting comprehensive and representative input. Guidance for industry, food and drug administration staff, and other stakeholders. Retrieved from https://www.fda.gov/media/139088/download
U.S. Food and Drug Administration (2018). Methods to identify what is important to patients & select, develop or modify fit-for-purpose clinical outcomes assessments. From the Patient-Focused Drug Development Guidance Public Workshop. Retrieved from https://www.fda.gov/media/116276/download
U.S. Food and Drug Administration (2019). Incorporating clinical outcome assessments into endpoints for regulatory decision-making. From the Patient-Focused Drug Development Guidance Public Workshop. Retrieved from https://www.fda.gov/media/132505/download
U.S. Food and Drug Administration (2021). Qualified Clinical Outcome Assessments (COA). Retrieved from https://www.fda.gov/drugs/clinical-outcome-assessment-coa-qualification-program/qualified-clinical-outcome-assessments-coa
Papadopoulos, E. J., Bush, E. N., Eremenco, S., & Coons, S. J. (2020). Why reinvent the wheel? Use or modification of existing clinical outcome assessment tools in medical product development. Value in Health, 23(2), 151–153.
Article Google Scholar
Beck, A. T., Steer, R. A., & Garbin, M. G. (1988). Psychometric properties of the beck depression inventory: Twenty-five years of evaluation. Clinical Psychology Review, 8(1), 77–100.
Article Google Scholar
Zigmond, A. S., & Snaith, P. (1983). The hospital anxiety and depression scale. Acta Psychiatrica Scandinavica, 67, 361–370.
Article CAS Google Scholar
Radloff, L. S. (1977). The CES-D scale: A self-report depression scale for research in the general population. Applied Psychological Measurement, 1(3), 385–401.
Article Google Scholar
Kroenke, K., Spitzer, R. L., & Williams, J. B. W. (2001). The PHQ-9: Validity of a brief depression severity measure. Journal of General Internal Medicine, 16, 606–613.
Article CAS Google Scholar
Bushnell, D. M., McCarrier, K. P., Bush, E. N., Abraham, L., Jamieson, C., McDougall, F., Trivedi, M. H., Thase, M. E., Carpenter, L., Coons, S. J., PRO Consortium’s Depression Working Group. (2019). Symptoms of major depressive disorder scale: Performance of a novel patient-reported symptom measure. Value Health, 22(8), 906–915. https://doi.org/10.1016/j.jval.2019.02.010
Article PubMed Google Scholar
Olsen, L., Jensen, D., Noerholm, V., Martiny, K., & Bech, P. (2003). The internal and external validity of the major depression inventory in measuring severity of depressive states. Psychological Medicine, 33(2), 351–356. https://doi.org/10.1017/S0033291702006724
Article CAS PubMed Google Scholar
Vaccarino, A. L., Evans, K. R., Kalali, A. H., Kennedy, S. H., Engelhardt, N., Frey, B. N., Greist, J. H., Kobak, K. A., Lam, R. W., MacQueen, G., Milev, R., Placenza, F. M., Ravindran, A. V., Sheehan, D. V., Sills, T., & Williams, J. B. (2016). The depression inventory development workgroup: A collaborative, empirically driven initiative to develop a new assessment tool for major depressive disorder. Innovations in Clinical Neuroscience, 13(9–10), 20–31.
PubMed PubMed Central Google Scholar
Trivedi, M. H., Rush, A. J., Ibrahim, H. M., Carmody, T. J., Biggs, M. M., Suppes, T., Crismon, M. L., Shores-Wilson, K., Toprac, M. G., Dennehy, E. B., Witte, B., & Kashner, T. M. (2004). The Inventory of Depressive Symptomatology, Clinician Rating (IDS-C) and Self-Report (IDS-SR), and the Quick Inventory of Depressive Symptomatology, Clinician Rating (QIDS-C) and Self-Report (QIDS-SR) in public sector patients with mood disorders: A psychometric evaluation. Psychological Medicine, 34(1), 73–82.
Article CAS Google Scholar
American Educational Research Association, American Psychological Association, & National Council on Measurement in Education (Eds.). (2014). Standards for educational and psychological testing. American Educational Research Association.
Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational Measurement (3rd ed., pp. 13–103). American Council on Education/Collier Macmillan.
Google Scholar
Kane, M. T. (2013). Validating the interpretations and uses of test scores. Journal of Educational Measurement, 50(1), 1–73.
Article Google Scholar
Weinfurt, K. P. (2021). Constructing arguments for the interpretation and use of patient-reported outcome measures in research: An application of modern validity theory. Quality of Life Research, 30(6), 1715–1722.
Article Google Scholar
Edwards, M. C., Slagle, A., Rubright, J. D., & Wirth, R. J. (2018). Fit for purpose and modern validity theory in clinical outcomes assessment. Quality of Life Research, 27, 1711–1720.
Article Google Scholar
Borsboom, D., Mellenbergh, G. J., & van Heerden, J. (2004). The concept of validity. Psychological Bulletin, 111(4), 1061–1071.
Google Scholar
Hood, S. B. (2009). Validity in psychological testing and scientific realism. Theory & Psychology, 19(4), 451–473.
Article Google Scholar
Newton, P., & Shaw, S. (2014). Validity in educational & psychological assessment. Sage.
Book Google Scholar
Markus, K. A., & Borsboom, D. (2013). Frontiers of test validity theory: Measurement, causation, and meaning. Routledge.
Book Google Scholar
Edwards, J. R., & Bagozzi, R. P. (2000). On the nature and direction of relationships between constructs and measures. Psychological Methods, 5(2), 155–174. https://doi.org/10.1037/1082-989X.5.2.155
Article CAS PubMed Google Scholar
Chan, K. S., Orlando, M., Ghosh-Dasidar, B., Duan, N., & Sherbourne, C. D. (2004). The interview mode effect on the Center for Epidemiological Studies Depression (CES-D) scale: An item response theory analysis. Medical Care, 42(3), 281–289.
Article Google Scholar
Byrom, B., Gwaltney, C., Slagle, A., Gnanasakthy, A., & Muehlhausen, W. (2019). Measurement equivalence of patient reported outcome measures migrated to electronic formats: A review of evidence and recommendations for clinical trials and bring your own device. Therapeutic Innovation and Regulatory Science, 53, 426–430.
Article Google Scholar
Coons, S. J., Gwaltney, C. J., Hays, R. D., Lundy, J. J., Sloan, J. A., Revicki, D. A., Lenderking, W. R., Cella, D., & Basch, E. (2009). Recommendations on evidence needed to support measurement equivalence between electronic and paper-based patient reported outcome (PRO) measures: ISPOR ePRO good research practices task force report. Value in Health, 12(4), 419–429.
Article Google Scholar
Bennett, A. V., Keenoy, K., Shouery, M., Basch, E., & Temple, L. K. (2016). Evaluation of mode equivalence of the MSKCC bowel function instrument, LASA quality of life, and subjective significance questionnaire items administered by Web, interactive voice response system (IVRS), and paper. Quality of Life Research, 25(5), 1123–1130.
Article Google Scholar
Bjorner, J. B., Rose, M., Gandek, B., Stone, A. A., Junghaenel, D. U., & Ware, J. E., Jr. (2014). Method of administration of PROMIS scales did not significantly impact score level, reliability, or validity. Journal of clinical epidemiology, 67(1), 108–113.
Article Google Scholar
Lundy, J. J., Coons, S. J., & Aaronson, N. K. (2014). Testing the measurement equivalence of paper and interactive voice response system versions of the EORTC QLQ-C30. Quality of Life Research, 23(1), 229–237.
Article Google Scholar
Gwaltney, C. J., Shields, A. L., & Shiffman, S. (2008). Equivalence of electronic and paper-and-pencil administration of patient-reported outcome measures: A meta-analytic review. Value in Health, 11(2), 322–333.
Article Google Scholar
Muehlhausen, W., Doll, H., Quadri, N., Fordham, B., O’Donohoe, P., Dogar, N., & Wild, D. J. (2015). Equivalence of electronic and paper administration of patient-reported outcome measures: a systematic review and meta-analysis of studies conducted between 2007 and 2013. Health and quality of life outcomes, 13(1), 1–20.
Article Google Scholar
Cella, D. F., Tulsky, D. S., Gray, G., Sarafian, B., Linn, E., Bonomi, A., Silberman, M., Yellen, S. B., Winicour, P., Brannon, J., Eckberg, K., Llyod, S., Purl, S., Blendowski, C., Goodman, M., Barnicle, M., Stewart, I., McHale, M., Bonomi, R., … Harris, J. (1993). The functional assessment of cancer therapy scale: development and validation of the general measure. Journal of Clinical Oncology., 11, 570–579.
Article CAS Google Scholar
Cella, D. F., Bonomi, A. E., Lloyd, S. R., Tulsky, D. S., Kaplan, E., & Bonomi, P. (1995). Reliability and validity of the Functional Assessment of Cancer Therapy - Lung (FACT-L) quality of life instrument. Lung Cancer, 12, 199–220.
Article CAS Google Scholar

Download references

Funding

The authors did not receive support from any organization for the submitted work.

Author information

Authors and Affiliations

Vector Psychometric Group, LLC, Chapel Hill, USA
Carrie R. Houts, Michael C. Edwards & R. J. Wirth
Eli Lilly & Company, Indianapolis, USA
Elizabeth Nicole Bush

Authors

Carrie R. Houts

View author publications

You can also search for this author in PubMed Google Scholar
Elizabeth Nicole Bush

View author publications

You can also search for this author in PubMed Google Scholar
Michael C. Edwards

View author publications

You can also search for this author in PubMed Google Scholar
R. J. Wirth

View author publications

You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Carrie R. Houts.

Ethics declarations

Conflict of interest

Elizabeth Nicole Bush is an employee and stockholder of Eli Lilly and Company. All other authors declare that they have no conflicts of interest.

Ethical approval

This article does not contain any studies with human participants performed by the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 26 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Houts, C.R., Bush, E.N., Edwards, M.C. et al. Using validity theory and psychometrics to evaluate and support expanded uses of existing scales. Qual Life Res 31, 2969–2975 (2022). https://doi.org/10.1007/s11136-022-03162-7

Download citation

Accepted: 13 May 2022
Published: 03 June 2022
Issue Date: October 2022
DOI: https://doi.org/10.1007/s11136-022-03162-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Using validity theory and psychometrics to evaluate and support expanded uses of existing scales