Abstract
Background
Scale development is a complex activity requiring significant investments of time and money to produce evidence of a scale’s ability to produce reliable scores and valid inferences. With increasing use of clinical outcome assessments (COAs) in medical product development, evidentiary expectations of regulatory bodies to support inferences are a key consideration. The goal of this paper is to demonstrate how existing methods in measurement science can be used to identify and fill evidence gaps when considering re-purposing an existing scale for a new use case (e.g., new patient population, altering the recall period), rather than creating a new COA tool.
Methods
We briefly review select validity theory and psychometric concepts, linking them to the nomenclature in the COA/regulated space. Four examples (two in-text and two in online supplemental materials) of modifications are presented to demonstrate these ideas in practice for quality of life (QOL)-related measures.
Results
Each example highlights the initial process of evaluating the desired validity claims, identifying gaps in evidence to support these claims, and determining how such gaps could be filled, often without having to develop a new measure.
Conclusions
If an existing scale, with minimal modification or additional evidence, can be shown to be fit for a new purpose, considerable effort can be saved and research waste avoided. In many cases, a new instrument is simply unnecessary. Far better to recycle an “old” scale for a new use–with sufficient evidence that it is fit for that purpose–than to “buy” a new one.
Similar content being viewed by others
References
Altman, D. G. (1994). The scandal of poor medical research. BMJ, 308(6924), 283–284. https://doi.org/10.1136/bmj.308.6924.283
Chalmers, I., & Glasziou, P. (2009). Avoidable waste in the production and reporting of research evidence. Lancet, 374(9683), 86–89. https://doi.org/10.1016/S0140-6736(09)60329-9
Vodicka, E., Kim, K., Devine, E. B., Gnanasakthy, A., Scoggins, J. F., & Patrick, D. L. (2015). Inclusion of patient-reported outcome measures in registered clinical trials: Evidence from clinical trials.gov (2007–2013). Contemporary clinical trials, 43, 1–9. https://doi.org/10.1016/j.cct.2015.04.004
Rosenthal, R. (1979). The file drawer problem and tolerance for null results. Psychological Bulletin, 86(3), 638–641. https://doi.org/10.1037/0033-2909.86.3.638
Shrout, P. E., & Rodgers, J. L. (2018). Psychology, science, and knowledge construction: Broadening perspectives from the replication crisis. Annual Review of Psychology, 69, 487–510. https://doi.org/10.1146/annurev-psych-122216-011845
U.S. Food and Drug Administration. (2018). Discussion document for patient-focused drug development public workshop on guidance 3: Select, develop, or modify fit-for-purpose clinical outcome assessments. Retrieved from https://www.fda.gov/media/116277/download
Oosterveld, P., Vorst, H. C. M., & Smits, N. (2019). Methods for questionnaire design: A taxonomy linking procedures to test goals. Quality of Life Research, 28, 2501–2512.
Smits, N., van der Ark, L. A., & Conijn, J. M. (2018). Measurement versus prediction in the construction of patient reported outcome questionnaires: Can we have our cake and eat it? Quality of Life Research, 27, 1673–1682.
U.S. Food and Drug Administration. (2009). Guidance for industry patient-reported outcome measures: Use in medical product development to support labeling claims. Retrieved from https://www.fda.gov/media/77832/download
U.S. Food and Drug Administration (2020). Patient-focused drug development: Collecting comprehensive and representative input. Guidance for industry, food and drug administration staff, and other stakeholders. Retrieved from https://www.fda.gov/media/139088/download
U.S. Food and Drug Administration (2018). Methods to identify what is important to patients & select, develop or modify fit-for-purpose clinical outcomes assessments. From the Patient-Focused Drug Development Guidance Public Workshop. Retrieved from https://www.fda.gov/media/116276/download
U.S. Food and Drug Administration (2019). Incorporating clinical outcome assessments into endpoints for regulatory decision-making. From the Patient-Focused Drug Development Guidance Public Workshop. Retrieved from https://www.fda.gov/media/132505/download
U.S. Food and Drug Administration (2021). Qualified Clinical Outcome Assessments (COA). Retrieved from https://www.fda.gov/drugs/clinical-outcome-assessment-coa-qualification-program/qualified-clinical-outcome-assessments-coa
Papadopoulos, E. J., Bush, E. N., Eremenco, S., & Coons, S. J. (2020). Why reinvent the wheel? Use or modification of existing clinical outcome assessment tools in medical product development. Value in Health, 23(2), 151–153.
Beck, A. T., Steer, R. A., & Garbin, M. G. (1988). Psychometric properties of the beck depression inventory: Twenty-five years of evaluation. Clinical Psychology Review, 8(1), 77–100.
Zigmond, A. S., & Snaith, P. (1983). The hospital anxiety and depression scale. Acta Psychiatrica Scandinavica, 67, 361–370.
Radloff, L. S. (1977). The CES-D scale: A self-report depression scale for research in the general population. Applied Psychological Measurement, 1(3), 385–401.
Kroenke, K., Spitzer, R. L., & Williams, J. B. W. (2001). The PHQ-9: Validity of a brief depression severity measure. Journal of General Internal Medicine, 16, 606–613.
Bushnell, D. M., McCarrier, K. P., Bush, E. N., Abraham, L., Jamieson, C., McDougall, F., Trivedi, M. H., Thase, M. E., Carpenter, L., Coons, S. J., PRO Consortium’s Depression Working Group. (2019). Symptoms of major depressive disorder scale: Performance of a novel patient-reported symptom measure. Value Health, 22(8), 906–915. https://doi.org/10.1016/j.jval.2019.02.010
Olsen, L., Jensen, D., Noerholm, V., Martiny, K., & Bech, P. (2003). The internal and external validity of the major depression inventory in measuring severity of depressive states. Psychological Medicine, 33(2), 351–356. https://doi.org/10.1017/S0033291702006724
Vaccarino, A. L., Evans, K. R., Kalali, A. H., Kennedy, S. H., Engelhardt, N., Frey, B. N., Greist, J. H., Kobak, K. A., Lam, R. W., MacQueen, G., Milev, R., Placenza, F. M., Ravindran, A. V., Sheehan, D. V., Sills, T., & Williams, J. B. (2016). The depression inventory development workgroup: A collaborative, empirically driven initiative to develop a new assessment tool for major depressive disorder. Innovations in Clinical Neuroscience, 13(9–10), 20–31.
Trivedi, M. H., Rush, A. J., Ibrahim, H. M., Carmody, T. J., Biggs, M. M., Suppes, T., Crismon, M. L., Shores-Wilson, K., Toprac, M. G., Dennehy, E. B., Witte, B., & Kashner, T. M. (2004). The Inventory of Depressive Symptomatology, Clinician Rating (IDS-C) and Self-Report (IDS-SR), and the Quick Inventory of Depressive Symptomatology, Clinician Rating (QIDS-C) and Self-Report (QIDS-SR) in public sector patients with mood disorders: A psychometric evaluation. Psychological Medicine, 34(1), 73–82.
American Educational Research Association, American Psychological Association, & National Council on Measurement in Education (Eds.). (2014). Standards for educational and psychological testing. American Educational Research Association.
Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational Measurement (3rd ed., pp. 13–103). American Council on Education/Collier Macmillan.
Kane, M. T. (2013). Validating the interpretations and uses of test scores. Journal of Educational Measurement, 50(1), 1–73.
Weinfurt, K. P. (2021). Constructing arguments for the interpretation and use of patient-reported outcome measures in research: An application of modern validity theory. Quality of Life Research, 30(6), 1715–1722.
Edwards, M. C., Slagle, A., Rubright, J. D., & Wirth, R. J. (2018). Fit for purpose and modern validity theory in clinical outcomes assessment. Quality of Life Research, 27, 1711–1720.
Borsboom, D., Mellenbergh, G. J., & van Heerden, J. (2004). The concept of validity. Psychological Bulletin, 111(4), 1061–1071.
Hood, S. B. (2009). Validity in psychological testing and scientific realism. Theory & Psychology, 19(4), 451–473.
Newton, P., & Shaw, S. (2014). Validity in educational & psychological assessment. Sage.
Markus, K. A., & Borsboom, D. (2013). Frontiers of test validity theory: Measurement, causation, and meaning. Routledge.
Edwards, J. R., & Bagozzi, R. P. (2000). On the nature and direction of relationships between constructs and measures. Psychological Methods, 5(2), 155–174. https://doi.org/10.1037/1082-989X.5.2.155
Chan, K. S., Orlando, M., Ghosh-Dasidar, B., Duan, N., & Sherbourne, C. D. (2004). The interview mode effect on the Center for Epidemiological Studies Depression (CES-D) scale: An item response theory analysis. Medical Care, 42(3), 281–289.
Byrom, B., Gwaltney, C., Slagle, A., Gnanasakthy, A., & Muehlhausen, W. (2019). Measurement equivalence of patient reported outcome measures migrated to electronic formats: A review of evidence and recommendations for clinical trials and bring your own device. Therapeutic Innovation and Regulatory Science, 53, 426–430.
Coons, S. J., Gwaltney, C. J., Hays, R. D., Lundy, J. J., Sloan, J. A., Revicki, D. A., Lenderking, W. R., Cella, D., & Basch, E. (2009). Recommendations on evidence needed to support measurement equivalence between electronic and paper-based patient reported outcome (PRO) measures: ISPOR ePRO good research practices task force report. Value in Health, 12(4), 419–429.
Bennett, A. V., Keenoy, K., Shouery, M., Basch, E., & Temple, L. K. (2016). Evaluation of mode equivalence of the MSKCC bowel function instrument, LASA quality of life, and subjective significance questionnaire items administered by Web, interactive voice response system (IVRS), and paper. Quality of Life Research, 25(5), 1123–1130.
Bjorner, J. B., Rose, M., Gandek, B., Stone, A. A., Junghaenel, D. U., & Ware, J. E., Jr. (2014). Method of administration of PROMIS scales did not significantly impact score level, reliability, or validity. Journal of clinical epidemiology, 67(1), 108–113.
Lundy, J. J., Coons, S. J., & Aaronson, N. K. (2014). Testing the measurement equivalence of paper and interactive voice response system versions of the EORTC QLQ-C30. Quality of Life Research, 23(1), 229–237.
Gwaltney, C. J., Shields, A. L., & Shiffman, S. (2008). Equivalence of electronic and paper-and-pencil administration of patient-reported outcome measures: A meta-analytic review. Value in Health, 11(2), 322–333.
Muehlhausen, W., Doll, H., Quadri, N., Fordham, B., O’Donohoe, P., Dogar, N., & Wild, D. J. (2015). Equivalence of electronic and paper administration of patient-reported outcome measures: a systematic review and meta-analysis of studies conducted between 2007 and 2013. Health and quality of life outcomes, 13(1), 1–20.
Cella, D. F., Tulsky, D. S., Gray, G., Sarafian, B., Linn, E., Bonomi, A., Silberman, M., Yellen, S. B., Winicour, P., Brannon, J., Eckberg, K., Llyod, S., Purl, S., Blendowski, C., Goodman, M., Barnicle, M., Stewart, I., McHale, M., Bonomi, R., … Harris, J. (1993). The functional assessment of cancer therapy scale: development and validation of the general measure. Journal of Clinical Oncology., 11, 570–579.
Cella, D. F., Bonomi, A. E., Lloyd, S. R., Tulsky, D. S., Kaplan, E., & Bonomi, P. (1995). Reliability and validity of the Functional Assessment of Cancer Therapy - Lung (FACT-L) quality of life instrument. Lung Cancer, 12, 199–220.
Funding
The authors did not receive support from any organization for the submitted work.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
Elizabeth Nicole Bush is an employee and stockholder of Eli Lilly and Company. All other authors declare that they have no conflicts of interest.
Ethical approval
This article does not contain any studies with human participants performed by the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Houts, C.R., Bush, E.N., Edwards, M.C. et al. Using validity theory and psychometrics to evaluate and support expanded uses of existing scales. Qual Life Res 31, 2969–2975 (2022). https://doi.org/10.1007/s11136-022-03162-7
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11136-022-03162-7