Obstacles to harnessing analytic innovations in foreign policy analysis: a case study of crowdsourcing in the U.S. intelligence community: Intelligence and National Security: Vol 38 , No 4

ABSTRACT

We interviewed national security professionals to understand why the U.S. Intelligence Community has not systematically incorporated prediction markets or prediction polls into its intelligence reporting. This behavior is surprising since crowdsourcing platforms often generate more accurate predictions than traditional forms of intelligence analysis. Our interviews suggest that three principal barriers to adopting these platforms involved (i) bureaucratic politics, (ii) decision-makers lacking interest in probability estimates, and (iii) lack of knowledge about these platforms’ capabilities. Interviewees offered many actionable suggestions for addressing these challenges in future efforts to incorporate crowdsourcing platforms or other algorithmic tools into intelligence tradecraft.

Acknowledgments

Marco Allen and Sarah Turley provided excellent research assistance. Discussions with Perry World House’s Predictive Intelligence Assessment Methods (PRIAM) working group played a key role in shaping the project. We thank two anonymous reviewers for offering constructive feedback. We particularly appreciate the government officials who generously provided their time and insight to make this research possible. Funding from the French Agence Nationale de la Recherche (under the Investissement d’Avenir programme, ANR-17-EURE-0010) is gratefully acknowledged. The views expressed in this article are those of the authors and do not reflect the official policy of the U.S. Government. Author names have been ordered randomly.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Notes

1. Betts, “Is Strategy an Illusion?”.

2. Jervis, System Effects.

3. Stastny and Lehner, “Comparative Evaluation of the Forecast Accuracy of Analysis Reports and a Prediction Market”; Tetlock and Gardner, Superforecasting, 1–24.

4. Horowitz, Kahn, and Samotin, “A Force for the Future”.

5. Sims, “Decision Advantage and the Nature of Intelligence Analysis”; Jensen, Whyte, and Cuomo, “Algorithms at War”; Zegart, Spies, Lies, and Algorithms.

6. For general background on prediction markets as a forecasting tool, see Arrow et al., “The Promise of Prediction Markets”; Wolfers and Zitzewitz, “Prediction Markets”.

7. Horowitz, Ciocca, Kahn, and Ruhl, Keeping Score.

8. McHenry, “Three IARPA Forecasting Efforts”.

9. Stastny and Lehner, “Comparative Evaluation of the Forecast Accuracy of Analysis Reports and a Prediction Market”.

10. Interviewee 7.

11. Interviewee 25.

12. Chang, Chen, Mellers, and Tetlock, “Developing Expert Political Judgment: The Impact of Training and Practice on Judgmental Accuracy in Geopolitical Forecasting Tournaments”.

13. Horowitz et al., “What Makes Foreign Policy Teams Tick”.

14. Satopää et al., “Combining Multiple Probability Predictions Using a Simple Logit Model”.

15. Tetlock and Gardner, Superforecasting, 1–24.

16. Davis, Bagozzi, and Warshaw, “User Acceptance of Computer Technology”; Adams, Nelson, and Todd, “Perceived Usefulness, Ease of Use, and Usage of Information Technology”; Holden and Karsh, “The Technology Acceptance Model”.

17. See, for example, Stoll, “Why the Web Won’t Be Nirvana”.

18. Matthews, “Why Are People Skeptical about Climate Change?”; Whitmarsh, “Skepticism and Uncertainty about Climate Change”.

19. For instance, see Damasio, Descartes’ Error. On the relevant of this research for intelligence studies, see Heuer, Psychology of Intelligence Analysis.

20. Camerer and Kunreuther, “Decision Processes for Low Probability Events”; Magat, Viscusi, and Huber, “Risk-Dollar Tradeoffs, Risk Perceptions, and Consumer Behavior”; Huber, Wider, and Huber, “Active Information Search and Complete Information Presentation in Naturalistic Risky Decision Tasks”.

21. Peters, Hibbard, Slovic, and Dieckmann, “Numeracy Skill and the Communication, Comprehension, and Use of Risk-Benefit Information”.

22. Kunreuther, Novemsky, and Kahneman, “Making Low Probabilities Useful”; Peters et al., “Numeracy Skill”.

23. Slovic, Finucane, Peters, and MacGregor, “Risk as Analysis and Risk as Feelings”.

24. Snyder, “Richness, Rigor, and Relevance in the Study of Soviet Foreign Policy”.

25. See, for example, Kent, “Words of Estimative Probability”; Lanir and Kahneman, “An Experiment in Decision Analysis in Israel in 1975”; Marchio, “The Intelligence Community’s Struggle to Express Analytic Uncertainty in the 1970s”.

26. Horowitz et al., Keeping Score, 17–18.

27. Indeed, the Good Judgment Project showed that crowdsourced prediction polls were generally more accurate than the judgments of any individual analyst, and that subject matter expertise had relatively little impact on the accuracy of analysts’ forecasts. See Tetlock and Gardner, Superforecasting, 250–270.

28. See, for example, Tetlock, Expert Political Judgment. For broader evidence that the collective wisdom of amateurs can be as reliable as expert judgment, see Allen, Pennycook, and Rand, “Scaling Up Fact-Checking Using the Wisdom of Crowds”.

29. Barnes, “Making Intelligence Analysis More Intelligent”; and Marrin, “Evaluating the Quality of Intelligence Analysis”.

30. Halperin and Clapp, Bureaucratic Politics and Foreign Policy, 25–61.

31. Interviews were conducted between June 2021 and May 2022.

32. Thus, we treated each interview as a unit of observation with five binary variables, reflecting whether or not the interviewee’s views were consistent with each hypothesis. In some cases, interviewees volunteered information about obstacles to implementing crowdsourced forecasts in response to other questions. If those responses were relevant to a hypothesis, they were also categorized for these purposes. Each member of the research team independently coded each interview. Discrepancies were identified and debated to reach unanimous judgments in coding the binary measures.

33. Goldstein, Getting in the Door, 670.

34. Interviews were conducted in-person, on zoom, and by telephone, depending on the availability, location, and preference of the interviewees. All interviews were thirty minutes, and interviewees were not compensated for their time. All interviews were conducted under the auspices of University of Pennsylvania IRB Protocol #848953.

35. Berry, “Validity and Reliability Issues in Elite Interviewing”.

36. See, for example, Braun and Clarke, “Using Thematic Analysis in Psychology”.

37. Greenstein and Mosley, “When Talk Isn’t Cheap”; and Leech et al., “Lessons from the ‘Lobbying and Policy Change’ Project”, 214.

38. The difference in response proportions among the three ‘top tier’ responses is not statistically significant. For instance, the gap between response rates pertaining to bureaucratic politics and perceived efficacy has a p-value of 0.48.

39. As noted in Section 1, organizations such as IARPA are tasked to develop analytic tools with broad applicability, but not to manage the implementation of mature operational systems.

40. Interviewee 17 noted that this idea would only work when using unclassified platforms; the next section will return to the question of whether forecasting tools would be more valuable in classified versus unclassified settings.

41. These questions were designed to address directions for future research suggested in Horowitz et al., Keeping Score.

42. Available at https://www.dni.gov/files/documents/ICD/ICD%20203%20Analytic%20Standards.pdf

43. Fingar, Reducing Uncertainty.

44. Interviewees 4, 5, 7, 13, 17, 20, 21, 22, 25.

45. Interviewees 4, 5, 7, and 10.

46. Interviewees 3, 4, 10, 12, 13, 20, 21, 23, 24, 25.

47. Interviewees 2, 4, 5, 9, 11, 17, 18, 23. Interviewee 9 specifically recommended granting program management to the National Intelligence Officer for Warning. Interviewee 2 recommended the NIC’s Strategic Futures Group.

48. Several other intelligence organizations, such as the Open Source Center, the National Geospatial-Intelligence Agency, and the MITRE Corporation, received one vote apiece. (Interviewees 9, 12, 18). Interviewee 24 suggested locating the program in Congress, but Interviewee 23 opposed that idea. No interviewees suggested housing the program with the Department of Defense (DoD), and three expressly said that DoD would be a poor home for the project. (Interviewees 5, 7, 10).

49. Werbach and Hunter, For the Win.

50. Another approach would be for the IC to analyze which of its personnel were most likely to participate in the ICPM. This would provide a behavioral measure for understanding empirical variation in which some analysts or offices were more likely than others to support crowdsourced forecasts. However, given that very few analysts ultimately cited the ICPM in finished reporting, this method would primarily serve to explain why some analysts demonstrated interest in algorithmic forecasting, which is not the same thing as understanding the conditions under which analysts would be more or less likely to integrate that methodology into intelligence tradecraft.

51. Friedman, Baker, Mellers, Tetlock, and Zeckhauser, “The Value of Precision in Probability Assessment”.

52. Friedman, Lerner, and Zeckhauser, “Behavioral Consequences of Probabilistic Precision”.

53. Tetlock and Gardner, Superforecasting.

Additional information

Funding

Funding from the French Agence Nationale de la Recherche (under the Investissement d’Avenir programme, ANR-17-EURE-0010) is gratefully acknowledged.

Notes on contributors

Laura Resnick Samotin

Laura Resnick Samotin is an adjunct assistant professor of international relations and Director of Strategic Partnerships at the School of International and Public Affairs, Columbia University. Her research focuses on the non-material determinants of military effectiveness and has been published in Journal of Politics, Foreign Affairs, and edited volumes from Cambridge University Press and Georgetown University Press.

Jeffrey A. Friedman

Jeffrey A. Friedman is an Associate Professor of Government at Dartmouth College. His research focuses on how risk and uncertainty shape foreign policy decision-making. He is the author of War and Chance: Assessing Uncertainty in International Politics (Oxford, 2019).

Michael C. Horowitz

Michael C. Horowitz is Director of Perry World House and Richard Perry Professor at the University of Pennsylvania. He is the author of The Diffusion of Military Power: Causes and Consequences for International Politics, and the co-author of Why Leaders Fight. His research interests include the intersection of emerging technologies such as artificial intelligence and robotics with global politics, military innovation, the role of leaders in international politics, and geopolitical forecasting methodology.

Obstacles to harnessing analytic innovations in foreign policy analysis: a case study of crowdsourcing in the U.S. intelligence community

Notes on contributors

Laura Resnick Samotin

Jeffrey A. Friedman

Michael C. Horowitz

Log in via your institution

Log in to Taylor & Francis Online

Restore content access

Related Research

Information for

Open access

Opportunities

Help and information

Obstacles to harnessing analytic innovations in foreign policy analysis: a case study of crowdsourcing in the U.S. intelligence community

ABSTRACT

Acknowledgments

Disclosure statement

Notes

Additional information

Funding

Notes on contributors

Laura Resnick Samotin

Jeffrey A. Friedman

Michael C. Horowitz

Log in via your institution

Log in to Taylor & Francis Online

Log in to Taylor & Francis Online

Restore content access

Related Research

To cite this article:

Download citation

Information for

Open access

Opportunities

Help and information

Keep up to date

Your download is now in progress and you may close this window

Login or register to access this feature