The Wayback Machine - https://web.archive.org/web/20091218034109/http://privacy.cs.cmu.edu:80/people/sweeney/explosion.html

Information Explosion

by Latanya Sweeney, Ph.D.

Abstract

In this chapter, I examine the tremendous growth in information being collected on individuals. From the examples provided in this chapter, it is clear that many details in the lives of most people are being documented in databases somewhere. I provide examples that exemplify recent behavioral tendencies in the collection of person-specific data. These tendencies are: (1) given an existing person-specific data collection, expand the number of fields being collected; I term this the "collect more" trend; (2) replace an existing aggregate data collection with a person-specific one; I term this the "collect specifically" trend; and, (3) given a question or problem to solve or merely provided the opportunity, gather information by starting a new person-specific data collection related to the question, problem or opportunity; I term this the "collect it if you can" trend. No matter how you look at it, all three tendencies result in more and more information being collected on individuals. Having so much sensitive information available makes it even more difficult for other organizations to release information that are effectively anonymous.

Keywords: data collection practices, privacy

Citation:
L. Sweeney. Information Explosion. Confidentiality, Disclosure, and Data Access: Theory and Practical Applications for Statistical Agencies, L. Zayatz, P. Doyle, J. Theeuwes and J. Lane (eds), Urban Institute, Washington, DC, 2001. Paper: 26 pages in
PS or PDF.

Sample of data collections discussed

Characterizing the amount of information collected on individuals

The term Global Disk Storage per Person (GDSP) is defined in this paper as the amount of rigid disk drive space sold in a year divided by the adult world population. This measure provides a means for characterizing the amount of disk drive storage that could be used to store information on individuals. Below is a graph of GDSP over time. Also reported is how that figure translates into how much data storage is available to record a minute of a person's time.

Publication


Latanya Sweeney's Home Page
Selected publications by Latanya Sweeney

Last modified 1/2003 by latanya@privacy.cs.cmu.edu