By Carl Bialik
Wall Street Journal (February 5, 2010)
My print column this week examines a quirk in U.S. Census Bureau data that may have led to research errors. A National Bureau of Economic Research working paper this week demonstrated that so-called microdata - a subset of all Census responses, released to researchers who want to dig deeper into demographic trends - for several surveys contained flaws.
"This whole issue arose from our attempts to preserve privacy," said Robert M. Groves, director of the Census Bureau. The agency takes several steps to scrub microdata of any information that might reveal the identity of a census respondent. These steps include changing responses slightly. "People who have a really rare combination of attributes have a higher likelihood of being disclosed," Groves said. "Part of disclosure-risk analysis looks at those rare combinations. Our pledge to that person is that no matter what we do, ever, there is no way that person can be reidentified."
The trick is making these changes without creating faulty or misleading data, and statisticians are still working out how to accomplish this.