Big Data

Big Data is a term WPF uses to describe very large datasets and the technologies and practices of handling those datasets. Typically, Big Data datasets are so large that traditional database systems are not able to handle or analyze them.

Sources for Big Data are many and varied. They include web data, sensors, cell towers, census data and other data from the government, social media, transactional data, and a variety of other data collection systems.

We have seen a tendency to use the term Big Data as a loosely defined stand-in for a number of privacy issues that sound the same, but aren’t. For example, Big Data and Data Brokers are sometimes used together. The two ideas are distinct and different, and it is crucial for public policy and discussion that the two are not conflated as being the same thing or even a similar thing. It is possible to work with Big Data and never be a Data Broker.

Large datasets are intriguing to the World Privacy Forum, and our research on large datasets resulting from sensors and ID cards in Asia helped us understand and explore the issue in-depth. Large datasets sometimes present privacy challenges, but sometimes they do not. Much depends on how the dataflows are collected, managed, stored, and so forth. Understanding these differences and knowing when and where the challenges are is going to be important going forward in this rapidly evolving space.

 

Public comments: WPF encourages NIST to refine report on de-identification of personally identifiable information

The World Privacy Forum submitted comments today to the National Institute of Standards and Technology in response to its publication, Draft Report on De-Identification of Personally Identifiable Information (NISTIR 8053). The WPF welcomes the draft NIST report, as the area of de-identification and re-identification of personal data swirls with controversy and confusion. We see considerable value

Medical identity theft and electronic health care records: risks and solutions

Executive Director Pam Dixon will be speaking this Friday at the National Association of Healthcare Journalists about electronic records, and the risk of medical identity theft and other risks that arise from data breaches of medical records. Dixon’s talk will cover new research, as well as discuss potential solutions to the problems. Details: When: Friday,

Collections Scoring, Privacy, and Consumer Impacts

This coming Thursday, Pam Dixon will be presenting new research on collections scoring, privacy, and impacts on low and middle income consumers. The Dixon/Gellman report, The Scoring of America, sparked a national conversation about analytics and fairness in the realm of consumer scores. This talk focuses on one particular category of scoring, that of using

Video: Healthy Cities Project in China — 20 million health records in the cloud (CES 2015, interview)

The Healthy Cities Project in China is one where mobile devices, mobile health mini-hubs, and sensors are the key way that patients, doctors, government, and enterprises can input, monitor, and access vital health statistics and other information in the cloud. Twenty million people already use this system. Healthy Cities is important for study, because it is a fully established infrastructure in those cities in China where it has been deployed. In the US, the Healthy Cities project is being studied by academics to see how it could be replicated in the US marketplace.