Big Data

Big Data is a term WPF uses to describe very large datasets and the technologies and practices of handling those datasets. Typically, Big Data datasets are so large that traditional database systems are not able to handle or analyze them.

Sources for Big Data are many and varied. They include web data, sensors, cell towers, census data and other data from the government, social media, transactional data, and a variety of other data collection systems.

We have seen a tendency to use the term Big Data as a loosely defined stand-in for a number of privacy issues that sound the same, but aren’t. For example, Big Data and Data Brokers are sometimes used together. The two ideas are distinct and different, and it is crucial for public policy and discussion that the two are not conflated as being the same thing or even a similar thing. It is possible to work with Big Data and never be a Data Broker.

Large datasets are intriguing to the World Privacy Forum, and our research on large datasets resulting from sensors and ID cards in Asia helped us understand and explore the issue in-depth. Large datasets sometimes present privacy challenges, but sometimes they do not. Much depends on how the dataflows are collected, managed, stored, and so forth. Understanding these differences and knowing when and where the challenges are is going to be important going forward in this rapidly evolving space.


Video: Healthy Cities Project in China — 20 million health records in the cloud (CES 2015, interview)

The Healthy Cities Project in China is one where mobile devices, mobile health mini-hubs, and sensors are the key way that patients, doctors, government, and enterprises can input, monitor, and access vital health statistics and other information in the cloud. Twenty million people already use this system. Healthy Cities is important for study, because it is a fully established infrastructure in those cities in China where it has been deployed. In the US, the Healthy Cities project is being studied by academics to see how it could be replicated in the US marketplace.

Panel talk: Big data, privacy, and vulnerable populations

Pam Dixon will be speaking at the IAPP-FTC Practical Privacy Conference in Washington DC this week. The conference is from Dec. 2-3. Her panel talk will focus on privacy issues relating to identifiable large datasets and vulnerable populations. She will also be discussing the role of data brokers in compiling datasets and categorizing people, as

Public comments: WPF urges FTC to focus on providing statistical parity for consumers (Big Data workshop)

WPF urges FTC to focus on consumers’  ability to control their digital exhaust and statistical parity for big data era At the FTC workshop on Big Data September 15, Big Data: Tool for Inclusion or Exclusion?, panelists including the World Privacy Forum discussed legal and ethical frameworks that are applicable to large datasets and issues

Privacy Spotlight: FTC Big Data Event

Big Data and its potential for inclusion and exclusion was on center stage this past September as the FTC held a day-long workshop with experts from industry, technology, privacy, civil liberties, and academia. World Privacy Forum’s Executive Director Pam Dixon, a panelist at the event, spoke about Big Data and privacy, emphasizing several key points, including the need for statistical parity, fairness, and the need for keeping existing consumer protection regulation.

FTC announces final agenda and panelist roster for Big Data workshop

The Federal Trade Commission has announced its panelist roster and final agenda for its upcoming workshop, “Big Data: A Tool for Inclusion or Exclusion?.”  The workshop is going to be an important and thoughtful discussion about big data and privacy issues, and will be taking place on Sept. 15 in Washington, D.C.  The World Privacy