Public comments: WPF urges FTC to focus on providing statistical parity for consumers (Big Data workshop)

WPF urges FTC to focus on consumers’  ability to control their digital exhaust and statistical parity for big data era

At the FTC workshop on Big Data September 15, Big Data: Tool for Inclusion or Exclusion?, panelists including the World Privacy Forum discussed legal and ethical frameworks that are applicable to large datasets and issues of discrimination and privacy. The view we articulated was that we should not throw out Fair Information Practices nor existing regulation. However, we acknowledged that something is missing in the area of consumer protection, and we coined the term “Statistical Parity” to describe this issue.

Here is our definition of the term:

Statistical parity means ensuring that all parts of the consumer data analytics process are fair: data collection, which data factors chosen and used for analytics, accuracy of the factors and how well the algorithm works for its intended purpose, and then how the final results are vetted and used, and for how long. Statistical parity means finding ways to ensure privacy and fairness in the analytics process from beginning to end, and to ensure that decisions about consumers are accurate and used fairly and in a non-discriminatory way.

We did not mean that each algorithm needs to be seen by consumers or the FTC. This is not feasible, nor a desirable outcome. We did mean that how fair and accurate the underlying factors are, how well the algorithm actually works, how the resulting analysis is used for consumers, and consumer access to the most meaningful analysis is important, and forms a new area we need to pay attention to.

Nowhere is the need for additional work in the area of statistical parity clearer than in the context of consumer scoring and predictive analytics. Consumer scores are simply the outgrowth of a robust and growing market and use of predictive analytics. But as modern statistical shorthand, scores are important. We published a substantial report about this issue in 2014, The Scoring of America.

In the comments, we discuss at length the issue of how fairness relates to the scoring of consumers. We also discuss the issue of fairness in how consumers are categorized. Regarding the important issue of digital exhaust, in the comments, we wrote:

How this fundamental issue of rights to shape our digital exhaust is navigated will have long-term impacts on privacy, and on certain questions related to large datasets, when those datasets contain personally identifiable information, or can be linked to or impact an individual. While some large data sets are genuinely not able to be tied back to an individual, some are. It is the datasets that contain individually identifiable digital exhaust that we are concerned with here.

Read WPF Comments to the FTC BigData: Inclusion or Exclusion? Workshop, (PDF. 12 pages)