Public Comments: August 2006 – FTC Complaint About Search AOL Data Releases



Internet privacy — The World Privacy Forum announced today that it would be filing a complaint with the Federal Trade Commission about the posting by AOL of a portion of its users’ search data on the Internet. While the data was not expressly identified by name, the search queries themselves included in some cases personally identifiable information such as individuals’ names, Social Security Numbers, and myriad other personal information. The World Privacy Forum urges consumers to take precautions when using search engines.


Before the
Federal Trade Commission
Washington, DC 20580

In the Matter of AMERICAL ONLINE LCC, a majority owned subsidiary of  TIME WARNER INC.

Complaint and Request for Investigation, Injunction, and Other Relief



1. This complaint concerns the disclosure of personal information by America Online LLC (AOL) in violation of its privacy policy and in violation of consumer expectations for the maintenance, use, and disclosure of personal information. As set forth in detail below, AOL engaged in unfair or deceptive acts or practices as defined by Section 5(a) of the FTC Act.

2. AOL is an Internet service company that operates a network of web brands as well as a large Internet access subscription service in the United States. The services AOL provides to Internet subscribers and users are at the heart of this complaint.

3. The World Privacy Forum alleges that AOL knowingly released at least two sets of user search query data to the public, that it announced the release of the data to the public, and that the released user data contains personally identifiable information as well as non-personally identifiable information that can be used to affirmatively identify some individuals. Specifically, AOL released an unedited data set of 21,011,340 new web search queries that had been conducted at <>. Each query included a unique ID number that represented one of 657,426 user accounts. Each account may have been used by one or more individuals or others. This data set covered searches over a three-month period of time from March to May of 2006. The material was released on AOL’s research web site <> on or about July 31, 2006. On or about the same date, AOL also released user search data from Fall 2004 containing approximately 20,000 user search queries at < edQueries >. The World Privacy Forum alleges that these actions comprise a pattern of personal data activities contravening the AOL privacy policy. The World Privacy Forum further alleges that the public release of these data constitutes unfair or deceptive acts or practices as defined by Section 5(a) of the FTC Act.



4. The World Privacy Forum is a nonprofit, non-partisan 501(c)(3) public interest research group that focuses on conducting in-depth research and consumer education in the area of privacy. The World Privacy Forum’s work covers a broad range of privacy topics, including consumer privacy topics, and it routinely comments on governmental and private sector activities affecting privacy. Recent World Privacy Forum activities include research on medical identity theft, consumer educational material regarding medical identity theft, and various agency comments. Recent reports include Medical Identity Theft: The Crime that Can Kill (May 2006), and Call Don’t Click 1 & 2 (2005), reports about, and related issues. The World Privacy Forum <> is based in San Diego, California.

5. AOL is a majority-owned subsidiary of Time Warner Inc. with principal offices at 22000 AOL Way, Dulles, VA 20166-9302. Time Warner Inc. is a Delaware corporation and maintains its principal offices at One Time Warner Center, New York, NY 10019.

6. AOL is a corporation as defined by Section 4 of the Federal Trade Commission Act, 15 U.S.C. § 44.

7. The acts and practices described in this complaint constitute commerce within the meaning of Section 4 of the Federal Trade Commission Act, 15 U.S.C. § 44.



AOL Policies

8. AOL states in its mission statement at <> that its mission is “To build a global medium as central to people’s lives as the telephone or televisions … and even more valuable.” [Emphasis AOL’s]. AOL also states that “We led the way in protecting our members from spam, viruses, hackers and other dangers” and that “ We do all these things and much more because at AOL, we are dedicated to the simple premise that our members and consumers deserve the best possible – and most valuable – online experience available anywhere.” These statements would allow a consumer to conclude that AOL was truthfully describing its dedication to protect its members from “hackers and other dangers.”

This statement is attached hereto as Exhibit A.

9. AOL issued a press release September 19, 2005 stating “AOL Named Most Trusted Web Portal/ISP for Privacy in National Consumer Survey” <>. The survey was conducted by Truste and was based upon in excess of 7,000 respondents. More than a third of respondent’s cited the following factors as key points to consider in judging the companies’ trustworthiness, as quoted from the AOL press release:

  • “Overall reputation of the company for product or service quality
  • The company’s limits over the collection, use and sharing of personal information
  • Quality of advertisements and solicitations that are respectful of my privacy requirements and rights
  • Sense of security protections when providing personal information
  • The privacy policy of the company”

Thus, as revealed in the survey, AOL users had quantifiable expectations of privacy when using the AOL service, an expectation which AOL acknowledged and encouraged by issuing its press release.

This press release is attached hereto as Exhibit B.

10. AOL maintains a search engine at <>.
A screen shot of the AOL Search home page is attached hereto as Exhibit C.

11. The applicable privacy policy for the AOL search engine is the AOL Network Privacy Policy at <>. The privacy policy does not disclose at any point that users’ search query information will be released or disclosed to the public. The policy does not state that users’ information will be released or disclosed to the public in an unedited, unfiltered fashion. Releasing user information to the public is a significant enough data practice that it should have been disclosed explicitly in the privacy policy affecting users. Releasing unedited, unfiltered user information to the public is an even more significant data practice.

The AOL Network Privacy Policy is attached hereto as Exhibit D.

12. The AOL Network Frequently Asked Questions page <> also applies to the AOL search engine. At this page, AOL states:

“How does the AOL Network use information about searches?

Our goal is to better personalize your experience with the AOL Network. The AOL Network plans to use information about the searches you perform through the Network, and how you use the results of those searches, to help customize and improve an AOL Network user’s search results and, over time, to provide more relevant content and offers to you. To provide users with control and privacy, the AOL Network will allow a user to disable this personalization functionality for either specific searches or all searches, as well as to review and/or delete any or all of his/her past searches.”

This statement in the last sentence indicates to users that they will be allowed to review and if they choose, to delete past searches conducted at the AOL search engine. This statement can be inferred to apply to any user of AOL’s search engine. In the same document, AOL states that some of its products are openly available to registered and non-registered users of AOL:

“What is the difference between registered and non-registered users of the AOL Network?

Some of the content and features from properties in the AOL Network are available for free to any Web visitor. They include maps and directions from Mapquest, movie information from Moviefone, local events and reviews from Cityguide, shopping advice from inStore, and news and information from and other AOL Network properties.

Other personalized and enhanced features and content from the AOL Network require a user to register to gain access to them, like AIM Mail, Address Book, Calendar, Message Boards, and My AOL. Registration for the AOL Network is absolutely free, and users only have to register once to access personalized and enhanced content from across the Network.”

Although AOL did not expressly mention its search engine in this paragraph, its search engine is one of the free products available to non-registered web users. The document did not state that only registered users could delete searches, therefore, all web users of AOL’s search service appear to have the right to delete searches according to the published AOL policy.

This document is attached hereto as Exhibit E.

AOL Research Site

13. AOL maintained an online research site that was publicly available at <> from about July 28 2006 to on or about August 7 2006. Users did not need a special password to access the site pages or explore the files. The Google search engine, in fact, cached the site on August 7, 2006 at 7:00 GMT. The AOL Research site’s welcome message stated:

“In our modern information-rich society, we have chosen to take on the challenge of helping people find and consume the information they seek. To that end, we accept that such a challenge cannot be met by a single individual or even a single company. With AOL Research, we are excited to introduce an open-ended research community where data, APIs and research results are shared.”

The welcome message also stated:

“Take a look at the Test Collections section where you will find raw data sets and an open wiki-style community for each one.”

The “News” section of the site noted that on 7/31/2006 test collections had been added to the site.

The cached welcome page of the AOL Research Site is hereto attached as Exhibit F.

14. The AOL Research site included a “Research APIs Terms of Service.” The site was by all appearances an intentional site, and the data releases were intentional. The “Research APIs Terms of Service” outlined the terms under which AOL Research would allow individuals to use the data it was releasing. It said, for example:

“You may not use the search results provided by AOL Research APIs service with an existing product or service that competes with products or services offered by AOL.” [1]

The Research Terms of Service sought to protect AOL’s interests, but it did not, anywhere in the document, seek to protect or even mention the privacy interests of research subjects.

In the Research Terms of Service, AOL further stated:

“AOL disclaims any warranties regarding the security, reliability, timeliness, availability, and performance of AOL Research APIs.” [2]

The AOL Research Terms of Service is hereto attached as Exhibit G.

AOL Release of Spring 2006 User Data

15. AOL users who used the AOL search engine for the period of March 1 to May 31, 2006 had their information recorded, and logged, and stored by AOL. [3]

16. AOL users who used the search engine in the Fall of 2004 also had their web queries recorded, logged and stored by AOL. [4] AOL’s tracking of web queries during 2004 and 2006 appeared to be a normal practice.

17. AOL randomly selected 657,426 of its “real world users” [5] with search queries logged at <> during the period of March 1 to May 31, 2006. AOL kept a record of the time of each search, and the time that its users clicked on a web site that appeared in the search results. The time stamps included at least the date, hour, minute, and second. According to the “Read me” document AOL released with the data, “The goal of this collection is to provide real query log data that is based on real users. It could be used for personalization, query reformulation or other types of search research.” [6]

The AOL Read Me document (U500k_README.txt) for the Spring 2006 data is hereto attached as Exhibit H.

18. This Read me document reveals that AOL knew that the data was from actual users and would be used for multiple kinds of research. Indeed, the statement suggests such a broad concept of research as to include almost any type of data use within its scope. Research that “could be used for personalization” implies some level of identification or at least tracking of searchers.

19. AOL put the selected data from Spring 2006 into a text format that sorted the queries by unique AOL user. Each user was given a number called AnonID under which the search terms were grouped. In so doing, AOL created a unified 3-month long data trail for each user, thus allowing readers of the data to associate the user’s queries over a long period of time. In so doing, associational information a user typed in over time allowed the user to become increasingly recognizable to third parties in some cases. [7]

20. The Read me document that AOL included with the released data files contained a description of the Spring 2006 data set. A portion of the Read me document is included below:

“The data set includes {AnonID, Query, QueryTime, ItemRank, ClickURL}.
AnonID – an anonymous user ID number.
Query – the query issued by the user, case shifted with most punctuation removed.
QueryTime – the time at which the query was submitted for search.
ItemRank – if the user clicked on a search result, the rank of the item on which they clicked is listed.
ClickURL – if the user clicked on a search result, the domain portion of the URL in the clicked result is listed.

Each line in the data represents one of two types of events:

1. A query that was NOT followed by the user clicking on a result item.

2. A click through on an item in the result list returned from a query.

In the first case (query only) there is data in only the first three columns/fields — namely AnonID, Query, and QueryTime (see above).
In the second case (click through), there is data in all five columns. For click through events, the query that preceded the click through is included. Note that if a user clicked on more than one result in the list returned from a single query, there will be TWO lines in the data to represent the two events. Also note that if the user requested the next “page” or results for some query, this appears as a subsequent identical query with a later time stamp.” [8]

In this explanation of the data set, AOL discloses that user queries and the web results that users clicked on are accompanied by a time and date stamp.

In the same document, AOL also disclosed under a section titled “CAVEAT EMPTOR” that the data was sexually explicit, and that users of the data should

“Please understand that the data represents REAL WORLD USERS, un-edited and randomly sampled, and that AOL is not the author of this data.”

AOL further directed in its caveat emptor that

“Also be aware that in some states it may be illegal to expose a minor to this data.” [9]

This “caveat emptor” statement by AOL indicates that AOL knowingly released unedited user data, which means that personally identifiable information could potentially be included.

21. On or about July 31, 2006 AOL posted the data from the Spring 2006 user data files at its web site < r3Months>.

A screen shot of this page is attached hereto as Exhibit I.

22. In posting this data to its publicly available web site, AOL knowingly posted unedited, unfiltered queries from its users, based on its statements about the data in its own documents.

AOL Release of Fall 2004 User Data

23. On or about Fall 2004, AOL sampled one week of AOL search queries and allowed those queries to be used for research purposes. Approximately 20,000 search queries were included in this Fall 2004 data set, which AOL described as unfiltered search engine logs. The <> site noted to “Please be aware that these queries are not filtered to remove any content.” [10] In a research paper based on the Fall 2004 data release, one of the authors wrote “To build the test set we had a team of human editors perform a manual classification of 20,000 queries randomly sampled from the general query stream.” [11] A document accompanying the data explained that the data as an “Automatic Query Classification Test Collection” that was “Distributed by the Illinois Institute of Technology Information Retrieval Laboratory ( in collaboration with AOL Search.” The document goes on to describe the data:

“Brief description:

This collection consists of 20,000 web queries randomly sampled from AOL Search in the Fall of 2004. These queries were classified into 20 topical categories by a team of approximately ten human assessors. The categories used were: …” [12]

This description is attached in its entirety hereto as Exhibit J.

24. The 20,000 data set of AOL search queries was subsequently described and offered via a web site at the Illinois Institute of Technology upon request. As of August 15, 2006, the Illinois Institute of Technology still made the data available at its discretion (upon request) via its web site. [13] It is unknown how long the data set has been available in this manner.

A screen shot of the IIT web site is hereto attached as Exhibit K.

25. On or about July 31, 2006 AOL posted the data from the Fall 2004 user data files at its web site < edQueries>.

A screen shot of the Fall 2004 data page at AOL is hereto attached as Exhibit L.

AOL Release of Other User Data Collections

26. The AOL Research Data Collections page indicates that additional data collections were released, some of which appear to have included user search queries, including a collection of 2 million government queries and a collection of 3.5 million Question/Answer queries, among other data sets. According to the site, additional releases of user search data were contemplated. A section titled “Coming Soon” said the following:

“Coming Soon

  • Random samples of queries over time?
  • Concurrent Query sets from Web, News, and Audio & Video?
  • Query sets from two dma’s same time?
  • Ask for new data sets via the Collection Community” [14]

A screen shot of the AOL Research page is hereto attached as Exhibit M.

AOL Announcement of Its Data Releases and Subsequent Data Dispersment

27. On or about August 4, 2006, a message from Abdur Chowdhury, Chief Architect for Research at AOL titled “Announcing AOL Research” was posted by Einat Amitay to at least three online forums and message boards. The message was also posted at the home page of the Illinois Institute of Technology’s Information Retrieval Laboratory. The announcement included the following information:

“AOL is embarking on a new direction for its business – making its content and products freely available to all consumers. To support those goals, AOL is also embracing the vision of an open research community, which is creating opportunities for researchers in academia and industry alike.

We are introducing AOL Research to everyone, with the goal of facilitating closer collaboration between AOL and anyone with a desire to work on interesting problems. To get started, we invite you to visit us at, where you will find:

– 20,000 hand labeled, classified queries
– 3.5 million web question/answer queries (who, what, where, when, etc.) – Query streams for 500,000 users over 3 months (20 million queries)
– Query arrival rates for queuing analysis
– 2 million queries against US Government domains

Also, please feel free to provide feedback on the site, datasets you’d like to see in the future, and any other comments about our vision.


Abdur Chowdhury” [15]

A screen shot of this email as posted online is attached hereto as Exhibit N.

28. The message from Abdur Chowdhury of AOL discloses that the data is to be used by academic and by industry researchers, and by “anyone with a desire to work on interesting problems.” By including “anyone with a desire to work on interesting problems,” AOL indicated that it knowingly released the data to the public at large.

29. The content of this message is consistent with the list of data maintained on the AOL Research site. Both the AOL Research site and the Abdur Chowdhury message indicate that a data set of 657,426 users was not the only data set. In fact, according to its own researcher and according to its own web site, which was archived before it was taken down by AOL, AOL also released 2 million queries about .gov domains, 20,000 queries from the Fall 2004 data set, and 3.5 million other queries.

30. The Chodhury message was posted at:

  • <>
  • Gmane <>
  • WebIR & IE <>

The repeated posting of this message indicates that AOL was intentionally announcing its 2004 to 2006 data sets, indicating that the release was intentional and knowing.

31. The extent to which the AOL research data was downloaded and circulated before the posting of the announcement messages is unknown.

32. The AOL announcement of its data availability was apparently successful. On or about August 5, 2006, other web sites were forwarding and posting the letter, which contained a link to the AOL data site. For example, Romip, Russian Information Retrieval Evaluation Seminar <> posted a link to the information. By August 6, 2006, multiple web sites and blogs had posted links to the data.

33. On or about the evening of August 7 or 8, 2006, AOL took the multiple data sets down from its website. Google and other search engine caches of the original AOL research site indicate that August 7 is the approximate last date the AOL research site was available in its original form online.

34. On or about August 7, 2006, the 2006 data set had been copied and reposted to multiple sites on the web. Specifically, on or about August 8, the 2006 data was posted at BitTorrent, a large peer-to-peer (P2P) file distribution network for the download of large data files such as the AOL files.

35. On or about August 9, 2006, web interfaces to the AOL Spring 2006 data were created and posted online at <>, <> and perhaps other sites not controlled by AOL. The web interface format offered by these web sites facilitated access to the AOL data by making it highly accessible to any Internet user. For example, the web interface allows users to search the data by keyword, user number, or web site result.

A screen shot of <> is attached hereto as Exhibit O.

36. It is unknown whether the data release was strictly of AOL subscriber data, or whether the data release was of any AOL user, regardless of subscriber or registration status. The World Privacy Forum has no way of knowing which conditions were operative for the sampling of the data sets. However, it may appear that depending on the status of the user (subscriber or non-subscriber) the applicable privacy policy may differ.

AOL User Data Identifiability and Related Issues Regarding Identifiability

37. On August 9, 2006 the New York Times published an article on its front page identifying one of the AOL users by name and with a photo. [16] The reporters were able to identify the AOL user by analyzing the search queries AOL revealed in its posting of the Spring 2006 data. A Washington Post story published August 15, 2006 described a confidential annex submitted in an Electronic Frontier Foundation FTC complaint that identified 10 to 15 individuals from the data. [17] The World Privacy Forum, in its research, was able to identify a probable minor by name and geographic location from the data and tie that data to photos and videos of the probable minor. The World Privacy Forum was also able to identify an individual’s medical situation, home address, telephone number, and pediatrician from the data to a high degree of confidence.

38. Not all information made public by AOL in its Spring 2006 release was overtly identifiable. However, the lack of overt identifiers is not always material for the privacy of the individuals whose search requests were made public. The FTC should recognize that there is a privacy interest in search engine records notwithstanding the lack of overt identifiers in all instances.page11image25336 page11image25496

39. The federal courts have found a privacy interest in data that is wholly non- identifiable. [18] In litigation over the constitutionality of a federal statute, a dispute arose over disclosure during discovery of patient records maintained by physicians testifying as expert witnesses. The health records were to be de-identified before disclosure so that a patient’s identity could not reasonably be ascertained. The case was decided in part on grounds that there is still a privacy interest even if there were no possibility that the patient’s identity could be determined. If there is a privacy interest in non-identifiable records, then there must be a privacy interest in search engine request records that have a greater degree of identifiability.

40. In Gonzalez v. Google, [19] a fight over government access to a sample of Internet search requests, the court raised the privacy issue sua sponte even though the information that the government requested only consisted of the text of the search string entered by the user and no additional information that would aid in identification of a user. The court saw that even this limited information had the potential to be linked with an identifiable user. The court specifically mentioned the possibility of identifying users who engaged in vanity searches. By contrast, the information made public by AOL was much more expansive in size and was much more identifiable because all search requests by the same user had the same user identification number. The ability to link different search requests to the same user greatly enhances the ability to identify the user.

41. Regardless of how the FTC defines the scope of privacy interests in non-identifiable data or in data that is not overtly identified, the public disclosure by AOL in this instance involves a real risk of identifiability. The individuals – some of whom are likely minors – whose search requests were made public are substantially at risk of being identified by other Internet users in one or more ways. The following situations have been tested by the World Privacy Forum and have been found to produce affirmative identifiability of the AOL users to varying degrees of confidence.

[Note: In order to protect the privacy of the AOL users at risk of being identified, the World Privacy Forum is not revealing specific identified subjects in this public complaint. The World Privacy Forum will make the results of these tests confidentially available to the FTC upon request separately from this complaint.]

Name information: In its investigation of the data, the World Privacy Forum found that AOL subscribers whose search requests were made public could be identified by name information contained within the request. A subscriber who conducted a vanity search – or search for his or her own name – can be identified by the search string in the URL made public. While it is possible that a search for someone by name was not made by the named individual, the other searches made public at the same time help in assessing whether the named individual was the searcher.

Non-name information: An AOL user whose search requests were made public could be identified by personal information contained in the request other than the name of the subscriber. Searches conducted for a Social Security Number, address, driver’s license number, telephone number, or other identifying particular assigned to an individual, such as a user name, password, or snippets of emails put into the search queries allowed identifiability. Again, it is possible that any of these searches could have been conducted by another individual, but the context will help to make it clear who did the searching.

Example of probable minor, non-name information: In at least one instance, a user repeatedly searched for a unique name, after which the user clicked on web pages for various social networking sites. A search at the social networking sites for the unique term brought up a profile that matched for the term. The profile contained the name of the individual, plus photos, videos, geographic and hobby information that directly tied that individual to multiple identical search queries. A web search for the name of the person which was revealed in the social networking profile yielded pages that contained the same photos as the social networking profile and further affirmatively identified the individual.

Example of medically sensitive identifiability: In another search, a user copied and pasted a snippet of a password reminder email into the search query box. The snippet contained the user’s name, as well as crucial password information. With this name, the other searches became highly identifiable, and the searcher’s phone number, home address, medical issues and a doctor were able to be tied to the individual with a high degree of confidence. Many other password searches were made public in this manner in the AOL search data, including passwords for financial institution accounts.

Time stamp data: The search data AOL released on its web site included the precise time stamp of when the searches were conducted. For example: in the data, user “A” searched for “term X” at 2006-03-26 18:47:57. The data also included a precise time stamp down to the second that recorded when users clicked through the search results pages to specific, identifiable web domains. With the public release of the time stamp information, external web sites that users visited can potentially link the AOL data to visits made to their sites. The capability for this correlation depends on the visited site’s policies and practices, and other factors. Of greatest importance is the detailed time stamp information. Some web sites could in some cases correlate their internal web logs to the AOL time stamp and query information. If the external website found the AOL- disclosed request in its web log and if it knew the user by name (from a registration or otherwise), then the website may potentially be able to link the named user to the user’s other AOL searches with varying degrees of confidence.

Identifiability by familiarity with data subject: An AOL subscriber whose search requests were made public might be identifiable to another individual who knows the searcher and the type of searches that might be made by the individual. Consider, for example, a teacher who assigned a student to search for a particular subject. If the assigned search showed up in the public file released by AOL, that search together with the date of the search and other searches made by the student (e.g., for the student’s name, school, teacher, etc.) would allow the teacher to identify the searches as having been made by the student. In this case, the identification of the student’s request by the teacher might not, by itself, reveal new information to the teacher because the teacher knew that the search request would be made. However, the association of the assigned request with the other requests made by the student for matters wholly unrelated to the assignment creates objectionable privacy consequences. The teacher might learn other facts about the student or the student’s family that the teacher has no right to know and that the student had every reason to believe would not be made available to the teacher or to the public. The same inference leading to identification might be made by a friend, family member, co-worker, or acquaintance.

42. AOL public disclosure of private information not only affected the privacy interests of those who conducted the search, but it potentially affected the privacy interests of the subjects of the search in some instances. Consider a search for information about an individual that was undertaken by a health care provider, health insurer, attorney, domestic violence shelter, debt collector, law enforcement agency, psychologist, social worker, or other professional or institution that has a statutory, ethical, or other duty of confidentiality with respect to the data subject. If the searcher would have an obligation or responsibility not to make public the name of the search subject, then it would be unconscionable if AOL, as the search engine provider, could disclose the fact of the search and the search subject’s name and other information. The World Privacy Forum also asks the FTC to find that the public release of search requests by a subscriber to a generally available search engine is an unfair trade practice because of the potential for public identification of search subjects in violation of confidentiality duties and public policy.



43. The disclosure of search requests violated the AOL privacy policy and thereby constituted an unfair and deceptive trade practice. AOL’s privacy policy says:

“How Your AOL Network information is Used

Your AOL Network information is used

  • to operate and improve the Web sites, services and offerings available through the AOL Network;
  • to personalize the content and advertisements provided to you;
  • to fulfill your requests for products, programs, and services;
  • to communicate with you and respond to your inquiries;
  • to conduct research about your use of the AOL Network; and
  • to help offer you other products, programs, or services that may be of interest.” [20]

44. AOL violated its privacy policy by failing to adequately disclose the scope of its research disclosures. The term used implies that the information will be employed by AOL internally and not shared with the public. Elsewhere the privacy policy makes express reference to sharing with third parties [21] and describes the conditions under which that sharing may occur. Together, these statements in the privacy policy tell users that research is an activity that will only be conducted by AOL itself and not by third parties. Even if this language can be construed to permit disclosure to third party researchers, the public disclosure of search engine requests is not an activity that any reasonable person would conclude is consistent with the research use language in the privacy policy. If AOL planned to disclose search requests to outside researchers, it should have been expressly stated, along with the terms and limitations of the disclosure.

45. AOL’s privacy policy unfairly and deceptively failed to disclose all the terms and conditions under which research disclosures might be made. AOL’s privacy policy did not define research, establish standards for researchers, or describe the rules that would apply to research disclosures. The use of a vague and unqualifed term such as research is, by itself, an unfair and deceptive practice if it is allowed to reserve to AOL the ability to define the scope of disclosures at will.

46. AOL’s privacy policy failed to disclose the possibility that researchers or others would make data available to the public. AOL’s privacy policy also failed to disclose the possibility that any disclosed research data might become available worldwide through wholly independent websites or that the availability of data might continue indefinitely.

47. Any ambiguity, uncertainty, or completeness in AOL’s privacy policy can and must be construed against AOL and in favor of the protection of the privacy of users. The policy was written by AOL alone, and it should bear the consequences of any lack of clarity or specificity. If a privacy policy does not make it expressly clear that a use or disclosure is contemplated, then the policy should be read as prohibiting that use or disclosure. It would be both unfair and deceptive to construe a privacy policy in any other way.

48. AOL may have violated its own policy described in its FAQ < > by not deleting search requests as promised. Telling users that they had the ability to erase search requests if AOL did not actually erase the requests is a particularly heinous violation of trust, if this occurred. [22]

49. AOL’s privacy policy also failed to disclose the possibility that the public disclosure of data for research could lead to the identification of individual users by third parties. Public disclosure of search requests, whether done with user identification numbers or otherwise, is an unfair and deceptive practice for a search engine regardless of the search engine’s stated privacy policy. The possibility that a search request can be linked to an identified individual is real. In at least some instances, starting from the search request alone, it is possible for a third party to determine with a reasonable degree of certainty which individual made the request. The greater the amount of information released along with the search request (e.g., date and time of request; identification number; multiple search requests), the greater the risk of identifiability.

50. AOL’s conduct represents a continuing unfair and deceptive trade practice and a continuing threat to the privacy of the searchers whose search requests were made public. AOL should have understood that the data, once being put on the web, could not be controlled or recalled by AOL. AOL took no precautions to assure that the data did not find its way onto the Internet and become perpetually available via mass download sites. AOL now has no practical way of controlling the data that it disclosed or minimizing the possibility that individuals will be identified and harmed in the future as a result of AOL’s disclosure.

51. AOL has engaged in a pattern of public release of user data. AOL is known to have allowed the release of data acquired in 2004 and data acquired in 2006. On the original AOL research web site, there are indications that multiple data sets were made available to the public. The full scope of AOL disclosures of user search requests remains uncertain.



52. The AOL actions are deceptive as defined by the FTC for purposes of enforcement of the FTC Act.


53. The FTC’s 1983 Policy Statement on Deception states that the Commission will find deception in cases where (1) there is a representation, omission or practice that is (2) likely to mislead the consumer acting reasonably in the circumstances and is (3) “material,” i.e. likely to affect the consumer’s conduct or decision with regard to a product or service (Letter from Chairman James C. Miller to Senator John Dingell, October 14, 1983, hereafter, “Policy Statement on Deception.”)

54. AOL’s conduct meets all three standards. It told users through a privacy policy how data might be used. However, that privacy policy was likely to mislead a reasonable consumer because it was incorrect, unclear, and incomplete. AOL’s actions were material because a consumer who knew of the possibility that his or her search requests might become public might have changed the manner and content of searches. Some consumers would have avoided using the AOL search engine for some or all searches in order to avoid the possibility of public disclosure, identification, embarrassment, and harm.


55. In its 1980 Policy Statement on Unfairness, the FTC states that a practice will be deemed unfair if it (1) causes substantial injury to consumers that (2) cannot be reasonably avoided by consumers and (3) is not outweighed by any countervailing benefits to consumers or competition that the practice produces (Letter from Chairman Michael Pertschuk to Senators Wendell Ford and John Danforth, December 17, 1980). AOL’s actions are unfair under all of these criteria.

56. AOL’s privacy policy and its actions in disclosing search engine requests meets the standard of unfairness. The disclosure can result and has resulted in substantial injury to consumers through public or other identification of consumers and their Internet activities. For example, searchers who are victims of domestic violence may have an increased risk of additional violence. Searchers who looked for drug or alcohol treatment facilities may be at increased risk of being publicly identified as substance abusers, with a possible loss of employment. Searchers who are minors run the risk that they might be identified by others who seek to harm or exploit them. All searchers run the risk of being embarrassed through public disclosure of their search activities. All searchers who are identifiable run the risk of receiving spam and fraudulent offers via email and of becoming the target of additional and unwanted marketing efforts. Because consumers had no ability to control the so-called research disclosures by AOL, they had no way to reasonably avoid the consequences of disclosure. Finally, no public benefit or policy justifies the public disclosure of search requests in the way that AOL disclosed the requests. Whether there might be some societal benefit to allowing bona fide researchers to use search requests under strictly controlled conditions is not at issue here. All that the FTC need conclude is that AOL’s actions were unfair and unjustified. The FTC is not required to make broader determinations of minimal standards for reasonable research disclosures.

57. In addition, AOL promises to allow users to delete search requests in its Frequently Asked Questions document. It is not clear if this policy was honored by AOL, or at what level this policy was honored. The policy states that users can review and delete their search histories. [23] But does this mean that the searches are deleted and are not given to researchers, or does this mean the searches are thoroughly deleted and not kept whatsoever by AOL? If any AOL user requested a deletion, and was subsequently exposed by any of AOL’s public releases of data, then this disclosure would constitute both an unfair and a deceptive trade practice.



Because of the foregoing, WPF requests that the Commission:

(a)  Investigate AOL’s 2006 data releases, its 2004 data release, and other data releases of its user data to researchers or to others, including “affiliated providers” and the public;

(b)  Investigate if there has been inappropriate marketing or other use of user data from search queries as a result of the AOL public disclosures;

(c)  Investigate whether or not AOL users who requested deletion of their search engine histories had their search histories disclosed publicly by AOL or otherwise disclosed in any manner by AOL after such a request was received;

(d)  Enjoin AOL from continuing to publish an unclear and incomplete privacy policy and from violating its privacy policy;

(e)  Order AOL to notify each user whose records were made public by contacting them by email and by certified mail;

(f)  Order AOL to offer to provide free credit monitoring for a three year period to any user whose records were made public, including those users whose searches contained SSNs, date of birth, address, name information, and any other combination of information that could potentially be used to identify them;

(g)  Order AOL to establish a process that will compensate users whose records were made public for any out-of-pocket damages that they incurred as well as to compensate users for any non-monetary consequences from the disclosure;

(h)  Order AOL to pay a substantial civil penalty sufficient to serve as a deterrent to similar conduct in the future;

(i)  Order AOL to institute a immediate audit of its privacy and security practices and procedures regarding its handling, use, storage, and dissemination of all user data from its networks and services, and to retain an appropriate, neutral third party auditor that both AOL and the FTC agree upon to monitor its implementation of these practices, and to report to the FTC the results of these efforts. After the initial audit, annual compliance audits and reports to the FTC should for continued a period no less than 20 years, and should be made public;

(j)  Order AOL to revise its network privacy policy to make clear in plain language what deletion of search records means, to add detailed information about privacy and security practices regarding search queries, and to expand its description of how it handles the security and privacy of users search query and other information during its use in any research and or marketing activities;

(k)  Order AOL to create robust data and privacy oversight practices, procedures, and policies for its research division, including appropriate oversight over the privacy and security practices of its internship and fellowship programs conducted jointly with external entities;

(l)  Order AOL to establish an institutional review board or a privacy board with a majority of independent members to review and approve all internal and external research requests for users’ data, whether that data is overtly identifiable or not;

(m)  Order AOL to facilitate expedited service cancellation and waive any cancellation or other fees upon service termination for all AOL subscribers who request cancellation as a result of AOL’s disclosure of search data, including but not limited to those subscribers whose data were disclosed;

(n)  Order AOL to refrain from explicitly or implicitly misrepresenting the extent to which it secures, protects or discloses any personal information maintained about consumers in the future;

(o)  Permanently enjoin AOL from violating the FTC Act, as alleged herein; and

(p)  Order such other equitable relief as the Commission finds appropriate.


Respectfully submitted,

Pam Dixon
Executive Director,
World Privacy Forum






[1] AOL “Research APIs Terms of Service,” originally available at <>. Last accessed August 15, 2006.

[2] Ibid.

[3] File: user-ct-test-collection.01.txt.gz through user-ct-test-collection-10.txt.gz. Files originally available at <>.

[4] Ibid.

[5] U500k_README.txt file from AOL data release of “user-ct-test-collection.” Originally available at <>.

[6] Ibid.

[7] Ibid supra note 1.

[8] Ibid supra note 3.

[9] Ibid.

[10] Cached copy of AOL Automatic Query Classification Test Collection site. <>. Last accessed August 15, 2006.

[11] Steven M. Beitzel, Eric C. Jensen, Ophir Frieder, David. D. Lewis, Abdur Chowdhury, Aleksander Kolcz. “Improving Automatic Query Calssification via Semi-supervised Learning”, in Proceedings of the 2005 ACM Conference on Research and Development in Information Retrieval (SIGIR-2005), Salvador, Brazil, pp. 581, August 2005.

[12] Automatic Query Classification Test Collection <>. Last accessed August 15, 2006.

[13] Illinois Institute of Technology IR Laboratory Collections Distribution Site <>. Last accessed August 15, 2006.

[14] AOL Research Data Collections, originally at <>. Last accessed August 12, 2006.

[15] Announcement text from <>. Last accessed August 13, 2006.

[16] Michael Barbaro and Tom Zeller Jr., A Face Is Exposed for AOL Searcher No. 4417749, New York Times, August 9, 2006.

[17] Nakashima, Ellen, Internet Privacy Group Files Complaint Against AOL, Washington Post, August 15, 2006. “The foundation submitted a confidential document to the FTC and AOL, listing 10 to 15 examples of searches that included information that could potentially identify a person, Hofmann said.”

[18] Northwestern Memorial Hospital v. Ashcroft, 362 F.3d 923 (7th Cir. 2004) <>.

[19] No. CV-068006MISC JW, March 17, 2006, N.D. Cal (order granting in part and denying in part motion to compel compliance with Subpoena Duces Tecum), <>.

[20] AOL Network Privacy Policy at <>. Last accessed August 15, 2006.

[21] “Your AOL Network information will not be shared with third parties unless it is necessary to fulfill a transaction you have requested, in other circumstances in which you have consented to the sharing of your AOL Network information, or except as described in this Privacy Policy.” AOL Network Privacy Policy at <>. Last accessed August 15, 2006.

[22] The AOL Network Frequently Asked Questions page <>. Last accessed August 15, 2006.

[23] The AOL Network Frequently Asked Questions page <>. “To provide users with control and privacy, the AOL Network will allow a user to disable this personalization functionality for either specific searches or all searches, as well as to review and/or delete any or all of his/her past searches.” Last accessed August 15, 2006.