Data-Mining, Section 215, and Regulating the Government’s Use of Stored Data: The Overlooked, but More Important, Question About NSA Surveillance

[Editor’s Note: Just Security is holding a “mini forum” on the Report by the President’s Review Group on Intelligence and Communications Technologies. Others in the series include a post by Marty Lederman analyzing the Report’s highlights, a post by Julian Sanchez examining the scope of the NSA’s section 702 program, a post by Jennifer Granick discussing the implications for non-US persons (with a follow-up post by Jennifer), and a post by Ryan Goodman discussing the effectiveness of the section 215 metadata program.]


Since Wednesday’s release of the Report of the President’s Review Group on Intelligence and Communications Technologies, public attention has been focused on its recommendation that the government should no longer be authorized to collect and store individuals’ phone “metadata” for data-mining purposes.  What fewer observers have noticed, however, is that the Report also questions the wisdom of authorizing the government to engage in relatively unbounded data-mining altogether, regardless of who holds the keys to the databases.  The question of how such databases are used, the Report suggests, is at least as important as how they are maintained.

This part of the Report, and the issue of how the government uses digital data more generally, deserves much greater attention as reforms to the NSA programs are considered in the coming weeks and months.  This is so, as the Report stresses, because the government’s use (not merely collection) of databases might have a profound effect not only on individuals’ privacy itself, but also on vital social practices:  “[L]aw-abiding citizens who come to believe that their behavior is watched too closely by government agencies . . . may be unduly inhibited from participating in the democratic process, may be inhibited from contributing fully to the social and cultural life of their communities, and may even alter their purely private and perfectly legal behavior for fear that discovery of intimate details of their lives will be revealed and used against them in some manner.”  (p.111, quoting the National Academy of Science).

In this post, we review the Report’s insights into the questions of collection and use, emphasizing the importance of the latter.  After discussing these two questions, at the end of the post we make a couple of brief observations about the Report and the current litigation challenging the legality of the existing Section 215 telephony metadata program.

(For details about other aspects of the Report, see Marty’s recent post.)

Who Should Collect and Store the Data?

Virtually all recent treatments of the Report point to Recommendation No. 4, a proposal to severely limit the government’s collection and storage of mass telephony metadata.  The Report recommends that:

[A]s a general rule, and without senior policy review, the government should not be permitted to collect and store all mass, undigested, non-public personal information about individuals to enable future queries and data-mining for foreign intelligence purposes.  Any program involving government collection or storage of such data must be narrowly tailored to serve an important government interest.  (Recommendation No. 4)

Applying this general rule to telephony metadata, in particular, the Report proposes legislation that would “terminate[] the storage of bulk telephony meta-data by the government under section 215, and transition[] as soon as reasonably possible to a system in which such meta-data is held instead either by private providers or by a private third party.”  (Recommendation No. 5)

At his press conference last Friday, the President implicitly invoked these recommendations when he suggested that perhaps there is a way to “hav[e] the private phone companies keep these records longer and to create some mechanism where they can be accessed in an effective fashion.”  That suggestion was quickly met “with fire from privacy advocates and technology experts, who say it would be as bad as or worse than having the NSA hold the records.  Phone companies also do not want to be the custodians of data sought by law enforcement or civil attorneys.”

This debate about who should collect and store personal data for government use is understandable, and important, in light of what the Report refers to as “the lurking danger of abuse” (p.113):

[W]e cannot discount the risk, in light of the lessons of our own history, that at some point in the future, high-level government officials will decide that this massive database of extraordinarily sensitive private information is there for the plucking.  Americans must never make the mistake of wholly “trusting” our public officials.  As the Church Committee observed more than 35 years ago, when the capacity of government to collect massive amounts of data about individual Americans was still in its infancy, the “massive centralization of . . . information creates a temptation to use it for improper purposes, threatens to ‘chill’ the exercise of First Amendment rights, and is inimical to the privacy of citizens.” (p.114)

For these reasons, it is not surprising that the political branches and the public will now likely devote a great deal of attention to the question of who the data custodians should be.  But, as noted below, the Report suggests that the question of use of the databases deserves at least as much attention, and may have even more substantial implications for privacy and democracy.

Standards for the Use of Stored Data?

The current focus on the “who stores?” question obscures a broader concern in the Report about the use of the data—about data-mining more broadly—regardless of who the custodian might be.  And this concern is hardly limited to telephony metadata—it extends to “every other type of record or other tangible thing that could be obtained through a traditional subpoena, including bank records, credit card records, medical records, travel records, Internet search records, e-mail records, educational records, library records, and so on” (p. 109).

The Report’s authors do not, of course, suggest that the government should never be allowed to examine any such data, no matter how closely connected they may be to a legitimate subject of investigation.  Their concern, instead, relates to how much the government can discover when it has access to “genuinely mass collections of all undigested, non-public personal information about individuals – those collections that involve not a selected or targeted subset (such as airline passenger lists), but far broader collections” (id.).

The “essence of the information age,” explains the Report (at p.110, here quoting the National Academy of Science) is that everyone leaves extensive “personal digital tracks” in countless computer systems, “whenever he or she makes a purchase, takes a trip, uses a bank account, makes a phone call, walks past a security camera, obtains a prescription, sends or receives a package, files income tax forms, applies for a loan, e-mails a friend, sends a fax, rents a video, or engages in just about any other activity.”

In one of the most provocative and important sections of Report (roughly pages 108-115), the authors worry that, even when the objects of the government’s investigation are entirely legitimate, mass collections of information permit the government to learn far more about those it is investigating than it has ever been able to do in the past.  That very capability, the Report argues, can have detrimental consequences not only for individual privacy, but also for public trust and critical social practices—consequences that, in the Report’s view, often receive inadequate attention when the government deliberates about how to craft a system of data-mining.

Even if such a database can be queried only on the basis of individualized suspicion about a particular suspected person or number, substantial concerns would remain, argue the authors.  Indeed, citing the example of the Section 215 telephony metadata program, they insist that their concerns would be present even if the standards for searching the collected data were significantly tightened.

Under the FISA Court’s current rules governing the Section 215 program (see generally the most current “Primary Order,” issued by Judge McLaughlin in October), the NSA can access the collection of telephony metadata only when it has facts giving rise to a reasonable, articulable suspicion (RAS) that a selection term to be queried, such as a phone number, is “associated with” a specified foreign terrorist organization.  When the NSA develops RAS for such an “identifier,” or “seed” phone number, it can then use the database to ascertain every telephone number that either called or was called by the seed phone number in the preceding five years.  And then NSA can make two more “hops,” querying the database to obtain a list of every phone number that called or was called by each of the numbers identified in the first inquiry, and in the second.  This exponential three-hop process, the Report estimates, would in a typical case generate one million numbers from a single seed, which are compiled in a “corporate store.”  (The exact numbers are uncertain.  But as David discussed last week, Judge Leon, too, calculated that three hops could easily lead to a million distinct numbers over five years.  See page 18 of his opinion.  Whatever the exact number for a particular seed, the program almost certainly has the capability of reaching millions of individuals’ phone records over time.)

Once these many phone numbers are in the corporate store, the NSA “may apply the full range of SIGINT analytic tradecraft to the results of the intelligence analysis queries of the collected . . . metadata” (Primary Order footnote 15).  That tradecraft presumably includes the ability to run the numbers against other databases, and to put names to the numbers.  And all this can be done whether or not the NSA makes any finding that these numbers are associated with a terrorist organization, so long as the further investigation is conducted “for valid foreign intelligence purposes.”  (“Foreign intelligence information” is, in turn, defined capaciously to include virtually any information that might be relevant to foreign affairs.)  If these further inquiries reveal information about a United States person, that information can then be shared with the FBI if a high-level official determines that it is “related to” counterterrorism information and is necessary to understand the counterterrorism information or assess its importance.

The PRGICT Report stresses that, even in cases where it would be reasonable to do follow-up investigations on numbers revealed by the “hops” from the “identifer” number, the NSA’s access to the phone records database, and to other mass databases, gives the government access to far more information about the persons associated with those numbers than the government would ever have had access to in a past, traditional investigation.  And this would be the case, the authors stress, even if the initial standard for searches were to be heightened—say, from reasonable articulable suspicion about the seed number to probable cause:  “[E]ven that would leave privacy at risk” (p.113).

“This is so,” the Report explains, because “in traditional searches, the government does not discover everything there is to know about an individual.  The enormity of the breach of privacy caused by queries of the hypothetical mass information database dwarfs the privacy invasion occasioned by more traditional forms of investigation.  For the innocent individual who is unlucky enough to be queried under even a probable cause standard, virtually everything about his life instantly falls into the hands of government officials.  The most intimate details of his life are laid bare.”  (Id.)

The authors acknowledge that this new capacity to assess mass collections of personal information would, of course, “make it easier for the government to protect the nation from terrorism” (p.114); and that is a distinct virtue.  But even so, they say, that does not mean that the use of such databases should be unlimited:  “[T]he question is not whether granting the government authority makes us incrementally safer, but whether the additional safety is worth the sacrifice in terms of individual privacy, personal liberty, and public trust.” (Id.)

The concern described in the Report is not so much that the government might discover particular private facts—that’s inevitable in any legitimate investigation.  Rather, the authors worry more that the unprecedented comprehensiveness of what the government can learn from such programs—the fact that the government has “had ready access to a massive storehouse of information about every detail of our lives”—could have a deep, negative impact on important social practices.  With respect to telephone records, in particular, the Report notes (p.117):

Knowing that the government has ready access to one’s phone call records can seriously chill “associational and expressive freedoms,” and knowing that the government is one flick of a switch away from such information can profoundly “alter the relationship between citizen and government in a way that is inimical to society.”  That knowledge can significantly undermine public trust, which is exceedingly important to the well-being of a free and open society.  (Quoting from Justice Sotomayor’s concurring opinion in U.S. v. Jones.)

This section of the Report suggests what David stressed in his post the other day—namely, that the more fundamental question with respect to data-mining is not who should hold the data, but what the government ought to be able to do with it, even in cases where there is justification for investigating, say, a particular phone number that is thought to be associated with al Qaeda.  For example:

— How many “hops” should the government be able to make from the “seed” number to other phone numbers?

— When such “hops” reveal a million or so numbers, should the government be entitled to analyze those numbers without judicial oversight or substantive constraints?

— At what point should the government be authorized to put a name to a number?

— What should be the standard for conducting further investigations of those numbers, or any identified phone customers, within the metadata collection itself or other mass databases?

— When can the names and/or numbers be used by the FBI in a criminal investigation?

Despite emphasizing its concerns about this set of questions, the Report offers few specific recommendations about how to deal with them.  That’s not terribly surprising:  They’re very hard questions, and the relevant considerations that should be brought to bear in answering them might vary considerably from one context to the next.

The Report does, however, suggest imposing one potentially important limitation, in the context of the Section 215 telephony metadata program.   Recommendation No. 5 proposes that “[a]ccess to such [telephony meta]data should be permitted only with a section 215 order from the Foreign Intelligence Surveillance Court that meets the requirements set forth in Recommendation 1.”  And Recommendation No. 1, in turn, proposes that Congress amend Section 215 to authorize the FISA Court to issue an order compelling a third party to disclose otherwise private information about particular individuals only if:

(1) it finds that the government has reasonable grounds to believe that the particular information sought is relevant to an authorized investigation intended to protect ‘against international terrorism or clandestine intelligence activities’ and

(2) like a subpoena, the order is reasonable in focus, scope, and breadth.

The addition of the adjective “particular,” and the requirement that a court assess the search for the reasonableness of its “focus, scope, and breadth,” appear designed to give courts an added responsibility at the front end to cabin the volume and nature of the information—the subset of the database—that the NSA could query.

What would such limitations look like in practice?  It may be impossible, given the range of circumstances that may arise, to specify a single ex ante rule; and thus the Report suggests that courts should make such calls on a case-by-case basis.  It does offer a hint at the general approach, however, when it states (p.115) that the government “would still be free under section 215 to obtain specific information relating to specific individuals or specific terrorist threats from banks, telephone companies, credit card companies, and the like—when it can demonstrate to the FISC that it has reasonable grounds to access such information” (emphasis in original).

As for what the government should be authorized to do with the subset of data to which it is afforded access—often referred to as the “minimization” rules—the Report is relatively silent.  The Report also does not discuss whether and to what extent such minimization rules should be transparent, so that the public knows how its information can be used once it appears in a database.  The authors appear to have left such difficult questions largely for consideration by Congress and the Executive branch in the first instance.

There is one hint, however (at pages 88-89), that the FISA court itself ought to set the parameters for whether and how the government can access a particular individual’s information within a database:

[A]s a matter of sound public policy, it is advisable for a neutral and detached judge, rather than a government investigator engaged in the “competitive enterprise” of ferreting out suspected terrorists, to make the critical determination whether the government has reasonable grounds for intruding upon the legitimate privacy interests of any particular individual or organization. The requirement of an explicit judicial finding that the order is “reasonable in focus, scope, and breadth” is designed to ensure this critical element of judicial oversight.  (Emphasis in original.)

This is consistent with the fundamental theme of this part of the Report–namely, that no matter who holds the massive collections of data, decision-makers should be far more attentive to the questions of how government agents may access and use those data.

Coda:  The Report and the Current Legal Controversies

Contrary to some accounts published before its release, the Report does not conclude that the current Section 215 program is legal; instead, it acknowledges that there are deeply contested statutory and constitutional questions, and explains that “[o]ur charge is not to resolve these questions, but to offer guidance from the perspective of sound public policy as we look to the future” (p. 108).

The Report does suggest two things about the legal questions that are worth flagging, however:  First, the Report implies that the Supreme Court ought not ultimately resolve the Fourth Amendment question based on an assumption, reflected in cases such as United States v. Miller and Smith v. Maryland, that individuals have little or no expectation of privacy in information they voluntarily make available to third parties such as banks, credit card companies, Internet service providers, telephone companies, health-care providers, etc.:  “In modern society, individuals, for practical reasons, have to use banks, credit cards, e-mail, telephones, the Internet, medical services, and the like.  Their decision to reveal otherwise private information to such third parties does not reflect a lack of concern for the privacy of the information, but a necessary accommodation to the realities of modern life.  What they want—and reasonably expect—is both the ability to use such services and the right to maintain their privacy when they do so.”  (pp. 110-11)

Second, invoking Justice Alito’s concurring opinion in United States v. Jones, the Report hints that whether the judiciary ultimately approves or invalidates any particular program for collecting and searching mass databases, including programs under Section 215, might well depend on how Congress first deals with the question, since legislatures are “well situated to gauge changing public attitudes, to draw detailed lines, and to balance privacy and public safety in a comprehensive way.”  The Report is therefore largely devoted to the question of how, exactly, Congress should strike that balance. 

About the Author(s)

David Cole

National Legal Director of the ACLU and Professor at Georgetown University Law Center Follow him on Twitter (@DavidColeACLU).

Marty Lederman

Professor at the Georgetown University Law Center Follow him on Twitter (@marty_lederman).