The FISC's Problematic Pen/Trap Opinion on Bulk Internet Metadata Collection

Published on November 22, 2013

The latest round of Foreign Intelligence Surveillance Court opinions released under the Freedom of Information Act includes what, for many surveillance wonks, has been the white whale of legal rationales for bulk data collection: The opinion by Judge Colleen Kollar-Kotelly blessing bulk acquisition of e-mail metadata from Internet Service Providers under FISA’s pen register/trap-and-trace authority, which (as Steve Vladeck notes) was subsequently relied on to support the more familiar §215 telephony metadata program. As Orin Kerr has compellingly argued, the opinion goes to great lengths to ignore ““important statutory clues suggesting that the pen register authority does not extend to bulk programmatic uses.” But there are a number of other quite distinct problems to raise with the opinion.

In my very first post at Just Security, I discussed the multilayered nature of Internet communications, and the way in which the “content” of a communication at one layer of the Internet stack is also often “metadata” relative to another layer. In particular, I wondered whether the FISA Court, in approving bulk collection from Internet providers, had considered the legal implications of this technical fact about Internet communications. While much of the technical detail concerning the pen/trap program is redacted in the published opinion, it seems reasonably clear that that the FISC did not initially do so, except in the most cursory and unsatisfactory way. The later Bates opinion appears, judging from contextual clues, to touch on this question at greater length, but the actual discussion is entirely redacted, so it’s hard to be sure. To save time in what follows, I’ll assume familiarity with the basic argument of that previous post in what follows and just recap the main point relatively briefly.

In the telephony context, we can normally make a binary distinction between metadata—the dialed numbers transmitted to the phone company in order to connect the call, as well as the timing information generated by the company itself—and the single “communication” that call consists of, whose “contents” are the words spoken to the human recipient. (An exception is so-called “Post Cut-Through Dialed Digits”—numbers dialed after the initial call is connected—whose status is a matter of some dispute in the courts.) On the Internet, however, each “communication” is really a series of nested communications. Relative to an Internet provider like Verizon or Comcast, there’s a stream of packets that could be an e-mail, an IM chat, a YouTube video, a Web page request, or countless other types of information. The metadata voluntarily conveyed to the ISP to route that communication is the TCP/IP addressing information that tells the ISP’s equipment where to send packets, whatever they happen to contain. (I’m oversimplifying slightly, but insofar as we’re concerned with people’s “reasonable expectations of privacy,” it should not be necessary to get into the weeds of more complex and varied network management practices.) We can call this totality of the traffic, encompassing many different instances and types of data flows, the ISP-level communication. (You could, alternatively, think of each individual “packet” as a distinct “communication,” and adjust the rest of the analysis accordingly, but this way seems more intuitive.) Within that stream, there will also be typically be (at least one layer of) server-level communications, the contents of which will typically be requests to a computer not controlled by the ISP to perform some function: “Send me this Web page” or “open a session with this Gchat user” or “deliver this e-mail to a user.” Finally, there will be what we colloquially think of as the “contents” of the communication—call this the human-level communications—consisting of the text, video, audio, virtual-world actions, etc., sent and received by the human participants. The server-level communication is metadata relative to the human-level communication, but it is content relative to the ISP-level. (Indeed, one reason the program may have been deemed less useful in 2011 is that major e-mail providers, starting with Google, began using encryption by default, rendering this content opaque to the backbone provider.) The FISC opinion fails to grapple with this important distinction in any serious way, and the Court’s statutory and Fourth Amendment analysis are profoundly defective as a result.

Kollar-Kotelly writes that the information acquired in bulk by the government here “like other forms of meta data [sic], is not protected by the Fourth Amendment because users of e-mail do not have a reasonable expectation of privacy in such information.” But Smith v. Maryland cannot support the categorical proposition that all of the myriad classes of information that might be characterized as “meta data” lack Fourth Amendment protection, however and wherever obtained.

Rather, as Kollar-Kotelly later and more accurately observes, the holding of Smith is that information knowingly and voluntarily conveyed to a third party lacks such protection. (In my view, the doctrine that users necessarily waive Fourth Amendment protection in all such voluntarily disclosed information is also, to use the technical legal term, totally crazypants—but leave that fight for another day.) Smith implicitly relies on the distinction between information conveyed to a particular entity, with the knowledge that it will be used and retained by that entity to provide a service (such as dialed numbers transmitted to the phone company) and information conveyed through that entity, but not—to the knowledge of the user, at least—routinely accessed and retained by that entity, though it may have the technical capability to do so (such as the words spoken in a phone call). Indeed, as the growing use of encryption for e-mail sessions underscores, that information can be entirely hidden from the ISP without materially affecting service, which makes it difficult to believe ordinary users expect the ISP to routinely access it.

Yet as the FISC opinion notes, at least some of the information sought here could not “be obtained under the business records provision, because it is not generally retained by communications service providers.” The subsequent Bates opinion asserts that “much, and perhaps all” of the metadata obtained was collected live on the wire, which would not be the case if NSA had obtained any substantial portion of it at the e-mail server. This strongly suggests, in line with previous characterizations of this program, that the “providers” in question are ISPs, as opposed to e-mail hosts, which generally would have some record of specific e-mails sent and received. It should also suggest, however, that Smith does not settle the Fourth Amendment question with respect to this type of collection. It might be argued that this is a trivial distinction, because there is often some third-party entity that accesses and retains the server-level communication that constitutes the metadata for each human-level e-mail. But this is not necessarily the case at all: Individuals can, and corporate entities routinely do, maintain their own e-mail servers, in which case there would be no third party entity to whom the relevant metadata was disclosed. In such cases, the metadata would have to be obtained by means of either a physical search warrant or subpoena or similar process directed to that person or entity.

This distinction also complicates the court’s statutory analysis. The FISA pen/trap provision authorizes collection of signalling and routing information, to exclude the “content” of any communication. As Kollar-Kotelly observes, however, “content” is defined more narrowly for the purposes of the pen/trap statute—where it encompasses only “information concerning the substance, purport, or meaning” of a communication—than it is under FISA more generally, where it also includes “any information concerning the identities of the parties to… or the existence of” a communication. Kollar-Kotelly breezily concludes that this definition has “no bearing” on the FISA pen/trap authority, where it suffices to assert that e-mail addressing information does not concern the “meaning or purport” of the e-mail.

This assertion is itself highly dubious: One way in which e-mail is different from telephony is that dedicated addresses (like, say, “[email protected]”) may be used for specific functions without any other content, such that the “meaning” of the communication is wholly contained in the existence of the communication. (Google “send a blank e-mail to” for sine examples.) We might also consider services like Craigslist, where users are assigned dedicated, disposable reply-to addresses for each classified ad they place, such that the address itself provides highly specific information about the message contents. One can think of loose analogues in the telephony context, but instances of this type are massively more common with e-mail. As Bates notes, the double of structure of the pen register statute—authorizing collection of signalling or routing information, but also prohibiting content collection—is coherent only if signalling or routing information can also be content; otherwise the prohibition would be redundant.

Even if we think a particular form of metadata does not concern the meaning or purport of a communication, Kollar-Kotelly concludes a bit hastily that the broader FISA definition of “content” is rendered entirely moot by the “notwithstanding any other law” clause of the FISA pen/trap provision. The “meaning or purport” definition of “contents” relevant to the ECPA pen/trap authority applies under “this chapter”—which is to say, the Wiretap Act and (by incorporation) the Title 18 Pen Register Statute. There, the government might argue that even where nested signalling information, such as post-cut-through dialed digits used for routing purposes (as when you place a calling card call or route through a collect calling service) are formally contained within a connected call, they are not “contents” in the sense used consistently throughout the Wiretap Act. Here, however, routing information (like an e-mail address) would be contents relative to the facility at which the pen/trap is directed under the FISA definition, though perhaps not the ECPA definition. I’m not entirely sure what I think about this yet, but it would be nice to see a more detailed discussion of whether this makes a potential difference.

Finally, I note a quirk this opinion’s reasoning shares with the telephony metadata rulings. In neither case does the Court appear to have authorized the use of “big data” analytic tools—such as pattern matching to detect a group of targets who had changed phones or e-mail accounts—that might plausibly be said to require a comprehensive database. Rather, it limited queries of that data to selectors for which a particularized determination of reasonable suspicion had been made. At that point, of course, a more traditional and circumscribed conception of relevance would permit the same records to be obtained via targeted orders. Thus the full weight of the justificatory burden for untargeted collection, then, is effectively borne by the argument that it is necessary to enable historical access to records that might ultimately be determined to be relevant. In other words, everything is relevant now because anything might turn out to be relevant in the future. The government has gestured toward the need to articulate a limiting principle to its collection powers by stressing that communications records in particular can be fruitfully analyzed in bulk to reveal networks of association—but to the extent that the real weight of the argument is borne by the putative necessity of historical access, it would apply to any body of records not retained indefinitely that could conceivably be relevant to an investigation. This should be especially troubling given that the same “relevance” language appears in the statutes authorizing National Security Letters, which do not require advance judicial approval.