Recently, Orin Kerr and I had a brief conversation on Twitter regarding the Fourth Amendment and the content/non-content distinction. Specifically, Orin asked those of us who subscribe to the mosaic theory of intelligence if some large amount of metadata can become content, can some small amount of content become metadata by the same logic? That is, if non-content in sufficient quantities can become content under the Fourth Amendment, shouldn’t the inverse of this function mean that sufficiently small amounts of content can become non-content? (Remember that content receives great constitutional protection than non-content.) There is a fair amount of unpacking to do in this short question, so let’s start by exploring the mosaic theory as it applies to Fourth Amendment law.

Mosaic theory describes the idea that a sufficient quantity of small pieces of seemingly innocuous data, pieced together by a clever analyst, can develop into an intelligence picture whose importance can outweigh the sum of the individual parts — much like the intricate patterns found in ancient Mesopotamia, Greece, and Rome, for which the theory is named. It follows from this theory that seemingly unimportant scraps of information — e.g., metadata — that by themselves do not rise to meet the current definition of a Fourth Amendment search, could be compiled and analyzed in large enough quantities to yield an intelligence picture at least as (if not more) meaningful than what might have been possible by collecting content information, which would typically require a court-issued warrant. Mosaic theory adherents assert that this scenario depicts a loophole in Fourth Amendment doctrine, giving government agencies the ability to bypass warrant requirements altogether through advanced technologies performing big data analytics. Further, because these metadata are preformatted for computer consumption — requiring little, if any, massaging by human data technicians — the government is given a distinct advantage, severely tipping the balance of power between citizen and state. Sufficient amounts of under-the-radar non-content data would effectively become content, and should therefore receive appropriately greater protection under the Fourth Amendment.

Orin’s question pushes mosaic theorists to further amplify their Fourth Amendment framework by asking if it therefore follows that small amounts of content data could effectively become non-content, receiving fewer Fourth Amendment protections, under this theory. My response at the time was to ask why Orin’s proposition would necessarily follow from mosaic theory. Orin agreed that it may not, but courts might look upon such a one-way theory with suspicion, treating it as an unprincipled, result-oriented approach to the problem. My goal, therefore, was to think through a principled response to Orin’s original query. 

First, let’s define our terms of debate. A fair discussion of this issue requires considerably more real estate than a blog post allows, so what follows is my preliminary stab at what will turn into a longer, and hopefully more meaningful article on the topic. Second, for the sake of this discussion, let’s assume that the basic propositions of mosaic theory hold true. That is, we will assume that, for some quantity of non-content data collected, the picture assembled from this data through collation and analysis gives the collecting agency as much or more information than they would have obtained through content data, the collection of which would likely require a warrant under current Fourth Amendment law.

The current (and controversial) content/non-content distinction pertaining to electronic data is a relatively recent vein of Fourth Amendment thought, and finds its roots in metaphor, as so often occurs when the law meets technology. Since the 1967 Supreme Court decision in Katz v. United States, courts have generally held that the contents of telephone calls — the actual conversation — are protected under the Fourth Amendment. In 1979, however, the Supreme Court held in Smith v. Maryland that the non-content portion of telephone calls — the number dialed, the time of the call, and its duration — did not receive the same Fourth Amendment protections. The Smith Court reasoned that, unlike the call’s contents, the call metadata was sent to the telephone provider in order to facilitate the call, and was therefore effectively the same as the address information we write on the outside of the letters we give the the postal service to deliver, which courts had long treated as public information outside of Fourth Amendment protections.

Courts struggled with this topic once again when electronic communication methods like email began to emerge in the 1980s. It was not until 2007, in the Sixth Circuit decision in Warshak v. United States, when courts finally began to seriously reason about electronic content under the Fourth Amendment. The Warshak court held that email receives the same Fourth Amendment protections as telephone calls. In 2008, two Ninth Circuit decisions — United States v. Forrester and Quon v. Arch Wireless Operating Co. — expanded this jurisprudence significantly. In Forrester, the court held that electronic communication metadata — IP addresses, to/from addressing, and data file size — was the equivalent to the non-content telephone call data in Smith, and therefore was not protected under the Fourth Amendment. The court in Quon looked to Forrester’s content/non-content reasoning to decide that users have a reasonable expectation of privacy in the content of the text messages. (Note that, while the Supreme Court granted cert in this case, it disposed of the case on narrow grounds, and elected not to wade into the thornier issues of content and non-content.)

As our use of electronic communication technologies have expanded far beyond email and text messages, tricky questions regarding Fourth Amendment protections for the vast amounts of detailed metadata we regularly share with service providers are beginning to spring up in our courts. Issues around location data — the frighteningly accurate, real-time stream of information our cell phones regularly broadcast in order for our providers to send calls and data to us — have become especially controversial in this space. As Justice Sotomayor pointed out in United States v. Jones, information based on location data “generates a precise, comprehensive record of a person’s public movements that reflects a wealth of detail about her familial, political, professional, religious, and sexual associations.” Yet some courts have treated this highly sensitive location information as non-content metadata, and apply Smith and its progeny to arrive at the conclusion that it does not deserve the same Fourth Amendment protections as content data. Under mosaic theory, should courts treat these data like content, even though they could be seen as being on the outside of the metaphorical envelope?

This is the core of what Orin’s question is getting at: If courts are to treat large amounts of location data (as is hourly collected by cell sites everywhere) as effectively content data, shouldn’t it follow that small amounts of content are so meaningless as to fall beneath the threshold of Fourth Amendment protections? My response to this question is that the latter should not follow from the former, even if we accept the rather specious reasoning of Smith. I base my argument on two bases. First, when it comes to technological and societal advances and the application of the Fourth Amendment, we should look to scope of interests protected the Bill of Rights. (This inquiry should be distinguished from literalist or originalist approaches, as it focuses on the purpose of the amendment rather than its strict textual meaning.) Second, the content/non-content distinction is based on what information a person knowingly exposes to public or third-party scrutiny. It is not at all clear that the average user of advanced technologies is truly knowledgeable about the information her devices or applications may be emitting. I shall (briefly) flesh out these arguments in turn.

A significant number of constitutional scholars, including Akhil Amar, Raymond Ku, Thomas Davies, Tracey Maclin, and William Stuntz, have written extensively on the Framers’ context, concluding that the Fourth Amendment was not written to protect citizens’ privacy, but to preserve and protect citizens’ power against a self-interested and overreaching government. As Ku puts it, the Fourth Amendment’s protections are best seen as “an outgrowth and complement to the limitations placed upon executive power through the Constitution’s separation of powers.” The balance of power between citizen and government was expressed as an agency problem by the Framers, where the Bill of Rights was written to limit or prevent government officials from acting against the interests of the people. The semantic game that gives us the content/non-content distinction, where the collection of content is a “search” under the Fourth Amendment, while collecting non-content is not, begins to lose its meaning when we consider the clear power advantage given to the executive when, for example, its agents can collect information on our exact whereabouts without first having to convince a judge that such a search is reasonable under the Fourth Amendment. It follows that this power equation would be further corrupted if we allow even small amounts of content — which we expressly assume to be private — to be slipped to government agencies in the name of some sort of quid pro quo arrangement.

Finally, can it truly be said that most of us knowingly share sensitive information such as location data with third parties (i.e., technology providers)? Even if many technology users have a vague notion of this possibility, I am not convinced that most are aware of the scope and breadth of the content our devices and applications emit as a sort of “digital exhaust.” For example, mobile devices such as smartphones usually include both Wi-Fi and cellular radios, enabling those devices to use Wi-Fi access points as a means of data transmission, which often give speed advantages to the user and don’t typically count against that user’s data cap. Most users will leave these Wi-Fi radios on so they will automatically connect to known networks when they come within range. By doing so, however, that device will periodically broadcast a list of prior networks accessed, which can give a listener all kinds of information about the habits and lifestyle of that user. Most of us would not consider these constant Wi-Fi emissions messages that we mean to send (or are even aware we are sending them), but we would also likely not label this sensitive information “public” under the content/non-content distinction. Increased sensitivity over large amounts of this metadata under the mosaic theory does not therefore imply an automatic decreased concern over small amounts of “content.”