How to Evaluate Whether the NSA's Telephony Metadata Program Makes Us Safer (and What Proponents and Opponents Get Wrong)

Published on December 27, 2013

[Editor’s Note: Just Security is holding a “mini forum” on the Report by the President’s Review Group on Intelligence and Communications Technologies. Others in the series include a post by Marty Lederman analyzing the Report’s highlights, post by Julian Sanchez examining the scope of the NSA’s section 702 program, a post by David Cole and Marty Lederman analyzing how metadata is used under section 215, and a post by Jennifer Granick discussing the implications for non-US persons (with a follow-up post by Jennifer).]

The President’s Review Group dropped a bombshell when it suggested that the NSA’s bulk telephony metadata program (under section 215) has not prevented any terrorist attacks. As many now know, the Report states:

“Our review suggests that the information contributed to terrorist investigations by the use of section 215 telephony meta-data was not essential to preventing attacks and could readily have been obtained in a timely manner using conventional section 215 orders.” (p. 104)

That part of the Report may be a game changer. As one congressional intelligence official told NBC News, “‘That was stunning. That was the ballgame. … ‘It flies in the face of everything that they have tossed at us.’”

The question I consider in this post is whether the Group’s assessment will, and should, signal the effective demise of the program. I examine the strongest claims that proponents of the program may still raise; and I propose some analytic tools for considering the issue of effectiveness, so that we might all (proponents, opponents, and others alike) candidly assess this particular program’s potential security benefits.

I. Objective: Stop terrorist attacks in US or elsewhere

The key objective of course is to stop terrorist attacks against the US homeland and vital US interests abroad. An important distinction, however, is whether the intelligence generated by the program is:

(a) “direct”: timely information to foil a specific attack; or

(b) “indirect”: information that enables the government to degrade a terrorist group or decrease the general likelihood of attacks

Examples of the latter might include information on individuals who have joined or are funding a terrorist organization. Intelligence could help to identify and successfully prosecute such individuals, and hence disable them and deter others. The important point is that both types of information aid the overall goal of stopping terrorist attacks. That point appears to have been lost on some critics of the program. When the government cites the latter information yields, critics often consider such situations irrelevant or little to do with stopping attacks.

It is not crystal clear, but the Review Group appears to recognize—and avoid—that mistake. The Report appears to conclude that the bulk telephony metadata program was not essential for providing either direct or indirect information to prevent terrorist attacks. For example, the Report refers to “the 54 situations in which signals intelligence has contributed to the prevention of terrorist attacks since 2007.” And we now know (from congressional hearings) that those 54 “situations” included not just specific plots but prosecution for funding a foreign terrorist organization and prosecution for collaboration long after a foiled attack.

Two additional notes are important.

1. Arguably only if this particular program is able to yield the first type of information will it avoid substantial reform or retirement. A key difference between the two types is the time sensitive nature of the information. The most significant potential drawback with alternatives to the section 215 program (e.g., Senator Wyden’s proposals) and with revisions to the 215 program (e.g., requiring the metadata to be held by private companies) is that they impose greater costs in terms of speed of information retrieval. That is not as great a concern if the 215 program operates most effectively only in situations in which time is not of the essence.

2. This framework for thinking about the effectiveness question bolsters one aspect of the government’s position – what General Alexander and Director Clapper call the “peace of mind metric” whereby the program provides information that helps to dispel concerns about a possible attack or possible connections between individual terrorists. The two cases cited by Clapper in testimony are the Boston Marathon – where the government was able to determine quickly that there was no larger plot involving New York City – and the threats to US embassies over summer 2013 – where the government was able to determine there was no connected threat to strike inside the US homeland. The Review Group does show sympathy for this value of the program (albeit with a caveat):

More often, negative results from section 215 queries have helped to alleviate concern that particular terrorist suspects are in contact with co-conspirators in the United States. … [but] there is reason for caution about the view that the program is efficacious in alleviating concern about possible terrorist connections, given the fact that the meta-data captured by the program covers only a portion of the records of only a few telephone service providers.” (p. 104)

Some commentators have criticized (if not ridiculed) this argument on the ground that it is insufficient, on its own, for retaining the program. That would be the case if the “peace of mind metric” were an end in itself separate from foiling plots. And, indeed, Alexander’s and Clapper’s testimony have confusingly framed this element as distinct from the “other” metric of foiled plots. Theoretically, however, this intelligence product could provide timely information that enables the government to more effectively stop and respond to an ongoing or imminent attack by focusing its limited resources appropriately (on the real threat). In efforts to foil an imminent plot, one would presumably value a program that can quickly help to eliminate particular scenarios. That said, even if the 215 program were able to perform these tasks successfully, the question for the program’s survival still boils down to whether it is worth the many tradeoffs. Moreover, the peace of mind metric may be cited as another reason to allow the program to operate only in exigent and emergency situations.

II. Degree of contribution to stopping attacks

How much does a CT program contribute to stopping terrorist attacks? Consider three thresholds:

High bar: whether the program is “essential” such that successful outcomes would have been impossible without it.
Medium bar: whether the program significantly increases the likelihood of successful outcomes.
Low bar: whether the program provides any uniquely valuable information to aid successful outcomes.

The Review Group’s bombshell statement, quoted at the start of this post, might be read simply to say that the program does not satisfy the high bar–the Group states that the intelligence information yielded by the program “was not essential to preventing attacks.” If that were the end of the story, one should be concerned that the Review Group set too high a standard to assess the value of the program. However, the Report goes on to say (in a footnote) that the program does not even cross the low bar. The Report states: “[T]here has been no instance in which NSA could say with confidence that the outcome would have been different without the section 215 telephony meta-data program.” In that respect, the Report is consistent with a strong statement by Senators Wyden and Udall in June 2013, in which they said: “We remain unconvinced that the secret Patriot Act collection has actually provided any uniquely valuable intelligence.” And, of course, it bears reminding that the Report did not hold NSA programs to too high or unreachable a standard as evidenced by its favorable conclusions with respect to the 702 foreign surveillance program. Indeed, the Group directly compared the effectiveness of the 215 program to the 702 program, and was clearly impressed by the success of the latter in preventing terrorist attacks. In an interview, one of the panelists said the comparison between the two program’s effectiveness was “‘night and day.’”

Notably, a statement by the leadership of the House and Senate Intelligence Committees appeared to suggest that the 215 program satisfied the high bar. They stated: “a number of recommendations in the report should not be adopted by Congress, starting with those based on the misleading conclusion that the NSA’s metadata program is ‘not essential to preventing attacks.’ … We continue to believe that it is vital this lawful collection program continue.” However, Senator Feinstein appeared to back off that characterization in an interview, where she stated, “I’m not saying it’s indispensable, but I’m saying that it is important and it is a major tool in ferreting out a potential terrorist attack.”

Finally, this framework for considering CT effectiveness may help to think about the value of redundancy. Critics of the program appear to dismiss or devalue cases in which the program has yielded intelligence that corroborates information already provided by other means, or yielded data that could have been obtained by alternative techniques. But corroboration is often important in decision-making. And redundancy is often a smart institutional design across many types of systems—one would think counterterrorism included. Again, the issue for the program’s survival in whole or in part will boil down to whether it is worth all the tradeoffs – but it is important to accurately gauge its benefits in making that calculation.

II. The coming reforms

Where do we go from here? Although the Review Group was unanimous in its Report, distinctions in the position of different panelists have begun to emerge. In a recent interview, the panelist Geof Stone elaborated that one of the reasons for the program’s ineffectiveness is its limited coverage of telephone records:

[Stone] said one reason the telephone records program is not effective is because, contrary to the claims of critics, it actually does not collect a record of every American’s phone call. Although the NSA does collect metadata from major telecommunications carriers such as Verizon and AT&T, there are many smaller carriers from which it collects nothing. Asked if the NSA was collecting the records of 75 percent of phone calls, an estimate that has been used in briefings to Congress, Stone said the real number was classified but “not anything close to that” and far lower.

For Stone, the lesson appears to be that the reason the NSA does not cover many more carriers, is because the (low-yield) benefits of the program are not worth those costs. NBC (Michael Isikoff) reports:

When panel members asked NSA officials why they didn’t expand the program to include smaller carriers, the answer they gave was “money,” Stone said. “They were setting financial priorities,” said Stone, and that was “really revealing” about how useful the bulk collection of telephone calls really was.

Another member, Michael Morell, appears to draw a different lesson: that this reason for the program’s ineffectiveness should be cured by expanding its coverage, for example, to (re-)include email correspondence. It will be interesting and important to see how these debates play out in the congressional hearings in which the Review Group’s members will testify next month.