We’ve now heard in some detail about how the NSA made use of its vast database of call records to investigate links between terror suspects via “contact chaining” from specific suspicious numbers.  This has always seemed like a somewhat strange rationale for collecting such huge quantities of data, given that something similar could be accomplished, albeit less conveniently, using targeted requests on those numbers.  Sophisticated pattern analysis to detect unknown operatives might require the entire database, after all, but chaining from known suspicious numbers should not.  It’s therefore worth drawing attention to an analytic tool described in Gen. Keith Alexander’s supplemental declaration to a February 2009 notice of a compliance incident to the FISC—a tool which routinely accessed the call records database without the required determination of “reasonable articulable suspicion.”

NSA personnel determined on 18 February 2009 that an NSA analytical tool known as [REDACTED] was querying both E.O. 12333 and the Business Records data and that such queries would not have been limited to RAS approved telephone identifiers. As explained further below [REDACTED] was automatically invoked to support certain types of analytical research. Specifically, to help analysts identify a phone number of interest. If an analyst conducted research supported by [REDACTED] the analyst would receive a generic notification that NSA’s signals intelligence (“SIGINT”) databases contained one or more references to the telephone identifier in which the analyst was interested; a count of how many times the identifier was present in SIGINT databases; the dates of the first and last call events associated with the identifier; a count of how many other unique telephone identifiers had direct contact with the identifier that was the subject of the analyst’s research; the total number of calls made to or from the telephone identifier that was the subject of the analyst’s research; the ratio of the count of total calls to the count of unique contacts; and the amount of time it took to process the analyst’s query. [REDACTED] did not return to the analyst the actual telephone identifier(s) that were in contact with the telephone identifier that was the subject of the analyst’s research and the analyst did not receive a listing of the individual NSA databases that were queried by [REDACTED] .

Conspicuously, there is no mention made of how many times this tool accessed the database, though in compliance incidents involving improper manual queries, such a count is given.  Though Alexander’s description makes it sound as though these queries were specific and individualized—involving a “telephone number in which the analyst was interested”—he also notes that it is typically invoked, not as a stand-alone tool, but as a “background process” invoked by other analytic tools.  That’s not in itself determinative, but you would expect a “background process” to be used when you needed to automatically pull in a lot of data.  Moreover, Alexander describes the tool’s purpose as helping to “identify” numbers of interest—which suggests that its real value is in sorting through records to pinpoint numbers that are not yet “of interest,” because there is no particular reason to suspect them until after the analysis is conducted.

Let’s consider, then, the very specific data this query tool was designed to return: The times and dates of the first and last call events, but apparently not the times and dates of calls between those endpoints. In other words, this tool is supporting analytic software that only cares when a phone went online, and when it stopped being used.  It also gets the total number of calls, and the ratio of unique contacts to calls, but not the specific numbers contacted.  Why, exactly, would this limited set of information be useful?  And why, in particular, might you want to compare that information across a large number of phones there’s not yet any particular reason to suspect?

One possibility that jumps out at me—and perhaps anyone else who’s a fan of The Wire—is that this is the kind of information you would want if you were trying to identify disposable prepaid “burner” phones being used by a target who routinely cycles through cell phones as a countersurveillance tactic.  The number of unique contacts and call/contact ratio would act as a kind of rough fingerprint—you’d assume a phone being used for dedicated clandestine purposes to be fairly consistent on that score—while the first/last call dates help build a timeline: You’re looking for a series of phones that are used for a standard amount of time, and then go dead just as the next phone goes online.

This is precisely the kind of thing you’d expect a vast call records database to be useful for—and one that, as described, has fewer civil liberties implications than the contact chaining we already know about.  I highlight it largely because if we’re now having a public debate on the utility of this program, we should be assessing the full scope of its uses—rather than having Congress assured of the importance of secret uses the public has had no opportunity to evaluate.