Secret 'BADASS' Intelligence Program Spied on Smartphones

https://firstlook.org/theintercept/2015/01/26/secret-badass-spy-program/

Version 0 of 1.

British and Canadian spy agencies accumulated sensitive data on smartphone users, including location, app preferences, and unique device identifiers, by piggybacking on ubiquitous software from advertising and analytics companies, according to a document obtained by NSA whistleblower Edward Snowden.

The document, included in a trove of Snowden material released by Der Spiegel on January 17, outlines a secret program run by the intelligence agencies called BADASS. The German newsweekly did not write about the BADASS document, attaching it to a broader article on cyberwarfare. According to The Intercept‘s analysis of the document, intelligence agents applied BADASS software filters to streams of intercepted internet traffic, plucking from that traffic unencrypted uploads from smartphones to servers run by advertising and analytics companies.

Programmers frequently embed code from a handful of such companies into their smartphone apps because it helps them answer a variety of questions: How often does a particular user open the app, and at what time of day? Where does the user live? Where does the user work? Where is the user right now? What’s the phone’s unique identifier? What version of Android or iOS is the device running? What’s the user’s IP address? Answers to those questions guide app upgrades and help target advertisements, benefits that help explain why tracking users is not only routine in the tech industry but also considered a best practice.

For users, however, the smartphone data routinely provided to ad and analytics companies represents a major privacy threat. When combined together, the information fragments can be used to identify specific users, and when concentrated in the hands of a small number of companies, they have proven to be irresistibly convenient targets for those engaged in mass surveillance. Although the BADASS presentation appears to be roughly four years old, at least one player in the mobile advertising and analytics space, Google, acknowledges that its servers still routinely receive unencrypted uploads from Google code embedded in apps.

For spy agencies, this smartphone monitoring data represented a new, convenient way of learning more about surveillance targets, including information about their physical movements and digital activities. It also would have made it possible to design more focused cyberattacks against those people, for example by exploiting a weakness in a particular app known to be used by a particular person. Such scenarios are strongly hinted at in a 2010 NSA presentation, provided by agency whistleblower Edward Snowden and published last year in The New York Times, Pro Publica, and The Guardian. That presentation stated that smartphone monitoring would be useful because it could lead to “additional exploitation” and the unearthing of “target knowledge/leads, location, [and] target technology.”

The 2010 presentation, along with additional documents from Britain’s intelligence service Government Communications Headquarters, or GCHQ, showed that the intelligence agencies were aggressively ramping up their efforts to see into the world of mobile apps. But the specifics of how they might distill useful information from the torrent of internet packets to and from smartphones remained unclear.

The BADASS slides fill in some of these blanks. They appear to have been presented in 2011 at the highly secretive SIGDEV intelligence community conference. The presentation states that “analytics firm Flurry estimates that 250,000 Motorola Droid phones were sold in the United States during the phone’s first week in stores,” and asks, “how do they know that?”

The answer is that during the week in question, Flurry uploaded to its own servers analytics from Droid phones on behalf of app developers, one phone at a time, and stored the analytics in their own databases. Analytics includes any information that is available to the app and that can conceivably help improve it, including, in certain instances with Flurry, the user’s age and gender, physical location, how long they left the app open, and a unique identifier for the phone, according to Flurry materials included in the BADASS document.

By searching these databases, the company was able to get a count of Droid phones running Flurry-enabled apps and, by extrapolating, estimate the total number of Droids in circulation. The company can find similar information about any smartphone that their analytics product supports.

Not only was Flurry vacuuming sensitive data up to its servers, it was doing so insecurely. When a smartphone app collects data about the device it’s running on and sends it back to a tracking company, it generally uses the HTTP protocol, and Flurry-enabled apps were no exception. But HTTP is inherently insecure—eavesdroppers can easily spy on the entire digital conversation.

If the tracking data was always phoned home using the HTTPS protocol—the same as the HTTP protocol, except that the stream of traffic between the phone and the server is encrypted—then the ability for spy agencies to collect tracking data with programs like BADASS would be severely impeded.

Yahoo, which acquired the analytics firm Flurry in late 2014, says that since acquiring the company they have “implemented default encryption between Flurry-enabled applications and Flurry servers. The 2010 report in question does not apply to current versions of Flurry’s analytics product.” Given that Yahoo acquired Flurry so recently, it’s unclear how many apps still use Flurry’s older tracking code that sends unencrypted data back to Yahoo’s servers. (Yahoo declined to elaborate specifically on that topic.)

The BADASS slides also use Google’s advertisement network AdMob as an example of intercepted, unencrypted data. Free smartphone apps are often supported by ads, and if the app uses AdMob then it sends some identifying information to AdMob’s servers while loading the ad. Google currently supports the ability for app developers to turn on HTTPS for ad requests, however it’s clear that only some AdMob users actually do this.

When asked about HTTPS support for AdMob, a Google spokesperson said, “We continue our ongoing efforts to encrypt all Google products and services.”

In addition to Yahoo’s Flurry and Google’s AdMob, the BADASS presentation also shows that British and Canadian intelligence were targeting Mobclix, Mydas, Medialets, and MSN Mobile Advertising. But it’s clear that any mobile-related plaintext traffic from any company is a potential target. While the BADASS presentation focuses on traffic from analytics and ad companies, it also shows spying on Google Maps heartbeat traffic, and capturing “beacons” sent out when apps are first opened (listing Qriously, Com2Us, Fluentmobile, and Papayamobile as examples). The BADASS presentation also mentions capturing GPS coordinates that get leaked when opening BlackBerry’s app store.

In a boilerplate statement, GCHQ said, “It is longstanding policy that we do not comment on intelligence matters. Furthermore, all of GCHQ’s work is carried out in accordance with a strict legal and policy framework, which ensures that our activities are authorised, necessary and proportionate, and that there is rigorous oversight.” Its Canadian counterpart, Communications Security Establishment Canada, or CSEC, responded with a statement that read, in part, “For reasons of national security, CSE cannot comment on its methods, techniques or capabilities. CSE conducts foreign intelligence and cyber defence activities in compliance with Canadian law.”

Julia Angwin, who has doggedly investigated online privacy issues as a journalist and author, most recently of the book “Dragnet Nation,” explains that “every type of unique identifier that passes [over the internet] unencrypted is giving away information about users to anyone who wants it,” and that “the evidence is clear that it’s very risky to be throwing unique identifiers out there in the clear. Anyone can grab them. This is more evidence that no one should be doing that.”

The BADASS program was created not merely to track advertising and analytic data but to solve a much bigger problem: There is an overwhelming amount of smartphone tracking data being collected by intelligence agencies, and it’s difficult to make sense of.

First there are the major platforms: iOS, Android, Windows Phone, and BlackBerry. On each platform, a range of hardware and platform versions are in use. Additionally, app stores are overflowing; new apps that track people get released every day. Old apps constantly get updated to track people in different ways, and people use different versions of apps for different platforms all at once. Adding to the diversity, there are several different ad and analytics companies that app developers use, and when those companies send tracking data back to their servers, they use a wide variety of formats.

With such an unwieldy haystack of data, GCHQ and CSEC, started the BADASS program, according to the presentation, to find the needles: information that can uniquely identify people and their devices, such as smartphone identifiers, tracking cookies, and other unique strings, as well as personally identifying information like GPS coordinates and email addresses.

BADASS is an an acryonym that stands for BEGAL Automated Deployment And Survey System. (It is not clear what “BEGAL” stands for, in turn.) The slideshow presentation is called “Mobile apps doubleheader: BADASS Angry Birds,” and promises “protocols exploitation in a rapidly changing world.”

Analysts are able to write BADASS “rules” that look for specific types of tracking information as it travels across the internet.

For example, when someone opens an app that loads an ad, their phone normally sends an unencrypted web request (called an HTTP request) to the ad network’s servers. If this request gets intercepted by spy agencies and fed into the BADASS program, it then gets filtered through each rule to see if one applies to the request. If it finds a match, BADASS can then automatically pull out the juicy information.

In the following slide, the information that is potentially available in a single HTTP request to load an ad includes which platform the ad is being loaded on (Android, iOS, etc.), the unique identifier of the device, the IMEI number which cell towers use to identify phones that try to connect to them, the name and version of the operating system that’s running, the model of the device, and latitude and longitude location data.

Similar information is sent across the internet in HTTP requests in several different formats depending on what company it’s being sent to, what device it’s running on, and what version of the ad or analytics software is being used. Because this is constantly changing, analysts can write their own BADASS rules to capture all of the permutations they can find.

The following slide shows part of the BADASS user interface, and a partial list of rules.

The slideshow includes a section called “Abusing BADASS for Fun and Profit” which goes into detail about the methodology analysts use to write new BADASS rules.

By looking at intercepted HTTP traffic and writing rules to parse it, analysts can quickly gather as much information as possibly from leaky smartphone apps. One slide states: “Creativity, iterative testing, domain knowledge, and the right tools can help us target multiple platforms in a very short time period.”

The slides also appear to mock the privacy promises of ad and analytics companies.

Companies that collect usage statistics about software often insist that the data is anonymous because they don’t include identifying information such as names, phone numbers, and email addresses of the users that they’re tracking. But in reality, sending unique device identifiers, IP addresses, IMEI numbers, and GPS coordinates of devices is far from anonymous.

In one slide, the phrase “anonymous usage statistics” appears in conspicuous quotation marks. The spies are well aware that despite not including specific types of information, the data they collect from leaky smartphone apps is enough for them to uniquely identify their targets.

The following slides show a chunk of Flurry’s privacy policy (at this point it has been replaced by Yahoo’s privacy policy), which states what information it collects from devices and how it believes this is anonymous.

The red box, which is present in the original slides, highlights this part: “None of this information can identify the individual. No names, phone numbers, email addresses, or anything else considered personally identifiable information is ever collected.”

Clearly the intelligence services disagree.

“Commercial surveillance often appears very benign,” Angwin says. “The reason Flurry exists is not to ‘spy on people’ but to help people learn who’s using their apps. But what we’ve also seen through Snowden revelations is that spy agencies seek to use that for their own purposes.”

While the BADASS program is specifically designed to target smartphone traffic, websites suffer from these exact same problems, and in many cases they’re even worse.

Websites routinely include bits of tracking code from several different companies for ads, analytics, and other behavioral tracking. This, combined with the lack of HTTPS, turns your web browser into a surveillance device that follows you around, even if you switch networks or use proxy servers.

In other words, while the BADASS presentation may be four years old, and while it’s been a year and a half since Snowden’s leaks began educating technology companies and users about the massive privacy threats they face, the big privacy holes exploited by BADASS remain a huge problem.

Photo, top: Christopher Furlong/Getty Images