Of the many new features in Apple’s iOS 11—which hit your iPhone a few weeks ago—a tool called Core ML stands out. It gives developers an easy way to implement pre-trained machine learning algorithms, so apps can instantly tailor their offerings to a specific person’s preferences. With this advance comes a lot of personal data crunching, though, and some security researchers worry that Core ML could cough up more information than you might expect—to apps that you’d rather not have it.
Core ML boosts tasks like image and facial recognition, natural language processing, and object detection, and supports a lot of buzzy machine learning tools like neural networks and decision trees. And as with all iOS apps, those using Core ML ask user permission to access data streams like your microphone or calendar. But researchers note that Core ML could introduce some new edge cases, where an app that offers a legitimate service could also quietly use Core ML to draw conclusions about a user for ulterior purposes.
“The key issue with using Core ML in an app from a privacy perspective is that it makes the App Store screening process even harder than for regular, non-ML apps,” says Suman Jana, a security and privacy researcher at Columbia University, who studies machine learning framework analysis and vetting. “Most of the machine learning models are not human-interpretable, and are hard to test for different corner cases. For example, it’s hard to tell during App Store screening whether a Core ML model can accidentally or willingly leak or steal sensitive data.”
The Core ML platform offers supervised learning algorithms, pre-trained to be able to identify, or “see,” certain features in new data. Core ML algorithms prep by working through a ton of examples (usually millions of data points) to build up a framework. They then use this context to go through, say, your Photo Stream and actually “look at” the photos to find those that include dogs or surfboards or pictures of your driver’s license you took three years ago for a job application. It can be almost anything.
For an example of where that could go wrong, thing of a photo filter or editing app that you might grant access to your albums. With that access secured, an app with bad intentions could provide its stated service, while also using Core ML to ascertain what products appear in your photos, or what activities you seem to enjoy, and then go on to use that information for targeted advertising. This type of deception would violate Apple’s App Store Review Guidelines. But it may take some evolution before Apple and other companies can fully vet the ways an app intends to utilize machine learning. And Apple’s App Store, though generally secure, does already occasionally approve malicious apps by mistake.
Attackers with permission to access a user’s photos could have found a way to sort through them before, but machine learning tools like Core ML—or Google’s similar TensorFlow Mobile—could make it quick and easy to surface sensitive data instead of requiring laborious human sorting. Depending on what users grant an app access to, this could make all sorts of gray behavior possible for marketers, spammers, and phishers. The more mobile machine learning tools exist for developers, the more screening challenges there could be for both the iOS App Store and Google Play.
Core ML does have a lot of privacy and security features built in. Crucially, its data processing occurs locally on a user’s device. This way, if an app does surface hidden trends in your activity, and heartbeat data from Apple’s Health tool, it doesn’t need to secure all that private information in transit to a cloud processor and then back to your device.
That approach also cuts down on the need for apps to store your sensitive data on their servers. You can use a facial recognition tool, for instance, that analyzes your photos, or a messaging tool that converts things you write into emojis, without that data ever leaving your iPhone. Local processing also benefits developers, because it means that their app will function normally even if a device loses internet access.
iOS apps are only just starting to incorporate Core ML, so the practical implications of the tool remain largely unknown. A new app called Nude, launched on Friday, uses Core ML to promote user privacy by scanning your albums for nude photos and automatically moving them from the general iOS Camera Roll to a more secure digital vault on your phone. Another app scanning for sexy photos might not be so respectful.
A more direct example of how Core ML could facilitate malicious snooping is a project that takes the example of the iOS “Hidden Photos” album (the inconspicuous place photos go when iOS users “hide” them from the regular Camera Roll). Those images aren’t hidden from apps with photo access permissions. So the project converted an open-source neural network that finds and ranks illicit photos to run on Core ML, and used it to comb through test examples of the Hidden Photos album to quickly rate how salacious the images in it were. In a comparable real-world scenario, a malicious dev could use Core ML to find your nudes.
Researchers are quick to note that while Core ML introduces important nuances—particularly to the app-vetting process—it doesn’t necessarily represent a fundamentally new threat. “I suppose CoreML could be abused, but as it stands apps can already get full photo access,” says Will Strafach, an iOS security researcher and the president of Sudo Security Group. “So if they wanted to grab and upload your full photo library, that is already possible if permission is granted.”
The easier or more automated the trawling process becomes, though, the more enticing it may look. Every new technology presents potential gray sides; the question now with Core ML is what sneaky uses bad actors will find for it along with the good.