When you snap a digital photo with your camera or phone, it stores more than just the pixels and colors that make up the image. Each image file also contains metadata, which includes details ranging from creation date and copyright info to the location where the photo was taken.
The same goes for images modified with many photo editing programs. Image editing programs often add metadata to images including modification timestamps, system info, and tracked changes.
Metadata can pose a privacy threat to people who share and post photos online. Although some social networks and photo storage and sharing sites scrub metadata from uploaded photos, many fail to do so, Comparitech researchers say, which could allow attackers to gather personal information from images posted online. For example, if someone posts a vacation photo with GPS coordinates and a timestamp in the metadata, an attacker could easily find when and where they traveled.
Metadata can be categorized into three broad categories:
- System metadata is generated when the image is stored (i.e. when a photo is taken or edits are saved). It includes specific labeled criteria, like the date and time the image was created and details about the camera and/or editing software used
- Substantive metadata include the contents of the actual file, such as tracked changes to an edited image
- Embedded metadata include data entered into a document that is not normally visible, such as formulas in an Excel spreadsheet
Image metadata can be embedded internally in common image file formats like JPEG and PNG. Such image data is usually stored in Exif (exchangeable image file format). But it can also exist outside the image file in a digital asset management (DAM) system. These are sometimes referred to as “sidecar” files, and are often stored in the XMP format.
Metadata has three broad use cases:
- Describing the contents of the file, including keywords, names of persons pictured, and location coordinates
- Administrative data can include the creation date, modification date, location, and other system metadata mentioned above.
Which image sharing services scrub metadata and which ones don’t?
Comparitech researchers analyzed the metadata scrubbing practices of 12 popular image storage and sharing services online. They uploaded an image of the Mona Lisa loaded with metadata to each of the services. After the upload, they then downloaded the image from each respective service to see if the metadata remained intact or not.
Let’s start with the most popular places to share images on the web. Imgur, Facebook and Instagram all scrub all metadata from photos upon upload. You don’t have to worry about leaking metadata when uploading images to these sites. Bear in mind, however, that even though users of those sites don’t have access to metadata, the sites themselves do.
Flickr keeps all of the original metadata data and even displays a lot of it on each photo’s web page.
Photobox.co.uk tags photos in the metadata comments section to indicate that uploaded images are compressed. The rest of the metadata is intact. It was the only service that actually added or modified data.
The remaining image sharing and storage services we examined didn’t remove or modify any metadata except for “date modified” timestamps:
If you don’t want to expose EXIF metadata on those sites, you’ll have to scrub images beforehand. More on how to do that below.
How you can be tracked using EXIF metadata: research examples
Comparitech researchers proved the sensitivity of image metadata by using publicly available images to track down image subjects and creators. (Note: we’ve scrubbed all of the following images of their original metadata).
Let’s start with a simple example. Using the GPS metadata in the above photo, we determined it was taken near Sørstranda, Norway.
The next subject was a photo of a man’s face. Using the image metadata, reverse image search, and a bit of open-source intelligence (OSINT), researchers were able to identify him as a previous game-show contestant. They found his country, date of birth, wedding date, spouse’s name, Facebook profile, Twitter account, LinkedIn page, Instagram account, work experience, skills, education, and interests. Researchers were also able to identify and find info about the subject’s game-show teammates as well.
Another subject was a passport-style headshot featuring a man in what appear to be military fatigues. Researchers were able to track down the image to a site with photos of the subject’s school graduation. Using the school name and graduation gallery, researchers retrieved the names of everyone in his graduating class. With the possibilities narrowed down, they found a man with a name similar to that of the image filename. Researchers went on to find the man’s Facebook and Instagram profiles. Using these images, they further discovered he was indeed a soldier. They learned his division and brigade, and info about his closest relatives.
Lastly, researchers identified a Philippine national using a photo of herself posted on an image-sharing site. The subject is holding up photo identification. Such photos are often used to verify the subject’s identity to a digital service, such as an online bank. Researchers were able to find out the subject’s country, birth date, weight, height, blood type, address, Facebook profile, job, education, that she recently had Covid-19, and her Youtube channel.
Metadata used as court evidence
Metadata from images and other files has been used as evidence in courts of law and police investigations, demonstrating metadata’s value from a privacy perspective. Here are a few prominent examples:
- In 2016, two Harvard students used GPS coordinates stored in the metadata of photos posted on the dark web to identify drug dealers 229 drug dealers. Dark web drug dealers often post images of their products online to help prove their credibility, but they often forget to scrub EXIF data beforehand.
- In 2017, an employee of Bio-Rad Laboratories filed a suit against his employer alleging he was fired for telling authorities about potential bribery in China. A performance review with a metadata timestamp dated after he was fired served as evidence in the case, resulting in a higher payout for violating laws against firing whistleblowers. This is the biggest metadata-linked payout to date at $10.8 million in damages.
- In 2015, a judge threw out a case in which a woman accused her spouse of physical abuse. The plaintiff provided several photos as evidence of abuse, but the metadata indicated the date that the wife had claimed the abuse occurred three months after the photos were taken.
- Digital forensics company Legility published a case study (PDF) describing a lawsuit it investigated. In that case, a healthcare company acquired another business. Employees from the original company left to start their own company. The now-acquired company sued the new company, alleging it poached employees and stole trade secrets and proprietary documents including customer lists. Using the metadata of those documents as evidence of when documents were copied and transferred, the acquired company was rewarded $7 million (GDP £5.1 million).
How to remove metadata from images
Cameras and camera apps vary quite a bit, but many of them have an option to turn off or limit the generation of metadata. Check your camera or app settings.
Most cameras and image editing programs store image metadata in the EXIF format. You might be able to edit EXIF data on exiting images through your camera or photo editing app.
Windows 10 comes with a built-in option to remove metadata. If you’re a PC user, just follow these steps:
- Right-click the image file and select Properties to open a new window
- Click the Details tab at the top
- Click the link that says Remove Properties and Personal Information at the bottom. Another new window will pop up.
- In the Remove Properties window, select Remove the following properties from this file:
- Click Select All, then OK
If you’re not on Windows, there’s no shortage of free EXIF editors and scrubbers online. Just be sure to read their privacy policies first. Here‘s an open-source option that runs locally on your computer.