Facial recognition systems are everywhere, from security cameras that try to spot criminals to the way Snapchat finds your face to put bunny ears on it. Computers need a lot of data to be able to learn how to recognize faces, and some of it comes from Flickr.
IBM released a “Diversity in Faces” data set earlier this year, which in a way is arguably a good thing: a lot of early face-recognition algorithms were trained on thin white celebrities, because it’s easy to find a lot of photos of celebrities. Your data source affects what your algorithm is able to do and understand, so there are a lot of racist, sexist algorithms out there. This dataset aims to help, by providing images of faces alongside data about the face such as skin color.
But most folks who uploaded their personal snapshots to Flickr probably didn’t realize that, years down the road, their faces and their friends’ and families’ faces could be used to train the next big mega-algorithm. If you applied a Creative Commons license to your photos, even a “non commercial” one, you could be in this data set.
NBC reports that IBM says it will remove images from the data set at the photographer’s or the photographed person’s request—but they haven’t made the data set public, so there’s no way to see for sure whether you are actually in there. Getting a photo removed won’t be easy, but if you want to know whether any of yours have been used, you can enter your Flickr username into NBC’s tool here. This isn’t necessarily the only data set out there that might contain your photo, but at least there’s a way to find out if your photos were used.