computer vision – Becca Ricks

Unidentified halo: A wearable that thwarts facial detection.

Posted on October 18, 2016October 18, 2016 by baricks

Unidentified halo is a wearable hat that responds to widespread surveillance culture and a lack of biometric privacy in public spaces. The hat is intended to shield the wearer from facial detection on surveillance cameras by creating a halo of infrared light around the face.

As recently as last week, new information has emerged suggesting that as many as half of all Americans are already included in a massive database of faces. Government reports have long confirmed that millions of images of citizens are collected and stored in federal face recognition databases. Police departments across the country use facial recognition technology for predictive policing. One major problem with these systems is that some facial recognition algorithms have been shown to misidentify black people at unusually high rates. There is also the problem of misidentifying a criminal – and how such mistakes can have disastrous consequences.

Shir David and I worked together on this project. We saw this piece as not only a fashion statement, but also an anti-surveillance tool that could be worn by anyone on the street who is concerned about protecting their privacy.

Since the human eye can’t see infrared light, the hat doesn’t draw any attention to the wearer. In the image above, the surveillance camera “sees” a halo of light around my face, preventing Google’s Cloud Vision platform from detecting a face. When we ran the images through Google’s API, it not only detected Shir’s face but even offered suggestions of her emotion based on facial indicators. My face, on the other hand, went undetected.

The project began as a subversive “kit” of wearable items that would allow the wearer to prevent their biometric data from being collected. Shir and I were both frustrated with both the ubiquity and the invisibility of the mechanisms of biopower, from surveillance cameras on streets to fingerprint scanners at the airport. We discussed the idea further with several engineers at NYU and they suggested that if we were interested in pursuing the idea further, we should construct a hat that shines infrared light on the user’s face.

We all agreed that the hat shouldn’t require technical know-how and the battery could be easily recharged. To do this, we soldered together 22 IR LEDs that are powered by a rechargeable 500mAH lithium battery and monitored by a potentiometer, and then adhered the circuit to a baseball cap. The LEDs are wired along the bill of the hat and the battery is tucked into the rim.

Humans can’t see the infrared light unless they are looking through the feed of a surveillance camera, so the wearer won’t draw attention to herself when she wears it on the street. In terms of users, we imagine that this wearable will be worn by someone who wants a way to protect his biometric identity from being tracked while he’s in public without causing a stir.

In future versions of the project, we think we would move the LEDs further down the bill of the hat so that it’s closer to the face. We also would ensure that the lithium battery is safely wrapped in a plastic enclosure so that there’s no way it could be accidentally punctured. And, of course, we would sew everything together to improve the appearance of the hat.

We also need to address why the infrared light appears on some IP surveillance cameras but not others – and what kinds of cameras are in use on subway platforms or street corners, for example. Of course, this project fails to address the ubiquity of iPhone cameras, which don’t pick up infrared light and have extremely advanced facial recognition algorithms. These questions will inform the next iteration of the wearable.

Is this you? Is this them? The algorithmic gaze, again.

Posted on October 12, 2016October 12, 2016 by baricks

Last week I presented a handful of different design concepts for my project. The feedback from my classmates was actually very positive – while I feel that the project still lacks focus at this stage, their comments reaffirmed that the different iterations of this projects are all connected by a conceptual thread. My task in the coming weeks is to continue following that thread and consider each iteration of the project a creative intervention into the same set of questions.

Theory & conceptual framework.

We know that systems that are trained on datasets that contain biases may exhibit those biases when they’re used, thus digitizing cultural prejudices like institutional racism and classism. Researchers working in the field of computer vision operate in a liminal space, one in which the consequences of their work remain undefined by public policy. Very little work has been done on “computer vision as a critical technical practice that entangles aesthetics with politics and big data with bodies,” argues Jentery Sayers.

I want to explore the ways in which algorithmic authority exercises disciplinary power on the bodies it “sees” vis-a-vis computer vision. Last week I wrote about Lacan’s concept of the gaze, a scenario in which the subject of a viewer’s gaze internalizes his or her own subjectivization. Michel Foucault wrote in Discipline and Punish about how the gaze is employed in systems of power. I’ve written extensively about biopower and surveillance in previous blog posts (here and here), but I want to continue exploring how people regulate their behavior when they know a computer is watching. Whether real or not, the computer’s gaze has a self-regulating effect on the person who knows they are being looked at.

It’s important to remember that the processes involved in training a data set to recognize patterns in images are so tedious that we tend to automate them. In his paper “Computer Vision as a Public Act: On Digital Humanities and Algocracy”, Jentery Sayers suggests that computer vision algorithms represent a new kind of power called algocracy – rule of the algorithm. He argues that the “programmatic treatment of the physical world in digital form” is so deeply embedded in our modern infrastructure that these algorithms have begun shaping our behavior and assert authority over us. An excerpt from the paper’s abstract:

Computer vision is generally associated with the programmatic description and reconstruction of the physical world in digital form (Szeliski 2010: 3-10). It helps people construct and express visual patterns in data, such as patterns in image, video, and text repositories. The processes involved in this recognition are incredibly tedious, hence tendencies to automate them with algorithms. They are also increasingly common in everyday life, expanding the role of algorithms in the reproduction of culture.

From the perspective of economic sociology, A. Aneesh links such expansion to “a new kind of power” and governance, which he refers to as “algocracy—rule of the algorithm, or rule of the code” (Aneesh 2006: 5). Here, the programmatic treatment of the physical world in digital form is so significantly embedded in infrastructures that algorithms tacitly shape behaviors and prosaically assert authority in tandem with existing bureaucracies.

Routine decisions are delegated (knowingly or not) to computational procedures that—echoing the work of Alexander Galloway (2001), Wendy Chun (2011), and many others in media studies—run in the background as protocols or default settings.

For the purposes of this MLA panel, I am specifically interested in how humanities researchers may not only interpret computer vision as a public act but also intervene in it through a sort of “critical technical practice” (Agre 1997: 155) advocated by digital humanities scholars such as Tara McPherson (2012) and Alan Liu (2012).

I love these questions posed tacitly by pioneering CV researchers in the 1970s: How does computer vision differ from human vision? To what degree should computer vision be modeled on human phenomenology, and to what effects? Can computer or human vision even be modeled? That is, can either even be generalized? Where and when do issues of processing and memory matter most for recognition and description? And how should computer vision handle ambiguity? Now, the CV questions posed by Facebook and Apple are more along these lines: Is this you? Is this them?

The project.

So how will these new ideas help me shape my project? For one, I’ve become much more wary of using pre-trained data sets like the Clarifai API or Microsoft’s COCO for image recognition. This week I built a Twitter bot that uses the Clarifai API to generate pithy descriptions of images tweeted at it.

@songoftrump 🔮 Okay let me look. 🔮 It’s a color – no wait a utensil. An opportunity to be involved in luxurious sexuality is coming. 👀 🙄 🔫

— crystal gazing (@gimmeafortune) October 11, 2016

I honestly was disappointed by the lack of specificity the data set offered. However, I’m excited that Clarifai announced today a new tool for users to train their own models for image classification.

I want to probe the boundaries of these pre-trained data sets – where do these tools break and why? How can I distort images in a way that objects are recognized as something other than themselves? What would happen if I trained my own data set on a gallery of images that I have curated? Computer vision isn’t source code; it’s a system of power.

For my project, I want to have control over the content that the model is being trained on so that it outputs interesting or surprising results. In terms of the aesthetic, I want to try out different visual ways of organizing these images – clusters, tile patterns, etc. Since training one of these models can take as little as a month, the goal for this week is to start creating the data set and the model.

I’ve been reading Wendy Chun’s Programmed Visions and Alexander Galloway’s Protocol: How Control Exists After Decentralization for months, but I’m recommitting to finishing these books in order to develop my project’s concept more fully.

Crystal gazing: A Twitter bot that uses computer vision to describe images.

Posted on October 11, 2016October 11, 2016 by baricks

This semester I’ve been interrogating the concept of “algorithmic gaze” vis-a-vis available computer vision and machine learning tools. Specifically, I’m interested in how such algorithms describe and categorize images of people.

For this week’s assignment, we were to build a Twitter bot using JavaScript (and other tools we found useful – Node, RiTA, Clarifai, etc) that generates text on a regular schedule. I’ve already build a couple Twitter bots in the past using Python, including BYU Honor Code, UT Cities, and Song of Trump, but I had never built one in Javascript. For this project, I immediately knew I wanted to experiment with building a Twitter bot that uses image recognition to describe what it sees in an image.

To do so, I used the Clarifai API’s robust machine learning library to access an already-trained neural net that generates a list of keywords based on an image input. Any Twitter user can tweet an image at my Twitter bot and receive a reply that includes a description of what’s in the photo, along with a magick prediction for their future (hence the name, crystal gazing).

@songoftrump 🔮 Okay let me look. 🔮 It’s a color – no wait a utensil. An opportunity to be involved in luxurious sexuality is coming. 👀 🙄 🔫

— crystal gazing (@gimmeafortune) October 11, 2016

After pulling an array of keywords from the image using Clarifai, I then used Tracery to construct a grammar that included a waiting message, a collection of insights into the image using those keywords, and a pithy life prediction.

@songoftrump 🔮 Hmm let me think. 🔮 I’m starting to see a table. Perhaps a man? Cancel plans and reverse decisions. 👅 💃🏻 🍩

— crystal gazing (@gimmeafortune) October 11, 2016

I actually haven’t deployed the bot to a server just yet because I’m still ironing out some issues in the code – namely, asynchronous callbacks that are screwing with the order of how functions need to be fired – but you can still see how the bot works by checking it out on Twitter or Github. It’s still a work in progress, however.

You can see the Twitter bot here and find the full code here in my github repo. I also built a version of the bot for the browser, which you can play around with here.