Becca Ricks – Page 3 – Creative explorations, experiments with code, internet art, design research, and other work.

Detourning Breitbart: An experiment in web scraping.

Posted on January 31, 2017January 31, 2017 by baricks

Like most of the people I’ve talked to this week, I’m overwhelmed by both the scale and the velocity with which the Trump/Bannon administration has undermined basic constitutional rights within the first week of office. Furthermore, the administration sends message that are just false (“The ban isn’t a Muslim ban”, “The ban doesn’t affect U.S. green card holders”, “Protesters are being organized and funded by CAIR”). Part of the issue is that the Trump administration provided no guidance to the Department of Homeland Security as to how the Executive Order was to be enforced, leaving such decisions to the vagaries of local law enforcement.

In recent days, it’s become more clear that such pronouncements have Steve Bannon’s fingerprints all over them. A white nationalist with isolationist impulses, Bannon has been disseminating his views as editor of Breitbart. As an organization, Breitbart generates false, unverified stories aimed at stoking fear among white nationalists.

I decided to scrape the headlines from Breitbart’s homepage and run them through a Markov chain to generate newer, even faker headlines. If the original headlines were dubious, the new ones are even more suspect. Here’s a sample:

You can find all my code here.

I also found a video online of Bannon lecturing in front of (no joke) a painting that includes the Bill of Rights, an American flag, and the Liberty Bell. So I threw in real Breitbart headlines I’d scraped and made this:

Thesis statement & research framework.

Posted on January 30, 2017January 30, 2017 by baricks

THESIS QUESTIONS

How are our online behaviors being interpreted and understood by machine learning algorithms? How do we adjust our behavior when we know it’s being surveilled and categorized? To what extent do we come to see and identify ourselves through the ‘eyes’ of the algorithm? How do users adjust their online behavior in response to algorithms?

Description

With my project, I want to take several different approaches to addressing the same set of questions.

First, I plan to do user research in the shape of individual anecdotes and broader surveys. I want to understand how prediction or recommendation engines, powered by increasingly accurate machine learning algorithms, are shaping our behaviors online. More importantly, I want to gain insights into how these mechanisms make us feel when we encounter them. I plan to send out an initial survey next week that gets to the heart of some of these questions.

Second, I intend to build a tool that gives users greater visibility into how algorithms are constructing a portrait of them based on their online behavior. What advertisements did they click? Who are their friends? What did they last purchase? I’m still not sure what form the tool itself will take but I plan to continue researching and referencing the work done by other researchers and activists.

Third, I want to gather up all my findings – both qualitative and quantitative – and present them in an engaging, exploratory way. I will likely write a research paper summarizing what I’ve discovered, but I also want to make that research accessible and educational to the average internet user.

Research Approach

First, the literature. I’ve started reading a number of books and academic articles that are relevant to this topic. Wendy Chun’s books Programmed Visions and Updating to Remain the Same have already been central to my research. I also plan to read Alexander Galloway’s Protocol, Patrick Hebron’s Learning Machines, and Weapons of Math Destruction. I’m making my way through Microsoft Research’s summary of academic articles related to critical algorithm studies. One article that’s been helpful in understanding user anecdotes has been Taina Bucher’s “The algorithmic imaginary: exploring the ordinary effects of Facebook algorithms,” which takes an ethnographic approach to understanding how users interact with algorithms online.

Second, the user research. I’m going to conduct my own user research in the form of surveys and the collection of individual anecdotes. I want to pinpoint specific interactions that users find particularly unnerving, creepy, benign, or invisible. I also want to understand how a knowledge that their news feed is filtered affects the way they interact with the platform.

Third, the experts. I want to get in touch with several researchers and artists who are already making strides in this field. I’m planning to reach out this week.

Personal Statement

Many people remember the day they first logged online or the day they got their first gmail account. I remember the exact day Facebook introduced its News Feed, a feature that allowed users to see what their friends were talking about on the platform. I remember going to high school that day and talking with my friends about the strangeness of it all, the experience of seeing what other people were commenting on and liking. And yet within days we had accepted and embraced the changes to the platform.

Since that day, Facebook has unrolled a number of changes to its platform, many of which we don’t notice or see because they are minor tweaks to the algorithm that dictates what information we see and what information is rendered invisible. Most recently, machine learning tools have thrown a whole new set of problems into the mix, as such algorithms become increasingly more nebulous and less transparent. I’m interested in understanding how algorithms – not just on Facebook, but on every platform – make us feel when we notice them. I also want to understand how users adjust their behavior in dialogue with such algorithms.

Much of my work at ITP has been focused on data privacy, surveillance culture, and the blurring of public and private spaces. I intend my thesis to be a continuation of past research and projects.

Slides from thesis presentation

Posted on January 30, 2017January 30, 2017 by baricks

My initial presentation to the class, outlining past projects and an introduction to my thesis topic.

Project progress: F(x) = x – c + b.

Posted on November 9, 2016November 9, 2016 by baricks

Since I wrote extensively about the user journey and narrative last week, I wanted to review some of the technical work I’ve been doing this week as I’ve attempted to get my deep learning framework (a tool for generating stories from images) up and running.

I started by following the installation & compilation steps outlined here for neural storyteller. The process makes use of skip-thought vectors, word embeddings, conditional neural language models, and style transfer. First, I installed dependencies, including NumPy, SciPy, Lasagne, Theano, and all their dependencies. Once I finish setting up the framework, I’ll be able to do the following:

Train a recurrent neural network (RNN) decoder on a genre of text (in this case, mystery novels). Each passage from the novel is mapped to a skip-thought vector. The RNN is conditioned on the skip-thought vector and aims to generate the story that it has encoded.
While that’s happening, train a visual-semantic embedding between Microsoft’s COCO images and captions. In this model, captions and images are mapped to a common vector space. After training, I can embed new images and retrieve captions.
After getting the models & the vectors, I’ll create a decoder that maps the image caption vectors to a text vector that I would then feed to the encoder to get the story.
The three vectors would be as follows: an image caption x, a “caption style” vector c and a “book style” vector b. The encoder F would therefore look like this: F(x) = x – c + b. In short, it’s like saying “Let’s keep the idea of the caption, but replace the image caption with a story.” This is essentially the style transfer formula that I will be using in my project. In this scenario, c is obtained from the skip-thought vectors for Microsoft COCO training captions and b obtained from the skip-thought vectors for mystery novel passages.

So far I’ve successfully set up the frameworks for skip-thought vectors (pre-trained on romance novels) & Microsoft’s COCO vectors. Now, I’m in the middle of installing and compiling Caffe, a deep learning framework for captioning images. I feel like I’ve hit a bit of a wall in the compilation process. I’ve run these commands specified in the Makefile, which have succeeded:

make clean
make all
make runtests
make pycaffe

When I try to import caffe, however, I get an error stating that the module caffe doesn’t exist, which means that there was some error in the build/compilation process. I’ve been troubleshooting the build for over a week now. I’ve met with several teachers, adjuncts, and students to troubleshoot. Today, I finally decided to use an Ubuntu container for caffe called Docker (which came highly recommended from a number of other students). I’m optimistic that Docker will help control some of the Python dependency/version issues I keep running into.

When I haven’t been working on the machine learning component of my project, I’ve started working on the website (running server-side, using node + express + gulp) where the game will live. I’ll be using this JQuery plugin that mimics the look and feel of a Terminal window.

Project update: Text-based game powered by machine learning.

Posted on November 2, 2016October 12, 2022 by baricks

For my project, I plan to build an interactive, text-based narrative where the text and the plot is generated through machine learning methods. At each stage in the narrative, the user will be prompted to choose the next step at various stages in the story.

The content of the game will be driven by a machine learning tool that takes image files and generates sequential stories from the images.

Here’s the storyboard / user flow for the game:

In terms of the technical details, I need to train my own data set on a specific genre of literature (horror? detective stories? thriller? choose your own adventure books) using the neural storyteller tool. Neural storyteller makes use of several different deep learning frameworks and tools, including skip thoughts, caffe, theanos, numpy, and scikit. Here’s an overflow of how the text in the game will be generated:

Here is the tentative schedule for the work:

Week 1: Nov. 2 – 8

Get the example encoder/trainer/models up and running (2-3 days).
Start training the same program on my own genre of literature (2-3 days).
Start building the website where the game will live (2 hrs).

Week 2: Nov. 9 – 15

After getting the machine learning framework working as I did with my solitaire app, start thinking about ways to structure the generative stories into the narrative arc (2-3 days).
Start building the front end of the game – upload buttons, submit forms. (1 day).

Week 3: Nov. 16 – 29

Start establishing the rules of game play & build the decision tree (2 days).
Continue building the website and tweaking the narrative (2 days).

Week 4: Nov. 30 – Dec. 7

User testing. Keep revising the game. Get feedback.

Statelessness and identity: A virtual reality experience.

Posted on November 1, 2016November 2, 2016 by baricks

Background research

When Leal’s father was born in Lebanon, her grandfather never registered his birth. Leal herself was born in Lebanon but is undocumented because her father wasn’t able to register the birth of any of his children. Lebanon is one of 27 countries that practices discriminatory nationality laws, according to a 2014 UNHCR report. The nationality law of Lebanon allows only Lebanese fathers to confer their nationality to their children in all circumstances; women cannot confer citizenship to their daughters.

Leal is considered “stateless” according to the UN definition. The term ‘stateless person’ refers to someone who is not considered a national by any country under the operation of its law. There are a number of reason individuals become stateless – perhaps they were displaced after a conflict or belong to an ethnic group that was never recognized by a nation as citizens of the country.

“To be stateless is like you don’t exist, you simply don’t exist,” Leal says. “You live in a parallel world with no proof of your identity.”

Today there are at least 10 million people worldwide who are stateless. Because such individuals are denied a legal identity, they aren’t afforded the same basic human rights we enjoy. Often they are denied access to housing, education, marriage certificates, health care, and job opportunities during their lives and confer the stateless status to their children. Most of these individuals lose their nationality by no fault of their own.

“Invisible is the word most commonly used to describe what it is like to be without a nationality,” says Mr. Grandi, Commissioner of the United Nations High Commissioner for Refugees. “For stateless children and youth, being ‘invisible’ can mean missing out on educational opportunities, being marginalized in the playground, being ignored by healthcare providers, being overlooked when it comes to employment opportunities, and being silenced if they question the status quo.”

Why it matters

Statelessness is a product of a world that is increasingly defined by political boundaries and identities. It’s also a product of loose migration laws in regions like Europe, where migrants leave home and then find that they cannot return. Individuals who are stateless often have a difficult time seeking asylum in other countries even though the loss of legal identity isn’t their fault.

Another theme that emerged from our research was the role that discrimination plays in producing stateless populations. As mentioned above, many countries discriminate against women in nationality laws. There are also a number of groups denied citizenship due to ethnic or religious discrimination. For instance, over 1 million of the Muslim Rohingya people in Myanmar do not have citizenship due to religious discrimination against Muslims. In the Dominican Republic, new laws have stripped up to 200,00 Haitian immigrants and their descendants of Dominican citizenship – even going so far as to deport thousands of people.

Ruta and I interested in understanding how such stateless individuals navigate a world in which they might possess overlapping ethnic/religious/national identities but lack a legal identity. The feeling most often described by these individuals is that of invisibility. With this project, we’re aiming to give voice to an individual (or group of individuals) who are unclaimed by a state government.

Concept

The project is a virtual reality film that explores narratives of statelessness and displacement. The idea is to find a compelling personal story and pair it with stunning visuals that match the content of the story. We want the narrative itself to drive the tenor and the mood of the film.

In terms of audience participation, Ruta and I felt more drawn to a heavily curated experience in which the audience hears the audio of a story and explores places that appear in the narrative. We think the best user experience will be one in which the narrative is somewhat more controlled rather than exploratory. We want to maintain an emotional, personal tone to the piece.

Audio will obviously play a huge role in the realization of this project, so we want to make sure the recording we use is itself a character in the film.

Next steps

Right now Ruta and I are reaching out to people we know who are experts in the field of human rights law, refugees, and migration. We’re hoping to make connections with people who are advocating for stateless individuals and find the right story or individual to drive this piece.

I’ve reached out to the following friends/experts:

Devon C. – Works for the UNHCR to resettle refugees in the Middle East and has been an advocate for refugees in the current Syrian refugee crisis (see her recent Foreign Policy piece).
Thelma Y. – Worked as an activist in Myanmar
Estee W. – A student at UPenn law studying discriminatory employment laws in Arab countries. Studied women’s employment laws in Jordan on a Fulbright scholarship.

Data Art: Text, Archives & Memory Stores

Posted on October 31, 2016October 31, 2016 by baricks

I’m interested in the methods we use to remember (or disremember) our past selves. In a digital era, we’re each constantly producing a steady stream of words and images that are recorded in our iCloud or on our Facebook timelines, for instance. A recent class speaker even suggested that our collective tweets may be permanently archived by the Library of Congress (but it’s still not a done deal). For this reason, I wanted to work with a body of data that serves as a digital artifact of my personal life. I wanted a way to frame the text against other events that were occurring in my life at the time.

For my first pass at this project, I chose to perform a simple text analysis on the texts I’d sent over the past year. I used a Python script to extract the information I wanted from my iMessage database (date + content of each message), analyze the sentiment of each message, and save the entire thing to a CSV.

In Processing, I plotted the sentiment of each text (from -1.0 to 1) against the “subjectivity” of each text (i.e. was this actually something I was expressing about myself vs. about someone else). In the scatterplot, sentiment is plotted on the x-axis and subjectivity on the y-axis. The result:

After plotting the results of the text analysis, I discovered that there’s a correlation between the subjectivity and the sentiment of the text. The more subjective the text is, the more likely it is that the text has an extreme polarity, negative or positive. This makes intuitive sense; after all, the more emotional my texts are, the more likely they’re personal/subjective.

If I were to do a second pass at this project, I think I would have chosen not a perform a sentiment analysis of the text. Sentiment analysis can be notoriously inaccurate and I think that ultimately this graph doesn’t reveal anything particularly interesting about my communication style. I’d be more interested in seeing which words I use most often, or how my communication style changed over a period of time.

Algorithmic gaze: Automating our decision-making capabilities.

Posted on October 20, 2016October 20, 2016 by baricks

01-jpgeefc4b6d-889a-48ce-8b70-9474711db893original

After spending many hours trying to articulate the perfect project concept that would appropriately communicate the research I’ve done thus far, I stumbled onto an idea that I think gets to the heart of what I’m trying to understand about computer vision. Namely, how might algorithms of the future use visual information to draw conclusions about you? And what are the consequences of ceding over our decision-making capabilities to a computer?

Here’s the quick and dirty elevator pitch for the game:

What happens when we let a computer make decisions on our behalf? ALGORITHMIC GAZE is an interactive web-based choose-your-own adventure game that makes personalized decisions for you based on a neural network trained on a collection of images. The project anticipates and satirizes a world in which we cede decision-making authority over to our computers.

I plan to build a low-fidelity game in three.js and WebGL. At the start of the game, the user will upload a handful of pictures and enter information about herself. Then, she will be guided through three different scenarios/scenes, in which there are objects with which she can interact. Each object will prompt a moment of decision: Let me decide or let the computer decide for me.

The program will use the images uploaded by the user to make decisions on behalf of the user. By tapping into a machine learning API, the program will use object recognition, sentiment analysis, facial recognition, and color analysis to make certain conclusions about the user’s preferences. The decisions made on behalf of the user may prompt illogical or surprising outcomes.

A storyboard of the experience:

Here’s what the basic decision tree will look like as you move through each scene.

Unidentified halo: A wearable that thwarts facial detection.

Posted on October 18, 2016October 18, 2016 by baricks

Unidentified halo is a wearable hat that responds to widespread surveillance culture and a lack of biometric privacy in public spaces. The hat is intended to shield the wearer from facial detection on surveillance cameras by creating a halo of infrared light around the face.

As recently as last week, new information has emerged suggesting that as many as half of all Americans are already included in a massive database of faces. Government reports have long confirmed that millions of images of citizens are collected and stored in federal face recognition databases. Police departments across the country use facial recognition technology for predictive policing. One major problem with these systems is that some facial recognition algorithms have been shown to misidentify black people at unusually high rates. There is also the problem of misidentifying a criminal – and how such mistakes can have disastrous consequences.

Shir David and I worked together on this project. We saw this piece as not only a fashion statement, but also an anti-surveillance tool that could be worn by anyone on the street who is concerned about protecting their privacy.

Since the human eye can’t see infrared light, the hat doesn’t draw any attention to the wearer. In the image above, the surveillance camera “sees” a halo of light around my face, preventing Google’s Cloud Vision platform from detecting a face. When we ran the images through Google’s API, it not only detected Shir’s face but even offered suggestions of her emotion based on facial indicators. My face, on the other hand, went undetected.

The project began as a subversive “kit” of wearable items that would allow the wearer to prevent their biometric data from being collected. Shir and I were both frustrated with both the ubiquity and the invisibility of the mechanisms of biopower, from surveillance cameras on streets to fingerprint scanners at the airport. We discussed the idea further with several engineers at NYU and they suggested that if we were interested in pursuing the idea further, we should construct a hat that shines infrared light on the user’s face.

We all agreed that the hat shouldn’t require technical know-how and the battery could be easily recharged. To do this, we soldered together 22 IR LEDs that are powered by a rechargeable 500mAH lithium battery and monitored by a potentiometer, and then adhered the circuit to a baseball cap. The LEDs are wired along the bill of the hat and the battery is tucked into the rim.

Humans can’t see the infrared light unless they are looking through the feed of a surveillance camera, so the wearer won’t draw attention to herself when she wears it on the street. In terms of users, we imagine that this wearable will be worn by someone who wants a way to protect his biometric identity from being tracked while he’s in public without causing a stir.

In future versions of the project, we think we would move the LEDs further down the bill of the hat so that it’s closer to the face. We also would ensure that the lithium battery is safely wrapped in a plastic enclosure so that there’s no way it could be accidentally punctured. And, of course, we would sew everything together to improve the appearance of the hat.

We also need to address why the infrared light appears on some IP surveillance cameras but not others – and what kinds of cameras are in use on subway platforms or street corners, for example. Of course, this project fails to address the ubiquity of iPhone cameras, which don’t pick up infrared light and have extremely advanced facial recognition algorithms. These questions will inform the next iteration of the wearable.

Is this you? Is this them? The algorithmic gaze, again.

Posted on October 12, 2016October 12, 2016 by baricks

Last week I presented a handful of different design concepts for my project. The feedback from my classmates was actually very positive – while I feel that the project still lacks focus at this stage, their comments reaffirmed that the different iterations of this projects are all connected by a conceptual thread. My task in the coming weeks is to continue following that thread and consider each iteration of the project a creative intervention into the same set of questions.

Theory & conceptual framework.

We know that systems that are trained on datasets that contain biases may exhibit those biases when they’re used, thus digitizing cultural prejudices like institutional racism and classism. Researchers working in the field of computer vision operate in a liminal space, one in which the consequences of their work remain undefined by public policy. Very little work has been done on “computer vision as a critical technical practice that entangles aesthetics with politics and big data with bodies,” argues Jentery Sayers.

I want to explore the ways in which algorithmic authority exercises disciplinary power on the bodies it “sees” vis-a-vis computer vision. Last week I wrote about Lacan’s concept of the gaze, a scenario in which the subject of a viewer’s gaze internalizes his or her own subjectivization. Michel Foucault wrote in Discipline and Punish about how the gaze is employed in systems of power. I’ve written extensively about biopower and surveillance in previous blog posts (here and here), but I want to continue exploring how people regulate their behavior when they know a computer is watching. Whether real or not, the computer’s gaze has a self-regulating effect on the person who knows they are being looked at.

It’s important to remember that the processes involved in training a data set to recognize patterns in images are so tedious that we tend to automate them. In his paper “Computer Vision as a Public Act: On Digital Humanities and Algocracy”, Jentery Sayers suggests that computer vision algorithms represent a new kind of power called algocracy – rule of the algorithm. He argues that the “programmatic treatment of the physical world in digital form” is so deeply embedded in our modern infrastructure that these algorithms have begun shaping our behavior and assert authority over us. An excerpt from the paper’s abstract:

Computer vision is generally associated with the programmatic description and reconstruction of the physical world in digital form (Szeliski 2010: 3-10). It helps people construct and express visual patterns in data, such as patterns in image, video, and text repositories. The processes involved in this recognition are incredibly tedious, hence tendencies to automate them with algorithms. They are also increasingly common in everyday life, expanding the role of algorithms in the reproduction of culture.

From the perspective of economic sociology, A. Aneesh links such expansion to “a new kind of power” and governance, which he refers to as “algocracy—rule of the algorithm, or rule of the code” (Aneesh 2006: 5). Here, the programmatic treatment of the physical world in digital form is so significantly embedded in infrastructures that algorithms tacitly shape behaviors and prosaically assert authority in tandem with existing bureaucracies.

Routine decisions are delegated (knowingly or not) to computational procedures that—echoing the work of Alexander Galloway (2001), Wendy Chun (2011), and many others in media studies—run in the background as protocols or default settings.

For the purposes of this MLA panel, I am specifically interested in how humanities researchers may not only interpret computer vision as a public act but also intervene in it through a sort of “critical technical practice” (Agre 1997: 155) advocated by digital humanities scholars such as Tara McPherson (2012) and Alan Liu (2012).

I love these questions posed tacitly by pioneering CV researchers in the 1970s: How does computer vision differ from human vision? To what degree should computer vision be modeled on human phenomenology, and to what effects? Can computer or human vision even be modeled? That is, can either even be generalized? Where and when do issues of processing and memory matter most for recognition and description? And how should computer vision handle ambiguity? Now, the CV questions posed by Facebook and Apple are more along these lines: Is this you? Is this them?

The project.

So how will these new ideas help me shape my project? For one, I’ve become much more wary of using pre-trained data sets like the Clarifai API or Microsoft’s COCO for image recognition. This week I built a Twitter bot that uses the Clarifai API to generate pithy descriptions of images tweeted at it.

@songoftrump 🔮 Okay let me look. 🔮 It’s a color – no wait a utensil. An opportunity to be involved in luxurious sexuality is coming. 👀 🙄 🔫

— crystal gazing (@gimmeafortune) October 11, 2016

I honestly was disappointed by the lack of specificity the data set offered. However, I’m excited that Clarifai announced today a new tool for users to train their own models for image classification.

I want to probe the boundaries of these pre-trained data sets – where do these tools break and why? How can I distort images in a way that objects are recognized as something other than themselves? What would happen if I trained my own data set on a gallery of images that I have curated? Computer vision isn’t source code; it’s a system of power.

For my project, I want to have control over the content that the model is being trained on so that it outputs interesting or surprising results. In terms of the aesthetic, I want to try out different visual ways of organizing these images – clusters, tile patterns, etc. Since training one of these models can take as little as a month, the goal for this week is to start creating the data set and the model.

I’ve been reading Wendy Chun’s Programmed Visions and Alexander Galloway’s Protocol: How Control Exists After Decentralization for months, but I’m recommitting to finishing these books in order to develop my project’s concept more fully.