Locating Facebook metadata in physical landscapes.

My Facebook metadata as landscape.

This semester, I’ve focused my attention on creative ways of interpreting and visualizing my personal Facebook data.

I’m interested in exploring the concept of “digital dualism” – the habit of viewing the online and offline as largely distinct (source). We are actively constructing our identities whether behind a screen or in person. As Nathan Jurgenson writes, “Any zero-sum “on” and “offline” digital dualism betrays the reality of devices and bodies working together, always intersecting and overlapping, to construct, maintain, and destroy intimacy, pleasure, and other social bonds.”

The exact location where I made a Facebook update.

With this project, I wanted to try re-inserting the digital world into the physical world. I decided to locate specific actions I took on Facebook within a physical geography and landscape.

It’s very easy to download your Facebook metadata from the website – all you have to do is follow these directions. In my data archive, I found information about every major administrative change I’ve made to my Facebook account since I created the account in 2006, including changes to my password, deactivating my account, changing my profile picture, etc. This information was interesting to me because from Facebook’s perspective, these activities were in all likelihood the most important decisions I had ever made as a Facebook user.

I rearranged that data into a simple JSON file:

I decided to explore the IP Address metadata associated with each action. I wanted to know more about the physical location where I had made these decisions concerning my Facebook account, since I obviously didn’t remember where I was or what I was doing when I had made these changes.

I wrote a Python script (see code here) that performs several different actions for each item in the JSON file:

(1) Takes the IP address and finds the corresponding geolocation, including latitude & longitude & city/state;

(2) Feeds the latitude/longitude into Google Maps’ Street View and downloads 10 images that each rotate 5 degrees;

(3) Adds a caption to each image specifying the Facebook activity, the exact date/time, and the city/state; and

(4) Merges the 10 images into a gif.

The result was two dozen weird undulating gifs of Google Street View locations, which you can check out on the project website.

After doing all that work, however, I didn’t feel satisfied with the output. If the goal was to find a way to re-insert my digital data trail into a physical space, I felt that the goal hadn’t yet been realized in this form. I decided to take the project into a different, more spatially-minded direction.

I wrote another Python script that programmatically takes the IP address and searches for the latitude/longitude on Google Maps, clicks the 3D setting, records a short video of the three-dimensional landscape, and then exports the frames of that video into images.

Programmatically screen recording Google Maps’ 3D landscape.

Using the photogrammetry software Photoscan, I created a 3D mesh and texture from the video frames. Then, I made a quick design of the Facebook app on an iPhone with the specific Facebook activity associated with that location & IP address. Finally, I pulled the landscape .obj into Unity with the iPhone image and produced some strange, fantastical 3D landscapes:

Pulling the 3D mesh into Unity and inserting the Facebook metadata into the landscape.

Thesis progress, Week 7.

The past few weeks have allowed me to think deeply about what I want to get out of my thesis project and what form this project will take. I wrote last week of the idea of the “manufactured self” – a self that has been constructed socially by external sources of power.

I stumbled on Alexandru Dragulescu’s thesis paper Data Portraits: Aesthetics and Algorithms, which outlines his creative practice for data portraiture. He describes “the concept of data portraits as a means for evoking our data bodies” and showcases his “data portraiture techniques that are re-purposed in the context of one’s social network.”

With my project, I will attempt to create a portrait of each participant based only on his or her Facebook data. I want to use facial recognition models (C++ and Python), 3D modeling (Three.js, Blender), the Facebook Graph API, and IBM Watson’s Natural Language Processing and Personality Insights APIs.


Visit the website: http://rebecca-ricks.com/manufactured-self/becca.html

After much experimentation, I have an overall idea of what the user flow will look like. There will be an online web application + a physical component. Here’s the flow:

(1) User logs into web application (with Facebook Oauth)

(2) Real-time analysis of personality + generate 3D facial model

(3) The 3D object is manipulated/distorted based on the personality insights (?)

(4) At the show, users will be able to take home a physical artifact of their data portrait (thermal print of the 3D model? An .obj? A list of personality insights?)

This week, I used C++ and Python to get this library up and running, which allows you to create a 3D model of a face from a 2D image. I spent a significant amount of time trying to install the library, generate the landmark points, and run the analysis on my own images. Here’s what that process looked like:

Generating the landmark points based on a photo of my face.
Generating the isomorphic texture that will be applied to the 3D model.
The 3D mesh model that the texture is applied to.
The final output displayed in the browser using three.js.

I also got access to a few of IBM Watson’s APIs via the Python SDK. Specifically, I’m looking at the Personality Insights API, which analyses a body of text (your Facebook likes, your Facebook posts, etc). I ran the analysis on my own Facebook data, and added the information to the website I built from the JSON file that was generated.

You can see an example of what that analysis looked like on my own Facebook data:


I also decided to test my 2D to 3D model on an earlier image I had created of my composite face based on every Facebook photo I’ve been tagged in.


Thesis progress, Week 6.

Last week I presented my midterm presentation and received some great feedback and suggestions. I resonated most with what Sam said about the monetization, commodification, and production of the self that occurs on Facebook. How can I incorporate that more fully into my thesis project?

I’m still iterating on a few different ideas, but eager to find the final form that my project will take, whether it’s one fully-developed web application or several different experimental applications.

I found some visual inspiration that has fueled the project I’m working on this week.

Source: https://labs.rs/en/

Share Lab has been investigating ‘The Facebook Algorithmic Factory’ with the intention “to map and visualize a complex and invisible process hidden behind a black box.” The result is an exploration of four main segments of the process: Data Collection (“Immaterial Labour and Data harvesting“), Storage and Algorithmic processing (“Human Data Banks and Algorithmic Labour“), and Targeting (“Qualified lives on discount“).

I was struck by not only the depth of research into Facebook’s policies and practices but also the beautiful (static) data visualizations produced as a way to clarify the research.

Source: https://labs.rs/en/
Source: https://labs.rs/en/

These data visualizations are simple but powerful. It left me thinking: How do I make this complex web personal ? How do I communicate the ways in which this process immediately affects every Facebook user? Can I make use of the Facebook API to build a graphic that takes the user’s personal information (likes, friends, advertisements) and displays them in an interactive web-based application?

I want to make use of a lot of the good research done by Share Lab as well as my own research to build an interactive web application that helps users see how their personal data is collected, stored, and used in order to manufacture a self, or a “consumer profile.” I was struck by what Nancy said about Facebook manufacturing a self and I think this would be a good conceptual starting point.

Right now I’m starting to build the web application using the Facebook Graph API, Facebook CLI, and a D3 clustering algorithm. I’m starting by building a web application that collects information about user_likes clustered according to category.

Film trailers as told through ‘visually similar’ images

This week, we reviewed useful tools ffmpeg and imagemagick to manipulate images and videos found online. I decided to start working with the trailer to Akira Kurosawa’s 1985 film Ran (Japanese for “chaos”). Ran is a period tragedy that combines the Shakespearian tragedy King Lear with legends of the daimyō Mōri Motonari.

The trailer is filled with beautiful, carefully framed shots. I wanted to see if there was a way to automatically detect and chop up the trailer into its individual shots/scenes. It turns out there is no simple solution to that problem so I hobbled together my own bash script to do so.

Once I had chopped up the trailer, I decided to export one image from each scene for analysis. I did so by writing a script that saves the first frame from each video.

I then used selenium to programmatically upload those images into a reverse image search that was powered by an image classifier that had been trained on Wikipedia data. The image classifier had been trained by Douweo Singa and the site can be accessed here. It’s described this way: “A set of images representing the categories in Wikidata is indexed using Google’s Inception model. For each image we extract the top layer’s vector. The uploaded image is treated the same – displayed are the images that are closest.” You can read more detailed notes about training the data in Douweo’s blog post.

I ended up with hundreds of ‘visually similar’ images, organized according to the shots in the trailer. I combined them into a side-by-side comparison video, where you can see some of the images that were deemed ‘visually similar’ by the training set. Check out the full video for Kurosawa’s Ran:

I then decided to repeat the entire process for the trailer to Dario Argento’s classic horror film Suspiria:

Thesis: Progress, Week 4

This week, I decided to experiment with a few different ideas and technologies in order to further develop my thesis project. Here are some of the projects/experiments I worked on:

Experiment #1: Chrome Extension (for Facebook).

A chrome extension that swaps all the pictures in your Facebook feed with the logos of the advertisers that currently have your contact information.

See the code repo here.

I started by downloading my entire Facebook archive (do it yourself). I found a list of all the advertisers who have my contact information from Facebook – more than 200 entities in total. This information shocked me, especially because a number of them were data collection companies and politicians running for office.

I took that list of advertisers and decided to scrape Google Images to download all their logos.

Then I shifted gears and built a Google Chrome extension that swaps all the images on Facebook for any images of your choosing. I wrote code that picks a random image from the folder of advertisers every time the page reloads.

Personally, I found that the advertisements for various Senators and politicians to be the most intrusive and unwanted.

Experiment #2: Facemash (for Facebook and LinkedIn).

A Python script that scrapes tagged images from Facebook and LinkedIn, and then identifies & overlays the faces using OpenCV.

See the code repo here.

I wrote several different Python scripts. One scrapes all the Facebook images you (or your friend) are tagged in. Once you have those images, you can run another script to identify the face and then overlay the faces on top of each other.

I used a few different Python packages and models, including OpenCV for facial recognition/warping and dlib for overlapping the images.  Read detailed instructions here (many thanks to Leon for his helpful workshop).

Here are some examples for me and my sisters:

I wrote another Python script that scrapes all the profile pictures from your LinkedIn connections. I was able to scrape the first 40 connections and then ran those images through the facemash script. This is what my average LinkedIn connection looks like:

Experiment #3: Aristotle Search (for Twitter).

I’ve been thinking a lot lately about how we search for and filter information online, and the ways in which Twitter and Google, for instance, make decisions for you about what’s most relevant. What if you wanted the ability to filter Twitter results according to a different set of criteria?

Inspired by Ted Hunt’s Socratic Search, I built a sister search engine called Aristotle Search that filters Twitter results according to Aristotle’s criteria for persuasive argument: logos (appeal to logic), pathos (appeal to emotion), and ethos (appeal to ethics).

The search engine is meant to be an exercise in speculative design that allows us to think about how a redesign of social platforms would change how we approach and engage with them. What if you approached Facebook with the intention to strengthen your relationship with family or reconnect with high school friends? What if you approached Google with a desire to challenge your own assumptions or seek clarity? (see: Socratic Search)

Rendering the familiar unfamiliar.

Clement Valla, The Universal Texture, 2012

I spoke with Taina Bucher and Surya Mattu today, who gave me excellent advice and direction on my thesis project.

Conversation #1: Be clear about your audience. 

In our conversation, Taina drew a distinction between the work that she has done and the work done by other researchers seeking to gauge digital literacy and algorithmic awareness (“do people know that they’re not seeing everything?”). They want to make the algorithm visible to its users. Taina, on the other hand, is more interested in user beliefs or expectations of how the algorithm system should perform. Her research aims to understand not only how people believe the Facebook algorithm functions, but also their normative conceptions about what algorithms should do.

Of course, it’s difficult to measure feelings or beliefs, so I’ve found myself asking: How can I observe user beliefs about how a platform should or does perform? How can I create an interaction that creates or observes those experiences? Many researchers take a qualitative approach – talk to study participants, interview them – but what will be my creative rupture?

Sterling Crispin, Data Masks, 2014

Conversation #2: Make it real, not hypothetical.

The conversation with Surya was really productive – he immediately understood what I was trying to achieve with this project and gave me 3-4 references of projects that had tried to achieve a similar effect, including the Chrome Extension alter, which allows you to scroll through a screenshot of someone else’s News Feed. We talked about his work on the Black Box Chrome Extension, which he said was his attempt to try to poke at the few aspects of the Facebook algorithm that are public but not immediately visible.

One piece of advice Surya gave me was to focus on the real data rather than the hypothetical. Sometimes the simplest intervention yields the most effective response. For instance, I told him about the research Taina Bucher has done to collect anecdotes from people who had had strange run-ins with the Facebook algorithm. He felt that she took a very powerful approach by focusing on the stories and images of real people, rather than theorize. He recommended that I start experimenting and see what resonates most with users.

Alter, Sena Partel, 2016


After talking to Taina and Surya today, I feel like I’m ready to move forward and experiment with Facebook, including downloading my own Facebook archive, building a Chrome Extension, building a 3D model of my face from tagged pictures, scraping Reddit discussions about how the Facebook algorithm works, and using personal data in an unexpected way.

I often think about Jenny Odell’s observation about her own work, that rather than take an new, unfamiliar technology and make something boring with it, she takes a familiar technology and renders it unfamiliar. I want to do the same with my thesis project.

Thesis: Progress, Week 3

This past week I experienced some setbacks. I competed my thesis statement and the feedback I got was that the scope might be too ambitious/broad for a ten week long project. I decided that moving forward, I think I will start building the tool and incorporate small-scale user research into the design process.

I reached out to expert developers and academics to talk about my thesis idea. Here is who I’ve made contact with:

  • danah boyd – head of Data & Society, principal researcher at Microsoft Research, adjunct professor at NYU
  • Tarleton L. Gillespie – principal researcher at Microsoft Research, adjunct professor at Cornell University
  • Suyra Mattu – fellow at Data & Society, developer of Black Box tool
  • Hang Do Thi Duc – designer, developer of data selfie
  • Taina Bucher – professor at University of Copenhagen

So far I’ve spoken to Hang, who started developing her project data selfie while getting her MFA at Parsons. She had a lot of good advice, namely to think big but stay realistic about what needs to be completed for the MVP. She also raised some ethical considerations, including thinking about how personal data will be saved (her project saves everything locally and only uses a server to make API calls). She likes the idea of building a Chrome Extension that essentially gives you recommendations and reminders of how the platform is collecting or tracking your personal data.

I have plans to Skype with Surya and Taina on Wednesday. Hopefully they can help me narrow my focus.

While I feel less pressure now that I’ve narrowed the scope, I’m not sure that the kind of tool I’m envisioning will answer my thesis question: How does the way algorithms see us change the way we see ourselves?

Today I began digging into the code of existing Chrome Extensions, such as Disconnect and show-facebook-computer-vision-tags, tools that boost awareness about how the Facebook algorithm is operating. This week, I’m planning to make a simple extension that manipulates the Facebook experience.


Thesis exercise: Draw the Facebook news feed.

This week, we were asked to create a Joseph Cornell-style shadow box as an exercise in visualizing our thesis project. I decided to use the assignment as an opportunity to test out an idea I had about user experience design, memory, and Facebook.

I wanted to use the analogy of a city to understand the process by which users make sense of opaque processes like algorithms. Cities, like algorithms, are massive and hard to wrap our heads around.

According to one article I read, “City planner Kevin Lynch developed design principles for urban design by asking city dwellers to sketch maps of their environments from memory. In so doing, he learned what features of a city are more or less memorable in support of a ‘cognitive image.’ Based on an assumption that easily ‘imaged’ cities make for better cities, he then moved to develop design recommendations for urban planners.”

I decided to apply this exercise of drawing from memory to the Facebook algorithm. I asked 6 friends to draw their Facebook News Feed from memory (without looking at their Facebook page). Here were the results:

Some observations:

First, I noticed that 4 people drew the browser version of Facebook and 2 people drew the Facebook mobile app. I hadn’t specified in my directions and it was interesting to see which version they jumped to first.

Second, I noticed that there were certain UI features people tended to remember more often. Here’s what was most visible/memorable:

  • Upper right bar (notifications/menu/home) (6/6)
  • Friend updates (6/6)
  • FB logo (5/6)
  • Advertisements (4/6)
  • Events (4/6)
  • Trending news (4/6)
  • FB chat (3/6)
  • Comments/likes (3/6)
  • Status prompt, “What’s on your mind?” (3/6)
  • Search bar (2/6)
  • Friend live updates (2/6)
  • Sponsored posts (1/6)
  • Birthdays (1/6)
  • Left sidebar options (1/6)

Here’s what wasn’t visible/memorable:

  • Lower right hand Search (0/6)
  • Upper right hand question mark/help (0/6)
  • Create a Post & Photo/Video Album (0/6)
  • Photo/Video & Feeling/Activity prompts (0/6)
  • Public vs Private sharing option (0/6)
  • Left sidebar Shortcuts & Explore & Create options (0/6)
  • Stats & info about pages for which they’re admin (0/6)

It makes sense that what we most remember is first the overall architecture of the site and second, the things we engage with first. Most of my participants remembered the notifications, their friends’ updates, news, events, and the right-hand advertisements.

Thesis: Literature review

The following is a first draft of a literature review for my thesis project, which will look at how algorithms online shape user behavior and how user beliefs about the platform recursively shape the algorithm.

Algorithms as biopower

Foucault reminds us that power is not static, nor does it emanate from a center of origin; rather, power exists in an enmeshed network. In other words, power is not applied to individuals—it passes through them.

The digital era of online advertising has ushered in a new type of data collection aimed at maximizing profits by serving up advertisements based on modular, elastic categories. In the past, consumers were categorized based on demographic and geographic data available in the census. As marketers moved online over the past two decades, however, they were able to use data from search queries to build user profiles on top of these basic categories. The subsequent construction of “databases of intentions” help marketers understand general trends in social wants and needs and consequently influence purchase decisions (2011).

Through use-patterns online, an individual may be categorized based on her gender, her race, her age, her consumption patterns, her location, her peers, and any number of relevant groupings. Online users are categorized through “a process of continual interaction with, and modification of, the categories through which biopolitics works” (2011). Medical services and health-related advertisements might be served to that individual based on that categorization process, meaning that those who are categorized as Hispanic, for instance, might not experience the same advertisements and opportunities as those categorized as Caucasian.

In order to govern populations according to Foucault’s prescription for social control, biopower requires dynamic, modular categories that have the ability to adapt to the dynamic nature of human populations. In this system, the personal identity of the individuals matters less than the categorical profile of the collective body. Cheney-Lippold argues that soft biopower works by “allowing for a modularity of meaning that is always productive—in that it constantly creates new information—and always following and surveilling its subjects to ensure its user data are effective” (2011).

Foucault argues that surveillance exerts a homogenizing, “normalizing” force on individuals who are being monitored. When algorithms are employed in systems of selective surveillance, the personal identity of an individual matters less than the categorical profile of the group as a whole. It is this “normalizing” effect that I am most interested in understanding on the individual level.

Algorithms as interface

In recent years, researchers in the social sciences have worked to understand how Facebook users engage with the News Feed algorithm, which dictates what content they see in their Home Feed. Many researchers have studied the degree to which people become aware of such algorithms, how people make sense of and construct beliefs about these algorithms, and how an awareness of algorithms affect people’s use of social platforms.

Much research has been done on the question of ‘algorithm awareness’ – the extent to which people are aware that “our daily digital life is full of algorithmically selected content.” Eslami et al. (2014) raises several questions, including: How aware do users need to be of the algorithms at work in their daily internet use? How visible should computational processes be to users of a final product?

To answer the first question, several studies have attempted to gauge how aware Facebook users are of the algorithm. In one study of Facebook users, Eslami et al. (2015) found that the majority were not aware their News Feed had been filtered and curated. The authors created a tool FeedViz that allowed users to see visually how their News Feed was being sorted. Many of the study participants disclosed that they had previously made inferences about their personal relationships based on the algorithm output and were shocked to learn that such output was not a reflection of such relationships. The authors suggest that designers think about ways they can give users more autonomy and control over their News Feed without revealing the proprietary data from the algorithm itself.

A different study by Rader and Gray (2015) concluded that the majority of Facebook users were, in fact, aware that they were not seeing every post from their friends. The authors were interested in understanding how user beliefs about the Facebook news feed – accurate or not – shape the way they interact with the platform. “Even if an algorithm’s behavior is an invisible part of the system’s infrastructure,” they write, “users can still form beliefs about how the system works based on their interactions with it, and these beliefs guide their behavior.” Furthermore, such user beliefs about how the system works “are an important component of a feedback loop that can cause systems to behave in unexpected or undesirable ways.” They argue that we need more use cases where user and algorithm goals are in conflict as part of the design process. They also suggest that designers rethink their approach to making the mechanisms of the algorithm seamless or invisible—for instance, leaving clues within the interface that indicate how the system is working.

Martin Berg’s research attempts to track the ways in which personalized social feeds are shaped by the experienced relationship between the self and others (2014). He conducts a study in which participants wrote daily self-reflexive diaries about their own Facebook use. The study found that participants expressed a certain insecurity or strangeness in seeing their social boundaries collapse on Facebook. Berg argues that the algorithm acts as both an architecture, a social space, and a social intermediary. Facebook posts function as a social meeting point for friends. Furthermore, the “harvesting personal and interactional data on Facebook” forms the basis of a “virtual data-double” in which the self is “broken into distinct data flows.” His research supports the idea that the user is both shaped by and shapes the Facebook algorithm.

Building on the concept of algorithmic awareness, social scientist Taina Bucher seeks to map out the emotions and moods of the spaces in which people and algorithms meet. She develops the notion of “the algorithmic imaginary,” ways of thinking about what algorithms are, what they should be, and how they function (2017). Since such ways of thinking ultimately mold the algorithm itself, she argues that it is crucial that we understand how algorithms make people feel if we want to understand their social power. In a recent study, she examines personal stories about the Facebook algorithm through tweets and interviews with regular users of the platform. In her own words, she looks at “people’s personal algorithm stories – stories about situations and disparate scenes that draw algorithms and people together.” (2017). By taking an episodic, qualitative approach, Bucher constructs a picture of the disparate emotions generated by interactions with algorithms.


Agamben, G. (1998) Homo Sacer: Sovereign Power and Bare Life. Stanford: Stanford University Press.

Agamben, G. (2005) State of Exception. Chicago: The University of Chicago Press.

Berg, M. (2014) ‘Participatory trouble: Towards an understanding of algorithmic structures on Facebook’, Cyberpsychology: Journal of Psychosocial Research on Cyberspace, 8(3), article 2.

Bucher, T. (2017), ‘The algorithmic imaginary: exploring the ordinary affects of Facebook algorithms’, Information, Communication & Society, 20:1, 30-44.

Bucher, T. (2012), ‘Want to be on the top? Algorithmic power and the threat of invisibility on Facebook’, new media & society 14(7): 1164-1180.

Cheney-Lippold, J. (2011) ‘A New Algorithmic Identity: Soft Biopolitics and the Modulation of Control’, Theory Culture & Society (28-164).

Eslami, M., Rickman, A., Vaccaro, K., Aleyasen, A., Vuong, A., Karahalios, K., Hamilton, K., Sandvig, C. (2015) ‘“I always assumed that I wasn’t really that close to [her]’: Reasoning about invisible algorithms in the news feed”’, CHI 2015, ACM Press.

Eslami, M., Hamilton, K., Sandvig, C., Pkarahalios, K. (2014) ‘A Path to Understanding the Effects of Algorithmic Awareness’, CHI 2014, ACM Press.

Foucault, M. (1977) Discipline and Punish: The Birth of a Prison. London: Penguin.

Foucault, M. (1990) The History of Sexuality: The Will to Knowledge. London: Penguin.

Foucault, M. (2003) Society Must Be Defended: Lectures at the Colle’ge de France, 1975-1976. New York: Picador.

Hier, S. (2003) ‘Probing the Surveillant Assemblage: On the Dialectics of Surveillance Practices as Processes of Social Control’, Surveillance & Society 1(3): 399-411.

Monahan, T. (2010) Surveillance in the Time of Insecurity. New Jersey: Rutgers University Press.

Rader, E. & Gray, R. (2015) ‘Understanding User Beliefs About Algorithmic Curation in the Facebook News Feed’, CHI 2015, Crossings: 173-182.

Rader, E. (2016) ‘Examining User Surprise as a Symptom of Algorithmic Filtering’, Journal of Human Computer Studies.

Schmitt, C. (1922) Political Theology: Four Chapters on the Concept of Sovereignty. Chicago: University of Chicago Press.

Youtube Tutorials: How YouTubers tried to answer my Google searches.

I use Google to search for answers to questions I don’t want to ask a human being. While most of my searches are done out of necessity (“how to use git no deep shit”) or urgency (“ruby’s nyc address”), I also turn to Google to answer questions I’m too embarrassed to ask my friends. Our Google searches therefore reveal a side of us that we may not want shared with the public.

I decided to make a website exploring how YouTubers attempted to answer some of the questions I asked Google in 2014. See the site here.

I started by downloading my entire Google search history, spanning the years 2013-2017. The zip file contains some ugly JSONs, so using Python I generated lists of searches organized by year. Then I programmatically cleaned up the lists to weed out Google Map & Flights searches. This was the result for 2013, for instance:

Next, I filtered the Google searches down to all the instances that included the words “how to.” I wanted to get a snapshot of what I was trying to learn from the internet in that particular year. Some examples from 2014:

Then I wrote a Python script that takes that array of google searches and programmatically searches for them on YouTube, downloading whatever video tutorial is the first result. I used selenium + webdriver + PhantomJS to browse and scrape the videos for me. You can see my full code here.

When I started this project, I knew I wanted to explore the culture of YouTube tutorials using my own searches as a starting point. I wanted to know how different online communities were attempting to answer and work through my questions.

What I found interesting was the ways in which my questions were interpreted. A simple question “how to bounce back” resulted in a trampoline how-to video. A question about “how to get over a breakup” resulted in a makeup tutorial for post-breakups (side note: I had no idea that there is a huge subculture of makeup tutorials on YouTube, complete with its own norms and signifiers). If I had searched on Reddit or WebMD, for instance, the results would have been similarly a product of the language of the online community.