Crystal gazing: A Twitter bot that uses computer vision to describe images.

This semester I’ve been interrogating the concept of “algorithmic gaze” vis-a-vis available computer vision and machine learning tools. Specifically, I’m interested in how such algorithms describe and categorize images of people.

For this week’s assignment, we were to build a Twitter bot using JavaScript (and other tools we found useful – Node, RiTA, Clarifai, etc) that generates text on a regular schedule. I’ve already build a couple Twitter bots in the past using Python, including BYU Honor Code, UT Cities, and Song of Trump, but I had never built one in Javascript. For this project, I immediately knew I wanted to experiment with building a Twitter bot that uses image recognition to describe what it sees in an image.

To do so, I used the Clarifai API’s robust machine learning library to access an already-trained neural net that generates a list of keywords based on an image input. Any Twitter user can tweet an image at my Twitter bot and receive a reply that includes a description of what’s in the photo, along with a magick prediction for their future (hence the name, crystal gazing).


After pulling an array of keywords from the image using Clarifai, I then used Tracery to construct a grammar that included a waiting message, a collection of insights into the image using those keywords, and a pithy life prediction.


I actually haven’t deployed the bot to a server just yet because I’m still ironing out some issues in the code – namely, asynchronous callbacks that are screwing with the order of how functions need to be fired – but you can still see how the bot works by checking it out on Twitter or Github. It’s still a work in progress, however.

You can see the Twitter bot here and find the full code here in my github repo. I also built a version of the bot for the browser, which you can play around with here.