This week, we reviewed useful tools ffmpeg and imagemagick to manipulate images and videos found online. I decided to start working with the trailer to Akira Kurosawa’s 1985 film Ran (Japanese for “chaos”). Ran is a period tragedy that combines the Shakespearian tragedy King Lear with legends of the daimyō Mōri Motonari.
The trailer is filled with beautiful, carefully framed shots. I wanted to see if there was a way to automatically detect and chop up the trailer into its individual shots/scenes. It turns out there is no simple solution to that problem so I hobbled together my own bash script to do so.
Once I had chopped up the trailer, I decided to export one image from each scene for analysis. I did so by writing a script that saves the first frame from each video.
I then used selenium to programmatically upload those images into a reverse image search that was powered by an image classifier that had been trained on Wikipedia data. The image classifier had been trained by Douweo Singa and the site can be accessed here. It’s described this way: “A set of images representing the categories in Wikidata is indexed using Google’s Inception model. For each image we extract the top layer’s vector. The uploaded image is treated the same – displayed are the images that are closest.” You can read more detailed notes about training the data in Douweo’s blog post.
I ended up with hundreds of ‘visually similar’ images, organized according to the shots in the trailer. I combined them into a side-by-side comparison video, where you can see some of the images that were deemed ‘visually similar’ by the training set. Check out the full video for Kurosawa’s Ran:
I then decided to repeat the entire process for the trailer to Dario Argento’s classic horror film Suspiria: