Wednesday, September 09, 2009

Unix for fun and profit

So today, someone asked me for a favor. A simple favor.

Or so I thought.

Some background:

See, about a year ago, we went to the Raleigh Drum Circle birthday bash, which is basically a get together for the drum circle that meets in Raleigh. It happens about this time every year. We brought our Nikon D40, and took a bunch of pictures, then posted them on flickr. Some of them turned out pretty good, I think.

So today, someone from the drum circle asked if he could use some of them for the drumcircle newsletter, flyers, and things of that nature. He wanted to get some higher resolution versions to use for the above purposes. Well, color me flattered! I was happy to comply. However, I had a problem. Between the camera and flickr, I'd lost the original filenames...flickr had the descriptions I put in, but not the original filenames. I originally took about 350 pictures, and only posted 40 or so. How would I find the originals for the 6 or so the drum circle guy wanted without inspecting each of the 350 originals?

I didn't have the original filename, but I did have the exif data. So I conceived a fiendish plan! The exif data includes, among many other juicy details, the date and time the picture was taken. Flickr shows that data, and I could cross-reference the data from the original files to find the file I wanted. So, I wrote a quick python script:


#!/usr/bin/python

import os
from PIL import Image

for filename in os.listdir(os.curdir):
    img = Image.open(filename)
    print filename, img._getexif()[36867]


The PIL interface to exif data is unfortunately pretty raw at this point. To interpret the dictionary you get from _getexif(), look in the ExifTags module in PIL. That's where I found the magic number '36867'.

The script doesn't have any error checking or other features, but it took less that 5 minutes to write, and gives me results I need.

# python ~/python/imagedates.py > /tmp/imgdates.txt
# grep "20:01:24" /tmp/imgdates.txt
dsc_3626.jpg 2008:09:15 20:01:24

Viola! The file I wanted. A few more greps, and I had everything I need.

I could probably turn this into a more generic tool, with error checking, commandline arguments, an API to flickr, and whatnot, but quick and dirty got the job done.