“find … | wc” tells me I've got 21365 photos on my computer. I am guessing that I must have taken at least 50,000, though, since I delete at least half of the photos I take. I've gone through several different tools for managing all of those photos and have been meaning to document what I do and share the scripts that I've written for this, in case someone else finds my approach useful. Well, I finally motivated myself to do that, which has become possible in part due to the fact that I've been in love with git and Gitorious, after migrating Sputnik to them. With Gitorious the task of releasing code somehow doesn’t seem as daunting. So, I started a project on gitorious to publish my script collection: Yuri’s Photo Lab.
The rest of the post draws on the README.txt that I included in the repository.
The repository contains a collection of scripts that I use to manage digital photos. I use those scripts in combination with F-Spot and my server-side application written in Django (running on this site), which I am not releasing since cleaning it up and documenting it would be too much work and I am about to abandon it anyway. (I am hoping to move to Sputnik in the next few months.) I use all of those tools on Linux (currently running Ubuntu Edgy). This all might work on OSX, parts might work on Windows.
After taking new photos, I go through the following steps:
Step 0: Get the photos off the camera into some directory.
I do this either by putting the card into a reader and mounting it, or by connecting my camera and using Digikam. I never let F-Spot import the photos from the camera, however, since I want to rename and triage them before importing them.
Step 1: Triage.
I use GQView to delete the unwanted photos. Of all the applications I've tried, GQView does the best job of allowing me to quickly go through hundreds of photos, deleting most of them.
Step 2: Rename
Once the photos are triaged, I organize them into directories by date.
I name each directory with the date and a short description. For
example: “2007.09.08-varanasi”, “2007.09.09-varanasi-2”, “2007.09.10-delhi”, “2007.09.11-agra”,
“2007.09.12-delhi-again”.
Where do I put those directories? Let’s start from the top, actually. I keep all of my photo-related stuff under “~/photos/”. The original photos go under “~/photos/main”. Under “main”, I have directories for each “quarter”:
yuri@chai:~/photos/main$ ls
2000.07-Summer2000 2003.06-Summer2003 2006.06-Summer-2006
2000.09-Fall2000 2003.09-Fall-2003 2006.09-Fall-2006
2000.12-Winter2000-2001 2003.12-Winter-2003-2004 2006.12-Winter-2006-2007
2001.03-Spring2001 2004.03-Spring-2004 2007.03-Spring-2007
2001.06-Summer2001 2004.06-Summer-2004 2007.06-Summer-2007
2001.09-Fall2001 2004.09-Fall-2004 2007.09-Fall-2007
2001.12-Winter2001-2002 2004.12-Winter-2004-2005 2007.12-Winter-2007-2008
2002.03-Spring2002 2005.03-Spring-2005 2008.03-Spring-2008
2002.06-Summer-2002 2005.06-Summer-2005 rn.py
2002.09-Fall2002 2005.09-Fall-2005 todo.txt
2002.12-Winter2002-2003 2005.12-Winter-2005-2006
2003.03-Spring2003 2006.03-Spring-2006
So, I put my “2007.09.08-varanasi”, etc. into “2007.09-Fall-2007”. Then from this directory I run a script “run.py” (which sits in “main”) to rename the photos from whatever the cameras named them into a uniform pattern:
yuri@chai:~/photos/main$ cd 2007.09-Fall-2007
yuri@chai:~/photos/main/2007.09-Fall-2007$ python ../rn.py 2007.09.11-agra
The “rn.py” script is included in the repository.
This gives all photos in “2007.09.11-agra” names like: “20070911_392_8069.JPG”. This includes: a date (“20070911”), a number within that date (392nd photo for 2007-09-11) and a random number. At this point I have over 20,000 photos all named this way. Having all the photos named the same way has helped on many occasions.
Sometimes photos from the same day fall into several different groups (e.g., two unrelated events) and sometimes there are just too many of them to display them all together. In those cases, I split them into several different directories after renaming them. E.g., I might split the photos of Agra (originally “2007.09.11-agra”) into “2007.09.11-a-agra-red-fort”, “2007.09.11-b-fatehpur”, “2007.09.11-c-agra-streets”, and “2007.09.11-d-taj”.
Step 3: Load into F-Spot
F-Spot is a great application, once you tame it. In it’s wild form, its a bit invasive, for my taste. I've configured my F-Spot to use “~/photos/main” as its photo directory, but I don’t let it mess with the structure of that directory. Occasionally I import photos using F-Spot’s GUI, always telling F-Spot to not copy the files but to just use the existing copies. But this gets very tedious. So, instead, I usually use a python script (“rn.py”, included in the repository) to import the photos:
yuri@chai:~/gk/git/photolab/mainline$ cp /home/yuri/.gnome2/f-spot/photos.db
~/tmp/
yuri@chai:~/git/photolab/mainline$ python fspot.py import
/home/yuri/photos/main/2007.09-Fall-2007/2007.09.11
Yes, I always make a backup of F-Spot’s database file before doing this. I recommend that everyone does the same. Oh, and you need to make sure that F-Spot is not running while you do this.
Step 4: Add Tags using F-Spot
I then open F-Spot and use it’s GUI to add tags to my photos.
Step 5: Generate Thumbnails and Smaller Images
I use ImageMagick to generate the thumbnails and mid-size versions of the images to display on the website. I use version “6.3.6”, which supports “—autorotate”. With this flag, ImageMagick will check each image’s EXIF record for the camera orientation and rotate the image if necessary. As of January 2008, using version 6.3.6 on Ubuntu meant compiling ImageMagick by hand, which is a bit of a pain in the a#&: you need to make sure that you have JPEG libraries installed before you build ImageMagick. Oh, and if you build it with PNG support, then some things stop working in Gnome…
I keep my resized images in “~/photos/sized”. When I need to resize images, I run “makethumbdir.py” (also included in the repository).
yuri@chai:~/photos/thumbscripts$ python makethumbdir.py
/home/yuri/photos/main/2007.09-Fall-2007/2007.09.11-d-taj
> make_thumbs.sh
This creates directories under “~/photos/sized” for each directory since 2007.09.11-d-taj and a shell script (make_thumbs.sh) which copies each file into the corresponding directory twice (once for a thumbnail and once for a mid-sized version for showing on the web), then calls ImageMagick’s “convert” on it:
/usr/local/bin/convert "/home/yuri/photos/main/2007.09-Fall-2007/2007.09.11-d-taj/20070911_402_4296.JPG"
-size 300x300 -auto-orient -thumbnail 150x150
/home/yuri/photos/sized/2007-09-11-d-taj/20070911_402_4296.thumb.jpg
I then run the shell script:
yuri@chai:~/photos/thumbscripts$ sh make_thumbs.sh
Step 6: Copy the Images to the Server.
I use “scp” to copy the images to my server. Nothing fancy about this. Perhaps the only trick is finding a host that is generous with disk space. I use DreamHost. (If you decide to open an account, use “YURI50BUCKS” promotion code to get $50 off.)
Step 7: Load the Images into My Web Application.
Again, the custom web application that I am using is too messy to release, but the basic idea is simple: I have F-Spot’s database file which has tags among other things. I want to load this information into my web app. I again use “fspot.py” for this, but this time as a module called from another script (updatePhotos.py). I am including this script in the repository just as example.