How I Manage my Digital Images
A number of people of have asked me for details about how I manage my collection of digital imagery. I assume that they’re asking because they know that the collection is large (~30,000 images) and that I’m rather persnickety about the details. Anyways, for anyone who cares, here are the details. As you will see, it is a bit over the top and OCD, but there are probably some useful hints hiding in here. Perhaps there’s something you’ll find useful.
My imagery comes from a number of different sources.
Everyone in the family has digital cameras, and phones which take pictured too. These create a stream of images which are often useful for things like blog posts and twitter updates, but I really like to archive everything. It seems like you’re always looking for some pictures of a particular person or place.
For all of these sources, it’s really useful if you can put them in a mode where they don’t reset the file numbers. For example, on my Nikon, that’s called “File No. Sequence” on the Set Up menu. If you do this, you won’t have to deal with two images with the same filename as often. Since that’s the source of a lot of mistakes when managing your archive, it’s a good thing to minimize the name conflicts.
I’ve also got lots of old slides (mostly Kodachrome) from my days with film. In addition, we have a lot of old snapshots. A lot of these are from family members who send me copies because I’m the “camera guy”.
Most of the slides were scanned with a Minolta DImage Dual III film scanner. I was really happy with this. It gave me a lot of control and I’d gotten very used to its quirks. Unfortunately, Minolta dropped the line and support has gotten pretty awful. I tried using it with VueScan for a while, but had a string of problems. At this point I’ve pretty much stopped fighting with it and I’m using an Epson V500 instead. I’m reasonably happy with this, but it has different quirks from the ones I’ve gotten used to over the years.
I haven’t had much luck scanning negatives with either of these scanners. They tend to come out contrasty and have a lot of noise. Luckily I don’t have a lot of stuff I care about on negatives because I liked slide film so much.
I’ve also got lots of old snapshots. A lot of these are old ones which have been passed down through the family. With these, the hardest part is usually figuring out who the people are and where/when the picture was taken. One important step is to look at the back of the print when you scan it. If anyone has written a note there, copy it into the scanned file’s metadata so that it stays with the digital version. Even the vaguest note will be useful later.
All of these are scanned with the Epson. You generally don’t want to go too crazy with the resolution when you do this. Something in the 6 to 8 megapixel range is more than enough for most snapshots. The one exception is that you can occasionally fix a grainy print by scanning at much higher res than the grain and then using Photoshop to blur it back down to a reasonable resolution.
One issue which I haven’t solved yet are those pebble textured prints from the 70’s. The scanner’s light source is narrow enough that it basically acts like a gradient filter and makes every detail of the surface’s texture standout. I’ve tried a number of different fixes for this, but none of them are really good enough to share.
Another option here are services like ScanCafe. I haven’t used any of these, but I have heard good things about them. If your collection is small enough that it doesn’t justify buying a good scanner, then this might be a good option.
The first step is to dump everything onto a hard disk. This really has to be a RAID 1. The next couple of steps are going to take some time, and disks fail. I have had drive failures, and RAID 1 is just the greatest thing ever. When a drive dies, you just stick a new drive in and wait about 20 minutes while it gets rebuilt from the other one. No running around looking for your backups and finding out you’ve lost something.
Basically I have a directory for each source. They’ve often got names like “Old Slides from Yellowstone” or “Chris’ camera”. That way I know where to dump things as they come in. You want to make it a habit to unload everyone’s camera every time they bring some pictures in. Leaving images on the SD card is asking for trouble.
I generally use Adobe Bridge for managing these offload directories. They tend to have large numbers of pictures in them, and you often want to pull a few out for things like blog posts. Bridge is great for zooming through them and finding the picture you want.
Next you need to do some clean up and processing. This is also where you’d throw away the real junk, but I never seem to do that. Basically you want to go marching through a directory in order, and keep track of which pictures you’ve already processed. I’ve often got little notebooks around the computer for tracking this.
For the scanned sources, I use Photoshop’s Spot healing brush to remove dust specks and scratches. It’s slow, but it does a good job. I find it kind of relaxing to spot photos after work while everyone else is watching TV. There are some other approaches which are faster when you’ve got a really messy image. I strongly recommend Ctein’s book on photo restoration, if you’re doing this a lot. I’ve learned a lot of great stuff from his book. It’s also really useful if you’ve got old color photos which have faded and color shifted.
I also convert all of my raw files to TIFFs at this point. I archive both copies. The raw file contains more information, but I’m really afraid that the day will come when I won’t be able to read some old raw file format. I’ve been around the HPC and sciviz world too long to think that any data format lives forever. In fact I did almost get caught when Kodak’s PhotoCD format died and everyone stopped supporting it.
As I said above, anything that’s in an obscure format gets converted to something more archival. I generally use uncompressed TIFF as my archive format in this case because it’s so simple and well documented. I know that I could write a reader for uncompressed TIFF using a couple of rocks and a sharp stick if I had to. The images are starting to get awfully large, so I’ve been considering switching to a compressed format. LZW TIFF seems like it’d be a good bet, but I’ve written LZW decompression before, and I think it requires more than a sharp stick, so maybe it shouldn’t be part of my tool chain.
I generally save them as 8 bit per channel with the sRGB color profile. I sometimes use the ProPhoto profile when it looks like a picture needs the wider gamut. This actually seems to be fairly rare, and it’s awfully easy to forget to convert back to something more standard when you want to post one of these on the web, so I haven’t made it my default.
Basically I walk through one of my offloading directories and group the pictures into sets which fit on a CD. I use MAM-A archival gold CDs with phthalocyanine dyes. The lifespan of most writable CDs is really pretty awful. You’re probably not going to go back to get something off these copies for years, and if you don’t use good CDs, then the odds are that you’ll find they’re unreadable when you need them. I’ve been thinking about switching to archival DVDs for years now, but I’m still not convinced that they’re as safe because of the higher density.
Each source ends up on a series of CDs. These are numbered sequentially with a code which starts with two letters which Identify the source. For example, the images from my D-70 are on CDs with numbers like ND-000, while the ones from the Minolta scanner are on CDs with numbers like DS-000. I have an external hard drive which has a pictures directory which contains a read-only copy of each CD. These are the copies I work from. The CDs themselves go into an archival filing system. I generally get my filing materials from Light Impressions. They’re not cheap, but they’re pretty trustworthy.
The directories with the copies of the CDs (and the offloading directory too) are backed up using an online service. I use Mozy, but Carbonite would be fine too. You absolutely need to do this. None of the early steps are going to matter if your house burns down unless you have some sort of offline storage. I used to store the CDs at work, but I think that this system is better. Make sure that you occasionally go through the steps of restoring things from their web site so that you are really sure that everything is working correctly. It’s safe to do this by deleting a directory and restoring it, because you’ve got the other copy on CD.
Now that you have tens of thousands of images archived, how do you manage them all? You want to tag them all so that you can find them when someone asks you for a picture of a particular person or event. For many years, I’ve used ACDSee. I first started using it because I didn’t have enough disk space to have all of the CDs online at once. ACDSee is really good at managing offline imagery.
At this point, since drives have gotten so cheap, you can just index the online read-only copies using any indexing software. I use Picassa a lot at this point. It has some pretty good tools for managing tags, and it also has good face recognition. That means that I can tell you that I’ve got 4,307 pictures of Peter, but only 3,237 pictures of Tom. This is almost totally automatic, and it speeds up the tagging process immensely. On the other hand, Picassa’s manual tagging tools aren’t the most efficient when you’re going through a large number of tools.
I also use Bridge’s tagging tools too. It’s really the best for tagging lots of files and for some types of complex queries. I’ll probably do another writeup at some point which just covers indexing tools in more detail.
One important part of the process now is sharing pictures online. I’ve used a number of different tools for this. One thing that you’re looking for with any of these tools is the ability to control the sharing. You often want to create an album to share with a particular group of people. Friendly and quick upload tools are another important consideration.
Most of our online pictures go on Flickr. We’ve used it for years, and its tooling is pretty good. However, I have concerns about their latest changes to their terms of service. I think that I’m actually violating them when I embed Flickr images in a blog post.
I also use Picasa quite a bit. It has been improving very quickly, and of course it integrates well with G+.
I used to use the Kodak one quite a bit too, but I haven’t really used that one in quite a while now.
Adobe Lightroom has some really nice tools for creating galleries and slideshows. I haven’t used them, but I’ve been meaning to spend more time with Lightroom. It really isn’t part of my workflow at this point.
Printing and Displaying
Even though everything is digital, you still want to print and display. I have an Epson photo printer with the Ultrachrome inks. When you print that onto a Baryta paper like the Hahnemuhle one, the results can be quite spectacular. Better than almost anything I every managed to do in the darkroom. That said, it does take a bit of fiddling, and I’ve gotten pretty lazy at this point. Instead of fighting with the printer, I often batch up about 10 prints and send them to Adorama. They really do a great job, and then prices are pretty reasonable. Be sure to download their ICC profiles, and convert your pictures to them before uploading. It really makes a big difference.
We also turn a lot of our pictures into things like photo books, calendars, and Christmas cards. In addition to Adorama, we also use Apple and VistaPrint for this sort of job. Of course, you’ll also want to use some to create Moo Cards.
For many years, I would get a bunch of prints made, and then they would sit in a box for years, and I would never look at them. Getting them framed and hung was the bottleneck. Finally I had a really bright idea. I installed a bunch of picture rails (which I got from AS Hanging Systems) and bought a bunch of cheap standard sized (8×10 through 16×20) frames. Now it only takes a moment to put up new prints and to rearrange the ones which are there. The rail system was a little expensive, but it has made a big difference in how much I actually got to enjoy my pictures.
In addition to printing, I also crop most of my favorites to the resolution of my standard monitor (1920×1200) and dump them in a directory which my screen saver is pointing at. I’m still looking for a similar setup for the Android devices. I’m currently using JustPictures, but I’m not completely convinced that’s the best one for the job.