De-Duping my photos with Anti-Twin

From RAW conversion to image editing and printing
User avatar
Dusty
Emperor of a Minor Galaxy
Posts: 2314
Joined: Thu Dec 18, 2008 5:04 pm
Location: St. Louis, Missouri, USA

De-Duping my photos with Anti-Twin

Unread postby Dusty » Sun Feb 24, 2019 6:49 am

Having some down time lately because of the weather, I've taken to de-duping my photos, and saving many gigs of space in so doing.

Part of this is the fact that I always used to have an non-busy late morning at my old job, and often worked on my photos there, as well as my home PC, so combining the files often meant that photos were transferred into a folder with a similar name between the 2 'puters.

When I left the old job last year, all of my backups were also transferred unto a backup drive and I had redundancies of redundancies.

This past few weeks I've been going thru them with Anti-Twin - a free deduping program that works wonderfully, if a bit slow. Disclaimer - I only consider a photo a dupe it is exactly the same in every way - if I adjusted it or cropped it, it's not the same. A/T can find those similar dupes, but it's a slower process in every way, and that's not what I'm looking for.

A few pointers if you want to go thru this: Start with smaller bites. I put both jpgs and raws into folders by year when importing them, and let Sony's PMB put them into dated sub-folders.

I found out the hard way that the program doesn't seem to sort things properly before starting it's comparison. So, if doing /pictures you may get 327 listed dupes for your /2014 folder, and 156 in /2013 and 427 in /2017 in your first batch of 1000, but you next batch may give you similar results, and with some of the same subfolders of the yearly folder having dupes again.

A/T defaults to batches of 1000 dupes, but you can set it higher or lower. I found it best to leave it alone. It reads all of the folders and subfolders contents into memory, then starts a comparison. For that reason, it's quite a bit faster if you select a subfolder at a time.

For me, when I adjust a photo, I normally re-name it, so I have it check for dupes by identical names, extension, and file size, as well as byte by byte content. If you don't rename you may want to do it pixel by pixel, which is slower.

So far I've saved over 100 gigs of space that were wasted on duplicate files - 2011 was a really messed up year for some reason.

If anyone else has an easier way to de-dupe, let me know. if you're tring out Anti-Twin and need some help, I'm getting pretty good with it.

Dusty
An a700 and couple of a580s, plus even more lenses.

Return to “Digital Workflow and PP”

Who is online

Users browsing this forum: No registered users and 2 guests

cron