Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

brewser

macrumors regular
Original poster
Oct 8, 2006
130
3
Does anyone know a way to remove duplicates in A3? I just installed it and I have a bunch of duplicate images with the same name. I can't find a good way to remove them.
Any help?
 

compuwar

macrumors 601
Oct 5, 2006
4,717
2
Northern/Central VA
I don't think it's a feature. The best file de-duplicator I've found so far is "Decloner." However, this won't help if you store your files in the Aperture library rather than simply reference them.
 

GoCubsGo

macrumors Nehalem
Feb 19, 2005
35,741
153
I don't think it's a feature. The best file de-duplicator I've found so far is "Decloner." However, this won't help if you store your files in the Aperture library rather than simply reference them.

What about that app you've used, Chipmunk?
 

brewser

macrumors regular
Original poster
Oct 8, 2006
130
3
I don't think it's a feature. The best file de-duplicator I've found so far is "Decloner." However, this won't help if you store your files in the Aperture library rather than simply reference them.

I use A3 as my library so Decloner won't work. I have used Tidy Up! in the past. Tidy wouldn't actually delete the dupes it would just mark them so you could then do a keyword search and manually delete those images. Tidy Up is not compatible with A3 so I was hoping there was an interim solution.
 

brewser

macrumors regular
Original poster
Oct 8, 2006
130
3
I am really new to A3 but I wonder if you copy the A3 originals folder out of the package contents to another folder and then have decloner remove the duplicates and then copy the originals back to A3.
What will A3 do when there are missing photo's. Does it have some type of rescrub or will it mark those as missing?
This might work if A3 doesn't get messed up.
Thoughts?
 

compuwar

macrumors 601
Oct 5, 2006
4,717
2
Northern/Central VA
I am really new to A3 but I wonder if you copy the A3 originals folder out of the package contents to another folder and then have decloner remove the duplicates and then copy the originals back to A3.
What will A3 do when there are missing photo's. Does it have some type of rescrub or will it mark those as missing?
This might work if A3 doesn't get messed up.
Thoughts?

If you've added metadata like keywords, then you'll lose it. The thing I really like about Decloner is that it takes the approach I was going to take the last time I got frustrated with Chipmunk crashing- sort by size, then checksum with a good hash (SHA-1 in this case)- the only couple of things I'd do different is to probably crc32 the first 50 or so bytes to make the checksum step faster for mismatched files, and save the whole list to a hashed disk file so that doing several volumes at once wouldn't be a memory issue. But they're minor nits- this is the first de-dupe program I've found that I completely like- and I've been searching for a while since my prior backup strategy was "Copy two old volumes to a newer bigger one."

Paul
 

brewser

macrumors regular
Original poster
Oct 8, 2006
130
3
If you've added metadata like keywords, then you'll lose it. The thing I really like about Decloner is that it takes the approach I was going to take the last time I got frustrated with Chipmunk crashing- sort by size, then checksum with a good hash (SHA-1 in this case)- the only couple of things I'd do different is to probably crc32 the first 50 or so bytes to make the checksum step faster for mismatched files, and save the whole list to a hashed disk file so that doing several volumes at once wouldn't be a memory issue. But they're minor nits- this is the first de-dupe program I've found that I completely like- and I've been searching for a while since my prior backup strategy was "Copy two old volumes to a newer bigger one."

Paul
Do you know if A3 will then rebuild the library or will it be a mess?
Thanks for your help!
 

brewser

macrumors regular
Original poster
Oct 8, 2006
130
3
I suspect you'd have to bring the files in to a new Library- which would be the right way to do it if you're going to export anyway, I think.

Paul

That makes sense. Then delete the old library. Thanks again for your help.
 

pdxflint

macrumors 68020
Aug 25, 2006
2,407
14
Oregon coast
If you've added metadata like keywords, then you'll lose it. The thing I really like about Decloner is that it takes the approach I was going to take the last time I got frustrated with Chipmunk crashing- sort by size, then checksum with a good hash (SHA-1 in this case)- the only couple of things I'd do different is to probably crc32 the first 50 or so bytes to make the checksum step faster for mismatched files, and save the whole list to a hashed disk file so that doing several volumes at once wouldn't be a memory issue. But they're minor nits- this is the first de-dupe program I've found that I completely like- and I've been searching for a while since my prior backup strategy was "Copy two old volumes to a newer bigger one."

Paul

I think one of the gears in my brain just jammed... :D ;)
 

compuwar

macrumors 601
Oct 5, 2006
4,717
2
Northern/Central VA
I think one of the gears in my brain just jammed... :D ;)

I tend to be dealing with tens to hundreds of thousands of files at once. A one-way hash function[1] like SHA-1 or MD5 takes a lot of CPU (relatively speaking) and reading 12 or 25M raw files takes a comparitively long amount of time. My "solution" would be to sort by size first (if they aren't the same size, they can't be the same file,) then to checksum a relatively "cheap" amount of data with a more "cheap" checksum algorithm (CRC32)- if they don't match at that point, they're not the same file. By winnowing away at the problem like that, I can do literally thousands more files in the same amount of time as it takes to read, checksum and compare a handfull of full-sized raw files with an "expensive" checksum or hash algorithm.

Unfortunately, I just don't have the time to write a lot of code anymore.

Paul
[1] A one-way hash algorithm produces a "relatively" unique output from any given input- "collisions" happen when two different inputs give the same output- but that's very, very unlikely from two files of the same size but differing content. So if a file containing "ABCDE" gives a hash output of f2342965 and a file containing "ABCDF" gives an output of 31c2485, you can be sure they're not the same file. Let's say it takes .1 second to do a CRC32 and 1 second to do a SHA1 hash- you're already looking at a 10:1 rate, but if you add in the fact that say 95% of the time, you're reading 100 bytes instead of 12000000 bytes things get much more efficient.

One way functions are often used to hash passwords- so you don't have to store the actual password anywhere- you simply store the hash, the you run the user's typed in password through the algorithm and produce a hash, then if they match, you "know" with a fair degree of certainty that the password is correct.
 

pdxflint

macrumors 68020
Aug 25, 2006
2,407
14
Oregon coast
I was just pulling your leg, Paul... but thanks for the explanation. I'd agree with your sorting/comparative logic completely - it's way more efficient.
 

bpwhistler

macrumors newbie
Feb 18, 2010
2
0
The developer of Tidy Up informed me yesterday that his app would be A3 compatible "very soon."

I guess "very soon" could mean different things to different people...
 

fltman

macrumors member
Jan 9, 2004
62
4
sweden
Be very careful with this one! It does not allow you to compare found duplicates like their iPhoto version does. Furthermore, it did an unacceptable job in my Aperture library. Almost all of the "duplicates" it found were unique images. :eek:

The Aperture version can also mark the originals with a keyword and thereby making them comparable. It's hard to say why you had unique photos marked as duplicates. Perhaps you ran in Magic Mode and images with the same Exif creation date were matched? Try Classic mode and the MD5 checksum. Also, remember that It compares the master images so versions belong to a duplicate master will also be marked as duplicates, this can be confusing if you think of your versions as unique images files while they are not.
 

tethead

macrumors 6502
Apr 13, 2005
405
371
NJ
good tip on Duplicate Annihilator - pretty nice app and seems to work quick. i've got nearly 37k images in Aperture so this is definitely better than trying to do it manually!!

i will say that it has found MANY false positives which were all shot in "Burst Mode" - but the program DOES warn you that it may do that if you run it in Magic Mode, so I was sure to look over the set of duplicates before just deleting all of them with the matching keyword.

thanks for the tips!!!
 

ckeilah

macrumors newbie
May 13, 2009
25
4
compuwar has the perfect solution (as long as it doesn't destroy Aperture's stored metadata) Can't we develop this into a useful app? Could this be implemented in an AppleScript script?! Maybe an AppleScript could use compuwar's ideas, and send a "delete this photo" command to Aperture. That would at least make the manual checking of the slightly different photos a lot easier (i.e. it would kill all the definite duplicates, leaving just two or three - one with good metadata, two dupes - to manually work on)

Even better would be if the AppleScript could find the duplicates and near-dupes by efficient multi-pass: filename, file date, CRC head, MD5, SHA-1, etc. the data portion; then compare the metadata and pick the one with the most fields filled out, then mark those in Aperture as #1 Best Dupe #2 Less Quality Dupe (#3-18 deleted because they are exact dupes of #2)

Duplicate Annihilator is on the right track, and the Aperture version actually claims to use CRC and MD5, but the fact that it marks/deletes UNIQUE photos (it even admits so) is totally unacceptable. (has this been fixed recently?)

I have had to import about five versions of broken iPhoto libraries, so I now have up to 10 duplicates of many photos, but probably only one copy of many. My Aperture library now contains about 200,000 photos, probably 70% of which are dupes. I have put a lot of work into the meta-data that (I hope) got brought over from the iPhoto imports, so I don't relish losing that. At this point it seems that the only way I can clean this up is to devote about 150 man-hours to going through the entire library, clicking on each visually duplicate photo, finding the one with the good metadata, and deleting the other 7. I am utterly bemused that these "computers" that were supposed to save us this kind of horribly repetitive manual labor are unable to do this, and that Apple doesn't seem to think it's flagship photo app. warrants having such functionality. :( :apple: :(

If I have somehow overlooked a workable solution, please poke me before I begin my arduous trek down the road of mind-numbing photo library editing. ;-)


more edit: I just found this, which also promises to assist, and should be included in our script, IMHO: http://hints.macworld.com/article.php?story=20060624112253828 Separating out the Original photos from the Edited photos seems to be a good idea.

or... is Tidy Up now Aperture3 compatible, and the best solution we have?


FWIW, enabling Auto-Stack appears to be broken. Aperture has been spinning gay all day now and nothing seems to be happening.
 
Last edited:

egis

macrumors member
Sep 3, 2008
76
0
Bethesda, Maryland
Remove Duplicate Images in Aperture 3

Does anyone know a way to remove duplicates in A3? I just installed it and I have a bunch of duplicate images with the same name. I can't find a good way to remove them.
Any help?

Since this thread was started (2010) the comments have focused on file duplicator software mostly, as opposed to image duplicators. Has anyone had experiences with Photosweeper? It seems to be a rather complete application for this use, but would like to know anyone's experiences. Thanks
 

mtngoatjoe

macrumors 6502
Jun 10, 2008
270
56
App for wife...

Does anyone know of an app that will force my wife to delete 80% of the photos she imports? It would also be really helpful if it forced her to apply faces, places, keywords, and ratings. My existance would be significantly simpler if I could find an app like that!
 

timnorman

macrumors newbie
Feb 25, 2014
1
0
Aperture 3 & iPhoto Library Manager

I have just started using Aperture 3, and am taking advantage of the shared use of my existing iPhoto library. With this setup, could I still use iPhoto Library Manager for removing duplicates?
Can anyone see a problem with this?
Thanks
Tim
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.