Remove Duplicate Images in Aperture 3?

Discussion in 'Digital Photography' started by brewser, Feb 10, 2010.

  1. macrumors regular

    Joined:
    Oct 8, 2006
    #1
    Does anyone know a way to remove duplicates in A3? I just installed it and I have a bunch of duplicate images with the same name. I can't find a good way to remove them.
    Any help?
     
  2. PDE
    macrumors 68020

    Joined:
    Nov 16, 2005
  3. macrumors 601

    compuwar

    Joined:
    Oct 5, 2006
    Location:
    Northern/Central VA
    #3
    I don't think it's a feature. The best file de-duplicator I've found so far is "Decloner." However, this won't help if you store your files in the Aperture library rather than simply reference them.
     
  4. macrumors Nehalem

    GoCubsGo

    Joined:
    Feb 19, 2005
    #4
    What about that app you've used, Chipmunk?
     
  5. macrumors 601

    compuwar

    Joined:
    Oct 5, 2006
    Location:
    Northern/Central VA
    #5
  6. thread starter macrumors regular

    Joined:
    Oct 8, 2006
    #6
    I use A3 as my library so Decloner won't work. I have used Tidy Up! in the past. Tidy wouldn't actually delete the dupes it would just mark them so you could then do a keyword search and manually delete those images. Tidy Up is not compatible with A3 so I was hoping there was an interim solution.
     
  7. thread starter macrumors regular

    Joined:
    Oct 8, 2006
    #7
    I am really new to A3 but I wonder if you copy the A3 originals folder out of the package contents to another folder and then have decloner remove the duplicates and then copy the originals back to A3.
    What will A3 do when there are missing photo's. Does it have some type of rescrub or will it mark those as missing?
    This might work if A3 doesn't get messed up.
    Thoughts?
     
  8. macrumors 601

    compuwar

    Joined:
    Oct 5, 2006
    Location:
    Northern/Central VA
    #8
    If you've added metadata like keywords, then you'll lose it. The thing I really like about Decloner is that it takes the approach I was going to take the last time I got frustrated with Chipmunk crashing- sort by size, then checksum with a good hash (SHA-1 in this case)- the only couple of things I'd do different is to probably crc32 the first 50 or so bytes to make the checksum step faster for mismatched files, and save the whole list to a hashed disk file so that doing several volumes at once wouldn't be a memory issue. But they're minor nits- this is the first de-dupe program I've found that I completely like- and I've been searching for a while since my prior backup strategy was "Copy two old volumes to a newer bigger one."

    Paul
     
  9. thread starter macrumors regular

    Joined:
    Oct 8, 2006
    #9
    Do you know if A3 will then rebuild the library or will it be a mess?
    Thanks for your help!
     
  10. macrumors 601

    compuwar

    Joined:
    Oct 5, 2006
    Location:
    Northern/Central VA
    #10
    I suspect you'd have to bring the files in to a new Library- which would be the right way to do it if you're going to export anyway, I think.

    Paul
     
  11. thread starter macrumors regular

    Joined:
    Oct 8, 2006
    #11
    That makes sense. Then delete the old library. Thanks again for your help.
     
  12. macrumors 68020

    pdxflint

    Joined:
    Aug 25, 2006
    Location:
    Oregon coast
    #12
    I think one of the gears in my brain just jammed... :D ;)
     
  13. macrumors 601

    compuwar

    Joined:
    Oct 5, 2006
    Location:
    Northern/Central VA
    #13
    I tend to be dealing with tens to hundreds of thousands of files at once. A one-way hash function[1] like SHA-1 or MD5 takes a lot of CPU (relatively speaking) and reading 12 or 25M raw files takes a comparitively long amount of time. My "solution" would be to sort by size first (if they aren't the same size, they can't be the same file,) then to checksum a relatively "cheap" amount of data with a more "cheap" checksum algorithm (CRC32)- if they don't match at that point, they're not the same file. By winnowing away at the problem like that, I can do literally thousands more files in the same amount of time as it takes to read, checksum and compare a handfull of full-sized raw files with an "expensive" checksum or hash algorithm.

    Unfortunately, I just don't have the time to write a lot of code anymore.

    Paul
    [1] A one-way hash algorithm produces a "relatively" unique output from any given input- "collisions" happen when two different inputs give the same output- but that's very, very unlikely from two files of the same size but differing content. So if a file containing "ABCDE" gives a hash output of f2342965 and a file containing "ABCDF" gives an output of 31c2485, you can be sure they're not the same file. Let's say it takes .1 second to do a CRC32 and 1 second to do a SHA1 hash- you're already looking at a 10:1 rate, but if you add in the fact that say 95% of the time, you're reading 100 bytes instead of 12000000 bytes things get much more efficient.

    One way functions are often used to hash passwords- so you don't have to store the actual password anywhere- you simply store the hash, the you run the user's typed in password through the algorithm and produce a hash, then if they match, you "know" with a fair degree of certainty that the password is correct.
     
  14. macrumors 68020

    pdxflint

    Joined:
    Aug 25, 2006
    Location:
    Oregon coast
    #14
    I was just pulling your leg, Paul... but thanks for the explanation. I'd agree with your sorting/comparative logic completely - it's way more efficient.
     
  15. macrumors newbie

    Joined:
    Feb 18, 2010
    #15
    The developer of Tidy Up informed me yesterday that his app would be A3 compatible "very soon."

    I guess "very soon" could mean different things to different people...
     
  16. macrumors newbie

    Joined:
    Apr 25, 2010
  17. macrumors newbie

    Joined:
    Jul 10, 2010
    #17
    Be very careful with this application...


    Be very careful with this one! It does not allow you to compare found duplicates like their iPhoto version does. Furthermore, it did an unacceptable job in my Aperture library. Almost all of the "duplicates" it found were unique images. :eek:
     
  18. macrumors member

    Joined:
    Jan 9, 2004
    Location:
    sweden
    #18
    The Aperture version can also mark the originals with a keyword and thereby making them comparable. It's hard to say why you had unique photos marked as duplicates. Perhaps you ran in Magic Mode and images with the same Exif creation date were matched? Try Classic mode and the MD5 checksum. Also, remember that It compares the master images so versions belong to a duplicate master will also be marked as duplicates, this can be confusing if you think of your versions as unique images files while they are not.
     
  19. macrumors regular

    tethead

    Joined:
    Apr 13, 2005
    Location:
    NYC
    #19
    good tip on Duplicate Annihilator - pretty nice app and seems to work quick. i've got nearly 37k images in Aperture so this is definitely better than trying to do it manually!!

    i will say that it has found MANY false positives which were all shot in "Burst Mode" - but the program DOES warn you that it may do that if you run it in Magic Mode, so I was sure to look over the set of duplicates before just deleting all of them with the matching keyword.

    thanks for the tips!!!
     
  20. ckeilah, Jul 12, 2011
    Last edited: Jul 12, 2011

    macrumors newbie

    Joined:
    May 13, 2009
    #20
    compuwar has the perfect solution (as long as it doesn't destroy Aperture's stored metadata) Can't we develop this into a useful app? Could this be implemented in an AppleScript script?! Maybe an AppleScript could use compuwar's ideas, and send a "delete this photo" command to Aperture. That would at least make the manual checking of the slightly different photos a lot easier (i.e. it would kill all the definite duplicates, leaving just two or three - one with good metadata, two dupes - to manually work on)

    Even better would be if the AppleScript could find the duplicates and near-dupes by efficient multi-pass: filename, file date, CRC head, MD5, SHA-1, etc. the data portion; then compare the metadata and pick the one with the most fields filled out, then mark those in Aperture as #1 Best Dupe #2 Less Quality Dupe (#3-18 deleted because they are exact dupes of #2)

    Duplicate Annihilator is on the right track, and the Aperture version actually claims to use CRC and MD5, but the fact that it marks/deletes UNIQUE photos (it even admits so) is totally unacceptable. (has this been fixed recently?)

    I have had to import about five versions of broken iPhoto libraries, so I now have up to 10 duplicates of many photos, but probably only one copy of many. My Aperture library now contains about 200,000 photos, probably 70% of which are dupes. I have put a lot of work into the meta-data that (I hope) got brought over from the iPhoto imports, so I don't relish losing that. At this point it seems that the only way I can clean this up is to devote about 150 man-hours to going through the entire library, clicking on each visually duplicate photo, finding the one with the good metadata, and deleting the other 7. I am utterly bemused that these "computers" that were supposed to save us this kind of horribly repetitive manual labor are unable to do this, and that Apple doesn't seem to think it's flagship photo app. warrants having such functionality. :( :apple: :(

    If I have somehow overlooked a workable solution, please poke me before I begin my arduous trek down the road of mind-numbing photo library editing. ;-)


    more edit: I just found this, which also promises to assist, and should be included in our script, IMHO: http://hints.macworld.com/article.php?story=20060624112253828 Separating out the Original photos from the Edited photos seems to be a good idea.

    or... is Tidy Up now Aperture3 compatible, and the best solution we have?


    FWIW, enabling Auto-Stack appears to be broken. Aperture has been spinning gay all day now and nothing seems to be happening.
     
  21. macrumors regular

    georgi0

    Joined:
    Aug 21, 2006
    Location:
    Cyberspace
  22. macrumors member

    Joined:
    Sep 3, 2008
    Location:
    Bethesda, Maryland
    #22
    Remove Duplicate Images in Aperture 3

    Since this thread was started (2010) the comments have focused on file duplicator software mostly, as opposed to image duplicators. Has anyone had experiences with Photosweeper? It seems to be a rather complete application for this use, but would like to know anyone's experiences. Thanks
     
  23. macrumors regular

    mtngoatjoe

    Joined:
    Jun 10, 2008
    #23
    App for wife...

    Does anyone know of an app that will force my wife to delete 80% of the photos she imports? It would also be really helpful if it forced her to apply faces, places, keywords, and ratings. My existance would be significantly simpler if I could find an app like that!
     
  24. macrumors 68040

    Badrottie

    Joined:
    May 8, 2011
    Location:
    Los Angeles
    #24
    Yeah me too…. :apple:
     
  25. macrumors newbie

    Joined:
    Feb 25, 2014
    #25
    Aperture 3 & iPhoto Library Manager

    I have just started using Aperture 3, and am taking advantage of the shared use of my existing iPhoto library. With this setup, could I still use iPhoto Library Manager for removing duplicates?
    Can anyone see a problem with this?
    Thanks
    Tim
     

Share This Page