Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

Anarchy99

macrumors 65816
Original poster
Dec 13, 2003
1,041
1,034
CA
I have several external hard drives I normally keep as a data drives where i throw anything and don't really have much of a organizational system
throughout the years I've probably piled up a lot of duplicate files eg. same/similar MP3 files duplicate pictures, movies etc. from old backups
What is the best software for automatically getting rid of some of the duplicates etc. and perhaps assisting in organizing so this doesn't happen again

Most the file duplicate finders Ive seen look for other similar names or every file of a certain size but I know softwares on the market like shazam & midomi etc. prove audio should be detectable and I know there's such thing as image recognition so is there a smarter way of finding these similar files rather than just relying on size or name?

As always I prefer free but I'm willing to spend money if the quality of software is worth it

thanks
 

flynz4

macrumors 68040
Aug 9, 2009
3,242
126
Portland, OR
As long as you do not have a common "home" for your data... you will continue to have a disorganized mess. No software will rid you of that.

You need a single place to keep your data (or at least categories of data). Then make sure that any data (of that category) is always put there. That way, everything else is just a copy, and can be deleted.

My best advice is to use specific tools for specific types of data. Examples:

Aperture/iPhoto for pictures. Insert everything... and then organize there... deleting dupes as you organize. Import using the "do not import duplicates" option... that might remove a lot during import.

iTunes or similar as a music database.

DevonThink or similar as a document database.

As soon as you start this... make sure that the single "home" for your data is well backed up. I would recommend at least two different backup mechanisms... one local, one remote. My recommendation is Time Machine and Crashplan. Also... making a clone can come in handy as well. I use CCC.

/Jim
 

balamw

Moderator emeritus
Aug 16, 2005
19,366
979
New England
Most the file duplicate finders Ive seen look for other similar names or every file of a certain size but I know softwares on the market like shazam & midomi etc. prove audio should be detectable and I know there's such thing as image recognition so is there a smarter way of finding these similar files rather than just relying on size or name?

I have found dupeguru to be quite good, and it has modes/versions specific to music and pictures, unfortunately the developer has decided to not continue development.

B
 

ozaz

macrumors 68000
Feb 27, 2011
1,574
512
Most the file duplicate finders Ive seen look for other similar names or every file of a certain size

Which ones have you ruled out based on this behaviour?

I use Gemini which I'm happy with. It certainly looks beyond file name, although I don't know if it goes any deeper than file properties and metadata.
 

Dark Dragoon

macrumors 6502a
Jul 28, 2006
844
3
UK
If you want to find similar but not necessarily exactly the same images then PhotoSweeper has worked well for me.

For finding files which are identical I use Gemini.

I use Gemini which I'm happy with. It certainly looks beyond file name, although I don't know if it goes any deeper than file properties and metadata.

As far as I'm aware Gemini calculates the checksum of each files contents (might be SHA-1) then compares the checksums for duplicates.
 

Basilfawltyone

macrumors regular
Sep 2, 2013
106
5
Chicago, IL
As long as you do not have a common "home" for your data... you will continue to have a disorganized mess. No software will rid you of that.

You need a single place to keep your data (or at least categories of data). Then make sure that any data (of that category) is always put there. That way, everything else is just a copy, and can be deleted.

My best advice is to use specific tools for specific types of data. Examples:

Aperture/iPhoto for pictures. Insert everything... and then organize there... deleting dupes as you organize. Import using the "do not import duplicates" option... that might remove a lot during import.

iTunes or similar as a music database.

DevonThink or similar as a document database.

As soon as you start this... make sure that the single "home" for your data is well backed up. I would recommend at least two different backup mechanisms... one local, one remote. My recommendation is Time Machine and Crashplan. Also... making a clone can come in handy as well. I use CCC.

/Jim

I just bought DevonThink to go paperless. One of the neat functions is the duplicate finder! I didn't know that when I got Devonthink but it works great.

They even have something called replicas, a kind of duplicate that really is an original and that changes accordingly when you do a change in any of the files.

I imported all my pdf, doc, pages, emails etc in to the Devonthink app and indentified 17.000 duplicates!
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.