Software to eliminate dupes of songs in Windows iTunes?

Discussion in 'Mac Apps and Mac App Store' started by papa deuce, Dec 3, 2009.

  1. papa deuce macrumors member

    Mar 7, 2004
    Does such a piece of software exist?

    Honestly, I think this needs to be built into iTunes itself...
  2. GGJstudios macrumors Westmere


    May 16, 2008
    No, you can identify dupes, but iTunes has no way of knowing if you have them there intentionally. I have many dupes, as a song appears on multiple albums, some of them live, some acoustic, etc.
  3. papa deuce thread starter macrumors member

    Mar 7, 2004
    Right, but I am wondering if software exists that can analyze the tracks, find exact dupes, and then delete them. Not just go by tags.
  4. GGJstudios macrumors Westmere


    May 16, 2008
    Tags are the only thing that they can go by. You could have two completely different songs that are identical in size, length and bitrate. Only with tags can you identify duplicates.
  5. Lloyd Christmas macrumors regular

    May 12, 2009
    sadly this one a major drawback of itunes. i have close to 6 gigs worth of duplicates. i dont want to spend the time going through them cause like GGJstudios stated, many songs are from different cds and acoustics versions. i hope in the version they change this feature. it would really be nice. Lloyd
  6. applesupergeek macrumors 6502a

    Nov 20, 2009
    Have a look at dougs scripts, there might be something there.

    I don't understand though what the problem actually is, if you a dup. track that's actually an accoustic version of the other track, won't it be tagged differently?

    I think doug's scripts have some dup remover that compares tags and if it find everything identical removes one of them, I don't know whether it actually checks size of the file.

    btw, if you do download anything from dougs site get the excellent script manager too, a very nice floating black (styled like some snow leopard/iphone elemnts) little window with all your scripts there. I can't do without it, if you want any suggestions for good scripts give us a shout.
  7. eatbacon macrumors regular

    Feb 4, 2003
    That is the goofiest thing I have ever heard. Of course you can compare two files to see if they are identical. Two totally different songs would obviously be different.
  8. steve-p macrumors 68000


    Oct 14, 2008
    Newbury, UK
    Two identical songs with different tags would also be different though at byte level, so a comparison would have to look inside each file and distinguish between metadata and music data.

    Consider the scenarios:

    1. File length and byte-level content are the same - must be a duplicate (but this could be spotted by looking at the tags which must also be the same)

    2. File length is the same but byte-level content differs - could still be a duplicate if the tags differ (but have the same total length), or are identical but in a different order, and the music data is the same

    3. File length differs but main tags are the same - could be the same song encoded by two different encoders - technically a duplicate (if bitrate is the same), but only identifiable by the main tags being identical

    4. File length differs and tags differ - could still be a duplicate - only extracting the music data from within the file and comparing it byte for byte would tell

    5. The same song from the same CD encoded by different encoders where the tags don't match - impossible to detect duplicates since nothing matches

    6. The same song from more than one CD - regardless of tags or encoder, the running time is usually slightly different so this could not be detected as a duplicate even if a human would say it was a duplicate by looking at the tags

    Although it is possible to identify some duplicates by looking inside the file, the tags are still the most effective way of doing it - which is why it's so important to get the tags right. And also probably why iTunes duplicate finder only works on tags and expects a human to make the final decision.
  9. GGJstudios macrumors Westmere


    May 16, 2008
    If you're not using the tags, on what criteria would you compare them, to be sure they're identical?
  10. applesupergeek macrumors 6502a

    Nov 20, 2009
  11. GermanSuplex macrumors 6502a


    Aug 26, 2009
    Not sure I would want anything doing this automatically for me. I have some tracks that appear dozens of times in iTunes because they're from different albums. The original album, a reissue, a compilation, a live album, etc. I wouldn't want a program deleting the wrong thing.

    What I do is just tosss all duplicates into a playlist called "Duplicates" and make sure all my playlists eliminate tracks from that playlist. That way anytime I listen to a track, only it gets the playcount, not its duplicates.

    What I'd rather see is a feature that allows you tie multiple songs together so that only one track gets the playcount updated. Take "My Girl" by The Temptations: a popular track that is on many, many compilation albums I own. It would be nice so that I could play any one of the ten times or so it appears in my library, but it would only update the playcount of one of them.
  12. GGJstudios macrumors Westmere


    May 16, 2008
    That won't work, as you can have two different songs with the same file size. Likewise, you can have two files that are duplicates of the same song, but have different file sizes... such as an extra second of silence before or after the song... or ripped with two different encoders. File size alone cannot determine duplicate songs.
  13. GermanSuplex macrumors 6502a


    Aug 26, 2009
    Maybe a checksum tag field that ignores the tag information? Say in the background it converts the song to uncompressed audio, creates the checksum and writes it into the tag of the file? Then it can compare checksums of audio types and remove duplicates.

    Of course, the obvious problem with that is time. It would take an insanely long time for larger libraries.

    Really, I don't think there's any good, reliable automated way of doing it without risk of losing tracks you want. If they ever did do a feature like this, Apple's support forums would be filled with people upset that iTunes or a third party software deleted their tracks.
  14. applesupergeek macrumors 6502a

    Nov 20, 2009
    Wait, let's clear a few things up here,

    If the two files are ripped from the same cd track, then they won't have an extra silence. If they do have an extra silence they are not the same track, hence they are not duplicates, the "same song essentially" as you seem to be defining a duplicate is not really a duplicate. Moreover if they were ripped with different encoders then again they are not duplicates, they are different items, better or worse quality rip of a particular cd track.

    But in order for all this to be meaningful we have to define duplicate as the same track from one particular cd ripped with the same encoder, and perhaps differing only in filename and tags. If different cd tracks with differing rippers are included in the mix then this just broadens this operational definition without any evident benefit and renders the effectiveness of any actual comparison rather impossible.

    Remember I didn't say filesize, I said BYTEWISE, which means that the dublicate remover actually compares same filesize songs byte by byte, there's no way you can get a false alarm this way.

    Like file duplicate removers for general files, you have to decide on a speed/accuracy/error rate and choose a combination of filename, byte comparison and as in the case of music tracks tag information. With increased complexity no automatic procedure will be 100%, but that's the sacrifice if you want to have something do it for you.

    I did that to clean my abstracts and pdf books archive which was in tens of gbs size. I used a commercial comparison/removal software and accepted that I might have say a 0.5% error rate that I would have to sacrifice If I wanted this to be done automatically.

    With audio you can set it to remove tracks that are 95% and upwards bytewise similar which can account for same track, same encoder, but different filename and slightly different tags.

Share This Page