I'd start looking at inode numbers of the files in the backups. Each unique inode number should be a different document version, if they have the same date-agnostic trailing pathname. That is, for two dates D1 and D2, if D1/folder/file and D2/folder/file both exist, then if they have the same inode number the files are exact duplicates (not unique).
When TM makes backups, it will hard-link the prior backup's file into the new backup. This prevents duplicating data. As a result, if backup N and backup N+1 have a file with the same inode number, then we can conclude that there were no differences between the backups.
Once the list has been whittled down using inode numbers, you should still compare file data, because there might be some exact duplicates remaining. For example, if a file is moved or duplicated in the folder tree, it might get a different inode number, even though it's the same as another file (not unique). I wouldn't compare files to find these, I'd compare hashes. Generate an MD5 for every file, and store in a list with the file path. Then sort by hash. Any files with the same hash are very likely to be exact duplicates. Any files with different hashes are guaranteed to be different.
At that point you may still have some duplication. but it would probably involve things like a Word file that was saved in a compressed form vs. uncompressed form, where even though the data differs, the Word-related content is the same. I don't know a way to find duplicates there (Word, PDF, Excel, etc.) except by somehow using the app the file belongs to.