View Single Post
Old 06-28-2010, 12:31 PM   #1
dpayment
Connoisseur
dpayment will become famous soon enoughdpayment will become famous soon enoughdpayment will become famous soon enoughdpayment will become famous soon enoughdpayment will become famous soon enoughdpayment will become famous soon enough
 
dpayment's Avatar
 
Posts: 90
Karma: 618
Join Date: Oct 2007
Location: Ottawa
Device: PocketBook Pro 902, EB-1150, PRS505, PRS700, Jetbook, Hanlin V3, Kobo
Question Finding and Deleting Duplicate Files of different formats

I posted this question several months ago, and got three or four answers, but they were all things I had already tried. Anyway, here goes again, but I'll try to clarify it a little more:

I've now got a couple of thousand ebooks on my computer, many of which are the same content, but in multiple different formats (lrf, pdf, lit, epub etc.). Also, many of these files either don't have the correct name, or have variations of the name because the sources of the files edited the titles before posting them. I've tried every search combination I can think of, in both basic and advanced search, and I've tried several duplicate finder programs. I've also looked in any FAQ I think might have the answer, but still can't find it. Does anyone know of a way or a piece of software to locate files with the similar content but in different file formats, PLEASE???

Surely I'm not the first person to realize this is a problem, there must be some way to search documents and have the search look for text that matches 80% or 90% or 95% of the original document.

I know I can do this handraulically, but that's going to be a real pain! Any help would be greatly appreciated.

Thanks,
Dan
dpayment is offline   Reply With Quote