Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 06-12-2009, 10:46 AM   #1
earthq
Enthusiast
earthq began at the beginning.
 
Posts: 30
Karma: 10
Join Date: Jun 2009
Device: none
Duplicates Management DCR

Hi,

I would like to suggest a design change request for duplicates management, if possible and other users like the idea...

Today (as of 0.5), when you have two or more ebooks with the same "parsed" title (as the RegEx in configuration), a dialog pops up about allowing those duplicates.

However, if the same book with different filename or "parsed" title comes in, or if metadata runs and updates the existing record, dups are not detected, even when run metadata again and both books get the exact same information.

So some ideas ...

- Enable "duplicates" reporting, allowing to select what fields should match. A simple free text field to enter SQL-ish query (and some samples) would be enough. Yes, we can use SQLite administration tool to run those queries, but would be great if can be integrated so can hit Del and delete the dupe straight away.

- Enable metadata lookup as an option on import, so the second already existing book would be reported as dupe.

- Enable optional ebook hashing on import, and as a batch job on existing books. So even if the same book file by chance gets different author and title (i.e. badly parsed or entered ISBN), still "same file check" can detect dups.

Let me know what you think!

Regards
earthq is offline   Reply With Quote
Old 06-12-2009, 04:18 PM   #2
Sabardeyn
Guru
Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.
 
Sabardeyn's Avatar
 
Posts: 644
Karma: 1242364
Join Date: May 2009
Location: The Right Coast
Device: PC (Calibre), Nexus 7 2013 (Moon+ Pro), HTC HD2/Leo (Freda)
Theoretically it sounds good, but...
  • Assumes no typographical errors occur within the database or files.
  • Assumes metadata is available, included and accurate.
  • Comparing (imported) filenames pre-supposes that everyone is using the same book-naming system. Post import filenames will work.
  • A hash cannot work if based on the file contents. Way too many variations between ebook formats, let alone someone editing a personal copy.
Despite that it might work. At least enough to bring things to the user's attention for further review. Now the only question becomes whether it is feasible to implement and use. It would seriously impair importing large libraries.

There would need to be some way to turn this off. Someone may have and want multiple copies of the same book for various reasons.

Last edited by Sabardeyn; 06-12-2009 at 04:20 PM.
Sabardeyn is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Calibre Duplicates mitch13 Calibre 5 11-13-2010 06:42 AM
Possible Bug on Duplicates Giuseppe Chillem Calibre 3 05-06-2010 07:26 PM
Duplicates pauldadams Calibre 17 05-04-2010 11:57 PM
Duplicates... jaxx6166 Sony Reader 5 07-09-2009 09:13 PM
Duplicates found! Moejoe Calibre 32 06-12-2009 12:20 AM


All times are GMT -4. The time now is 07:39 AM.


MobileRead.com is a privately owned, operated and funded community.