Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 02-18-2010, 04:13 PM   #1
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Changing default Add Book behavior - Comments?

The default Add Book options seem to work like this:

Option 1 is to add books from a single directory. Each book is added as a separate book. You point and choose the books you want to add. Different formats of the same book are added as separate records.

Option 2 adds books from multiple directories. Each directory is assumed to be a single book, so different formats are added as the same book, regardless of file name.

Option 3 also adds books from multiple directories. It's labeled that it "assumes every ebook file is a different book." That doesn't actually seem to be true. If the filenames are the same, but the extensions differ, this option behaves like the first option. However, if a filename differs, it adds that file as a different book, and adds all other files with the same name, but a different format extension into that record.

That works well when adding a new book.

I've noticed, however, that it's not as effective for long-term when you try to add a new format for a book that's already in the database. For example, if you try to run option 3 a second time on the same directory, you get duplicate books. It nicely sorts the formats into the new records it creates, but it's not aware of the older records for the same books.

I've looked at the code, and basically, it looks at the title of the new books it is trying to add. If that title exists in the database, it's a duplicate, and you are asked if you want to add it as a duplicate record. It does not look at the author to see if this is an identical book in a different format that might logically be added as a new format for an existing record.

I've decided I need that feature as I try to bring my ebook library of the last 20+ years into Calibre. I'm past the halfway point and most of my books seem to already be in the library in text format (the first format I added). To address my personal needs, I've modified the code to function as follows:

When an attempt to add a book is made, it checks the database and finds all books by the same author. It then compares the title of the new book to the titles of those books. If the new title is sufficiently similar to an existing book title by the same author, it adds the book as another format of the existing book. "Sufficiently similar" means that it ignores case and any leading indefinite articles ("the", "a", "an", etc.)

This meets my needs and has greatly improved the speed of adding my existing books into calibre. I can drop a few hundred books onto the main screen and it sorts them into the existing records, creating new records when the title/author is new, and warning me that it will create duplicates only when the title matches, but not the author (this last is the remnant of the current behavior - I catch all the other true identical "duplicates" where both author and title match.)

My question is whether this would be useful for others, and if so, how should it be integrated into calibre? I overwrite an existing format whenever I add a new copy of the book in the same format. Some might hate that. Some may need to enter duplicate records of the same format.

I could certainly display a list of "identical books" the same way it currently displays a list of "duplicate books" and ask for permission first. I could not overwrite the same formats, but only add new ones. I could put in a variety of options that allow reversion to the old behavior, or that control format overwite, but I know Kovid hates option clutter (for good reason).

I'm just not sure what optimal behavior is for other needs.

Currently the code has some minor oddities (It produces multiple records when there are multiple formats of a new book that has the same title as a book by another author, but it should be easy to fix this when I've got time.)

I'm going to be out of action for a while, but if anyone has any thoughts on this, I'd love to hear them. If no one has any interest, or this would break current expectations, I won't spend the time to clean up the code. (Which I would definitely need to do so I don't look too bad if I send it to Kovid - he's seen enough of my hack jobs already!)

Thanks for any suggestions or comments.

Last edited by Starson17; 02-18-2010 at 04:15 PM.
Starson17 is offline   Reply With Quote
Old 02-18-2010, 06:17 PM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 25,454
Karma: 4961459
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
I would suggest that rather than modifying existing behavior, we can add a new menu entry under the add books button that has your modified algorithm.
kovidgoyal is online now   Reply With Quote
 
Enthusiast
Old 02-22-2010, 10:03 PM   #3
ebartley
Junior Member
ebartley began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Jun 2008
Location: Brooklyn NY
Device: Kindle
I found this thread searching on "overwrite book" because I wanted to replace existing copies while preserving my metadata, so I would greatly appreciate this option.
ebartley is offline   Reply With Quote
Old 02-23-2010, 03:51 AM   #4
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 62,782
Karma: 40397151
Join Date: Nov 2006
Location: UK
Device: PW2, iPad Retina Mini, iPhone 4, MS Surface Pro, Onyx T68, N7,
Kovid,

I opened a ticket sometime last summer (I'm afraid I don't recall the number) at your request, suggesting that it would be useful to have the option of "adding a new format to an existing book" when dragging a new book to Calibre. I suggest that Calibre could offer three choices when it detected a duplicate:

1. Overwrite the existing book.
2. Add as a new format for an existing book.
3. Add as a new catalog entry.

I think that's essentially the same as the previous poster is asking for here.
HarryT is online now   Reply With Quote
Old 02-23-2010, 09:43 AM   #5
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by kovidgoyal View Post
I would suggest that rather than modifying existing behavior, we can add a new menu entry under the add books button that has your modified algorithm.
I'm back, so you're going to have to put up with my posts again Adding an additional option is the perfect solution. I don't know why I didn't think of it. Once I'm back up to full speed again, I'll revise the proposed code and file a ticket.
Starson17 is offline   Reply With Quote
Old 02-23-2010, 10:33 AM   #6
theducks
Grand Sorcerer
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 14,293
Karma: 5495472
Join Date: Aug 2009
Location: The (original) Silicon Valley, USA
Device: Galaxy Tab 2, Astak Pocket Pro, K4NT
My 2 cents. Make a Dummy "Add Format" on the existing Main menu that simply reminds the user to use the Meta-data edit form of the Title.

The current underlaying Logic is correct. Calibre needs to know Exactly, which Title to add the new format to.
theducks is offline   Reply With Quote
Old 02-23-2010, 03:43 PM   #7
ebartley
Junior Member
ebartley began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Jun 2008
Location: Brooklyn NY
Device: Kindle
What about bulk additions? The existing edit metadata tools only let you replace files in one item at a time. Matches might not be perfect, but in the cases where there's one and only one exact match to a title (or title/author combination) we can be pretty certain which item to replace/add to.

I imported hundreds of ebook files (all legally acquired!) from Baen before I realized that the format I'd chosen didn't have a table of contents. Many of these now have series information now entered into the metadata.

On one hand, serves me right for not testing more carefully, and frankly I'll have to do most of the work a second time just by *downloading* all the files all over again (which I'd have to even if I'd saved files instead of deleting them after seeing that they opened within calibre -- the .lit files don't have a TOC, or at least not one that imported successfully, where the .prc/.mobi files do.)

On the other hand, I'd rather not start reimporting them from scratch if I don't have to, and even a thoughtful and experienced user might decide to reacquire multiple ebooks in a better source format if they became available at a later date. (Possibly, in the future, because a better source format starts to exist!)

On the third tentacle, checking out the source myself has to be a good thing, even if my experience with python was in much smaller projects.
ebartley is offline   Reply With Quote
Old 02-27-2010, 01:47 PM   #8
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by ebartley View Post
What about bulk additions?
Bulk additions is what I desperately needed this option for. With the current design, Once an ebook entry is made, there's no way to update it or to add new formats to it.

I've completed making code changes for the perfect fix to make it do what I need. All that's left is to submit them, but, I have serious doubts they will be accepted.

When I wrote this post, I was thinking that I needed to change the default behavior. The reason I thought that was that I needed all the pulldown options for Add Books that Kovid already provided, but I needed them to operate differently and more automatically.

When Kovid suggested adding another pulldown choice to the Add Books list, it made sense to me (of course I was on post surgical pain meds and half out of my noodle at the time.) I later realized that I desperately needed drag and drop for bulk additions. So, instead of adding three more menu options, to duplicate Kovid's three, I opted for a checkbox option on the Preferences page. Option creep strikes again. That's reason 1 why this may not be accepted.

Reason 2 is that for bulk additions that allow drag/drop bulk updates of existing formats, I overwrite identical books with the same format. I want this function to allow easy update with better copies, but I can see others might choose this option, then scream when they lose their book by overwriting a good copy with a bad one. (I keep all original formats.)

The third concern is that I identify the "same" book by requiring an exact author match, but only a nearly exact title match (sort of a fuzzy match). I compare titles by ignoring case, any leading indefinite articles ("the", "a", "an"), changing underscores and periods to spaces and stripping punctuation. So for example:

"The Diary of a Madwoman" matches all these (and vice-a-versa):

"The Diary Of A Madwoman"
"Diary of a Madwoman"
"The_Diary_of_a_Madwoman"
"The Diary; of a Madwoman"
"the.diary.of.a.madwoman"

but not:
"The Diary Of Madwoman"
"The Diary of a Madwomen"
"Diaries of a Madwoman"
"Diary of a Madwoman, The"

This really helps with bulk updates, but it does make it slightly easier to overwrite a book by the same author. I think the risk is really low that a wrong book will be overwritten, but it's highly likely that the same format will be overwritten, so a bad format will displace a better format. I've added the best cautions I can think of in the option description and Tooltip, but .....

Default behavior is unchanged (checkbox is default off).

OK, that's enough. I've got it set the way I would want it written, and that's my first guide. I can comment that it's speeded up my bulk entry process by a factor of 10. Like the post I'm responding to, I screwed up my initial entry. I should have simultaneously entered all formats for each book, but I didn't. I entered all my txt formats first, since that was organized best and included a copy of almost all my books. The result, however, was that it made adding doing bulk additions of all my other formats nearly impossible. I've been trying to fix it manually for months. Now I can just drag and drop, and everything sorts correctly, with minor title differences being ignored.

If anyone has comments, before I submit it, feel free.
Starson17 is offline   Reply With Quote
Old 02-27-2010, 02:11 PM   #9
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 25,454
Karma: 4961459
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Have you seen

http://bugs.calibre-ebook.com/ticket/4797
kovidgoyal is online now   Reply With Quote
Old 02-27-2010, 02:39 PM   #10
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by kovidgoyal View Post
Yes, that was my proposal to simplify drag/drop adding of multiple ebooks of different formats. (I suggested using the shift key to signal when drag/dropping multiple ebooks that they should all be considered the same book.) I was tearing my hair out to speed up my entry process. It had stretched for months, but that was a particular annoyance - drag dropping multiple formats of a single ebook always created multiple entries. I was adding a single book, opening it to edit, then dragging in the other 2 formats, rinse repeat - it was taking forever.

I still think it's a good idea, but it is nowhere near as helpful as the code I've got now. With the new code I can drag drop 150 files comprising 3 formats of each of 50 books and all end up where I want - in existing records, where possible, and otherwise in new records with the multiple formats all in the same record. The shift signal would require 50 drag drops and I'd still need to deal with the issue that any existing records for the same book would block the new formats from being added.

Last edited by Starson17; 02-27-2010 at 02:50 PM.
Starson17 is offline   Reply With Quote
Old 02-27-2010, 02:42 PM   #11
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 25,454
Karma: 4961459
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Oops, I should have looked at who opened the ticket
kovidgoyal is online now   Reply With Quote
Old 02-27-2010, 02:58 PM   #12
guyanonymous
Guru
guyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud of
 
Posts: 692
Karma: 27532
Join Date: Dec 2007
Device: Ebookwise 1150 / 1200
I like the idea, very much, of being able to add multiple books, via drag and drop, to the same book already present in Calibre in an easy manner.

That said, I'd also enjoy being able to select 2+ books in Calibre that are the same book, just different formats (it happens) and merge them all under one title heading.
guyanonymous is offline   Reply With Quote
Old 02-27-2010, 03:42 PM   #13
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by guyanonymous View Post
That said, I'd also enjoy being able to select 2+ books in Calibre that are the same book, just different formats (it happens) and merge them all under one title heading.
Yes, merger of existing records is something I'd very much like to have, too. It's on my personal list to do if no one does it first. I've been slowly ramping up the difficulty of the projects I attempt. Let's see if the current one makes the grade, and if so, I may attempt merger next. This project has made me familiar with most of the code I'd need to change and I think I'm getting close to the required skill level.

I like the way the code described above works, but it does rely on exact author matches. So Arthur Clarke is a different author from Arthur C Clarke, who is different from Arthur C. Clarke. It was too risky to try to automatically do fuzzy matches on both author and title. That means you still need some manual cleanup of bulk adds, and record merger would make that cleanup much easier.

I'm thinking you'd select the surviving record first, then the other records you want to merge into it, right click, select merge and they'd all get merged into the first selected record. Sound about right to you?
Starson17 is offline   Reply With Quote
Old 02-27-2010, 04:29 PM   #14
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Proposed code added as Ticket #5016.
Starson17 is offline   Reply With Quote
Old 02-27-2010, 05:25 PM   #15
guyanonymous
Guru
guyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud of
 
Posts: 692
Karma: 27532
Join Date: Dec 2007
Device: Ebookwise 1150 / 1200
RE: merge -> that's how I considered it.

Perhaps whichever title you right click upon is the one they get merged into?

Kudos for expanding the calibre world! I wish I had the programming gene.
guyanonymous is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Kindle 2, Changing Default Dictionary Alejandro Amazon Kindle 8 11-07-2010 01:40 PM
Changing default send-to addresses for new book purchases Buran Amazon Kindle 2 08-13-2010 06:41 PM
Comments - batch add? mezme Calibre 4 12-25-2009 11:48 PM
Changing default text style? jyavenard Kindle Developer's Corner 5 09-29-2009 09:38 AM
iLiad Changing the default sound ericshliao iRex Developer's Corner 4 12-21-2008 12:05 PM


All times are GMT -4. The time now is 04:52 AM.


MobileRead.com is a privately owned, operated and funded community.