Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 03-14-2010, 11:16 AM   #1
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
How the new 0.6.45 Add new formats to existing books option works

A How To Use for a new feature:

Version 0.6.45 added some code that simplifies adding new formats to existing book records. Just drag the new ebook format (or formats) into the main screen and you're done. Calibre will check the database for each book, determine if there is a book with an identical author and a nearly identical title and if the match is close enough, it will add the new format to the existing book entry. You can also use any of the add book options that pull down next to the Add Books icon.

This makes it much easier to bring big collections of books into Calibre. I've dragged in hundreds of books at a time, and they all get sorted into the correct existing ebook record if they are just new formats, or into new records if there isn't already one in the database.

Kovid added a nice notification feature to my code so that after all books are entered, Calibre will list any author/title matches it found. If you drag 5 different formats of the same new book in, it will add the first one, then notify you of the last 4 matches found that were merged into the first record.

For this to work, you need the following:

The option in Preferences|Add/Save needs to be turned on. It's the one that says "If books with similar titles and authors found .."

Next, your book has to actually have a sufficiently similar author and title to the existing book, and you have to have Calibre configured so it can correctly figure out the title and author of the new book.

I prefer to leave the option to "read metadata only from filename" on, then make sure the regex will correctly read the author and title from the filename. If the internal metadata of the book being added correctly specifies the author/title, then you can leave the "read metadata only from filename" off and ignore the regex.

All 3 of these (2 options and regex) are on the same page: Preferences|Add/Save.

In addition to adding the notification feature, Kovid changed my code slightly to handle multiple authors differently and to prevent overwrites of formats you already have. (I preferred to overwrite older copies with the newer copy of the same format, but I understand why he preferred the opposite.)

Multiple authors are now handled as follows: If a new book has multiple authors and the existing ebook record does not, a new ebook record is created. If a new book has a single author and the existing ebook record has multiple authors, the new book is merged (assuming one of the authors matches and the title matches closely enough). (I was lazy with the original code and just matched the first author of the new book against any author of the existing record, so multiple author new books were merged into single author records when author/title otherwise matched.)

How closely do the author/titles have to match:

Author names must match exactly, except for case.

Title matches also ignore case. Leading indefinite articles, like "the", "a" and "an" are ignored, as are most non-alphanumerics. Hyphens, dots (periods) and underscores are replaced with a space before matching. Brackets of various types (parentheses, angle, curly, square) and punctuation, such as, colons and semicolons are removed entirely. Multiple whitespace characters are condensed to a single space. (These changes are only for the comparison - to determine if the new book should be merged into an existing record with a slightly different name.)

Examples:
"The Diary of a Madwoman" matches all these (and vice-a-versa):

"The Diary Of A Madwoman"
"Diary of a Madwoman"
"The_Diary_of_a_Madwoman"
"The Diary; of a Madwoman"
"the.diary.of.a.madwoman"
"The.diary of.a.madwoman"

but not:
"The Diary Of Madwoman"
"The Diary of a Madwomen"
"Diaries of a Madwoman"

Existing formats are not overwritten (however, my testing shows that you will still get a message saying a merger occurred.)

How to replace a format:

The easiest way is just to right click the existing book, select
"Remove books", then "Remove files of a specific format", remove the format from the old record, then drag in the new format ebook.
Starson17 is offline   Reply With Quote
Old 03-14-2010, 01:00 PM   #2
guyanonymous
Guru
guyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud ofguyanonymous has much to be proud of
 
Posts: 692
Karma: 27532
Join Date: Dec 2007
Device: Ebookwise 1150 / 1200
I look forward to trying it....thanks!

Oh....any idea what it'll do with file names of the format:

Comic Book #1.cbz
Comic Book #2.cbz
etc...

Will it put them into one, or all in separate entries? How big doe sthe difference have to be?
guyanonymous is offline   Reply With Quote
 
Advertisement
Old 03-14-2010, 01:19 PM   #3
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 65,474
Karma: 43935573
Join Date: Nov 2006
Location: UK
Device: Kindle Voyage, iPad Mini, iPhone 4, MS Surface Pro, N7
Excellent - that's a feature I've been after for a long time.
HarryT is online now   Reply With Quote
Old 03-14-2010, 01:27 PM   #4
pilotbob
Grand Sorcerer
pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.
 
pilotbob's Avatar
 
Posts: 19,634
Karma: 11390499
Join Date: Jan 2007
Location: Tampa, FL USA
Device: Kindle Touch
Nice feature... great work!

BOb
pilotbob is offline   Reply With Quote
Old 03-14-2010, 01:52 PM   #5
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by guyanonymous View Post
I
Oh....any idea what it'll do with file names of the format:

Comic Book #1.cbz
Comic Book #2.cbz
etc...

Will it put them into one, or all in separate entries? How big does the difference have to be?
What is the author name for those two books? What is the book title? Everything depends on what Calibre determines the author/title to be. Filename is only relevant if you have told it to get the title from the filename, and if you have, then you have to look at your regex to see how you have told Calibre to parse the filename. For purposes of comparison (to decide if two book titled match or not), numbers are not ignored, provided they are part of the title.

I say "provided they are part of the title" because you could have a regex that stripped numbers from the filename, so that even if the filename has numbers, the book title derived from that filename by way of your regex, no longer has any numbers in it.

Although I don't have many .cbz files, they should be handled the same as any other format.

If the title of those books is "Comic Book #1" and "Comic Book #2" Then they are different books and it will add them as two entries. If the title is "Comic Book" then they are the same title. If the authors are the same, then they will look like the same book in the same format, but since it won't overwrite formats, and won't add duplicates, it won't do anything other than give you a message.

Edit: If you have turned on the option described here, but not the "get metadata only from filename option, and tried to drag those two files into Calibre, it would try to get the metadata from inside the file. I don't know if .cbz files have internal metadata, but for formats that do not have internal metadata, Calibre will use the filename to get the minimum info it needs. I don't recall exactly what author/title it will choose from those filenames.

Last edited by Starson17; 03-14-2010 at 02:11 PM.
Starson17 is offline   Reply With Quote
Old 03-14-2010, 02:06 PM   #6
pilotbob
Grand Sorcerer
pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.
 
pilotbob's Avatar
 
Posts: 19,634
Karma: 11390499
Join Date: Jan 2007
Location: Tampa, FL USA
Device: Kindle Touch
Quote:
Originally Posted by Starson17 View Post
Examples:
"The Diary of a Madwoman" matches all these (and vice-a-versa):

but not:
"The Diary of a Madwomen"
Why? Those look exactly the same to me?

BOb
pilotbob is offline   Reply With Quote
Old 03-14-2010, 02:15 PM   #7
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 65,474
Karma: 43935573
Join Date: Nov 2006
Location: UK
Device: Kindle Voyage, iPad Mini, iPhone 4, MS Surface Pro, N7
One is "woman"; the other is "women".
HarryT is online now   Reply With Quote
Old 03-14-2010, 02:18 PM   #8
pilotbob
Grand Sorcerer
pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.
 
pilotbob's Avatar
 
Posts: 19,634
Karma: 11390499
Join Date: Jan 2007
Location: Tampa, FL USA
Device: Kindle Touch
Quote:
Originally Posted by HarryT View Post
One is "woman"; the other is "women".
DOH! I looked at it 20 times and just couldn't see that.

BOb
pilotbob is offline   Reply With Quote
Old 03-14-2010, 02:27 PM   #9
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by HarryT View Post
One is "woman"; the other is "women".
Yeah, it's easy to miss. I was just trying to show that the fuzzy matching doesn't try to match plurals, or typos. I tried to deal with all the cases I personally saw while adding 14,000 books. I was more aggressive than Kovid wanted to be about overwriting the same format, so I was conservative about matching titles. I didn't want to overwrite anything accidentally.

The post above about Comic Book #1 vs. Comic Book #2 convinces me that Kovid was right. I'd hate to have dozens of issues get lumped into a single record/format because the user's regex was set to ignore numbers, then have him throw away the originals, thinking they were safely stored in Calibre.
Starson17 is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
[Old Thread] import library or export to single file add to existing library PCreighton Calibre 4 04-10-2011 02:08 AM
<Command Line> Add multiple books in multiple formats himitsu Calibre 8 09-26-2010 12:07 AM
add books in 2 formats options cybmole Calibre 6 09-21-2010 06:52 AM
Add'l formats for existing book(s) jxh11215 Calibre 11 03-24-2010 05:13 PM
Add the Commments in meta data to an existing LRF file mgrunk Calibre 3 07-26-2009 01:35 PM


All times are GMT -4. The time now is 06:03 AM.


MobileRead.com is a privately owned, operated and funded community.