View Single Post
Old 10-20-2010, 12:06 PM   #11
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by chaley View Post
Authors, and indeed most items, are normalized in the DB sense. Each exists once in some table.

However, spelling variations are not automatically corrected. For example, outside of merge processing, calibre does not consider any of "Lawrence, D H", "Lawrence, DH", or "Lawrence, D.H.", "D H Lawrence", or "Lawrence, D" to be the same author. The merge code may detect some of these because it strips punctuation before doing the compare.
I'm not sure if the "DB" in question is the Calibre DB or one of the online metadata fetching source DBs (Last time I tried to count them there were 6 of them).

With respect to Calibre, the autosort/automerge code used when Adding Books does strip punctuation from titles before doing compares to find identical books already in the library, but it doesn't use that "normalized" title for anything other than the compare to existing book titles, and it doesn't strip punctuation from authors, only titles.

The Merge code that merges existing records (as compared to autosort/automerge) doesn't do anything to the author or title. The author/title of the first selected book is always kept and the others dropped, except when they're "Unknown," in which case the first not-Unknown author or title encountered in the merge selection list is used, if there are any.

BTW, the reason I think of the automerge option as also being "autosort" is that when a large block of files is added, the code separates the incoming files into three distinct groups - files that are merged into an existing book record (that didn't have a matching format), files that are added as a new book record (no match on author/title) and files that are not added (author/title/format all matched so the file could not be merged to the matching book record).
Starson17 is offline   Reply With Quote