View Single Post
Old 09-19-2011, 05:56 AM   #4
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
Not to discourage you if you come up with examples (and if they're good ones they can be acted upon), but Calibre also leans towards false negatives in questionable situations vs. false positives. i.e. if it's debatable whether a sentence should be unwrapped or not it will leave the hard break.

A common example is:
"Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor."
Proper Name said.

This annoys some users that Calibre doesn't unwrap this, but it's extremely difficult to tell whether the above is one sentence or two sentences from an algorithmic standpoint.

Leaving the hard break in place if it's one sentence is annoying, but you always recognize as a human when it happens to fix it manually. However if you remove the hard break and it should have been two sentences the dialogue can be fundamentally changed, and it's not so easy for a human to detect if the author really meant both those items to be in a single paragraph - if you even notice the oddity you'll need to dig out the original file/book to check.
ldolse is offline   Reply With Quote