View Single Post
Old 09-01-2014, 09:34 PM   #19
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by DiapDealer View Post
I seem to see a lot of blame being layed at smartening algorithm's doorsteps instead of the users who apply them without discrimination.
Word Processors/Microsoft Word will most likely auto-Smartify while you are typing, or not fix the smart quotations if you delete/add information later (or copy/paste text from another source). Yes, this can be disabled, but most people won't.

I also want to pull my hair out at a lot of these Content Management Systems (CMS), or these sites which automatically apply their smartening algorithms to text. People will just use the built-in tools to type, or copy/paste their document in, and it will get auto-smartified once it gets published to their WordPress or whatever, whether the input text used smart/dumb quotes or not.

Although this is a dilemma, would you would want the algorithm to start completely from scratch to fix mistakes, or would you want it to not fix if you deliberately put the correct quotes in certain positions?

Now that I have gotten a hawk eye for mismatched punctuation, I see the minor errors caused by them left and right!!! I would rather just have dumb quotes (like the MobileRead forums), than to have auto-smartified stuff.

You also have measurements like: 4'6"20° (to actually be PROPER, you would use a PRIME character (′) + DOUBLE PRIME character (″)), where the Smarten Punctuation algorithms insert a RIGHT SINGLE QUOTE (’) + RIGHT DOUBLE QUOTE (”).

Proper: 4′6″20°
Stay Dumb (still ok): 4'6"20°
Smartened (wrong): 4’6”20°

So if you have these measurements in your paragraph, the smarten algorithms will also get confused, and mangle the quotation marks further in the paragraph.

Also, I have seen many of these algorithms where they take into account ONLY the "dumb quotes", instead of starting completely from scratch. So any text which accidentally holds some smart quotes, will get thrown off (think back again, copying/pasting material from another source).

OR, I have seen certain algorithms get mangled when they are right next to an opening/closing HTML tag. Calibre's Smarten Punctuation algorithm causes a handful of these if my memory serves me right, next book I stumble across with it, I will definitely have to gather real examples.

Quote:
Originally Posted by DiapDealer View Post
Most have options to granularly choose which elements you want to "smarten." For example, I tend to leave ellipses out of my smartening attempts--focusing only on quotation marks and dashes.
Hmmm... what are the Smarten Punctuation tools you use that allow such granularity? I would LOVE to upgrade!

I save Smartening Punctuation as one of the final steps, and I always do a Before/After EPUB. I then do a very thorough code compare to see EXACTLY what punctuation was smartened, and fix up any mistakes caused. Luckily, Finereader is able to OCR a lot of the Smart Quotes to match the quotes in the source document, so I only have to double-check a handful of comparisons where the Smarten Algorithm =/= the OCR text.

I used to use Modify ePub's Smarten Punctuation, I have shifted over to Calibre's Smarten Punctuation, because it handles a few of those edge cases better.

Most people would just push the Smarten button and move on, never seeing exactly what it changed.
Tex2002ans is offline   Reply With Quote