Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 02-12-2020, 01:23 PM   #31
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 7,544
Karma: 19001583
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
Quote:
Originally Posted by DNSB View Post
I do check first to make sure that someone has not used one of those strings but so far, that has not happened.
I use ¬, |, ¦, @ as temporary replacement characters. Easy to input from my keyboard, unlikely to be present in any novel-like text (but always check first).

You may get surprises with ’em, days’, ’tis, evermo’, etc. Not to mention ‘em and ‘tis (I have seen them too often).
Jellby is offline   Reply With Quote
Old 02-12-2020, 03:00 PM   #32
DNSB
Bibliophagist
DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.
 
DNSB's Avatar
 
Posts: 45,210
Karma: 168808723
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
Quote:
Originally Posted by Jellby View Post
I use ¬, |, ¦, @ as temporary replacement characters. Easy to input from my keyboard, unlikely to be present in any novel-like text (but always check first).

You may get surprises with ’em, days’, ’tis, evermo’, etc. Not to mention ‘em and ‘tis (I have seen them too often).
Hence the post checking for those cases. Especially on some of my wife's books where they are attempting a Cockney or similar accent.
DNSB is offline   Reply With Quote
Old 02-12-2020, 04:01 PM   #33
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 79,356
Karma: 145488914
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by DNSB View Post
It's a pretty simplistic solution. First I convert the 4 flavours of curly quotes into strings of ~ and % and then convert those back into curly quotes.

“ => ~%~% => ‘
” => %~%~ => ’
‘ => ~%%~ => “
’ => %~~% => ”

So the sentence I used in my sample comes out looking like after the first pass:

Code:
<p>John said ~%%~I asked Bill and he said ~%~%You should be ashamed of yourself.%~%~ %~~%</p>
I do check first to make sure that someone has not used one of those strings but so far, that has not happened. I also use a cleanup search for a double curly quote between two letters which gets replaced by a right single curly quote if needed. Given that you should never see a curly quote in the code segment, this pretty much covers my needs. These are part of my saved searches.
That should work rather well for what I want. It will make reading some books easier.
JSWolf is offline   Reply With Quote
Old 02-12-2020, 05:04 PM   #34
Quoth
Still reading
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 13,798
Karma: 103895653
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper
I'd use ¬ and ^ or ~

It depends on context. I've a UK keyboard, Linux (more AltGr characters) and the CapsLock mapped as Compose, so plenty of choice. I first search to make sure the replacement character isn't used.
þ ß ð ĸ ł § « » · Ω ® Ŧ đ ŋ etc are on the AltGr
Quoth is offline   Reply With Quote
Old 02-12-2020, 05:20 PM   #35
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by JSWolf View Post
Mind posting your steps please? Thanks.
Nothing has really changed in the American “” (Outer) ‘’ (Inner) <-> British ‘’ (Outer) “” (Inner)...

See the 2016 topic, "How to convert straight quotes to smart 'curly' typographer's quotes" (especially my post #26, but read the entire topic). Same exact people have the same exact comments 4 years later.

But I'll go into a bit more detail here.

Flipping Inner/Outer Quotes

The vast majority could be flipped with no issue (using Jellby's "intermediate symbol" method):
  • “ -> ¬
  • ” -> |
  • ‘ -> ¦
  • ’ -> @

(Personally, I wouldn't use the @, too common. But you can pick any obscure Unicode characters, like ¤ CURRENCY SIGN (U+00A4))

Then you just have to search/replace the INNER/OUTER symbols to their opposite.

Outer -> Inner:
  • “ -> ‘
    • “ -> ¦
  • ” -> ’
    • ” -> @

Inner -> Outer:
  • ‘ -> “
    • ‘ -> ¬
  • ’ -> ”
    • ’ -> |

* * *

"Actual Apostrophe" Note: You could do this pass before or after substition. It's up to you.

Look for very common apostrophe word-endings:
  • ’s
  • ’ll
  • ’d

Mark those with a different symbol for "actual apostrophe" (let's pick the ¤ CURRENCY SIGN):
  • It¤s
  • I¤ll
  • You¤d

You could even mark those words that actually use a RIGHT SINGLE QUOTE:
  • ¤tis
  • ¤til
  • ¤em

You'll also have to check possessives that end with s + apostrophe:
  • Zeus’
  • Chris’

(These have to be decided on a case-by-case basis, because you don't know if it's the end of a quotation or not.)

* * *

Now you replace all those intermediate symbols with the new INNER/OUTER quotations.

Outer -> Inner:
  • ¦ -> ‘
  • @ -> ’

Inner -> Outer:
  • ¬ -> “
  • | -> ”

"Actual Apostrophe":
  • ¤ -> ’

Look For Mismatches

Afterwards, like the above 2016 topic discusses, you have to do a more thorough check (like Toxaris's Dialogue Check) which looks for mismatching sets of outer/inner quotes.

You could roughly get it with some Regex that looks for these in the same paragraph:
  • opening quote + opening quote
  • closing quote + closing quote

(If you're doing this, you'll probably want to do this while you still have "Actual Apostrophes" marked as ¤... so they don't interfere.)

Note: Pure Regex will definitely miss a few though. You need something slightly more intelligent that checks both forwards/backwards for mismatches (Toxaris's Dialogue Check is the only thing I know of that does this).

Note 2: Like Jellby says, there are typically a lot of wrong LEFT/RIGHT errors in lots of ebooks (especially around em dashes), so a program may want to squash a lot of those BEFORE doing the flipping steps.

Note 3: Depending on the language, you may have even more "actual apostrophes" in the middle of words... like French uses a lot of:
  • l’influence
  • d’économie

so another "actual apostrophe" check may just be assume anything in the middle of a word is one:

Search: (\w)’(\w)
Replace: \1¤\2

Last edited by Tex2002ans; 02-12-2020 at 05:39 PM.
Tex2002ans is offline   Reply With Quote
Old 02-12-2020, 05:28 PM   #36
hobnail
Running with scissors
hobnail ought to be getting tired of karma fortunes by now.hobnail ought to be getting tired of karma fortunes by now.hobnail ought to be getting tired of karma fortunes by now.hobnail ought to be getting tired of karma fortunes by now.hobnail ought to be getting tired of karma fortunes by now.hobnail ought to be getting tired of karma fortunes by now.hobnail ought to be getting tired of karma fortunes by now.hobnail ought to be getting tired of karma fortunes by now.hobnail ought to be getting tired of karma fortunes by now.hobnail ought to be getting tired of karma fortunes by now.hobnail ought to be getting tired of karma fortunes by now.
 
Posts: 1,584
Karma: 14328510
Join Date: Nov 2019
Device: none
Quote:
Originally Posted by JSWolf View Post
That should work rather well for what I want. It will make reading some books easier.
And while you're at it you can fix all those misspelled words that the British insist on using; catalogue, criticise, programme, colour, etc.

Last edited by hobnail; 02-12-2020 at 05:56 PM.
hobnail is offline   Reply With Quote
Old 02-12-2020, 05:50 PM   #37
Quoth
Still reading
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 13,798
Karma: 103895653
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper
Quote:
Originally Posted by hobnail View Post
And while you're at it you can fix all those misspelled words that the Americans insist on using; catalog, criticize, program, color, etc.
Invented by an American guy compiling a dictionary. UK dictionaries are usage based, not prescriptive.
Actually I believe the Author's regional spelling should be used. You did mean while, not why? Typos can be fixed. The style of Quotes and Layout are publisher things, and UK / Irish publishers use all the different conventions, both single and double quoted text. Consistent house styles for decades.
Quoth is offline   Reply With Quote
Old 02-12-2020, 05:57 PM   #38
hobnail
Running with scissors
hobnail ought to be getting tired of karma fortunes by now.hobnail ought to be getting tired of karma fortunes by now.hobnail ought to be getting tired of karma fortunes by now.hobnail ought to be getting tired of karma fortunes by now.hobnail ought to be getting tired of karma fortunes by now.hobnail ought to be getting tired of karma fortunes by now.hobnail ought to be getting tired of karma fortunes by now.hobnail ought to be getting tired of karma fortunes by now.hobnail ought to be getting tired of karma fortunes by now.hobnail ought to be getting tired of karma fortunes by now.hobnail ought to be getting tired of karma fortunes by now.
 
Posts: 1,584
Karma: 14328510
Join Date: Nov 2019
Device: none
Quote:
Originally Posted by FrustratedReader View Post
You did mean while, not why?
Thanks, fixed.
hobnail is offline   Reply With Quote
Old 02-12-2020, 06:15 PM   #39
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,656
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
@Jellby - curious: in Spanish would you use an em-dash (u2014) '—' or a quotation dash (horizontal bar) (u2015) '―'. In some fonts a quote dash a tad shorter and heavier than an em dash , e.g. TNR

Em dash : —

Quote dash : ―


BR
BetterRed is offline   Reply With Quote
Old 02-12-2020, 06:38 PM   #40
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,656
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by Tex2002ans View Post
Note 2: Like Jellby says, there are typically a lot of wrong LEFT/RIGHT errors in lots of ebooks (especially around em dashes), so a program may want to squash a lot of those BEFORE doing the flipping steps.
This recently got fixed in Word, but I'm not sure if it was by me with a bit of blackart autocorrect trickery or if, after a 30+ year wait, a fix was issued by MS. I first came across the issue on IBM DISOSS/PROFS in the '70's.

If I had my druthers we'd all use corners to mark dialogue, they work for RTL, LTR and UP and DOWN languages - clever people them Chinese/Japanese/Koreans.

I've been experimenting with using unicode emojis for temporary replacement markers - they're hard to miss if you fail to remove them. Forgot to say, via AHK.

BR

Last edited by BetterRed; 02-12-2020 at 06:42 PM.
BetterRed is offline   Reply With Quote
Old 02-12-2020, 07:09 PM   #41
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by BetterRed View Post
@Jellby - curious: in Spanish would you use an em-dash (u2014) '—' or a quotation dash (horizontal bar) (u2015) '―'. In some fonts a quote dash a tad shorter and heavier than an em dash
See "No break space and alignment" from a few months ago.

That HORIZONTAL BAR/"quotation dash" character is missing on a lot of fonts... probably not safe to use in ebooks.

Quote:
Originally Posted by BetterRed View Post
This recently got fixed in Word, but I'm not sure if it was by me with a bit of blackart autocorrect trickery or if, after a 30+ year wait, a fix was issued by MS.
In what, Office 365? (What month?)

LibreOffice is still plagued with it, but they've been working on some autocorrect bugs lately. Maybe it might get squashed.

After any Smarten Punctuation, I always just do a search for things like:

”— (RIGHT DOUBLE QUOTE + EM DASH)
—“ (EM DASH + LEFT DOUBLE QUOTE)

Last edited by Tex2002ans; 02-12-2020 at 07:15 PM.
Tex2002ans is offline   Reply With Quote
Old 02-12-2020, 07:15 PM   #42
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,656
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by Tex2002ans View Post
That HORIZONTAL BAR + "quotation dash" character is missing on a lot of fonts...
I know that, but my 'some fonts' contained an implied 'where horizontal-bar exists'

BR
BetterRed is offline   Reply With Quote
Old 02-12-2020, 07:18 PM   #43
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by BetterRed View Post
I know that, but my 'some fonts' contained an implied 'where horizontal-bar exists'
lol, well next you're going to say NARROW NO-BREAK SPACE... where it exists!
Tex2002ans is offline   Reply With Quote
Old 02-12-2020, 08:19 PM   #44
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,656
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by BetterRed View Post
This recently got fixed in Word, but I'm not sure if it was by me with a bit of blackart autocorrect trickery or if, after a 30+ year wait, a fix was issued by MS.
Quote:
Originally Posted by Tex2002ans View Post
In what, Office 365? (What month?)
Yes - which would be Word 2019 - no idea which month, MS installs new versions of Office 365 without so much as a '… and thank your mother for the rabbits', let alone a list of improvements and fixes. But on reflection, it may have come via the new version of the Transtools addin I installed recently.

IIRC the quote is initially shown as a straight quote and gets changed to a curly when you press the space bar or Enter (which is off-putting), that makes me think its in auto-correct or it's a macro. I've been typing quote, backspace, ctrl+alt+num-, and End for almost 30 years, no doubt I'll continue to do so.

BR
BetterRed is offline   Reply With Quote
Old 02-12-2020, 08:25 PM   #45
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,656
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by Tex2002ans View Post
lol, well next you're going to say NARROW NO-BREAK SPACE... where it exists!
When I wrote that post - to Jellby - I was wondering what a Spanish printer might do.

BR
BetterRed is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
What's your update rule-of-thumb? Anabana Calibre 20 06-13-2012 11:23 PM
PRS-600 Thumb Problem Gernella Sony Reader 4 02-02-2010 05:27 PM
THUMB WHEEL HAS THREE BUTTONS asdx Astak EZReader 2 01-29-2010 12:51 PM


All times are GMT -4. The time now is 06:55 AM.


MobileRead.com is a privately owned, operated and funded community.