Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > ePub

Notices

Reply
 
Thread Tools Search this Thread
Old 03-23-2016, 05:03 AM   #16
Toxaris
Wizard
Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.
 
Toxaris's Avatar
 
Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
Quote:
Originally Posted by JSWolf View Post
What I would like is a way to convert UK style quotes to US style quotes because UK style quotes just look unnatural.
It is just what you are used to. I find the double quotes too big and obtrusive. But, in the end it is about the story, not the quotes actually used. If the story is good, I don't even notice the used style.
Toxaris is offline   Reply With Quote
Old 03-24-2016, 06:24 AM   #17
fbrzvnrnd
Fanatic
fbrzvnrnd ought to be getting tired of karma fortunes by now.fbrzvnrnd ought to be getting tired of karma fortunes by now.fbrzvnrnd ought to be getting tired of karma fortunes by now.fbrzvnrnd ought to be getting tired of karma fortunes by now.fbrzvnrnd ought to be getting tired of karma fortunes by now.fbrzvnrnd ought to be getting tired of karma fortunes by now.fbrzvnrnd ought to be getting tired of karma fortunes by now.fbrzvnrnd ought to be getting tired of karma fortunes by now.fbrzvnrnd ought to be getting tired of karma fortunes by now.fbrzvnrnd ought to be getting tired of karma fortunes by now.fbrzvnrnd ought to be getting tired of karma fortunes by now.
 
Posts: 557
Karma: 400004
Join Date: Feb 2009
Device: ONYX M96
Quote:
Originally Posted by HarryT View Post
I don't see the relevance of this to converting straight quotes to curly quotes. Can you elaborate?
You can search text in straight quotes using RegEx, and replace it with curly ones. Using RegEx could give you more instruments to get the text you really need and jump the other. The regex I wrote about book/article title is an example.
fbrzvnrnd is offline   Reply With Quote
Old 03-24-2016, 07:44 AM   #18
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 85,544
Karma: 93383099
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
Quote:
Originally Posted by fbrzvnrnd View Post
You can search text in straight quotes using RegEx, and replace it with curly ones. Using RegEx could give you more instruments to get the text you really need and jump the other. The regex I wrote about book/article title is an example.
Thanks.

I strongly suspect that we're all familiar with the use of regular expressions, but unfortunately they really don't help with the fundamental problem we're addressing here. You can certainly use a regex to find a single quote at the start of a word, but the problem is knowing whether to replace it with an left or a right curly quote. How would you address that issue?

Eg, if you find the string 'ello you need to replace the straight quote with a right apostrophe (abbreviated word), whereas if you find the string 'hello you need to replace it with left apostrophe (opening speech or quotation mark).

Have you any suggestions to assist with this?
HarryT is offline   Reply With Quote
Old 03-24-2016, 08:17 AM   #19
fbrzvnrnd
Fanatic
fbrzvnrnd ought to be getting tired of karma fortunes by now.fbrzvnrnd ought to be getting tired of karma fortunes by now.fbrzvnrnd ought to be getting tired of karma fortunes by now.fbrzvnrnd ought to be getting tired of karma fortunes by now.fbrzvnrnd ought to be getting tired of karma fortunes by now.fbrzvnrnd ought to be getting tired of karma fortunes by now.fbrzvnrnd ought to be getting tired of karma fortunes by now.fbrzvnrnd ought to be getting tired of karma fortunes by now.fbrzvnrnd ought to be getting tired of karma fortunes by now.fbrzvnrnd ought to be getting tired of karma fortunes by now.fbrzvnrnd ought to be getting tired of karma fortunes by now.
 
Posts: 557
Karma: 400004
Join Date: Feb 2009
Device: ONYX M96
Using the example:

Quote:
What the 'ell are you doin'?

and:

'Today is Thursday', he said.
It is easy to use the begin paragraph/uppercase to find all the dialogs jumping away from the 'ell.
I can try to catch all the text in straight quotes beginning with a lowercase. If there is a good % of results I can build a better Regex looking around: space before, punctuation/start of paragraph before, uppercase et ceterae.

I do not wanna say I can make a 100% find/replace work with regex, but I can reach a 99% leaving some manual working here and there.
fbrzvnrnd is offline   Reply With Quote
Old 03-24-2016, 12:13 PM   #20
Toxaris
Wizard
Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.
 
Toxaris's Avatar
 
Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
Quote:
Originally Posted by fbrzvnrnd View Post
Using the example:



It is easy to use the begin paragraph/uppercase to find all the dialogs jumping away from the 'ell.
I can try to catch all the text in straight quotes beginning with a lowercase. If there is a good % of results I can build a better Regex looking around: space before, punctuation/start of paragraph before, uppercase et ceterae.

I do not wanna say I can make a 100% find/replace work with regex, but I can reach a 99% leaving some manual working here and there.
I hate to burst your bubble, but you can't. Not 99% by a long shot. For regular documents it can be automated, but the examples here will mess things up. They don't follow the rules, after all they could be start of a normal sentence. And, even if you could catch some cases, when another language is used the rules will differ.
Toxaris is offline   Reply With Quote
Old 03-24-2016, 01:36 PM   #21
dwig
Wizard
dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.
 
dwig's Avatar
 
Posts: 1,613
Karma: 6718541
Join Date: Dec 2004
Location: Paradise (Key West, FL)
Device: Current:Surface Go & Kindle 3 - Retired: DellV8p, Clie UX50, ...
Quote:
Originally Posted by Toxaris View Post
...And, even if you could catch some cases, when another language is used the rules will differ.
and even within one language there can be significant difference between dialects, as large or larger than there is between mocdfn British English and American English.
dwig is offline   Reply With Quote
Old 03-24-2016, 01:54 PM   #22
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 79,130
Karma: 144284184
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
When you do a smarten punctuation, you then have to read the book and check each punctuation mark.
JSWolf is offline   Reply With Quote
Old 03-24-2016, 01:56 PM   #23
fbrzvnrnd
Fanatic
fbrzvnrnd ought to be getting tired of karma fortunes by now.fbrzvnrnd ought to be getting tired of karma fortunes by now.fbrzvnrnd ought to be getting tired of karma fortunes by now.fbrzvnrnd ought to be getting tired of karma fortunes by now.fbrzvnrnd ought to be getting tired of karma fortunes by now.fbrzvnrnd ought to be getting tired of karma fortunes by now.fbrzvnrnd ought to be getting tired of karma fortunes by now.fbrzvnrnd ought to be getting tired of karma fortunes by now.fbrzvnrnd ought to be getting tired of karma fortunes by now.fbrzvnrnd ought to be getting tired of karma fortunes by now.fbrzvnrnd ought to be getting tired of karma fortunes by now.
 
Posts: 557
Karma: 400004
Join Date: Feb 2009
Device: ONYX M96
I'm not talking about a regex for every book. I suggest to use regex to *adapt* the regex to the contest you are working with.
fbrzvnrnd is offline   Reply With Quote
Old 03-24-2016, 02:20 PM   #24
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 85,544
Karma: 93383099
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
Quote:
Originally Posted by JSWolf View Post
When you do a smarten punctuation, you then have to read the book and check each punctuation mark.
You don't have to read the book, but you certainly have to search for apostrophes and quotation marks and check that they're all correct.
HarryT is offline   Reply With Quote
Old 03-24-2016, 05:39 PM   #25
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 79,130
Karma: 144284184
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by HarryT View Post
You don't have to read the book, but you certainly have to search for apostrophes and quotation marks and check that they're all correct.
True that.
JSWolf is offline   Reply With Quote
Old 03-31-2016, 10:57 PM   #26
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by eschwartz View Post
EDIT: Yes, DiapDealer's PunctuationSmarten plugin is your friend.
I second this. Or here is "Diap's Editing Toolbag" (Calibre version of the Plugin) which I use all the damn time:

https://www.mobileread.com/forums/sho....php?p=2980740

I find it to be much more helpful than Calibre's built in Smarten Punctuation because this one has options (such as don't touch ellipses, or don't touch dashes).

Side Note: My personal method is three rounds:

Round #1: Diapdealer's Toolbag, Smarten Punctuation.

Round #2: I run the book through a lot of the regex fixes (of common errors I have come across). I do my usual code cleanup + OCR fixing + everything else.

Round #3: As the final step, I run the final text through Toxaris's Dialogue Check.

Quote:
Originally Posted by HarryT View Post
Tools tend to have problems with words with initial apostrophes, particularly in books where the single apostrophe is also used for speech marks.
Yep, the automated tools do make quite a few mistakes (typically around em dashes, italics/other HTML tags, [...]).

The books I work on don't really have too many of the "'tis a jolly good day" + "'twas the night before Christmas" + "go get 'em", but I have this in my Saved Regexes:

Search: ‘(Em|em|Til|til|Tis|tis|Twas|twas)
Replace: ’\1

You can easily just append whatever words needed in there with a pipe between, and it can make it easier to find/change the Left Single Quote (wrong) quotes into Right Single Quote (Correct).

(I believe Diap's Toolbag also has an "exception" list if you wanted to take that route.)

I also have this Regex to handle years, such as "’90s":

Search: ‘([0-9])
Replace: ’\1

Quote:
Originally Posted by JSWolf View Post
What I would like is a way to convert UK style quotes to US style quotes because UK style quotes just look unnatural.
I mean come on JSWolf, you can't be serious. That is just because you primarily read US material.

I suspect you already came across this the multitude of times me (and Toxaris) have posted this:

https://en.wikipedia.org/wiki/Quotat...ious_languages

All different languages use all different types of quotation marks (High/Low, Left/Right, Quotes/Guillemets, [...]). If you read Finnish books you might be used to ”…” instead.

UK to US has no easily automated way to do it... you would have to manually replace all Left/Right Single Quotes with their Double Quote equivalents (and change all Double -> Single).

Then you try to catch a lot of the accidentally converted apostrophes like:

Search: ([a-zA-Z])”([a-z])
Replace: \1’\2

And step through and try to catch apostrophes at the end of words:

Search: ([s])”(\s)
Replace: \1’\2

And a ton more elbow grease.

I have done UK -> US quotes a handful of times, and it is brutal/boring work.

Quote:
Originally Posted by Jellby View Post
It all boils down to distinguishing between right single quote and apostrophe. Unfortunately, in Unicode they are the same character (a design mistake, I'd say).
Hmmmm... yeah this does seem to be the crux of a lot of the Smarten Punctuation issues. It would simplify a lot. :P

Side Note: On a related note, does everyone here remember the glorious Smarten Punctuation (plus other typography) discussion we had back in 2014? (My gods, how time flies):

https://www.mobileread.com/forums/sho...58#post2912458

Last edited by Tex2002ans; 03-31-2016 at 11:08 PM.
Tex2002ans is offline   Reply With Quote
Old 04-01-2016, 07:17 AM   #27
Toxaris
Wizard
Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.
 
Toxaris's Avatar
 
Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
Quote:
Originally Posted by Tex2002ans View Post

Side Note: My personal method is three rounds:

Round #1: Diapdealer's Toolbag, Smarten Punctuation.

Round #2: I run the book through a lot of the regex fixes (of common errors I have come across). I do my usual code cleanup + OCR fixing + everything else.

Round #3: As the final step, I run the final text through Toxaris's Dialogue Check.
Actually, my tooling also have a convert straight to smart quotation mark procedure that also fixes common mistakes for example with numbers. It is also possible to quickly run a series of S/R commands in one go (especially easy to do in the upcoming release). It seems a bit of a waste to go from HTML/ePUB to Word and back again.
Toxaris is offline   Reply With Quote
Old 04-01-2016, 07:53 AM   #28
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by Toxaris View Post
Actually, my tooling also have a convert straight to smart quotation mark procedure that also fixes common mistakes for example with numbers.
Well by the time it gets to running through your Dialogue Check, there aren't many fixes...

I always save your tools for the very last pass. I take the nearly completed EPUB, use your Import EPUB and do a Dialogue Check + spellchecking pass (and do manual corrections directly in the EPUB). I don't need Word bungling up my perfect EPUB.

I must admit, I haven't tested your Smarten Quotes though (all I know is that the default Word one is DREADFUL).
Tex2002ans is offline   Reply With Quote
Old 04-01-2016, 10:14 AM   #29
Toxaris
Wizard
Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.
 
Toxaris's Avatar
 
Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
Quote:
Originally Posted by Tex2002ans View Post
I must admit, I haven't tested your Smarten Quotes though (all I know is that the default Word one is DREADFUL).
Well, I wouldn't call it dreadful. It is not that bad actually, you just need several fixes afterwards. I just can't understand why Microsoft does not improve it themselves.
Toxaris is offline   Reply With Quote
Old 04-01-2016, 07:14 PM   #30
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,629
Karma: 29710510
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by Toxaris View Post
Well, I wouldn't call it dreadful. It is not that bad actually, you just need several fixes afterwards. I just can't understand why Microsoft does not improve it themselves.
Maybe they lost the code

But if they did, the EU would probably 'order' them offer OOo, AbiWord, Wordstar and Atlantis as alternatives for Word, like Mario Monti 'ordered' them to offer alternate browsers - which they never obeyed. Mario's reward was to be parachuted into the Palazzo Chigi (Italy's 10 Downing St). He's now a Senator for Life.

There's some VBA code here that could be made smarter.

BR
BetterRed is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Typographer's Quotes and other formatting issues. [Assistance to a new member] Dr. Drib Writers' Corner 6 05-07-2015 01:29 PM
Curly vs Straight Quotes in Metadata icallaci Library Management 2 05-04-2015 02:08 AM
Curly quotes or apostrophes? storax Workshop 2 06-19-2013 11:43 AM
convert straight quotes to curly quotes alansplace Calibre 3 09-25-2010 03:51 PM
curly quotes DaleDe Sigil 6 06-26-2010 10:33 PM


All times are GMT -4. The time now is 06:14 AM.


MobileRead.com is a privately owned, operated and funded community.