Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Sigil > Plugins

Notices

Reply
 
Thread Tools Search this Thread
Old 05-12-2020, 09:56 AM   #106
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 22,309
Karma: 124547494
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Your suggestion of '\d\d[^0-9'] is not a bad one. There might be some things that fall through the cracks when the British style of single-quotes is employed, but they were going to be wrongly interpreted under the old rule too.

Give this beta version a workout. There's only three lines changed from version 0.3.3 of the plugin (lines 631-633 of newsmartypants.py), and two of those are comments.
Attached Files
File Type: zip PunctuationSmarten_v0.3.4b.zip (18.4 KB, 15 views)

Last edited by DiapDealer; 05-12-2020 at 10:27 AM.
DiapDealer is online now   Reply With Quote
Old 05-12-2020, 10:47 AM   #107
Thomas_AR
Zealot
Thomas_AR began at the beginning.
 
Thomas_AR's Avatar
 
Posts: 126
Karma: 10
Join Date: Jan 2015
Location: Buenos Aires
Device: Android
Folks, just one more question.
I know this plugin works best in English books. As i am mostly working with German Books is there an adapted plugin as some signs are very different. Or is it language independent.

Thanks and sorry for the noop question

Thomas
Thomas_AR is offline   Reply With Quote
Old 05-12-2020, 11:51 AM   #108
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 22,309
Karma: 124547494
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
To be honest, this plugin is basically just a wrapper around someone else's work: the SmartyPants Python smartening algorithm (which was nowhere near perfect--or even complete--in the first place). I've just tweaked it where I can. I would say that SmartyPants is definitely English-centric.

There may be some language specific (or agnostic) smartening algorithms out there, but I've not run into them.
DiapDealer is online now   Reply With Quote
Old 05-13-2020, 09:13 AM   #109
AlanHK
Fanatic
AlanHK is slicker than a case of WD-40AlanHK is slicker than a case of WD-40AlanHK is slicker than a case of WD-40AlanHK is slicker than a case of WD-40AlanHK is slicker than a case of WD-40AlanHK is slicker than a case of WD-40AlanHK is slicker than a case of WD-40AlanHK is slicker than a case of WD-40AlanHK is slicker than a case of WD-40AlanHK is slicker than a case of WD-40AlanHK is slicker than a case of WD-40
 
AlanHK's Avatar
 
Posts: 520
Karma: 73392
Join Date: Apr 2014
Device: PW-3, Android phone
Quote:
Originally Posted by DiapDealer View Post
Give this beta version a workout.
Yep, that works.


<p>Month '1' in '84 was a good year. He had a '317' average and solved the '3 body problem'.</p>

to

<p>Month ‘1’ in ’84 was a good year. He had a ‘317’ average and solved the ‘3 body problem’. </p>


I see the previous version checked for '80s and '90s, etc. But not particular years.

There are some cases that fail: A '12-gauge shotgun'. But 'NN years are much more common.


When you update it, you could include the plugin.png icon.
If you don't have one, here's the one I use, found in another thread a while ago.
Attached Images
 

Last edited by AlanHK; 05-14-2020 at 01:31 PM.
AlanHK is offline   Reply With Quote
Old 05-26-2020, 12:18 PM   #110
thosp
Junior Member
thosp began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Dec 2014
Device: laptop & tablet
Odd change

Greetings,

I use this in Sigil to 'correct' files downloaded form ... everywhere. Recently, I noticed a perceived problem.

The file in question had this, " -- " or {[space][hyphen][hyphen][space]} and changed it to "—" or (EM DASH}.

The writer meant to have a space before and after the punctuation mark, why was it changed the way it was? The proper change, in my view, would be to " – " or {[space]EN DASH][space]} as this makes the epub a better read.

Thank You
thosp is offline   Reply With Quote
Old 05-26-2020, 01:59 PM   #111
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 22,309
Karma: 124547494
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Every style guide I know of (with the exception of the Associated Press style guide) states that there should be no spaces between an em dash and the adjacent words.

You can easily turn off dash smartening altogether in the plugin's config section if you want to handle your own non-standard manual conversion of the double-dashes. But I'm not changing the default behavior of the plugin's dash smartening to something non-standard.

Last edited by DiapDealer; 05-26-2020 at 02:02 PM.
DiapDealer is online now   Reply With Quote
Old 05-27-2020, 03:07 AM   #112
AlanHK
Fanatic
AlanHK is slicker than a case of WD-40AlanHK is slicker than a case of WD-40AlanHK is slicker than a case of WD-40AlanHK is slicker than a case of WD-40AlanHK is slicker than a case of WD-40AlanHK is slicker than a case of WD-40AlanHK is slicker than a case of WD-40AlanHK is slicker than a case of WD-40AlanHK is slicker than a case of WD-40AlanHK is slicker than a case of WD-40AlanHK is slicker than a case of WD-40
 
AlanHK's Avatar
 
Posts: 520
Karma: 73392
Join Date: Apr 2014
Device: PW-3, Android phone
Quote:
Originally Posted by DiapDealer View Post
Every style guide I know of (with the exception of the Associated Press style guide) states that there should be no spaces between an em dash and the adjacent words.
US style is em-dash, no spaces—like this.

UK style is a spaced en-dash – like this (except in number ranges: pages 80–81).

Some publishers do use spaced em dashes though, not that common.

I do dashes manually by S&R, as there is often a mix of hyphens and various dashes, spaced or unspaced. And then spend a while fixing ellipses, since I prefer spaced periods.

Last edited by AlanHK; 05-27-2020 at 03:12 AM.
AlanHK is offline   Reply With Quote
Old 05-31-2020, 04:14 AM   #113
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 1,551
Karma: 7400529
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by thosp View Post
The file in question had this, " -- " or {[space][hyphen][hyphen][space]} and changed it to "—" or (EM DASH}.

The writer meant to have a space before and after the punctuation mark, why was it changed the way it was? The proper change, in my view, would be to " – " or {[space]EN DASH][space]} as this makes the epub a better read.
If you change your "(EM/EN)-Dash Setting" to:

'--' = em dash (no en dash support)

Click image for larger version

Name:	SmartenPunctuation.-.EM.EN.Dash.Setting.png
Views:	8
Size:	4.9 KB
ID:	179611

I tested on this:

Code:
<p>This is a -- sample.</p>
and it changed to this:

Code:
<p>This is a — sample.</p>
Then you could search for (SPACE + EM DASH + SPACE):

Search:

then replace with (SPACE + EN DASH + SPACE):

Replace:

You could setup a Sigil Saved Search if you find you're doing this correction often. Then you could just run SmartenPunctuation, then the Saved Search right after.

Quote:
Originally Posted by DiapDealer View Post
You can easily turn off dash smartening altogether in the plugin's config section if you want to handle your own non-standard manual conversion of the double-dashes. But I'm not changing the default behavior of the plugin's dash smartening to something non-standard.
Agreed. I always keep the options for dashes + ellipses off, then correct that stuff on my own.

But maybe how you have that "(no en dash support)", you can have the opposite:

' -- ' = spaced en dash (no em dash support)

Note: Just tested in LibreOffice / Word, their AutoCorrect by default changes:

"--" -> en dash
" -- " -> spaced en dash
"sample-- " (two hyphens + space) -> unspaced em dash

Quote:
Originally Posted by DiapDealer View Post
Every style guide I know of (with the exception of the Associated Press style guide) states that there should be no spaces between an em dash and the adjacent words.
On parenthetical expressions, also see:

https://en.wikipedia.org/wiki/Dash#P...sentence_level

And I agree with AlanHK. From what I've read over the years:

The spaced EN DASH seems to be used more in British publications (and newspapers).

The EM DASH tends to be used more in American.

Spacing Side Note: On spacing around em dashes... there's "open-set" and "close-set":

Open-set = Both sides use hair or thin space (or, disgustingly, normal spaces)
Close-set = No space on either side

See Robert Bringhurst, "Elements of Typographic Style", Chapter 5:

Quote:
5.2 DASHES, SLASHES & DOTS

5.2.1 Use spaced en dashes – rather than close-set em dashes or spaced hyphens – to set off phrases.


Standard computer keyboards and typewriters include only one dash: the hyphen. Any normal font of text type, either roman or italic, includes at least three. These are the hyphen and two sizes of long dash: the en dash – which is one en (half an em, M/2) in width – and the em dash-which is one em (two ens) wide. Many fonts also include a subtraction sign, which may or may not be the same length and weight as the en dash, and some include a figure dash (equal to the width of a standard numeral). The three-quarter em dash, and the three-to-em dash, which is one third of an em (M/3) in length, are often missing but perfectly easy to make.

In typescript, a double hyphen (--) is often used for a long dash. Double hyphens in a typeset document are a sure sign that the type was set by a typist, not a typographer. A typographer will use an em dash, three-quarter em, or en dash, depending on context or personal style. The em dash is the nineteenth-century standard, still prescribed in many editorial style books, but the em dash is too long for use with the best text faces. Like the oversized space between sentences, it belongs to the padded and corseted aesthetic of Victorian typography.

Used as a phrase marker – thus – the en dash is set with a normal word space either side.

Last edited by Tex2002ans; 05-31-2020 at 04:20 AM.
Tex2002ans is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
epubcheck plugin for Sigil Doitsu Plugins 309 03-21-2020 10:58 AM
[Plugin] ePub3-itizer - epub3 output plugin for Sigil KevinH Plugins 333 02-23-2020 03:32 PM
icarus Sigil plugin AlPe Plugins 26 12-05-2017 10:03 AM
[Plugin] KindleImport Sigil plugin DiapDealer Plugins 151 09-27-2017 05:05 PM
smoothRemove_v010 plugin for Sigil kbanelas Plugins 15 01-27-2017 05:51 PM


All times are GMT -4. The time now is 11:29 PM.


MobileRead.com is a privately owned, operated and funded community.