Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 08-20-2013, 02:34 PM   #1
Perkin
Guru
Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.
 
Perkin's Avatar
 
Posts: 644
Karma: 64171
Join Date: Sep 2010
Location: Kent, England, Sol 3, ZZ9 plural Z Alpha
Device: Sony PRS-300, Kobo Aura HD
Smart Quotes help (for Sigil plugin)

I'm working on a python script which hopefully will be an improvement on smartypants (used by plugin for Sigil), and would like some help with regards as what doesn't work correctly.

The ones I don't think will ever be satisfactorily solved will be where there's words which start with an apostrophe, such as
'Twas brillig, and the slithy toves....
(Having said that, I could have a list of known 'words - 'tis, 'twas 'cause etc.)

If you've got any known flubs, please can you let me know, with a small example as well if possible, and what it should look like when done correctly.

Ones like
Code:
John said, "The man said 'aaaaa'
" 'bbbb'
" 'cccc'
" dddd
" and then ended the story."
WRONG:
Code:
John said, “The man said ‘aaaaa’
 ” ‘bbbb’
 “ ‘cccc’
 ” dddd
 ” and then ended the story.”

RIGHT:
Code:
John said, “The man said ‘aaaaa’
 “ ‘bbbb’
 “ ‘cccc’
 “ dddd
 “ and then ended the story.”
The code I've got at moment does all the ones I've got correctly (including the above) - just need further test cases.

Thanks for any cases given.
Perkin is offline   Reply With Quote
Old 08-20-2013, 03:06 PM   #2
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 8,924
Karma: 40404050
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
There's a weird situation I can't nail down 100% where smartypants reverses a closing quote (makes it an opening one). When it does happen, it seems to be near an emdash entity (or character). But that seems to be bug, rather than a special typographic situation it doesn't handle.

I also thought of creating a user-editable list/dictionary of 'tis-type words that could be integrated into smarty (or another) script.

I think you've identified the two big "deal-breaker" scenarios where SmartyPants is concerned, though.
DiapDealer is offline   Reply With Quote
Old 08-20-2013, 03:14 PM   #3
Perkin
Guru
Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.
 
Perkin's Avatar
 
Posts: 644
Karma: 64171
Join Date: Sep 2010
Location: Kent, England, Sol 3, ZZ9 plural Z Alpha
Device: Sony PRS-300, Kobo Aura HD
I think I know what you mean, but using your smartypants plugin - it does them correctly -- although it removes two spaces, where they appear between the dashes and the quotes. I think when the space stays there, smartypants did an opening quote rather than a closing one.

Code:
<p>He said, "Go away -- "</p>

<p>He said, "Go away --"</p>

<p>He said, 'Go away -- '</p>

<p>He said, 'Go away --'</p>
Perkin is offline   Reply With Quote
Old 08-20-2013, 03:57 PM   #4
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 6,102
Karma: 4791309
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
A dash can be inside or outside quotes. Is this handled correctly?

Code:
<p>"Blah blah"--he said, and continued--"blah, blah blah."</p>
<p>"Blah blah--" he said, and continued, "--blah, blah, blah."</p>
and you already know that the dashes could be spaced or not...
Jellby is offline   Reply With Quote
Old 08-20-2013, 05:46 PM   #5
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 8,924
Karma: 40404050
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by Jellby View Post
A dash can be inside or outside quotes. Is this handled correctly?

Code:
<p>"Blah blah"--he said, and continued--"blah, blah blah."</p>
<p>"Blah blah--" he said, and continued, "--blah, blah, blah."</p>
and you already know that the dashes could be spaced or not...
Smartypants seems to handle that situation quite well on it's own--so far as I can tell.

In my wrapper script however, I do a little pre/post processing to achieve some personal goals that wouldn't be possible with smartypants alone (borrowing heavily from calibre). Those changes may not suit others, but they're pretty easily tweaked. For instance:

1) I preserve any html comments present. Smarty would butcher those double-dashes (calibre does the same thing).
2) I remove spaces that may occur on either side of double-dashes; simply because I find spaces before or after emdashes aesthetically unappealing when reading.
3) Smarty uses numeric entities for the quotation marks, emdashes and ellipses it creates. I've made arrangements to selectively convert those entities that Smarty creates to characters where it suits me.

I think Perkins' script is only going to be dealing with quotation marks, though. Which makes sense since "fixing" the double-dash and the "three consecutive periods" stuff is pretty trivial, really.

Last edited by DiapDealer; 08-20-2013 at 05:55 PM.
DiapDealer is offline   Reply With Quote
Old 08-20-2013, 05:53 PM   #6
Perkin
Guru
Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.
 
Perkin's Avatar
 
Posts: 644
Karma: 64171
Join Date: Sep 2010
Location: Kent, England, Sol 3, ZZ9 plural Z Alpha
Device: Sony PRS-300, Kobo Aura HD
Quote:
Originally Posted by Jellby View Post
A dash can be inside or outside quotes. Is this handled correctly?

Code:
<p>"Blah blah"--he said, and continued--"blah, blah blah."</p>
<p>"Blah blah--" he said, and continued, "--blah, blah, blah."</p>
and you already know that the dashes could be spaced or not...
My code converts them to:
Code:
“Blah blah”–he said, and continued–“blah, blah blah.”
“Blah blah–” he said, and continued, “–blah, blah, blah.”
Ones I can't handle correctly are if there's a space before and after the quote, - one would always be wrong.
Code:
<p>"Blah blah-- " he said, and continued, " --blah, blah, blah."</p>
As it stands both the solitary quotes are converted to opening quotes, so the first would be wrong.

I'm working on correcting that.
Edit: Just solved that particular problem as well.

Last edited by Perkin; 08-20-2013 at 06:05 PM.
Perkin is offline   Reply With Quote
Old 08-20-2013, 06:58 PM   #7
mrmikel
Book Twiddler
mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.
 
Posts: 2,000
Karma: 1405001
Join Date: Apr 2008
Location: Central Oregon Coast
Device: PRS-300
Glad you solved the problems of quotes with spaces. This is very very common in older works.
mrmikel is offline   Reply With Quote
Old 08-22-2013, 08:55 AM   #8
Perkin
Guru
Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.
 
Perkin's Avatar
 
Posts: 644
Karma: 64171
Join Date: Sep 2010
Location: Kent, England, Sol 3, ZZ9 plural Z Alpha
Device: Sony PRS-300, Kobo Aura HD
Uploaded the plugin to DiapDealer' PlugIn thread, here
It's in post #19

Edit:
It converts quotes/apostophe and the mdash,ndash,ellipsis, and preserves html comments
and does the words that begin with an apostrophe - from an apos_exceptions.txt file

If you don't want it to do any of the (m/n)dash or ellipsis entities, you can comment out the lines (add a # to beginning of the line) in the smarten.py file
30, 31, 32 (calculate extras for the entities)
42, (add pre tags to comments)
56, 57, 58 (convert the entities)
119 (remove the pre tags from comments)

Last edited by Perkin; 08-22-2013 at 09:08 AM.
Perkin is offline   Reply With Quote
Old 08-22-2013, 03:21 PM   #9
Steadyhands
Enthusiast
Steadyhands began at the beginning.
 
Steadyhands's Avatar
 
Posts: 30
Karma: 10
Join Date: Dec 2011
Location: Brisbane, Oz
Device: iPad2
Here's a list of my saved searches for Quote fixes. Some also include changing hyphens to mdash also. No text examples to go with them sorry.

Quote:
11\Name=Quote Fix/Quote fix1
11\Find=\\-([\\\x2018|\\\x201c])</p>
11\Replace=\x2014\x201d</p>
12\Name=Quote Fix/Quote fix2
12\Find=\x201c</p>
12\Replace=\x201d</p>
13\Name=Quote Fix/Quote fix3
13\Find=\\. \\\x201d \\\x2018
13\Replace=. \x201c \x2018
14\Name=Quote Fix/Quote fix4
14\Find="<p class=\\\"calibre(\\d+)\\\">\\\x201d \\\x2018"
14\Replace="<p class=\"calibre\\1\">\x201c \x2018"
15\Name=Quote Fix/Quote fix5
15\Find=</i> (\\p{P})\\\x201d
15\Replace=</i>\\1\x201d
16\Name=Quote Fix/Quote fix6
16\Find="<p class=\\\"calibre(\\d+)\\\">\\\x201d"
16\Replace="<p class=\"calibre\\1\">\x201c"
17\Name=Quote Fix/Quote fix7
17\Find="([\\!|\\.|\\?\\\x2026|\\,]) \\\x201d</p>"
17\Replace=\\1\x201d</p>
18\Name=Quote Fix/Quote fix8
18\Find="<p class=\\\"calibre(\\d+)\\\">\\\x201c\\s(?!\x2018)"
18\Replace="<p class=\"calibre\\1\">\x201c"
19\Name=Quote Fix/Quote fix9
19\Find="\\, \\\x201d\\s(?=[a-z])"
19\Replace=",\x201d "
20\Name=Quote Fix/Quote fix10
20\Find="<p class=\\\"calibre(\\d+)\\\">\\\x201c\\- "
20\Replace="<p class=\"calibre\\1\">\x201c\x2014"
21\Name=Quote Fix/Quote fix11
21\Find=(-|\x2013)\x201d
21\Replace=\x2014\x201d
22\Name=Quote Fix/Quote fix12
22\Find="\x2026 \x201d(?=[A-Z])"
22\Replace=\x2026 \x201c
PS, took the easy way out and cut and pasted form the sigil_searches.ini.
Steadyhands is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Smart Quotes crutledge Calibre 8 12-27-2012 07:35 PM
Creating smart quotes in Sigil? MacEachaidh Sigil 15 02-28-2012 07:29 PM
Smart Quotes Paxman53 Sigil 15 02-15-2012 08:31 PM
Smart Quotes Toxaris ePub 2 05-31-2010 10:32 AM
Removing smart quotes horseyride Workshop 8 03-06-2008 12:08 PM


All times are GMT -4. The time now is 02:59 PM.


MobileRead.com is a privately owned, operated and funded community.