Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 08-20-2013, 07:10 AM   #16
varlog
clueless
varlog has not lost his or her sense of wonder.varlog has not lost his or her sense of wonder.varlog has not lost his or her sense of wonder.varlog has not lost his or her sense of wonder.varlog has not lost his or her sense of wonder.varlog has not lost his or her sense of wonder.varlog has not lost his or her sense of wonder.varlog has not lost his or her sense of wonder.varlog has not lost his or her sense of wonder.varlog has not lost his or her sense of wonder.varlog has not lost his or her sense of wonder.
 
varlog's Avatar
 
Posts: 63
Karma: 86836
Join Date: Sep 2012
Location: Europa
Device: prs t1
Also check if your python installation have necessary modules: smartypants, uuid, htmlentitydefs etc.
varlog is offline   Reply With Quote
Old 08-20-2013, 08:28 AM   #17
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 8,766
Karma: 39536849
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by varlog View Post
Also check if your python installation have necessary modules: smartypants, uuid, htmlentitydefs etc.
Actually, I've included the SmartyPants script with the zipfile. So there's no real need to have it "installed" in Python. As long as it's in the same directory as the script that imports it, all should be well. I try to keep everything as "stock" as possible (meaning not a lot of requirements for special modules needing to be installed).
DiapDealer is online now   Reply With Quote
Old 08-20-2013, 08:43 AM   #18
varlog
clueless
varlog has not lost his or her sense of wonder.varlog has not lost his or her sense of wonder.varlog has not lost his or her sense of wonder.varlog has not lost his or her sense of wonder.varlog has not lost his or her sense of wonder.varlog has not lost his or her sense of wonder.varlog has not lost his or her sense of wonder.varlog has not lost his or her sense of wonder.varlog has not lost his or her sense of wonder.varlog has not lost his or her sense of wonder.varlog has not lost his or her sense of wonder.
 
varlog's Avatar
 
Posts: 63
Karma: 86836
Join Date: Sep 2012
Location: Europa
Device: prs t1
hm... strange. I seem to remember that I had to install smartypants to make it work.
varlog is offline   Reply With Quote
Old 08-22-2013, 08:51 AM   #19
Perkin
Guru
Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.
 
Perkin's Avatar
 
Posts: 642
Karma: 64171
Join Date: Sep 2010
Location: Kent, England, Sol 3, ZZ9 plural Z Alpha
Device: Sony PRS-300, Kobo Aura HD
EDIT: see post #25 - new version

Here's a new 'smarten' (quotes/mdash/ndash/hellip) plugin, should be an improvement on the smartyPants version.

Also has an apos_exceptions.txt file which you should edit and add/remove words you want that begin with an apostrophe, it's case SENSITIVE, so if a word could also be at beginning of a sentence include another line with that capitalised word, such as tis and Tis, one word on each line, WITHOUT the apostrophe.

There's a few usual ones in there already, but edit as you see fit.
One I deliberately left off was Cause as it could also be non apostophe'd.
as in the examples
"'Cause I said so!"
'Cause of death?'
whereas it's lower-cased version would usually be apostrophe'd

The mdash and ndash entities are converted from dashes, -- is ndash, and --- is mdash, if you want them the other way round you need to change the lines 56/57 in the smarten.py file, remove one of the three dashes from line 56 and add one to the two in line 57.

Usual disclaimers apply.
Any problems or quote-cases to add, please let me know.

Edit:
If you don't want it to do any of the (m/n)dash or ellipsis entities, you can comment out the lines (add a # to beginning of the line) in the smarten.py file
30, 31, 32 (calculate extras for the entities)
42, (add pre tags to comments)
56, 57, 58 (convert the entities)
119 (remove the pre tags from comments)
Edit 2:
If you do the commenting out you will need to add
Code:
    text = re.sub(r"""(?<=-\s)"(?=\s)""",  r"""”""", text) # rd
    text = re.sub(r"""(?<=-\s)'(?=\s)""",  r"""’""", text) # rs
straight after lines 87/88 - as the n/m-dash entities may no longer be there, so we need to check for a normal dash instead.

Last edited by Perkin; 08-24-2013 at 10:03 AM.
Perkin is offline   Reply With Quote
Old 08-22-2013, 09:09 AM   #20
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 8,766
Karma: 39536849
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Thanks! Can't wait to take it for a spin.
DiapDealer is online now   Reply With Quote
Old 08-22-2013, 04:23 PM   #21
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 8,766
Karma: 39536849
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Had a chance to check out your script and have had good luck with it. I've found certain scenarios, though, where the quotes in the DOCTYPE are changed to entities. It's a Pretty-Print thing, I think.

If the DOCTYPE is all on one line, your script leaves it alone.
Code:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
But if it's split over two lines, your script seems to think it's fair game.
Code:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
  "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
And of course, I'll need to change most of those entities into their character equivalents to suit my preferences.
DiapDealer is online now   Reply With Quote
Old 08-22-2013, 04:43 PM   #22
Perkin
Guru
Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.
 
Perkin's Avatar
 
Posts: 642
Karma: 64171
Join Date: Sep 2010
Location: Kent, England, Sol 3, ZZ9 plural Z Alpha
Device: Sony PRS-300, Kobo Aura HD
@DiapDealer, can you change the line ~45 in smarten.py to the one below (it adds flags) and try it again on the multiline DOCTYPE

the line follows the comment:
# Split the html into tags and text
Code:
    entities = re.split(r'(<.+?>)', text, flags=re.M|re.S)
It would then allow multiline tags, rather than single lined ones, should also help catch a few other possible odd ones.

Thanks.
Perkin is offline   Reply With Quote
Old 08-22-2013, 05:07 PM   #23
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 8,766
Karma: 39536849
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Cool, thanks! I hadn't waded in all that deep yet.
DiapDealer is online now   Reply With Quote
Old 08-23-2013, 08:29 AM   #24
Perkin
Guru
Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.
 
Perkin's Avatar
 
Posts: 642
Karma: 64171
Join Date: Sep 2010
Location: Kent, England, Sol 3, ZZ9 plural Z Alpha
Device: Sony PRS-300, Kobo Aura HD
EDIT: see post #25 - new version

Here's a new version of the smarten plugin, contains a couple of fixes and the apos_exceptions.txt file is now dealt with differently, you can have more than one entry on each line and you can include the apostrophe or leave out it.

Any other problems or case-fixes please let me know.

Edit:
If you don't want it to do any of the (m/n)dash or ellipsis entities, you can comment out the following lines (add a # to beginning of the line) in the smarten.py file
31, 32, 33 and 53, 54, 55

Edit 2:
One fix already, change line 63 (of smarten.py) to following - it adds \b to the regex. If a single quote preceded a word which was also in apos_exceptions word, then it got changed to an apostrophe, when it shouldn't have done - should be an open quote.
e.g. ('im is in apos_exception.txt) 'important'
Code:
            text = re.sub(r"'("+entry.strip("'")+"\b)", r"’\1", text)

Last edited by Perkin; 08-24-2013 at 10:03 AM.
Perkin is offline   Reply With Quote
Old 08-24-2013, 09:58 AM   #25
Perkin
Guru
Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.
 
Perkin's Avatar
 
Posts: 642
Karma: 64171
Join Date: Sep 2010
Location: Kent, England, Sol 3, ZZ9 plural Z Alpha
Device: Sony PRS-300, Kobo Aura HD
Just found another flub, when an n/mdash entity was at beginning or end of line, the entity was being changed, but the count wasn't getting added to (as the extras/offset check wasn't getting a match), so the tags went out of alignment with the text.

Hopefully this should be it. (famous last words...)

Edit:
If you don't want it to do any of the (m/n)dash or ellipsis entities, you can comment out the following lines (add a # to beginning of the line) in the smarten.py file
32, 33, 34, 35, 36 and 56, 57, 58

Edit 2:
Yep, spoke too soon.
Just updated it, missed removing a couple of characters in the ndash replace line.
If you've already downloaded this new version, can you change line 57 in smarten.py to:
Code:
    text = re.sub(r'(?<=[^-])--(?=[^-])', r'–', text, flags=re.M) # ndash
Attached Files
File Type: zip smarten_sigil_plugin.zip (2.4 KB, 42 views)

Last edited by Perkin; 08-24-2013 at 10:58 AM.
Perkin is offline   Reply With Quote
Old 09-02-2013, 10:14 AM   #26
Perkin
Guru
Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.
 
Perkin's Avatar
 
Posts: 642
Karma: 64171
Join Date: Sep 2010
Location: Kent, England, Sol 3, ZZ9 plural Z Alpha
Device: Sony PRS-300, Kobo Aura HD
@Smarten plugin...
Just found another problem, to do with the apos_exceptions part. Although it did seem to be working, I found that now it doesn't - even though I can't see why it did originally work.

Could you change line ~66 of smarten.py
from this
Code:
            text = re.sub(r"'(" + entry.strip("'") + "\b)", r"&# 8217;\1", text)
TO this
Code:
            text = re.sub(r"'(%s\b)" % entry.strip("'"), r"&# 8217;\1", text)
ALSO removing the space in the '&# 8217;' (added so it displays here)
Perkin is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Sigil 0.3.4 / Sigil 0.4.0 RC1 / Cover in Nook Color Bertrand Sigil 13 08-06-2011 04:06 AM
Sigil 0.3.4 / Problème CSS entre Sigil et iPad Grivels Software 10 07-03-2011 09:06 AM
Plugins junkml Plugins 32 06-19-2009 06:43 AM
Plugins? Mitchll Plugins 0 12-27-2008 02:36 PM


All times are GMT -4. The time now is 06:46 AM.


MobileRead.com is a privately owned, operated and funded community.