Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 08-18-2018, 11:21 PM   #1
pittendrigh
Connoisseur
pittendrigh began at the beginning.
 
Posts: 53
Karma: 10
Join Date: Mar 2011
Location: montana
Device: none
Sigil output with no html or head elements

Is it possible to have Sigil output NOT include html or head elements?

Why? I have software that displays an unzipped (on the server) epub3 (not epub2, not yet anyway). I use it behind a password barrier for teaching users how to build a boat.

To make this work i use PHP XPath functions to parse the various parts of an unzipped epub3. It works. I've been using it for half a year now. Long how-to-do-it instructions in epub format are good stuff for me. But I only want this output for use on my website.

However, on a per-XML-page basis I use PHP's preg_replace function to strip off the HTML and HEAD elements so the output can be embedded inside my surrounding Content Management System HTML output.

I could continue using that way. I could use a bash/sed or python script to remove those elements from the file output AFTER unzipping the epub3.

Or better yet, if it was possible, I'd like to click Save on my Sigil editor, in a way that saves the various individual pages without HTML and Head. Is that possible? With Sigil? I've read various faqs and browsed the Sigil user guide and can't find it.
pittendrigh is offline   Reply With Quote
Old 08-19-2018, 12:16 AM   #2
DNSB
Bibliophagist
DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.
 
DNSB's Avatar
 
Posts: 5,209
Karma: 23054146
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Aura One, Aura H2O, Aura HD, Nexus 7 HD, iPad Air, Tolino epos
Quote:
Originally Posted by pittendrigh View Post
Is it possible to have Sigil output NOT include html or head elements?

Why? I have software that displays an unzipped (on the server) epub3 (not epub2, not yet anyway). I use it behind a password barrier for teaching users how to build a boat.

To make this work i use PHP XPath functions to parse the various parts of an unzipped epub3. It works. I've been using it for half a year now. Long how-to-do-it instructions in epub format are good stuff for me. But I only want this output for use on my website.

However, on a per-XML-page basis I use PHP's preg_replace function to strip off the HTML and HEAD elements so the output can be embedded inside my surrounding Content Management System HTML output.

I could continue using that way. I could use a bash/sed or python script to remove those elements from the file output AFTER unzipping the epub3.

Or better yet, if it was possible, I'd like to click Save on my Sigil editor, in a way that saves the various individual pages without HTML and Head. Is that possible? With Sigil? I've read various faqs and browsed the Sigil user guide and can't find it.
My quick reply would be no. Sigil is designed to output valid epub2/epub3 and what you are asking for would not be considered valid.

You could try asking in the Sigil forum but I suspect the answer there would still be no.
DNSB is offline   Reply With Quote
Advert
Old 08-19-2018, 04:04 AM   #3
Doitsu
Wizard
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 4,341
Karma: 14300083
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by pittendrigh View Post
Or better yet, if it was possible, I'd like to click Save on my Sigil editor, in a way that saves the various individual pages without HTML and Head. Is that possible? With Sigil? I've read various faqs and browsed the Sigil user guide and can't find it.
Since Sigil comes with an embedded Python interpreter, it's relatively easy to create a plugin that'll remove all <html> and <head> tags. Note that you'll also need to disable the Mend XHTML Source Code on Save option.

For a simple proof-of-concept plugin that'll bold all occurrences of "the," see this post.

For more information on the Sigil plugin API see the Sigil Plugin Framework guide. Also check out the Sigil Plugin Development forum.

EDIT: Depending on the structure of your books, you also might be able to remove the <html> and <head> tags with a Saved Search group.

Last edited by Doitsu; 08-19-2018 at 08:07 AM.
Doitsu is offline   Reply With Quote
Old 08-19-2018, 08:37 AM   #4
pittendrigh
Connoisseur
pittendrigh began at the beginning.
 
Posts: 53
Karma: 10
Join Date: Mar 2011
Location: montana
Device: none
Thank you. I didn't know about the Python plugin mechanism. I also didn't think about the "Mend Source Code" issue. That does get ugly. I'll think about it. Perhaps I'll stick with what I have.

Displaying epub on a website is powerful. I created and ran online courses before I retired, mostly using Moodle and reams of software I wrote myself.

Courseware like Moodle (and all the commercial versions like Blackboard and Desire to Learn) are good at making automatically graded multiple choice tests and in some cases good at connecting to backend grading databases, but courseware is abysmally bad at making online books, as supplementary reading material for online courses. And E-education is where the education industry is heading.

Sigil and online epub, similar to what I've been doing, solves that problem. There is an absolutely enormous untapped market in online education for epub on the web, as the contents of a DIV, so it can be surrounded by requisite courseware navigation.

Mark my words. This market will cascade like a broken dam once it gets going.

Last edited by pittendrigh; 08-19-2018 at 08:41 AM.
pittendrigh is offline   Reply With Quote
Old 08-19-2018, 09:54 AM   #5
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 19,326
Karma: 99454782
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Using an Output Plugin to save the modified content should get you around any potential issues with Mend.

But I'm curious. If you strip out the head portion of an epub's html, don't you lose the ability to style the book's content separately from the website? Or do you resort to exclusively using inline styling for the epub content?

EDIT: never mind. I see now that you're primarily using this for displaying one work on a personal website. Being able to separate the style of the book from the site is probably not an issue in that scenario.

Last edited by DiapDealer; 08-19-2018 at 12:40 PM.
DiapDealer is offline   Reply With Quote
Advert
Old 08-19-2018, 11:51 AM   #6
elibrarian
Imperfect Perfectionist
elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.
 
elibrarian's Avatar
 
Posts: 164
Karma: 345678
Join Date: Dec 2011
Location: ělstykke, Denmark
Device: none
Quote:
Originally Posted by pittendrigh View Post
Displaying epub on a website is powerful.
I don't know if it'll suit your purposes, but you might use something like Readium Cloudviewer (the lite version should suffice in most cases, I think): https://github.com/readium/readium-js-viewer/releases to open your epub directly on the web.

Almost without any customisation it'll look something like this: Sample (rightclick and choose to open in a new window, or you'll be taken of Mobileread)

regards,

Kim
elibrarian is offline   Reply With Quote
Old 08-19-2018, 12:46 PM   #7
pittendrigh
Connoisseur
pittendrigh began at the beginning.
 
Posts: 53
Karma: 10
Join Date: Mar 2011
Location: montana
Device: none
Doesn't Readium show ebook pages as a separate entity? What I'm doing is placing the ebook output into a DIV element inside a content management system, so the ebook output is surrounded by a larger navigation context--so the ebook is part of a larger, more complex website. And not separate from it.

I use it on a hobby boat building website now but it would be even more useful embedded inside courseware like Moodle, so the ebook would be surrounded by links to "Syllabus, Class Forum, Class Test Schedule, etc).

Ebook styling is simplistic. It's easy to incorporate fonts, element widths and colors into the surrounding css:

#ebook toc { float: left; max-width: 20%; background: gray; color: black; margin: 0.25em;}

Last edited by pittendrigh; 08-19-2018 at 01:46 PM.
pittendrigh is offline   Reply With Quote
Old 08-19-2018, 02:07 PM   #8
elibrarian
Imperfect Perfectionist
elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.elibrarian ought to be getting tired of karma fortunes by now.
 
elibrarian's Avatar
 
Posts: 164
Karma: 345678
Join Date: Dec 2011
Location: ělstykke, Denmark
Device: none
Quote:
Originally Posted by pittendrigh View Post
Doesn't Readium show ebook pages as a separate entity? What I'm doing is placing the ebook output into a DIV element inside a content management system, so the ebook output is surrounded by a larger navigation context--so the ebook is part of a larger, more complex website. And not separate from it.
Well, as I said, I wasn't really sure it would fit your concept.

That said, I did experiment with embedding the CloudReader into the pages of my site, when I first set it up. It's af couple of years back now, but as far as I recall, it wasn't very difficult to set up, but I gave up the idea because of the impact a fixed size window autoloading external content (which the visitor might or might not be interested in) would have on both the loading-times and the mobile-friendliness of the pages.

regards,

Kim
elibrarian is offline   Reply With Quote
Old 08-19-2018, 11:28 PM   #9
slowsmile
Witchman
slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.slowsmile ought to be getting tired of karma fortunes by now.
 
Posts: 435
Karma: 768286
Join Date: May 2013
Location: Philippines
Device: Android S5
@pittendrigh...It would't be difficult to write a python command line script to extract head and html tags from each epub file. Just export all the epub xhtml files from Sigil to a directory and then run a python script from that directory that does something like this(using python 3.4):

Code:

from bs4 import BeautifulSoup

# get the file list(separated by spaces) as a string from the command prompt
print("\nInput html file names:\n")
files = raw-input()

# convert input string to file list
file_list = []
file_list = files.split()

# remove the html and head tags
for file in file_list:
    outfile = 'new_' + file
    outfp = open(outfile, "wt", encoding="utf-8")
    html = open(file, 'rt', encoding='utf-8').read()
    soup = BeautifulSoup(html, 'html5lib')   # parses like a web brower
     
    for html_tag in soup.find('html', limit=1)
    if html_tag:
        html_tag.extract()

    for head_tag in soup.find_all('head', limit=1)
    if head_tag:
        head_tag.extract() 
        
    outfp.writelines(str(soup))
    outfp.close()
....Or you could perhaps even do it without using BeautifulSoup like this:

Code:
# get the file list as a string from the command prompt
print("\nInput html file names:\n")
files = raw-input()

# convert input string to file list
file_list = []
file_list = files.split()

# remove the html and head tags
for file in file_list:
    outfile = 'new_' + file
    outfp = open(outfile, 'wt', encoding=utf-8)
    with open(file, 'rt', encoding='utf-8') as infp:
        for line in infp:    
            if line.lstrip().startstwith('<html') or \
                line.lstrip().startstwith('</html>') or \
                line.lstrip().startstwith('<head') or \
                line.lstrip().startstwith('</head'>):
                continue
            else:
                outfp.write(line)
     outfp.close()            
You could probably also use Tidy in a script to do exactly the same as the above.

Last edited by slowsmile; 08-19-2018 at 11:38 PM.
slowsmile is offline   Reply With Quote
Old 08-20-2018, 12:07 AM   #10
KevinH
Wizard
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 3,108
Karma: 1931746
Join Date: Nov 2009
Device: many
or simply serialize the contents of the body tag
KevinH is offline   Reply With Quote
Old 08-25-2018, 09:58 AM   #11
pittendrigh
Connoisseur
pittendrigh began at the beginning.
 
Posts: 53
Karma: 10
Join Date: Mar 2011
Location: montana
Device: none
Thank you for that little python script. I'll play with that. Epub on the server (rather than inside a hand-held device) is a niche market but it is an important niche market: for creating powerful how-to-to-it tutorials of any kind.

The combination of video AND the written word, packaged together as planned, organized unit is the most powerful instructional tool there is--short of a live human being at your side anyway. Extensive embedded video doesn't work well for hand-held readers. Epub on the server is the way to go for how to build a house, how to rebuild an engine, how to learn photography, from the camera through to image processing, how to build a boat, how to organize a political campaign, etc.

Sigil and Epub on the server represents the most powerful instructional tool on the planet.

Actually the best instructional technology involves Sigil, Epub, web servers and Moodle. I spent the last 5 years of my career (before retiring) doing distance learning (developing elementary computer programming classes for Tribal Colleges). Moodle is great but it (and all other distance learning software, like Blackboard, Desire to Learn etc) are abysmally bad at creating online resource materials.

Sigil and Epub on the server (because of the text/video combination) are like a super-charger put on top of an old V8 engine.

Last edited by pittendrigh; 08-25-2018 at 10:24 AM.
pittendrigh is offline   Reply With Quote
Reply

Tags
embedded, output, sigil

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Default output to HTML and enable conversion to HTML Nostras Conversion 0 09-15-2016 06:24 PM
Sigil messing up Head tetrault Sigil 8 08-05-2016 01:10 AM
Sigil changes markup of empty elements when the document is saved ibu Sigil 9 08-12-2013 02:39 PM
html elements in epub kamwoj ePub 8 02-11-2012 06:43 AM
HTML input -<b> and <i> being converted to block elements? fluxcore Conversion 2 02-18-2011 10:36 PM


All times are GMT -4. The time now is 02:44 AM.


MobileRead.com is a privately owned, operated and funded community.