MobileRead Forums - View Single Post

hymie · 08-08-2014, 03:20 PM

Greetings.

So I have an HTML file which, I suspect, was generated from a Texi (GNU TexInfo) file. I don't have the Texi file, just the HTML file.

This HTML file has some unfortunate features. First, all of the h2 tags have the "chapter" class attached to them. Next, all of the chapter / subchapter headings look like this:

Code:

<div class="node">
<a name="Introduction"></a>
<p><hr>
Next:&nbsp;<a rel="next" accesskey="n" href="#Document-Structure">Document Structure</a>,
Previous:&nbsp;<a rel="previous" accesskey="p" href="#Top">Top</a>,
Up:&nbsp;<a rel="up" accesskey="u" href="#Top">Top</a>
</div>
<h2 class="chapter">1 Introduction</h2>

You can see a sample here
http://orgmode.org/manual/Document-Structure.html
Note that the "a" anchor is above the <hr>

So I'm trying to use the command-line ebook-convert program to convert this file to an epub. But try as I might, I cannot prevent the h2/chapter tag from creating a page-break in my epub. I've tried

Code:

--chapter=/
--chapter-mark=none
--dont-split-on-page-breaks
--sr1-search='<h2 class="chapter">' --sr1-replace='<h2>'
--chapter=//h:div\[@class=\"node\"\]

but nothing seems to prevent the page breaks from appearing.

I'm running out of ideas. My last option is to edit the file and try to adjust it, but I'm hoping to automate the process.

Is there something I'm missing?

08-08-2014, 03:20 PM	#1
hymie Enthusiast Posts: 27 Karma: 10 Join Date: Oct 2011 Device: iPhone	Can't stop the page breaks Greetings. So I have an HTML file which, I suspect, was generated from a Texi (GNU TexInfo) file. I don't have the Texi file, just the HTML file. This HTML file has some unfortunate features. First, all of the h2 tags have the "chapter" class attached to them. Next, all of the chapter / subchapter headings look like this: Code: <div class="node"> <a name="Introduction"></a> <p><hr> Next: <a rel="next" accesskey="n" href="#Document-Structure">Document Structure</a>, Previous: <a rel="previous" accesskey="p" href="#Top">Top</a>, Up: <a rel="up" accesskey="u" href="#Top">Top</a> </div> <h2 class="chapter">1 Introduction</h2> You can see a sample here http://orgmode.org/manual/Document-Structure.html Note that the "a" anchor is above the <hr> So I'm trying to use the command-line ebook-convert program to convert this file to an epub. But try as I might, I cannot prevent the h2/chapter tag from creating a page-break in my epub. I've tried Code: --chapter=/ --chapter-mark=none --dont-split-on-page-breaks --sr1-search='<h2 class="chapter">' --sr1-replace='<h2>' --chapter=//h:div\[@class=\"node\"\] but nothing seems to prevent the page breaks from appearing. I'm running out of ideas. My last option is to edit the file and try to adjust it, but I'm hoping to automate the process. Is there something I'm missing?