Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 01-20-2009, 09:55 PM   #1
Bierkonig
Member
Bierkonig began at the beginning.
 
Posts: 22
Karma: 10
Join Date: Dec 2008
Device: Sony PRS-700
Any way to force page breaks when converting HTML to EPUB

I am new to this and thank you in advance for any patient explanations.

Reading the forums, I know that there's a raging debate about whether we need the page anymore with ebooks. Some celebrate that we can liberate text from the page and need maintain only those formatting elements necessary to understand how words and sections and headers related to each other. In essence, the book becomes an electronic scroll. However, a few of us believe that the innovation of the page-based codex, which began replacing the scroll, makes finding information within the text more efficient. Specifically, the codex makes communication about specific content with other readers easier and I've seen several posts by academics here saying they need to reference back to page numbers for when communicating with non-ebook readers. I'm in this second camp.

I'm scanning pages primarily of text (and a few tables and pictures) to Abbyy FineReader and saving its OCR output as HTML. The HTML output looks great on my Sony PRS-700 when I use Calibre to convert it to ePUB. However, it would make me so happy if there was a way to force the reader to paginate according to breaks in the HTML rather than...arbitrarily. I have no idea how the reader manages pagination of the text. I know that its possible to insert a page break in an RTF and the Reader will break the page accordingly for a Calibre conversion to ePub.

Is there any way to use Calibre to tell the Reader to break pages at <hr>, and nowhere else? As it is, the Reader averages turning 10 pages of epub -- each ending with an <hr>, noting an intended page break -- into 11 or 12 pages. If there are other ideas of how to edit my html to make the Reader understand my pagination desires, I'm all ears.

The only alternative I know to maintain pagination is to use pdf reflow, but the results are much less attractive than html/epub.

thanks.
https://www.mobileread.com/forums/ima...sadd1/help.gif
Bierkonig is offline   Reply With Quote
Old 01-20-2009, 10:10 PM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,826
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
You can force page breaks at (almost) any point in the HTMl, but you cannot prevent page breaks from happening if the content between two of your forced page breaks is longer than the screen length, if you think about it for a minute, you'll understand why.

To force page breaks at <hr> tags use

Code:
<style type="text/css">
hr {page-break-after:always;}
</style>
in the header of your HTML
kovidgoyal is offline   Reply With Quote
Old 01-20-2009, 10:16 PM   #3
Bierkonig
Member
Bierkonig began at the beginning.
 
Posts: 22
Karma: 10
Join Date: Dec 2008
Device: Sony PRS-700
Thanks for the quick reply. And yes, that does make some sense. Can you give me a basic understanding of what the "screen length" is?

Also, are there any ways -- pre-calibre conversion, or within calibre -- to force a longer screen length or reformat txt to fit within this proscribed screen length?

All of my ePubs end up about 115% longer in page count than the originals I feed in. It's driving me nuts.
Bierkonig is offline   Reply With Quote
Old 01-20-2009, 10:46 PM   #4
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,826
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
In a reflowabe format, there is a logical page, i.e. the contents of the documents between two hard (forced) page breaks. This logical page will be split into any number of physical pages that depend on the size of the screen of the device used to view the file as well as the font size being used.

lines per screen = number of pixels in the physical screen in the vertical direction/number of pixels per line in the vertical direction

You cant change the numerator. You can change the denominator by changing the font size, but since the reader can also change font sizes, your setting will only make sense at one size
kovidgoyal is offline   Reply With Quote
Old 01-20-2009, 11:01 PM   #5
llasram
Reticulator of Tharn
llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.
 
llasram's Avatar
 
Posts: 618
Karma: 400000
Join Date: Jan 2007
Location: EST
Device: Sony PRS-505
Quote:
Originally Posted by kovidgoyal View Post
lines per screen = number of pixels in the physical screen in the vertical direction/number of pixels per line in the vertical direction
I think the OP is talking about the page numbers AdobeDE uses to delimit the text rather than pages as "screens full of text".

If so, check out Adobe's "EPUB Best Practices Guide," most recent version available in EPUB format from their Digital Publishing Technology website. It's the one place I've seen discussed Adobe's EPUB-extension "page map" facility, which lets you provide an explicit mapping of where AdobeDE determines numbered page boundaries to be.
llasram is offline   Reply With Quote
Old 01-20-2009, 11:22 PM   #6
llasram
Reticulator of Tharn
llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.
 
llasram's Avatar
 
Posts: 618
Karma: 400000
Join Date: Jan 2007
Location: EST
Device: Sony PRS-505
Quote:
Originally Posted by llasram View Post
It's the one place I've seen discussed Adobe's EPUB-extension "page map" facility, which lets you provide an explicit mapping of where AdobeDE determines numbered page boundaries to be.
Holy hell. I just realized -- and tested to confirm -- that this facility can actually be used to completely remove the marginal page numbers. The specifics are a bit trickier than one might desire, but it's doable.
llasram is offline   Reply With Quote
Old 01-20-2009, 11:26 PM   #7
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,826
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Quote:
Originally Posted by llasram View Post
Holy hell. I just realized -- and tested to confirm -- that this facility can actually be used to completely remove the marginal page numbers. The specifics are a bit trickier than one might desire, but it's doable.
Wow, break out the champagne. I'm guessing you just map the entire book to a single page?
kovidgoyal is offline   Reply With Quote
Old 01-20-2009, 11:39 PM   #8
llasram
Reticulator of Tharn
llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.
 
llasram's Avatar
 
Posts: 618
Karma: 400000
Join Date: Jan 2007
Location: EST
Device: Sony PRS-505
Quote:
Originally Posted by kovidgoyal View Post
Wow, break out the champagne. I'm guessing you just map the entire book to a single page?
Well, that's the tricky part... It seems that if a page-map is present, ADE won't display any flows which don't have any pages associated with them. So there has to be at least one page per file, but they can all have name="" with no problems. But then doing that means that you only have as many pages as you have flows, both for the purposes of the "x of y" status bar and using the number buttons for page-wise navigation. Kind of weird. So maybe the solution is to duplicate the default "1024 bytes == 1 page" manually, but with all the page names set to blank?
llasram is offline   Reply With Quote
Old 01-20-2009, 11:49 PM   #9
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,826
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Quote:
Originally Posted by llasram View Post
Well, that's the tricky part... It seems that if a page-map is present, ADE won't display any flows which don't have any pages associated with them. So there has to be at least one page per file, but they can all have name="" with no problems. But then doing that means that you only have as many pages as you have flows, both for the purposes of the "x of y" status bar and using the number buttons for page-wise navigation. Kind of weird. So maybe the solution is to duplicate the default "1024 bytes == 1 page" manually, but with all the page names set to blank?
Yeah, I guess that's the best solution, but is it really worth the effort?
kovidgoyal is offline   Reply With Quote
Old 01-21-2009, 03:25 AM   #10
mtravellerh
book creator
mtravellerh ought to be getting tired of karma fortunes by now.mtravellerh ought to be getting tired of karma fortunes by now.mtravellerh ought to be getting tired of karma fortunes by now.mtravellerh ought to be getting tired of karma fortunes by now.mtravellerh ought to be getting tired of karma fortunes by now.mtravellerh ought to be getting tired of karma fortunes by now.mtravellerh ought to be getting tired of karma fortunes by now.mtravellerh ought to be getting tired of karma fortunes by now.mtravellerh ought to be getting tired of karma fortunes by now.mtravellerh ought to be getting tired of karma fortunes by now.mtravellerh ought to be getting tired of karma fortunes by now.
 
mtravellerh's Avatar
 
Posts: 9,635
Karma: 3856660
Join Date: Oct 2008
Location: Luxembourg
Device: PB360°
Quote:
Originally Posted by kovidgoyal View Post
Yeah, I guess that's the best solution, but is it really worth the effort?
Well, I HATE those ADE page numbers forcing you to make at least a 15 px margin on the right side that looks WAY too big in Webkit- based readers. I would personally kiss llasram's feet if he had found the solution to get rid of those and I do not say such things lightly.
mtravellerh is offline   Reply With Quote
Old 01-21-2009, 05:01 AM   #11
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,826
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Quote:
Originally Posted by mtravellerh View Post
Well, I HATE those ADE page numbers forcing you to make at least a 15 px margin on the right side that looks WAY too big in Webkit- based readers. I would personally kiss llasram's feet if he had found the solution to get rid of those and I do not say such things lightly.
You realize that webkit based renderers support javascript while ADE does not, so you can just have a little javascript in your files to reset the margin
kovidgoyal is offline   Reply With Quote
Old 01-21-2009, 07:26 AM   #12
llasram
Reticulator of Tharn
llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.
 
llasram's Avatar
 
Posts: 618
Karma: 400000
Join Date: Jan 2007
Location: EST
Device: Sony PRS-505
Quote:
Originally Posted by kovidgoyal View Post
Yeah, I guess that's the best solution, but is it really worth the effort?
I kind of like the page numbers myself, but sounds like it might be worth it to mtravellerh .

Quote:
Originally Posted by mtravellerh View Post
Well, I HATE those ADE page numbers forcing you to make at least a 15 px margin on the right side that looks WAY too big in Webkit- based readers.
Is it any better to use points? LCD displays have a lower DPI than e-ink, so should be a smaller number of pixels.

Quote:
Originally Posted by kovidgoyal View Post
You realize that webkit based renderers support javascript while ADE does not, so you can just have a little javascript in your files to reset the margin
I'd be kind of leery of that myself... The OPS spec says that reader systems "should not" execute <script/>s, so using scripting to get essentially default behavior seems like a bad idea in the long run.
llasram is offline   Reply With Quote
Old 01-21-2009, 12:00 PM   #13
akash
Junior Member
akash began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Jan 2009
Device: none
scanning pages primarily

I'm scanning pages primarily of text (and a few tables and pictures) to Abbyy FineReader and saving its OCR output as HTML. The HTML output looks great on my Sony PRS-700 when I use Calibre to convert it to ePUB. However, it would make me so happy if there was a way to force the reader to paginate according to breaks in the HTML rather than...arbitrarily. I have no idea how the reader manages pagination of the text. I know that its possible to insert a page break in an RTF and the Reader will break the page accordingly for a Calibre conversion to ePub.
akash is offline   Reply With Quote
Old 01-21-2009, 12:36 PM   #14
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,826
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Quote:
Originally Posted by llasram View Post
I'd be kind of leery of that myself... The OPS spec says that reader systems "should not" execute <script/>s, so using scripting to get essentially default behavior seems like a bad idea in the long run.
I suspect that's one part of the OPS spec that's going to change. It's rather ridiculuous to not supoprt javascript and in a few years when portable devices are powerful enough to handle javascript, it will make absolutely no sense.
kovidgoyal is offline   Reply With Quote
Old 01-21-2009, 12:59 PM   #15
mtravellerh
book creator
mtravellerh ought to be getting tired of karma fortunes by now.mtravellerh ought to be getting tired of karma fortunes by now.mtravellerh ought to be getting tired of karma fortunes by now.mtravellerh ought to be getting tired of karma fortunes by now.mtravellerh ought to be getting tired of karma fortunes by now.mtravellerh ought to be getting tired of karma fortunes by now.mtravellerh ought to be getting tired of karma fortunes by now.mtravellerh ought to be getting tired of karma fortunes by now.mtravellerh ought to be getting tired of karma fortunes by now.mtravellerh ought to be getting tired of karma fortunes by now.mtravellerh ought to be getting tired of karma fortunes by now.
 
mtravellerh's Avatar
 
Posts: 9,635
Karma: 3856660
Join Date: Oct 2008
Location: Luxembourg
Device: PB360°
Quote:
Originally Posted by akash View Post
I'm scanning pages primarily of text (and a few tables and pictures) to Abbyy FineReader and saving its OCR output as HTML. The HTML output looks great on my Sony PRS-700 when I use Calibre to convert it to ePUB. However, it would make me so happy if there was a way to force the reader to paginate according to breaks in the HTML rather than...arbitrarily. I have no idea how the reader manages pagination of the text. I know that its possible to insert a page break in an RTF and the Reader will break the page accordingly for a Calibre conversion to ePub.
The simplest way is to add pagebreak in source view manually by inserting <div style="page-break-before:always;"></div> to break the page at that point. You can use horizontal lines with the same tag or even header and paragraph tags.

You can assign a pagebreak to headers of paragraphs, by creating an inline or outline CSS sheet and using the same tag, thus sparing yourself to have to type that all the time. Is that what you wanted to know?
mtravellerh is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Force page breaks in epubs alexvallette ePub 11 09-06-2010 07:53 AM
bookmark issues converting HTML to EPUB isabellkirsten Calibre 0 04-09-2010 11:47 PM
Remove page info from HTML when converting? JMikeD Calibre 5 04-04-2010 08:40 PM
converting multi-page HTML to Mobipocket shinew Calibre 13 02-21-2009 01:33 PM
Problem converting a webpage html to LRF, what program should I use? Long page turns seajewel Workshop 1 08-01-2008 06:32 AM


All times are GMT -4. The time now is 12:22 PM.


MobileRead.com is a privately owned, operated and funded community.