Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 07-14-2020, 12:33 PM   #1
andi1235
Member
andi1235 began at the beginning.
 
Posts: 15
Karma: 10
Join Date: Nov 2012
Device: none
Question about converting epub to .docx or PDF

Hi all,

I have an epub file that contains references to original print page numbers in the format <span epub:type="pagebreak" role="doc-pagebreak" id="page591" aria-label="591" title="591" class="calibre"></span> (591 being the example page number from this book). I would like a page break added before each original page, when converting to either .docx or PDF.

I tried to set up an Xpath expression to add a page break before each of these span tags. However, my resulting file doesn't end up with any page breaks inserted. The XPath code I used is //h:span[@epub:type="pagebreak"]

I'm not sure if this is a result of an error I'm making in my XPath code, or if this is not how the pagebreak insertion feature is supposed to work, or if there's another problem I'm not aware of. Please help!

I'll attach a couple pages from the epub I've been working with. Thanks in advance!
Attached Files
File Type: epub SAMPLE.epub (8.49 MB, 162 views)
andi1235 is offline   Reply With Quote
Old 07-14-2020, 10:15 PM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,356
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
XPath doesn't know about epub namespace. Use the role instead
kovidgoyal is offline   Reply With Quote
Advert
Old 07-15-2020, 11:43 AM   #3
andi1235
Member
andi1235 began at the beginning.
 
Posts: 15
Karma: 10
Join Date: Nov 2012
Device: none
Thank you! I'll try that.
andi1235 is offline   Reply With Quote
Old 07-15-2020, 11:54 AM   #4
andi1235
Member
andi1235 began at the beginning.
 
Posts: 15
Karma: 10
Join Date: Nov 2012
Device: none
Quote:
Originally Posted by kovidgoyal View Post
XPath doesn't know about epub namespace. Use the role instead
So, I tried using the role, but that didn't work either. No errors or anything, it just didn't add pagebreaks. I just used the Xpath "generator" thing to create the expression; does //h:span[@role="doc-pagebreak"] look like it should work? Or am I misunderstanding something?

Thanks again for your help!
andi1235 is offline   Reply With Quote
Old 07-15-2020, 12:01 PM   #5
andi1235
Member
andi1235 began at the beginning.
 
Posts: 15
Karma: 10
Join Date: Nov 2012
Device: none
Just as a sort of "base" test, I just tried running a conversion and telling it to add a pagebreak before EVERY span tag (my Xpath code was //h:span). No page breaks were added in my .docx or PDF output files.
andi1235 is offline   Reply With Quote
Advert
Old 07-15-2020, 12:03 PM   #6
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,356
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
where are you adding this option, see https://manual.calibre-ebook.com/con...for-conversion
kovidgoyal is offline   Reply With Quote
Old 07-15-2020, 12:20 PM   #7
andi1235
Member
andi1235 began at the beginning.
 
Posts: 15
Karma: 10
Join Date: Nov 2012
Device: none
In the "Structure Detection" section, "Insert page breaks Before (XPath Expression)". Is that the wrong place?
andi1235 is offline   Reply With Quote
Old 07-15-2020, 12:21 PM   #8
andi1235
Member
andi1235 began at the beginning.
 
Posts: 15
Karma: 10
Join Date: Nov 2012
Device: none
Oh, and I'm adding it every time I try to convert an epub -- I haven't tried with bulk conversion or setting it beforehand yet.
andi1235 is offline   Reply With Quote
Old 07-15-2020, 12:26 PM   #9
andi1235
Member
andi1235 began at the beginning.
 
Posts: 15
Karma: 10
Join Date: Nov 2012
Device: none
Here's my conversion log from the last time I tried this, just trying to add a pagebreak before each span tag. I was converting epub to .docx that time.
Attached Files
File Type: txt log.txt (13.8 KB, 140 views)
andi1235 is offline   Reply With Quote
Old 07-16-2020, 03:05 AM   #10
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,356
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Do an epub to epub conversion that will split up the html files at your page breaks and then they should work converting to pdf and docx as well. Although css based page breaks *should* be working with direct conversions as well, have to look into that.
kovidgoyal is offline   Reply With Quote
Old 07-16-2020, 02:15 PM   #11
andi1235
Member
andi1235 began at the beginning.
 
Posts: 15
Karma: 10
Join Date: Nov 2012
Device: none
I'll give that a try, thank you!
andi1235 is offline   Reply With Quote
Old 07-16-2020, 02:40 PM   #12
andi1235
Member
andi1235 began at the beginning.
 
Posts: 15
Karma: 10
Join Date: Nov 2012
Device: none
I seem to have having mixed results. It appeared to work the first time I tried it but subsequent times, although I am successfully getting the split xhtml files in the epub, using the same detection method to create page breaks in the .docx file (or PDF) doesn't seem to be working. Is there a different way to tell the program to break up the document at every new xhtml?
andi1235 is offline   Reply With Quote
Old 07-16-2020, 02:41 PM   #13
andi1235
Member
andi1235 began at the beginning.
 
Posts: 15
Karma: 10
Join Date: Nov 2012
Device: none
I think the first time I tried it I accidentally used a different xpath command that searched for the word "pagebreak" instead of looking at the role. I'm not sure, though. Would that make a difference?
andi1235 is offline   Reply With Quote
Old 07-16-2020, 09:58 PM   #14
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,356
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
you convert to epun ONCE then you delete the original_epub file and use the new epub to convert to other formats.
kovidgoyal is offline   Reply With Quote
Old 07-21-2020, 01:16 PM   #15
andi1235
Member
andi1235 began at the beginning.
 
Posts: 15
Karma: 10
Join Date: Nov 2012
Device: none
Thank you! The conversion is working now. I appreciate all your help!
andi1235 is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Converting from EPUB to DOCX - styles tage fredheim Conversion 2 10-16-2019 11:21 AM
Having issues converting docx to epub suenhoho Conversion 6 03-20-2019 04:16 PM
Error Converting Docx to EPUB lisanna Conversion 1 11-01-2016 05:03 AM
Converting to epub from rtf vs. docx kjulia28 Conversion 0 12-19-2015 03:23 AM
Error with converting docx to epub ssflwp Calibre 3 07-05-2014 06:39 AM


All times are GMT -4. The time now is 10:23 PM.


MobileRead.com is a privately owned, operated and funded community.