Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 09-08-2020, 06:54 AM   #1
paperbackebook
Member
paperbackebook began at the beginning.
 
Posts: 11
Karma: 10
Join Date: Sep 2020
Device: none
Angry Converting from epub to pdf results in dollarsigns

Hi,

I've been trying to convert an ebook in epub format to PDF so I can print some parts of it.

I've tried to use the Adobe ebook software but although it allows me to print it crashes or starts printing empty pages or only prints the first page.

This file is not print protected. I can open it properly in Calibre.
But when I convert it takes ages and the resulting PDF has on each page a whole bunch of dollar signs. These look like links and when clicked nothing happens.
There's e.g. 20 lines with 15 dollar signs on each of those lines.
This seems to repeat on all pages.

Any idea where this is coming from? Or how I can get rid of these? The content still seems to be a bit borked but I hope that will be resolved when those dollar signs are removed.

If there are other methods please let me know.

Thanks a lot in advance (this has been quite frustrating)!
paperbackebook is offline   Reply With Quote
Old 09-08-2020, 12:36 PM   #2
Quoth
Still reading
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 13,746
Karma: 103847703
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper
Use Calibre to convert to RTF.
Load RTF into LO Writer or MS Word and set a sensible page size, margins and fix styles.

Then print.

Only make a PDF if wanting to do POD or fixed layout on a Tablet. Word & Writer can both make PDFs. Embed fonts if printing on a different computer.

Basically there is no need for epub to PDF ever, unless you own a Sony Digital Paper. Even then Calibre -> RTF -> Wordprocessor & Fix -> Print to Digital paper
Quoth is offline   Reply With Quote
Advert
Old 09-08-2020, 12:45 PM   #3
paperbackebook
Member
paperbackebook began at the beginning.
 
Posts: 11
Karma: 10
Join Date: Sep 2020
Device: none
Quote:
Originally Posted by Quoth View Post
Use Calibre to convert to RTF.
Load RTF into LO Writer or MS Word and set a sensible page size, margins and fix styles.

Then print.

Only make a PDF if wanting to do POD or fixed layout on a Tablet. Word & Writer can both make PDFs. Embed fonts if printing on a different computer.

Basically there is no need for epub to PDF ever, unless you own a Sony Digital Paper. Even then Calibre -> RTF -> Wordprocessor & Fix -> Print to Digital paper
Thanks for the suggestion.
I'll try that.

What odd is that when I open the xhtml files everything is rendered correctly. Then when printing to PDF the content is borked.
And now I also tried opening those files in the browser and printing from there. Again the content is borked in another way.
So why the *** is it impossible to just print what is rendered...
paperbackebook is offline   Reply With Quote
Old 09-08-2020, 01:05 PM   #4
paperbackebook
Member
paperbackebook began at the beginning.
 
Posts: 11
Karma: 10
Join Date: Sep 2020
Device: none
And when I print from Firefox the layout is good however it prints 13 pages. 1 page with the contents and then 12 with headers and footers and after disabling those 12 empty pages :s What a sh**show.
Incredible that we can put a man on the moon 60 years ago but still can't get a normal print :s
paperbackebook is offline   Reply With Quote
Old 09-08-2020, 01:49 PM   #5
paperbackebook
Member
paperbackebook began at the beginning.
 
Posts: 11
Karma: 10
Join Date: Sep 2020
Device: none
In RTF it actually worse. The text is parsed correctly but the layout is completely gone.
paperbackebook is offline   Reply With Quote
Advert
Old 09-08-2020, 01:52 PM   #6
Deskisamess
Wizard
Deskisamess ought to be getting tired of karma fortunes by now.Deskisamess ought to be getting tired of karma fortunes by now.Deskisamess ought to be getting tired of karma fortunes by now.Deskisamess ought to be getting tired of karma fortunes by now.Deskisamess ought to be getting tired of karma fortunes by now.Deskisamess ought to be getting tired of karma fortunes by now.Deskisamess ought to be getting tired of karma fortunes by now.Deskisamess ought to be getting tired of karma fortunes by now.Deskisamess ought to be getting tired of karma fortunes by now.Deskisamess ought to be getting tired of karma fortunes by now.Deskisamess ought to be getting tired of karma fortunes by now.
 
Deskisamess's Avatar
 
Posts: 2,742
Karma: 45300001
Join Date: Sep 2012
Location: Ohio
Device: iPhone 13 Pro, iPad mini, iPad Pro 12.9",Paperwhite 6.8", Scribe 2022
Quote:
Originally Posted by paperbackebook View Post
In RTF it actually worse. The text is parsed correctly but the layout is completely gone.
Can't you fix the layout once the file is in Word? Or at least fix the parts you want to print.

How about doing simple screen grabs of the bits you want to print?
Deskisamess is offline   Reply With Quote
Old 09-08-2020, 03:41 PM   #7
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 79,247
Karma: 145488788
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Convert to HTMLZ and then unzip the contents and load the Index.html file into Word. Then print the bits you want.

Another solution is to unzip the ePub and load the HTML files into Word that you want to print from
JSWolf is offline   Reply With Quote
Old 09-08-2020, 05:26 PM   #8
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,640
Karma: 29710510
Join Date: Mar 2012
Location: Sydney Australia
Device: none
@paperbackebook - Why convert the PDF to anything, PDF is the worse format to convert to EPUB or most anything else.

Have you tried using a different PDF viewer to Acrobat, not a browser based one like firefox or edge, they're half-baked. I switched to PDF-XChange because of Acrobat frustrations, this is its print dialogue.

Click image for larger version

Name:	Annotation 2020-09-09 065631.jpg
Views:	195
Size:	194.2 KB
ID:	181835

Or, if you have a recent edition of Word (2013 or later I think) try opening the PDF directly in it and print from there. It can produce surprisingly good results for some PDFs. On large PDFs it may run for a while (as in 10-20 minutes) and then fail, gracefully though.

IMO, conversion of PDF to HTML, or its derivatives, should only be done when all else fails.

BR

Last edited by BetterRed; 09-08-2020 at 07:30 PM. Reason: clarity
BetterRed is offline   Reply With Quote
Old 09-09-2020, 03:23 AM   #9
paperbackebook
Member
paperbackebook began at the beginning.
 
Posts: 11
Karma: 10
Join Date: Sep 2020
Device: none
Quote:
Originally Posted by Deskisamess View Post
Can't you fix the layout once the file is in Word? Or at least fix the parts you want to print.

How about doing simple screen grabs of the bits you want to print?
Some of the issues are "minor" like it all of a sudden switches to a smaller font.
Sometimes it's a sentence that goes from normal to what seems words in subscript and superscript.
Others are that between some sentences all of a sudden the "margin" between the previous one and this one is too small so they partly overlap.

There's 230 pages of these. So even if I only fix these easy ones it'll still cost me a big chunk of time.

Others are that 2 sentences are squashed together. So the characters from one sentence is merged with the other sentence and the whole sentence is gibberish. In that case I would have to go to the ebook and retype those sentences. From a software point of view I don't get how this happens. The sentences are read through the xhtml and the sentences are clearly seperate. So how Word manages to completely mangles those sentences is beyond me.

I appreciate the suggestions but this is not workable.
paperbackebook is offline   Reply With Quote
Old 09-09-2020, 03:32 AM   #10
paperbackebook
Member
paperbackebook began at the beginning.
 
Posts: 11
Karma: 10
Join Date: Sep 2020
Device: none
Quote:
Originally Posted by JSWolf View Post
Convert to HTMLZ and then unzip the contents and load the Index.html file into Word. Then print the bits you want.

Another solution is to unzip the ePub and load the HTML files into Word that you want to print from
I indeed did something like this yesterday. But not using Word. xhtml is html so the browser should be able to properly do it.

So one approach that works, which is time consuming though, is:

- extract the epub file
- change the printer settings in firefox to not print footers and headers
- open the xhtml file in firefox (firefox only: edge, IE, Chrome mess up printing a simple file they render correctly)
- select "pages" so it only prints the first page instead of an extra 12 empty pages
- print
- repeat a gazillion times
paperbackebook is offline   Reply With Quote
Old 09-09-2020, 03:38 AM   #11
paperbackebook
Member
paperbackebook began at the beginning.
 
Posts: 11
Karma: 10
Join Date: Sep 2020
Device: none
Quote:
Originally Posted by BetterRed View Post
@paperbackebook - Why convert the PDF to anything, PDF is the worse format to convert to EPUB or most anything else.

Have you tried using a different PDF viewer to Acrobat, not a browser based one like firefox or edge, they're half-baked. I switched to PDF-XChange because of Acrobat frustrations, this is its print dialogue.

Attachment 181835

Or, if you have a recent edition of Word (2013 or later I think) try opening the PDF directly in it and print from there. It can produce surprisingly good results for some PDFs. On large PDFs it may run for a while (as in 10-20 minutes) and then fail, gracefully though.

IMO, conversion of PDF to HTML, or its derivatives, should only be done when all else fails.

BR
You've got it the other way around I want to print an epub document.
Apparently the Adobe Digital Editions that opens the epub can't print properly. So the idea was to convert the epub to PDF and then print that.

It's just bizarre that it can render those pages properly in whatever browser or epub viewer, however once I print they all of a sudden forgot how to render the page and completely screw it up.

Also the conversion takes ages. The only thing that seemed to work a bit quick and mediocre was PDF Candy software.
I also had to switch to the 64bit version of Calibre since it ran out of memory.

So if anyone knows about a decent xhtml viewer/epub viewer that can do a simple print of what is rendered on screen that would be great.
paperbackebook is offline   Reply With Quote
Old 09-09-2020, 04:48 AM   #12
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 79,247
Karma: 145488788
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by paperbackebook View Post
I indeed did something like this yesterday. But not using Word. xhtml is html so the browser should be able to properly do it.

So one approach that works, which is time consuming though, is:

- extract the epub file
- change the printer settings in firefox to not print footers and headers
- open the xhtml file in firefox (firefox only: edge, IE, Chrome mess up printing a simple file they render correctly)
- select "pages" so it only prints the first page instead of an extra 12 empty pages
- print
- repeat a gazillion times
That also is a good way to do it so it prints properly.
JSWolf is offline   Reply With Quote
Old 09-09-2020, 05:20 AM   #13
paperbackebook
Member
paperbackebook began at the beginning.
 
Posts: 11
Karma: 10
Join Date: Sep 2020
Device: none
Quote:
Originally Posted by JSWolf View Post
That also is a good way to do it so it prints properly.
I wouldn't call that a good way rather desperation kicking in

I had hoped there would be a simpler way since every reader and browser can render it correctly.
Just a simple print button or ctrl-p that prints what is rendered in calibre (without pdf conversion, ...) would probably already do the trick (although I know it's probably a lot more complex than that).

However what it does is create a PDF which takes an hour or more and then the content is mangled.

Sorry for the rants just trying to wrap my head around this decades-long printing battle and how it's still not won.
paperbackebook is offline   Reply With Quote
Old 09-09-2020, 09:05 AM   #14
paperbackebook
Member
paperbackebook began at the beginning.
 
Posts: 11
Karma: 10
Join Date: Sep 2020
Device: none
Continuing this sage...

I have looked into the xhtml and made a small reproducer.

What I notice is that every word in the book has a seperate span tag with absolute positioning. Is this accepted as normal in the ebook world? It looks ridiculous to me.

When I look at those positions in the span tag I see high numbers pixels. Example:
style="position:absolute;top:6113.53px;left:4640px ;letter-spacing:-1.29px;"

The decimal pixel values feel ridiculous tbh. Between each line there's 340px. (I removed the other lines). Yet some lines render properly and others seem to be displaced.
When looking at the page the words and lines are normally spaced. So nowhere near 340px diff between lines.
However the catch is this general div style:
transform: scale(0.05)
And the font size is 300px. Yes 300px.

So it seems during this scaling something goes wrong when printing. Still weird why it would render properly and then completely mess up though.
paperbackebook is offline   Reply With Quote
Old 09-09-2020, 09:12 AM   #15
paperbackebook
Member
paperbackebook began at the beginning.
 
Posts: 11
Karma: 10
Join Date: Sep 2020
Device: none
I was wondering how it would react to the scale being removed so I did the following test:

Divided the font size by 20 (since scale is 0.05)
Removed the scale part
Calculated the px/20 of each span tag (2 span tags in my test case)

Guess what... the text is shown properly in the print dialog.
Why they use this transform I have no idea.
paperbackebook is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Converting to EPUB results in files that Calibre can't read Sheeba Calibre 7 01-20-2020 08:34 PM
Help! Converting from epub to mobi results in loss of *all* images and font data Spankeh Conversion 12 10-17-2019 07:45 PM
Converting AZW3 book to PDF results in messed up format NeonHD Conversion 3 07-24-2017 04:22 AM
Converting epub to epub results in 2 pages in book deback Conversion 13 01-31-2016 03:06 PM
Converting Microsoft Word documents to PDF for the eDGe (with good results) borisb enTourage Archive 1 10-22-2010 01:31 PM


All times are GMT -4. The time now is 06:11 AM.


MobileRead.com is a privately owned, operated and funded community.