Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 07-01-2013, 07:59 AM   #1
Ctipi
Junior Member
Ctipi began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Jul 2013
Device: Calibre
Problems with images pdf/epub

Hello,
I've a file .pdf with text and images (jpeg). When I convert it in .epub (with Calibre) the images in the .epub format are cutted in many lines, separated by empty (white) lines. If I open the .epub with Sigil, the software see each "cutted image" like an independent image. Each "independent image" (wich is only a small part of the entire one) has a different size (566x73 px; 566x133px).

What's happened during the conversion?
What can I do?
I know that it's recommended to don't work with pdf files. But wich file can I use to pass by a .pub format (with images!) in an .epub?

Thank you so much,

Camilla
Ctipi is offline   Reply With Quote
Old 07-01-2013, 08:47 AM   #2
dwig
Wizard
dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.
 
dwig's Avatar
 
Posts: 1,613
Karma: 6718541
Join Date: Dec 2004
Location: Paradise (Key West, FL)
Device: Current:Surface Go & Kindle 3 - Retired: DellV8p, Clie UX50, ...
The original image (the one that existed BEFORE the document was converted to a PDF) was probably sliced during the conversion to PDF and, as a result, existed in the PDF as an array of separate images. There is nothing that can be done during the PDF > ePub conversion to reassemble the pieces.

Your only hope is to manually rebuild a replacement image from the pieces using an image editor (Photoshop, ...). You can extract the pieces from the ePub by expanding it and copying out the images. Once you've reassembled the images in an image editor you can edit the ePub is Sigil and replace the multiple pieces with the newly combined image.

Humpty Dumpty sat on a wall (read: Humpty Dumpty = original document)
Humpty Dumpty had a great fall (read: great fall = conversion to PDF)
All the King's horses and all the King's men (read: tools to convert PDF back into a real document)
Couldn't put Humpty together again.
dwig is offline   Reply With Quote
Old 07-01-2013, 09:36 AM   #3
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 31,241
Karma: 61360164
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
re: sliced images

Hand editing the html and CSS sometimes will let you place the pieces close together (0 or negative top/bottom margins) with a very thin gap left. Unfortunately, each device renders that gap differently.

Photo stitching is the best solution
theducks is offline   Reply With Quote
Old 07-01-2013, 02:13 PM   #4
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,266
Karma: 16544702
Join Date: Sep 2009
Location: UK
Device: ClaraHD, Forma, Libra2, Clara2E, LibraCol, PBTouchHD3
When this has happened to me, 'stitching' has been out of the question because the sliced image has ended up in 100+ pieces. I just create a screencap of the full image from the original pdf then surgically replace the slices with the one good image. Though whether this is practical for any given pdf is going to depend on how many images have been sliced/wrecked.
jackie_w is offline   Reply With Quote
Old 07-01-2013, 06:37 PM   #5
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 22,006
Karma: 30277294
Join Date: Mar 2012
Location: Sydney Australia
Device: none
@Ctipi

Have you tried any other pdf to epub conversion methods.

My first choice for PDF conversions is to convert to PRC with MobiCreator, and then convert the PRC to EPUB with Calibre.

I sometimes use Acrobat, Nitro and Omnipage, but they cost money, and if you're not familiar with them they're not easy to drive.

A method I have yet to try is to use Doxillion (Free) to convert the PDF to DOCX and then use Calibre to convert the DOCX to EPUB. That route solved another conversion issue the other day.

I convert a lot of PDF's from .edu, .gov and .ngo sites, and I've not found a universal solution to get a good epub from a PDF. But in my experience two steps are more often better than one step.

I've never had any sliced images, but most of the PDFs I convert don't have many images. I also give up if I can't get what I consider to be a reasonable conversion.

BR
BetterRed is offline   Reply With Quote
Old 07-01-2013, 10:59 PM   #6
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 22,006
Karma: 30277294
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by jackie_w View Post
When this has happened to me, 'stitching' has been out of the question because the sliced image has ended up in 100+ pieces. I just create a screencap of the full image from the original pdf then surgically replace the slices with the one good image. Though whether this is practical for any given pdf is going to depend on how many images have been sliced/wrecked.
jackie_w : I've been trying to think of a reason this might happen - the only thing that came to mind is Progressive versus Baseline jpegs.

If you have a PDF with the problem are you able to extract the recalcitrant image and have a look at its properties - or send me the PDF via a PM and I'll have looksee

@Ctipi - another trick that sometimes overcomes PDF issues is to print the PDF to a PDF via a PDF print driver and convert the 'printed PDF' - I use the PDF print driver from Bullzip because it doesn't nag me to upgrade to the Pay4Me version.

BR
BetterRed is offline   Reply With Quote
Old 07-01-2013, 11:33 PM   #7
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 9,897
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Kindle PaperWhite SE 11th Gen
Quote:
Originally Posted by BetterRed View Post
jackie_w : I've been trying to think of a reason this might happen - the only thing that came to mind is Progressive versus Baseline jpegs.
I may be wrong, but doesn't this happen because the book was published with the images sliced up. I don't think that this was due to anything in the conversion process itself.

Quote:
Originally Posted by BetterRed View Post
@Ctipi - another trick that sometimes overcomes PDF issues is to print the PDF to a PDF via a PDF print driver and convert the 'printed PDF'
I wonder if this could work, it would be worth a try.
DoctorOhh is offline   Reply With Quote
Old 07-02-2013, 12:10 AM   #8
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 22,006
Karma: 30277294
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by DoctorOhh View Post
I may be wrong, but doesn't this happen because the book was published with the images sliced up. I don't think that this was due to anything in the conversion process itself.
No, you could be well be right - the only way to prove that is to extract the image from the PDF - but if it was the conversion then why would that be so - Progressive v Baseline was one thing that came to mind.

But further research (sliced images in PDF's) indicates that this is/was an issue with InDesign CS3 and PS. And better than that even its looks like some guy who looks a lot like yourself, may have solved the problem way back in April 2010 https://www.mobileread.com/forums/showthread.php?t=81580

@Ctipi - have a look at the above link post #6

BR

Last edited by BetterRed; 07-02-2013 at 12:19 AM.
BetterRed is offline   Reply With Quote
Old 07-02-2013, 12:37 AM   #9
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 9,897
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Kindle PaperWhite SE 11th Gen
Quote:
Originally Posted by BetterRed View Post
And better than that even its looks like some guy who looks a lot like yourself, may have solved the problem way back in April 2010 https://www.mobileread.com/forums/showthread.php?t=81580
That post is exactly what theducks was talking about above.

I may have helped someone back then, but I solved it for myself too as stated at the end of the post.

Quote:
Originally Posted by DoctorOhh View Post
My opinion is until I have a reader large enough to view pdf files full screen as Adobe intended I'm just not going to mess with them.
PDF files as a source for conversions personally aren't worth my time.
DoctorOhh is offline   Reply With Quote
Old 07-02-2013, 01:11 AM   #10
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 22,006
Karma: 30277294
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by DoctorOhh View Post
...but I solved it for myself too as stated at the end of the post.
That was with MobiCreator, yes? Which is my 1st choice for PDF conversions

Quote:
Originally Posted by DoctorOhh View Post
PDF files as a source for conversions personally aren't worth my time.
More often than not, I don't have a choice

BR
BetterRed is offline   Reply With Quote
Old 07-02-2013, 01:49 AM   #11
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 9,897
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Kindle PaperWhite SE 11th Gen
Quote:
Originally Posted by BetterRed View Post
More often than not, I don't have a choice
If I didn't have a choice, I probably would have budgeted for a Nexus 10 by now.
DoctorOhh is offline   Reply With Quote
Old 07-02-2013, 05:28 AM   #12
Ctipi
Junior Member
Ctipi began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Jul 2013
Device: Calibre
Thank you !!
So, it's quicker if I recharge all the images directly in Sigil, isn't it?
I've tried to download MobiCreator, but it's only for Windows, I've a Mac, is there a similar software that I can download?
Thanks!
Ctipi is offline   Reply With Quote
Old 07-02-2013, 06:32 AM   #13
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 22,006
Karma: 30277294
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by Ctipi View Post
Thank you !!
So, it's quicker if I recharge all the images directly in Sigil, isn't it?
I've tried to download MobiCreator, but it's only for Windows, I've a Mac, is there a similar software that I can download?
Thanks!
Try the CSS edit that DrOhh posted in https://www.mobileread.com/forums/showthread.php?t=81580

If you google for "convert pdf to prc/mobi/epub mac' you will get some products, but I suspect the programs for the Mac will be like the programs for Windows - a mixed bunch. If you need to do many PDF conversions you'll need more than one tool.

I think the Doxillion Document converter is available for Mac

Also try the Print PDF to a PDF - it's a long shot, but not much effort, I think Print to PDF is a standard feature on OS/X.

If its a one off situation, maybe you can get friend who uses Windows to run the PDF through MobiCreator to create the PRC - its a very simple program and I think there's a How To video on YouTube.

BR
BetterRed is offline   Reply With Quote
Old 07-02-2013, 09:50 AM   #14
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,266
Karma: 16544702
Join Date: Sep 2009
Location: UK
Device: ClaraHD, Forma, Libra2, Clara2E, LibraCol, PBTouchHD3
Quote:
Originally Posted by BetterRed View Post
jackie_w : I've been trying to think of a reason this might happen - the only thing that came to mind is Progressive versus Baseline jpegs.

If you have a PDF with the problem are you able to extract the recalcitrant image and have a look at its properties - or send me the PDF via a PM and I'll have looksee
I'm afraid I didn't keep the PDFs after I'd created a nice clean epub version. If I come across any in the future I'll PM you.

In case it sheds anymore light, my PDF conversion method is:
  1. Use the pdf2xml.exe utility, which is the basis of the MobiPocket converter, to extract an XML file of the text plus the images. Only some of the images are extracted for some reason.
  2. Use homegrown software to convert XML to simple clean HTML, preserving styling/structure (headings, scenebreaks, dropcaps, dehyphenation etc) normally lost during typical PDF-epub conversions.
  3. Use calibre to convert HTML to epub.
The shredded images were produced in step 1 by the pdf2xml.exe extract process. As this program doesn't seem to have any input parameters, I couldn't experiment further.
jackie_w is offline   Reply With Quote
Old 07-04-2013, 01:34 PM   #15
Ctipi
Junior Member
Ctipi began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Jul 2013
Device: Calibre
Unhappy

Quote:
Originally Posted by BetterRed View Post
Try the CSS edit that DrOhh posted in https://www.mobileread.com/forums/showthread.php?t=81580
I've tried it but in my CSS there is not the CSS edit that DrOhh posted.
Each image it has been transformed like this (so, instead of being only the number 25 is 25_1, 25_2). This is my CSS on the image pages, like fallow:

<?xml version="1.0" encoding="utf-8" standalone="no"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">

<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Tutto sulle Rose</title>
<meta content="pdftohtml 0.36" name="generator" />
<meta content="user" name="author" />
<meta content="2013-06-25T15:06:16+00:00" name="date" />
<link href="../Styles/stylesheet.css" rel="stylesheet" type="text/css" />
<link href="../Styles/page_styles.css" rel="stylesheet" type="text/css" />
</head>

<body class="calibre">
<p class="calibre1">24</p>

<p class="calibre1"><img alt="" class="calibre2" src="../Images/index-25_1.jpg" /></p>

<p class="calibre1"><img alt="" class="calibre2" src="../Images/index-25_2.jpg" /></p>

<p class="calibre1"><img alt="" class="calibre2" src="../Images/index-25_3.jpg" /></p>

<p class="calibre1"><img alt="" class="calibre2" src="../Images/index-25_4.jpg" /></p>

<p class="calibre1"><img alt="" class="calibre2" src="../Images/index-25_5.jpg" /></p>

<p class="calibre1"><img alt="" class="calibre2" src="../Images/index-25_6.jpg" /></p>

<p class="calibre1"></p>

Why does the conversion split each single image in many parts?


Thanks!!!

Last edited by DoctorOhh; 07-05-2013 at 12:36 AM.
Ctipi is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
PDF Reader and epub with images Ayla7 Onyx Boox 0 12-25-2012 06:04 PM
pdf (with images) to html to ePub (mac) ernest Conversion 3 11-12-2012 06:25 PM
why ePub -> PDF pages as images? xristy Calibre 15 12-28-2010 08:42 PM
PDF to Epub - Images with Text ebahm Calibre 2 09-19-2010 03:23 PM
pdf to epub/breaking up images? dhume01 Calibre 1 07-06-2010 08:51 PM


All times are GMT -4. The time now is 11:28 PM.


MobileRead.com is a privately owned, operated and funded community.