View Full Version : Problem converting pdf to epub (size) using calibre


abadguy
06-09-2010, 08:36 AM
Hi everyone, I had a pdf ebook made only by images. I converted it into a searchable text pdf using an ocr program called pdf converter professional, a very good program. The original pdf was 8mb, the new pdf is about 18mb. When I try to convert it to epub format using Calibre I get an epub output file of 80+ mb and there's no way I could manage to upload such a big file to my iphone unfortunately (I also tried to convert the 8mb pdf to epub but I still get a big 70mb+ epub file as an output). I haven't figured out yet how to make my output file smaller, do I make some mistake in calibre's settings? Please help me :( I don't get why the epub is so big after conversion, after all this ebook is just 140 pages..

p.s. if this could help I also converted the file with the ocr program in a .doc format, but since calibre doesn't support .doc I don't know how this could be useful

Pranananda
06-09-2010, 10:25 AM
You might try making a copy of the epub (an epub is just a zip archive), renaming the extension to .zip (from .epub), unzipping the book, and see which files in the epub are large. It could be that there are some large images. If you then open the images in Preview or Photoshop or Gimp, you an probably save those images at a reduced resolution, zip of the files, and make it an epub again.

Jellby
06-09-2010, 10:33 AM
Maybe the "searchable text" pdf has both the images and the text, and when you convert it to epub, calibre keeps the images. Try getting a text-only file first with your OCR program, and make it HTML if possible.

frabjous
06-09-2010, 01:40 PM
p.s. if this could help I also converted the file with the ocr program in a .doc format, but since calibre doesn't support .doc I don't know how this could be useful

If your OCR program will only output the text in .pdf or .doc format, it should be pretty easy to convert the .doc to a format calibre can handle, like .html, .rtf or .odt. Just use a WordProcessor to open the .doc, and then save as or export as another format. If using M$ Word, save as filtered HTML. (If you don't have M$ Word, you could use a free open source word processor like OpenOffice, or AbiWord (http://abisource.org/). I've mostly had success using the HTML output of AbiWord's conversions from doc to html in calibre to generate ebooks.)

Hedaya
03-22-2012, 08:25 AM
I'm having the reverse problem: getting epubs into PDFs. It actually just fails completely. any thoughts?

frabjous
03-22-2012, 05:41 PM
I'm having the reverse problem: getting epubs into PDFs. It actually just fails completely. any thoughts?

What is "it"? What method are you trying?

For the other direction, I'd use jellby's script instead. Here. (http://www.mobileread.com/forums/showthread.php?t=89689)

Hedaya
03-23-2012, 05:33 AM
I'll do that, thanx! =)