Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 02-24-2012, 01:18 AM   #1
hszforu
Junior Member
hszforu began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Feb 2012
Device: Kindle 4 NT
Problem while converting pdf to epub using calibre

whenever i am converting a pdf to epub using calibre, extra information is added inbetween the epub format.

for example, the following text is added in between the pages in epub format.

P:\010Comp\Begin8\189-0\ch02.vp
Friday, February 11, 2005 7:28:17 AM

Color profile: Generic CMYK printer profile
Begin8 Java: A Beginner’s Guide, 3rd Ed Schildt 3189-0 2
Composite Default screen
Blind Folio 2:38
38

And this text differs everytime, so i cannot use search and replace function
Is there any way to get rid of this text while converting?


Also, when converting from pdf to epub, the tables in the epub format are not displayed, instead they are represented sequentially in the epub format?Is there any other solution?
hszforu is offline   Reply With Quote
Old 02-24-2012, 06:17 AM   #2
mrmikel
Color me gone
mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.
 
Posts: 2,089
Karma: 1445295
Join Date: Apr 2008
Location: Central Oregon Coast
Device: PRS-300
PDF conversion is always going to be unpredictable. You can try using Adobe Acrobat to get text or Mobipocket Creator to get HTML to run through calibre to get to epub. You might want to download Sigil as well. You can use it to clean up the various problems that will still remain with the best conversion.

You can use Sigil's search and replace function and regular expressions (regex) to try to get rid of the extra text in what you have now. It takes some study. Be sure you use more current postings as you if you try to learn about it because the particular flavor of regex in Sigil has changed recently.

You might be able to look at the current epub you have in code view and see that although the text always changes, the lead into it and out of it is always the same. That could give you a handle to locate it while searching and replacing.
mrmikel is offline   Reply With Quote
Old 02-24-2012, 06:25 AM   #3
itimpi
Wizard
itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.
 
Posts: 4,552
Karma: 950151
Join Date: Nov 2008
Device: Sony PRS-950, iphone/ipad (Marvin/iBooks/QuickReader)
Just in case it is not obvious from the previous reply - caliber is not adding the extra information - it will be present in the PDF, typically as headers and footers.

As was mentioned it is possible to use the Calibre serach and replace feature in regex mode to get rid of this during conversion. However that involves working out a regex expression to dientify the offending text that is specific to this book - as all PDF files differ slightly it has not proved possible to get calibre to work out the required regex and completely automate such removal.
itimpi is offline   Reply With Quote
Old 02-24-2012, 08:43 AM   #4
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,800
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Adding to itimpi

Sigil is great for this type of cleanup.
You can see what code (in CV) is needed for your REGEX.
theducks is offline   Reply With Quote
Old 02-24-2012, 08:48 AM   #5
hidari
MR Drone
hidari ought to be getting tired of karma fortunes by now.hidari ought to be getting tired of karma fortunes by now.hidari ought to be getting tired of karma fortunes by now.hidari ought to be getting tired of karma fortunes by now.hidari ought to be getting tired of karma fortunes by now.hidari ought to be getting tired of karma fortunes by now.hidari ought to be getting tired of karma fortunes by now.hidari ought to be getting tired of karma fortunes by now.hidari ought to be getting tired of karma fortunes by now.hidari ought to be getting tired of karma fortunes by now.hidari ought to be getting tired of karma fortunes by now.
 
hidari's Avatar
 
Posts: 1,613
Karma: 15612282
Join Date: Oct 2007
Location: DRONEZONE
Device: PB360+, Huawei MP5, Libra H20
Quote:
Originally Posted by theducks View Post
Adding to itimpi

Sigil is great for this type of cleanup.
You can see what code (in CV) is needed for your REGEX.
As the venerable JSWolf...would say....(not sure what happened to him? self-exile or ostracized)... PDF to epub on calibre is like the roulette wheel....sigil might be the way to go.... or Abby PDF transformer...great program PDF to RTF or PDF to word or PDF to searchable pdf..... it used to cost about 100USD but well worth the cost... also converts two column to one column pdfs...

Last edited by hidari; 02-28-2012 at 05:38 PM.
hidari is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Problem converting pdf to epub (size) using calibre abadguy PDF 6 03-23-2012 05:33 AM
problem converting PDF to epub Herbstwind Conversion 10 11-10-2011 05:18 AM
Problem with accents converting PDF to EPUB madeira Calibre 0 07-09-2010 05:15 PM
Problem converting PDF to EPUB in calibre adgpro Calibre 2 07-09-2010 01:10 AM
Problem converting pdf to epub smartin Calibre 3 05-02-2010 06:55 AM


All times are GMT -4. The time now is 01:49 AM.


MobileRead.com is a privately owned, operated and funded community.