Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 12-12-2011, 06:08 PM   #1
sinan
Enthusiast
sinan has read War And Peace ... all of itsinan has read War And Peace ... all of itsinan has read War And Peace ... all of itsinan has read War And Peace ... all of itsinan has read War And Peace ... all of itsinan has read War And Peace ... all of itsinan has read War And Peace ... all of itsinan has read War And Peace ... all of itsinan has read War And Peace ... all of itsinan has read War And Peace ... all of itsinan has read War And Peace ... all of it
 
sinan's Avatar
 
Posts: 23
Karma: 66956
Join Date: Feb 2010
Location: Conn. USA
Device: Kindle 3, Kindle PW
Ultimate PDF to Epub/Mobi conversion tips

Pdf is one of the most common format around but not very useful or user-friendly for ebook readers. For better reading experience on our reading devices, at some point, we need to convert pdf to another more conventional formats such as epub or mobi.

The main issues with pdf to epub/mobi conversions:
1. Page numbers
2. Ruptured or half-cut paragraphs
3. Stylistic issues (bold, italic)
4. Font-declarations


I have been dealing with pdf conversions for quite sometime, and tried several commercial and free converters and other workarounds, I will to try to share my experience with you, while doing so, I will go from easiest method to more complicated ones to implement.

It will be a long list of methods, I will add more in the future. But remember, every pdf creator has its own way of coding the text, so every pdf files will be slightly different than the other.

If you have better methods or more convenient ways, please share with us, so we can come up one, all in one thread.

1. Using Mobipocket Creator and Calibre
This method will remove page numbers and join ruptured paragraphs. We will use Mobipocket to remove page numbers and join half-cut paragraphs and Calibre to convert html file to epub/mobi or other formats

1. Install mobipocket creator and run it.

2. On the home screen, "Under Import From Existing File", click "Import Adobe PDF".

3. Choose file from a directory, define publication folder, language and encoding.

4. Click Import
If your pdf has legitimate repeating pattern, such as page number, it will be removed and half-cut paragraphs on different pages will be joined. By legitimate, we mean: numbers appear on the same spot throughout the book as you expect in a printed book.

5. You can go ahead and use Mobipocket creator to create prc file and convert this file to other formats via calibre which I don't recommend. But if you want so, rest is pretty straight forward. One thing you need to know, you can study html file and define tagname, attribute and value to make multilevel TOC, just like you do in calibre.

6. Go to your publication folder, default publication folder is under My documents/My Publications. You will find your book in a separate folder under publication folder. You will see different formats of your book in this folder.

7. Open html file with an html editor. You can use free editors, notepad or wordpad to correct minor mistakes or clean unnecessary sections. Study title and subtitle patterns. For examples titles can be wrapped in H1 tags and subtitles in h2, h3, with proper attributes and values which can be used to make multilevel table of contens (TOC).

8. Once you fixed html file, you can drag and drop it on Calibre, in which you can define styles, TOC etc.

Edit:
In some cases, cropping pdf before dropping it into Mobipocket creator yields better results with Mobipocket creator. There is a simple and free software called Briss to crop pdf files. Portable version is also available.
1. Run Briss
2. Load the file
3. Cancel the warning, if you don't know what to do.
4. On the cropping screen, all the odd pages will be superimposed into a singe look, so do the even pages, and blue rectangle with small dark squares at the corners will appear on each pages. Drag these squares to set the margins of the cropping for better result. You can preview it by clicking action on the menu bar >> Preview
5. Once you set the margins, click action >> Crop PDF
Briss is not stable enough, sometimes gives error and may not let you crop but you can always preview it. So saving previewed pdf will also work.

2. Using Adobe PDF Pro and Calibre
Adobe pdf pro to crop page numbers, join ruptured paragraphs and produce html file, Calibre to create different formats.

1. Open your pdf with Adobe PDF Pro, and choose Document > Crop pages. Using different options available, crop page numbers out.

2. Save you document as html file. This process will join half-cut paragraphs. Careful, when you crop a pdf page, the data is still there but pdf viewer does not display it. If you drop your cropped pdf into Calibre, Calibre will recognize cropped data so even after conversion, you will see page numbers still there.

3. Open html file with an html editor, notepad or wordpad, fix anything you want and remove unnecessary sections. Study html for better conversion via Calibre.

4. Drop html into Calibre. Fill in information and convert it. You can always create multiple level TOC in Calibre by using tags, attributes and values.

Edit: With new Adobe Acrobat Pro, things are little different. Here I am going to explain how to crop out headers and footers permanently.

1. Open your pdf with Adobe Acrobat Pro.

2. Click Tools >> Pages >> Crop
Set margins and crop document. You can use different page range, odd and event page settings.

3. Once you cropped your file Click Tools >> Protection >> Remove hidden information.

4. You will see Status: Finding hidden information, then Results . Once all the hidden information found, you can check/uncheck each group of information.

5. Click Remove.

6. Save your document. You have removed page numbers and headers and other information that you cropped out, permanently.

Note: Dropping cropped pdf into Calibre directly may not yield good results. Especially if you have unicode characters on your pdf. Better option, first convert it html first.

Once you have cropped pdf without hidden information, you can either use Adobe Acrobat Pro or Calibre for pdf to html conversion. So I am going to explain following steps for both Adobe Acrobat Pro and Calibre.

If you are going to use Adobe Acrobat Pro:
7. Now, lets save our pdf as html before converting it into mobi or epub document. Click File >> Save As >> More Options >> Html Web Page

If you cant save your file as html, make sure you unchecked "Run OCR if needed". For that, click "Settings" on "Save As" screen.

Keep in mind, sometimes Adobe Pdf Pro fails to apply bold or italic styling to the text and you will have plain text without bold or italic. In that case either use Mobipocket creator or Calibre's built in pdf to html conversion.

You can do some manual fix before conversion if you like.

8. Drag and drop html file into Calibre, and set TOC and other stuff and convert.

If you are going to use Calibre:
7. Drop your pdf file into Calibre.

8. Convert you pdf into htmlz.

Actually, Htmlz is nothing to do with html. It is a special zip file, just like epub which has all the necessary files inside to produce an ebook.

You can use Calibre's conversion settings to remove fonts, font size, margin etc.

9. Once conversion is complete, open htmlz file with a zip software, you will see a file called index.html along with css and opf files. You can edit index file and repackage.

10. During final conversion, use htmlz as your source.

3. Using any pdf to htm converter and Textpipe Pro
Textpipe pro is a pattern based text processing tool and doesn't matter how lame the conversion is, you can bring your text to a desired look and style and format. For pdf to html, sticking to what you know is the easiest and the best way and I usually use mobipocket creator for conversion, and Textpipe for reformatting/styling and mobipocket/calibre for producing ebooks.

If you want complete control over your ebook's style, or picky about the quality, or hate reading poorly formatted text, or always enjoy ebooks with TOC, or want to clean up messy html produced by Word processors like Ms Word (watch a screencast), or always insist on clean html format before conversion, Textpipe is the right tool for you.


Textpipe pro can do pattern based search and replace along with other jobs, and the options with it is endless but here is a brief list of things that you can do with Text pipe in terms of ebook reformating/styling.

1. You can add/remove all html tags/classes/attributes all at once with or without their text. For example: if you have a converted html text like <p style="..."> or <p class="..." style="...">, find and replace will never work for you and you have to clean up manually. But with Textpipe, it will only take seconds. Also you can remove desired class with its text completely, such as <p class="myclass"> myText </p>.
2. You can remove specific html tags/classes/attributes while keeping others. For example, you may want to remove all attributes except for italic and bold.
3. You can remove remove page numbers or titles all at once.
4. You can convert certain tags into another tags ie. h2 >> p
5. Change case after restricting the text, like changing case of text that lies in certain tag or certain class.
6. Since some ebook readers do not support small caps, you can mimic small caps as S<small>MALL</small>. First, you can restrict your text based on pattern, like being between certain tag or class. Then you can add <small> to not the first but remaining letters of the words. Sounds complicated but it is really very easy with subfilters and takes seconds.
7. Joining ruptured paragraphs/sentences
8. Remove extra spaces and tabs
9. Shifting and swapping text
10. Splitting and joining multiple htmls
11. Changing text encoding system (ansi, unicode, utf-8)
12. Adding/removing italics or bolds.

Learning curve may look a bit steep but it is not. Just take a look and play around. I am planning on doing an extended tutorial on it later.

Last edited by sinan; 04-26-2012 at 10:02 AM. Reason: Cropping beforehand yields better results. Crop pdf permanently
sinan is offline   Reply With Quote
Old 03-05-2012, 11:27 AM   #2
Goldotor
Junior Member
Goldotor began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Mar 2012
Location: Jakarta Indonnesia
Device: Galaxy tab and ipad
what about indesign?

Thx a lot for those really interesting infos. I have been told indesign was also a great software to convert pdf into epub. have you already test it. I am going to work on it soon.

Quote:
Originally Posted by sinan View Post
Pdf is one of the most common format around but not very useful or user-friendly for ebook readers. For better reading experience on our reading devices, at some point, we need to convert pdf to another more conventional formats such as epub or mobi.

The main issues with pdf to epub/mobi conversions:
1. Page numbers
2. Ruptured or half-cut paragraphs
3. Stylistic issues (bold, italic)
4. Font-declarations


I have been dealing with pdf conversions for quite sometime, and tried several commercial and free converters and other workarounds, I will to try to share my experience with you, while doing so, I will go from easiest method to more complicated ones to implement.

It will be a long list of methods, I will add more in the future. But remember, every pdf creator has its own way of coding the text, so every pdf files will be slightly different than the other.

If you have better methods or more convenient ways, please share with us, so we can come up one, all in one thread.

1. Using Mobipocket Creator and Calibre
This method will remove page numbers and join ruptured paragraphs. We will use Mobipocket to remove page numbers and join half-cut paragraphs and Calibre to convert html file to epub/mobi or other formats

1. Install mobipocket creator and run it.

2. On the home screen, "Under Import From Existing File", click "Import Adobe PDF".

3. Choose file from a directory, define publication folder, language and encoding.

4. Click Import
If your pdf has legitimate repeating pattern, such as page number, it will be removed and half-cut paragraphs on different pages will be joined. By legitimate, we mean: numbers appear on the same spot throughout the book as you expect in a printed book.

5. You can go ahead and use Mobipocket creator to create prc file and convert this file to other formats via calibre which I don't recommend. But if you want so, rest is pretty straight forward. One thing you need to know, you can study html file and define tagname, attribute and value to make multilevel TOC, just like you do in calibre.

6. Go to your publication folder, default publication folder is under My documents/My Publications. You will find your book in a separate folder under publication folder. You will see different formats of your book in this folder.

7. Open html file with an html editor. You can use free editors, notepad or wordpad to correct minor mistakes or clean unnecessary sections. Study title and subtitle patterns. For examples titles can be wrapped in H1 tags and subtitles in h2, h3, with proper attributes and values which can be used to make multilevel table of contens (TOC).

8. Once you fixed html file, you can drag and drop it on Calibre, in which you can define styles, TOC etc.

Edit:
In some cases, cropping pdf before dropping it into Mobipocket creator yields better results with Mobipocket creator. There is a simple and free software called Briss to crop pdf files. Portable version is also available.
1. Run Briss
2. Load the file
3. Cancel the warning, if you don't know what to do.
4. On the cropping screen, all the odd pages will be superimposed into a singe look, so do the even pages, and blue rectangle with small dark squares at the corners will appear on each pages. Drag these squares to set the margins of the cropping for better result. You can preview it by clicking action on the menu bar >> Preview
5. Once you set the margins, click action >> Crop PDF
Briss is not stable enough, sometimes gives error and may not let you crop but you can always preview it. So saving previewed pdf will also work.

2. Using Adobe PDF Pro and Calibre
Adobe pdf pro to crop page numbers, join ruptured paragraphs and produce html file, Calibre to create different formats.

1. Open your pdf with Adobe PDF Pro, and choose Document > Crop pages. Using different options available, crop page numbers out.

2. Save you document as html file. This process will join half-cut paragraphs. Careful, when you crop a pdf page, the data is still there but pdf viewer does not display it. If you drop your cropped pdf into Calibre, Calibre will recognize cropped data so even after conversion, you will see page numbers still there.

3. Open html file with an html editor, notepad or wordpad, fix anything you want and remove unnecessary sections. Study html for better conversion via Calibre.

4. Drop html into Calibre. Fill in information and convert it. You can always create multiple level TOC in Calibre by using tags, attributes and values.

Edit: With new Adobe Acrobat Pro, things are little different. Here I am going to explain how to crop out headers and footers permanently.

1. Open your pdf with Adobe Acrobat Pro.

2. Click Tools >> Pages >> Crop
Set margins and crop document. You can use different page range, odd and event page settings.

3. Once you cropped your file Click Tools >> Protection >> Remove hidden information.

4. You will see Status: Finding hidden information, then Results . Once all the hidden information found, you can check/uncheck each group of information.

5. Click Remove.

6. Save your document. You have removed page numbers and headers and other information that you cropped out, permanently.
Note: Dropping cropped pdf into Calibre directly may not yield good results. Especially if you have unicode characters on your pdf. Better option, first convert it html first.

7. Now, lets save our pdf as html before converting it into mobi or epub document. Click File >> Save As >> More Options >> Html Web Page

If you cant save your file as html, make sure you unchecked "Run OCR if needed". For that, click "Settings" on "Save As" screen.

You can do some manual fix befor conversion if you like.

8. Drag and drop html file into Calibre, and set TOC and other stuff and convert.


3. Using any pdf to htm converter and Textpipe Pro
Textpipe pro is a pattern based text processing tool and doesn't matter how lame the conversion is, you can bring your text to a desired look and style and format. For pdf to html, sticking to what you know is the easiest and the best way and I usually use mobipocket creator for conversion, and Textpipe for reformatting/styling and mobipocket/calibre for producing ebooks.

If you want complete control over your ebook's style, or picky about the quality, or hate reading poorly formatted text, or always enjoy ebooks with TOC, or want to clean up messy html produced by Word processors like Ms Word (watch a screencast), or always insist on clean html format before conversion, Textpipe is the right tool for you.


Textpipe pro can do pattern based search and replace along with other jobs, and the options with it is endless but here is a brief list of things that you can do with Text pipe in terms of ebook reformating/styling.

1. You can add/remove all html tags/classes/attributes all at once with or without their text. For example: if you have a converted html text like <p style="..."> or <p class="..." style="...">, find and replace will never work for you and you have to clean up manually. But with Textpipe, it will only take seconds. Also you can remove desired class with its text completely, such as <p class="myclass"> myText </p>.
2. You can remove specific html tags/classes/attributes while keeping others. For example, you may want to remove all attributes except for italic and bold.
3. You can remove remove page numbers or titles all at once.
4. You can convert certain tags into another tags ie. h2 >> p
5. Change case after restricting the text, like changing case of text that lies in certain tag or certain class.
6. Since some ebook readers do not support small caps, you can mimic small caps as S<small>MALL</small>. First, you can restrict your text based on pattern, like being between certain tag or class. Then you can add <small> to not the first but remaining letters of the words. Sounds complicated but it is really very easy with subfilters and takes seconds.
7. Joining ruptured paragraphs/sentences
8. Remove extra spaces and tabs
9. Shifting and swapping text
10. Splitting and joining multiple htmls
11. Changing text encoding system (ansi, unicode, utf-8)
12. Adding/removing italics or bolds.

Learning curve may look a bit steep but it is not. Just take a look and play around. I am planning on doing an extended tutorial on it later.
Goldotor is offline   Reply With Quote
Advert
Old 03-05-2012, 12:41 PM   #3
Pablo
Guru
Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.
 
Pablo's Avatar
 
Posts: 970
Karma: 4999999
Join Date: Mar 2009
Location: Rosario, Argentina
Device: SONY PRS-505, PRS-T2
This is a great contribution!!! I always start with Mobipocket Creator for pdf files and sometimes also use Briss, both are excellent tools to deal with pdf files. Karma for you....
Pablo is offline   Reply With Quote
Old 03-05-2012, 02:29 PM   #4
DSpider
Evangelist
DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.
 
DSpider's Avatar
 
Posts: 450
Karma: 343115
Join Date: Nov 2009
Location: Romania
Device: PW2 2014
Goldotor, why did you quote the entire post? It more than doubles the vertical size of the thread and being on a wide screen it's killing my scroll finger. Come on. Please edit it out.


Thanks.

Filtering out the (already supposedly) "Filtered" HTML output from Word is always a PITA.

For "Adobe PDF Pro" you probably mean Adobe Acrobat X Pro.
DSpider is offline   Reply With Quote
Old 03-06-2012, 09:15 AM   #5
Goldotor
Junior Member
Goldotor began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Mar 2012
Location: Jakarta Indonnesia
Device: Galaxy tab and ipad
oups..

Sorry it was my first post here, I wish I could edit it out.. but how
thank you.
Goldotor is offline   Reply With Quote
Advert
Old 03-06-2012, 09:25 AM   #6
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 73,510
Karma: 126422064
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
The only real way to convert PDF to another format is after the conversion, you HAVE to take the PDF and the converted file and A/B compare every paragraph, every sentence, every word, every letter, every punctuation mark and every style. There's no way to convert a novel length PDF to anything else without errors and the A/B compare is the only way to be sure of no conversion errors.
JSWolf is offline   Reply With Quote
Old 03-06-2012, 09:37 AM   #7
DSpider
Evangelist
DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.DSpider ought to be getting tired of karma fortunes by now.
 
DSpider's Avatar
 
Posts: 450
Karma: 343115
Join Date: Nov 2009
Location: Romania
Device: PW2 2014
Goldotor, don't sweat it. You probably clicked the quote button by mistake. Your post has an edit button at the end (bottom right corner).

Welcome to the forums.

Last edited by DSpider; 03-06-2012 at 09:39 AM.
DSpider is offline   Reply With Quote
Old 04-26-2012, 10:04 AM   #8
sinan
Enthusiast
sinan has read War And Peace ... all of itsinan has read War And Peace ... all of itsinan has read War And Peace ... all of itsinan has read War And Peace ... all of itsinan has read War And Peace ... all of itsinan has read War And Peace ... all of itsinan has read War And Peace ... all of itsinan has read War And Peace ... all of itsinan has read War And Peace ... all of itsinan has read War And Peace ... all of itsinan has read War And Peace ... all of it
 
sinan's Avatar
 
Posts: 23
Karma: 66956
Join Date: Feb 2010
Location: Conn. USA
Device: Kindle 3, Kindle PW
Yes there is.

There is always some manual work, but you can reduce the amount enormously with the right tools.

Last edited by sinan; 04-26-2012 at 10:07 AM.
sinan is offline   Reply With Quote
Old 05-07-2012, 02:38 AM   #9
bladex01
Member
bladex01 began at the beginning.
 
bladex01's Avatar
 
Posts: 15
Karma: 48
Join Date: Dec 2011
Device: Kindle4 Touch
Excellent tutorial, congratulations! I waiting for next extended tutorial with impatience.
bladex01 is offline   Reply With Quote
Old 06-07-2012, 06:35 PM   #10
saoir
Groupie
saoir ought to be getting tired of karma fortunes by now.saoir ought to be getting tired of karma fortunes by now.saoir ought to be getting tired of karma fortunes by now.saoir ought to be getting tired of karma fortunes by now.saoir ought to be getting tired of karma fortunes by now.saoir ought to be getting tired of karma fortunes by now.saoir ought to be getting tired of karma fortunes by now.saoir ought to be getting tired of karma fortunes by now.saoir ought to be getting tired of karma fortunes by now.saoir ought to be getting tired of karma fortunes by now.saoir ought to be getting tired of karma fortunes by now.
 
saoir's Avatar
 
Posts: 188
Karma: 2088290
Join Date: Jan 2009
Location: Ireland
Device: Kindle Paperwhite
What about for Mac users ? I have been trying to convert a few major documents I have in pdf, to Mobi or AZW3 using Calibre. But the page numbers keep screwing it up and even online converters wont work.
saoir is offline   Reply With Quote
Old 06-22-2012, 03:34 PM   #11
johnhalbert
Junior Member
johnhalbert began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Jun 2012
Device: Kindle Fire, Kindle 2
Awesome guide! Thanks to the OP on this one.

How are you guys handling footnotes? They always end up in the middle of text & I'm always less than sure how best to deal with those.
johnhalbert is offline   Reply With Quote
Old 07-02-2012, 04:15 PM   #12
Jerkso
Junior Member
Jerkso began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Jul 2012
Device: none
<< NECRO POST >> ... sorry

Quote:
Originally Posted by JSWolf View Post
The only real way to convert PDF to another format is after the conversion, you HAVE to take the PDF and the converted file and A/B compare every paragraph, every sentence, every word, every letter, every punctuation mark and every style. There's no way to convert a novel length PDF to anything else without errors and the A/B compare is the only way to be sure of no conversion errors.
Really the typesetting and layout of document is more or less a one off piece of art from the typesetter. It is futile and a waste of time to bother trying to retain anything other than the data in the document.

And generally you can grab the text of the book without any corruption in my experience. I am happy to just extract the text and the images from a pdf and accept just text if I have to. Since thats what is important, well to me at least.

Anyways... good post, but damn PDF is a disgusting format technically and usability wise.
Jerkso is offline   Reply With Quote
Old 07-03-2012, 07:12 AM   #13
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 7,514
Karma: 18512745
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
Quote:
Originally Posted by Jerkso View Post
And generally you can grab the text of the book without any corruption in my experience. I am happy to just extract the text and the images from a pdf and accept just text if I have to. Since thats what is important, well to me at least.
Spaces around punctuation, hyphenation, scene breaks, italics, blockquotes, etc. are often lost or ruined, even if you can extract the raw text from a PDF. And these are integral part of the text, not just part of the typesetting art.
Jellby is offline   Reply With Quote
Old 09-11-2012, 06:36 AM   #14
BlackVoid
Evangelist
BlackVoid ought to be getting tired of karma fortunes by now.BlackVoid ought to be getting tired of karma fortunes by now.BlackVoid ought to be getting tired of karma fortunes by now.BlackVoid ought to be getting tired of karma fortunes by now.BlackVoid ought to be getting tired of karma fortunes by now.BlackVoid ought to be getting tired of karma fortunes by now.BlackVoid ought to be getting tired of karma fortunes by now.BlackVoid ought to be getting tired of karma fortunes by now.BlackVoid ought to be getting tired of karma fortunes by now.BlackVoid ought to be getting tired of karma fortunes by now.BlackVoid ought to be getting tired of karma fortunes by now.
 
Posts: 415
Karma: 510423
Join Date: Nov 2006
Device: Sony PRS-505
Quote:
Originally Posted by JSWolf View Post
The only real way to convert PDF to another format is after the conversion, you HAVE to take the PDF and the converted file and A/B compare every paragraph, every sentence, every word, every letter, every punctuation mark and every style. There's no way to convert a novel length PDF to anything else without errors and the A/B compare is the only way to be sure of no conversion errors.
Yes, that is true if you want a perfect conversion. But a readable conversion can be had with much less work. I used to use BookDesigner 5 to convert from PDF to LRF with acceptable results and minimal editing. Unfortunately BookDesigner does not support ePub, so an extra step would be needed.

Calibre is also one of the worst converters, it totally ruins paragraph and page layout with 1 line and empty pages, etc.

I will try the methods described by the thread starter.
BlackVoid is offline   Reply With Quote
Old 01-10-2013, 10:22 AM   #15
Reenokazar
Junior Member
Reenokazar began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Jan 2013
Device: Kindle 4
Thanks so much, this really helped me bro!
Reenokazar is offline   Reply With Quote
Reply

Tags
pdf, pdf conversion, pdf to epub, pdf to mobi

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
The Ultimate Kobo Tips & Tricks Thread ficbot Kobo Reader 449 01-25-2023 10:00 AM
Conversion Issues - PDF/Word to Mobi/ePub MajC Conversion 1 02-14-2011 08:27 AM
Any chance for chm to epub/mobi/pdf conversion support joblack Calibre 4 11-02-2010 01:06 AM
PDF to Mobi Conversion rayh Calibre 2 09-24-2010 02:33 AM
Epub/Mobi TO pdf conversion problem Hitch Calibre 4 06-15-2010 05:28 PM


All times are GMT -4. The time now is 01:39 AM.


MobileRead.com is a privately owned, operated and funded community.