01-25-2016, 11:23 AM | #1 |
Groupie
Posts: 171
Karma: 40000
Join Date: Oct 2013
Device: kindle
|
pdf to doc: best way?
Hello guys.
What is the best and (possibly) easiest way to convert from pdf (text) to doc/odt/rft? And I mean with the line breaks in the right places too, not just saving to doc from acrobat. The final goal is an epub. I did it this way (acrobat pro->save as doc) once, and then I bulk-corrected all the line breaks with perfect epub. Not sure if I missed anything. EDIT apparently I did. It doesn't undo line breaks where the line ends with punctuation and starts with a guillemet «, and it erases the dash where the new line starts with one (this is probably due to prefectepub's hypenation regex). Also, it doesn't undo line breaks if a page ends with a period and the next one starts with a capital letter. EDIT2 acrobat pro->save as HTML does a pretty good job. One other time I passed the pdf through finereader, but it was more complicated. Thanks. Last edited by 1v4n0; 02-14-2017 at 08:38 AM. |
07-24-2016, 01:45 AM | #2 |
Addict
Posts: 229
Karma: 13495
Join Date: Feb 2009
Location: SoCal
Device: Kindle 3, Kindle PW, Pocketbook 301+, Pocketbook Touch, Sony 950, 350
|
|
Advert | |
|
08-07-2016, 10:44 AM | #3 |
Junior Member
Posts: 4
Karma: 23332
Join Date: Aug 2016
Location: Columbia
Device: Kobo Aura
|
It depends on which version of office you are using. MS Office 2007 has a full utility for converting and editing PDF files.
|
08-07-2016, 03:41 PM | #4 |
Groupie
Posts: 171
Karma: 40000
Join Date: Oct 2013
Device: kindle
|
I'm on open office
|
08-07-2016, 05:31 PM | #5 |
Guru
Posts: 970
Karma: 4999999
Join Date: Mar 2009
Location: Rosario, Argentina
Device: SONY PRS-505, PRS-T2
|
Conversion of pdf files is not at all easy, and every pdf is different. In all cases, the resulting file has to be parsed and corrected manually.
My first step with text pdf files is always BRISS to remove headers and footers and then Mobipocket Creator Professional Edition (free software). Of the several output formats, I always use the html file. If the html is in good shape, I open it with Sigil. If a lot of corrections are needed, I sometimes use notepad++ to make a first cleanup and then I open with Sigil. I've tried other solutions, including saving to html with Adobe Standard, but I always come back to Mobipocket Creator, which has always produced the best results for me. Hope this helps. Last edited by Pablo; 08-07-2016 at 05:34 PM. |
Advert | |
|
08-07-2016, 07:34 PM | #6 |
Addict
Posts: 229
Karma: 13495
Join Date: Feb 2009
Location: SoCal
Device: Kindle 3, Kindle PW, Pocketbook 301+, Pocketbook Touch, Sony 950, 350
|
Aiseesoft PDF Converter Ultimate->docx->proofreading->applying Heading style to chapter titles, generate TOC->Calibre->ePub
Last edited by EbokJunkie; 08-07-2016 at 07:37 PM. |
08-11-2016, 11:11 PM | #7 | |
Fuzzball, the purple cat
Posts: 1,273
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
Quote:
Edit: I've added some examples. The scanned conversion isn't as good as the conversion of the computer-generated PDF (native.pdf), but it's still very good considering the intelligence that had to go into something like that. Sorry I used a zip file for the word files. The forum attachment system would not let me attach a .docx file. The open_office_format.zip file contains the Word files saved in the .odt format. Last edited by willus; 08-12-2016 at 09:56 AM. Reason: Added examples (now with .odt files) |
|
08-12-2016, 04:18 AM | #8 |
Groupie
Posts: 171
Karma: 40000
Join Date: Oct 2013
Device: kindle
|
Cool. Open office doesn't have anything like that, right?
|
08-12-2016, 09:54 AM | #9 |
Fuzzball, the purple cat
Posts: 1,273
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
|
08-12-2016, 09:08 PM | #10 |
Junior Member
Posts: 6
Karma: 10
Join Date: Apr 2011
Device: Kindle Keyboard 3 wifi, Kindle Voyage
|
How do you go from there?. I got a nice docx but when I converted to AZW3 with calibre to use in my kindle it was terrible.
|
08-12-2016, 11:07 PM | #11 | |
Fuzzball, the purple cat
Posts: 1,273
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
Quote:
|
|
02-14-2017, 08:41 AM | #12 |
Groupie
Posts: 171
Karma: 40000
Join Date: Oct 2013
Device: kindle
|
Sorry for the gravedigging, and feel free to delete this post. I just found out that exporting the pdf to html instead of doc does a pretty good job, at least with acrobat pro. Then you can just edit the html file with whatever text editor you have (OpenOffice, in my case).
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
convert word doc to pdf or epub | wrenn1 | Kobo Reader | 13 | 07-29-2010 12:44 PM |
Android doc 2 pdf (offline) | Snepscheut | enTourage Archive | 1 | 06-23-2010 11:59 PM |
Converting from .doc to .pdf. to .lrf???? | tnronin | Workshop | 8 | 01-28-2010 11:24 AM |