|
|
#1 |
|
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,090
Karma: 6058305
Join Date: Sep 2010
Location: UK
Device: Kindle Paperwhite
|
Converting multi-column PDFs on Linux
I have some RPG PDFs that I'd like to be able to read on my Kindle. Converting them is a real pain, because the text is in two columns. After some experimentation, I've found the following set of commands, which appear to produce a plain text file with the text in the correct order:
Code:
pdftohtml -c -s -i -xml INPUT_FILE.pdf sed -e s/"<[^>]*>"//g INPUT_FILE.xml > OUTPUT_FILE.txt The text will probably need some cleaning up, and of course will contain no formatting, but I found that Calibre was quite intelligent at working out where headings were when converting the text file to a .mobi. |
|
|
|
![]() |
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Q: multi-column PDF to single column mobi format converstion | auburn1975 | Calibre | 7 | 01-28-2012 07:11 PM |
| Multi column sort? | nexus100 | Calibre | 1 | 07-12-2010 12:19 AM |
| Multi-column articles in PDF | tdido | OpenInkpot | 7 | 06-30-2009 12:13 PM |
| Converting PDFs to images (Linux only) | kylecronan | Kindle Developer's Corner | 1 | 02-28-2009 03:37 PM |
| manipulating/converting pdfs under linux | johnnytruant | iRex | 3 | 02-02-2009 03:57 PM |