![]() |
#1 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,090
Karma: 6058305
Join Date: Sep 2010
Location: UK
Device: Kindle Paperwhite
|
Converting multi-column PDFs on Linux
I have some RPG PDFs that I'd like to be able to read on my Kindle. Converting them is a real pain, because the text is in two columns. After some experimentation, I've found the following set of commands, which appear to produce a plain text file with the text in the correct order:
Code:
pdftohtml -c -s -i -xml INPUT_FILE.pdf sed -e s/"<[^>]*>"//g INPUT_FILE.xml > OUTPUT_FILE.txt The text will probably need some cleaning up, and of course will contain no formatting, but I found that Calibre was quite intelligent at working out where headings were when converting the text file to a .mobi. |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Q: multi-column PDF to single column mobi format converstion | auburn1975 | Calibre | 7 | 01-28-2012 06:11 PM |
Multi column sort? | nexus100 | Calibre | 1 | 07-11-2010 11:19 PM |
Multi-column articles in PDF | tdido | OpenInkpot | 7 | 06-30-2009 11:13 AM |
Converting PDFs to images (Linux only) | kylecronan | Kindle Developer's Corner | 1 | 02-28-2009 02:37 PM |
manipulating/converting pdfs under linux | johnnytruant | iRex | 3 | 02-02-2009 02:57 PM |