Quote:
Originally Posted by Liviu_5
The text pdf's are also of two types - the ones that reflow (well) and the ones that do not. Yours seem to be of the second type so if the text in the original is too small, you have to use horizontal half-page to read them nicely, preserving the layout.
I have no idea if/how you can convert a text pdf that does not reflow into one that does, but I will look into it since I started liking using pdf's on the 700 since they have a nicer layout than even the lrf files.
Text pdf's can and do have embedded images. An image pdf is all made of images - scans are the best example - and you cannot extract the text unless you OCR it.
I made my first pdf out of a Ms_doc using open_office_org and it reflows great as well as being nicer than the lrf I made with Calibre, and faster than the epub I made with Calibre, though the epub is smaller.
|
well it could be hard and finicky, but yes its possible to convert a PDF with Manual Line Breaks. I saw a thread yesturday that used Microsoft word to Autoformat the txt (i dont remember what the shortcut was, but it did have a shortcut.) also book designer is good and rescueing 99% of line Breaks when i Import it (depending on the format that it was imported from. I ussually use a PDF converter on it first)
also for the data in headers and footers that hard coded into some pdfs (such as some converters that actually put the path of the imput file into the footer... god i hate those)
I convert the PDF to TXT, then I use notepad++.
use find to find a regular expression inside every line of the header/footer. also check the box that says Mark Line. now all the lines where the term is will be marked. then you can go to EDIT and there is a option to delete all Marked lines.