View Single Post
Old 05-17-2012, 04:13 AM   #13
Junior Member
joesh began at the beginning.
Posts: 5
Karma: 10
Join Date: Oct 2011
Location: Seattle WA
Device: Nook
ldolse - thanks for the considered response and education on how Calibre removes in preprocessing much of the formatting I was hoping to use.

As far as blank lines are concerned, certainly PDF doesn't have them but translators like pdftotext do create them in the text output - as does pdftohtml I believe.

kiwidude - I really do understand that PDF is, in general, a programming language and a PostScript interpreter is a fairly large beast. That said, most screenplay PDFs are created by a small handful of programs and generally create PDFs that are easy enough for tools like pdftotext to render with pretty high fidelity.

[edit: I stand corrected - I've just found a script output from one of the big screenwriting programs that's not well rendered by pdftotext et al]

I'm sure not looking for perfection here. What pdftotext generates is very satisfactory. Which brings me to a different thought - most eReaders understand straight text, right? Perhaps an easier way to go would be to make a separate tool that'd rewrap paragraphs to a width appropriate for a given reader and then just send the resulting text file to the eReader. Comments?

Last edited by joesh; 05-17-2012 at 06:52 AM. Reason: found that - as kiwidude said - even pdf for screenplays can be knotty to render
joesh is offline   Reply With Quote