Thread: Typos in ebooks
View Single Post
Old 04-10-2010, 04:15 AM   #68
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 7,560
Karma: 20150435
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
Quote:
Originally Posted by WarnerYoung View Post
I'm not sure that's strictly true. It depends on how the PDF file was generated. Otherwise, a standard PDF reader wouldn't be able to let you select and copy its text, or search through the text in its files. Or am I missing something here?
Even with text-based PDFs, the PDF does not (necessarily) contain information about words, paragraphs, etc. The characters are easy to extract (unless there are funny fonts involved) but joining hyphenated words at the end of line, putting spaces where they belong, removing page numbres and headers, dealing with footnotes, putting columns in the right order, detecting paragraphs, etc. is a different matter.
Jellby is offline   Reply With Quote