View Single Post
Old 08-09-2009, 11:08 AM   #1
acj412
Junior Member
acj412 began at the beginning.
 
Posts: 9
Karma: 10
Join Date: Apr 2009
Device: Sony Reader 505
Create proper paragraph breaks in ereader2html

When using ereader2html.py to read books on my Sony, I noticed that there are no spaces betwen paragraphs, nor are there any indentations. Looking closer at the output html, I noticed that the original paragraph indents are present (as spaces), and that paragraphs are separated by a line break only (<br>). This works for ereader files, but in html, the spaces at the beginning of the line get ignored. This generates a file that is much harder to read, as you have neither an indentation nor an extra space to separate paragraphs.

As a workaround, I replaced the line in the ereader2html source code

s = s.replace('\n', '<br>\n')

with

s = s.replace('\n', '<p>\n').

Now individual paragraphs are separated by spaces. Also, you can now use calibre's "Remove spacing between paragraphs." option if you want to remove spaces between paragraphs and use indents instead.

Depending on how the publisher uses the \n tag in the original ereader code, you might introduce other formatting quirks. But, in most cases, I think this change creates a more readable text.
acj412 is offline   Reply With Quote