View Single Post
Old 12-01-2010, 08:48 AM   #1
vbdasc
Junior Member
vbdasc began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Nov 2010
Device: none
HTML to TXT and line breaks

Well, the HTML to text conversion in calibre apparently ignores the line break tags ( <br /> ), i.e. converts them to empty strings. As far as I understand, the reason for this behaviour is that far too many HTML texts tend to use the line break tags incorrectly, and this hack fixes the problem, allowing the text to reflow properly on any output device. However, I believe the <br /> tag should be converted to a space instead. Imagine the following snippet of HTML code:

cater<br />pillar

Every web browser out there will show you two distinct words, but Calibre will produce a text containing only one word - "caterpillar". And this is obviously incorrect. Any opinions on this? Thanks in advance.
vbdasc is offline   Reply With Quote