View Single Post
Old 08-11-2008, 04:43 PM   #21
llasram
Reticulator of Tharn
llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.
 
llasram's Avatar
 
Posts: 618
Karma: 400000
Join Date: Jan 2007
Location: EST
Device: Sony PRS-505
Quote:
Originally Posted by kovidgoyal View Post
@llasram it may be woth implementing a simple custom pretty printer that just inserts new lines at the end of a bunch of pre defined tags (like <p>, <br>, <div>, <tr>, the various OPF tags etc) If we dont care about indentation, this should be easy enough to do.
I tried again running the markup through lxml, this time using a parser with the 'remove_blank_text' option enabled. This fixed the issue with pure-whitespace "tails" causing pretty-printing to fail entirely, at the cost of possibly removing relevant whitespace. That said, I checked several books and didn't see any rendering differences between enabling pretty-printing and not. That said, it seems safest to leave this as an option, so I've got the lxml-based option '--pretty-print' implemented and pushed up in my branch.
llasram is offline   Reply With Quote