Quote:
Originally Posted by leonardjensan
Can someone tell me the best way to do this?
|
That depends what is the original format. The route I usually take is to use open office.
For example if I have RTF or text file I open it in open office writer and save as html.
Than I manually modify html until it is completely well formed. You can check html validity by using tidy.
http://tidy.sourceforge.net/
This is how I usually clean generated HTML (execute from command line):
Quote:
tidy -raw -c -i -ashtml -o output.html <original.html>
|