I still vote working your way up from plain text (without losing the bolds and italics) from Word:
1. Using FineReader export to DOCX as
"Formatted Text". For illustrative purposes, here's a screenshot of FineReader 11
(they've added icons now, no more confusion resulting in thousands of text boxes):
2. Use the method I mentioned
earlier to save as Plain Text, which will give you a squeaky clean document with bolds and italics, and do the layout - I always use my own customized 'Quick Styles' here. Much faster.
3. Export as HTML (maybe try Filtered HTML). Then go from there.