As someone who comes from a Analog Electronics background,
Every conversion Loses Detail and Inserts Artifacts.
Conversions should always start from a copy closest to the Original, with the most detail (The music industries calls them
Master Recordings, Your Digital Camera call them RAW format
) .
Smaller file size could be a result of compression (detail loss possible) or stripping out inserted noise (MSWord cr*p) which may make them less desirable a conversion sources. Only
trained eyes can tell if removing inserted noise will have no further impact.
TXT has little Noise and Little detail (no fonts, simplistic spacing/alignment)
PDF has lots of detail, but prevents clean translation (its purpose was to render in a single way, not be converted.)
Then there is the source that has been
leaned down to fit a tiny display (phone). For some unknown reason The w3c standards have no provision for Display size, aspect or color depth, thus forcing the user to use a single CSS for a 'Display' type device'
You can never truly recover
Lost data (resolution). Archive the original source if you need to modify the file to compact the output.