Don't start from the .pdfs --- instead use the Quark source.
Dump to XPress Tags or .html or some other sort of tagged format, then massage that, adding back in anything which wasn't in the main text flow (or get a specialized XTension/utility such as textractor).
PDFs convert the formatting into localized text changes and positional information which is difficult to extract. If you must use a .pdf as a source, use a utility such as Marcel Weiher's TextLightning.app which will analyze that positional information and then allow you to use global search-replace techniques to convert the local-formatting into proper styles.
William
|