View Single Post
Old 10-29-2009, 07:11 AM   #6
WillAdams
Wizard
WillAdams ought to be getting tired of karma fortunes by now.WillAdams ought to be getting tired of karma fortunes by now.WillAdams ought to be getting tired of karma fortunes by now.WillAdams ought to be getting tired of karma fortunes by now.WillAdams ought to be getting tired of karma fortunes by now.WillAdams ought to be getting tired of karma fortunes by now.WillAdams ought to be getting tired of karma fortunes by now.WillAdams ought to be getting tired of karma fortunes by now.WillAdams ought to be getting tired of karma fortunes by now.WillAdams ought to be getting tired of karma fortunes by now.WillAdams ought to be getting tired of karma fortunes by now.
 
WillAdams's Avatar
 
Posts: 1,234
Karma: 3350652
Join Date: Feb 2008
Device: Amazon Kindle Paperwhite (300ppi), Samsung Galaxy Book 12
Don't start from the .pdfs --- instead use the Quark source.

Dump to XPress Tags or .html or some other sort of tagged format, then massage that, adding back in anything which wasn't in the main text flow (or get a specialized XTension/utility such as textractor).

PDFs convert the formatting into localized text changes and positional information which is difficult to extract. If you must use a .pdf as a source, use a utility such as Marcel Weiher's TextLightning.app which will analyze that positional information and then allow you to use global search-replace techniques to convert the local-formatting into proper styles.

William
WillAdams is offline   Reply With Quote