If it's an ebook it's easiest to fix in Calibre.
I'm no expert on regex, so I decline to look like an idiot and post the idiotic regexes I use in LO Writer.
I replace all tabs with a space.
I replace multiple spaces with a space.
I replace a space at the start of of a paragraph with nothing.
I replace a space at the end of of a paragraph with nothing.
I replace empty paragraphs with nothing.
I have regexes to find illegal (in English, French is different) space with punctuation.
I have a spreadsheet with list of docs on first column and headings on other columns are regexes to copy/paste. Then I put a checkmark in the column. Other sheets have revision level, status, etc.
Only paragraphs that are headings or certain lists (e.g. contents) should end with no punctuation. Those can be found with a regex.
However if it's a downloaded ebook rather than a source docx/odt, then regex and daip's toolbag (global change HTML tags etc) in Calibre is best.
|