View Single Post
Old 12-10-2011, 01:32 PM   #8
SBT
Fanatic
SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.
 
SBT's Avatar
 
Posts: 580
Karma: 810184
Join Date: Sep 2010
Location: Norway
Device: prs-t1, tablet, Nook Simple, assorted kindles, iPad
convert from proof-reading html back to text

@opitz: Thanks for your kind words; any kind of feedback is welcome.

When you've finished proofreading in LibreOffice, or just want to return to editing in a pure text editor, you can use the following function which is the reverse of makeproofread (which I think I'll rename txt2proof, so there'll be some consistency.

I thought it would be a good idea to read input from STDIN if no filename is given; I'll probably add that functionality to all functions where appropriate

Code:
function proof2txt {
# Usage: proof2txt [inputfile.html].
# If no inputfile, input is read from STDIN.
# Output written to STDOUT
[ $1 ] && inputfile=$1 || inputfile="/dev/stdin"
# Handle text marked as italic/bold.
# LibreOffice inserts </I> and <I> (ditto for bold) at the end and beginning 
# of italic sections than spans several lines.
# Enclosing <..> tags are replaced by html-encoded < & > for italics/bold.
sed  '1h;1!H;${g;s/<\/I>\n<I>/\n/g;s/<\/B>\n<B>/\n/g;p;}' $inputfile |\
sed s/'<\(\/\?[BI]\)>'/'\&lt;\1\&gt;'/g |\
lynx -dump -stdin |\
grep -v "^   \[[0-9]\{3,3\}\.jpg\]"
}

Last edited by SBT; 12-10-2011 at 01:36 PM.
SBT is offline   Reply With Quote