View Single Post
Old 04-25-2012, 04:06 AM   #3
SBT
Fanatic
SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.
 
SBT's Avatar
 
Posts: 580
Karma: 810184
Join Date: Sep 2010
Location: Norway
Device: prs-t1, tablet, Nook Simple, assorted kindles, iPad
I had a look at diff3, but it doesn't seem to have the ability to select 'best out of three' automagically. I ended up making the following bash script. What it does is basically, given versions a.txt, b.txt, c.txt, ...:
  1. find lines that differ between a and b
  2. do a poll of the two line versions among all file versions
  3. select the one with most hits.
This can be iterated, repeating the process with c and d, and so on, then diffing the refined versions and so on.

Code:
cp a.txt ab.txt
diff -y a.txt b.txt|grep '|' |\
while read l
do
a=${l%%	*}
b=${l##*	}
na=$(cat ?.txt|grep -c "^$a\$" )
nb=$(cat ?.txt|grep -c "^$b\$" )
sd=$(( ${#b} - ${#a} ))
sd=${sd#-} # assume lines are not similar if the lengths differ by more than 3
if (( "$na" < "$nb"  &&  $sd < 3 )); then 
sed -i "/^$a\$/s/.*/# $b/" ab.txt
fi
done
SBT is offline   Reply With Quote