View Single Post
Old 03-14-2012, 04:05 PM   #6
SBT
Fanatic
SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.
 
SBT's Avatar
 
Posts: 580
Karma: 810184
Join Date: Sep 2010
Location: Norway
Device: prs-t1, tablet, Nook Simple, assorted kindles, iPad
Here's how you can find the characters used in a xhtml file (tags are excluded) in a unix bash shell:
Code:
cat file.xhtml|sed -e 's/<[^>]\+>//g' -e 's/./&\n/g'  |sort -u |tr "\n" " "
If you want to just find the characters in headers, you can try:
Code:
grep "<h[1-4]" OEBPS/vol1/12.xhtml|sed -e 's/<[^>]\+>//g' -e 's/./&\n/g'  |sort -u |tr "\n" " "
SBT is offline   Reply With Quote