I have a script that can extract TOC from epub file, is there a way for me to retrieve first line or x chars from the html file pointed by TOC. My understanding is that each TOC entry point to a certain html file, so I'd like to open that html file and retried the 1st line or x chars from the 1st paragraph. I'm able to get the TOC using the following scripts (thanks to cas)
Code:
#! /bin/bash
# This script needs InfoZIP's unzip program
# and the xml2 tool from http://ofb.net/~egnor/xml2/
# and sed, of course.
EPUB_LIST=(my*.epub)
for f in "${EPUB_LIST[@]}"; do
echo "$f:"
unzip -p "$f" OEBPS/toc.ncx |
xml2 |
sed -n -e 's:^/ncx/navMap/navPoint/navLabel/text=: :p'
echo
done