MobileRead Forums - View Single Post - extract first line of html/text file pointed by TOC from epub

michaelbr · 09-03-2022, 04:44 AM

I have a script that can extract TOC from epub file, is there a way for me to retrieve first line or x chars from the html file pointed by TOC. My understanding is that each TOC entry point to a certain html file, so I'd like to open that html file and retried the 1st line or x chars from the 1st paragraph. I'm able to get the TOC using the following scripts (thanks to cas)

Code:

#! /bin/bash

# This script needs InfoZIP's unzip program
# and the xml2 tool from http://ofb.net/~egnor/xml2/
# and sed, of course.

EPUB_LIST=(my*.epub)

for f in "${EPUB_LIST[@]}"; do
    echo "$f:"
    unzip -p "$f" OEBPS/toc.ncx | 
        xml2 | 
        sed -n -e 's:^/ncx/navMap/navPoint/navLabel/text=:  :p'
    echo
done

09-03-2022, 04:44 AM	#1
michaelbr Connoisseur Posts: 81 Karma: 10 Join Date: Aug 2010 Location: Murcia/Spain Device: Android 12	extract first line of html/text file pointed by TOC from epub I have a script that can extract TOC from epub file, is there a way for me to retrieve first line or x chars from the html file pointed by TOC. My understanding is that each TOC entry point to a certain html file, so I'd like to open that html file and retried the 1st line or x chars from the 1st paragraph. I'm able to get the TOC using the following scripts (thanks to cas) Code: #! /bin/bash # This script needs InfoZIP's unzip program # and the xml2 tool from http://ofb.net/~egnor/xml2/ # and sed, of course. EPUB_LIST=(my*.epub) for f in "${EPUB_LIST[@]}"; do echo "$f:" unzip -p "$f" OEBPS/toc.ncx \| xml2 \| sed -n -e 's:^/ncx/navMap/navPoint/navLabel/text=: :p' echo done