Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > ePub

Notices

Reply
 
Thread Tools Search this Thread
Old 08-27-2009, 12:37 AM   #1
Waltarro
Junior Member
Waltarro began at the beginning.
 
Posts: 6
Karma: 32
Join Date: Sep 2008
Device: Sony PRS505
Extract html from epub

I got a little tired of manually extracting the html from epub
files when I wanted to just read the book in a browser. Just messing
around with bash I came up with a simple script to do the job.

Its pretty crude and I know I should have read the metadata.opf
and probably would have if I did this in Java or Python, anyway
thought I would share nonetheless. Works in linux, might work
on a mac with a few tweaks. Just pass in the epub file as the first
parameter.

Code:
#!/bin/bash

# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.

# You should have received a copy of the GNU General Public License
# along with this program.  If not, see <http://www.gnu.org/licenses/>.

bookname=$1
unzip $1 -d /tmp/epub2html > /dev/null

str0=`find /tmp/epub2html/content/* -regex '.*_1.html'`
let len=${#str0}-6

substr=${str0:23:$len}
substr=${substr%1.html}

files=`ls -l /tmp/epub2html/content/$substr*.html | wc -l`

for x in $(seq 0 $files); do
 
filepart="/tmp/epub2html/content/$substr$x.html"

   if [ -e $filepart ]; then 
     cat $filepart >> ${bookname//.epub/.html}
   fi
done

#copy over the images if you want them
if [ ! -e resources ]; then
  mkdir resources 
fi

`cp /tmp/epub2html/content/resources/*.jpg /tmp/epub2html/content/resources/*.png -t ./resources 2> /dev/null`

rm -R /tmp/epub2html
Waltarro is offline   Reply With Quote
Old 08-27-2009, 12:44 AM   #2
frabjous
Wizard
frabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameter
 
frabjous's Avatar
 
Posts: 1,213
Karma: 12890
Join Date: Feb 2009
Location: Amherst, Massachusetts, USA
Device: Sony PRS-505
Thanks!

You might also want to check out jellby's script here:
http://www.mobileread.com/forums/showthread.php?t=51267
frabjous is offline   Reply With Quote
Old 08-27-2009, 12:55 AM   #3
Waltarro
Junior Member
Waltarro began at the beginning.
 
Posts: 6
Karma: 32
Join Date: Sep 2008
Device: Sony PRS505
You're right, I remember reading the post but saw the reference
to javascript so didn't think much of it... On the plus side I did
learn some new things about bash scripting so it wasn't a waste.

Thanks
Waltarro is offline   Reply With Quote
Old 08-27-2009, 05:21 AM   #4
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 5,984
Karma: 4346919
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
Quote:
Originally Posted by Waltarro View Post
You're right, I remember reading the post but saw the reference
to javascript so didn't think much of it... On the plus side I did
learn some new things about bash scripting so it wasn't a waste.
The javascript is only used to have some kind of "reader" in the browser and to override the epub's CSS. If you only want to extract the XHTML files, you don't need it.
Jellby is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
HTML to ePub? martienne ePub 1 08-08-2010 07:05 PM
HTML Book + non HTML TOC to epub aarcane Calibre 4 03-02-2010 02:58 AM
HTML to EPUB? SFCurley Calibre 7 02-02-2010 12:20 PM
epub to html banjomike Calibre 2 01-31-2010 11:27 AM
Why ePub rather than HTML? Robotech_Master Workshop 20 03-30-2009 03:53 PM


All times are GMT -4. The time now is 09:50 PM.


MobileRead.com is a privately owned, operated and funded community.