![]() |
#1 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,623
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
|
Work on an unzipped EPUB xhtml file
Hi
I use Linux. If I unzip an EPUB, I can use a python script to work with a terminal on the .xhtml files and I can perform this way some tasks I am unable to do directly on the EPUB. However, things do not appear to be as easy as that, specially for saving the output. Are there any recommendations to follow to modify safely these .xhtml files? The goal is to modify one of these files and import it back in the EPUB. This is how the script is looking. Spoiler:
Am I missing something obvious? Any practical recommendation appreciated.. Last edited by roger64; 01-15-2016 at 07:28 PM. |
![]() |
![]() |
![]() |
#2 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 5,680
Karma: 23983815
Join Date: Dec 2010
Device: Kindle PW2
|
AFAIK, a zipped epub archive needs be packed in certain sequence and the mimetype needs be added first and uncompressed.
The Sigil Plugin runner routines contain this Python 3 code that worked fine for me: Code:
epub_mimetype = b'application/epub+zip' def unzip_epub_to_dir(path_to_epub, destdir): f = open(pathof(path_to_epub), 'rb') sz = ZipFile(f) for name in sz.namelist(): data = sz.read(name) name = name.replace("/", os.sep) filepath = os.path.join(destdir,name) basedir = os.path.dirname(filepath) if not os.path.isdir(basedir): os.makedirs(basedir) with open(filepath,'wb') as fp: fp.write(data) f.close() def epub_zip_up_book_contents(ebook_path, epub_filepath): outzip = zipfile.ZipFile(pathof(epub_filepath), 'w') files = unipath.walk(ebook_path) if 'mimetype' in files: outzip.write(pathof(os.path.join(ebook_path, 'mimetype')), pathof('mimetype'), zipfile.ZIP_STORED) else: raise Exception('mimetype file is missing') files.remove('mimetype') for file in files: filepath = os.path.join(ebook_path, file) outzip.write(pathof(filepath),pathof(file),zipfile.ZIP_DEFLATED) outzip.close() Since you're a Linux user, you could also use a shell script. Alternatively, you could also run your Python code in Calibre Editor as a function or write a Sigil plugin. This way all the packing and unpacking is handled by the hosting app. Last edited by Doitsu; 01-16-2016 at 02:57 AM. |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,623
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
|
@Doitsu
Thanks for sharing this code. I believed that I could import directly any .xhtml file from the Calibre editor... ![]() |
![]() |
![]() |
![]() |
#4 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 5,680
Karma: 23983815
Join Date: Dec 2010
Device: Kindle PW2
|
@roger64: Since your Python code seems to have something to do with footnotes, also check out my AddIDs plugin.
If you have the same number of footnote references and footnotes (and both are in the same order) you might be able to use it assign the proper ids to footnote references and footnotes. (You'd run it twice: once for the footnote references and once for the footnote definitions.) |
![]() |
![]() |
![]() |
#5 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,623
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
|
Quote:
If you are interested, I can PM you a test file using this script. It's quite efficient and quick but for this defect... It maybe could be integrated in your plugin. I had first though I could have done a Calibre function out of this, but as no support seems to be available and I don't know how to proceed... https://www.mobileread.com/forums/sho...41&postcount=1 Last edited by roger64; 01-16-2016 at 08:08 AM. Reason: function |
|
![]() |
![]() |
Advert | |
|
![]() |
#6 | ||
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 5,680
Karma: 23983815
Join Date: Dec 2010
Device: Kindle PW2
|
Quote:
Quote:
BTW, also check out the Sigil footnote plugin. |
||
![]() |
![]() |
![]() |
#7 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,623
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
|
I think we do not speak about the same thing. Your plugin checks the ids on both sides. This script checks the chapter numbers on the return side (on the body side they usually all point to the same chapter containing the notes so it's easy to check).
Even if the ids are correct, a wrong chapter number is enough to break the return link. So a safety check of the chapter numbers can confirm you than the links are working both sides. Ids and chapter numbers are the two variable elements of any link. I asked a friend to write this script because I had to deal with some books with broken links. To put back the missing (or wrong) chapter numbers, I had to do it manually, jumping from one to another or... I'll show you. Last edited by roger64; 01-16-2016 at 09:19 AM. |
![]() |
![]() |
![]() |
#8 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,623
Karma: 3000161
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
|
Interested people can now follow this thread here:
https://www.mobileread.com/forums/sho...66&postcount=1 @Doitsu Thanks for your expert help for debugging the script. ![]() Last edited by roger64; 01-17-2016 at 03:33 AM. |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Working with unzipped EPUB folders | schrijver | Sigil | 5 | 10-14-2015 02:02 PM |
XHTML file limit? | BobK99 | Sigil | 4 | 03-08-2013 05:38 AM |
ncx file to html/xhtml file | javochase | Conversion | 1 | 06-23-2011 06:57 PM |
xhtml file name change | bobcdy | Sigil | 11 | 10-23-2010 12:05 AM |
Several xhtml/html to a single epub file help. | clowe1028 | ePub | 3 | 03-21-2010 03:47 AM |