Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > ePub

Notices

Reply
 
Thread Tools Search this Thread
Old 12-31-2009, 05:27 PM   #1
acts_as_david
Reader/Developer
acts_as_david began at the beginning.
 
Posts: 6
Karma: 10
Join Date: Dec 2009
Device: Sony PRS-600
Post Zip spec for ePub

According to the ePub OCF spec, the ePub format should be zipped using the PK Zip compression method, as described in the PKware APPNOTE:

http://www.pkware.com/documents/casestudies/APPNOTE.TXT

The short version of my question: I'm looking for an easy, cross-platform way to accomplish this in a conversion tool I'm writing.

More specifically: I'm designing an ePub library (and tool) for Ruby to handle, manipulate, and create ePub files. Directory and file creation is easy enough, but Zipping them up into an ePub file gets tricky.

Right now, I'm using the following (on OS X, should work in Linux, too):

Code:
`zip -X0 Sample.epub mimetype`
`zip -rX9 Sample.epub * -x mimetype`
This is fine and produces a Zip file that looks something like this in my Hex editor:

PK...........;oa
.,............mi
metypeapplicatio
n/epub+zipPK...
......l.;......
..........META-I
NF/PK........%..
;.f......4......
.META-INF/contai
ner.xml

That is, mimetype and 'application/epub+zip' occur at the right offsets, according to the spec.

But no other Zip utility I've found will output this type of Zip file! Not in OS X ('compress') or Linux or Windows (Send to -> Archive). Trying rubyzip and zipruby didn't help. My files never turn out correct using these tools/libraries, and they don't pass epubcheck.

Any suggestions on what library I should use to Zip up my archives?

Last edited by acts_as_david; 12-31-2009 at 05:29 PM. Reason: fix code sections
acts_as_david is offline   Reply With Quote
Old 12-31-2009, 05:59 PM   #2
pepak
Fanatic
pepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura aboutpepak has a spectacular aura about
 
Posts: 594
Karma: 4150
Join Date: Mar 2008
Device: Sony Reader PRS-505
The most common ZIP library is ZLib.
InfoZip is a command line utility that works on both Windows and Linux and produces the expected output.
pepak is offline   Reply With Quote
Old 12-31-2009, 08:41 PM   #3
acts_as_david
Reader/Developer
acts_as_david began at the beginning.
 
Posts: 6
Karma: 10
Join Date: Dec 2009
Device: Sony PRS-600
Okay, thanks for the suggestions. I think I'll try Zlib first (it's part of the standard Ruby library) and maybe move onto InfoZip if that doesn't work.

One thing I'm not clear on is how many Zip specifications there are? I had thought that any Zip implementation would produce identical Zip files, but as I've said, rubyzip and zipruby didn't.

How can you tell whether an implementation will churn out the right type of file?
acts_as_david is offline   Reply With Quote
Old 12-31-2009, 09:00 PM   #4
Kolenka
<Insert Wit Here>
Kolenka ought to be getting tired of karma fortunes by now.Kolenka ought to be getting tired of karma fortunes by now.Kolenka ought to be getting tired of karma fortunes by now.Kolenka ought to be getting tired of karma fortunes by now.Kolenka ought to be getting tired of karma fortunes by now.Kolenka ought to be getting tired of karma fortunes by now.Kolenka ought to be getting tired of karma fortunes by now.Kolenka ought to be getting tired of karma fortunes by now.Kolenka ought to be getting tired of karma fortunes by now.Kolenka ought to be getting tired of karma fortunes by now.Kolenka ought to be getting tired of karma fortunes by now.
 
Kolenka's Avatar
 
Posts: 973
Karma: 1254645
Join Date: Jan 2008
Location: Puget Sound
Device: Sony T2, Kindle Paperwhite
There is only one ZIP spec, but there are a lot of similar-sounding compression schemes that aren't ZIP (bzip, gzip which don't have an internal file structure, for example, and tend to need you to use the tar format underneath to archive/compress a folder).

One of the other problems you are looking at is that the mimetype needs to be stored rather than compressed in the ZIP archive as the first file. Not all zip libraries even support setting compression for individual files (versus the whole archive).

compress doesn't create ZIP archives, which is a key distinction as well.
Kolenka is offline   Reply With Quote
Old 01-01-2010, 05:32 AM   #5
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 6,141
Karma: 4792399
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
Quote:
Originally Posted by acts_as_david View Post
Right now, I'm using the following (on OS X, should work in Linux, too):

Code:
`zip -X0 Sample.epub mimetype`
`zip -rX9 Sample.epub * -x mimetype`
You may want to add -D to the second line's arguments.
Jellby is offline   Reply With Quote
Old 01-01-2010, 10:32 AM   #6
Valloric
Created Sigil, FlightCrew
Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.
 
Valloric's Avatar
 
Posts: 1,978
Karma: 350515
Join Date: Feb 2008
Device: Sony Reader PRS 505
Quote:
Originally Posted by acts_as_david View Post
The short version of my question: I'm looking for an easy, cross-platform way to accomplish this in a conversion tool I'm writing.
Take a look at ZipArchive. It has both a commercial license and an open source one (GPL).

It's used in Sigil, and I recommend it highly. Works perfectly across Windows, Linux, Mac OS X and the various BSD's.

If your app will be GPL-compatible, feel free to take a look at the patched ZipArchive used in Sigil: it fixes a few bugs related to python generated ZIP files and improves BSD support.

Quote:
Originally Posted by pepak View Post
The most common ZIP library is ZLib.
InfoZip is a command line utility that works on both Windows and Linux and produces the expected output.
Zlib is a compression library. It only offers the DEFLATE algorithm used in ZIP, it does *not* understand or provide facilities for the PKZIP format.

You cannot use it (alone) to create a ZIP archive.

Last edited by Valloric; 01-01-2010 at 10:36 AM. Reason: typo
Valloric is offline   Reply With Quote
Old 01-01-2010, 04:16 PM   #7
rogue_ronin
Banned
rogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-books
 
Posts: 475
Karma: 796
Join Date: Sep 2008
Location: Honolulu
Device: Nokia 770 (fbreader)
Quote:
Originally Posted by Jellby View Post
You may want to add -D to the second line's arguments.
Doesn't that eliminate directory structure?

m a r
rogue_ronin is offline   Reply With Quote
Old 01-02-2010, 12:29 AM   #8
acts_as_david
Reader/Developer
acts_as_david began at the beginning.
 
Posts: 6
Karma: 10
Join Date: Dec 2009
Device: Sony PRS-600
Quote:
Originally Posted by Kolenka View Post
There is only one ZIP spec, but there are a lot of similar-sounding compression schemes that aren't ZIP (bzip, gzip which don't have an internal file structure, for example, and tend to need you to use the tar format underneath to archive/compress a folder).

One of the other problems you are looking at is that the mimetype needs to be stored rather than compressed in the ZIP archive as the first file. Not all zip libraries even support setting compression for individual files (versus the whole archive).

compress doesn't create ZIP archives, which is a key distinction as well.
Okay, thanks for the details, Kolenka. You're right that the mimetype file was being compressed when it shouldn't have been, giving me a bad epub file. Thomas Sondergaard, the developer of rubyzip, was kind enough to add compression options to his library, which now does what I want.

Quote:
Take a look at ZipArchive. It has both a commercial license and an open source one (GPL).

It's used in Sigil, and I recommend it highly. Works perfectly across Windows, Linux, Mac OS X and the various BSD's.

If your app will be GPL-compatible, feel free to take a look at the patched ZipArchive used in Sigil: it fixes a few bugs related to python generated ZIP files and improves BSD support.
Thanks for the suggestion, Valloric, I'll look into ZipArchive and how it's used in Sigil. I'm sure I can get some good ideas from looking at the code.
acts_as_david is offline   Reply With Quote
Old 01-02-2010, 04:50 AM   #9
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 6,141
Karma: 4792399
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
Quote:
Originally Posted by rogue_ronin View Post
Doesn't that eliminate directory structure?
No, it just eliminates the entries for the directories themselves. The files are still stored with their paths. As the man page says, the directory entries are useful when you want to store their permissions, but you don't need the permissions in an ePUB, so you don't need the directory entries.
Jellby is offline   Reply With Quote
Old 01-02-2010, 06:28 AM   #10
rogue_ronin
Banned
rogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-books
 
Posts: 475
Karma: 796
Join Date: Sep 2008
Location: Honolulu
Device: Nokia 770 (fbreader)
I spent an hour or two yesterday trying to figure out how to store a set of files using zip so that the relative path would be correct, and so that it wouldn't store the full path.

-j definitely killed all the directory info; never could find a way to tell zip to store the path info starting from a given directory. Is there such a trick?

I finally had to write a macro that would generate a batch file that would cd to the proper directory and run the zip commands from there.

I'm doing this on a virtual machine of Win2k, although I generally use Linux; the world's greatest text editor requires Windows, sadly...

m a r
rogue_ronin is offline   Reply With Quote
Old 01-02-2010, 07:33 AM   #11
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 6,141
Karma: 4792399
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
Quote:
Originally Posted by rogue_ronin View Post
I spent an hour or two yesterday trying to figure out how to store a set of files using zip so that the relative path would be correct, and so that it wouldn't store the full path.
With the zip command, from what I gather from the man page, you can only store relative paths with respect to the current directory. You can also drop any path info (-j) or store absolute paths (-jj).


Quote:
I'm doing this on a virtual machine of Win2k, although I generally use Linux; the world's greatest text editor requires Windows, sadly...
Hmm... no, vim works fine in linux
Jellby is offline   Reply With Quote
Old 01-02-2010, 09:45 AM   #12
Slash5
Member
Slash5 is on a distinguished road
 
Posts: 13
Karma: 64
Join Date: Nov 2009
Location: S. Ontario, Canada
Device: Jetbook, Sony PRS-505
I was playing around in VB trying to create epub files.
The only way I could get it to work was by having a base zip file with mimetype at the right offset and the uncompressed data. I then copied that file to the filename I wanted. I then had to change directory to the files I wanted to add so that pathnames were not included. Then add my files to the canned zipfile.

This was the only reliable way I could get properly formated epubs.

Since I was modifying already created epubs, I eventually decided it was easier to extract the html, modify it and then zip back to the original epub. Solves all the problems.
Slash5 is offline   Reply With Quote
Old 01-02-2010, 02:51 PM   #13
rogue_ronin
Banned
rogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-books
 
Posts: 475
Karma: 796
Join Date: Sep 2008
Location: Honolulu
Device: Nokia 770 (fbreader)
@Slash5: me, too, basically!

Quote:
Originally Posted by Jellby View Post
With the zip command, from what I gather from the man page, you can only store relative paths with respect to the current directory. You can also drop any path info (-j) or store absolute paths (-jj).
That's what I concluded, too. All, or nothing. Still, I worked-around it. But auto-generating, self-executing batch-files seems like overkill, even when it's exactly necessary.

Quote:
Hmm... no, vim works fine in linux
I said editor, not shreditor.

m a r

ps: Brain shreditor.

rogue_ronin is offline   Reply With Quote
Old 01-02-2010, 06:16 PM   #14
acts_as_david
Reader/Developer
acts_as_david began at the beginning.
 
Posts: 6
Karma: 10
Join Date: Dec 2009
Device: Sony PRS-600
Quote:
Hmm... no, vim works fine in linux
Er... TextMate works just fine in OS X ;-)

Also, running the zip command is working fine for me, re: relative file paths. In other words, running zip (etc., etc.) from within the directory I manually created gives me a proper ePub file. How are you guys running the zip command?
acts_as_david is offline   Reply With Quote
Old 01-02-2010, 08:20 PM   #15
rogue_ronin
Banned
rogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-books
 
Posts: 475
Karma: 796
Join Date: Sep 2008
Location: Honolulu
Device: Nokia 770 (fbreader)
I was running it as a macro, so it was being called from the current directory, or the root of the drive, which gave me the full path stored in the zip. Not good.

So I had to rewrite the macro as described above to start a new process in the appropriate directory.

m a r
rogue_ronin is offline   Reply With Quote
Reply

Tags
cross-platform, epub, offset, ruby, zip

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
ePub Zip/Unzip AppleScript application for Mac OS X pdurrant ePub 95 08-29-2014 08:41 PM
Zip-Probleme EPUB MichaelMBerlin ePub 4 08-27-2010 08:26 AM
Help! My How do you chnage a ZIP to an Epub? Ralph Sir Edward ePub 4 08-08-2010 07:07 PM
citations in epub spec? romnempire ePub 8 06-13-2010 04:43 AM
Other formats than ePub or Zip? Robotech_Master Calibre 4 05-28-2009 02:15 PM


All times are GMT -4. The time now is 12:30 AM.


MobileRead.com is a privately owned, operated and funded community.