Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Development

Notices

Reply
 
Thread Tools Search this Thread
Old 05-08-2014, 10:09 PM   #16
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,754
Karma: 54401244
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by rpspringuel View Post
So, it seems my understanding of copyright law was flawed and my test book was still in copyright. Sorry about that. I'll go looking for something that's out of copyright and create a new test book for those interested in testing.
I don't know if your understanding of the law was flawed or not.
MR does not even allow posting of 'Free' books that are under copyright (eg. Baen free Library)
We just try and abide by the owners rules
theducks is offline   Reply With Quote
Old 05-13-2014, 01:33 PM   #17
rpspringuel
Enthusiast
rpspringuel began at the beginning.
 
Posts: 40
Karma: 10
Join Date: Feb 2014
Device: Kindle 4
If anyone is interested in testing, I've now uploaded a new test book which I believe to be out of copyright. I've attached it to the same post where I attached to code so that you can get both in one place.
rpspringuel is offline   Reply With Quote
Advert
Old 06-23-2014, 11:14 AM   #18
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,602
Karma: 5433388
Join Date: Nov 2009
Device: many
some things you might be interested in

Hi,

A recent version of KindleUnpack has added support for decoding the "PAGE" sections in each mobi generated by kindlegen when an epub with either a pagelist in ncx or page-map.xml is provided for input.

This PAGE section is nearly identical to the APNX format. In fact, if you add 8 bytes of padding to the front of an APNX file contents you can use the same code to decode both.

This code has support for decoding multiple page naming schemes (arabic, roman, characters, etc).

KindleUnpack uses this code to recreate a page-map.xml file from kindlegen generated mobis (for just the mobi 8 part). The latest version of KindleUnpack will also allow the user to pass in an associated apnx file (associated with a mobi 8 (azw3) file and have it decoded.

FWIW, after removing all of the cruft just used to properly parse the command line into working utf-8, and removing the addition of 8 bytes of padding, this code could be easily modified and added to calibre to process the PAGE sections from kindlegen generated mobis.

I have attached it just in case it is of interest to you or Kovid.

Take care,

KevinH
Attached Files
File Type: zip decode_apnx.py.zip (2.7 KB, 257 views)

Last edited by KevinH; 06-23-2014 at 02:18 PM.
KevinH is offline   Reply With Quote
Old 06-23-2014, 12:25 PM   #19
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,826
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
@KevinH: Do any actual Kindles make us of the PAGE section in AZW3 files?
kovidgoyal is offline   Reply With Quote
Old 06-23-2014, 01:41 PM   #20
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,602
Karma: 5433388
Join Date: Nov 2009
Device: many
Hi Kovid,

Only indirectly. Since real Kindles get only one version (Mobi 8 or Old Mobi) and not both upon download, the PAGE section is stripped out to become the actual APNX file after very slight modifications to add the palm database name and the asin.

To prove this theory, I used the actual APNX file from a recent purchase of the book Blood Rites by Jim Butcher, that I had downloaded to my Kindle for Mac (ie. an azw3 file and its associated apnx) and when I unpacked it the apnx offsets pointed EXACTLY to the <a id="page-X"></a> tags in the assembled text (raw text after inserting the fragments into their skeletons in the right place but before changing any links or anything else like that).

I then used kindlegen on the unpacked mobi (epub) and it recreated the exact same offsets in its PAGE sections in the mobi 8 part of the file.

So the original epub had either a pagelist in the ncx or a page-map.xml in it that referenced those id tags to mark the start of that page.

I also compared it to my hard copy of that exact same page and again they matched exactly.

So the PAGE sections created by Kindlegen are used to make the associated apnx file (with only addition of the proper metadata) and they will point exactly to the id tags used in the epub in either the pagelist of the ncx or the page-map.xml if either of those are used

Obviously, if they are just made-up offsets, none of this would have worked.

So you can think of the Kindlegen generated PAGE sections as keeping all of the relevant page information if and only if the input epub had either a page-map.xml (the Adobe standard) or pagelist info in the ncx. Otherwise kindlegen won't bother to create the PAGE section.

An apnx file on the other hand could be either just a set of page sized offsets or an exact page information taken from a PAGE section. There is no way to tell unless you look for the id= tags that marked the original pages in the epub it all was created from.


KevinH

Quote:
Originally Posted by kovidgoyal View Post
@KevinH: Do any actual Kindles make us of the PAGE section in AZW3 files?

Last edited by KevinH; 06-23-2014 at 02:16 PM.
KevinH is offline   Reply With Quote
Advert
Old 06-23-2014, 02:12 PM   #21
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,602
Karma: 5433388
Join Date: Nov 2009
Device: many
Hi Kovid,

One other thing. KindleUnpack can split Kindlegen generated dual/joint mobis into old mobi and azw3 pieces. Right now, it will strip out the PAGE sections (they are one of the sections pointed to by the SRCS offset) when it strips out the SRCS.

If you do decide to add support for PAGE sections in calibre, I can modify that code to leave the PAGE section in place for calibre to see and use. So calibre would have true page information from both kindlegen generated joint mobis, and azw3s split by KindleUnpack.

If you decide not to incorporate it, I can instead have Kindleunpack write out an apnx file directly (after adding in the required palm database name and asin), so that the true page information is not lost on the mobi side.

Just let me know which you prefer.

Take care,

KevinH

Last edited by KevinH; 06-23-2014 at 02:15 PM.
KevinH is offline   Reply With Quote
Old 06-23-2014, 10:40 PM   #22
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 73,840
Karma: 128597114
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by KevinH View Post
Hi Kovid,

One other thing. KindleUnpack can split Kindlegen generated dual/joint mobis into old mobi and azw3 pieces. Right now, it will strip out the PAGE sections (they are one of the sections pointed to by the SRCS offset) when it strips out the SRCS.

If you do decide to add support for PAGE sections in calibre, I can modify that code to leave the PAGE section in place for calibre to see and use. So calibre would have true page information from both kindlegen generated joint mobis, and azw3s split by KindleUnpack.

If you decide not to incorporate it, I can instead have Kindleunpack write out an apnx file directly (after adding in the required palm database name and asin), so that the true page information is not lost on the mobi side.

Just let me know which you prefer.

Take care,

KevinH
Please leave the PAGE section so it can be converted to ePub and work with ADE. I don't know if the ePub code will need modifying or not, but it would be good to keep it in so it can be fixed to work with ADE.
JSWolf is online now   Reply With Quote
Old 06-24-2014, 12:18 AM   #23
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,826
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
I'm somewhat unconvinced for the need for PAGE sections so far. It seems to me:

1) The only thing that uses it is Amazon's servers. And since you cannot send AZW3 by email anyway, generating AZW3s with PAGE sections appears to be useless, unless you intend to send you book to KDP, and KDP does not accept calibre generated AZW3 files anyway.

2) The other possible use case is converting kindlegen produced AZW3s to EPUB. However, since the only way to (typically) get hold of kindlegen produced AZW3s is generating them yourself, you should already have an epub with a page map, so I dont see why you would not want to use the source epub directly.

That said, the code is pretty trivial, so I might just add it anyway. But it is not a priority.
kovidgoyal is offline   Reply With Quote
Old 06-24-2014, 02:19 AM   #24
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,602
Karma: 5433388
Join Date: Nov 2009
Device: many
Hi,
The other use case is distributing free of charge kindlegen generated mobis without the source zip archive. Kindlegen can in fact generate these now with a simple command line option and their terms of use allow you to give them away for free to others without going through Amazon.

If someone had such a kindlegen generated mobi and they could decide they would like to convert it to epub for reading on another device. If they use calibre to do this now, they would lose the real page information.

Similarly, with KindleUnpack you can split out the source and create azw3s with page info in PAGE sections, and post them for others to convert.

The nice thing about the PAGE section is that it keeps the real page info right inside the mobi where it belongs, and not in some external APNX file.

Either way, I plan to make KindleUnpack keep the PAGE section when splitting while adding the capability to create an APNX file from it too. Perhaps a standalone APNX file generator if that turns out to be useful.

Take care,

KevinH

Quote:
Originally Posted by kovidgoyal View Post
I'm somewhat unconvinced for the need for PAGE sections so far. It seems to me:

1) The only thing that uses it is Amazon's servers. And since you cannot send AZW3 by email anyway, generating AZW3s with PAGE sections appears to be useless, unless you intend to send you book to KDP, and KDP does not accept calibre generated AZW3 files anyway.

2) The other possible use case is converting kindlegen produced AZW3s to EPUB. However, since the only way to (typically) get hold of kindlegen produced AZW3s is generating them yourself, you should already have an epub with a page map, so I dont see why you would not want to use the source epub directly.

That said, the code is pretty trivial, so I might just add it anyway. But it is not a priority.
KevinH is offline   Reply With Quote
Old 06-24-2014, 02:23 AM   #25
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,826
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
But anyone that's distributing kindlegen produced files would most likely also be distributing the source epubs.

In any case, as I said, I have no objections to including PAGE support into calibre, it's just not much of a priority, since the use case for it is rather minimal.
kovidgoyal is offline   Reply With Quote
Old 06-24-2014, 04:06 PM   #26
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,602
Karma: 5433388
Join Date: Nov 2009
Device: many
Hi Kovid,

Understood. I have now modified KindleUnpack to read the PAGE sections and create the proper APNX on the fly (for both old mobi and mobi8 parts) which should hold most people until you get around to adding that feature to calibre.

Thanks!

KevinH
KevinH is offline   Reply With Quote
Old 06-24-2014, 10:34 PM   #27
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 73,840
Karma: 128597114
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by KevinH View Post
Hi Kovid,

Understood. I have now modified KindleUnpack to read the PAGE sections and create the proper APNX on the fly (for both old mobi and mobi8 parts) which should hold most people until you get around to adding that feature to calibre.

Thanks!

KevinH
But what about when the AZW3 is converted to ePub? Do the page numbers get kept?
JSWolf is online now   Reply With Quote
Old 06-25-2014, 10:41 AM   #28
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,602
Karma: 5433388
Join Date: Nov 2009
Device: many
Quote:
Originally Posted by JSWolf View Post
But what about when the AZW3 is converted to ePub? Do the page numbers get kept?
That was the whole point of the change. If you have a joint mobi or azw3 with a PAGE section, then using experimental KindleUnpack_v072d will create an epub with a page-map.xml and the appropriate entries in the spine and manifest to that the epub will have the page numbers.

It will also allow you to pass in an existing APNX and **IF** it was generated from an epub with real page id= anchors to mark page start points, then you should also get an epub with a page-map.xml as well.
KevinH is offline   Reply With Quote
Old 06-27-2014, 07:52 PM   #29
Sabardeyn
Guru
Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.
 
Sabardeyn's Avatar
 
Posts: 644
Karma: 1242364
Join Date: May 2009
Location: The Right Coast
Device: PC (Calibre), Nexus 7 2013 (Moon+ Pro), HTC HD2/Leo (Freda)
Kevin H.,
Not sure I understood the gist of this conversational thread.

Are you saying Amazon is now including the equivalent of a "page break" within the ebook text so that both digital and paper versions of a book have the exact same page content and numbering? Thus simplifying syncing citations between the two different media forms?

If the answer is yes, I imagine it's a fairly important step. It immediately opens ebooks up to more technical usage - like magazine articles, technical journals and textbooks. Where exactly locating and citing a source is mandatory.
Sabardeyn is offline   Reply With Quote
Old 06-27-2014, 08:14 PM   #30
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,602
Karma: 5433388
Join Date: Nov 2009
Device: many
Hi,

Not quite ... page-breaks are just typically used to force text to start on a new page which is unreliable at best in a reflowable document.

Both epub and mobi/azw3 with APNX have the capability to encode information as to the start of each page in a specific hardcopy edition.

In epub, you use anchors/id= attributes to indicate the start of that page (not a pagebreak). Then in either a pagelist in the ncx or a separate page-map.xml file included in the epub, each of these anchors points is associated with its true page name (ie number or Roman numeral or ...). These are real page numbers with multiple schemes just as in the hardcopy version.

If an epub like that is passed into kindlegen, kindlegen will add a PAGE section to the mobi. This PAGE section becomes the APNX file when that book is delivered from Amazon. With the proper APNX file, a Mobi can show you on which page of the hardcopy book you are currently reading.

Calibre, does not yet understand these PAGE sections and the APNX file created by Calibre does not know or use multiple naming schemes such as Roman numerals and things like that. And many times these page start points are just chosen arbitrarily by Calibre.

So the most recent version of KindleUnpack (v072f) now understands these PAGE sections and will create a real APNX file for that mobi, and upon unpacking, will generate the proper page-map.xml entries.

I hope this is clearer. If not, just ask.

KevinH


Quote:
Originally Posted by Sabardeyn View Post
Kevin H.,
Not sure I understood the gist of this conversational thread.

Are you saying Amazon is now including the equivalent of a "page break" within the ebook text so that both digital and paper versions of a book have the exact same page content and numbering? Thus simplifying syncing citations between the two different media forms?

If the answer is yes, I imagine it's a fairly important step. It immediately opens ebooks up to more technical usage - like magazine articles, technical journals and textbooks. Where exactly locating and citing a source is mandatory.

Last edited by KevinH; 06-27-2014 at 09:15 PM.
KevinH is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Kindle (AZW3/MOBI) ebooks with "real page numbers" to PDF with same page numbers? abvgd Conversion 2 05-24-2013 01:24 PM
How to add real page numbers for Kindle ebooks sinan Workshop 2 08-17-2011 02:37 AM
Do Sony Readers display real page numbers? varlokkur Sony Reader 26 03-10-2011 04:10 AM
Real Page Numbers MarcusStringer ePub 12 02-10-2011 04:10 PM
Page numbers in iphone vs Real Kindle palex481 Amazon Kindle 26 03-16-2009 05:28 PM


All times are GMT -4. The time now is 07:36 AM.


MobileRead.com is a privately owned, operated and funded community.