07-10-2014, 10:46 AM | #886 |
Grand Sorcerer
Posts: 27,551
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
When it comes to opf tag attributes, is there any logic built-in to exclude the stuff from a version 3 opf that isn't valid in a version 2 opf?
I know an un-influenced re-creation of the original source has always been (and should be) a higher priority than epub spec adherence, so perhaps it would make more sense if some of these epub3-only properties/attributes could be used to enhance the new auto-detection feature? Last time I checked, it seemed auto-detect only checked for the presence of a couple of fixed-layout properties to make its decision. I have no idea what it might entail, but I'd like to see a more robust (possibly even heuristic) approach to ensure that the same sort of source that went in is coming back out (if auto-detect is selected). |
07-10-2014, 11:11 AM | #887 | ||||
Sigil Developer
Posts: 7,645
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Hi tkeo,
Quote:
That is why I store ALL of the extra metadata from the RESC inside a comment. So no need to strip out the original coverpage info as well. Quote:
Quote:
Quote:
Thanks, KevinH Last edited by KevinH; 07-10-2014 at 02:26 PM. |
||||
07-10-2014, 11:33 AM | #888 | |
Sigil Developer
Posts: 7,645
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Hi DiapDealer,
Good point! I have not looked at the epub3 and auto detection code yet and it will need to be updated as well. The package tag and its version is included in some RESC and the current k8resc can easily parse it if present and use its value to help auto detection. I also think that epub version (or A for auto) should be passed into k8resc as well to help it clean up and remove any epub 3 pieces if the user requests epub2, since most of them come in via the RESC info. One main problem is the damn refines in epub3 metadata, they use and reference the original "id=" properties on the title, the creator, and on other things but these are all stripped away when that info becomes the EXTH equivalent. The remnants do seem to make it into the RESC but the ids being referenced by the refines are long gone and we can only guess as to which creator or title or whatever they actually refer to. So that is something, that can only be fixed by hand editing after the reconstruction. I will try to take look at passing epubver into mobi_k8resc.py and see if I can add some auto-detect code and things to clean up if down versioning. Thanks, KevinH ps, the new mobi_k8resc parse code should look familiar as it is a tweaked version of the old mobiml2html parser we used! Quote:
|
|
07-10-2014, 01:06 PM | #889 |
Grand Sorcerer
Posts: 27,551
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
|
07-10-2014, 01:55 PM | #890 |
Sigil Developer
Posts: 7,645
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Help Needed Detecting what is epub 3
Hi DiapDealer and tkeo, and all:
I need some help on what should be considered an epub 3 feature when auto-detect is used? Here is what I think so far based on what tkeo had previously: 1. "fixed-layout" (EXTH item 122) in the metadata 2. "page-progression-direction" (EXTH item 527) in the metadata 3. "primary-writing-mode" (EXTH item 525) in the metadata and it ends with "rl" 4. RESC itemrefs have "properties" I would like to add: 5. RESC package version exists and startswith "3" 6. RESC "spine" has "page-progression-direction" (think tkeo used that as well?) 7. RESC metadata uses "refines" 8. RESC metadata uses meta property= attributes Are there any others we should add? Are there particular version tags or strings in the metadata that only exist/work for epub3 that we could look for when parsing the RESC? Thanks, KevinH Last edited by KevinH; 07-10-2014 at 03:31 PM. |
07-10-2014, 03:38 PM | #891 |
Sigil Developer
Posts: 7,645
Karma: 5433388
Join Date: Nov 2009
Device: many
|
KindleUnpack_v072y_test
Hi tkeo,
I would like to remove your conversion of Amazon metadata to epub3 as it seems wasteful to re-parse the metadata string we just created (and thereby eliminate taglist). Instead, I would like to determine the epub version by looking in metadata and k8resc first to determine the target version, then properly building the correct metadata the very first time for that specific version. That should make everything easier I think. I have taken a shot at pre-determining the epub version when the opf object is created (if not already specified), only building the metadata once to meet the target version, removing the taglist and mobi_taglist.py completely and polishing up a few things so that epub3 should not at least work. I think we are getting close to having a finished product once we remove the remaining redundancy in mobi_opf.py Hopefully within another day or two we will have something to release. Please see the attached KindleUnpack_v072y_test.zip and let me know what you think. Take care, KevinH Last edited by KevinH; 07-10-2014 at 06:08 PM. |
07-11-2014, 09:11 AM | #892 | |
Connoisseur
Posts: 94
Karma: 10
Join Date: Feb 2014
Location: Japan
Device: Kindle PaperWhite, Kobo Aura HD
|
Hi,
Quote:
Additionally, we can probably use, 9. "orientation-lock" (EXTH item 124) in the metadata has "portrait" or "landscap" Note: "original-resolution" (EXTH item 126) requires 'fixed-layout" is "true" 10. "Title file-as"(EXTH item 508) in the metadata 11. "Creator file-as"(EXTH item 517) in the metadata 12. "Publisher file-as"(EXTH item 522) in the metadata 13. RESC metadata uses "rendition:" prefix 14. RESC metadata tag is <metadata xmlns:dc="http://purl.org/dc/elements/1.1/"> instead of <metadata xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns: opf="http://www.idpf.org/2007/opf" xmlns="http://www.idpf.org/2007/opf"> Thanks, |
|
07-11-2014, 10:47 AM | #893 | |
Sigil Developer
Posts: 7,645
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Hi tkeo,
Since the only possible values of EXTH orientation-lock are portrait or landscape, I will simply look to see if it is in the metadata.keys(). Items 10, 11, 12 are really just extensions of epub2 metatdata so I will not force things to epub3 for using them. As for 14, I already check that any of the new meta tags with "property" are present so I guess this would catch all of these as well. I do look for the rendition namespace in the package attributes though. Thanks, Kevin ps, I will be working on removing the redundancy from mobi_opf.py and then focusing on meta data more fully. Take care, KevinH Quote:
|
|
07-11-2014, 11:10 AM | #894 |
Connoisseur
Posts: 94
Karma: 10
Join Date: Feb 2014
Location: Japan
Device: Kindle PaperWhite, Kobo Aura HD
|
Hi Kevin,
I have also been doing to reduce redundancy. I have not completed yet and there are bugs. I have attached my version just for reference. Take care, tkeo |
07-11-2014, 02:28 PM | #895 |
Sigil Developer
Posts: 7,645
Karma: 5433388
Join Date: Nov 2009
Device: many
|
preview only testing version v072z
Hi tkeo, (FYI: other co-developers and testers)
Thanks for that bug fix in decoding package tags in mobi_k8resc.py I have tried to incorporate your mobi_opf.py changes into mine. We were pretty close on many things. I see you want to break out epub3 metadata from the general case and we can do that later, but right now I went with an integrated one we already had. So attached is KindleUnpack_v072z_test.zip which we can start heavy testing on to make sure nothing is broken for PrintReplica and older Mobis as well as epub2 and epub3. For epub3 I have added in the automatic generation of dcterms:modifed to meet the minimum epub3 metadata spec. So hopefully, all that remains is some bug hunting and corner cases to resolve and we can make this a public release! After that, if you want you can start on your epub3 specific metatdata changes and hopefully try to figure out a way to fix the refines info and integrate the RESC extra metadata into the final product without needing to comment it out. I have run out of free time recently so hopefully you can take the lead on all of that after we get any bugs ironed out and a stable v073 release made and available to all. Thanks for all of your hard work on this! Edit: I just finished studying the epub 3 metadata and it disallows all opf: prefixes like file-as, role and schemes. Therefore the dc:identifier is different under epub3 for urn:uuid, isbn, etc. So you were right and we do need an epub 3 specific metadata routine for even the basics just to handle the refines of file-as and role and identifiers properly even etc for the most basic EXTH values and not just for fixed-layout and related things. I will play around with this a bit too. Take care, KevinH So here is Last edited by KevinH; 07-11-2014 at 06:05 PM. |
07-12-2014, 01:24 AM | #896 | ||
Connoisseur
Posts: 94
Karma: 10
Join Date: Feb 2014
Location: Japan
Device: Kindle PaperWhite, Kobo Aura HD
|
Hi,
In the Calibre KindleUnpack Plugin thread, an error is reported. https://www.mobileread.com/forums/sho...&postcount=215 https://www.mobileread.com/forums/sho...&postcount=225 Quote:
Quote:
Code:
i = int(metadata['CoverOffset'][0]) if imgnames[i] is not None: Code:
i = int(metadata['CoverOffset'][0]) if i >= 0 and i < len(imgnames) and imgnames[i] is not None: Code:
imageNumber = int(metadata['CoverOffset'][0]) cover_image = self.imgnames[imageNumber] Code:
imageNumber = int(metadata['CoverOffset'][0]) if imageNumber >= 0 and self.imageNumber < len(self.imgnames): cover_image = imgnames[imageNumber] I will post fixed test version to the Calibre Plugin thread to see it work or not. I am no idea why CoverOffset EXTH has such a value. Is is needed to be fixed in the latest version? Thanks, |
||
07-12-2014, 05:36 AM | #897 | |
Connoisseur
Posts: 94
Karma: 10
Join Date: Feb 2014
Location: Japan
Device: Kindle PaperWhite, Kobo Aura HD
|
Hi,
Quote:
EDIT We can set any value to EXTH orientation-lock through Code:
<meta name="orientation-lock" content="XXXX"/> And do you know how to set values to EXTH item 508 (Title file-as), EXTH item 517(Creator file-as) and EXTH item 522 (Publisher file-as)? They seem not converted from refine meta tags. Thanks, Last edited by tkeo; 07-12-2014 at 06:01 AM. |
|
07-12-2014, 08:41 AM | #898 | ||
Sigil Developer
Posts: 7,645
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Hi tkeo,
Quote:
Quote:
Take care, Kevin Last edited by KevinH; 07-12-2014 at 08:48 AM. |
||
07-12-2014, 08:45 AM | #899 |
KCC Co-Author
Posts: 845
Karma: 765434
Join Date: Mar 2013
Location: Poland
Device: Kindle Oasis 2
|
orientation-lock have three valid values: portrait, landscape and
none primary-writing-mode have four: horizontal-lr, horizontal-rl, vertical-lr, vertical-rl There is also boolean RegionMagnification that inform reader if pages in book have Panel View code embedded. |
07-12-2014, 08:46 AM | #900 | |
Sigil Developer
Posts: 7,645
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Hi,
My bet is that that EXTH was set to 0xffffffff which is often used as a placeholder for missing values in MobiHeaders. The size field of the EXTH value must be corrupt or broken. I would rather detect that during EXTH parsing and leave the code as is. Please try running a recent version of DumpMobiHeader_v016 or later on the problem ebook so check the field size and the unsigned hex value. Thanks, KevinH Quote:
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Can i rotate text and insert images in Mobi and EPUB? | JanGLi | Kindle Formats | 5 | 02-02-2013 04:16 PM |
PDF to Mobi with text and images | pocketsprocket | Kindle Formats | 7 | 05-21-2012 07:06 AM |
Mobi files - images | DWC | Introduce Yourself | 5 | 07-06-2011 01:43 AM |
pdf to mobi... creating images rather than text | Dumhed | Calibre | 5 | 11-06-2010 12:08 PM |
Transfer of images on text files | anirudh215 | 2 | 06-22-2009 09:28 AM |