12-19-2013, 04:42 PM | #646 | |
The Grand Mouse 高貴的老鼠
Posts: 71,511
Karma: 306214458
Join Date: Jul 2007
Location: Norfolk, England
Device: Kindle Voyage
|
Quote:
|
|
12-19-2013, 07:59 PM | #647 | |
Sigil Developer
Posts: 7,651
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Quote:
Yes but we could unzip and rebuild it similar to how we handle KF8 and make an epub-like file from it. If it really is similar to what the KF8 mobi pieces are (ie css, svg, skeletons and fragments) we should be able to do that with hopefully slight modifications to how we build an epub from an azw3 file. "He said keeping his fingers crossed!" Either way, definitely a project for the new year. BTW .... Happy Holidays / Merry Christmas! |
|
Advert | |
|
01-10-2014, 06:00 AM | #648 |
KCC Co-Author
Posts: 845
Karma: 765434
Join Date: Mar 2013
Location: Poland
Device: Kindle Oasis 2
|
Performance issue with creating KF8 from hybrid MOBI filled with images is still on the table.
Any help will be appreciated. |
01-10-2014, 05:57 PM | #649 |
Bookmaker & Cat Slave
Posts: 11,462
Karma: 158448243
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
|
|
01-11-2014, 02:50 AM | #650 |
KCC Co-Author
Posts: 845
Karma: 765434
Join Date: Mar 2013
Location: Poland
Device: Kindle Oasis 2
|
No. Completely other issue. Metadata tweaker was a dead end. KevinH made working code but after additional research we found that this approach is totally impractical. Using hyrbid MOBI is no-go in this case.
Current KindleUnpack method of stitching KF8 image records tremendously increase processing time when book contain only them. Last edited by AcidWeb; 01-11-2014 at 02:52 AM. |
Advert | |
|
01-11-2014, 05:01 AM | #651 | |
Bookmaker & Cat Slave
Posts: 11,462
Karma: 158448243
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
|
Quote:
Hitch |
|
01-11-2014, 05:03 PM | #653 | |
Bookmaker & Cat Slave
Posts: 11,462
Karma: 158448243
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
|
Quote:
h |
|
02-07-2014, 08:33 AM | #654 |
Connoisseur
Posts: 94
Karma: 10
Join Date: Feb 2014
Location: Japan
Device: Kindle PaperWhite, Kobo Aura HD
|
KindleUnpack v63
Hi,
I have modified the KindleUnpack package. The main aim of this modification is to be able to process right-to-left page progression books properly. I have added: the page progression direction attribute in a spine tag, I attach the modified version to this post. I hope it works correctly on any environments.some id_map_strings, K8 RESC section processing. I removed attached file due to bug and posted fixed one. Thanks, Last edited by tkeo; 02-09-2014 at 01:21 AM. |
02-07-2014, 10:08 PM | #655 | |
Sigil Developer
Posts: 7,651
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Hi,
Thanks for your modifications. It seems a significant percent of your changes have to do with parsing the RESC section. The text direction itself is stored in the exth metadata. The cover image info is also available from exth values. So what types of useful information are you capturing from the RESC section? For most of the examples I have seen, it is basically a small shell and not that useful. Will you please post a small mobi azw3 style test case for rtl ebooks and a second test case that shows significant information in the RESC section that can't be found in other places in the EXTH metadata? Also, perhaps we should pull the RESC parsing code into its own file to make the changes more self-contained and easier to follow. Thanks, KevinH Quote:
|
|
02-08-2014, 01:07 AM | #656 | |
Connoisseur
Posts: 94
Karma: 10
Join Date: Feb 2014
Location: Japan
Device: Kindle PaperWhite, Kobo Aura HD
|
Quote:
Thanks for your comments. Yes, you are right. The text direction and the cover image info are stored in the EXTH. The most imortant infomation in the RESC I need to retrieve is the "page-spread property" in each spine itemref tag, which is necessary to show the images spaned on two pages correctly in a landscape view. I will prepare and post an example later. I am thinking that the cover image info and the spine itemref ids in the RESC help to make nearer the output to the source ebook processed by kindlegen. But I'm not sure someone wants or not. I've also found "creator role" and "creator display-seq" which might make more detailed retrieval; however, I cold not found how to get correspondence between in metadata and in RECS if creators are plural. I will consider to separate code. Currently, the modified parts of code are integrated to mobi_k8proc.py and mobi_k8opf.py, in order to find correspondeces from the spine itemrefs in the RESC to the original K8Processor class, based on skeleton ids and xhtml finenames. Thanks, tkeo |
|
02-08-2014, 08:06 AM | #657 | |
Connoisseur
Posts: 94
Karma: 10
Join Date: Feb 2014
Location: Japan
Device: Kindle PaperWhite, Kobo Aura HD
|
Huge bug in this version
Quote:
I found a huge bug in this modification. Some K8 ebooks are able to process but others be not. I am fixing now. I am sorry, tkeo |
|
02-09-2014, 01:11 AM | #658 | |
Connoisseur
Posts: 94
Karma: 10
Join Date: Feb 2014
Location: Japan
Device: Kindle PaperWhite, Kobo Aura HD
|
Bug fixed. KindleUnpack v63
Quote:
I've fixed bugs in the KindleUnpack v63 previously posted. I made examples of rtl books also. The souce (rtl_example1_src.zip) of the first one (rtl_example1.mobi) is written htmls manually. The second one (rtl_example2.mobi) is generated by Kindle Comic Creator. Both have "page-spread properties" in spine itemrefs. BTW, I encounter a curious phenomena. Ebooks generated by Kindle Comic Creator (ex. rtl_example2.mobi) are able to unpack; however , created epub files are not accepted by kindlgen, whereas unziped kindlegensrc.zip files are accepted. This occurs v62 too. Thanks, Last edited by tkeo; 02-09-2014 at 01:23 AM. |
|
02-09-2014, 12:28 PM | #659 | |
Sigil Developer
Posts: 7,651
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Hi tkeo,
Thank you for the examples, I will play around with them. I can't believe that a Kindle device supports the spine/page spread properties by keeping and parsing the RESC section on the fly during reading. My guess is they must include or encode that information in some other way but I that is just a guess and I could be wrong. I had never heard of the page spread properties and so searched up on them. They seem to be specific to fixed layout and comics. Many of the spine properties you are parsing for in the RESC are not part of the official epub 2 spec at all and are epub 3 or non-universal epub 2 extensions. KindleUnpack tries to generate a working epub that meets epub 2 specs since as far as I know there are no true shipping epub 3 devices. Your features technically would require an epub 3 spec book or us just adding then to epub 2 and hope that the mix works just fine. I am not sure that is the right approach. Perhaps it would be better to create a separate version of KindleUnpack that tries its best to create an epub 3 like output since current Kindle AZW3 is someplace between epub 2 and epub 3. Quote:
The epub 2 like structure we generate is not from kindlegensrc but instead from reverse compiling the AZW3. If you have access to kindlegensrc then you should not need KindleUnpack unless you want to explore just how the raw AZW3 text is generated or interpreted by kindlegen. Many times the user only has access to a shipping AZW3 or a stripped AZW3 (these will not have the SRCS section) and KindleUnpack will do its best to decompile the AZW3 back to something usable. KindleUnpack will not generate an exact replica of the input sources nor is it even guaranteed to generate a working epub! But in most cases, if a valid verified epub2 is input into kindlegen, then KindleUnpack will generate a valid working epub2. If the user inputs old/broken html or even old mobi 6s onto kindlegen, it will create the mobi/azw3 but when it is unpacked using KindleUnpack this software will do its best but will most likely not generate a valid epub2. So ignoring fixed layout books for the moment and comics, do you have any test cases that show a valid epub 2 being given to kindlegen that KindleUnpack unpacks to a non-valid epub 2? If so, I would consider them bugs and so would love to have a bug report with a testcase that shows this behaviour. I will look closer at what you have done but as Kindlegen supports more and more epub 3 as valid input, we will need to create a new version of KindleUnpack that unpacks to an epub 3-like container and not an epub 2. Frankly after studying the epub 3 spec, it seems the people who created the spec don't really understand what ebooks are all about and are completely missing the fundamental concept that simpler is better for all things. What a mess! Unfortunately, right now with the various private extensions supported for fixed layout, comics, and multi-media ebooks all differing by epub vendor (Apple vs ADE/Kobo) and Amazon and the huge overhead and unnecessary complexity of epub 3, we are in some no-man's-land between official epub 2 and some fantasy epub 3. KevinH |
|
02-09-2014, 01:18 PM | #660 |
Sigil Developer
Posts: 7,651
Karma: 5433388
Join Date: Nov 2009
Device: many
|
Hi tkeo,
One other thing, I am not a big fan of minidom at all. It seems generally bloated and barfs if any true unicode is used (at least on 2.X). I see you wrote both a xml.dom.minidom version and a regular expression version of things. Every time I have used a xml elementTree or some other XML parser (either standard package or add-ons) in python 2.X I have run into problem cases that simply do not parse well or get confused with encodings, resulting in non-robust operation on some platforms (Mac, Win, or Linux). So unless you feel strongly about it (and given the re vs dom code sizes are about the same), I would rather stick with regular expressions version as they are easier for people to modify and fix are are robust to most encoding issues. I see you have also written a metadata parsing routine that supports epub 3 like "refines" on named items. This is quite nice but using it in epub 2 spec devices might cause problems. I really think we should incorporate your code and try and create an epub 3 generator version of KindleUnpack to stay in epub 3 space and not try to mix private extensions into what is primarily epub 2 code. What do you think? KevinH |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Can i rotate text and insert images in Mobi and EPUB? | JanGLi | Kindle Formats | 5 | 02-02-2013 04:16 PM |
PDF to Mobi with text and images | pocketsprocket | Kindle Formats | 7 | 05-21-2012 07:06 AM |
Mobi files - images | DWC | Introduce Yourself | 5 | 07-06-2011 01:43 AM |
pdf to mobi... creating images rather than text | Dumhed | Calibre | 5 | 11-06-2010 12:08 PM |
Transfer of images on text files | anirudh215 | 2 | 06-22-2009 09:28 AM |