Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > Kindle Formats

Notices

Reply
 
Thread Tools Search this Thread
Old 07-24-2014, 03:48 PM   #931
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,546
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by davidnwelton View Post
I am sorry to butt in, but I was just curious if you guys had considered putting the source code for mobiunpack up on github or something like that that makes it easier to collaborate on.

Thanks
It's been talked about several time before in the thread. I don't think anyone really wants to be saddled with being the maintainer, or just making sure there ARE current maintainers who can guard the kingdom. Maybe some day, but unless Amazon introduces new features into the format, development often comes to stand-still for long periods of time. We could come back from a long hiatus and find that no one with a key to the front door is around anymore.
DiapDealer is offline   Reply With Quote
Old 07-24-2014, 03:56 PM   #932
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,630
Karma: 5433388
Join Date: Nov 2009
Device: many
Hi,

Already tried that and it did not work. We had it on google code for years and nothing was ever done or even used. So we took it down.

The universe of potential users of this project is actually quite small and often don't visit any sourcecode/google code/github sites. Actual editors and authors find us here. There are only ever 2 or 3 active developers at one time.

So we get better input on KindleUnpack from this forum and exchanging patches than we ever did from having a repository. Except for a recent flourish of new features this past month, KindleUnpack is considered reasonably stable.

If you have something to contribute simply ask about it here and then post a patch.

If your patch does not impede the intent of the package it will most likely be accepted. But please note: KindleUnpack is a Kindle mobi/azw3 diagnostic tool that allows ebook editors/authors to see how their code (and the code of others) has been changed by kindlegen, fix minor bugs, provide a python based documentation as to the compiled mobi/azw4/azw3 format.

It is not a standalone Mobi to epub generator. If you want that, please look towards Calibre instead.

Hope this helps,

KevinH
KevinH is offline   Reply With Quote
Advert
Old 07-25-2014, 10:03 PM   #933
tkeo
Connoisseur
tkeo began at the beginning.
 
Posts: 94
Karma: 10
Join Date: Feb 2014
Location: Japan
Device: Kindle PaperWhite, Kobo Aura HD
Hi,
Quote:
Originally Posted by davidnwelton View Post
I am sorry to butt in, but I was just curious if you guys had considered putting the source code for mobiunpack up on github or something like that that makes it easier to collaborate on.
Other have already answered the reasons of not using such as github. There is a repository of KindleUnpack on github created by quiris; however, it is not updated from v0.71 (the latest version is v0.73).
https://github.com/quiris11/KindleUnpack.

I use git locally a little but not have any github account.
Thanks,
tkeo is offline   Reply With Quote
Old 07-26-2014, 09:04 AM   #934
tkeo
Connoisseur
tkeo began at the beginning.
 
Posts: 94
Karma: 10
Join Date: Feb 2014
Location: Japan
Device: Kindle PaperWhite, Kobo Aura HD
Hi,

I have modified KindleUnpack v0.73. Modifications are as follows:
  1. added refines metadata processing
  2. fixed language code in the ncx and title in the navigation document
  3. added F (force to fit to epub2) option to epubver for removing epub3 attribute to fit to epub2 definition
I feel the removing epub3 attribute is needed to discuss the necessity and how to switch.

In addition, I am considering to move adding metaguidetext to guidetext from mobi_opf.py to processMobi7() in kindleunpack.py.

Please give opinions if you have.
Thanks,

CAUTION This update is under development, not intent to end users because the specification is not fixed.
Attached Files
File Type: zip patch_v073_to_v073a.zip (9.9 KB, 198 views)
File Type: zip KindleUnpack_v073a_diff.zip (30.5 KB, 235 views)

Last edited by tkeo; 07-26-2014 at 09:25 AM.
tkeo is offline   Reply With Quote
Old 07-26-2014, 11:24 AM   #935
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,630
Karma: 5433388
Join Date: Nov 2009
Device: many
Hi tkeo,

Quote:
Originally Posted by tkeo View Post
Hi,
I have modified KindleUnpack v0.73. Modifications are as follows:

1. added refines metadata processing
Is this only for single creator? How did you deal with multiple creators?

Quote:
[*]fixed language code in the ncx and title in the navigation document[*]added F (force to fit to epub2) option to epubver for removing epub3 attribute to fit to epub2 definition
Great! These are useful additions.

Quote:
I feel the removing epub3 attribute is needed to discuss the necessity and how to switch.
I agree, here are a few things to consider during this forced conversion:

1. replace section tag with <div data-tag="section"> and similar for closing tag
2. replace epub:type=blah" attributes with data-epub-type="blah" to keep the semantic meaning
3. allow video and audio tags to go through as it
4. deal with < aside > in some sane manner
5. add epub_type vocabulary to guide elements where crossover exists and to nav if possible
6. convert nav to toc if toc does not exist
7. convert meta data from new format back to using older format (with opf:scheme, opf:fileas, opf:role) replacing refines with something more sane
8. remove cover manifest property and add in required meta name="cover"
9. there are probably a few other new tags we should convert as well
10 ...

Please add to the list above. Once we agree on the best way to force epub 2, I would also like a similar way to reverse all of this (including reversing the section to div, reversing epub:type to data:epub type, etc ) to force generation of a valid epub 3 from an epub 2 with extras added starting point.

Quote:
In addition, I am considering to move adding metaguidetext to guidetext from mobi_opf.py to processMobi7() in kindleunpack.py.
That matches what we do for kf8 so that is a good idea.

I am also considering adding in my mobiml2html.py code to convert mobi 7 to something importable into calibre and Sigil, that can be further edited.

I would also like to add a feature that provides the best single output format using the following scheme:

1. if mobi includes SRCS record, then return kindlegensrc.zip is provided return it
2. if mobi has no source, but has azw3, return our unpacked epub from mobi8
3. if mobi has no source, and no kf8 part, use the mobiml2html.py code to return at least parseable proper xhtml version of mobi 7 output.

Please let me know what you think.

Quote:
CAUTION This update is under development, not intent to end users because the specification is not fixed.
So don't ask for versions of it for a plugin or for your bug reporting JSWolf! ;-)

KevinH
KevinH is offline   Reply With Quote
Advert
Old 07-28-2014, 09:18 AM   #936
tkeo
Connoisseur
tkeo began at the beginning.
 
Posts: 94
Karma: 10
Join Date: Feb 2014
Location: Japan
Device: Kindle PaperWhite, Kobo Aura HD
Hi,
Quote:
Originally Posted by KevinH View Post
Is this only for single creator? How did you deal with multiple creators?
If there are only one pairs for title, publisher and creator, and correspinding EXTH for furigana, the refiens tags are not commented out. Otherwise they are commented out. The followings is an example.
Code:
<?xml version="1.0" encoding="utf-8"?>
<package version="3.0" xmlns="http://www.idpf.org/2007/opf" prefix="rendition: http://www.idpf.org/vocab/rendition/#" unique-identifier="uid">
<metadata xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:opf="http://www.idpf.org/2007/opf">
<dc:title id="title01">AAAA</dc:title>
<dc:language>ja</dc:language>
<dc:identifier id="uid">3232262294</dc:identifier>
<meta property="dcterms:modified">2014-07-26T08:09:39Z</meta>
<dc:creator id="creator01">XXX</dc:creator>
<dc:creator id="creator02">YYY</dc:creator>
<dc:publisher id="publisher01">BBBB</dc:publisher>
<dc:date opf:event="publication">2011-05-30</dc:date>
<!-- Refines MetaData from EXTH -->
<meta property="file-as" refines="#title01">aaaa</meta>
<meta property="file-as" refines="#publisher01">bbbb</meta>
<!-- THE FOLLOWINGS ARE REQUIRED TO EDIT IDS MANUALLY
<meta property="file-as" refines="#creator01">yyy/meta>
<meta property="file-as" refines="#creator02">xxx</meta>
<meta scheme="marc:relators" property="role" refines="#creator01">aut</meta>
<meta property="display-seq" refines="#creator01">1</meta>
-->
Quote:
I agree, here are a few things to consider during this forced conversion:

1. replace section tag with <div data-tag="section"> and similar for closing tag
2. replace epub:type=blah" attributes with data-epub-type="blah" to keep the semantic meaning
3. allow video and audio tags to go through as it
4. deal with < aside > in some sane manner
5. add epub_type vocabulary to guide elements where crossover exists and to nav if possible
6. convert nav to toc if toc does not exist
7. convert meta data from new format back to using older format (with opf:scheme, opf:fileas, opf:role) replacing refines with something more sane
8. remove cover manifest property and add in required meta name="cover"
9. there are probably a few other new tags we should convert as well
10 ...
6 and 8 are already done. In our KindleUnpack, the nav section in the nav.xhtml is geterated from ncx (i.e. NCXProcessor), so the ncx always exists.
for 7, they need to convert from refines metadata, so, the problem is solving the id correspondence as same as the refines metadata.
tkeo is offline   Reply With Quote
Old 07-28-2014, 10:33 AM   #937
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 73,887
Karma: 128597114
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
In forcing to ePub2, the video and audio tags might cause a problem. Consider dropping them.
JSWolf is offline   Reply With Quote
Old 07-29-2014, 09:28 AM   #938
tkeo
Connoisseur
tkeo began at the beginning.
 
Posts: 94
Karma: 10
Join Date: Feb 2014
Location: Japan
Device: Kindle PaperWhite, Kobo Aura HD
Hi,

I would like to comfirm that we are going to create converting a K8 epub-like structure with epub2 tags which is accepted as a source for kindlegen version 2.?, for F option, is it right?

Thanks,
tkeo is offline   Reply With Quote
Old 07-29-2014, 05:45 PM   #939
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,630
Karma: 5433388
Join Date: Nov 2009
Device: many
Hi tkeo,

If the user specifies F to force to epub2, my guess is they want the epub version 2 for their own use and probably won't most be passing it back through kindlegen which candeal with the epub 3 features just fine. My guess, they probably want to load it into calibre or Sigil for further editing but neither really support epub 3.

So if we can take the epub 3 features and convert them as little as possible, making liberal use of the data-* attribute and comments specially marked to be reversible, down-convert the epub3 metadata and the like, the user may be able to edit it in calibre or Sigil and get it to validate, and yet make it easy to auto convert back to epub 3 if possible.

That is the plan anyway.

Take care,

KevinH

Quote:
Originally Posted by tkeo View Post
Hi,

I would like to comfirm that we are going to create converting a K8 epub-like structure with epub2 tags which is accepted as a source for kindlegen version 2.?, for F option, is it right?

Thanks,
KevinH is offline   Reply With Quote
Old 07-31-2014, 09:03 AM   #940
tkeo
Connoisseur
tkeo began at the beginning.
 
Posts: 94
Karma: 10
Join Date: Feb 2014
Location: Japan
Device: Kindle PaperWhite, Kobo Aura HD
Hi Kevin,

Quote:
Originally Posted by KevinH View Post
1. replace section tag with <div data-tag="section"> and similar for closing tag
2. replace epub:type=blah" attributes with data-epub-type="blah" to keep the semantic meaning
3. allow video and audio tags to go through as it
4. deal with < aside > in some sane manner
5. add epub_type vocabulary to guide elements where crossover exists and to nav if possible
6. convert nav to toc if toc does not exist
7. convert meta data from new format back to using older format (with opf:scheme, opf:fileas, opf:role) replacing refines with something more sane
8. remove cover manifest property and add in required meta name="cover"
9. there are probably a few other new tags we should convert as well
10 ...
Although we are not sure about the best way of force-conversion to epub2 tags and the list is completed or not, I have modified to fulfill the 7 in the list, in order to confirm that this conversion is matched to the purpose or not.


In comparison with the v0.73, v0.73b (and maybe v0.74) is rather minor functional improvement, so, I would like to do a slower pace.

Thanks,
Attached Files
File Type: zip patch_v073a_to_v073b.zip (3.9 KB, 216 views)
File Type: zip KindleUnpack_v073b_diff.zip (21.8 KB, 200 views)
tkeo is offline   Reply With Quote
Old 08-01-2014, 08:03 AM   #941
lglgaigogo
Junior Member
lglgaigogo began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Aug 2014
Device: kindle paper white
The program now can't do well with dictionary coded in utf-8
When I use the kindlegen option: -western, this problem won't occur.

1.Tag <idxrth> , value attribute seem to be messy
2.Tag <idx:iform>, value attribute seem to be messy


It seems it uses a weird table to index the character.
The table :



the dictionary can be found at:
https://github.com/lglgaigogo/AI2KD/...ish%205th.mobi

Last edited by lglgaigogo; 08-02-2014 at 02:44 AM.
lglgaigogo is offline   Reply With Quote
Old 08-01-2014, 08:53 AM   #942
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,546
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Dictionary support has always been a bit touch and go.
DiapDealer is offline   Reply With Quote
Old 08-01-2014, 11:46 AM   #943
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,630
Karma: 5433388
Join Date: Nov 2009
Device: many
I'm out of town and out of touch for the next 10 days or so. When I get back, I will download the dictionary and try to reproduce the issue. ORDT sections like that represent a byte mapping of one character encoding into another, typically multi-byte. I have seen this issue in some sample ebooks. It is caused by the generating machine using a strange charset like 65002 versus the more typical 65001 (utf-8).

If you want to play around looke in the mobi_index.py file for the strings horde and ORDT. As the code comments say, there are two ORDT provided but we could only figure out what the second one was for. We may now figure it out from your testcase.

KevinH
KevinH is offline   Reply With Quote
Old 08-02-2014, 02:06 AM   #944
lglgaigogo
Junior Member
lglgaigogo began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Aug 2014
Device: kindle paper white
Quote:
Originally Posted by KevinH View Post
I'm out of town and out of touch for the next 10 days or so. When I get back, I will download the dictionary and try to reproduce the issue. ORDT sections like that represent a byte mapping of one character encoding into another, typically multi-byte. I have seen this issue in some sample ebooks. It is caused by the generating machine using a strange charset like 65002 versus the more typical 65001 (utf-8).

If you want to play around looke in the mobi_index.py file for the strings horde and ORDT. As the code comments say, there are two ORDT provided but we could only figure out what the second one was for. We may now figure it out from your testcase.

KevinH
Thank you for paying attention on my issue. I am now try to understand the non western character encoding pattern.
Thank you.

For now, I figure out:

1.Every character has 2 bytes index
2.For western letters it should be like 00 XX ,for example, 'a' is 00 03, 'b' is 00 64, and look up the table ORDT:
ORDT[3*2+1] is 'a'
ORDT[64*2+1] is 'b'
3.For non western letters, it should be like XX XX, for example, '潘' is 6F 58, and in python:
Code:
 print u"\u6F58" # is exactly the character '潘'

Last edited by lglgaigogo; 08-04-2014 at 10:38 AM.
lglgaigogo is offline   Reply With Quote
Old 08-12-2014, 08:09 AM   #945
tkeo
Connoisseur
tkeo began at the beginning.
 
Posts: 94
Karma: 10
Join Date: Feb 2014
Location: Japan
Device: Kindle PaperWhite, Kobo Aura HD
This is an experimental version just to exhibit it can be extracted audio and video in a mobi.

I have modifed kindleunpack.py to extract AUDI and VIDE sections which contain an audio and a video respectively.

I have modifed to extract them to the HDImage folder;
however, the extracted files are not linked in xhtmls,
suffixes of the files are hard-coded to '.mp3' and '.mp4.'

Thanks,
Attached Files
File Type: zip kindleunpack_v073x.zip (12.4 KB, 207 views)
File Type: zip multimedia_test.zip (63.3 KB, 208 views)
tkeo is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Can i rotate text and insert images in Mobi and EPUB? JanGLi Kindle Formats 5 02-02-2013 04:16 PM
PDF to Mobi with text and images pocketsprocket Kindle Formats 7 05-21-2012 07:06 AM
Mobi files - images DWC Introduce Yourself 5 07-06-2011 01:43 AM
pdf to mobi... creating images rather than text Dumhed Calibre 5 11-06-2010 12:08 PM
Transfer of images on text files anirudh215 PDF 2 06-22-2009 09:28 AM


All times are GMT -4. The time now is 11:39 PM.


MobileRead.com is a privately owned, operated and funded community.