Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > Kindle Formats

Notices

Reply
 
Thread Tools Search this Thread
Old 05-14-2014, 07:09 AM   #1
baf
Evangelist
baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.
 
Posts: 404
Karma: 2200000
Join Date: May 2012
Device: kt
Libmobi – C library

Hi,

I just want to say that I started working on a C library for handling MOBI files.
It is in a very initial state. I just put my libmobi project to github. I am working on reading and parsing of mobi documents now. Next step would be writing mobi format files.
In the project there is a mobitool program which is meant to be an example of usage and a tool for testing the library. It is able to load and parse basic mobi files now. It partially recreates and dumps original markup.

All credit goes to users of this forum and authors of KindleUnpack and Calibre – the only source of information about MOBI format.

Anybody who would like to test or contribute to the project is welcome
baf is offline   Reply With Quote
Old 06-26-2014, 03:21 PM   #2
baf
Evangelist
baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.
 
Posts: 404
Karma: 2200000
Join Date: May 2012
Device: kt
I want to report that I moved my project forward.
As I want to understand how mobi format works I focused on recreating html-like markup from mobi file.
I implemented reconstruction of internal references to html targets.
Apart from reading and parsing mobi documents, libmobi should now be able to produce a set of files that may be used as an input to kindlegen.

I am sure there are still bugs and need for improvements. I am short of time and mobi format samples, but I will be working on it.

For those who would like to test it, but don't know how to compile programs, I attach statically built binaries for several architectures. The program – mobitool is very simple. Its main function is turning mobi document into markup files. Precompiled binaries may or may not work for your particular system. The best and recommended way to test the library is to compile it from sources (from github). Mobitool is build together with the library. The process is quite straightforward.

UPDATE: I removed outdated binaries. Please download current releases from project's github page.


Code:
usage: mobitool [-diemrsuvx7] [-o dir] [-p pid] filename
    without arguments prints document metadata and exits
    -d      dump rawml text record
    -e      create EPUB file (with -s will dump EPUB source)
    -i      print detailed metadata
    -m      print records metadata
    -o dir  save output to dir folder
    -p pid  set pid for decryption
    -r      dump raw records
    -s      dump recreated source files
    -u      show rusage
    -v      show version and exit
    -x      extract conversion source and log (if present)
    -7      parse KF7 part of hybrid file (by default KF8 part is parsed)

Last edited by baf; 03-22-2016 at 05:20 PM. Reason: update binaries
baf is offline   Reply With Quote
Old 06-26-2014, 05:02 PM   #3
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,506
Karma: 5433350
Join Date: Nov 2009
Device: many
Hi,

I grabbed the source from github.

Nicely done!

I do not have libxml2.dylib on my machine so I configured without it and it built just fine on my Mac OS X Mavericks machine.

Interesting. I am glad to see that at least someone is making use of the KindleUnpack code and the code we contributed to calibre to support KF8 and joint mobis! ;-)

A few things:

1. when using the -s option you probably should remove the all the aid="blah" attributes on tags as they are all kindlegen generated and are not legal html5 and they will be added yet again if you pass that back through kindlegen.

(perhaps this would have happened if I had libxml2.dylib in /usr/local/lib/)?

2. you might want to grab the latest KindleUnpack testing version (v072f at the moment) and examine the code added to parse PAGE sections, and HD CONT sections and CRES sections as the first is used to create apnx files and the latter two are related to HD Images which are stored in CRES sections that come after the new CONT container boundary that comes after all of the kf8 sections.

It also has code to unpack to epub version 2 or epub version 3.

If you ever have questions about what the KindleUnpack code does, just let me know and I would be happy to help. Currently, I am working on unpacking .azk ebooks which are basically just a zip archive of json fragments and skeletons (so it is much like a KF8) but with its own form of dictionary based compression that whose keys are mapped into the Braille region of (x2800) unicode code points so that the jsonp objects are pure unicode strings.

Lots of fun with that one.

Take care,

KevinH

Last edited by KevinH; 06-26-2014 at 05:05 PM.
KevinH is offline   Reply With Quote
Old 06-26-2014, 06:01 PM   #4
DaleDe
Grand Sorcerer
DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.
 
DaleDe's Avatar
 
Posts: 11,470
Karma: 13095790
Join Date: Aug 2007
Location: Grass Valley, CA
Device: EB 1150, EZ Reader, Literati, iPad 2 & Air 2, iPhone 7
Quote:
Originally Posted by baf View Post
Hi,



All credit goes to users of this forum and authors of KindleUnpack and Calibre – the only source of information about MOBI format.

Anybody who would like to test or contribute to the project is welcome
The main source for information about MOBI is the wiki. The KindleUnpack folks generally keep it up to date so you don't have to read source files (unless you like that sort of thing).

Dale
DaleDe is offline   Reply With Quote
Old 06-26-2014, 07:59 PM   #5
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,506
Karma: 5433350
Join Date: Nov 2009
Device: many
Hi,

Quote:
Originally Posted by DaleDe View Post
The main source for information about MOBI is the wiki. The KindleUnpack folks generally keep it up to date so you don't have to read source files (unless you like that sort of thing).

Dale
Actually, although that is very true for the older mobi format, the newer KF8 is a lot more complicated with lots of indices and tables and documenting it all would be quite a chore. The best documentation is probably by code example from calibre and KindleUnpack. And I must admit when I first reversed engineered the KF8 (mobi 8) format I did not take the time to update the Wiki with what I found, although I probably should have.

When I get more time I will try to add to the Wiki a lot more on the internal Mobi 8 format, the new skeleton and fragment indices, the FDST table, the guide index, the new fields in the ncx , and with how the raw ml must be split and recombined to create the source, and how internal links with base 32 numbers must be processed. I can also expand on the PAGE sections and how to convert them to APNX files and lots on the HD CONT section and high res images.

Most of that is only documented in the KindleUnpack and calibre code at the moment.

In fact, having a wiki just dedicated to the Mobi 8 format with actual C and python code samples that shows exactly how to parse things would be quite useful.
KevinH is offline   Reply With Quote
Old 06-27-2014, 08:05 AM   #6
baf
Evangelist
baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.
 
Posts: 404
Karma: 2200000
Join Date: May 2012
Device: kt
Quote:
Originally Posted by KevinH View Post
Hi,

I grabbed the source from github.

Nicely done!

I do not have libxml2.dylib on my machine so I configured without it and it built just fine on my Mac OS X Mavericks machine.
Thanks for good word!
I thought libxml2 is part of xcode installation, but it is also possible that I built it myself. My Mavericks says:
Code:
$ xml2-config --prefix
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.9.sdk/usr
Quote:
Originally Posted by KevinH View Post
Interesting. I am glad to see that at least someone is making use of the KindleUnpack code and the code we contributed to calibre to support KF8 and joint mobis! ;-)
As I said earlier it is the only source of information on KF8 format. I followed MOBI wiki to some point, but most of the knowledge is only available between lines of python code (which btw I hardly understand ). Hats off to all the people that contributed to it.

Quote:
Originally Posted by KevinH View Post
A few things:

1. when using the -s option you probably should remove the all the aid="blah" attributes on tags as they are all kindlegen generated and are not legal html5 and they will be added yet again if you pass that back through kindlegen.

(perhaps this would have happened if I had libxml2.dylib in /usr/local/lib/)?
I realise that I should strip all Amazon formatting, but as it is the easiest step I put it off for later
If you had had libxml2 installed, mobitool would also recreate opf and ncx files.

Quote:
Originally Posted by KevinH View Post
2. you might want to grab the latest KindleUnpack testing version (v072f at the moment) and examine the code added to parse PAGE sections, and HD CONT sections and CRES sections as the first is used to create apnx files and the latter two are related to HD Images which are stored in CRES sections that come after the new CONT container boundary that comes after all of the kf8 sections.

It also has code to unpack to epub version 2 or epub version 3.

If you ever have questions about what the KindleUnpack code does, just let me know and I would be happy to help. Currently, I am working on unpacking .azk ebooks which are basically just a zip archive of json fragments and skeletons (so it is much like a KF8) but with its own form of dictionary based compression that whose keys are mapped into the Braille region of (x2800) unicode code points so that the jsonp objects are pure unicode strings.
Great! Where do I find the latest testing versions of KindleUnpack?
I found it. Is there any public repo for development, like on github?
Thanks!

Last edited by baf; 06-27-2014 at 08:12 AM.
baf is offline   Reply With Quote
Old 06-27-2014, 10:52 AM   #7
DaleDe
Grand Sorcerer
DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.
 
DaleDe's Avatar
 
Posts: 11,470
Karma: 13095790
Join Date: Aug 2007
Location: Grass Valley, CA
Device: EB 1150, EZ Reader, Literati, iPad 2 & Air 2, iPhone 7
Quote:
Originally Posted by KevinH View Post
Hi,

In fact, having a wiki just dedicated to the Mobi 8 format with actual C and python code samples that shows exactly how to parse things would be quite useful.
Thanks for the correction. There is no problem creating more pages in the wiki although there is already a KF8 page there that could also be used.

Dale
DaleDe is offline   Reply With Quote
Old 06-27-2014, 10:11 PM   #8
eureka
but forgot what it's like
eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.
 
Posts: 741
Karma: 2345678
Join Date: Dec 2011
Location: north (by northwest)
Device: Kindle Touch
Quote:
Originally Posted by KevinH View Post
In fact, having a wiki just dedicated to the Mobi 8 format with actual C and python code samples that shows exactly how to parse things would be quite useful.
I want this.

@baf, can you lead or participate in updating technical documentation of MOBI format basing on your fresh knowledge?

I can help with that (though with slow pace) or can do it on my own. I can read Python and C, so knowledge hidden in KindleUnpack/libmobi code would be accessible to me. But undefined license of KindleUnpack and LGPL of libmobi are pretty restrictive for me, because I also have a desire to write another mobi library (someday), but with MIT license, so I'm trying to hold back from reading KindleUnpack and libmobi code.

@baf, can you relicense your library to MIT? @KevinH, can you (and all your contributors) explicitly set KindleUnpack license to MIT (or, better, CC0, given that you've expressed your neutrality to license choice sometime ago)?

I know, I'm asking too much, but I couldn't restrain myself from taking the opportunity.
EDIT: I see now that KindleUnpack is GPL3-licensed and libmobi is derived from it, so, I guess, no luck for me here.

Last edited by eureka; 06-27-2014 at 10:19 PM.
eureka is offline   Reply With Quote
Old 06-28-2014, 08:50 AM   #9
baf
Evangelist
baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.
 
Posts: 404
Karma: 2200000
Join Date: May 2012
Device: kt
Quote:
Originally Posted by eureka View Post
@baf, can you lead or participate in updating technical documentation of MOBI format basing on your fresh knowledge?
I don't think any leadership is needed here. We could just start with updating MOBI wiki.
It is my duty to contribute to the wiki and I want to do it, but at the moment libmobi development itself consumes too much of my time. I hope I will be able to document some KF8 related algorithms when I reach more stable state of my project.
Quote:
I can read Python and C, so knowledge hidden in KindleUnpack/libmobi code would be accessible to me. But undefined license of KindleUnpack and LGPL of libmobi are pretty restrictive for me, because I also have a desire to write another mobi library (someday), but with MIT license, so I'm trying to hold back from reading KindleUnpack and libmobi code.
Choosing a license is a personal choice. I support GPL idea. Just notice that LGPL license is pretty permissive for a shared library. You can use it in commercial, closed-source applications. It still contains the dreadful virus though. So I well understand that somebody else may want to choose another approach.

You shouldn't be afraid of reading KindleUnpack or libmobi code. Is there much difference between learning an algorithm from a source code or from a documentation made of this source code? I still believe you know how to make fair use of it.
baf is offline   Reply With Quote
Old 06-28-2014, 09:41 AM   #10
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,465
Karma: 192992430
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
I find the act of asking someone to change their personal decision (RE software licensing), to match your personal needs, to be more vulgar than door-to-door religious proselyzing.

If someone ASKS for help choosing a license ... sure, expound away. But if the decision is already made ... assume some thought went into it, and that license that most closely matched the creator's personal convictions was selected.
DiapDealer is offline   Reply With Quote
Old 06-28-2014, 11:23 AM   #11
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,506
Karma: 5433350
Join Date: Nov 2009
Device: many
Hi,
In general, I am a bit of a license agnostic, especially when it comes to creating tools for authors and other non-developer users who just want to read and develop their own books.

That said ... I wouldn't be happy if I gave away the literally 100s of hours reverse engineering the mobi 8 code just so that someone else could make money on it without contributing back their code and improvements and without attribution. That was the whole point of the original GPL and although I think the GPL version 3 is doing more harm than good by driving commercial companies away from open source projects like gcc, the original GPL 1 and 2 licenses were quite good licenses and served a useful purpose.

And none of my reverse engineering would have been worth a hill of beans without knowledge of the older mobi format and the original authors of mobiunpack, the huff dic compression code, everyone who contributed to the wiki, and especially the insanely messy code of reading and decoding the indexes with their arcane variable length bytes and rules. So if they want GPL 3 vs 2, who am I to say otherwise. Their choice of license has worked splendidly, as far as I can tell. So I think there is 0 chance of the license of KindleUnpack changing.

My 2 cents,

KevinH

Last edited by KevinH; 06-28-2014 at 11:28 AM.
KevinH is offline   Reply With Quote
Old 06-28-2014, 07:25 PM   #12
eureka
but forgot what it's like
eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.eureka ought to be getting tired of karma fortunes by now.
 
Posts: 741
Karma: 2345678
Join Date: Dec 2011
Location: north (by northwest)
Device: Kindle Touch
Thanks for detailed answers.

@baf, @KevinH, I fully appreciate your work and understand your reasons and exclusive right for choosing a license for your code. I endorse (L)GPL in general, but in case of unique closed-source libraries (I mean Amazon's mobi library used in kindlegen and on Kindle, not baf's libmobi or KindleUnpack) it would be nice to have [also] MIT/BSD-licensed library code as an implementation example compatible with most of possible licenses for further (re)implementations (as opposed to end-user programs, where differences between GPL and MIT/BSD aren't important in this sense). For me, this case is about disseminating knowledge in source code form and not necessarily about freeing software.

Anyway, I didn't want to start flamewar about licensing and didn't mean to thrust my opinion on it. Sorry for raising this theme.

As a sidenote: I don't want to read and reimplement (L)GPL-licensed mobi-related code mostly because of moral considerations, not legal. I don't think somebody will really sue me, if I'll read your code, get knowledge about algorithms and data structures, implement them and share result under MIT license. It's just that such reimplementation not in the spirit of principles behind GPL and "free software". Also I didn't even start to write my code, so following this reason is easy and rational ATM

UPD: to be fully clear, (L)GPL of existing source in this concrete case is acceptable for me, and I don't want to persuade you to change it. I respect your personal choices.

Last edited by eureka; 06-28-2014 at 09:32 PM. Reason: clarifications in sidenote
eureka is offline   Reply With Quote
Old 11-17-2014, 07:12 AM   #13
baf
Evangelist
baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.
 
Posts: 404
Karma: 2200000
Join Date: May 2012
Device: kt
Hi again,
I found some time lately to move my project forward a bit.
I updated my post at the beginning of this thread with newer binaries for testing.

What has been added?
I mainly focused on handling dictionaries – added support for inflections (also old inflections scheme found in older dictionaries built with mobigen).
I also fixed many bugs related to parsing non-dictionary files.

While the project lacks extensive testing, mobitool successfully unpacked all the files I managed to grab.
For dictionaries I made tests on files found on mobileread forums, as well as dictionaries from my own kindle (dedrmed).

For developers:
I want to share some of my findings here. I hope I don't reinvent wheel here, but I couldn't find this information anywhere.

First thing is inflection scheme found in some older dictionaries. It uses tag 7 in inflections index. The tag contains pairs of values: an offset of the inflection rule in ctoc record and length of the string. It seems that each rule must be applied to all the entries in the orthographic index which end with a string matching entry label in inflection index. This old scheme was probably lossy, it is impossible to recreate exact source of the infection rules. However, decompiled source should still produce same compiled dictionary file. This "loosyness" may be observed by searching an old scheme dictionary on Kindle (I did it on Kindle.app). For example a search for a non-existant word "guidarka" in The New Oxford American Dictionary will bring entry "guitar" :) Why? I leave it as a riddle.

I also found a number of tags used in orthographic indices. Tags 22 and 25 contain offsets of main entry in the same index (link to it). Tags 5, 40, 53, 69, 70, 71 all point to ctoc entries in various indices (orth, names, keys). Generally they match source tags <idx:string\>, <idx:keys\>, <idx:orth format\> and others. I didn't implement their reconstruction, as this information is probably not so important.

Another thing I discovered is that some orthographic indices substitute some latin ligatures with custom replacements. It probably facilitates search. In older dictionaries these replacements are listed in LIGT section of the header. These are four byte entries: two bytes for ligature and two bytes for replacing bytes. As far as I know the section always lists same five ligatures, irrelevant of the ligatures used in document. In newer documents the section is missing, but replacement is still being done. I didn't found the list of replacements anywhere in document in such case. I assume we still have to deal with the standard set of five ligatures. Reconstructed html documents with ligatures that haven't been repaired contain characters with codes in range 0x1–0x5.
The ligatures are decomposed and first character is replaced by a control character. The five cases are:
OE => 0x1,
oe => 0x2,
AE => 0x3,
ae => 0x4,
ss => 0x5.
So instead of ligature "Œ" there are 2 bytes: 0x1, 0x45.

That's all for this quick summary. I will be glad to answer any questions if anything is not clear.
baf is offline   Reply With Quote
Old 11-17-2014, 03:05 PM   #14
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,422
Karma: 85397180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Quote:
Originally Posted by baf View Post
Great! Where do I find the latest testing versions of KindleUnpack?
I found it. Is there any public repo for development, like on github?
Thanks!
There is, now: https://github.com/kevinhendricks/KindleUnpack/
eschwartz is offline   Reply With Quote
Old 11-19-2014, 02:15 AM   #15
baf
Evangelist
baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.baf ought to be getting tired of karma fortunes by now.
 
Posts: 404
Karma: 2200000
Join Date: May 2012
Device: kt
Quote:
Originally Posted by eschwartz View Post
Thanks!
baf is offline   Reply With Quote
Reply

Tags
libmobi

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Merge Formats library into Metadata library? Or, Add Format in bulk? Sabardeyn Library Management 5 01-23-2013 06:00 AM
copy #1 library to #2 library, cover & .epub file lost yujunglin Library Management 3 10-15-2011 02:13 AM
[Old Thread] import library or export to single file add to existing library PCreighton Calibre 4 04-10-2011 01:08 AM
Sony Reader Library running, but Library doesn't show on screen wyldmint Sony Reader 0 08-29-2010 01:59 AM
How to move public library book from ADE to Sony Library? mom2three Sony Reader 3 06-30-2010 05:26 AM


All times are GMT -4. The time now is 09:03 PM.


MobileRead.com is a privately owned, operated and funded community.