01-07-2009, 05:58 PM | #46 | |
Grand Sorcerer
Posts: 7,452
Karma: 7185064
Join Date: Oct 2007
Location: Linköpng, Sweden
Device: Kindle Voyage, Nexus 5, Kindle PW
|
Quote:
Are these characters and the trailing data part of the record size or are they outside the specified record size? |
|
01-07-2009, 06:00 PM | #47 | |
Grand Sorcerer
Posts: 7,452
Karma: 7185064
Join Date: Oct 2007
Location: Linköpng, Sweden
Device: Kindle Voyage, Nexus 5, Kindle PW
|
Quote:
|
|
Advert | |
|
01-07-2009, 07:03 PM | #48 |
Feedbooks.com Co-Founder
Posts: 2,263
Karma: 145123
Join Date: Nov 2006
Location: Paris, France
Device: Sony PRS-t-1/350/300/500/505/600/700, Nexus S, iPad
|
|
01-07-2009, 07:03 PM | #49 | ||
Reticulator of Tharn
Posts: 618
Karma: 400000
Join Date: Jan 2007
Location: EST
Device: Sony PRS-505
|
Quote:
Quote:
As I understood it, the "record size" was just the distance to the next record. In which case yes, they are part of the record they follow. |
||
01-07-2009, 07:10 PM | #50 |
Reticulator of Tharn
Posts: 618
Karma: 400000
Join Date: Jan 2007
Location: EST
Device: Sony PRS-505
|
|
Advert | |
|
01-07-2009, 07:30 PM | #51 |
Grand Sorcerer
Posts: 7,452
Karma: 7185064
Join Date: Oct 2007
Location: Linköpng, Sweden
Device: Kindle Voyage, Nexus 5, Kindle PW
|
|
01-07-2009, 09:08 PM | #52 | |
Grand Sorcerer
Posts: 7,452
Karma: 7185064
Join Date: Oct 2007
Location: Linköpng, Sweden
Device: Kindle Voyage, Nexus 5, Kindle PW
|
Quote:
My code now is the following and I wondered if this is a correct understanding of it: Code:
eval { sub min { return ($_[0]<$_[1]) ? $_[0] : $_[1] } my $maxi = min($#$recs, $header->{'records'}); for( my $i = 1; $i <= $maxi; $i ++ ) { my $data = $recs->[$i]->{'data'}; my $len = length($data); my $overlap = ""; if ($self->{multibyteoverlap}) { my $c = chop $data; print STDERR "I:$i - $len - ", int($c), "\n"; my $n = $c & 7; foreach (0..$n-1) { $overlap .= chop $data; } } $body .= _decompress_record( $header->{'version'}, $data ); $body .= $overlap; } }; |
|
01-07-2009, 09:42 PM | #53 | ||
Reticulator of Tharn
Posts: 618
Karma: 400000
Join Date: Jan 2007
Location: EST
Device: Sony PRS-505
|
Quote:
Code:
<trailing multibyte bytes><multibyte size & flags><trailing data><size> Quote:
My error. I did byte & 3 to get the size, and for some reason when I was translating the info into the wiki I turned that into 3 bits. It is only 2 bits (which I have updated the wiki to reflect). |
||
01-07-2009, 11:05 PM | #54 | |
GuteBook/Mobi2IMP Creator
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
|
Quote:
|
|
01-08-2009, 06:39 AM | #55 |
Grand Sorcerer
Posts: 7,452
Karma: 7185064
Join Date: Oct 2007
Location: Linköpng, Sweden
Device: Kindle Voyage, Nexus 5, Kindle PW
|
But how do I then detect how many bytes there are in the trailing multibyte bytes? How can I know for sure which byte is the one giving the number of bytes? Or can you parse it in reverse order and it is not ambigious?
|
01-08-2009, 08:16 AM | #56 | |
Reticulator of Tharn
Posts: 618
Karma: 400000
Join Date: Jan 2007
Location: EST
Device: Sony PRS-505
|
Quote:
This is Calibre's current code for find the total size of the trailing entries: Code:
def sizeof_trailing_entries(self, data): def sizeof_trailing_entry(ptr, psize): bitpos, result = 0, 0 while True: v = ord(ptr[psize-1]) result |= (v & 0x7F) << bitpos bitpos += 7 psize -= 1 if (v & 0x80) != 0 or (bitpos >= 28) or (psize == 0): return result num = 0 size = len(data) flags = self.book_header.extra_flags >> 1 while flags: if flags & 1: num += sizeof_trailing_entry(data, size - num) flags >>= 1 if self.book_header.extra_flags & 1: num += (ord(data[size - num - 1]) & 0x3) + 1 return num |
|
01-08-2009, 11:16 AM | #57 |
Grand Sorcerer
Posts: 7,452
Karma: 7185064
Join Date: Oct 2007
Location: Linköpng, Sweden
Device: Kindle Voyage, Nexus 5, Kindle PW
|
Thanks, it was as complicated as I suspected then... These kind of complications seems very odd and I suspect that a specification of the MobiPocket format is not released because either it does not exist or they do not want to show the world how bad the format really is.
Is there any test file available somewhere were the extraflags is something else than 0x1? |
01-08-2009, 12:11 PM | #58 | |
Reticulator of Tharn
Posts: 618
Karma: 400000
Join Date: Jan 2007
Location: EST
Device: Sony PRS-505
|
Quote:
Attached is one I've generated with mobigen. |
|
01-08-2009, 12:22 PM | #59 | |
The Grand Mouse 高貴的老鼠
Posts: 71,510
Karma: 306214458
Join Date: Jul 2007
Location: Norfolk, England
Device: Kindle Voyage
|
Oh - very useful stuff. And it turns out that the Mobipocket decoder will need some fixes for cases where bit position 1 is set. I can only suppose that very few commercial DRMed eBooks are out there with that bit set.
Happily easy to fix given this code, once such a book turns up. Quote:
|
|
01-08-2009, 01:05 PM | #60 |
frumious Bandersnatch
Posts: 7,516
Karma: 18512745
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
|
How do you deal with "font-variant: small-caps"? Do you convert <span class="small-caps">Foo Bar</span> into F<font size="-1">OO</font> B<font size="-1">AR</font> ?
I guess "text-transform: uppercase" is easier... (I once found an HTML book where many capital letters were "created" with this property, which meant that copy-pasting gave lowercase letters, it was a pain...) |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
LRF output | kovidgoyal | Calibre | 873 | 04-06-2010 02:32 PM |
Trying to get consistent look to all output | daveps | Calibre | 0 | 03-08-2010 02:18 PM |
Best Output for Kindle 2 | brewjono | Calibre | 4 | 01-28-2010 08:55 PM |
PRC output | Nate the great | Calibre | 6 | 10-17-2009 12:58 AM |
One last oeb2mobi test... | llasram | Kindle Formats | 13 | 01-15-2009 11:20 AM |