Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > Kindle Formats

Notices

Reply
 
Thread Tools Search this Thread
Old 01-07-2009, 09:05 AM   #31
tompe
Grand Sorcerer
tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.
 
Posts: 7,452
Karma: 7185064
Join Date: Oct 2007
Location: Linköpng, Sweden
Device: Kindle Voyage, Nexus 5, Kindle PW
Quote:
Originally Posted by nrapallo View Post
*Thank you* for finally confirming my suspision that the byte count to the filepos/link is "off" in mobi2html (and consequently in Mobi2IMP). I've had to sometimes add upto 200 extra bytes to find the "anchor" tag the filepos was referring to in my conversions from .prc to .imp. I had no idea why I had to do this and never would have thought the UTF-8 decoding could have precipitated this, but it does make awful good sense to me now that you mentioned this!

My Mobi2IMP solution (which was a brute force naive approach) was to scan forward in the uncompressed text (html) from the stated filepos position and look for the first '<' to plop the anchor (for that filepos)! 99% of the times it worked, but it was not elegant nor foolproof!
If you give ma a pointer to or a file with this problem I can see if it is easy to fix.
tompe is offline   Reply With Quote
Old 01-07-2009, 09:50 AM   #32
nrapallo
GuteBook/Mobi2IMP Creator
nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.
 
nrapallo's Avatar
 
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
Quote:
Originally Posted by tompe View Post
If you give ma a pointer to or a file with this problem I can see if it is easy to fix.
Sure can!

Most Feedbooks.com Mobipocket/Kindle offerings have this problem (as they usually are UTF-8 encoded).

For example, see The Ant King and Other Stories ebook (.mobi).

The extracted files/results of Mobi2IMP are included in the .zip below. The below .txt file shows the dos window output of Mobi2IMP. In particular, look at the section following:
Code:
Adding name attributes
FIXED 3: 0000026250 (6) - Wasn't an anchor: reak/><a
Note the number in parentheses i.e. 6 shows how many characters over that I had to go to find the first "<". If I could find one within the first 200-300 bytes, then I would print FIXED, otherwise I would just issue a WARNING.

I've always seen this behaviour with Feedbooks.com .prc/.mobi ebooks.
nrapallo is offline   Reply With Quote
Old 01-07-2009, 10:36 AM   #33
llasram
Reticulator of Tharn
llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.
 
llasram's Avatar
 
Posts: 618
Karma: 400000
Join Date: Jan 2007
Location: EST
Device: Sony PRS-505
Quote:
Originally Posted by tompe View Post
If you give ma a pointer to or a file with this problem I can see if it is easy to fix.
It looks like Mobiperl also isn't handling the "standard" variable-width integer encoded trailing data indicated by the other bits of the extra data flags field.
llasram is offline   Reply With Quote
Old 01-07-2009, 11:21 AM   #34
pdurrant
The Grand Mouse 高貴的老鼠
pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.
 
pdurrant's Avatar
 
Posts: 71,491
Karma: 306214458
Join Date: Jul 2007
Location: Norfolk, England
Device: Kindle Voyage
I think these must be the bytes that were messing up the first version of the Mobipocket decoder, and that the second version tried to fix without really understanding them. (The fifth version seems to handle them correctly, although I still haven't followed exactly what's going on here.)

It would be very nice to get the controlling bits and format of the trailing bytes set out clearing in the wiki...


Quote:
Originally Posted by llasram View Post
It looks like Mobiperl also isn't handling the "standard" variable-width integer encoded trailing data indicated by the other bits of the extra data flags field.
pdurrant is offline   Reply With Quote
Old 01-07-2009, 11:55 AM   #35
llasram
Reticulator of Tharn
llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.
 
llasram's Avatar
 
Posts: 618
Karma: 400000
Join Date: Jan 2007
Location: EST
Device: Sony PRS-505
Quote:
Originally Posted by pdurrant View Post
I think these must be the bytes that were messing up the first version of the Mobipocket decoder, and that the second version tried to fix without really understanding them.
Indeed -- I'm not sure the community as whole really understands them. Calibre's rules for parsing them look to be the same as mobidedrm 0.5's, except that Calibre will ignore the extra data flags field if the MOBI header is shorter than 0xe4 bytes or *longer* than 0xe8 bytes. I vaguely recall being responsible for the test case which led to that one, but I'm now suspecting it may have been an interaction with an earlier version of mobidedrm. And neither handles bit 1 of the extra data flags, although Calibre will when Kovid gets around to pulling from lp:~llasram/calibre/staging .

Quote:
Originally Posted by pdurrant View Post
It would be very nice to get the controlling bits and format of the trailing bytes set out clearing in the wiki...
I've got a whole ream of stuff I figured out writing oeb2mobi that I need to add to the wiki... The effect of bit 1 of extra data flags, the format of "uncrossable" boundary records, the format of the FCIS record, and the format of the index records (although I'm still working on that...).
llasram is offline   Reply With Quote
Old 01-07-2009, 12:06 PM   #36
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,835
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Quote:
Originally Posted by llasram View Post
And neither handles bit 1 of the extra data flags, although Calibre will when Kovid gets around to pulling from lp:~llasram/calibre/staging .
Done. I've also refactored mobi2oeb to use lxml instead of BeautifulSoup for a significant speedup
kovidgoyal is online now   Reply With Quote
Old 01-07-2009, 01:59 PM   #37
tompe
Grand Sorcerer
tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.
 
Posts: 7,452
Karma: 7185064
Join Date: Oct 2007
Location: Linköpng, Sweden
Device: Kindle Voyage, Nexus 5, Kindle PW
Quote:
Originally Posted by llasram View Post
It looks like Mobiperl also isn't handling the "standard" variable-width integer encoded trailing data indicated by the other bits of the extra data flags field.
I will wait for the wiki description of this...
tompe is offline   Reply With Quote
Old 01-07-2009, 02:12 PM   #38
llasram
Reticulator of Tharn
llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.
 
llasram's Avatar
 
Posts: 618
Karma: 400000
Join Date: Jan 2007
Location: EST
Device: Sony PRS-505
Quote:
Originally Posted by tompe View Post
I will wait for the wiki description of this...
Done .
llasram is offline   Reply With Quote
Old 01-07-2009, 02:30 PM   #39
Elsi
Wizard
Elsi is a glorious beacon of lightElsi is a glorious beacon of lightElsi is a glorious beacon of lightElsi is a glorious beacon of lightElsi is a glorious beacon of lightElsi is a glorious beacon of lightElsi is a glorious beacon of lightElsi is a glorious beacon of lightElsi is a glorious beacon of lightElsi is a glorious beacon of lightElsi is a glorious beacon of light
 
Elsi's Avatar
 
Posts: 2,366
Karma: 12000
Join Date: Jan 2008
Location: Texas, USA
Device: Kindle; Sony PRS 505; Blackberry 8700C
in the Kindle

You may already know how this displays in the Kindle, but here are some scans of the screen. (Kindle's screenshot function wasn't working; not sure why.) I scanned @ 300dpi, then reduced the image, exported to JPG with 15% optimization, so the fuzziness is due to the scanning, not the screen itself.

On two of the images, I circled a portion of the author field that doesn't display properly. Also, I really like the way the chapters begin 1/2 way down the page & hope to get your CSS so I can apply it to the books I'm making.
Attached Thumbnails
Click image for larger version

Name:	Menu-screenshot.jpg
Views:	345
Size:	43.2 KB
ID:	20762   Click image for larger version

Name:	Cover-screenshot.jpg
Views:	339
Size:	40.6 KB
ID:	20763   Click image for larger version

Name:	TOC-screenshot.jpg
Views:	338
Size:	34.7 KB
ID:	20764   Click image for larger version

Name:	Chapter-screenshot.jpg
Views:	338
Size:	30.2 KB
ID:	20765   Click image for larger version

Name:	Poetry-screenshot.jpg
Views:	344
Size:	41.3 KB
ID:	20766  
Elsi is offline   Reply With Quote
Old 01-07-2009, 02:59 PM   #40
llasram
Reticulator of Tharn
llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.
 
llasram's Avatar
 
Posts: 618
Karma: 400000
Join Date: Jan 2007
Location: EST
Device: Sony PRS-505
Quote:
Originally Posted by Elsi View Post
You may already know how this displays in the Kindle, but here are some scans of the screen.
Awesome! Thank you . Looks pretty good... I may leave the cover generation at 600x800. Hmm...

Quote:
Originally Posted by Elsi View Post
On two of the images, I circled a portion of the author field that doesn't display properly.
Yarh... Already fixed. Although it looks like it's been re-arranged -- does the Kindle treat the ',' or '&' character specially, like trying to rearrange "Last, First" to "First Last"?

Quote:
Originally Posted by Elsi View Post
Also, I really like the way the chapters begin 1/2 way down the page & hope to get your CSS so I can apply it to the books I'm making.
Heh. I think that's actually a bug. It's supposed to be only 5 or so lines down, and shows up that way in Mobipocket Desktop. I'm actually hoping I've fixed when I post another build of the file . But if that's what you want, you should be able to achieve the effect with a 'margin-top: 50%' property on your chapter headers.
llasram is offline   Reply With Quote
Old 01-07-2009, 03:04 PM   #41
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 7,515
Karma: 18512745
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
Quote:
Originally Posted by Elsi View Post
On two of the images, I circled a portion of the author field that doesn't display properly.
Funny, it seems it takes the semicolon as a separator for different authors, and the comma as the separator between first and last name, so:

Alexandre Dumas, pè;re

(considering that the è entity is not decoded, it has a semiocolon then) is parsed as:

First author: pè (first name) Alexandre Dumas (last name)
Second author: re (last name)

Quote:
Also, I really like the way the chapters begin 1/2 way down the page & hope to get your CSS so I can apply it to the books I'm making.
I usually try to get something a bit more space-efficient...
Jellby is offline   Reply With Quote
Old 01-07-2009, 03:18 PM   #42
wallcraft
reader
wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.
 
wallcraft's Avatar
 
Posts: 6,975
Karma: 5183568
Join Date: Mar 2006
Location: Mississippi, USA
Device: Kindle 3, Kobo Glo HD
Quote:
Originally Posted by llasram View Post
I may leave the cover generation at 600x800.
I suggest making this tunable. A default of 600x800 would be ok, although 525x640 works well on a wider range of devices.
wallcraft is offline   Reply With Quote
Old 01-07-2009, 04:56 PM   #43
Elsi
Wizard
Elsi is a glorious beacon of lightElsi is a glorious beacon of lightElsi is a glorious beacon of lightElsi is a glorious beacon of lightElsi is a glorious beacon of lightElsi is a glorious beacon of lightElsi is a glorious beacon of lightElsi is a glorious beacon of lightElsi is a glorious beacon of lightElsi is a glorious beacon of lightElsi is a glorious beacon of light
 
Elsi's Avatar
 
Posts: 2,366
Karma: 12000
Join Date: Jan 2008
Location: Texas, USA
Device: Kindle; Sony PRS 505; Blackberry 8700C
Quote:
Originally Posted by Elsi View Post
Also, I really like the way the chapters begin 1/2 way down the page & hope to get your CSS so I can apply it to the books I'm making.
Quote:
Originally Posted by llasram View Post
Heh. I think that's actually a bug. It's supposed to be only 5 or so lines down, and shows up that way in Mobipocket Desktop. I'm actually hoping I've fixed when I post another build of the file . But if that's what you want, you should be able to achieve the effect with a 'margin-top: 50%' property on your chapter headers.
Quote:
Originally Posted by Jellby View Post
I usually try to get something a bit more space-efficient...
I'll agree that 1/2 the page is too far down, but I've not been happy with the default placement as shown in this first image (from The Moving Picture Girls). I also would like to try something like the pseudo-watermark used in the commercial book Fortune and Fate by Sharon Shinn as shown in the second image.
Attached Thumbnails
Click image for larger version

Name:	MovingPictureGirls-screenshot.jpg
Views:	306
Size:	48.5 KB
ID:	20771   Click image for larger version

Name:	FortuneAndFate-screenshot.jpg
Views:	322
Size:	42.0 KB
ID:	20772  
Elsi is offline   Reply With Quote
Old 01-07-2009, 05:19 PM   #44
pdurrant
The Grand Mouse 高貴的老鼠
pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.pdurrant ought to be getting tired of karma fortunes by now.
 
pdurrant's Avatar
 
Posts: 71,491
Karma: 306214458
Join Date: Jul 2007
Location: Norfolk, England
Device: Kindle Voyage
Oh - thank you. Do you have a sample file with bit one set? It looks to me like the mobipocket decoder will need adjustment to cope with such files.

Quote:
Originally Posted by llasram View Post
pdurrant is offline   Reply With Quote
Old 01-07-2009, 05:39 PM   #45
tompe
Grand Sorcerer
tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.
 
Posts: 7,452
Karma: 7185064
Join Date: Oct 2007
Location: Linköpng, Sweden
Device: Kindle Voyage, Nexus 5, Kindle PW
Quote:
Originally Posted by nrapallo View Post
Sure can!

Most Feedbooks.com Mobipocket/Kindle offerings have this problem (as they usually are UTF-8 encoded).

For example, see The Ant King and Other Stories ebook (.mobi).
I looked at the output from --rawhtml but could not find any UTF-8 characters... But there is null characters in the file. But that is the data directly from the Perl module unpacking the compressed data so this is probably releated to something else. UTF-8 ought not to produce null characters.

I thought that the unpacking of the record was totally independent on the character set used. Right or wrong?
tompe is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
LRF output kovidgoyal Calibre 873 04-06-2010 02:32 PM
Trying to get consistent look to all output daveps Calibre 0 03-08-2010 02:18 PM
Best Output for Kindle 2 brewjono Calibre 4 01-28-2010 08:55 PM
PRC output Nate the great Calibre 6 10-17-2009 12:58 AM
One last oeb2mobi test... llasram Kindle Formats 13 01-15-2009 11:20 AM


All times are GMT -4. The time now is 03:49 AM.


MobileRead.com is a privately owned, operated and funded community.