![]() |
#1 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,452
Karma: 7185064
Join Date: Oct 2007
Location: Linköpng, Sweden
Device: Kindle Voyage, Nexus 5, Kindle PW
|
html2mobi (a mobigen replacement written in Perl)
When I realized that there was support for reading and writing mobi files in Perl I got inspired to start to write a mobigen replacement today since my favourite language is Perl.
Now if a set of html files are given to the script a table of content is generated automatically. The script also takes an opf file as input and now it manages to generate a working mobi file for the Alice in Wonderland test example with working images (at least they work in FBReader). The table of content is not working properly but I will look at that. Does anybody know the datastructure for this? I can always just add it in the beginning but if it is possible to do it correctly I will do it. Now I just save the images in new records. Will this work? I seem to remember some limitations mentioned about the size of a record. In a couple of days I can make the first alpha version available. But first I want to test the script on some more examples. So does anybody have any recomendation for files to test with? Or know about some well known issues I should check for? |
![]() |
![]() |
![]() |
#2 |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 78,968
Karma: 144284074
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
How ill your script handle images? Will they be the same size in the script generated mobi book?
|
![]() |
![]() |
Advert | |
|
![]() |
#3 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,156
Karma: 27110894
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Are you actually parsing the HTML and recreating it or just packaging it into a mobi?
|
![]() |
![]() |
![]() |
#4 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,452
Karma: 7185064
Join Date: Oct 2007
Location: Linköpng, Sweden
Device: Kindle Voyage, Nexus 5, Kindle PW
|
Quote:
But I have not actually found a specification of allowed HTML code. I was going to take the appoach that what works on my Gen3 is allowed... |
|
![]() |
![]() |
![]() |
#5 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,452
Karma: 7185064
Join Date: Oct 2007
Location: Linköpng, Sweden
Device: Kindle Voyage, Nexus 5, Kindle PW
|
Quote:
Do you always want to maximize the image size according to the reading device? Or should you add some size specification in the img tag? |
|
![]() |
![]() |
Advert | |
|
![]() |
#6 |
reader
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 6,977
Karma: 5183568
Join Date: Mar 2006
Location: Mississippi, USA
Device: Kindle 3, Kobo Glo HD
|
MobiPocket does have PRCGEN Documentation, which provides some information about the supported HTML.
You have probably already seen MobiPocket TOC using mobigen and Images in MobiPocket. In particular, a toc.html appears to be required for mobigen to create a TOC and it is inserted at the end of the .mobi file. An automatic TOC would be a useful addition, and yet another reason to prefer html2mobi over mobigen. I have never seen the hisrc attribute (Image support and display) used for an image in an actual MOBI file, but it might be one way to add a larger image to a MOBI file while maintaining backward compatibility. It might be enough, though, to have a default image size, or have html2mobi honor width & height larger than the image by rescaling the image (note that the reader ignores width & height larger than the image). |
![]() |
![]() |
![]() |
#7 |
reader
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 6,977
Karma: 5183568
Join Date: Mar 2006
Location: Mississippi, USA
Device: Kindle 3, Kobo Glo HD
|
A mobi2html that explodes MOBI to HTML would also be useful. It would obviously only work on DRM-free PRC and MOBI files. The easiest option would just be to extract the single HTML file and the images, with the images correctly referenced in the HTML. Better would be to extract the .opf file from the HTML preamble. Note that mobi2epub would then be a simple addition, or just use the existing oeb2epub.py in combination with mobi2html.
|
![]() |
![]() |
![]() |
#8 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,452
Karma: 7185064
Join Date: Oct 2007
Location: Linköpng, Sweden
Device: Kindle Voyage, Nexus 5, Kindle PW
|
Quote:
|
|
![]() |
![]() |
![]() |
#9 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,452
Karma: 7185064
Join Date: Oct 2007
Location: Linköpng, Sweden
Device: Kindle Voyage, Nexus 5, Kindle PW
|
Quote:
For the Alice in Wonderland opf the toc is inserted in the end because it is in the spine specification. And if it was not there its has to be inserted because it is in the manifest specification. What I do not get is how to code things so you get a button in FBReader for the toc. I assume the guide tag has something to do with this. The gif cover that was 600x800 caused my Gen3 to hang so I had to reboot it. I rescaled it a bit and saved as jpg instead and that worked better. |
|
![]() |
![]() |
![]() |
#10 |
reader
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 6,977
Karma: 5183568
Join Date: Mar 2006
Location: Mississippi, USA
Device: Kindle 3, Kobo Glo HD
|
An OPF preamble does seem to optional. This is from the MobiPocket version of Ring of Fire from the Baen Free Library.
Code:
<HTML><HEAD><metadata> <dc-metadata xmlns:dc="http://purl.org/metadata/dublin_core" xmlns:oebpackage="http://openebook.org/namespaces/oeb-package/1.0/"> <dc:Title>Ring of Fire</dc:Title> <dc:Type>Novel</dc:Type> <dc:Identifier id="ISBN-074347175X" scheme="ISBN-Hardcover">0-7434-7175-X</dc:Identifier> <dc:Identifier id="ISBN13-9780743471756" scheme="ISBN13-Hardcover">978-0-7434-7175-6</dc:Identifier> <dc:Identifier id="ISBN-1416509089" scheme="ISBN-Paperback">1-4165-0908-9</dc:Identifier> <dc:Identifier id="ISBN13-9781416509080" scheme="ISBN13-Paperback">978-1-4165-0908-0</dc:Identifier> <dc:Identifier id="DOI-074347175X" scheme="DOI">10.1125/Baen.074347175X</dc:Identifier> <dc:Publisher>Baen Books</dc:Publisher> <dc:Creator role="aut" file-as="Flint, Eric">Eric Flint</dc:Creator> <dc:Contributor role="art" file-as="Blair, Dru">Dru Blair</dc:Contributor> <dc:Subject>Science Fiction</dc:Subject> <dc:Rights>2004 by Eric Flint</dc:Rights> <dc:Date>2004-01-01</dc:Date> <dc:Language>US English (en-us)</dc:Language> </dc-metadata> </metadata> <GUIDE> <REFERENCE TYPE="toc" TITLE="Table of Contents" HREF="074347175X_top.htm" filepos="0001692887"> <REFERENCE TYPE="cover" TITLE="Cover" HREF="074347175X__i_.htm" filepos="0000001553"> <REFERENCE TYPE="copyright-page" TITLE="Copyright" HREF="074347175X__p_.htm" filepos="0000001785"> <REFERENCE TYPE="firstpage" TITLE="First Page" HREF="074347175X__p_.htm#Chap_0" filepos="0000004946"> </GUIDE> <METADATA HREF="xyz_metadata.htm" filepos="0001694500"><hr></HEAD><BODY> <h1 align="center"><img src="BMP" recindex="00001"><br /> Ring of Fire<br /> by<br />Eric Flint</H1> <p align="center"><A HREF="074347175X_top.htm" filepos="0001692887">Table of Contents</A></P> Another way to generate "typical" MOBI books would be to run mobigen.exe on an exploded LIT file, and compare the result to using html2mobi. In the case of Baen books, you can use the LIT version and compare the result to their MOBI version. Last edited by wallcraft; 11-25-2007 at 08:39 PM. |
![]() |
![]() |
![]() |
#11 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,442
Karma: 300001
Join Date: Sep 2006
Location: Belgium
Device: PRS-500/505/700, Kindle, Cybook Gen3, Words Gear
|
Quote:
I'm going to do a post on internals of mobi format "soon"... |
|
![]() |
![]() |
![]() |
#12 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,452
Karma: 7185064
Join Date: Oct 2007
Location: Linköpng, Sweden
Device: Kindle Voyage, Nexus 5, Kindle PW
|
Quote:
I actually got mobi2html to work. Use it as: perl mobi2html Alice_In_Wonderland.mobi > Alice.html The images should work. But there are some problem with the rendering of the "wave text". I attach the script if anybody are interested in playing around with it. How do I attach a file called mobi2html? |
|
![]() |
![]() |
![]() |
#13 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,452
Karma: 7185064
Join Date: Oct 2007
Location: Linköpng, Sweden
Device: Kindle Voyage, Nexus 5, Kindle PW
|
Do you know how the "library" image is specified? I noticed that when I had 7 images in the document then the library image was the record directly after the 7:th image record.
|
![]() |
![]() |
![]() |
#14 |
Groupie
![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 189
Karma: 793
Join Date: Oct 2006
|
![]() Last edited by andym; 11-26-2007 at 03:26 AM. |
![]() |
![]() |
![]() |
#15 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,442
Karma: 300001
Join Date: Sep 2006
Location: Belgium
Device: PRS-500/505/700, Kindle, Cybook Gen3, Words Gear
|
Quote:
I did notice that one of my books had a cover image that does not actually appear in the .mobi file... so it seems it's downloaded from the server and is stored separately in the Covers folder. |
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
html2mobi - html formatting | brunovg | Kindle Formats | 2 | 12-13-2009 05:56 AM |
Old Version Mobigen needed | wilko10 | Kindle Formats | 11 | 11-25-2008 08:10 PM |
Does someone still have Mobigen 6.01 build 37? | IceHand | Kindle Formats | 7 | 03-03-2008 05:04 PM |
lit2mobi written in Perl working | tompe | Bookeen | 7 | 01-19-2008 01:06 PM |
MobiPocket TOC using mobigen | wallcraft | Reading and Management | 4 | 12-07-2007 09:45 AM |