View Full Version : Can Mobi books be 'exploded'


AnemicOak
03-06-2009, 02:44 PM
I'm just learning the ins and outs of the Mobi format now that I have a K2. I'm wondering if a Mobi book can be 'exploded' in any way, like a lit can be. The book I'm reading right now has TM's (trademark symbol) in place of asterisks where asterisks are used to separate part of a chapter (sorry, don't know the proper term). I'd like to fix these, but don't really want to have to change the Mobi to HTML & completely recreate the book.

With a Lit I'd just explode it, edit the HTML and then re-build it with Readerworks.

Any ideas?

pilotbob
03-06-2009, 02:46 PM
The mobiperl toolset has a mobi2html.

https://dev.mobileread.com/trac/mobiperl

I also think that calibre has an any2oeb that will do this.

BOb

JSWolf
03-06-2009, 02:50 PM
Use mobi2oeb that comes with Calibre.

AnemicOak
03-06-2009, 02:50 PM
Thanks. I guess I'm hoping to get a whole package (HTML, jpeg, oeb) and not just the straight HTML that mobi2html gives you. I'll check out any2oeb.

pilotbob
03-06-2009, 02:52 PM
Thanks. I guess I'm hoping to get a whole package (HTML, jpeg, oeb) and not just the straight HTML that mobi2html gives you. I'll check out any2oeb.

The doc says:

A script to explode a DRM-free MobiPocket file to html. If no unpack directory is specified the directory unpacked in the current directory will be used.

BOb

tompe
03-06-2009, 03:01 PM
Thanks. I guess I'm hoping to get a whole package (HTML, jpeg, oeb) and not just the straight HTML that mobi2html gives you. I'll check out any2oeb.

mobi2htm gives you everything. Except that oeb information is not in the file so the script does not try to guess how it should look like. You get html code and all the images.

bwaldron
03-06-2009, 03:12 PM
Use mobi2oeb that comes with Calibre.

Yes, that does a better job in getting the complete "package."

AnemicOak
03-06-2009, 03:19 PM
Thanks gang. I used mobi2oeb and got what I was hoping for except the resulting opf wouldn't open in Mobi Creator to rebuild the book. I use oeb2mobi and it worked, kinda. Not exactly formatted (font & spacing) that the original had, but it's good otherwise.

bwaldron
03-06-2009, 03:22 PM
Thanks gang. I used mobi2oeb and got what I was hoping for except the resulting opf wouldn't open in Mobi Creator to rebuild the book.

That's odd -- I've never had that happen with output from mobi2oeb. Guess you found an "edge case."

pilotbob
03-06-2009, 03:22 PM
Wait, what?

You mean you can't open an existing mobipocket book in mobipocket creater in order to edit it?

BOb

bwaldron
03-06-2009, 03:23 PM
You mean you can't open an existing mobipocket book in mobipocket creater in order to edit it?

No, I don't think you can.

AnemicOak
03-06-2009, 03:23 PM
Wait, what?

You mean you can't open an existing mobipocket book in mobipocket creater in order to edit it?

BOb

Not that I could figure out.

pilotbob
03-06-2009, 03:24 PM
No, I don't think you can.

Ok, that is lame! Or is mobi creator not an editor and just a transformation tool? I have never used it.

BOb

AnemicOak
03-06-2009, 03:25 PM
That's odd -- I've never had that happen with output from mobi2oeb. Guess you found an "edge case."

Yeah, I got about 100 css errors when it was generating. The end file should work for my purposes though.

bwaldron
03-06-2009, 03:27 PM
Ok, that is lame! Or is mobi creator not an editor and just a transformation tool? I have never used it.

Nope, not an editor -- it "creates" mobi books from other formats.

tompe
03-06-2009, 04:26 PM
Yes, that does a better job in getting the complete "package."

Why?

Exactly what is the difference between using mobi2html and then mobigen or MobiPocket creator on the resulting html file and images compared to using calibre?

Since calibre process the files more than mobi2html you will increase the risk of not getting the same result back using calibre based tools.

bwaldron
03-06-2009, 05:49 PM
Why?

Exactly what is the difference between using mobi2html and then mobigen or MobiPocket creator on the resulting html file and images compared to using calibre?

Since calibre process the files more than mobi2html you will increase the risk of not getting the same result back using calibre based tools.

As I said, getting the complete package (opf & jpg). I use and like both tools.

tompe
03-06-2009, 06:06 PM
As I said, getting the complete package (opf & jpg). I use and like both tools.

What you get is a reconstructed opf file which is not the one used to create the file. Why do you need that file?

mobi2html is complete in the sense that you get all the information that is in the file.

Having said that there is actually a reason to prefer mobi2oeb and that is if you have utf-8 characters in the MobiPocket book (and the book is tagged as utf-8). Then I think that there are still some bugs or limitations in mobi2html. But these books are not so common.

bwaldron
03-06-2009, 07:12 PM
What you get is a reconstructed opf file which is not the one used to create the file. Why do you need that file?


I've got my entire library of over 1000 ebooks in calibre. Not all of them have full metadata in the original files, but it is all complete in Calibre. I can save to disk from Calibre and then run mobi2oeb with full metadata preserved for Mobipocket Creator by just opening the opf file.

So I don't need it, but it is a convenience factor. I haven't found any significant differences in the regenerated mobi files after editing regardless of whether I use mobi2oeb or mobi2html (which, understand, I am not denigrating).

tompe
03-06-2009, 07:18 PM
I've got my entire library of over 1000 ebooks in calibre. Not all of them have full metadata in the original files, but it is all complete in Calibre. I can save to disk from Calibre and then run mobi2oeb with full metadata preserved for Mobipocket Creator by just opening the opf file.


Ah, I forgot that. I was wrong. I thought I had implemented this functionality but now I remember that I had just planned to to it. For a prc book (not MobiPocket format) without metadata then all information is saved using mobi2html (and since the example starting this thread seemed to be such a book I thought just about this case). It is not saved for a MobiPocket book having metadata.

EowynCarter
03-14-2009, 03:45 PM
Well, that one didn't helped me.

I mean that :
http://www.mobipocket.com/forum/viewtopic.php?t=14952

i converted it to html. But, bad luck, noting to tel the wrong line feed appart from the wrong ones.
Any suggestion ?

tompe
03-14-2009, 03:51 PM
Well, that one didn't helped me.

I mean that :
http://www.mobipocket.com/forum/viewtopic.php?t=14952

i converted it to html. But, bad luck, noting to tel the wrong line feed appart from the wrong ones.
Any suggestion ?

Well, you can probably getting it near to correct by removing all line feeds that does not have a "." before it or remove every line feed that is not followed by a word that starts with a capital letter.

tompe
03-14-2009, 03:52 PM
Well, that one didn't helped me.

I mean that :
http://www.mobipocket.com/forum/viewtopic.php?t=14952

i converted it to html. But, bad luck, noting to tel the wrong line feed appart from the wrong ones.
Any suggestion ?

Check if the html code has a paragraph division.

EowynCarter
03-14-2009, 05:36 PM
acctually, it opens the <p> and don't close it.

Using Capital letters would probably gives some result, but i can't get the regexp right.

EowynCarter
03-15-2009, 06:25 AM
That's what i used finnally. Not perfect, but the book gets back to readable.
I whish i needn't to do the editor's job ! I'm Gonna ask for a refund, or a proper prc.

<p height="0"><font color="black" face="Times New Roman" size="3"><span><font color="black">(([^a-z])(.)*[.,;:!?"])<o:p></o:p></font></span></font><div height="0"></div>
replace by -> <p>$1</p>

<p height="0"><font color="black" face="Times New Roman" size="3"><i><span><font color="black">(([^a-z])(.)*[.,;:!?"])<o:p></o:p></font></span></i></font><div height="0"></div>
replace by -> <p>$1</p>

<p height="0"><font color="black" face="Times New Roman" size="3"><span><font color="black">(([^a-z])(.)*)<o:p></o:p></font></span></font><div height="0"></div>
replace by -> <p>$1

<p height="0"><font color="black" face="Times New Roman" size="3"><i><span><font color="black">(([^a-z])(.)*)<o:p></o:p></font></span></i></font><div height="0"></div>
replace by -> <p>$1

<p height="0"><font color="black" face="Times New Roman" size="3"><span><font color="black">((.)*[.,;:!?"])<o:p></o:p></font></span></font><div height="0"></div>
replace by -> $1</p>

<p height="0"><font color="black" face="Times New Roman" size="3"><i><span><font color="black">((.)*[.,;:!?"])<o:p></o:p></font></span></i></font><div height="0"></div>
replace by -> $1</p>

<p height="0"><font color="black" face="Times New Roman" size="3"><span><font color="black">((.)*)<o:p></o:p></font></span></font><div height="0"></div>
replace by -> $1

<p height="0"><font color="black" face="Times New Roman" size="3"><i><span><font color="black">((.)*)<o:p></o:p></font></span></i></font><div height="0"></div>
replace by -> $1

JSWolf
03-15-2009, 09:13 AM
Why?

Exactly what is the difference between using mobi2html and then mobigen or MobiPocket creator on the resulting html file and images compared to using calibre?

Since calibre process the files more than mobi2html you will increase the risk of not getting the same result back using calibre based tools.
Mobi2oeb will work if the file is compressed with the higher level of compression. mobi2html won't. That to me is the main difference.

kevindorsey
03-18-2009, 04:16 PM
A few of my questions were answered, thanks JSWolf.