Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > Kindle Formats

Notices

Reply
 
Thread Tools Search this Thread
Old 11-12-2010, 07:01 AM   #1
oecherprinte
Zealot
oecherprinte began at the beginning.
 
Posts: 115
Karma: 20
Join Date: Jul 2010
Device: Kindle3 3G, Kindle Paperwhite 2
Extract table of contents from mobi file

Hi,

are you aware of any way to extract a tabe of contents from a mobil file? I would need the following information:

- chapter/section name
- location of chapter/section

Here is what I would love to do:

I would like to process the "My Clippings" file for kindle and regroup the annotations to a book such that they are grouped under the chapters. Since I know the positions of the annotations I could group them according to chapters when I know the chapter positions and titles.

I would be fine with any programming language (python, ...).

Thanks,

Jens

P.S.: Of course it would be great if I could also extract the toc for DRMed kindle books.
oecherprinte is offline   Reply With Quote
Old 11-12-2010, 11:55 AM   #2
susan_cassidy
Wizard
susan_cassidy ought to be getting tired of karma fortunes by now.susan_cassidy ought to be getting tired of karma fortunes by now.susan_cassidy ought to be getting tired of karma fortunes by now.susan_cassidy ought to be getting tired of karma fortunes by now.susan_cassidy ought to be getting tired of karma fortunes by now.susan_cassidy ought to be getting tired of karma fortunes by now.susan_cassidy ought to be getting tired of karma fortunes by now.susan_cassidy ought to be getting tired of karma fortunes by now.susan_cassidy ought to be getting tired of karma fortunes by now.susan_cassidy ought to be getting tired of karma fortunes by now.susan_cassidy ought to be getting tired of karma fortunes by now.
 
Posts: 2,251
Karma: 3720310
Join Date: Jan 2009
Location: USA
Device: Kindle, iPad (not used much for reading)
TOCs don't work by referencing locations. They are HTML-based, and reference a tag on the chapter heading. If you run mobi2html (Perl program from MobiPerl), and look at the HTML output, there is usually just a chunk that is the list of hyperlinks forming the TOC.
susan_cassidy is offline   Reply With Quote
Old 11-12-2010, 03:01 PM   #3
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
Quote:
Originally Posted by susan_cassidy View Post
TOCs don't work by referencing locations. They are HTML-based, and reference a tag on the chapter heading. If you run mobi2html (Perl program from MobiPerl), and look at the HTML output, there is usually just a chunk that is the list of hyperlinks forming the TOC.
You can find out where it is by looking at the "Guide" section of the Mobi's OPF file. The table of contents is referenced by the "toc" guide item.
HarryT is offline   Reply With Quote
Old 11-12-2010, 06:35 PM   #4
oecherprinte
Zealot
oecherprinte began at the beginning.
 
Posts: 115
Karma: 20
Join Date: Jul 2010
Device: Kindle3 3G, Kindle Paperwhite 2
Thanks for the answer. But excuse my ignorance:

What is the opf file? Is that clear text xml?

Thanks,

Jens
oecherprinte is offline   Reply With Quote
Old 11-13-2010, 04:58 AM   #5
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
The OPF file is the file that contains all the "instructions" on how to make a Mobipocket book. It contains all the metadata, the list of files that make up the book, and what order they should be arranged in, the "Guide" section (which contains links to the cover image, the table of contents, and the point at which the book should be initially opened) and so on.

You can unpack a Mobi file to its OEB source (which will include the OPF file) using the "ebook-convert" tool that's a part of Calibre.
HarryT is offline   Reply With Quote
Old 11-13-2010, 09:43 AM   #6
oecherprinte
Zealot
oecherprinte began at the beginning.
 
Posts: 115
Karma: 20
Join Date: Jul 2010
Device: Kindle3 3G, Kindle Paperwhite 2
That's cool. Thank you very much. So I could write a script to extract the toc and use that later on.
oecherprinte is offline   Reply With Quote
Old 04-16-2012, 10:57 AM   #7
bwdrennan
Junior Member
bwdrennan began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Apr 2012
Device: HTC Evo Shift 4g
Originality

'You can unpack a Mobi file to its OEB source...using the "ebook-convert" tool that's a part of Calibre.'

Hi, When I performed this on a .mobi file, the resulting styles.css file contained Calibre's styles. HTML files were laced with Calibre's styles. File names for images were altered from the original. Is there a way to unpack to the original files, bit for bit? Or, is all that original information left aside as the conversion software packs the file into a .mobi archive? Thanks!
bwdrennan is offline   Reply With Quote
Old 04-16-2012, 12:10 PM   #8
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,463
Karma: 192992430
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Is there a way to unpack to the original files, bit for bit? Or, is all that original information left aside as the conversion software packs the file into a .mobi archive? Thanks!
MOBI isn't really an "archive." It's basically a binary database that's compiled from source files. Unless the source is still included with the package (kindlegen produces files which include the source files), it's simply impossible to rebuild/recreate the exact source files "bit for bit." Mobi_Unpack will allow you to extract the markup code that's contained within a mobi file with very little change, but it still won't represent the original source. Using the debug option, you can actually dump the raw contents of the mobi file, if you so choose.

Mobi_unPack.py
DiapDealer is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
[Old Thread] Table of contents forced to end not start of .mobi irishpolyglot Conversion 4 05-24-2011 08:33 PM
Table of Contents RTF > MOBI daxmon87 Calibre 12 10-09-2010 12:46 AM
Table of Contents - html to Mobi problem thames Calibre 3 06-02-2010 07:24 PM
Pocketbbook 360: Table of Contents & file formats mimosavj PocketBook 6 04-08-2010 10:05 AM
PRS-500 Can I add a table of contents to a given lrf file? harpum Sony Reader Dev Corner 0 07-13-2007 08:36 PM


All times are GMT -4. The time now is 07:17 AM.


MobileRead.com is a privately owned, operated and funded community.