MobileRead Forums - View Single Post - Kindle Software Preview Release and calibre 'Fetch News' items

nickredding · 02-11-2011, 06:48 PM

I've been looking into this and have found:

1) Instapaper files (sent from Instapaper) have a single section with the captured pages appearing as articles within that section. The BACK button does indeed return from an article to the corresponding cursor position on the Sections & Articles view.

2) If I convert the Instapaper MOBI file to HTML using ebook-convert I find the generated toc.ncx file to be devoid of any table of contents info. Specifically, this is all I got for an Instapaper file with 6 articles:

Code:

<?xml version='1.0' encoding='utf-8'?>
<ncx xmlns="http://www.daisy.org/z3986/2005/ncx/" version="2005-1" xml:lang="en">
  <head>
    <meta content="6f4b85b8-9278-43e2-a6aa-a995e0ed3953" name="dtb:uid"/>
    <meta content="1" name="dtb:depth"/>
    <meta content="calibre (0.7.45)" name="dtb:generator"/>
    <meta content="0" name="dtb:totalPageCount"/>
    <meta content="0" name="dtb:maxPageNumber"/>
  </head>
  <docTitle>
    <text>Instapaper</text>
  </docTitle>
  <navMap/>
</ncx>

I also noted that the HTML file uses H1 tages for the article titles, and I'm surmising that the Kindle is auto-generating the table of contents from the H1 tags. If I open the Instapaper MOBI file in Calibre or Mobipocket reader there is no table of contents.

I've looked at the output from Calibre, using ebook-convert to take it from MOBI to HTML. What I've found is Calibre doesn't use H1 to identify article titles--it generates a complete toc.ncx file which references articles using ID labels.

There is a strange error in the ebook conversion from MOBI to HTML for files with more than one feed (i.e. section). The references to section TOC's in toc.ncx end up looking like

Code:

    <navPoint class="chapter" id="ff83e450-3d2b-42e9-a233-9a8f90f9fd82" playOrder="0">
      <navLabel>
        <text>Front Page</text>
      </navLabel>
      <content src="Daily_Caller.html#eed_0/index.html"/>
    </navPoint>

Note the missing "f" in ="Daily_Caller.html#eed_0/index.html" it should be ="Daily_Caller.html#feed_0/index.html." During the conversion, ebook-convert spat out a bunch of file not found messages, as in

Code:

C:\Users\Nick\Calibre Library>ebook-convert daily.mobi dailydir1
1% Converting input to HTML...
InputFormatPlugin: MOBI Input running
on C:\Users\Nick\Calibre Library\daily.mobi
Parsing all content...
Forcing Daily_Caller.html into XHTML namespace
Referenced file 'feed_6/index.html' not found
Referenced file 'feed_2/index.html' not found
Referenced file 'feed_4/index.html' not found
Referenced file 'feed_3/index.html' not found
Referenced file 'feed_5/index.html' not found
Referenced file 'feed_0/index.html' not found
Referenced file '%20http%3a//twitter.com/Chris_Moody%20' not found
Referenced file 'feed_1/index.html' not found
Referenced file 'feed_10/index.html' not found
Referenced file 'feed_9/index.html' not found
Referenced file 'feed_8/index.html' not found
Referenced file '../2011/02/10/trump-at-cpac-points-to-white-house-run/' not found
34% Running transforms on ebook...
Merging user specified metadata...
Detecting structure...
Flattening CSS and remapping font sizes...
Source base font size is 12.00000pt
Cleaning up manifest...
Trimming unused files from manifest...
Trimming 'images/00002.jpg' from manifest
Trimming 'images/00001.jpg' from manifest
Trimming 'images/00003.jpg' from manifest
Creating OEB Output...
67% Creating OEB Output
OEB output written to C:\Users\Nick\Calibre Library\dailydir1
Output saved to   C:\Users\Nick\Calibre Library\dailydir1

I'm not sure if this is representative of a fault in the MOBI file or in ebook-convert to HTML.

Any suggestions from MOBI experts would be appreciated. There is definately something wrong somewhere, and perhaps it is in Calibre after all, if only a mismatch between how the Kindle handles MOBI periodical files and how Calibre structures them.