View Single Post
Old 10-16-2012, 06:57 AM   #930
Dylan Tomorrow
Connoisseur
Dylan Tomorrow ought to be getting tired of karma fortunes by now.Dylan Tomorrow ought to be getting tired of karma fortunes by now.Dylan Tomorrow ought to be getting tired of karma fortunes by now.Dylan Tomorrow ought to be getting tired of karma fortunes by now.Dylan Tomorrow ought to be getting tired of karma fortunes by now.Dylan Tomorrow ought to be getting tired of karma fortunes by now.Dylan Tomorrow ought to be getting tired of karma fortunes by now.Dylan Tomorrow ought to be getting tired of karma fortunes by now.Dylan Tomorrow ought to be getting tired of karma fortunes by now.Dylan Tomorrow ought to be getting tired of karma fortunes by now.Dylan Tomorrow ought to be getting tired of karma fortunes by now.
 
Posts: 62
Karma: 640495
Join Date: Jul 2012
Location: Germany
Device: Kindle Touch, Android smartphone w/ FBReader
Lightbulb Broken indexing, missing TOC on my Kindle and my conversion workaround

Hi everyone,

as before, I've been reading loads of fanfics and mostly been very happy with this plugin and the fabulous job it is doing at downloading them, but two issues remained when reading the MOBI files on my Kindle Touch, for which I can at least share how I fixed them using two other tools (Howto at the end):

1. Broken indexing: This only happened to some of the MOBIs created by FFDL—most of them do index without hassle—but it was very annoying as the Kindle tries to index those files into eternity, draining the battery while doing so. This one I got to the bottom of—for 3 files—in trying to solve it. I figured it must be some non.validating HTML tag soup responsible for it and turned out, that was the case for those three. I let FFDL download the EPUB and then converted that to MOBI with the official KindleGen command line tool.

It was the three fictionalley stories I mentioned before, all of them had not properly closed <p> tags which KindleGen corrected (well, for two of them. One of them it just gave me an "Enhanced Mobi building failure" ). None of the other files gave me errors, but most of the MOBIs KindleGen put out indexed where they did not before, no matter what I tried (adding only one unindexed file at a time, restarting the kindle, etc.). So I figure they must have had some invalid syntax somewhere.

2. Missing Table of Contents: This is an issue with all of the MOBIs created by FFDL. This one is also peculiar. The MOBIs FFDL creates evidently have TOCs, as Calibre can navigate through them in the sidebar of its EbookViewer. But they never showed up in the Go to menu. You could only jump to the TOC that was part of the ebook itself via Go to, which made it more cumbersome.

But the TOC does show up in the Go to menu for ebooks converted with KindleGen. To be more exact, it does in the KF8 part. You see, MobiGen creates a hybrid MOBI file that contains both the old Mobipocket/mobi7 format and the new KF8/AZW3 format. Because there were still some files refusing to index, I used MobiUnpack to split the MOBIs into .mobi (mobi7) and .azw3 (KF8) files. The .azw3 files are the ones where the TOC works as expected, and all of them index without a problem and also superfast.

Summary/Conversion Workaround Howto:
So to anyone else who has problems with indexing/missing TOCs on Kindle eReaders (Warning: This will probably break your previous notes and marks from the old MOBIs, so I'd advise to only do it for the non-indexing files and files without notes/marks):

Preparations (only necessary before first conversion):
  • configure duplicates automerge: Go to Preferences -> Import/Export -> Adding Books: Activate checkbox Automerge added books if they already exist in the calibre library, choose Create new record for each duplicate format in the dropdown menu, click Apply.
  • Download & Install KindleGen [mr wiki page] (install under linux: execute sudo cp kindlegen/ /opt/kindlegen/ ; sudo ln -s /opt/kindlegen/kindlegen /usr/local/bin/kindlegen), download & unpack MobiUnpack [mr wiki page] (use the python script in Mobi_Unpack_v054.zip, as the calibre plugin can unfortunately not work on more than one file at a time), copy contents of /lib/ subfolder into your conversion folder
The conversion workaround itself:
  1. Download Fanfics as EPUB with FFDL
  2. Export EPUBs (and MOBIs, if any) to conversion directory with calibre's save to disk in a single directory action (delete OPF files (you can keep the old MOBIs, if any, as backup, but move them to another directory so that KindleGen won't overwrite them))
  3. batch convert EPUBs with kindlegen; e.g. in the Linux bash console:
    Code:
    for b in *.epub ; do kindlegen "$b" -c2 -verbose && echo "$b converted successfully." ; done
  4. batch split .azw3s from .mobis, collect .azw3s from MobiUnpack-created subfolders in conversion folder and delete leftover MU subfolders; e.g. in the Linux bash console:
    Code:
    for b in *.mobi ; do python mobi_unpack.py -s -d "$b" && echo "$b converted successfully." ; done
    mv */*.azw3 .
    rm -rf ./*/
  5. delete any left-over old MOBIs from the fanfics with the remove files of a specific format from selected books action
  6. import .azw3 files from your conversion directory into calibre via drag&drop or the Add Books menu entry, they are automagically added to the right entries
  7. Put files onto Kindle, experience working TOC and indexing
  8. PROFIT!

EDIT:
I did not save the KindleGen log for the first conversion batch I did, so I just ran the corresponding FFDL created EPUBs through Epubcheck 3. Almost all of them had similar HTML syntax errors (mostly unclosed <p> tags, also faulty element nesting, missing attributes, inline elements in the <body> outside of an enclosing block element). ALL of the fictionalley fanfics produced errors, also some from AO3 and fanfiction.net.

So I assume those kinds of syntax errors explain why the files did not index. If only all fanfic sites enforced proper XHTML/HTML5!

Last edited by Dylan Tomorrow; 10-16-2012 at 03:10 PM. Reason: +Epubcheck results
Dylan Tomorrow is offline   Reply With Quote