Originally Posted by ichrispa
I suggest we define some common ground for looking into this topic. First off, I am running Firmware Version: 2.3.2
I do not strictly object to people contributing knowledge on older firmware versions. As an engineer I have rarely seen database structures change over the time of a product's deployment, because it makes a userside conversion during firmware updates manditory. And there is soooo much that can go wrong there. So I don't think that versions >=2.0.0 should differ to much on the subject.
The database structure changes with most firmware releases, but it is rare for the new database not to be backwardly compatible with the previous one. They add tables and columns. I've never seen them remove anything and I've only seen two or three changes in datatype.
Infering from davidfor post and crossreferencing with my contents of the database, the column contentType is:
It is logical to deduce that images, html files, text files, previews, etc. etc. alls have there own content type. It might also be true that Kobo purchased/encrypted books and papers are a different content type than sideloaded books. Unfortunately I cannot try that out, as I have not purchased any ebooks from kobo.
1 - unknown
2 - unknown
3 - unknown
4 - unknown
5 - unknown
6 - EBook, epub (sideloaded)
7 - unknown
8 - unknown
9 - EBook, pdf (sideloaded)
10 - Newspaper, epub
For epubs and html the entry can also identify a chapter, as long as it is in a seperate file in the epub archive and the heading contains an anchor. For example:
is a chapter from a calibre synthesized newspaper, which is correctly flagged as ContentType epub.
sqlite> select ContentID,ContentType,MimeType from content;
file:///mnt/onboard/Zeit Online [Wed, 26 Dec 2012]_2012-12-26.epub#(132)feed_13/article_0/index_u136.html|9|application/epub+zip
To summarize: ContentType does affect where the kobo interface places the epub, but not the reader application.
The meaning of 6 and 9 isn't correct. There is a ContentType 6 row for every item on the device except the newspapers. This describes the item. It doesn't matter the type of book or if it s an image or HTML or something else. Other columns describe the type of book. Look at MimeType for the type,
The ContentType 9 row describes the chapters in the book. For an epubs this is the TOC. It doesn't matter if the chapters are in a single file in the epub, or separate files. It is one 9 per chapter. For other types, there is usually one 9 record.
For a kepubs, there are also ContentType 899 rows. As with the 9 rows, these seem to be TOC entries. These are new. I think they started being created with 2.2.0. The 899's have the chapter title, but the 9's have the file names. The TOC and navigation use these somehow.
ContentType 10 appears to be for newspapers and the articles in them. They seem to be either epubs or kepubs. But they have a multilevel structure that is reflected in the database. The "BookId" is used to do this.
@davidfor: When you change the suffix to .kepub.epub, is the contentType being detected automatically? How is the EpubType entry affected by this change? Mine is stuck at -1, but this might affect the reader application.
When the epub is renamed to "kepub.epub", everything is automatic. All the appropriate rows were created automatically. I haven't paid attention to the EpubType. Looking right now, all the downloaded kepubs have 1 except for the previews. For these, EpubType is 3.