View Single Post
Old 01-04-2013, 09:23 AM   #10
ichrispa
Enthusiast
ichrispa shines like a glazed doughnut.ichrispa shines like a glazed doughnut.ichrispa shines like a glazed doughnut.ichrispa shines like a glazed doughnut.ichrispa shines like a glazed doughnut.ichrispa shines like a glazed doughnut.ichrispa shines like a glazed doughnut.ichrispa shines like a glazed doughnut.ichrispa shines like a glazed doughnut.ichrispa shines like a glazed doughnut.ichrispa shines like a glazed doughnut.
 
Posts: 40
Karma: 8604
Join Date: Dec 2012
Location: Germany
Device: Kobo Touch
Hi everyone,

first off: wow. I didn't think I would get this much interest. Thanks

Secondly @Uschiekid: You are absolutely right about the title. I apologize.

Thirdly, seeing Uschiekid question I realizise I might have not been all to clear on what I am trying to accomplish here: I want to analyze the onboard database of the Kobo Touch reader (possibly shared among other devices like the kobo mini or glo). This database is located under .kobo/KoboReader.sqlite. Attempting to alter with this database can impede the functionality of your reader. Do not attempt to open/edit this file unless you know what you are doing. The discussion here is not centered on the userinterface side or stuff you can do with calibre or other management tools; this is strictly a development discussion.

If we can find out how to do this, I will propably write a python script with the ability to "sideload" a publication already on the Kobo to the newspaper category. I seriously doubt however that this will be very user friendly.

Now, on to the interesting stuff...

I suggest we define some common ground for looking into this topic. First off, I am running Firmware Version: 2.3.2

I do not strictly object to people contributing knowledge on older firmware versions. As an engineer I have rarely seen database structures change over the time of a product's deployment, because it makes a userside conversion during firmware updates manditory. And there is soooo much that can go wrong there. So I don't think that versions >=2.0.0 should differ to much on the subject.

Now davidfor has raised an interesting point. The ContentType table would obviously index the content type of a kepub that flags it as paper. To be on the same page here, my 2.3.0 and 2.3.2 database structure had the following tables:

Code:
AbTest              Event               content_keys        user              
Achievement         Rules               content_settings    volume_shortcovers
Bookmark            Shelf               publications        volume_tabs       
DbVersion           ShelfContent        ratings           
Dictionary          content             shortcover_page
We will not get around analyzing how the primary keys are cross indexed in the various tables, but the content table would obviously hold the "interesting" stuff. My columns of the table content look like this:

Code:
CREATE TABLE content( ContentID TEXT NOT NULL,               
                      ContentType TEXT NOT NULL, 
                      MimeType TEXT NOT NULL,
                      BookID TEXT,BookTitle TEXT,
                      ImageId TEXT,  
                      Title TEXT COLLATE NOCASE,
                      Attribution TEXT COLLATE NOCASE,
                        Description TEXT,
                      DateCreated TEXT,      
                      ShortCoverKey TEXT, 
                      adobe_location TEXT,
                      Publisher TEXT,  
                      IsEncrypted BOOL,   
                      DateLastRead TEXT, 
                      FirstTimeReading BOOL,
                      ChapterIDBookmarked TEXT, 
                      ParagraphBookmarked INTEGER, 
                      BookmarkWordOffset INTEGER, 
                      NumShortcovers INTEGER,  
                      VolumeIndex INTEGER,   
                      ___NumPages INTEGER,  
                      ReadStatus INTEGER,  
                      ___SyncTime TEXT,   
                      ___UserID TEXT NOT NULL,
                      PublicationId TEXT,
                      ___FileOffset INTEGER,
                      ___FileSize INTEGER,
                      ___PercentRead INTEGER,
                      ___ExpirationStatus INTEGER,
                      FavouritesIndex NOT NULL DEFAULT -1,
                      Accessibility INTEGER DEFAULT 1,
                      ContentURL TEXT, 
                      Language TEXT,
                      BookshelfTags TEXT,
                      IsDownloaded BIT NOT NULL DEFAULT 1,
                      FeedbackType INTEGER DEFAULT 0,
                      AverageRating INTEGER DEFAULT 0, 
                      Depth INTEGER,
                      PageProgressDirection TEXT, 
                      InWishlist BOOL NOT NULL DEFAULT FALSE,
                      ISBN TEXT,
                      WishlistedDate TEXT DEFAULT "0000-00-00T00??.000",
                      FeedbackTypeSynced INTEGER DEFAULT 0,
                      IsSocialEnabled BOOL NOT NULL DEFAULT TRUE,
                      EpubType INT NOT NULL DEFAULT -1, 
                      Monetization INTEGER DEFAULT 2,
                      ExternalId TEXT, 
                      Series TEXT, 
                      SeriesNumber TEXT,
                      Subtitle TEXTWordCount INTEGER DEFAULT -1,   
                      PRIMARY KEY (ContentID)   
);

CREATE INDEX content_attribution_index ON content (Attribution);
CREATE INDEX content_bookid_index ON content (BookID);
CREATE INDEX content_date_last_read_index ON content (DateLastRead);
CREATE INDEX content_mime_type ON content (MimeType);
CREATE INDEX content_title_index ON content (Title);
Infering from davidfor post and crossreferencing with my contents of the database, the column contentType is:

Code:
1   - unknown
2   - unknown
3   - unknown
4   - unknown
5   - unknown
6   - EBook, epub (sideloaded)
7   - unknown
8   - unknown
9   - EBook, pdf (sideloaded)
10 - Newspaper, epub
It is logical to deduce that images, html files, text files, previews, etc. etc. alls have there own content type. It might also be true that Kobo purchased/encrypted books and papers are a different content type than sideloaded books. Unfortunately I cannot try that out, as I have not purchased any ebooks from kobo.

For epubs and html the entry can also identify a chapter, as long as it is in a seperate file in the epub archive and the heading contains an anchor. For example:

Code:
sqlite> select ContentID,ContentType,MimeType from content;
...
file:///mnt/onboard/Zeit Online [Wed, 26 Dec 2012]_2012-12-26.epub#(132)feed_13/article_0/index_u136.html|9|application/epub+zip
...
is a chapter from a calibre synthesized newspaper, which is correctly flagged as ContentType epub.

To summarize: ContentType does affect where the kobo interface places the epub, but not the reader application.

Quote:
So, I tried changing the extension to .kepub.epub. That worked as well. Opening it from the "News & Magazines" section and the reader app used was different to the normal kepub. But, the TOC and a few other things didn't work correctly.
@davidfor: When you change the suffix to .kepub.epub, is the contentType being detected automatically? How is the EpubType entry affected by this change? Mine is stuck at -1, but this might affect the reader application.

Regards,

ichrispa
ichrispa is offline   Reply With Quote