View Single Post
Old 07-24-2014, 09:57 AM   #1
gulielmus
Junior Member
gulielmus began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Jan 2014
Device: iPad
Marvin's Annotation Format (importing iBooks annotations)

Hi, so I'd really like to love Marvin. It doesn't seem like it'd be a hard thing to do. (I'm slightly disappointed by the ePub3 warnings, but that's about it so far.)

What I'm particularly interested in is exporting my annotations (notes, highlights, bookmarks) from iBooks. Yes, I know iBooks provides no means of extracting these, but iBooks on Mavericks provides an opportunity that only existed with things like iOS Backup Extractor previously: the iBooks SQLite database is available for perusal.


% cd Library/Containers/com.apple.iBooksX/Data/Documents

% sqlite3 BKLibrary/BKLibrary-1-091020131601.sqlite
sqlite> SELECT
ZAUTHOR,
ZTITLE,
ZASSETID
FROM
ZBKLIBRARYASSET
WHERE
ZTITLE = 'Ulysses';
James Joyce|Ulysses|AFD7FD1805E4A50A67FDB24A0A7C5F39


% sqlite3 AEAnnotation/AEAnnotation_v10312011_1727_local.sqlite
sqlite> SELECT
ZANNOTATIONLOCATION,
ZANNOTATIONMODIFICATIONDATE,
ZANNOTATIONNOTE,
ZANNOTATIONSELECTEDTEXT,
ZANNOTATIONSTYLE,
ZANNOTATIONTYPE
FROM
ZAEANNOTATION
WHERE
ZANNOTATIONDELETED = 0
AND
ZANNOTATIONASSETID=‘AFD7FD1805E4A50A67FDB24A0A7C5F 39'
epubcfi(/6/36[episode-12]!/4/2[episode-12]/774/1,:0,:423)|379400715||They believe in rod, the scourger almighty, creator of hell upon earth, and in Jacky Tar, the son of a gun, who was conceived of unholy boast, born of the fighting navy, suffered under rump and dozen, was scarified, flayed and curried, yelled like bloody hell, the third day he arose again from the bed, steered into haven, sitteth on his beamend till further orders whence he shall come to drudge for a living and be paid.|3|2



Apple is using EPUB-CFI (http://www.idpf.org/epub/linking/cfi/epub-cfi.html) for their annotations! Not some quirky custom method, but the standards-compliant portable specification for annotations! At this point I'm doing a little happy dance.

So, satisfied with this I went off to see how easy it would be to get this CFI into Marvin (probably by way of a Calibre column or something…). I extracted the Marvin container from an iTunes backup, and opened the mainDb.sqlite to see what fields I might need to extract.

% sqlite3 com.appstafarian.MarvinIP/Library/mainDb.sqlite
sqlite> .schema
...

CREATE TABLE "Highlights" (
"ID" INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE,
"BookID" INTEGER NOT NULL REFERENCES "Books"("ID") ON DELETE CASCADE,
"Section" INTEGER NOT NULL,
"Colour" INTEGER NOT NULL,
"Note" TEXT NOT NULL,
"UUID" TEXT NOT NULL,
"StartXPath" TEXT NOT NULL,
"EndXPath" TEXT NOT NULL,
"StartOffset" INTEGER NOT NULL,
"EndOffset" INTEGER NOT NULL,
"Text" TEXT NOT NULL,
"Deleted" INTEGER NOT NULL,
"AncestorXPath" TEXT NOT NULL,
"NoteDateTime" REAL NOT NULL
);

...




Uh oh. What are StartXPath, EndXPath, and AncestorXPath? Where’s the CFI field? I've got a bad feeling about this...

sqlite> SELECT
Section,
StartXPath,
EndXPath,
StartOffset,
EndOffset,
Text,
AncestorXPath
FROM
Highlights;
18| /x:html[1]/x:body[1]/x:div[1]/x:div[1]/x:section[1]/x[387]/text()[1]|/x:html[1]/x:body[1]/x:div[1]/x:div[1]/x:section[1]/x[387]/text()[1]|0|423|They believe in rod, the scourger almighty, creator of hell upon earth, and in Jacky Tar, the son of a gun, who was conceived of unholy boast, born of the fighting navy, suffered under rump and dozen, was scarified, flayed and curried, yelled like bloody hell, the third day he arose again from the bed, steered into haven, sitteth on his beamend till further orders whence he shall come to drudge for a living and be paid.|/x:html[1]/x:body[1]/x:div[1]/x:div[1]/x:section[1]/x[387]


Oi. It looks like my project has died before it even began. In order to convert an iBooks EPUB CFI into the relevant xpaths, I would have to open each book, open the package document to discover the spine location, then open that spine document to resolve the rest of the CFI, then covert all of that into XPath.

So I have a question for Marvin’s developers: do you plan on supporting EPUB-CFI in the near future? I’m downright dismayed that Marvin appears to have rolled its own solution to content location within EPUB files when there’s a standard for portable content reference. I was expecting to have to wrangle whatever form Apple was using in iBooks to CFI; I never dreamed the problem would be the reverse.

If Marvin were open source, I’d see how I could add support for CFIs (migrating existing XPath references would be painful though, and impossible without opening each individual book). As it is, I’m left with a massive amount of notes in iBooks that I now know I can extract easily (yay!) but that I cannot easily import into Marvin. It's particularly disappointing since one of the key advantages that eBooks have over print is that marginalia [I]can be[I] portable. I just wish eBook reading software understood that.

Last edited by gulielmus; 07-24-2014 at 10:00 AM.
gulielmus is offline   Reply With Quote