Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > KOReader


Thread Tools Search this Thread
Old 05-26-2023, 02:33 PM   #16
Junior Member
ryanwwest began at the beginning.
Posts: 6
Karma: 10
Join Date: Apr 2023
Device: Boox Max Lumi, Leaf 2
Thanks cst, good to know!

I'm currently building out an understanding of the annotation storage formats in Zotero and KOReader for a two-way sync, probably implemented as a Zotero plugin. Some new discussions about it, for anyone interested:
- Zotero itemAnnotations SQLite table understanding:
- KOReader .sdr metadata.pdf.lua contents understaning (and WIP refactor):

I should probably make a post for this somewhere here or in Zotero forums, but the main findings so far from experimentation and todos:
- For KOReader -> Zotero highlights, KOReader's `highlight.pboxes` basically corresponds to Zotero's itemAnnotations.position.rects and I think would be relatively easily to convert to/from. The main challenge is itemAnnotations.sortIndex (of the format (page index)|(closest text offset)|(Y coordinate from the top) ), where (closest text offset) is the character index from I think the start of the page, so we'd need to use some Zotero function to get this value from the x/y coordinates of the first word in the highlight.
- For Zotero -> KOReader highlights, you mainly just need to generate the pos0 and pos1 variables duplicated in highlight and bookmarks, and when PDF reflow is on in KOReader their structure is just an x, y, and page value. I think the x and y value correspond to the pixel-center of the first (and last, for pos1) word in the highlight and inexact values seem to be fine (it jumps to the nearest word). When PDF reflow is off, pos0 and pos1 include additional info about rotation and zoom (and x/y may be different), but both seem to work so I'd prefer to use the simple option when possible.

You're right that there isn't a great way to match Zotero and KOReader highlights as they use different storage methods. The stored text from the highlight works until you have multiple instances of that text on the page. Zotero's itemAnnotations has a unique itemID and KOReader has booksmarks.datetime which may be sufficient, but difficult to map them one-to-one (especially if you modify a highlight on one end - how do you avoid making a duplicate and not removing the old version?).

I wonder if the answer is some universal annotation format (maybe sqlite, maybe json) that can convert annotation data to and from various applications like Zotero and KOReader independently and store the most information possible there. Then watch for external changes to either the .sdr/metadata.pdf.lua file or the itemAnnotations table and when one is updated, immediately update the central/universal format followed by updating any other external annotation datastores. So if KOReader adds a new highlight then quits, upon quit the central datastore updates itself then updates Zotero DB as it maintains a mapping between itself and all external annotations.

I know this is beyond the scope of what you're doing, so I will find a discussion place elsewhere. Eventually I want to expand this central annotation datastore idea to at least HTML and EPUB annotations as well. But thought you might have interest or comments since you were looking at this
ryanwwest is offline   Reply With Quote

academic use, plugin

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
[GUI Plugin] Zotero Metadata Importer DaltonST Plugins 291 08-07-2023 12:38 PM
[Viewer Plugin] Text to Speech chye Plugins 19 10-24-2020 04:09 AM
Getting BookId from viewer plugin Terisa de morgan Plugins 7 10-26-2016 04:54 PM
viewer plugin? dhunter Plugins 1 03-08-2012 10:10 PM
Perfect Viewer PDF Plugin GJSmith Kobo Tablets 9 12-01-2011 02:03 PM

All times are GMT -4. The time now is 11:13 AM. is a privately owned, operated and funded community.