Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Viewer

Notices

View Poll Results: Want LLM/AI (e.g., Gemini) features in Calibre Viewer?
Yes, this would significantly enhance my reading experience. 0 0%
No, I prefer using external tools or don't need this feature. 7 100.00%
Voters: 7. You may not vote on this poll

Reply
 
Thread Tools Search this Thread
Old Yesterday, 03:53 AM   #1
amirthfultehrani
Junior Member
amirthfultehrani began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Aug 2025
Device: Windows 11
Smile Feature Proposal: Gemini/LLM API Integration in E-book Viewer

Hello Calibre developers and community, may you all be well!

As a long-time user, I admire and appreciate the continuous effort that has made Calibre the indispensable application it is today. My sincere gratitude to all involved.

My gratitude aside, now a reflection on the reading experience (that many may be able to attest to).

Through countless hours of reading, I've often found myself pausing, grappling with questions a page (or pages) alone cannot answer. Such unanswered questions frequently necessitates reaching for external tools - be it search engines or, increasingly, Large Language Models (LLMs) like ChatGPT - to bring the reader (me) to deeper comprehension.

Think extracting the logical structure from an argument in Spinoza's Ethics or identifying experimental findings from a highly technical paper in Science magazine -- sure, one can meticulously re-read, cross-reference, and manually outline these works, to eventually get to the "answers," but an LLM offers that which traditional tools cannot provide: an immediate, context-aware synthesis, one that drastically reduces the cognitive overhead and time spent in the pursuit of genuine understanding; after all, effort saved is focus gained.

To expand briefly on the point: unlike static resources that provide isolated facts or definitions (e.g., search engines or dictionaries), LLMs can take a paragraph or passage as complex as Kant's Critique of Pure Reason or as dense as a molecular biology research paper from Nature and instead synthesize its core ideas, explain complex relationships, or rephrase it into simpler term -- and all while retaining important context.

It has led me to a realization that I have finally decided to act upon: while having a separate LLM window (ChatGPT, Claude, Grok, Gemini, etc.,) works, imagine the metamorphic potential if this capability were integrated directly into the Calibre application itself.

With that vision, I have developed and tested a new feature that integrates Google's Gemini API (which can be abstracted to any compatible LLM) directly into the Calibre E-book Viewer. My aim is to empower users with in-context AI tools, removing the need to leave the reading environment. The results: capability of instant text summarization, clarification of complex topics, grammar correction, translation, and more, enhancing the reading and research experience.

Key Features Implemented:
  • Dockable Side Panel: persistent, dockable panel for direct interaction with selected text.
  • Customizable Quick Actions: grid of user-configurable buttons for one-click common tasks (e.g., "Summarize," "Explain Simply").
  • Custom Prompts: dedicated field for arbitrary, user-defined LLM queries.
  • Quick Access: includes a global keyboard shortcut (Ctrl+Shift+G) to toggle the panel's visibility, and a button on the floating text selection bar for immediate access.
  • Comprehensive Settings: robust settings dialog (accessible from the panel) for managing the API key, selecting LLM models, and fully customizing quick actions.

Implementation & Stability:
Stability has been part and parcel throughout development. Previous attempts with multi-file architectures or the Calibre plugin system resulted in untraceable startup crashes within the viewer process. Learning from this, I engineered the entire feature to be self-contained within src/calibre/gui2/viewer/ui.py.
All necessary UI classes (panel, settings dialogs) and API integration logic are defined locally within this file and its adjacent co-located modules (e.g., gemini_panel.py, gemini_settings.py)

Here is a screenshot of the feature in action (attached as well):

Here is a screenshot of its settings panel (also attached):


The Ask:
I have the complete, working code for ui.py (and the additional files for the panel/settings/configuration within src/calibre/gui2/viewer/) ready for review. Before proceeding with a formal pull request on GitHub, I wanted to present this proposal to the community to gauge interest and gather initial feedback. I believe this feature would be a valuable and frequently used addition to Calibre's capabilities and would be delighted to guide it through the contribution process.

Thank you for your time and consideration, amazing Calibre community.

Best regards,
Amir
Attached Thumbnails
Click image for larger version

Name:	CalibreGeminiPanelIntegrationScreenshot.png
Views:	16
Size:	368.3 KB
ID:	217270   Click image for larger version

Name:	CalibreGeminiSettingsPanelScreenshot.png
Views:	11
Size:	20.9 KB
ID:	217271  
amirthfultehrani is offline   Reply With Quote
Old Yesterday, 06:18 AM   #2
DNSB
Bibliophagist
DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.
 
DNSB's Avatar
 
Posts: 46,490
Karma: 169115146
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
For me, this would not add anything to my use of the calibre viewer. While I have to admire the work you have put into this, given the number of times that AI generates results that are totally out of touch with reality make me want to run screaming. Looking back over the last few years on MobileRead, there have been quite a few discussions of using AI to 'improve' calibre, none of which amounted to much.

As in one recent discussion (General Discussions => Using AI for reading ), in message #45:

Quote:
Originally Posted by DNSB View Post
Just for the heck of it, I had 10 2 page maximum summaries of the first 11 chapters of 20 books generated. About 50% of them weren't bad, 35% were iffy and 15% had me questioning if I existed in the same universe as the AI.
And at this time, I can't kid myself it is getting better. Going by OpenAI's own testing as reported in an article on LiveScience, their newer reasoning models hallucinate more often with o3 and o4-mini hallucinating 33% and 48% of the time.

AI hallucinates more frequently as it gets more advanced — is there any way to stop it from happening, and should we even try?

Last edited by DNSB; Yesterday at 06:23 AM.
DNSB is offline   Reply With Quote
Old Yesterday, 06:37 AM   #3
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,389
Karma: 27756918
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Looks fine to me, am sure some people will find it useful, like the Lookup panel already present in the viewer. I assume this requires the user to configure which LLM they want to interact with? How are the api keys/passwords whatever stored? Does sit support querying a local LLM?

One concern is that it should also be implemented in the content server viewer, though that can be done in a later iteration.
kovidgoyal is offline   Reply With Quote
Old Yesterday, 07:02 AM   #4
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,766
Karma: 30237628
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Moderator Notice
Users of the Viewer might not visit the Development subforum, so I'm about to move this to the Viewer forum.


Pretty sure Kovid has said Viewer plugins are on his to do list.

I use some AI tools, but I prefer using them a stand alone. The trigger to turn to them could stem from an e-book, a podcast, a newspaper article, somewhere like Quillette… or even a paper book.

BR
BetterRed is online now   Reply With Quote
Old Yesterday, 07:34 AM   #5
rantanplan
Weirdo
rantanplan ought to be getting tired of karma fortunes by now.rantanplan ought to be getting tired of karma fortunes by now.rantanplan ought to be getting tired of karma fortunes by now.rantanplan ought to be getting tired of karma fortunes by now.rantanplan ought to be getting tired of karma fortunes by now.rantanplan ought to be getting tired of karma fortunes by now.rantanplan ought to be getting tired of karma fortunes by now.rantanplan ought to be getting tired of karma fortunes by now.rantanplan ought to be getting tired of karma fortunes by now.rantanplan ought to be getting tired of karma fortunes by now.rantanplan ought to be getting tired of karma fortunes by now.
 
Posts: 848
Karma: 11003000
Join Date: Nov 2019
Location: Wuppertal, Germany
Device: Kobo Sage, Kobo Libra 2, Boox Note Air 2+
Quote:
Originally Posted by amirthfultehrani View Post
(...)Such unanswered questions frequently necessitates reaching for external tools - be it search engines or, increasingly, Large Language Models (LLMs) like ChatGPT - to bring the reader (me) to deeper comprehension.

Think extracting the logical structure from an argument in Spinoza's Ethics or identifying experimental findings from a highly technical paper in Science magazine -- sure, one can meticulously re-read, cross-reference, and manually outline these works, to eventually get to the "answers," but an LLM offers that which traditional tools cannot provide: an immediate, context-aware synthesis, one that drastically reduces the cognitive overhead and time spent in the pursuit of genuine understanding; after all, effort saved is focus gained.(...)
What exactly is the benefit for you? Having an LLM write a synopsis in easy to understand language for you is the the same as reading the plot of a novel on wikipedia and thinking that this will replace reading a novel or watching a movie. To understand these concepts, you need to figure them out yourself, having an LLM do the work for you is defeating the purpose. This is a sure way to mess up your critical thinking skills.
rantanplan is offline   Reply With Quote
Old Yesterday, 08:07 PM   #6
DNSB
Bibliophagist
DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.
 
DNSB's Avatar
 
Posts: 46,490
Karma: 169115146
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
Quote:
Originally Posted by rantanplan View Post
What exactly is the benefit for you?
As long as I am not forced to use the feature, I would file this under the multiple features of calibre and it's plugins that are of little use to me but others may find useful.

Much like the earlier discussion about AI generated summaries, not my cuppa tea but there seems to be a decent number of people who are interested.

For what it may be worth, I re-ran my earlier ebook summary test using 04-mini and the results were worse. Pretty much like the information from OpenAI using their test suite.

Quote:
Research conducted by OpenAI found that its latest and most powerful reasoning models, o3 and o4-mini, hallucinated 33% and 48% of the time, respectively, when tested by OpenAI's PersonQA benchmark. That's more than double the rate of the older o1 model. While o3 delivers more accurate information than its predecessor, it appears to come at the cost of more inaccurate hallucinations.
DNSB is offline   Reply With Quote
Old Yesterday, 10:23 PM   #7
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,766
Karma: 30237628
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by rantanplan View Post
What exactly is the benefit for you? Having an LLM write a synopsis in easy to understand language for you is the the same as reading the plot of a novel on wikipedia and thinking that this will replace reading a novel or watching a movie. To understand these concepts, you need to figure them out yourself, having an LLM do the work for you is defeating the purpose. This is a sure way to mess up your critical thinking skills.
As I understand the OP's proposal it is not about summarising what's in front of you, it's about exploring phrases, ideas, and concepts used within the text.

Example: The AU PM has been using the catchphrase "progressive patriotism" and it's been bandied about in the media without much 'exploration' of what he is talking about.

I put this to ChatGPT "What is 'progressive patriotism', where did it originate, what are its foundations?"

Amongst other things it lead me to newspaper articles by Virginia Woolf, J.B. Priestley and Orwell's - The Lion and the Unicorn - and Teddy Rooselvelt's Bull Moose Party.

And I now know where the AU PM got the phrase, an article in a 2022 AU literary magazine, probably via Katherine Murphy his erstwhile media czar.

BR
BetterRed is online now   Reply With Quote
Reply

Tags
artificial intelligence, development, feature request, large language model, viewer


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
New Feature Proposal - Leverage ChatGPT AI for Automatic Book Summary and Analysis mikhail_fil Calibre 27 10-02-2023 10:24 AM
Calibre - Feature proposal for server Mr. Nietzsche Server 2 08-21-2019 09:48 AM
E-Book Viewer feature suggestion? arthurh3535 Calibre 3 10-19-2018 11:00 PM
Feature request: E-book Viewer TOC doesn't show where you are alessandro Calibre 5 11-21-2013 10:16 AM
Feature proposal: attachments chrisberkhout Calibre 1 08-07-2013 10:40 PM


All times are GMT -4. The time now is 07:59 AM.


MobileRead.com is a privately owned, operated and funded community.