Feature Proposal: Gemini/LLM API Integration in E-book Viewer

amirthfultehrani · 08-04-2025, 04:53 AM

Hello Calibre developers and community, may you all be well!

As a long-time user, I admire and appreciate the continuous effort that has made Calibre the indispensable application it is today. My sincere gratitude to all involved.

My gratitude aside, now a reflection on the reading experience (that many may be able to attest to).

Through countless hours of reading, I've often found myself pausing, grappling with questions a page (or pages) alone cannot answer. Such unanswered questions frequently necessitates reaching for external tools - be it search engines or, increasingly, Large Language Models (LLMs) like ChatGPT - to bring the reader (me) to deeper comprehension.

Think extracting the logical structure from an argument in Spinoza's Ethics or identifying experimental findings from a highly technical paper in Science magazine -- sure, one can meticulously re-read, cross-reference, and manually outline these works, to eventually get to the "answers," but an LLM offers that which traditional tools cannot provide: an immediate, context-aware synthesis, one that drastically reduces the cognitive overhead and time spent in the pursuit of genuine understanding; after all, effort saved is focus gained.

To expand briefly on the point: unlike static resources that provide isolated facts or definitions (e.g., search engines or dictionaries), LLMs can take a paragraph or passage as complex as Kant's Critique of Pure Reason or as dense as a molecular biology research paper from Nature and instead synthesize its core ideas, explain complex relationships, or rephrase it into simpler term -- and all while retaining important context.

It has led me to a realization that I have finally decided to act upon: while having a separate LLM window (ChatGPT, Claude, Grok, Gemini, etc.,) works, imagine the metamorphic potential if this capability were integrated directly into the Calibre application itself.

With that vision, I have developed and tested a new feature that integrates Google's Gemini API (which can be abstracted to any compatible LLM) directly into the Calibre E-book Viewer. My aim is to empower users with in-context AI tools, removing the need to leave the reading environment. The results: capability of instant text summarization, clarification of complex topics, grammar correction, translation, and more, enhancing the reading and research experience.

Key Features Implemented:

Dockable Side Panel: persistent, dockable panel for direct interaction with selected text.
Customizable Quick Actions: grid of user-configurable buttons for one-click common tasks (e.g., "Summarize," "Explain Simply").
Custom Prompts: dedicated field for arbitrary, user-defined LLM queries.
Quick Access: includes a global keyboard shortcut (Ctrl+Shift+G) to toggle the panel's visibility, and a button on the floating text selection bar for immediate access.
Comprehensive Settings: robust settings dialog (accessible from the panel) for managing the API key, selecting LLM models, and fully customizing quick actions.

Implementation & Stability:
Stability has been part and parcel throughout development. Previous attempts with multi-file architectures or the Calibre plugin system resulted in untraceable startup crashes within the viewer process. Learning from this, I engineered the entire feature to be self-contained within src/calibre/gui2/viewer/ui.py.
All necessary UI classes (panel, settings dialogs) and API integration logic are defined locally within this file and its adjacent co-located modules (e.g., gemini_panel.py, gemini_settings.py)

Here is a screenshot of the feature in action (attached as well):

Here is a screenshot of its settings panel (also attached):

The Ask:
I have the complete, working code for ui.py (and the additional files for the panel/settings/configuration within src/calibre/gui2/viewer/) ready for review. Before proceeding with a formal pull request on GitHub, I wanted to present this proposal to the community to gauge interest and gather initial feedback. I believe this feature would be a valuable and frequently used addition to Calibre's capabilities and would be delighted to guide it through the contribution process.

Thank you for your time and consideration, amazing Calibre community.

Best regards,
Amir

DNSB · 08-04-2025, 07:18 AM

For me, this would not add anything to my use of the calibre viewer. While I have to admire the work you have put into this, given the number of times that AI generates results that are totally out of touch with reality make me want to run screaming. Looking back over the last few years on MobileRead, there have been quite a few discussions of using AI to 'improve' calibre, none of which amounted to much.

As in one recent discussion (General Discussions => Using AI for reading ), in message #45:

Quote:

Originally Posted by DNSB

Just for the heck of it, I had 10 2 page maximum summaries of the first 11 chapters of 20 books generated. About 50% of them weren't bad, 35% were iffy and 15% had me questioning if I existed in the same universe as the AI.

And at this time, I can't kid myself it is getting better. Going by OpenAI's own testing as reported in an article on LiveScience, their newer reasoning models hallucinate more often with o3 and o4-mini hallucinating 33% and 48% of the time.

AI hallucinates more frequently as it gets more advanced — is there any way to stop it from happening, and should we even try?

kovidgoyal · 08-04-2025, 07:37 AM

Looks fine to me, am sure some people will find it useful, like the Lookup panel already present in the viewer. I assume this requires the user to configure which LLM they want to interact with? How are the api keys/passwords whatever stored? Does sit support querying a local LLM?

One concern is that it should also be implemented in the content server viewer, though that can be done in a later iteration.

BetterRed · 08-04-2025, 08:02 AM

Moderator Notice
Users of the Viewer might not visit the Development subforum, so I'm about to move this to the Viewer forum.

Pretty sure Kovid has said Viewer plugins are on his to do list.

I use some AI tools, but I prefer using them a stand alone. The trigger to turn to them could stem from an e-book, a podcast, a newspaper article, somewhere like Quillette… or even a paper book.

BR

rantanplan · 08-04-2025, 08:34 AM

Quote:

Originally Posted by amirthfultehrani

(...)Such unanswered questions frequently necessitates reaching for external tools - be it search engines or, increasingly, Large Language Models (LLMs) like ChatGPT - to bring the reader (me) to deeper comprehension.

Think extracting the logical structure from an argument in Spinoza's Ethics or identifying experimental findings from a highly technical paper in Science magazine -- sure, one can meticulously re-read, cross-reference, and manually outline these works, to eventually get to the "answers," but an LLM offers that which traditional tools cannot provide: an immediate, context-aware synthesis, one that drastically reduces the cognitive overhead and time spent in the pursuit of genuine understanding; after all, effort saved is focus gained.(...)

What exactly is the benefit for you? Having an LLM write a synopsis in easy to understand language for you is the the same as reading the plot of a novel on wikipedia and thinking that this will replace reading a novel or watching a movie. To understand these concepts, you need to figure them out yourself, having an LLM do the work for you is defeating the purpose. This is a sure way to mess up your critical thinking skills.

DNSB · 08-04-2025, 09:07 PM

Quote:

Originally Posted by rantanplan

What exactly is the benefit for you?

As long as I am not forced to use the feature, I would file this under the multiple features of calibre and it's plugins that are of little use to me but others may find useful.

Much like the earlier discussion about AI generated summaries, not my cuppa tea but there seems to be a decent number of people who are interested.

For what it may be worth, I re-ran my earlier ebook summary test using 04-mini and the results were worse. Pretty much like the information from OpenAI using their test suite.

Quote:

Research conducted by OpenAI found that its latest and most powerful reasoning models, o3 and o4-mini, hallucinated 33% and 48% of the time, respectively, when tested by OpenAI's PersonQA benchmark. That's more than double the rate of the older o1 model. While o3 delivers more accurate information than its predecessor, it appears to come at the cost of more inaccurate hallucinations.

BetterRed · 08-04-2025, 11:23 PM

Quote:

Originally Posted by rantanplan

What exactly is the benefit for you? Having an LLM write a synopsis in easy to understand language for you is the the same as reading the plot of a novel on wikipedia and thinking that this will replace reading a novel or watching a movie. To understand these concepts, you need to figure them out yourself, having an LLM do the work for you is defeating the purpose. This is a sure way to mess up your critical thinking skills.

As I understand the OP's proposal it is not about summarising what's in front of you, it's about exploring phrases, ideas, and concepts used within the text.

Example: The AU PM has been using the catchphrase "progressive patriotism" and it's been bandied about in the media without much 'exploration' of what he is talking about.

I put this to ChatGPT "What is 'progressive patriotism', where did it originate, what are its foundations?"

Amongst other things it lead me to newspaper articles by Virginia Woolf, J.B. Priestley and Orwell's - The Lion and the Unicorn - and Teddy Rooselvelt's Bull Moose Party.

And I now know where the AU PM got the phrase, an article in a 2022 AU literary magazine, probably via Katherine Murphy his erstwhile media czar.

BR

amirthfultehrani · 08-06-2025, 02:20 AM

Quote:

Originally Posted by DNSB

For me, this would not add anything to my use of the calibre viewer. While I have to admire the work you have put into this, given the number of times that AI generates results that are totally out of touch with reality make me want to run screaming. Looking back over the last few years on MobileRead, there have been quite a few discussions of using AI to 'improve' calibre, none of which amounted to much.

As in one recent discussion (General Discussions => Using AI for reading ), in message #45:

And at this time, I can't kid myself it is getting better. Going by OpenAI's own testing as reported in an article on LiveScience, their newer reasoning models hallucinate more often with o3 and o4-mini hallucinating 33% and 48% of the time.

AI hallucinates more frequently as it gets more advanced — is there any way to stop it from happening, and should we even try?

DNSB, thank you for taking the time to review my suggestion + provide such detailed feedback + your compliment. I appreciate it all. Furthermore, I completely understand your reservations about AI's limitations, especially concerning "hallucinations" (factual inaccuracies) and very much appreciate you linking that article. Not only did I learn a new word from it (confabulation - fun word!), but I was able to glean other perspectives that I didn't know existed (e.g., everything an LLM producing being, in essence, a "hallucination" - both the good, and the bad).

I will bring up one critical distinction between what I have implemented vs. what I think you are referring to, DNSB. It seems in-house, OpenAI uses SimpleQA and PersonQA. In their own words, the former uses "fact-seeking questions with short answers...measures model accuracy for attempted answers" and for the latter, "questions and publicly available facts about people that measures...accuracy on attempted answers." What my implementation does is a great mitigation of this hallucination risk. How? It forces the LLM to primarily work within the provided text, but, of course, it does not eliminate it, in particular, when external knowledge (via the explicit web grounding) is involved. In any case too, I liked the suggestions the article provided to address the hallucination issues: making the model structured (e.g., get it to check its own outputs, compare different perspectives, and go through a reasoned thought process) and/or design the model to acknowledge when they are unsure (e.g., flag uncertainty and defer to human judgment).

All this said, DNSB, I am glad I added the poll of yes/no. It seems, at least for our excellent Calibre developer community, my suggestion may not be of great use (if any). I hope, perhaps in some other way, I can still contribute to the wonderful development of Calibre. I saw that there is no official way for plugins to access the Calibre viewer. Maybe that is where I shall start if it is deemed significant. Thank you for your thoughts, DNSB.

amirthfultehrani · 08-06-2025, 02:42 AM

Quote:

Originally Posted by kovidgoyal

Looks fine to me, am sure some people will find it useful, like the Lookup panel already present in the viewer. I assume this requires the user to configure which LLM they want to interact with? How are the api keys/passwords whatever stored? Does sit support querying a local LLM?

One concern is that it should also be implemented in the content server viewer, though that can be done in a later iteration.

David, thank you for your direct engagement and specific questions! Practicality, security, and future compatibility - what excellent things I hope to work toward, especially from an excellent lead developer like you! Thank you for the great work. It is inspiring.

It seems my poll (as of time of viewing) speaks such that, at the very least, our Calibre development community may not gain much from this feature. Maybe this was an issue in the wording of the answers (maybe I should have eliminated "significantly" and just left it at "enhanced"). In any case, to answer your questions, David:

Yes, the user configures the LLM. Within the Gemini panel, there's a settings icon, that, when clicked, opens a dedicate dialog where the user can select their preferred model from a predefined list (currently Gemini 1.5 Pro and Gemini 1.5 Flash, as these are the "stable," general-purpose models available via Google's Generative Language API). The dropdown displays human-readable names, while the backend uses the specific model IDs (attachment shown; also showed attachment of the "quick actions" editing).
The API key is stored using Calibre's existing 'vprefs' system (calibre.gui2.viewer.config.vprefs). If I understand Calibre's architecture correctly, that means the key is saved within Calibre's configuration files (relevant screenshot attached; prefs.json/viewer.json <- something like this I imagine is the innate origin). That means leveraging Calibre's established preference management, and since I worked with the source build, that would mean (hopefully) the sensitive data would be safe (b/c it'd be specific to the user's Calibre installation).
Currently, my implementation supports solely cloud-based LLMs accessible via HTTP API calls (right now, only Google's Generative Language API endpoint). It does not support querying a local LLM directly. Integrating with local LLMs (e.g., Ollama, local web servers for LLama.cpp, etc.,) would require a different set of API clients/local process communication, which I could certainly do! My only concern would be if it's a valuable future enhancement.
David, I apologize, but what do you mean when you refer to "content server viewer"? I just looked it up on the Calibre site manual, and this sounds like a very cool feature of Calibre that I did not know about (viewing books in a browser??)! I would be happy to explore this if it is desired that this core feature be implemented (or anything adjacent) and if there's interest in proceeding.

Lastly, the entire feature, including the 'GeminiPanel' and 'GeminiSettingsDialog' classes, are contained within 'ui.py' and its next door co-located modules ('gemini_panel.py,' 'gemini_settings.py'). Thankfully, in my testing, it has proven highly stable and avoids prior issues I faced with top-level imports that caused crashes.

In ending, David, I am unsure given our friends comments and poll results if this would be something worth moving forward. However, I am very keen on contributing to Calibre, and if it is okay, I ask you to please let me know what would be something "similar" (architecturally, conceptually, any "...ally") that I could help with.

Thank you again, David, for the excellent work & thoughts!

amirthfultehrani · 08-06-2025, 02:46 AM

Quote:

Originally Posted by BetterRed

Users of the Viewer might not visit the Development subforum, so I'm about to move this to the Viewer forum.

Pretty sure Kovid has said Viewer plugins are on his to do list.

I use some AI tools, but I prefer using them a stand alone. The trigger to turn to them could stem from an e-book, a podcast, a newspaper article, somewhere like Quillette… or even a paper book.

BR

BetterRed, thank you for (if it has been done, I am unsure - in any case, my appreciation stands) moving the thread to the Viewer forum; may that lead to broader user feedback!

Additionally, that is fantastic news regarding David's intention to add viewer plugins. A framework like that could have prevented a lot of hassle I had to go through, and, no doubt, shall open up the door to a lot of robust and excellent Calibre additional features.

Pertaining to standalone AI tools - that is my view as well. My aim with my integration was to address the specific friction points I encountered while actively reading an e-book in Calibre -- where context is immediately available, in-situ comprehension directly on the digital page.

Thank you again for your valuable input, BetterRed.

amirthfultehrani · 08-06-2025, 03:12 AM

Quote:

Originally Posted by rantanplan

What exactly is the benefit for you? Having an LLM write a synopsis in easy to understand language for you is the the same as reading the plot of a novel on wikipedia and thinking that this will replace reading a novel or watching a movie. To understand these concepts, you need to figure them out yourself, having an LLM do the work for you is defeating the purpose. This is a sure way to mess up your critical thinking skills.

Thank you for raising such important points, rantanplan. I completely agree that relying on an LLM to "do the work for you" (e.g., simply generating summaries or explanations without further engagement
would indeed be counterproductive to developing two things we desperately need in our world: critical thinking skills and true understanding. That was never the intended purpose for my suggestion.

The benefit, as I have envisioned and experienced, is not to replace the act of reading, or the act of critically thinking, but instead about reducing friction and accelerating the learning process when faced with challenging text.

I urge you to please consider these scenarios, which an LLM can resolve far more efficiently than re-reading or traditional search, allowing the reader to then re-engage with the original text more effectively:

Quote:

"I have, therefore, found it necessary to deny knowledge, in order to make room for faith."

- Immanuel Kant's Critique of Pure Reason

A famous, short, but often misunderstood quote in that everyone thinks its about faith in a religious sense, or about choosing irrational belief over empirical evidence (i.e., abandon scientific knowledge to embrace religious dogma), when really its about making room for moral + practical reasoning (thus, faith in a philosophical sense) in a world where pure speculative reason (knowledge) cannot definitively prove (or disprove) concepts like God, freedom, and immortality. I have included a couple screenshots of my feature's interpretation from simply asking it to explain in laymen terms. Apologies for the book quality!

Click image for larger version

Name: ExamplePart1OfLLMResponseToDensePhilosophy.png
Views: 109
Size: 439.1 KB
ID: 217328

Click image for larger version

Name: ExamplePart2OfLLMResponseToDensePhilosophy.png
Views: 109
Size: 47.9 KB
ID: 217329

Quote:

Though the Earth, and all inferior Creatures be common to all Men, yet every Man has a Property in his own Person. This no Body has any Right to but himself. The Labour of his Body, and the Work of his Hands, we may say, are properly his. Whatsoever then he removes out of the State that Nature hath provided, and left it in, he hath mixed his Labour with, and joyned to it something that is his own, and thereby makes it his Property. It being by him removed from the common state Nature placed it in, hath by this labour something annexed to it, that excludes the common right of other Men. For this Labour being the unquestionable Property of the Labourer, no man but he can have a right to what that is once joyned to, at least where there is enough, and as good left in common for others.

- John Locke's Second Treatise of Government

For this, the prompt I gave to the LLM was "Summarize the logical structure of this passage regarding property." The results are attached as screenshots.

Click image for larger version

Name: JohnLockePart1.png
Views: 123
Size: 376.4 KB
ID: 217330

Click image for larger version

Name: JohnLockePart2.png
Views: 113
Size: 50.0 KB
ID: 217331

I could also give examples of bridging knowledge gaps - for example, you read "Congress of Vienna" or "liminal spaces" and you want to quickly know about them. I could also give examples of confirmation of understanding - like asking whether a sentence is a supporting example of the narrator being unreliable when you think it is (or isn't). The point is to be able to do all this in a simple no-need-to-go-to-a-separate-tab manner. The result would be a substantial time-saver for those who are already using external LLMs for such tasks, or for those who just may want to potentially try something out to enhance their reading.

Ultimately, my intention was that this'd be a user-configurable option. That'd mean if one is concerned, they can simply choose not to enable or use it, and it wouldn't interfere with their existing Calibre experience. For those who find value in LLMs (and there is a growing number who do), it offers a streamlined + integrated way to utilize them.

Thank you again for your candid and thoughtful reply, rantanplan. It has provided me with valuable perspective, and I truly appreciate the engagement.

kovidgoyal · 08-06-2025, 04:03 AM

@amirthfultehrani rather than using google's bespoke API, have you considered using something like openrouter.ai which allows one to query any model you like via a single API. As for local models, they dont have to be queryable right away, but when designing the config UI and preferences implementation, keep in mind the possibility for their eventual support.

When storing the api keys in prefs, at the very least, obfuscate it, see how relay_password is stored, for an example of doing that.

The content server viewer is indeed the calibre viewer running in a browser.
https://manual.calibre-ebook.com/ser...he-book-viewer

Dont worry too much about the poll results, this feature will be useful to some people in some contexts, as an alternative or even supplement to the existing Lookup feature. And given it is opt in and has zero cost when not used, it wont bother anyone that doesnt want to use it.

Another question, how is the panel trigerred? By a shortcut? By a button in the highlight bar? both?

And finally, the name's Kovid not David.

DNSB · 08-06-2025, 05:37 AM

One other item to remember is that many of the people posting on MobileRead are well into the old curmudgeon stage of their existence. Even minor changes tend to meet with resistance. Recent example was Sigil going to a non-native file dialog when running under Windows which was already used for MacOS and Linux. Baby duckling syndrome is very real.

Quoth · 08-06-2025, 01:28 PM

Even if off by default, it's the thin end of a wedge that could import nonsense and eventually cost the user and environment.

People that don't know what LLM really is and costs will turn it on.

People can separately use LLM / generative AI if they want. No need to spoil the best ebook management system and set of tools.

amirthfultehrani · 08-09-2025, 10:58 PM

Quote:

Originally Posted by kovidgoyal

@amirthfultehrani rather than using google's bespoke API, have you considered using something like openrouter.ai which allows one to query any model you like via a single API. As for local models, they dont have to be queryable right away, but when designing the config UI and preferences implementation, keep in mind the possibility for their eventual support.

When storing the api keys in prefs, at the very least, obfuscate it, see how is stored, for an example of doing that.

The content server viewer is indeed the calibre viewer running in a browser.
https://manual.calibre-ebook.com/ser...he-book-viewer

Dont worry too much about the poll results, this feature will be useful to some people in some contexts, as an alternative or even supplement to the existing Lookup feature. And given it is opt in and has zero cost when not used, it wont bother anyone that doesnt want to use it.

Another question, how is the panel trigerred? By a shortcut? By a button in the highlight bar? both?

And finally, the name's Kovid not David.

Kovid, thank you very much for your reply, my friend, and my dearest apologies for my accident of calling you David. I am 95% sure during my volley of last replies, I was a bit sleep deprived, and I think I, likely, without notice, switched the "Ko" (Kovid) for "Da" (David). I hope you can please forgive me!

Apologies aside, Kovid, a few updates from the last time of contact:

I have switched from the Google Gemini API to now, the one of OpenRouter.ai (a great suggestion, thank you; much smoother, faster, and saves a lot of potential work - compatibility with all types of LLMs)!
The API key has now been obfuscated using the same polyglot.binary methods as relay_password
Code:
```
conf.set('relay_password', as_hex_unicode(password))
pw=from_hex_unicode(opts.relay_password)
```
The panel is triggered, as of right now, via one primary way: a global keyboard shortcut (Ctrl + Shift + G). I experimented and I believe managed to also do it with the floating text selection bar, but scrapped it b/c I didn't know how useful it'd be.

Also, Kovid, thank you for the link to the content server viewer and for your comments on the poll results (not even 1 vote for a "Yes"

). My only next question, dear Kovid, is what my next steps shall be.

Thank you for all your time & support thus far. May I hear from you soon, Kovid.

View Poll Results: Want LLM/AI (e.g., Gemini) features in Calibre Viewer?
Yes, this would significantly enhance my reading experience.	4	14.29%
No, I prefer using external tools or don't need this feature.	24	85.71%
Voters: 28. You may not vote on this poll

08-04-2025, 04:53 AM	#1
amirthfultehrani Junior Member Posts: 9 Karma: 10 Join Date: Aug 2025 Device: Windows 11	Feature Proposal: Gemini/LLM API Integration in E-book Viewer Hello Calibre developers and community, may you all be well! As a long-time user, I admire and appreciate the continuous effort that has made Calibre the indispensable application it is today. My sincere gratitude to all involved. My gratitude aside, now a reflection on the reading experience (that many may be able to attest to). Through countless hours of reading, I've often found myself pausing, grappling with questions a page (or pages) alone cannot answer. Such unanswered questions frequently necessitates reaching for external tools - be it search engines or, increasingly, Large Language Models (LLMs) like ChatGPT - to bring the reader (me) to deeper comprehension. Think extracting the logical structure from an argument in Spinoza's Ethics or identifying experimental findings from a highly technical paper in Science magazine -- sure, one can meticulously re-read, cross-reference, and manually outline these works, to eventually get to the "answers," but an LLM offers that which traditional tools cannot provide: an immediate, context-aware synthesis, one that drastically reduces the cognitive overhead and time spent in the pursuit of genuine understanding; after all, effort saved is focus gained. To expand briefly on the point: unlike static resources that provide isolated facts or definitions (e.g., search engines or dictionaries), LLMs can take a paragraph or passage as complex as Kant's Critique of Pure Reason or as dense as a molecular biology research paper from Nature and instead synthesize its core ideas, explain complex relationships, or rephrase it into simpler term -- and all while retaining important context. It has led me to a realization that I have finally decided to act upon: while having a separate LLM window (ChatGPT, Claude, Grok, Gemini, etc.,) works, imagine the metamorphic potential if this capability were integrated directly into the Calibre application itself. With that vision, I have developed and tested a new feature that integrates Google's Gemini API (which can be abstracted to any compatible LLM) directly into the Calibre E-book Viewer. My aim is to empower users with in-context AI tools, removing the need to leave the reading environment. The results: capability of instant text summarization, clarification of complex topics, grammar correction, translation, and more, enhancing the reading and research experience. Key Features Implemented: Dockable Side Panel: persistent, dockable panel for direct interaction with selected text. Customizable Quick Actions: grid of user-configurable buttons for one-click common tasks (e.g., "Summarize," "Explain Simply"). Custom Prompts: dedicated field for arbitrary, user-defined LLM queries. Quick Access: includes a global keyboard shortcut (Ctrl+Shift+G) to toggle the panel's visibility, and a button on the floating text selection bar for immediate access. Comprehensive Settings: robust settings dialog (accessible from the panel) for managing the API key, selecting LLM models, and fully customizing quick actions. Implementation & Stability: Stability has been part and parcel throughout development. Previous attempts with multi-file architectures or the Calibre plugin system resulted in untraceable startup crashes within the viewer process. Learning from this, I engineered the entire feature to be self-contained within src/calibre/gui2/viewer/ui.py. All necessary UI classes (panel, settings dialogs) and API integration logic are defined locally within this file and its adjacent co-located modules (e.g., gemini_panel.py, gemini_settings.py) Here is a screenshot of the feature in action (attached as well): Here is a screenshot of its settings panel (also attached): The Ask: I have the complete, working code for ui.py (and the additional files for the panel/settings/configuration within src/calibre/gui2/viewer/) ready for review. Before proceeding with a formal pull request on GitHub, I wanted to present this proposal to the community to gauge interest and gather initial feedback. I believe this feature would be a valuable and frequently used addition to Calibre's capabilities and would be delighted to guide it through the contribution process. Thank you for your time and consideration, amazing Calibre community. Best regards, Amir Attached Thumbnails

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
New Feature Proposal - Leverage ChatGPT AI for Automatic Book Summary and Analysis	mikhail_fil	Calibre	27	10-02-2023 11:24 AM
Calibre - Feature proposal for server	Mr. Nietzsche	Server	2	08-21-2019 10:48 AM
E-Book Viewer feature suggestion?	arthurh3535	Calibre	3	10-20-2018 12:00 AM
Feature request: E-book Viewer TOC doesn't show where you are	alessandro	Calibre	5	11-21-2013 11:16 AM
Feature proposal: attachments	chrisberkhout	Calibre	1	08-07-2013 11:40 PM

08-04-2025, 07:37 AM	#3
kovidgoyal creator of calibre Posts: 45,617 Karma: 28549044 Join Date: Oct 2006 Location: Mumbai, India Device: Various	Looks fine to me, am sure some people will find it useful, like the Lookup panel already present in the viewer. I assume this requires the user to configure which LLM they want to interact with? How are the api keys/passwords whatever stored? Does sit support querying a local LLM? One concern is that it should also be implemented in the content server viewer, though that can be done in a later iteration.

08-04-2025, 08:02 AM	#4
BetterRed null operator (he/him) Posts: 22,032 Karma: 30277294 Join Date: Mar 2012 Location: Sydney Australia Device: none	Moderator Notice Users of the Viewer might not visit the Development subforum, so I'm about to move this to the Viewer forum. Pretty sure Kovid has said Viewer plugins are on his to do list. I use some AI tools, but I prefer using them a stand alone. The trigger to turn to them could stem from an e-book, a podcast, a newspaper article, somewhere like Quillette… or even a paper book. BR

08-06-2025, 04:03 AM	#12
kovidgoyal creator of calibre Posts: 45,617 Karma: 28549044 Join Date: Oct 2006 Location: Mumbai, India Device: Various	@amirthfultehrani rather than using google's bespoke API, have you considered using something like openrouter.ai which allows one to query any model you like via a single API. As for local models, they dont have to be queryable right away, but when designing the config UI and preferences implementation, keep in mind the possibility for their eventual support. When storing the api keys in prefs, at the very least, obfuscate it, see how relay_password is stored, for an example of doing that. The content server viewer is indeed the calibre viewer running in a browser. https://manual.calibre-ebook.com/ser...he-book-viewer Dont worry too much about the poll results, this feature will be useful to some people in some contexts, as an alternative or even supplement to the existing Lookup feature. And given it is opt in and has zero cost when not used, it wont bother anyone that doesnt want to use it. Another question, how is the panel trigerred? By a shortcut? By a button in the highlight bar? both? And finally, the name's Kovid not David.

08-06-2025, 05:37 AM	#13
DNSB Bibliophagist Posts: 48,276 Karma: 174315444 Join Date: Jul 2010 Location: Vancouver Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos	One other item to remember is that many of the people posting on MobileRead are well into the old curmudgeon stage of their existence. Even minor changes tend to meet with resistance. Recent example was Sigil going to a non-native file dialog when running under Windows which was already used for MacOS and Linux. Baby duckling syndrome is very real.

08-06-2025, 01:28 PM	#14
Quoth Still reading Posts: 15,033 Karma: 111111255 Join Date: Jun 2017 Location: Ireland Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper	Even if off by default, it's the thin end of a wedge that could import nonsense and eventually cost the user and environment. People that don't know what LLM really is and costs will turn it on. People can separately use LLM / generative AI if they want. No need to spoil the best ebook management system and set of tools.

Advert

Advert