| 
 | |||||||
| View Poll Results: Want LLM/AI (e.g., Gemini) features in Calibre Viewer? | |||
| Yes, this would significantly enhance my reading experience. |      | 4 | 14.29% | 
| No, I prefer using external tools or don't need this feature. |      | 24 | 85.71% | 
| Voters: 28. You may not vote on this poll | |||
|  | 
|  | Thread Tools | Search this Thread | 
|  08-04-2025, 03:53 AM | #1 | 
| Junior Member  Posts: 9 Karma: 10 Join Date: Aug 2025 Device: Windows 11 |  Feature Proposal: Gemini/LLM API Integration in E-book Viewer  Hello Calibre developers and community, may you all be well! As a long-time user, I admire and appreciate the continuous effort that has made Calibre the indispensable application it is today. My sincere gratitude to all involved. My gratitude aside, now a reflection on the reading experience (that many may be able to attest to). Through countless hours of reading, I've often found myself pausing, grappling with questions a page (or pages) alone cannot answer. Such unanswered questions frequently necessitates reaching for external tools - be it search engines or, increasingly, Large Language Models (LLMs) like ChatGPT - to bring the reader (me) to deeper comprehension. Think extracting the logical structure from an argument in Spinoza's Ethics or identifying experimental findings from a highly technical paper in Science magazine -- sure, one can meticulously re-read, cross-reference, and manually outline these works, to eventually get to the "answers," but an LLM offers that which traditional tools cannot provide: an immediate, context-aware synthesis, one that drastically reduces the cognitive overhead and time spent in the pursuit of genuine understanding; after all, effort saved is focus gained. To expand briefly on the point: unlike static resources that provide isolated facts or definitions (e.g., search engines or dictionaries), LLMs can take a paragraph or passage as complex as Kant's Critique of Pure Reason or as dense as a molecular biology research paper from Nature and instead synthesize its core ideas, explain complex relationships, or rephrase it into simpler term -- and all while retaining important context. It has led me to a realization that I have finally decided to act upon: while having a separate LLM window (ChatGPT, Claude, Grok, Gemini, etc.,) works, imagine the metamorphic potential if this capability were integrated directly into the Calibre application itself. With that vision, I have developed and tested a new feature that integrates Google's Gemini API (which can be abstracted to any compatible LLM) directly into the Calibre E-book Viewer. My aim is to empower users with in-context AI tools, removing the need to leave the reading environment. The results: capability of instant text summarization, clarification of complex topics, grammar correction, translation, and more, enhancing the reading and research experience. Key Features Implemented: 
 Implementation & Stability: Stability has been part and parcel throughout development. Previous attempts with multi-file architectures or the Calibre plugin system resulted in untraceable startup crashes within the viewer process. Learning from this, I engineered the entire feature to be self-contained within src/calibre/gui2/viewer/ui.py. All necessary UI classes (panel, settings dialogs) and API integration logic are defined locally within this file and its adjacent co-located modules (e.g., gemini_panel.py, gemini_settings.py) Here is a screenshot of the feature in action (attached as well): Here is a screenshot of its settings panel (also attached): The Ask: I have the complete, working code for ui.py (and the additional files for the panel/settings/configuration within src/calibre/gui2/viewer/) ready for review. Before proceeding with a formal pull request on GitHub, I wanted to present this proposal to the community to gauge interest and gather initial feedback. I believe this feature would be a valuable and frequently used addition to Calibre's capabilities and would be delighted to guide it through the contribution process. Thank you for your time and consideration, amazing Calibre community.  Best regards, Amir | 
|   |   | 
|  08-04-2025, 06:18 AM | #2 | |
| Bibliophagist            Posts: 47,985 Karma: 174315100 Join Date: Jul 2010 Location: Vancouver Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos | 
			
			For me, this would not add anything to my use of the calibre viewer. While I have to admire the work you have put into this, given the number of times that AI generates results that are totally out of touch with reality make me want to run screaming. Looking back over the last few years on MobileRead, there have been quite a few discussions of using AI to 'improve' calibre, none of which amounted to much. As in one recent discussion (General Discussions => Using AI for reading ), in message #45: Quote: 
 AI hallucinates more frequently as it gets more advanced — is there any way to stop it from happening, and should we even try? Last edited by DNSB; 08-04-2025 at 06:23 AM. | |
|   |   | 
|  08-04-2025, 06:37 AM | #3 | 
| creator of calibre            Posts: 45,598 Karma: 28548962 Join Date: Oct 2006 Location: Mumbai, India Device: Various | 
			
			Looks fine to me, am sure some people will find it useful, like the Lookup panel already present in the viewer. I assume this requires the user to configure which LLM they want to interact with? How are the api keys/passwords whatever stored? Does sit support querying a local LLM? One concern is that it should also be implemented in the content server viewer, though that can be done in a later iteration. | 
|   |   | 
|  08-04-2025, 07:02 AM | #4 | 
| null operator (he/him)            Posts: 22,006 Karma: 30277294 Join Date: Mar 2012 Location: Sydney Australia Device: none | Moderator Notice Users of the Viewer might not visit the Development subforum, so I'm about to move this to the Viewer forum. Pretty sure Kovid has said Viewer plugins are on his to do list. I use some AI tools, but I prefer using them a stand alone. The trigger to turn to them could stem from an e-book, a podcast, a newspaper article, somewhere like Quillette… or even a paper book. BR | 
|   |   | 
|  08-04-2025, 07:34 AM | #5 | |
| Weirdo            Posts: 918 Karma: 11941602 Join Date: Nov 2019 Location: Wuppertal, Germany Device: Kobo Sage, Kobo Libra 2, Boox Note Air 2+ | Quote: 
 | |
|   |   | 
|  08-04-2025, 08:07 PM | #6 | |
| Bibliophagist            Posts: 47,985 Karma: 174315100 Join Date: Jul 2010 Location: Vancouver Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos | 
			
			As long as I am not forced to use the feature, I would file this under the multiple features of calibre and it's plugins that are of little use to me but others may find useful. Much like the earlier discussion about AI generated summaries, not my cuppa tea but there seems to be a decent number of people who are interested. For what it may be worth, I re-ran my earlier ebook summary test using 04-mini and the results were worse. Pretty much like the information from OpenAI using their test suite. Quote: 
 | |
|   |   | 
|  08-04-2025, 10:23 PM | #7 | |
| null operator (he/him)            Posts: 22,006 Karma: 30277294 Join Date: Mar 2012 Location: Sydney Australia Device: none | Quote: 
 Example: The AU PM has been using the catchphrase "progressive patriotism" and it's been bandied about in the media without much 'exploration' of what he is talking about. I put this to ChatGPT "What is 'progressive patriotism', where did it originate, what are its foundations?" Amongst other things it lead me to newspaper articles by Virginia Woolf, J.B. Priestley and Orwell's - The Lion and the Unicorn - and Teddy Rooselvelt's Bull Moose Party. And I now know where the AU PM got the phrase, an article in a 2022 AU literary magazine, probably via Katherine Murphy his erstwhile media czar. BR | |
|   |   | 
|  08-06-2025, 01:20 AM | #8 | |
| Junior Member  Posts: 9 Karma: 10 Join Date: Aug 2025 Device: Windows 11 | Quote: 
 I will bring up one critical distinction between what I have implemented vs. what I think you are referring to, DNSB. It seems in-house, OpenAI uses SimpleQA and PersonQA. In their own words, the former uses "fact-seeking questions with short answers...measures model accuracy for attempted answers" and for the latter, "questions and publicly available facts about people that measures...accuracy on attempted answers." What my implementation does is a great mitigation of this hallucination risk. How? It forces the LLM to primarily work within the provided text, but, of course, it does not eliminate it, in particular, when external knowledge (via the explicit web grounding) is involved. In any case too, I liked the suggestions the article provided to address the hallucination issues: making the model structured (e.g., get it to check its own outputs, compare different perspectives, and go through a reasoned thought process) and/or design the model to acknowledge when they are unsure (e.g., flag uncertainty and defer to human judgment). All this said, DNSB, I am glad I added the poll of yes/no. It seems, at least for our excellent Calibre developer community, my suggestion may not be of great use (if any). I hope, perhaps in some other way, I can still contribute to the wonderful development of Calibre. I saw that there is no official way for plugins to access the Calibre viewer. Maybe that is where I shall start if it is deemed significant. Thank you for your thoughts, DNSB. | |
|   |   | 
|  08-06-2025, 01:42 AM | #9 | |
| Junior Member  Posts: 9 Karma: 10 Join Date: Aug 2025 Device: Windows 11 | Quote: 
 David, thank you for your direct engagement and specific questions! Practicality, security, and future compatibility - what excellent things I hope to work toward, especially from an excellent lead developer like you! Thank you for the great work. It is inspiring. It seems my poll (as of time of viewing) speaks such that, at the very least, our Calibre development community may not gain much from this feature. Maybe this was an issue in the wording of the answers (maybe I should have eliminated "significantly" and just left it at "enhanced"). In any case, to answer your questions, David: 
 Lastly, the entire feature, including the 'GeminiPanel' and 'GeminiSettingsDialog' classes, are contained within 'ui.py' and its next door co-located modules ('gemini_panel.py,' 'gemini_settings.py'). Thankfully, in my testing, it has proven highly stable and avoids prior issues I faced with top-level imports that caused crashes. In ending, David, I am unsure given our friends comments and poll results if this would be something worth moving forward. However, I am very keen on contributing to Calibre, and if it is okay, I ask you to please let me know what would be something "similar" (architecturally, conceptually, any "...ally") that I could help with. Thank you again, David, for the excellent work & thoughts! | |
|   |   | 
|  08-06-2025, 01:46 AM | #10 | |
| Junior Member  Posts: 9 Karma: 10 Join Date: Aug 2025 Device: Windows 11 | Quote: 
 Additionally, that is fantastic news regarding David's intention to add viewer plugins. A framework like that could have prevented a lot of hassle I had to go through, and, no doubt, shall open up the door to a lot of robust and excellent Calibre additional features. Pertaining to standalone AI tools - that is my view as well. My aim with my integration was to address the specific friction points I encountered while actively reading an e-book in Calibre -- where context is immediately available, in-situ comprehension directly on the digital page. Thank you again for your valuable input, BetterRed. | |
|   |   | 
|  08-06-2025, 02:12 AM | #11 | |||
| Junior Member  Posts: 9 Karma: 10 Join Date: Aug 2025 Device: Windows 11 | Quote: 
 would indeed be counterproductive to developing two things we desperately need in our world: critical thinking skills and true understanding. That was never the intended purpose for my suggestion. The benefit, as I have envisioned and experienced, is not to replace the act of reading, or the act of critically thinking, but instead about reducing friction and accelerating the learning process when faced with challenging text. I urge you to please consider these scenarios, which an LLM can resolve far more efficiently than re-reading or traditional search, allowing the reader to then re-engage with the original text more effectively: Quote: 
 A famous, short, but often misunderstood quote in that everyone thinks its about faith in a religious sense, or about choosing irrational belief over empirical evidence (i.e., abandon scientific knowledge to embrace religious dogma), when really its about making room for moral + practical reasoning (thus, faith in a philosophical sense) in a world where pure speculative reason (knowledge) cannot definitively prove (or disprove) concepts like God, freedom, and immortality. I have included a couple screenshots of my feature's interpretation from simply asking it to explain in laymen terms. Apologies for the book quality! Quote: 
 For this, the prompt I gave to the LLM was "Summarize the logical structure of this passage regarding property." The results are attached as screenshots. I could also give examples of bridging knowledge gaps - for example, you read "Congress of Vienna" or "liminal spaces" and you want to quickly know about them. I could also give examples of confirmation of understanding - like asking whether a sentence is a supporting example of the narrator being unreliable when you think it is (or isn't). The point is to be able to do all this in a simple no-need-to-go-to-a-separate-tab manner. The result would be a substantial time-saver for those who are already using external LLMs for such tasks, or for those who just may want to potentially try something out to enhance their reading. Ultimately, my intention was that this'd be a user-configurable option. That'd mean if one is concerned, they can simply choose not to enable or use it, and it wouldn't interfere with their existing Calibre experience. For those who find value in LLMs (and there is a growing number who do), it offers a streamlined + integrated way to utilize them. Thank you again for your candid and thoughtful reply, rantanplan. It has provided me with valuable perspective, and I truly appreciate the engagement. | |||
|   |   | 
|  08-06-2025, 03:03 AM | #12 | 
| creator of calibre            Posts: 45,598 Karma: 28548962 Join Date: Oct 2006 Location: Mumbai, India Device: Various | 
			
			@amirthfultehrani rather than using google's bespoke API, have you considered using something like openrouter.ai which allows one to query any model you like via a single API. As for local models, they dont have to be queryable right away, but when designing the config UI and preferences implementation, keep in mind the possibility for their eventual support. When storing the api keys in prefs, at the very least, obfuscate it, see how relay_password is stored, for an example of doing that. The content server viewer is indeed the calibre viewer running in a browser. https://manual.calibre-ebook.com/ser...he-book-viewer Dont worry too much about the poll results, this feature will be useful to some people in some contexts, as an alternative or even supplement to the existing Lookup feature. And given it is opt in and has zero cost when not used, it wont bother anyone that doesnt want to use it. Another question, how is the panel trigerred? By a shortcut? By a button in the highlight bar? both? And finally, the name's Kovid not David. | 
|   |   | 
|  08-06-2025, 04:37 AM | #13 | 
| Bibliophagist            Posts: 47,985 Karma: 174315100 Join Date: Jul 2010 Location: Vancouver Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos | 
			
			One other item to remember is that many of the people posting on MobileRead are well into the old curmudgeon stage of their existence. Even minor changes tend to meet with resistance. Recent example was Sigil going to a non-native file dialog when running under Windows which was already used for MacOS and Linux. Baby duckling syndrome is very real.
		 | 
|   |   | 
|  08-06-2025, 12:28 PM | #14 | 
| Still reading            Posts: 14,922 Karma: 110507267 Join Date: Jun 2017 Location: Ireland Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper | 
			
			Even if off by default, it's the thin end of a wedge that could import nonsense and eventually cost the user and environment. People that don't know what LLM really is and costs will turn it on. People can separately use LLM / generative AI if they want. No need to spoil the best ebook management system and set of tools. | 
|   |   | 
|  08-09-2025, 09:58 PM | #15 | |
| Junior Member  Posts: 9 Karma: 10 Join Date: Aug 2025 Device: Windows 11 | Quote: 
 Apologies aside, Kovid, a few updates from the last time of contact: 
 Also, Kovid, thank you for the link to the content server viewer and for your comments on the poll results (not even 1 vote for a "Yes"  ). My only next question, dear Kovid, is what my next steps shall be. Thank you for all your time & support thus far. May I hear from you soon, Kovid. | |
|   |   | 
|  | 
| Tags | 
| artificial intelligence, development, feature request, large language model, viewer | 
| Thread Tools | Search this Thread | 
| 
 | 
|  Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post | 
| New Feature Proposal - Leverage ChatGPT AI for Automatic Book Summary and Analysis | mikhail_fil | Calibre | 27 | 10-02-2023 10:24 AM | 
| Calibre - Feature proposal for server | Mr. Nietzsche | Server | 2 | 08-21-2019 09:48 AM | 
| E-Book Viewer feature suggestion? | arthurh3535 | Calibre | 3 | 10-19-2018 11:00 PM | 
| Feature request: E-book Viewer TOC doesn't show where you are | alessandro | Calibre | 5 | 11-21-2013 10:16 AM | 
| Feature proposal: attachments | chrisberkhout | Calibre | 1 | 08-07-2013 10:40 PM |