View Single Post
Old 08-06-2025, 01:20 AM   #8
amirthfultehrani
Junior Member
amirthfultehrani began at the beginning.
 
Posts: 9
Karma: 10
Join Date: Aug 2025
Device: Windows 11
Quote:
Originally Posted by DNSB View Post
For me, this would not add anything to my use of the calibre viewer. While I have to admire the work you have put into this, given the number of times that AI generates results that are totally out of touch with reality make me want to run screaming. Looking back over the last few years on MobileRead, there have been quite a few discussions of using AI to 'improve' calibre, none of which amounted to much.

As in one recent discussion (General Discussions => Using AI for reading ), in message #45:



And at this time, I can't kid myself it is getting better. Going by OpenAI's own testing as reported in an article on LiveScience, their newer reasoning models hallucinate more often with o3 and o4-mini hallucinating 33% and 48% of the time.

AI hallucinates more frequently as it gets more advanced — is there any way to stop it from happening, and should we even try?
DNSB, thank you for taking the time to review my suggestion + provide such detailed feedback + your compliment. I appreciate it all. Furthermore, I completely understand your reservations about AI's limitations, especially concerning "hallucinations" (factual inaccuracies) and very much appreciate you linking that article. Not only did I learn a new word from it (confabulation - fun word!), but I was able to glean other perspectives that I didn't know existed (e.g., everything an LLM producing being, in essence, a "hallucination" - both the good, and the bad).

I will bring up one critical distinction between what I have implemented vs. what I think you are referring to, DNSB. It seems in-house, OpenAI uses SimpleQA and PersonQA. In their own words, the former uses "fact-seeking questions with short answers...measures model accuracy for attempted answers" and for the latter, "questions and publicly available facts about people that measures...accuracy on attempted answers." What my implementation does is a great mitigation of this hallucination risk. How? It forces the LLM to primarily work within the provided text, but, of course, it does not eliminate it, in particular, when external knowledge (via the explicit web grounding) is involved. In any case too, I liked the suggestions the article provided to address the hallucination issues: making the model structured (e.g., get it to check its own outputs, compare different perspectives, and go through a reasoned thought process) and/or design the model to acknowledge when they are unsure (e.g., flag uncertainty and defer to human judgment).

All this said, DNSB, I am glad I added the poll of yes/no. It seems, at least for our excellent Calibre developer community, my suggestion may not be of great use (if any). I hope, perhaps in some other way, I can still contribute to the wonderful development of Calibre. I saw that there is no official way for plugins to access the Calibre viewer. Maybe that is where I shall start if it is deemed significant. Thank you for your thoughts, DNSB.
amirthfultehrani is offline   Reply With Quote