Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Library Management

Notices

Reply
 
Thread Tools Search this Thread
Old 08-23-2021, 07:29 PM   #1
McStubb
Connoisseur
McStubb began at the beginning.
 
Posts: 52
Karma: 10
Join Date: May 2014
Device: None
Guess Book Language

How hard would it be to create, and would anyone be interested in doing it, a tool that scans thru the ebooks and attempts to identify language the book is written in?
McStubb is offline   Reply With Quote
Old 08-23-2021, 08:34 PM   #2
jhowell
Grand Sorcerer
jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.
 
jhowell's Avatar
 
Posts: 7,070
Karma: 91577715
Join Date: Nov 2011
Location: Charlottesville, VA
Device: Kindles
I am curious, what would be the use case for such a tool?

All of the e-book formats that I am aware of encode the language of the content.
jhowell is online now   Reply With Quote
Advert
Old 08-23-2021, 11:03 PM   #3
Sarmat89
Fanatic
Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.Sarmat89 ought to be getting tired of karma fortunes by now.
 
Posts: 516
Karma: 2268308
Join Date: Nov 2015
Device: none
Automatic conversions from PDF, RTF, plaintext, or HTML, probably. Those usually don't encode any language.
Sarmat89 is offline   Reply With Quote
Old 08-24-2021, 07:09 AM   #4
McStubb
Connoisseur
McStubb began at the beginning.
 
Posts: 52
Karma: 10
Join Date: May 2014
Device: None
Correct, and there are some ebooks that I have run across where the default language was set to English, even though the contents were foreign.
McStubb is offline   Reply With Quote
Old 08-29-2021, 12:55 PM   #5
McStubb
Connoisseur
McStubb began at the beginning.
 
Posts: 52
Karma: 10
Join Date: May 2014
Device: None
Looks like the python package langdetect does exactly what I am looking for. I cheated and and wrote something that scans the filesystem and flagged those for review.
McStubb is offline   Reply With Quote
Advert
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
How do I keep book's file name in the language of the book? marinai Conversion 2 07-09-2016 11:26 PM
Where does the language come from in Book Details? fxp33 Kobo Developer's Corner 3 01-17-2015 04:23 PM
Changing book language Gerlyn Sigil 2 10-12-2012 10:53 AM
I need e-book with farsi language NSB Which one should I buy? 3 05-08-2011 02:13 PM
I guess it had to happen sometime - book disappears from BoB online bookshelf FizzyWater News 20 12-03-2008 12:59 AM


All times are GMT -4. The time now is 07:10 AM.


MobileRead.com is a privately owned, operated and funded community.