Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Plugins

Notices

Reply
 
Thread Tools Search this Thread
Old 09-27-2020, 06:06 AM   #1381
davidfor
Grand Sorcerer
davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.
 
Posts: 24,905
Karma: 47303824
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
Quote:
Originally Posted by scorpion2782 View Post
i have an error when i try to download metadata from goodreads.
this is the log:

Spoiler:
Code:
Count Page/Word Statistics
do_count_statistics - book_path=C:\Users\dma02\AppData\Local\Temp\calibre_s0fbfhfj\oppqayyg_count_pages\1539.epub, pages_algorithm=2, page_count_mode=Download, statistics_to_run=['PageCount', 'WordCount', 'FleschReading', 'FleschGrade', 'GunningFog'], custom_chars_per_page=1500, icu_wordcount=True
do_count_statistics - job started for file book_path=C:\Users\dma02\AppData\Local\Temp\calibre_s0fbfhfj\oppqayyg_count_pages\1539.epub
-------------------------------
Logfile for book ID 1539 (Ninfee nere)
	Method of counting _page_count_mode=Download _download_sources=[('goodreads', '30831231')]
	results= {'PageCount': None, 'WordCount': 100640, 'FleschReading': 57.49500249521196, 'FleschGrade': 7.180003232503573, 'GunningFog': 12.521758139287016}
	FAILED TO GET PAGE COUNT FROM WEBSITE
	Found 100640 words
	Computed 57.5 Flesch Reading
	Computed 7.2 Flesch-Kincaid Grade
	Computed 12.5 Gunning Fog Index
1539
do_statistics_for_book:  C:\Users\dma02\AppData\Local\Temp\calibre_s0fbfhfj\oppqayyg_count_pages\1539.epub 2 Download [('goodreads', '30831231')] ['PageCount', 'WordCount', 'FleschReading', 'FleschGrade', 'GunningFog'] 1500 True
DownloadPagesWorker::run - source_id=30831231, source_name=goodreads
DownloadPagesWorker::run - PAGE_DOWNLOADS[source_name]={'URL': 'http://www.goodreads.com/book/show/%s', 'pages_xpath': '//div[@id="details"]/div[@class="row"]/span[@itemprop="numberOfPages"]/text()', 'name': 'Goodreads', 'id': 'goodreads', 'icon': 'images/goodreads.png', 'active': True}
DownloadPagesWorker::run - self.pages_regex=None
Download source book url: 'http://www.goodreads.com/book/show/30831231'
Failed to parse download source details page: 'http://www.goodreads.com/book/show/30831231'
	Word count using icu_wordcount - trying to count_words
	Word count - used count_words: 100640
	Word count: 100640
	Results of NLTK text analysis:
	  Number of characters: 545458
	  Number of words: 111245
	  Number of sentences: 14245
	  Number of syllables: 185952
	  Number of complex words: 26137
	  Average words per sentence: 7.809406809406809
For this book, using language=ita
	Flesch Reading Ease: 57.49500249521196
	Flesch Kincade Grade: 7.180003232503573
	Gunning Fog: 12.521758139287016
Traceback (most recent call last):
  File "calibre_plugins.count_pages.download", line 77, in _get_details
  File "site-packages\lxml\html\__init__.py", line 875, in fromstring
  File "site-packages\lxml\html\__init__.py", line 761, in document_fromstring
  File "src/lxml/etree.pyx", line 3237, in lxml.etree.fromstring
  File "src/lxml/parser.pxi", line 1896, in lxml.etree._parseMemoryDocument
  File "src/lxml/parser.pxi", line 1777, in lxml.etree._parseDoc
  File "src/lxml/parser.pxi", line 1082, in lxml.etree._BaseParser._parseUnicodeDoc
  File "src/lxml/parser.pxi", line 615, in lxml.etree._ParserContext._handleParseResultDoc
  File "src/lxml/parser.pxi", line 725, in lxml.etree._handleParseResult
  File "src/lxml/parser.pxi", line 654, in lxml.etree._raiseParseError
  File "", line 1
lxml.etree.XMLSyntaxError: encoding not supported USC4 little endian, line 1, column 1
That is an encoding or language issue. I have attached a beta that should fix this. Plus I have added Czech support that @seeker supplied to me last month and I hadn't had a chance to integrate.

The changes in the beta are:
  • Fix: Wasn't getting the series info.
  • New: Czech translation - thanks to seeder
  • New: Add download page count from databazeknih.cz and cbdb.cz - thanks to seeder

I haven't done a lot of testing of these changes. The language makes it a little difficult for me. If anyone sees an problems, please report them here with examples so that I can look at them.


Edit:
I have replaced the attachment as I realised I had left a debug statement in the code that would break on most systems. But, I don't think anyone downloaded the beta.
Attached Files
File Type: zip Count Pages-beta.zip (307.1 KB, 271 views)

Last edited by davidfor; 09-27-2020 at 06:36 AM. Reason: Updated attachment as I left a debug statement in.
davidfor is offline   Reply With Quote
Old 09-27-2020, 05:39 PM   #1382
scorpion2782
Enthusiast
scorpion2782 began at the beginning.
 
Posts: 30
Karma: 10
Join Date: Aug 2011
Device: Kobo Glo HD
Quote:
Originally Posted by davidfor View Post
That is an encoding or language issue. I have attached a beta that should fix this. Plus I have added Czech support that @seeker supplied to me last month and I hadn't had a chance to integrate.

The changes in the beta are:
  • Fix: Wasn't getting the series info.
  • New: Czech translation - thanks to seeder
  • New: Add download page count from databazeknih.cz and cbdb.cz - thanks to seeder

I haven't done a lot of testing of these changes. The language makes it a little difficult for me. If anyone sees an problems, please report them here with examples so that I can look at them.


Edit:
I have replaced the attachment as I realised I had left a debug statement in the code that would break on most systems. But, I don't think anyone downloaded the beta.

thank you very much, now there are no more errors in the log, as shown below, but the number of pages is still set to the calculated one (in this case 12) and does not take the download result from goodreads (400)

Code:
Count Page/Word Statistics
do_count_statistics - book_path=C:\Users\dma02\AppData\Local\Temp\calibre_lxwdkr41\1ihzbrpw_count_pages\1539.epub, pages_algorithm=2, page_count_mode=Download, statistics_to_run=['PageCount', 'WordCount', 'FleschReading', 'FleschGrade', 'GunningFog'], custom_chars_per_page=1500, icu_wordcount=True
do_count_statistics - job started for file book_path=C:\Users\dma02\AppData\Local\Temp\calibre_lxwdkr41\1ihzbrpw_count_pages\1539.epub
-------------------------------
Logfile for book ID 1539 (Ninfee nere)
	Method of counting _page_count_mode=Download _download_sources=[('goodreads', '30831231')]
	results= {'download_source': 'goodreads', 'PageCount': 400, 'WordCount': 100640, 'FleschReading': 57.49500249521196, 'FleschGrade': 7.180003232503573, 'GunningFog': 12.521758139287016}
	Downloaded page count from Goodreads: 400
	Found 100640 words
	Computed 57.5 Flesch Reading
	Computed 7.2 Flesch-Kincaid Grade
	Computed 12.5 Gunning Fog Index
1539
do_statistics_for_book:  C:\Users\dma02\AppData\Local\Temp\calibre_lxwdkr41\1ihzbrpw_count_pages\1539.epub 2 Download [('goodreads', '30831231')] ['PageCount', 'WordCount', 'FleschReading', 'FleschGrade', 'GunningFog'] 1500 True
DownloadPagesWorker::run - source_id=30831231, source_name=goodreads
DownloadPagesWorker::run - PAGE_DOWNLOADS[source_name]={'URL': 'http://www.goodreads.com/book/show/%s', 'pages_xpath': '//div[@id="details"]/div[@class="row"]/span[@itemprop="numberOfPages"]/text()', 'name': 'Goodreads', 'id': 'goodreads', 'icon': 'images/goodreads.png', 'active': True}
DownloadPagesWorker::run - self.pages_regex=None
Download source book url: 'http://www.goodreads.com/book/show/30831231'
_get_details: len(raw)= 479061
_parse_page_count: start
_parse_page_count: root.__class__= _ElementTree
_parse_page_count: pages_xpath='//div[@id="details"]/div[@class="row"]/span[@itemprop="numberOfPages"]/text()', =pages_regex='None'
_parse_page_count: pages= ['400 pages']
_parse_page_count: pages[0]= 400 pages
_parse_page_count: pages_regex= None
_parse_page_count: pages_text= 400
_parse_page_count: end
	Word count using icu_wordcount - trying to count_words
	Word count - used count_words: 100640
	Word count: 100640
	Results of NLTK text analysis:
	  Number of characters: 545458
	  Number of words: 111245
	  Number of sentences: 14245
	  Number of syllables: 185952
	  Number of complex words: 26137
	  Average words per sentence: 7.809406809406809
For this book, using language=ita
	Flesch Reading Ease: 57.49500249521196
	Flesch Kincade Grade: 7.180003232503573
	Gunning Fog: 12.521758139287016
scorpion2782 is offline   Reply With Quote
Advert
Old 09-27-2020, 08:36 PM   #1383
davidfor
Grand Sorcerer
davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.
 
Posts: 24,905
Karma: 47303824
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
Quote:
Originally Posted by scorpion2782 View Post
thank you very much, now there are no more errors in the log, as shown below, but the number of pages is still set to the calculated one (in this case 12) and does not take the download result from goodreads (400)

Code:
Count Page/Word Statistics
do_count_statistics - book_path=C:\Users\dma02\AppData\Local\Temp\calibre_lxwdkr41\1ihzbrpw_count_pages\1539.epub, pages_algorithm=2, page_count_mode=Download, statistics_to_run=['PageCount', 'WordCount', 'FleschReading', 'FleschGrade', 'GunningFog'], custom_chars_per_page=1500, icu_wordcount=True
do_count_statistics - job started for file book_path=C:\Users\dma02\AppData\Local\Temp\calibre_lxwdkr41\1ihzbrpw_count_pages\1539.epub
-------------------------------
Logfile for book ID 1539 (Ninfee nere)
	Method of counting _page_count_mode=Download _download_sources=[('goodreads', '30831231')]
	results= {'download_source': 'goodreads', 'PageCount': 400, 'WordCount': 100640, 'FleschReading': 57.49500249521196, 'FleschGrade': 7.180003232503573, 'GunningFog': 12.521758139287016}
	Downloaded page count from Goodreads: 400
	Found 100640 words
	Computed 57.5 Flesch Reading
	Computed 7.2 Flesch-Kincaid Grade
	Computed 12.5 Gunning Fog Index
1539
do_statistics_for_book:  C:\Users\dma02\AppData\Local\Temp\calibre_lxwdkr41\1ihzbrpw_count_pages\1539.epub 2 Download [('goodreads', '30831231')] ['PageCount', 'WordCount', 'FleschReading', 'FleschGrade', 'GunningFog'] 1500 True
DownloadPagesWorker::run - source_id=30831231, source_name=goodreads
DownloadPagesWorker::run - PAGE_DOWNLOADS[source_name]={'URL': 'http://www.goodreads.com/book/show/%s', 'pages_xpath': '//div[@id="details"]/div[@class="row"]/span[@itemprop="numberOfPages"]/text()', 'name': 'Goodreads', 'id': 'goodreads', 'icon': 'images/goodreads.png', 'active': True}
DownloadPagesWorker::run - self.pages_regex=None
Download source book url: 'http://www.goodreads.com/book/show/30831231'
_get_details: len(raw)= 479061
_parse_page_count: start
_parse_page_count: root.__class__= _ElementTree
_parse_page_count: pages_xpath='//div[@id="details"]/div[@class="row"]/span[@itemprop="numberOfPages"]/text()', =pages_regex='None'
_parse_page_count: pages= ['400 pages']
_parse_page_count: pages[0]= 400 pages
_parse_page_count: pages_regex= None
_parse_page_count: pages_text= 400
_parse_page_count: end
	Word count using icu_wordcount - trying to count_words
	Word count - used count_words: 100640
	Word count: 100640
	Results of NLTK text analysis:
	  Number of characters: 545458
	  Number of words: 111245
	  Number of sentences: 14245
	  Number of syllables: 185952
	  Number of complex words: 26137
	  Average words per sentence: 7.809406809406809
For this book, using language=ita
	Flesch Reading Ease: 57.49500249521196
	Flesch Kincade Grade: 7.180003232503573
	Gunning Fog: 12.521758139287016
The log doesn't show any problems. It does suggest you have a problem with the plugin configuration. As the page count is being set to "12", and the "Gunning Fog" value is "12.5", I suspect that you have these two values going into the same column. Or swapped or something like that.
davidfor is offline   Reply With Quote
Old 10-30-2020, 07:26 AM   #1384
jaad34
Junior Member
jaad34 began at the beginning.
 
jaad34's Avatar
 
Posts: 3
Karma: 10
Join Date: Oct 2020
Location: Somewhere in Spain
Device: iPad
Game over?

Plugin does not work on Caliber 5.4.1 64bits

The following error message appears: "ModuleNotFoundError: No module named 'Win32com' "

Thanks in advance.
jaad34 is offline   Reply With Quote
Old 10-30-2020, 09:05 AM   #1385
mbovenka
Wizard
mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.mbovenka ought to be getting tired of karma fortunes by now.
 
Posts: 2,084
Karma: 14079267
Join Date: Oct 2007
Location: Almere, The Netherlands
Device: Kobo Sage
Quote:
Originally Posted by jaad34 View Post
Plugin does not work on Caliber 5.4.1 64bits

The following error message appears: "ModuleNotFoundError: No module named 'Win32com' "
WorksForMe™
mbovenka is offline   Reply With Quote
Advert
Old 10-30-2020, 09:17 AM   #1386
davidfor
Grand Sorcerer
davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.
 
Posts: 24,905
Karma: 47303824
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
Quote:
Originally Posted by jaad34 View Post
Plugin does not work on Caliber 5.4.1 64bits

The following error message appears: "ModuleNotFoundError: No module named 'Win32com' "
To the best of my knowledge, and search abilities, the Count Pages plugin does not use the Win32com module. And I have used it with calibre 5.4.1 without error. Can you post the full details of the error message? Then I can see exactly where it is failing. Plus, what type of book did you run the plugin on?
davidfor is offline   Reply With Quote
Old 10-30-2020, 02:28 PM   #1387
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 31,164
Karma: 60406498
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by jaad34 View Post
Plugin does not work on Caliber 5.4.1 64bits

The following error message appears: "ModuleNotFoundError: No module named 'Win32com' "

Thanks in advance.
Count pages works for me. (note: 1.10.1 )
theducks is online now   Reply With Quote
Old 10-30-2020, 04:17 PM   #1388
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,881
Karma: 30277270
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by theducks View Post
Count pages works for me. (note: 1.10.1 )


However, since 1.10.1 was released in June, a beta was attached to post #1381 in Sept by davidfor.

BR
BetterRed is online now   Reply With Quote
Old 10-30-2020, 07:26 PM   #1389
davidfor
Grand Sorcerer
davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.
 
Posts: 24,905
Karma: 47303824
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
Quote:
Originally Posted by BetterRed View Post


However, since 1.10.1 was released in June, a beta was attached to post #1381 in Sept by davidfor.
I forgot all about that. But, it doesn't have changes in it that could be related to that error.
davidfor is offline   Reply With Quote
Old 10-30-2020, 07:38 PM   #1390
davidfor
Grand Sorcerer
davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.
 
Posts: 24,905
Karma: 47303824
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
Update to version 1.11.0

I have just released version 1.11.0 of the plugin. The changes are:
  • Fix: Errors parsing non-English pages when downloading page count.
  • New: Czech translation - thanks to seeder
  • New: Add download page count from databazeknih.cz and cbdb.cz - thanks to seeder.

The changes are the same as the beta released in September.

Calibre should notify you of the update in the next hour or so. If there are any problems, please report them here.
davidfor is offline   Reply With Quote
Old 11-01-2020, 09:51 AM   #1391
jist
Enthusiast
jist began at the beginning.
 
Posts: 40
Karma: 10
Join Date: May 2018
Device: Onyx Note Lite - Win10
I think I mentioned this before a long time ago somewhere: this is a great plugin.
It may as well be a candidate to be integrated in Calibre by default.
So thanks davidfor fore creating and maintaining it!

I notice one problem when counting pdf's though.
There are pdf's that are problematic to be counted.
The process will keep running without end, without result.
These pdf's may be slightly corrupt, very complicated, not up-to-standard, etc. I don't know.

The problem is that for those there will be no pop-up or notification that they are problematic.
If you get impatient, you can click 'jobs', and if it shows something like 1%, and after 5 minutes it still shows that, you know you'd better cancel the process.

The second issue is, when you do that, there is a process that will keep running in Windows.
The process is called pdftohtml.exe
It will even keep running if you close or restart Calibre.

Cut to the chase:
1. perhaps some notification when pdf's seem problematic to count, and propose if you'd like to end the process?
2. When manually stopping the plugin or shutting down Calibre, stop the pdftohtml.exe process?

Last edited by jist; 11-01-2020 at 09:54 AM.
jist is offline   Reply With Quote
Old 11-01-2020, 09:57 AM   #1392
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 31,164
Karma: 60406498
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Read the conversion Sticky on PDF

The same applies here (because that process you needed to kill is the same used for conversion, because that is what is being done, it is just the output id binned after the count)
PDF is a cr*p format to do anything except PRINT
theducks is online now   Reply With Quote
Old 11-01-2020, 10:04 AM   #1393
jist
Enthusiast
jist began at the beginning.
 
Posts: 40
Karma: 10
Join Date: May 2018
Device: Onyx Note Lite - Win10
Quote:
Originally Posted by theducks View Post
Read the conversion Sticky on PDF
I know and understand that pdf is a problematic format for conversion.

I am not complaining that the page count plugin is not able to count all pdf's.
(for me it does well on the majority of them though)

This is about (not) notifying the user, and (not) ending a process that keeps running in a hidden fashion even after ending the plugin or restarting Calibre.

edit:
A new thought came up:
While I can imagine that counting words from a pdf can be problematic and process-freezing sometimes, I am guessing that counting the pages usually won't pose serious problems.

So maybe when counting words 'gets stuck', propose to only write the page count and terminate word count?

Last edited by jist; 11-01-2020 at 11:23 AM.
jist is offline   Reply With Quote
Old 11-01-2020, 03:20 PM   #1394
compurandom
Wizard
compurandom ought to be getting tired of karma fortunes by now.compurandom ought to be getting tired of karma fortunes by now.compurandom ought to be getting tired of karma fortunes by now.compurandom ought to be getting tired of karma fortunes by now.compurandom ought to be getting tired of karma fortunes by now.compurandom ought to be getting tired of karma fortunes by now.compurandom ought to be getting tired of karma fortunes by now.compurandom ought to be getting tired of karma fortunes by now.compurandom ought to be getting tired of karma fortunes by now.compurandom ought to be getting tired of karma fortunes by now.compurandom ought to be getting tired of karma fortunes by now.
 
Posts: 1,018
Karma: 500000
Join Date: Jun 2015
Device: Rocketbook, kobo aura h2o, kobo forma, kobo libra color
On some of these "uncountable" pdfs, they seem to get into what looks like an infinite loop extracting small images. I looked in the tmp dir for one and it had several thousand identical (same md5sum) very small images after I aborted it. I might be able to dig up a URL for one if you need an example.
compurandom is offline   Reply With Quote
Old 11-02-2020, 04:18 AM   #1395
davidfor
Grand Sorcerer
davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.
 
Posts: 24,905
Karma: 47303824
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
Quote:
Originally Posted by jist View Post
I know and understand that pdf is a problematic format for conversion.

I am not complaining that the page count plugin is not able to count all pdf's.
(for me it does well on the majority of them though)

This is about (not) notifying the user, and (not) ending a process that keeps running in a hidden fashion even after ending the plugin or restarting Calibre.

edit:
A new thought came up:
While I can imagine that counting words from a pdf can be problematic and process-freezing sometimes, I am guessing that counting the pages usually won't pose serious problems.

So maybe when counting words 'gets stuck', propose to only write the page count and terminate word count?
The issue is defining "gets stuck". If a PDF with one word on one page takes hours to count, it is reasonable to think something went wrong. But for a 100MB file, how do you know? It could just be taking a very long time and will work in a number of hours. Or it could have hung.

Separating the page and word count for PDF might make sense. I have to check the code. I don't know exactly how they work. The page count might come from a metadata attribute of some sort. If so, it might be possible to run them as separate jobs. That way you would get the page count even if you didn't get the word count. But, I'm honestly not sure if it is worth the hassle. How often does it actually fail.

For killing the job, I think it is out of my hands. Calibre is doing that and I don't know if the plugin gets a chance to kill and sub-processes. I'll look when I have a chance.

And for the record, I didn't create the plugin. The kiwidude is the one to congratulate for that. And a pile of other excellent plugins.
davidfor is offline   Reply With Quote
Reply

Tags
count, count pages, page count, pages, plugin


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
[GUI Plugin] Quality Check kiwidude Plugins 1252 08-02-2025 09:53 AM
[GUI Plugin] Open With kiwidude Plugins 404 02-21-2025 05:42 AM
[GUI Plugin] Quick Preferences kiwidude Plugins 62 03-16-2024 11:47 PM
[GUI Plugin] Kindle Collections (old) meme Plugins 2070 08-11-2014 12:02 AM
[GUI Plugin] Plugin Updater **Deprecated** kiwidude Plugins 159 06-19-2011 12:27 PM


All times are GMT -4. The time now is 05:14 PM.


MobileRead.com is a privately owned, operated and funded community.