Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Plugins

Notices

Reply
 
Thread Tools Search this Thread
Old 01-08-2016, 05:17 PM   #841
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 80,031
Karma: 147977995
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
The book I noticed the counting errors with...

Old Method - 169541
ICU Method - 168183
Word 2016 - 168187

For Word, I used Calibre to convert to text and loaded the text version into Word to get the count. So yes, the ICU method is a lot more accurate.
JSWolf is offline   Reply With Quote
Old 01-08-2016, 05:21 PM   #842
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,826
Karma: 30277270
Join Date: Mar 2012
Location: Sydney Australia
Device: none
I thought the existence of a space adjacent to an ellipsis was germane to why ellipsis is there.

From memory, CMS has space before and after to indicate missing words, to indicate an unfinished sentence no space before, and period after. MLA puts [] around the ellipsis to indicate missing words, IMO that looks better if words are omitted from the start of a sentence.

But as Jefferson said - "On matters of style swim with the current, on matters of principle stand like a rock."

BR

Last edited by BetterRed; 01-09-2016 at 02:22 AM.
BetterRed is offline   Reply With Quote
Advert
Old 01-09-2016, 03:37 PM   #843
Divingduck
Wizard
Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.
 
Posts: 1,166
Karma: 1410083
Join Date: Nov 2010
Location: Germany
Device: Sony PRS-650
@davidfor,
Thanks for the beta. I run a test against some German EPUB files in different genres. As reference information I made a docx-conversion. Here are the results.

1. A Federal Agency for Civic Education document
(Link:http://www.bpb.de/system/files/datei/APuZ_2015-52.epub)
Count Page/Word Statistics
Word count - old method: 30,349
Word count: 30,539
MS Word 2013 Statistics
Words: 29,864
Spoiler:
Count Page/Word Statistics
Logfile for book ID 15709 (Europäische Integration in der Krise)
Found 30539 words
Computed 11.9 Flesch-Kincaid Grade
Computed 39.7 Flesch Reading
Computed 12.0 Gunning Fog Index
Found 111 pages
15709
InputFormatPlugin: EPUB Input running
on C:\Users\xxxxx\AppData\Local\Temp\calibre_mediha\o zhs7n_count_pages\15709.epub
Found HTML cover OEBPS/00000000000_cover.html
Page count: 111.0
Word count - old method: 30349
Word count: 30539
Results of NLTK text analysis:
Number of characters: 199944
Number of words: 32658
Number of sentences: 1888
Number of syllables: 57866
Number of complex words: 4280
Average words per sentence: 17
Flesch Reading Ease: 39.6790997612
Flesch Kincade Grade: 11.9481633903
Gunning Fog: 12.0422071162
MS Word 2013 Statistics
Characters: 200404
Characters w space: 229956
paragraphs: 693
Words: 29864


2. Dale Brown - Außer Kontrolle (Thriller)
isbn:9783641111038
Count Page/Word Statistics
Word count - old method: 109,218
Word count: 106,689
MS Word 2013 Statistic
Words: 107,033
Spoiler:
Count Page/Word Statistics
Logfile for book ID 15535 (Außer Kontrolle)
Found 106689 words
Computed 9.3 Flesch-Kincaid Grade
Computed 56.7 Flesch Reading
Computed 10.1 Gunning Fog Index
Found 322 pages
15535
InputFormatPlugin: EPUB Input running
on C:\Users\xxxxx\AppData\Local\Temp\calibre_mediha\c lqzsw_count_pages\15535.epub
Found HTML cover titlepage.xhtml
Page count: 322.0
Word count - old method: 109218
Word count: 106689
Results of NLTK text analysis:
Number of characters: 648986
Number of words: 115430
Number of sentences: 6878
Number of syllables: 182710
Number of complex words: 10629
Average words per sentence: 16
Flesch Reading Ease: 56.6846993849
Flesch Kincade Grade: 9.32779606688
Gunning Fog: 10.0832712466
MS Word 2013 Statistic
Characters: 639284
Characters w space: 744117
paragraphs: 2857
Words: 107033


3. Umberto Eco - Die große Zukunft des Buches (non-fiction)
isbn:9783446236165
Count Page/Word Statistics
Word count - old method: 69,745
Word count: 68,944
MS Word 2013 Statistic
Words: 68,916
Spoiler:
Count Page/Word Statistics
Logfile for book ID 14875 (Die große Zukunft des Buches)
Found 68944 words
Computed 8.2 Flesch-Kincaid Grade
Computed 64.4 Flesch Reading
Computed 9.9 Gunning Fog Index
Found 212 pages
14875
InputFormatPlugin: EPUB Input running
on C:\Users\xxxxx\AppData\Local\Temp\calibre_mediha\i 6wxog_count_pages\14875.epub
Page count: 212.0
Word count - old method: 69745
Word count: 68944
Results of NLTK text analysis:
Number of characters: 383108
Number of words: 74867
Number of sentences: 4560
Number of syllables: 111639
Number of complex words: 6615
Average words per sentence: 16
Flesch Reading Ease: 64.4424975623
Flesch Kincade Grade: 8.24573911069
Gunning Fog: 9.93426743425
MS Word 2013 Statistics
Characters: 384109
Characters w space: 452814
paragraphs: 934
Words: 68916


4. Helena Marten - Die Kaffeemeisterin (historical fiction)
isbn:9783641059606
Count Page/Word Statistics
Word count - old method: 149,472
Word count: 148,460
MS Word 2013 Statistics
Words: 148,883
Spoiler:
Count Page/Word Statistics
Logfile for book ID 15637 (Die Kaffeemeisterin)
Found 148460 words
Computed 7.1 Flesch-Kincaid Grade
Computed 70.7 Flesch Reading
Computed 9.4 Gunning Fog Index
Found 454 pages
15637
InputFormatPlugin: EPUB Input running
on C:\Users\xxxxx\AppData\Local\Temp\calibre_mediha\r o4vpd_count_pages\15637.epub
Page count: 454.0
Word count - old method: 149472
Word count: 148460
Results of NLTK text analysis:
Number of characters: 813152
Number of words: 158706
Number of sentences: 10017
Number of syllables: 226847
Number of complex words: 13451
Average words per sentence: 15
Flesch Reading Ease: 70.6866814109
Flesch Kincade Grade: 7.12637304198
Gunning Fog: 9.39016798357
MS Word 2013 Statistics
Characters: 801879
Characters w space: 948783
paragraphs: 3534
Words: 148883
Divingduck is offline   Reply With Quote
Old 01-09-2016, 08:24 PM   #844
Ravensknight
Serpent Rider
Ravensknight ought to be getting tired of karma fortunes by now.Ravensknight ought to be getting tired of karma fortunes by now.Ravensknight ought to be getting tired of karma fortunes by now.Ravensknight ought to be getting tired of karma fortunes by now.Ravensknight ought to be getting tired of karma fortunes by now.Ravensknight ought to be getting tired of karma fortunes by now.Ravensknight ought to be getting tired of karma fortunes by now.Ravensknight ought to be getting tired of karma fortunes by now.Ravensknight ought to be getting tired of karma fortunes by now.Ravensknight ought to be getting tired of karma fortunes by now.Ravensknight ought to be getting tired of karma fortunes by now.
 
Ravensknight's Avatar
 
Posts: 1,123
Karma: 10219804
Join Date: Jun 2009
Device: Sony 350; Nook STR; Oasis
These past couple of pages are why I so infrequently post. Why be a complete bother to the creator of the plugin AND annoy others who use it? And all to no purpose...
Ravensknight is offline   Reply With Quote
Old 01-09-2016, 09:15 PM   #845
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 31,134
Karma: 60406498
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
If I was getting paid 'by the word', I could see the need for 'the most accurate precision'. (Please, Please, do not get 'Weights-and Measures' involved. None of the current methods would be acceptable )

But I don't.

I find ADE page estimates (I use a RMSDK device) meets my needs for book size, just like 'heft' did in the bookstore of days past.
theducks is offline   Reply With Quote
Advert
Old 01-10-2016, 12:19 AM   #846
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 9,897
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Kindle PaperWhite SE 11th Gen
Quote:
Originally Posted by theducks View Post
I find ADE page estimates (I use a RMSDK device) meets my needs for book size, just like 'heft' did in the bookstore of days past.
Agreed.
DoctorOhh is offline   Reply With Quote
Old 01-10-2016, 01:22 AM   #847
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,826
Karma: 30277270
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by Ravensknight View Post
These past couple of pages are why I so infrequently post. Why be a complete bother to the creator of the plugin AND annoy others who use it? And all to no purpose...
- one might say, "Triumph of the Pedants", but one shouldn't of course. Leni Riefenstahl

BR
BetterRed is offline   Reply With Quote
Old 01-10-2016, 01:47 AM   #848
rpgmaker
Connoisseur
rpgmaker began at the beginning.
 
Posts: 86
Karma: 10
Join Date: Oct 2014
Device: Kindle Paperwhite 2
Quote:
Originally Posted by Ravensknight View Post
These past couple of pages are why I so infrequently post. Why be a complete bother to the creator of the plugin AND annoy others who use it? And all to no purpose...
Bother the creator? The person that seems to be in charge of the plugin now said that he/she was bored and decided to look into it, it might as well has been ignored but it wasn't. Annoy users? Why are people annoyed by this? The benefit is marginal but this is nothing more than looking for a way of making the count more "accurate", I don't see the need to hate on this or consider it annoying in any way. I don't think this makes the user a "pedant", this kind of marginal changes are done every day in all kinds of projects and they improve them in the long term.
rpgmaker is offline   Reply With Quote
Old 01-10-2016, 04:47 AM   #849
Divingduck
Wizard
Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.Divingduck ought to be getting tired of karma fortunes by now.
 
Posts: 1,166
Karma: 1410083
Join Date: Nov 2010
Location: Germany
Device: Sony PRS-650
Quote:
Originally Posted by Ravensknight View Post
These past couple of pages are why I so infrequently post. Why be a complete bother to the creator of the plugin AND annoy others who use it? And all to no purpose...
Anyway, it's up to you to read, ignore or post, agree or disagree.

Quote:
Originally Posted by theducks View Post
I find ADE page estimates (I use a RMSDK device) meets my needs for book size, just like 'heft' did in the bookstore of days past.
Agreed too, especial as page references are common everywhere of the world and in all the different areas. For this I need one method and that is mainly Adobe page count.

At the end JSWolf is right to say there is something wrong in counting words as that is reality. On the other side there is no simple solution, as the different samples shows (and users who work with this information's know this very well too, e.g for technical dokumentation's the gap is much bigger). This discussion is a good one as it brings facts on the table for users and developer who like to think outside the box to find the optimum solution for their requirements. That is exactly why calibre is what it is, right?

Best regards,
DivingDuck

Last edited by Divingduck; 01-10-2016 at 04:49 AM.
Divingduck is offline   Reply With Quote
Old 01-10-2016, 07:19 AM   #850
davidfor
Grand Sorcerer
davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.
 
Posts: 24,905
Karma: 47303824
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
Quote:
Originally Posted by Ravensknight View Post
These past couple of pages are why I so infrequently post. Why be a complete bother to the creator of the plugin AND annoy others who use it? And all to no purpose...
Others have commented, but personally, I see the last few pages as a healthy discussion about the plugin. Someone reported a possible problem, and the discussion has been about how useful it is to fix the bug. I like seeing this level of involvement as it helps to work out the best solution.
davidfor is offline   Reply With Quote
Old 01-10-2016, 06:29 PM   #851
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,826
Karma: 30277270
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by davidfor View Post
Others have commented, but personally, I see the last few pages as a healthy discussion about the plugin. Someone reported a possible problem, and the discussion has been about how useful it is to fix the bug. I like seeing this level of involvement as it helps to work out the best solution.
A discussion about word counting without any discussion about when hyphenation should be used is a somewhat barren discussion. The latter is to some degree a matter of style, it's also a matter where there are no hard and fast rules. Google 'when to use hyphens' and read what the writerly commentariat and grammarians have to say on hyphenation.

The existing algorithm is NOT a bug - putting a Whitworth nut on a Metric bolt is a bug. But that JSWolf regards all opinions, other than his own, as bugs, is a proven fact

It's an issue of which algorithm to use. The one that has stood those who use it in good stead for nigh on 5 years. Or one that has only become available in recent times. There would be no discussion, from me at least, if the proposal was to add an option to use the existing or the ICU algorithms when computing word counts. IMO, adding an option would be in the 'spirit' of the original developer, who usually (always ?) protected 'legacy' features. If at all possible, existing 'installs' would set the option to use the 'legacy' algorithm, new installs would default to the 'ICU' algorithm.

Support forums are riddled with complaints about Apple, MS, Google etc blithely clobbering/discontinuing existing features. Less so with IBM, if you're minded you can definitely run IEBGENER and probably DISSOS or PROFS on your shiny new z/OS system.

Facetiously, one might suggest an option to include the components of hyphenated words in the word count if they are present in designated dictionaries. Thus, the compound word 'so-called' would likely be counted as two words, whereas 'topsy-turvy' would likely be counted as one. But realistically one wouldn't — would one?

===============

An unrelated feature I'd like to see in Count Pages, is an option to use the format file with the latest file system modification date as the basis for counting. In my workflow that would avoid in-flight conversions to EPUB - because in 99% of cases, I Convert from non-EPUB to EPUB immediately prior to running Count Pages.

NB: EPUB is not even close to being near the top of my preferred input format list, although it is my designated output format. I rarely need to convert from EPUB, when I do it's unlikely I would then run Count Pages. I would typically attach the output format file to an email, send it, and then remove the format from the library.

BR
BetterRed is offline   Reply With Quote
Old 01-10-2016, 06:37 PM   #852
Katsunami
Grand Sorcerer
Katsunami ought to be getting tired of karma fortunes by now.Katsunami ought to be getting tired of karma fortunes by now.Katsunami ought to be getting tired of karma fortunes by now.Katsunami ought to be getting tired of karma fortunes by now.Katsunami ought to be getting tired of karma fortunes by now.Katsunami ought to be getting tired of karma fortunes by now.Katsunami ought to be getting tired of karma fortunes by now.Katsunami ought to be getting tired of karma fortunes by now.Katsunami ought to be getting tired of karma fortunes by now.Katsunami ought to be getting tired of karma fortunes by now.Katsunami ought to be getting tired of karma fortunes by now.
 
Katsunami's Avatar
 
Posts: 6,111
Karma: 34000001
Join Date: Mar 2008
Device: KPW1, KA1
For what it is worth, soft hyphenation with the Hyphenate This plugin messes up Count Pages (or the word count functions it uses). It counts a huge number of extra pages and words if you run it after running Hyphenate This.

Therefore Hyphenate This is the last plugin I run, and I only run it on the AZW3 format, which is the one I use for the Kindle Paperwhite 1.
Katsunami is offline   Reply With Quote
Old 01-10-2016, 11:49 PM   #853
davidfor
Grand Sorcerer
davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.
 
Posts: 24,905
Karma: 47303824
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
Quote:
Originally Posted by BetterRed View Post
A discussion about word counting without any discussion about when hyphenation should be used is a somewhat barren discussion. The latter is to some degree a matter of style, it's also a matter where there are no hard and fast rules. Google 'when to use hyphens' and read what the writerly commentariat and grammarians have to say on hyphenation.
Sorry, I don't think the discussion is about when to use hyphenation. It's about what is considered a word. Or, for the sake of using an algorithm that doesn't need a dictionary or full grammar, what is a word delimiter.
Quote:
The existing algorithm is NOT a bug - putting a Whitworth nut on a Metric bolt is a bug. But that JSWolf regards all opinions, other than his own, as bugs, is a proven fact
JSWolf reported three cases. The ellipses example is a bug (always two words) and the em-dash is probably a bug (two words in most circumstances). The en-dash is harder. I think it would be two words in more cases than one word, but I'm not sure. And Katsunami has pointed to another example; the soft-hyphen should not be a word delimiter. That to me is a worse problem
Quote:
It's an issue of which algorithm to use. The one that has stood those who use it in good stead for nigh on 5 years. Or one that has only become available in recent times. There would be no discussion, from me at least, if the proposal was to add an option to use the existing or the ICU algorithms when computing word counts. IMO, adding an option would be in the 'spirit' of the original developer, who usually (always ?) protected 'legacy' features. If at all possible, existing 'installs' would set the option to use the 'legacy' algorithm, new installs would default to the 'ICU' algorithm.
The problem with the option is that there is no real long-term benefit in having it. If we were having options to use "ICU" vs "Simple spaces delimiter" vs "Some wacky word count method I found on the net", then yes, options for the choice. To me, the number of people that are interested in the exact number of words is few. And most of them will be horrified that the count is wrong and want it fixed. For most of the rest, an approximation is good enough and that is why my initial reaction was "who cares". When I looked again, the language/locale issues was what made me decide to look at the changes.

Saying that it has served the users for five years is a problem. The code for both methods is in calibre. Are you sure that the existing method hasn't changed in five years? Are you sure it won't change in the future? I'm a little surprised that when Kovid implemented the ICU method that he didn't remove the old method. Sure, he would have left the interface, but that would have just pointed to the new code.

And for changes to the algorithm, if it had been implemented completely inside Count Pages and the issue was that, for example, the ellipses character was not in the list of word delimiter characters, I would have had no hesitation in adding it. Would you expect an option to keep the old in that case?
Quote:
Support forums are riddled with complaints about Apple, MS, Google etc blithely clobbering/discontinuing existing features. Less so with IBM, if you're minded you can definitely run IEBGENER and probably DISSOS or PROFS on your shiny new z/OS system.
"PROFS", I haven't heard mention of that for a LONG time.
Quote:
Facetiously, one might suggest an option to include the components of hyphenated words in the word count if they are present in designated dictionaries. Thus, the compound word 'so-called' would likely be counted as two words, whereas 'topsy-turvy' would likely be counted as one. But realistically one wouldn't — would one?
Sorry, I would count a hyphenated compound word as one word. Anything with hyphens connecting the parts is to me one word. That isn't the problem, it's working out what a hyphen is. And what other characters should not be treated as word delimiters.
Quote:
===============

An unrelated feature I'd like to see in Count Pages, is an option to use the format file with the latest file system modification date as the basis for counting. In my workflow that would avoid in-flight conversions to EPUB - because in 99% of cases, I Convert from non-EPUB to EPUB immediately prior to running Count Pages.

NB: EPUB is not even close to being near the top of my preferred input format list, although it is my designated output format. I rarely need to convert from EPUB, when I do it's unlikely I would then run Count Pages. I would typically attach the output format file to an email, send it, and then remove the format from the library.
I hadn't looked at that part of the code before, but when choosing which format to use, your preferred input format is used. So, if the epub is always being counted, it must be above the other formats in your calibre preferences. Looking for the most recently changed is possible, but it's enough outside the existing code that I can't say that I'm interested in adding it.
davidfor is offline   Reply With Quote
Old 01-11-2016, 12:05 AM   #854
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,448
Karma: 27757438
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Quote:
Originally Posted by davidfor View Post
I'm a little surprised that when Kovid implemented the ICU method that he didn't remove the old method. Sure, he would have left the interface, but that would have just pointed to the new code.
That is because the old code is only used by conversion heuristics and conversion heuristics is a bit of the conversion pipeline I dont maintain. I dislike making changes in other people's code unless there is some compelling reason to do so.
kovidgoyal is online now   Reply With Quote
Old 01-11-2016, 03:13 AM   #855
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 80,031
Karma: 147977995
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by BetterRed View Post
The existing algorithm is NOT a bug - putting a Whitworth nut on a Metric bolt is a bug. But that JSWolf regards all opinions, other than his own, as bugs, is a proven fact
What I posted are indeed true bugs. Those six words were counted as three words when they should be six words. This is not an opinion. It's a fact. If you don't like bug reports, just ignore them.
JSWolf is offline   Reply With Quote
Reply

Tags
count, count pages, page count, pages, plugin


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
[GUI Plugin] Quality Check kiwidude Plugins 1252 08-02-2025 09:53 AM
[GUI Plugin] Open With kiwidude Plugins 404 02-21-2025 05:42 AM
[GUI Plugin] Quick Preferences kiwidude Plugins 62 03-16-2024 11:47 PM
[GUI Plugin] Kindle Collections (old) meme Plugins 2070 08-11-2014 12:02 AM
[GUI Plugin] Plugin Updater **Deprecated** kiwidude Plugins 159 06-19-2011 12:27 PM


All times are GMT -4. The time now is 01:04 AM.


MobileRead.com is a privately owned, operated and funded community.