![]() |
#961 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 24,905
Karma: 47303824
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
|
Quote:
I suppose the same could be done for AZW3, but, I have no idea of the internal format of the format, so I don't know if it is constructed in a suitable way. Plus, whatever compression is used will affect it. I'll put it on the list to look at, but I can't promise anything. |
|
![]() |
![]() |
![]() |
#962 | |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 80,083
Karma: 147983159
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
Quote:
ICU on Code:
Page count: 255.0 Word count using icu_wordcount - trying to count_words Word count - used count_words: 94052 Word count: 94052 Code:
Page count: 255.0 Word count using older method - trying to count_words Word count: 96049 Last edited by JSWolf; 01-04-2017 at 07:40 PM. |
|
![]() |
![]() |
Advert | |
|
![]() |
#963 |
null operator (he/him)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 21,836
Karma: 30277270
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
@JSWolf - maybe hyphenations are counted as one word by ICU and multiple words by non ICU.
Select Tools->Reports->Words in the calibre book editor, filter with '-', save to csv, and get your spreadsheet to accumulate the Times Used column; note bene: if my suggestion is true then forget-me-not would be counted as three words by non-ICU ![]() BR |
![]() |
![]() |
![]() |
#964 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 24,905
Karma: 47303824
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
|
Quote:
In any case, go back to the discussion in this thread at this time last year. You were the one that started that "discussion" by pointing out a possible bug. And that was the point of adding the ICU method as it is seemed to handle some things in a better way and was language aware. Back then, I did post explanations of some of the differences if you want to look. Also, both methods rely on code in calibre. If that is updated, then it might change the count the plugin produces. Personally, I expect both numbers to be wrong. I tend to think the ICU method is the more accurate, but that is based on me counting very small samples. I take all the statistics produced by the plugin as approximations. And during last year's discussion I was very tempted to introduce a "nearest 1000" option. Of course, that would raise the argument of rounding vs truncating. |
|
![]() |
![]() |
![]() |
#965 |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 80,083
Karma: 147983159
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
The ICU count is more accurate then the non-ICU method. The ICU gets closer to the correct word count. The difference between the word count produced with Word 2016 when the ePub is converted to RTF is 36 difference. Not a lot and not enough to be bothered with. I've scrambled and posted the ePub in the message with the counts if you are interested in seeing it.
|
![]() |
![]() |
Advert | |
|
![]() |
#966 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 24,905
Karma: 47303824
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
|
Quote:
And for reference, the editor is using the ICU algorithm. There is a difference in the counts between the plugin and the editor. I haven't gotten around to looking at what the difference is yet. |
|
![]() |
![]() |
![]() |
#967 |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 80,083
Karma: 147983159
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
I have it... Convert to text, count the words, delete the text version and done.
|
![]() |
![]() |
![]() |
#968 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 24,905
Karma: 47303824
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
|
Quote:
As I said, I take either count as an approximation. Until someone demonstrates that one or the other is wrong, and how, I am going to accept that they work. |
|
![]() |
![]() |
![]() |
#969 | |
null operator (he/him)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 21,836
Karma: 30277270
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
Quote:
I often remove description, subjects etc from the content.opf (with Sigil) before I use calibre's spell checker, why calibre's spell-checker and not Sigil's - because it's multi-lingual. I would prefer that the content.opf file not be included in the PI's calculations. I'd also quite like some front and back matter to be excluded, but that's a much bigger ask. BR |
|
![]() |
![]() |
![]() |
#970 |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 31,147
Karma: 60406498
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
|
![]() |
![]() |
![]() |
#971 | |||
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 24,905
Karma: 47303824
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
|
Quote:
Quote:
Quote:
|
|||
![]() |
![]() |
![]() |
#972 | |
null operator (he/him)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 21,836
Karma: 30277270
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
Quote:
IIRC someone wrote in the earlier discussion, "The only time word count accuracy matters is if someone is paying for or being paid for the words." BR |
|
![]() |
![]() |
![]() |
#973 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 24,905
Karma: 47303824
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
|
YAB - Use Preferred Input Format
OK, here's another beta with the option that BR asked for. The text for the label and tooltip is somewhere between what I had and BR had.
Other than fine-tuning the labels and tooltips or if someone finds a bug, I'm planning for this to be the last beta. |
![]() |
![]() |
![]() |
#974 |
null operator (he/him)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 21,836
Karma: 30277270
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
@davidfor - works for me, and as I hoped (knew) it would be, it's a heck of lot faster now it's not redoing the conversion.
Thanks again. BR I don't think I'll ever understand why it didn't work this way from the getGo. |
![]() |
![]() |
![]() |
#975 | |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 80,083
Karma: 147983159
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
Quote:
I will leave it up to you to say if this difference is enough to warrant any more changes. |
|
![]() |
![]() |
![]() |
Tags |
count, count pages, page count, pages, plugin |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
[GUI Plugin] Quality Check | kiwidude | Plugins | 1252 | 08-02-2025 09:53 AM |
[GUI Plugin] Open With | kiwidude | Plugins | 404 | 02-21-2025 05:42 AM |
[GUI Plugin] Quick Preferences | kiwidude | Plugins | 62 | 03-16-2024 11:47 PM |
[GUI Plugin] Kindle Collections (old) | meme | Plugins | 2070 | 08-11-2014 12:02 AM |
[GUI Plugin] Plugin Updater **Deprecated** | kiwidude | Plugins | 159 | 06-19-2011 12:27 PM |