|
![]() |
|
Thread Tools | Search this Thread |
![]() |
#1 |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 300
Karma: 2000410
Join Date: Jan 2012
Device: Kindle 4
|
Easiest way to count the occurrence of a word across a few EPUB books?
I see we have dedicated threads for a few different e-book software, which if good. I have no idea in which software's topic to ask, so maybe I should ask it here?
I want to count for the occurrence of a word across a few EPUB books. What's the easiest or best ways to accomplish that? The search function didn't really give clue to this type of search. |
![]() |
![]() |
![]() |
#2 |
350 Hoarder
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,574
Karma: 8281267
Join Date: Dec 2010
Location: Midwest USA
Device: Sony PRS-350, Kobo Glo & Glo HD, PW2
|
I'm more familiar with Sigil since I use that most often, but in the lower right corner when you search for a word in Sigil, click the "Count all" button and it will show you the number of times the word occurs. Then just do that for any other books.
Others may have their favorite software they use for such functions. |
![]() |
![]() |
![]() |
#3 |
hopeless n00b
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 5,110
Karma: 19597086
Join Date: Jan 2009
Location: in the middle of nowhere
Device: PW4, PW3, Libra H2O, iPad 10.5, iPad 11, iPad 12.9
|
I keep plain text copies of my ebooks and just use Notepad++ "Find in Files" feature when I need to search for something.
If you're running Linux, I think there may be a Linux-specific plugin for Calibre that will do full text searches. |
![]() |
![]() |
![]() |
#4 |
Karmaniac
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,553
Karma: 11499146
Join Date: Oct 2008
Location: Miami FL
Device: PRS-505, Jetbook, + Mini, +Color, Astak Ez Reader Pro, PPW1, Aura H2O
|
Well, your best bet is Kindle's X-ray.
It'll tell you per book, so if you have a kindle with X-ray, you just check each book for the word on X-ray. Easiest solution. Another solution, is to find and download a digital version of the book online (could be txt, epub, pdf, ...) , and search with a word processor on a pc. |
![]() |
![]() |
![]() |
#5 | ||||
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 300
Karma: 2000410
Join Date: Jan 2012
Device: Kindle 4
|
Quote:
Quote:
Quote:
Quote:
Thanks to all! |
||||
![]() |
![]() |
![]() |
#6 |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 281
Karma: 7724454
Join Date: Sep 2017
Location: Bethesda, MD, USA
Device: Kobo Aura H20, Kobo Clara HD
|
On the "Search" dialog, change the "Mode" dropdown at the bottom from "Current File" to "All HTML Files". Then run "Count All" again.
|
![]() |
![]() |
![]() |
#7 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 28,341
Karma: 203719646
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
For the record: Sigil doesn't dissect the book into individual html files; the ebook's creator does (because it's common practice to do so). Sigil will happily allow someone to create an epub with one monstrous html file if they like.
|
![]() |
![]() |
![]() |
#8 | |
350 Hoarder
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,574
Karma: 8281267
Join Date: Dec 2010
Location: Midwest USA
Device: Sony PRS-350, Kobo Glo & Glo HD, PW2
|
Quote:
|
|
![]() |
![]() |
![]() |
#9 | |
hopeless n00b
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 5,110
Karma: 19597086
Join Date: Jan 2009
Location: in the middle of nowhere
Device: PW4, PW3, Libra H2O, iPad 10.5, iPad 11, iPad 12.9
|
Quote:
Screenshots are attached for reference. ![]() For me, Notepad++ (or similar text editors) is the easiest option since I sometimes have to search through hundreds of different ebooks (typically for lines that I remember but didn't annotate/lost the annotation). Last edited by ilovejedd; 02-28-2018 at 10:55 AM. |
|
![]() |
![]() |
![]() |
#10 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Quote:
https://www.mobileread.com/forums/sh...d.php?t=169744 merge all of them into one behemoth EPUB, and then open it in Calibre's Editor and use Tools > Check Spelling. AZARDI is an EPUB reader that lets you search across multiple EPUBs: http://azardi.infogridpacific.com/ You have to right click each EPUB and "Index" it, but after that you can search across them freely. May I ask what the use-case is? Are you trying to see how often a character is mentioned in a series? Last edited by Tex2002ans; 02-28-2018 at 10:19 PM. |
|
![]() |
![]() |
![]() |
#11 |
Bibliophagist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 44,568
Karma: 167913281
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
|
Hmmm... Enter text to find in the Find: box, Mode: Normal, All HTML files, Up/Down doesn't matter if Wrap is checked and click Count All.
Last edited by DNSB; 03-01-2018 at 01:10 PM. Reason: fat fingers cause typos... |
![]() |
![]() |
![]() |
#12 |
Karmaniac
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,553
Karma: 11499146
Join Date: Oct 2008
Location: Miami FL
Device: PRS-505, Jetbook, + Mini, +Color, Astak Ez Reader Pro, PPW1, Aura H2O
|
Unpack the ebooks with a compression tool.
Search for programs in google: Free tools include '7z' (free program), 'winrar' (free trial), or 'winzip' (evaluation) Then edit using an HTML editor, or word processor, or worst case, an advanced notepad: - Free HTML editor: NVU, Kompozer, Microsoft FrontPage - Free office and word editor: Apache OpenOffice - Notepads (free): Notepad++, Windows included Notepad or Wordpad. |
![]() |
![]() |
![]() |
#13 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 733
Karma: 5797160
Join Date: Jun 2010
Location: Istanbul
Device: Kobo Libra
|
Under macOS or Linux it is pretty easy. Create this bash script and make it executable:
Code:
#!/bin/bash PAT=${1:?"Usage: grep-epub PAT *.epub files to grep"} shift : ${1:?"Need epub files to grep"} for i in $* ;do echo $0 $i unzip -p $i "*.htm*" "*.xml" "*.opf" | perl -lpe 's![<][^>]{1,200}?[>]!!g;' | grep -Pinaso ".{0,60}$PAT.{0,60}" | grep -Pi --color "$PAT" done Code:
sudo ln -s ~/Apps/CLI/grep-epub.sh /usr/local/bin/grep-epub To find all occurences of sea in Dubliners: Code:
grep-epub "sea" Dubliners.epub Code:
grep-epub " sea " Dubliners.epub In Windows, you can use this with Linux subsystem. Last edited by GERGE; 03-10-2018 at 03:48 AM. |
![]() |
![]() |
![]() |
#14 | |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 78,985
Karma: 144284074
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
Quote:
|
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Word Count in Marvin 3? | Deahna | Marvin | 10 | 10-31-2017 07:41 PM |
Word Count? | noirverse | Marvin | 0 | 11-11-2016 08:23 PM |
word count | Tanjamuse | Editor | 5 | 11-09-2014 06:31 AM |
Word Count | leebase | Calibre | 34 | 06-07-2011 11:53 PM |