![]() |
#1 |
Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 13
Karma: 2386
Join Date: Nov 2012
Device: Kindle Touch
|
Advanced search within ebook using application or regex
I'm using the Calibre plugin Quality Check to search books in my library. I'd like to be able to find words related to specific contexts within a book, probably searching the whole book for two or more words appearing in either a single sentence or a paragraph (possibly a section) in any order. For example, if I'm searching for "protein" as it relates to "batter", I'd like to be able to search the book for those two terms without getting results for protein as it relates to meat, nuts, etc.
I've played around with the regex for this a bit, but can't seem to get it, and my regex skills still fairly limited, so I'd really appreciate some help on this. Or, is there another program or plugin more suited to this task? Cheers! |
![]() |
![]() |
![]() |
#2 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
|
you are asking tor the sort of search the google spent millions of man-hours development - for free ? that's a tall order
try writing the logic of what tests you need to code e.g. find protein then find batter but quit if you find a full stop before you've found batter... repeat with the words flipped. Two words within a "section". well you' have to figure out what marks the end of what you consider to be a section, then code a test for that..... have fun. you may have more success if you convert the book(s) back into an uncompressed format, like .txt then let windows index the folder & file contents (tick index contents) & then use windows search on the containing folder... that brings a lot of firepower to bear, assuming you use windows. |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
null operator (he/him)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 21,716
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
@Earthlark
For Linux there is the Recoll Integration PI In Windows I do what cybmole suggests; retain a Windows Search-able format (doc, rtf, txt, odt - sadly not epub or mobi etc). After doing a Windows Search I massage the results (list of files) with a couple of Notepad++ macros to create a csv, which I feed into the Import List which I get to show the list of books in the Calibre GUI (it marks them) I usually search within a specific library folder (eg X:/Libraries/Journals) or across all libraries (X:/Libraries) I thought about automating it - but a) I like being able to winnow the results list before submitting it to Import List, b) I'm lazy and c) I prefer to see what going on. The same thing could be done on a Mac with Spotlight, which I'm pretty certain will search in epub and other ebook formats BR Last edited by BetterRed; 02-04-2014 at 03:52 AM. Reason: clarity |
![]() |
![]() |
![]() |
#4 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
|
if we are making lists then there's also Amazon's allegedly wonderful X-ray tool - but you need a real Kindle for that & I don't know how good it is for challenges like this.
plain ol' google is very food if you can also feed it an actual book title in quotes. It seems to have a lot of book body text available to its search engine is so it has often found me a quote from within a book, even where the book itself is copyright & cannot be viewed in its entirity |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Advanced search within ebook using application or regex | Earthlark | General Discussions | 2 | 02-04-2014 12:10 AM |
Regex search and replace | dwlamb | Sigil | 6 | 04-12-2013 02:34 PM |
Problem using 'Advanced Search' with 'Search in Forum' | Wetdogeared | Feedback | 6 | 06-21-2011 09:37 AM |
advanced text search and non-ascii characters | msz59 | General Discussions | 0 | 05-05-2011 09:47 AM |
Strange behavior with advanced catalog search (regex) | QuantumBeep | Library Management | 1 | 02-21-2011 11:06 AM |