No, I cannot recall exactly what I did to index epubs. My guess, and it really is just a guess, is that I handled it as an alternate extension for ZIP files. Only the text portions of the archive are relevant to Indexing, so ignoring the other stuff in an EPUB (ToC, Spine, OPF, etc) would be fine. Images might have been ignored as well, which wouldn't really mater as I wouldn't be searching on them anyway.
All that I am certain of is that I almost maxed out Indexing at 1.97 Gb after going through my entire library.
PS: If you can figure out how to generate an EPUB iFilter, it appears as though you might have a lucrative niche product, with a growing market, without any competitors in sight.
|