View Single Post
Old 06-24-2024, 07:44 PM   #9
Comfy.n
want to learn what I want
Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.Comfy.n ought to be getting tired of karma fortunes by now.
 
Posts: 1,615
Karma: 7891011
Join Date: Sep 2020
Device: none
Well I've used EPOM just for extracting translators and original titles, using the FTS index. That was not too challenging. In your case it would be better if Dalton could help, but it's been almost a year he's away from MR, unfortunately. Or maybe some regex power user.

I don't see an easy way to detect the exact beginning of the text, given the ebooks' structure variations, however you could try something like this

- set the tweak MAXIMUM_LENGTH_TO_ACCEPT= to a large value
- then populate the #first-pars column using regex to match, say, the first 1000 chars in the book
Comfy.n is offline   Reply With Quote