Hello
I am trying to extract the date on which the content of the book was created. This date is found in title.xhtml with this format:
<p class="ePUBfirma"><strong class="sans">Wolfman2408</strong> <code class="ePUBfecha sans">24.05.13</code></p>
I have used the following regular expression in a "TXT Query"
^(?!.*(\bfax\b|\bisbn\b|\blegal\b)).*((0?[1-9]|[123]\d)[-](0?[1-9]|1[012])[-]([1][9]|[2][0])?\d\d)|((0?[1-9]|[123]\d)[\/](0?[1-9]|1[012])[\/]([1][9]|[2][0])?\d\d)|((0?[1-9]|[123]\d)[\.](0?[1-9]|1[012])[\.]([1][9]|[2][0])?\d\d)$
It is so complex, because there are false positives when it finds numbers after the 'isbn', 'fax' or 'legal deposit' expressions, as well as in the numerical formats and their separators.
I have done checking the regular expression in pythex:
https://pythex.org/?regex=%5E(%3F!.*...ll=1&verbose=0
And it's works, inserting the date in a column called "generated".
The problem is that it takes almost 30 seconds per book, and when i select 40 or more books the program hangs.
Is this normal or is a fault in the regex or in the plugin?
Is there a way to format the output to replace the date separators '.' and '-' with '/' ?
Thank you