MobileRead Forums - View Single Post

DaltonST · 03-13-2023, 04:57 PM

The DEBUG log shows exactly why MD was picked over HTML. It was because the 'normalizing factor' that was set was backwards. The divisor I used should have been the dividend, and vice versa. Fixed in new beta version uploaded to the prior post.

Code:

current_column:  #ris_abstract
MD : regex match:  (\*|\_)+(\S+)(\*|\_)+
MD : regex match:  (^(\W{1})(\s)(.*)(?:$)?)+
MD : regex match:  (\'{1})(.*)(\'{1})
HTML : regex match:   href=
HTML : regex match:  <a|</a|<a href=
HTML : regex match:  <div|</div
HTML : regex match:  <li|</li
HTML : regex match:  <p|</p
HTML : regex match:  <span|</span
HTML : regex match:  <ul|</ul
HTML : regex match:  <u|</u
guessing:
  MD: normalizing factor:  0.9
  MD: scores & ratios:  3 0.1875 0.16875
  HTML: scores & ratios:  8 0.25806451612903225
  --->>> best guess:  html