View Single Post
Old 09-22-2020, 04:10 PM   #18
DNSB
Bibliophagist
DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.
 
DNSB's Avatar
 
Posts: 46,708
Karma: 169712392
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
Quote:
Originally Posted by Sarmat89 View Post
It should be trivial to simplify the markup with a bit of regexes, as both the paragraph separation and whitespace is present. Locate important formatting like italics, and remove everything else.
Sadly, that markup defines where everything is placed on the page. If you remove it, you then have the not so fun task of trying to make the epub look decent again. I tend to agree with Hitch that it is not a simple task. I've attached two lines from the Harry Potter: A Journey Through a History of Magic". And yes, both quotes define a single line of text.

Quote:
Code:
  <div class="liw c6E c7" style="font-size:0;top:326px;min-width:379px;line-height:24px;z-index:289">
    <span class="w c87 c160 f32 c161 c78" style="width:70px"> Portrait </span><span class="w c160 f32 c161 c78" style="width:23px;letter-spacing:-0.23px">of </span><span class="w c87 c160 f32 c161 c78" style="width:80px">Professor </span><span class="w c87 c160 f32 c161 c78" style="width:109px">McGonagall </span><span class="w c1D c160 f32 c161 c78">by </span><span class="w c13 c160 f32 c161 c78" style="letter-spacing:-0.05px">Jim </span><span class="w c13 c160 f32 c161 c78" style="text-align:right;letter-spacing:.09px">Kay</span>
  </div>
Quote:
Code:
  <div class="liw nw c7" style="font-size:0;top:122px;left:293px;min-width:46px;line-height:100px;z-index:261">
    <span style="font-size:68px" class="w c116 c1705 f1"> F</span>
  </div>

  <div class="liw nw c7 c7CB" style="font-size:0;top:136px;left:335px;min-width:253px;line-height:77px">
    <span class="w c1705 f1" style="width:252px"><span class="c1705 c19C c" style="width:39px"> A</span><span class="bl c1705 c c38" style="top:4px;font-size:52px">N</span><span class="bl c1705 cA c16F c c5E">T</span><span class="bl c1705 c1839 c c30" style="font-size:50px">A</span><span class="bl c1705 cA"><span class="c1705 c16F c c63">S</span><span class="c1705 c19C c c88">T</span></span><span class="bl c1705 c1839 c19C c cD">I </span><span class="w c36 c1705 f1 c19C"> C</span></span>
  </div>

  <div class="liw nw c7" style="font-size:0;top:195px;left:364px;min-width:235px;line-height:105px;z-index:272">
    <span class="w c1705 f1" style="width:234px"><span class="c1705 c c38" style="font-size:71px"> B</span><span class="bl c1705 c1C4 c c5C" style="top:-7px">E</span><span class="bl c1705 c1C4 c c88" style="top:-11px">A</span><span class="bl c1705 c c2D" style="top:-6px;font-size:44px">S</span><span style="top:-3px;font-size:53px" class="bl c1705"><span class="c c1705 c65">T</span><span class="c c1705 cD">S </span></span></span>
  </div>

  <div class="liw c7 c44 c45 c46 c47 c48 cAC cAD cAE cAF cB0 c6E c1EC c70" style="top:330px;z-index:287">
    <span class="w cADB f2" style="width:148px"><span class="cADB c71 c78 c c1C"> F</span><span class="cADB c72 c78"><span class="c cADB c1B">A</span><span class="c cADB c14">N</span><span class="c cADB c28">T</span><span class="c cADB c25">A</span><span class="c cADB c32">S</span><span class="c cADB c25">T</span><span class="c cADB c163">I</span><span class="c cD cADB">C </span></span></span><span class="w cA4 cADB f2"><span class="cADB c71 c78 c" style="width:21px">B</span><span class="cADB c72 c78"><span class="c cADB c25">E</span><span class="c cADB c25">A</span><span class="c cADB c32">S</span><span class="c cADB c25">T</span><span class="c cD cADB">S </span></span></span><span class="w c81 c1AE cADB f2 c72 c78"><span class="c cADB c1B">A</span><span class="c cADB c14">N</span><span class="c cD cADB">D </span></span><span class="w c101 cADB f2"><span class="cADB c71 c78 c c29">W</span><span class="cADB c72 c78"><span class="c cADB c1C">H</span><span class="c cADB c28">E</span><span class="c cADB c1B">R</span><span class="c cD cADB">E </span></span></span><span class="w c1AE cADB f2 c72 c78" style="width:48px"><span class="c cADB c25">T</span><span class="c cD cADB">O </span></span><span class="w c127 cADB f2"><span class="cADB c71 c78 c c27">F</span><span class="cADB c72 c78"><span class="c cADB c2A">I</span><span class="c cADB c14">N</span><span class="c cD cADB">D </span></span></span><span class="w c1D8 cADB f2"><span class="cADB c71 c78 c c14">T</span><span class="cADB c72 c78"><span class="c cADB c1C">H</span><span class="c cADB c28">E</span><span class="c cD cADB">M </span></span></span><span class="w cADB f22 c71" style="width:57px"><span class="c c63 cADB">w</span><span class="c cADB c1B">a</span><span class="c cD cADB">s </span></span>
  </div>
DNSB is offline   Reply With Quote