View Single Post
Old 06-30-2017, 12:01 AM   #1
stumped
Wizard
stumped ought to be getting tired of karma fortunes by now.stumped ought to be getting tired of karma fortunes by now.stumped ought to be getting tired of karma fortunes by now.stumped ought to be getting tired of karma fortunes by now.stumped ought to be getting tired of karma fortunes by now.stumped ought to be getting tired of karma fortunes by now.stumped ought to be getting tired of karma fortunes by now.stumped ought to be getting tired of karma fortunes by now.stumped ought to be getting tired of karma fortunes by now.stumped ought to be getting tired of karma fortunes by now.stumped ought to be getting tired of karma fortunes by now.
 
Posts: 3,305
Karma: 10259306
Join Date: May 2016
Device: kobo forma, Kobo Libra, Huawei media Tab, fire HD10, PW3 HDX8.9,
help edit a very bad kindle edition - a sttyle per word

i bought what must rate as one of the worst coded books ever. An expensive kindle edition official game guide, where each word has its own style, making whole paragraphs non reflowable
can someone please suggest how to regex out some of this complexity with calibre editor

a snippet follows- a single para. there are hundreds of these.

unsurprisingly - all tries to convert to another format are failing, hang for hours at 47%


Code:
 <p class="para">
        <span class="line fs1">
          <span class="word si fs1" style="left: 167px; top: 127px; width: 122px; ">TALOS</span>
          <span class="word si fs1" style="left: 301px; top: 127px; width: 10px; ">I</span>
          <span class="word si fs1" style="left: 322px; top: 127px; width: 127px; ">LOBBY</span>
        </span>
        <span class="line fs1">
          <span class="word si fs1" style="left: 813px; top: 127px; width: 235px; ">HARDWARE</span>
          <span class="word si fs1" style="left: 1059px; top: 127px; width: 98px; ">LABS</span>
        </span>
        <span class="line fs17">
          <span class="word si fs17" style="left: 120px; top: 198px; width: 37px; ">KEY</span>
          <span class="word si fs17" style="left: 163px; top: 198px; width: 104px; ">FACILITIES:</span>
        </span>
        <span class="line fs7">
          <span class="word si fs7" style="left: 120px; top: 238px; width: 97px; ">TRANSTAR</span>
          <span class="word si fs7" style="left: 223px; top: 238px; width: 71px; ">EXHIBIT</span>
        </span>
        <span class="line fs7">
          <span class="word si fs7" style="left: 120px; top: 278px; width: 105px; ">EXECUTIVE</span>
          <span class="word si fs7" style="left: 231px; top: 278px; width: 82px; ">OFFICES</span>
        </span>
        <span class="line fs7">
          <span class="word si fs7" style="left: 120px; top: 317px; width: 56px; ">SALES</span>
          <span class="word si fs7" style="left: 182px; top: 317px; width: 90px; ">DIVISION</span>
        </span>
        <span class="line fs7">
          <span class="word si fs7" style="left: 120px; top: 357px; width: 77px; ">HUMAN</span>
          <span class="word si fs7" style="left: 203px; top: 357px; width: 116px; ">RESOURCES</span>
        </span>
        <span class="line fs7">
          <span class="word si fs7" style="left: 442px; top: 238px; width: 14px; ">IT</span>
          <span class="word si fs7" style="left: 463px; top: 238px; width: 91px; ">SECURITY</span>
        </span>
        <span class="line fs7">
          <span class="word si fs7" style="left: 442px; top: 278px; width: 82px; ">TRAUMA</span>
          <span class="word si fs7" style="left: 531px; top: 278px; width: 75px; ">CENTER</span>
        </span>
        <span class="line fs7">
          <span class="word si fs7" style="left: 442px; top: 317px; width: 53px; ">STAFF</span>
          <span class="word si fs7" style="left: 502px; top: 317px; width: 85px; ">LOUNGE</span>
        </span>
        <span class="line fs0">
          <span class="word si fs0" style="left: 120px; top: 405px; width: 61px; ">When</span>
          <span class="word si fs0" style="left: 188px; top: 405px; width: 84px; ">TranStar</span>
          <span class="word si fs0" style="left: 279px; top: 405px; width: 98px; ">acquired</span>
          <span class="word si fs0" style="left: 383px; top: 405px; width: 35px; ">the</span>
          <span class="word si fs0" style="left: 425px; top: 405px; width: 67px; ">space</span>
          <span class="word si fs0" style="left: 499px; top: 405px; width: 71px; ">station</span>
          <span class="word si fs0" style="left: 576px; top: 405px; width: 18px; ">in</span>
          <span class="word si fs0" style="left: 601px; top: 405px; width: 55px; ">2030,</span>
          <span class="word si fs0" style="left: 663px; top: 405px; width: 47px; ">they</span>
        </span>
        <span class="line fs0">
          <span class="word si fs0" style="left: 120px; top: 436px; width: 74px; ">spared</span>
          <span class="word si fs0" style="left: 201px; top: 436px; width: 27px; ">no</span>
          <span class="word si fs0" style="left: 235px; top: 436px; width: 90px; ">expense</span>
          <span class="word si fs0" style="left: 332px; top: 436px; width: 18px; ">in</span>
          <span class="word si fs0" style="left: 356px; top: 436px; width: 123px; ">refurbishing</span>
          <span class="word si fs0" style="left: 486px; top: 436px; width: 35px; ">the</span>
          <span class="word si fs0" style="left: 528px; top: 436px; width: 67px; ">lobby,</span>
          <span class="word si fs0" style="left: 602px; top: 436px; width: 43px; ">with</span>
          <span class="word si fs0" style="left: 652px; top: 436px; width: 35px; ">the</span>
        </span>
        <span class="line fs0">
          <span class="word si fs0" style="left: 120px; top: 468px; width: 48px; ">goal</span>
          <span class="word si fs0" style="left: 175px; top: 468px; width: 21px; ">of</span>
          <span class="word si fs0" style="left: 203px; top: 468px; width: 109px; ">projecting</span>
          <span class="word si fs0" style="left: 319px; top: 468px; width: 14px; ">a</span>
          <span class="word si fs0" style="left: 340px; top: 468px; width: 60px; ">warm</span>
          <span class="word si fs0" style="left: 407px; top: 468px; width: 43px; ">and</span>
          <span class="word si fs0" style="left: 457px; top: 468px; width: 75px; ">inviting</span>
          <span class="word si fs0" style="left: 539px; top: 468px; width: 129px; ">atmosphere</span>
          <span class="word si fs0" style="left: 675px; top: 468px; width: 22px; ">to</span>
        </span>
        <span class="line fs0">
          <span class="word si fs0" style="left: 120px; top: 500px; width: 67px; ">guests</span>
          <span class="word si fs0" style="left: 193px; top: 500px; width: 43px; ">and</span>
          <span class="word si fs0" style="left: 243px; top: 500px; width: 124px; ">employees.</span>
          <span class="word si fs0" style="left: 374px; top: 500px; width: 124px; ">Connected</span>
          <span class="word si fs0" style="left: 504px; top: 500px; width: 22px; ">to</span>
          <span class="word si fs0" style="left: 533px; top: 500px; width: 35px; ">the</span>
          <span class="word si fs0" style="left: 574px; top: 500px; width: 72px; ">Shuttle</span>
          <span class="word si fs0" style="left: 653px; top: 500px; width: 45px; ">Bay,</span>
        </span>
        <span class="line fs0">
          <span class="word si fs0" style="left: 120px; top: 532px; width: 35px; ">the</span>
          <span class="word si fs0" style="left: 161px; top: 532px; width: 60px; ">lobby</span>
          <span class="word si fs0" style="left: 229px; top: 532px; width: 65px; ">serves</span>
          <span class="word si fs0" style="left: 300px; top: 532px; width: 23px; ">as</span>
          <span class="word si fs0" style="left: 330px; top: 532px; width: 14px; ">a</span>
          <span class="word si fs0" style="left: 351px; top: 532px; width: 75px; ">central</span>
          <span class="word si fs0" style="left: 434px; top: 532px; width: 48px; ">hub,</span>
          <span class="word si fs0" style="left: 488px; top: 532px; width: 100px; ">providing</span>
          <span class="word si fs0" style="left: 595px; top: 532px; width: 49px; ">easy</span>
          <span class="word si fs0" style="left: 652px; top: 532px; width: 75px; ">access</span>
        </span>
        <span class="line fs0">
          <span class="word si fs0" style="left: 120px; top: 563px; width: 22px; ">to</span>
          <span class="word si fs0" style="left: 148px; top: 563px; width: 35px; ">the</span>
          <span class="word si fs0" style="left: 190px; top: 563px; width: 114px; ">Neuromod</span>
          <span class="word si fs0" style="left: 311px; top: 563px; width: 85px; ">Division,</span>
          <span class="word si fs0" style="left: 403px; top: 563px; width: 151px; ">Psychotronics,</span>
          <span class="word si fs0" style="left: 561px; top: 563px; width: 43px; ">and</span>
          <span class="word si fs0" style="left: 611px; top: 563px; width: 105px; ">Hardware</span>
        </span>
        <span class="line fs0">
          <span class="word si fs0" style="left: 120px; top: 595px; width: 55px; ">Labs.</span>
          <span class="word si fs0" style="left: 181px; top: 595px; width: 37px; ">The</span>
          <span class="word si fs0" style="left: 225px; top: 595px; width: 53px; ">main</span>
          <span class="word si fs0" style="left: 285px; top: 595px; width: 30px; ">lift,</span>
          <span class="word si fs0" style="left: 322px; top: 595px; width: 85px; ">located</span>
          <span class="word si fs0" style="left: 414px; top: 595px; width: 18px; ">in</span>
          <span class="word si fs0" style="left: 438px; top: 595px; width: 35px; ">the</span>
          <span class="word si fs0" style="left: 480px; top: 595px; width: 70px; ">center</span>
          <span class="word si fs0" style="left: 557px; top: 595px; width: 21px; ">of</span>
          <span class="word si fs0" style="left: 585px; top: 595px; width: 35px; ">the</span>
          <span class="word si fs0" style="left: 627px; top: 595px; width: 67px; ">lobby,</span>
          <span class="word si fs0" style="left: 700px; top: 595px; width: 13px; ">is</span>
        </span>
        <span class="line fs0">
          <span class="word si fs0" style="left: 120px; top: 626px; width: 120px; ">connected</span>
          <span class="word si fs0" style="left: 247px; top: 626px; width: 22px; ">to</span>
          <span class="word si fs0" style="left: 275px; top: 626px; width: 35px; ">the</span>
          <span class="word si fs0" style="left: 317px; top: 626px; width: 115px; ">Arboretum</span>
          <span class="word si fs0" style="left: 439px; top: 626px; width: 43px; ">and</span>
          <span class="word si fs0" style="left: 488px; top: 626px; width: 36px; ">Life</span>
          <span class="word si fs0" style="left: 531px; top: 626px; width: 88px; ">Support.</span>
        </span>
        <span class="line fs17">
          <span class="word si fs17" style="left: 765px; top: 198px; width: 37px; ">KEY</span>
          <span class="word si fs17" style="left: 808px; top: 198px; width: 104px; ">FACILITIES:</span>
        </span>
        <span class="line fs7">
          <span class="word si fs7" style="left: 765px; top: 238px; width: 173px; ">DEMONSTRATION</span>
          <span class="word si fs7" style="left: 943px; top: 238px; width: 80px; ">THEATER</span>
        </span>
        <span class="line fs7">
          <span class="word si fs7" style="left: 765px; top: 278px; width: 138px; ">COMBUSTION</span>
          <span class="word si fs7" style="left: 909px; top: 278px; width: 37px; ">LAB</span>
        </span>
        <span class="line fs7">
          <span class="word si fs7" style="left: 765px; top: 317px; width: 106px; ">CHEMICAL</span>
          <span class="word si fs7" style="left: 878px; top: 317px; width: 37px; ">LAB</span>
        </span>
        <span class="line fs7">
          <span class="word si fs7" style="left: 765px; top: 357px; width: 102px; ">BALLISTICS</span>
          <span class="word si fs7" style="left: 874px; top: 357px; width: 37px; ">LAB</span>
        </span>
        <span class="line fs7">
          <span class="word si fs7" style="left: 1088px; top: 238px; width: 68px; ">BEAMS</span>
          <span class="word si fs7" style="left: 1162px; top: 238px; width: 45px; ">AND</span>
          <span class="word si fs7" style="left: 1213px; top: 238px; width: 66px; ">WAVES</span>
          <span class="word si fs7" style="left: 1285px; top: 238px; width: 37px; ">LAB</span>
        </span>
        <span class="line fs7">
          <span class="word si fs7" style="left: 1088px; top: 278px; width: 96px; ">MACHINE</span>
          <span class="word si fs7" style="left: 1190px; top: 278px; width: 54px; ">SHOP</span>
        </span>
        <span class="line fs7">
          <span class="word si fs7" style="left: 1088px; top: 317px; width: 89px; ">AIRLOCK</span>
        </span>
        <span class="line fs0">
          <span class="word si fs0" style="left: 765px; top: 405px; width: 103px; ">Hardware</span>
          <span class="word si fs0" style="left: 874px; top: 405px; width: 47px; ">Labs</span>
          <span class="word si fs0" style="left: 928px; top: 405px; width: 13px; ">is</span>
          <span class="word si fs0" style="left: 947px; top: 405px; width: 14px; ">a</span>
          <span class="word si fs0" style="left: 968px; top: 405px; width: 69px; ">secure</span>
          <span class="word si fs0" style="left: 1043px; top: 405px; width: 91px; ">research</span>
          <span class="word si fs0" style="left: 1140px; top: 405px; width: 42px; ">and</span>
          <span class="word si fs0" style="left: 1189px; top: 405px; width: 142px; ">development</span>
        </span>
        <span class="line fs0">
          <span class="word si fs0" style="left: 765px; top: 436px; width: 72px; ">facility.</span>
          <span class="word si fs0" style="left: 843px; top: 436px; width: 70px; ">Guests</span>
          <span class="word si fs0" style="left: 919px; top: 436px; width: 42px; ">and</span>
          <span class="word si fs0" style="left: 967px; top: 436px; width: 137px; ">unauthorized</span>
          <span class="word si fs0" style="left: 1110px; top: 436px; width: 102px; ">personnel</span>
          <span class="word si fs0" style="left: 1219px; top: 436px; width: 35px; ">are</span>
          <span class="word si fs0" style="left: 1260px; top: 436px; width: 69px; ">limited</span>
          <span class="word si fs0" style="left: 1335px; top: 436px; width: 21px; ">to</span>
        </span>
        <span class="line fs0">
          <span class="word si fs0" style="left: 765px; top: 468px; width: 68px; ">visiting</span>
          <span class="word si fs0" style="left: 840px; top: 468px; width: 34px; ">the</span>
          <span class="word si fs0" style="left: 880px; top: 468px; width: 53px; ">foyer</span>
          <span class="word si fs0" style="left: 939px; top: 468px; width: 42px; ">and</span>
          <span class="word si fs0" style="left: 988px; top: 468px; width: 153px; ">Demonstration</span>
          <span class="word si fs0" style="left: 1146px; top: 468px; width: 83px; ">Theater.</span>
          <span class="word si fs0" style="left: 1236px; top: 468px; width: 16px; ">A</span>
          <span class="word si fs0" style="left: 1258px; top: 468px; width: 81px; ">number</span>
          <span class="word si fs0" style="left: 1345px; top: 468px; width: 21px; ">of</span>
        </span>
        <span class="line fs0">
          <span class="word si fs0" style="left: 765px; top: 500px; width: 42px; ">labs</span>
          <span class="word si fs0" style="left: 813px; top: 500px; width: 42px; ">and</span>
          <span class="word si fs0" style="left: 862px; top: 500px; width: 34px; ">the</span>
          <span class="word si fs0" style="left: 902px; top: 500px; width: 92px; ">Machine</span>
          <span class="word si fs0" style="left: 1001px; top: 500px; width: 52px; ">Shop</span>
          <span class="word si fs0" style="left: 1059px; top: 500px; width: 35px; ">are</span>
          <span class="word si fs0" style="left: 1100px; top: 500px; width: 83px; ">located</span>
          <span class="word si fs0" style="left: 1189px; top: 500px; width: 82px; ">beyond</span>
          <span class="word si fs0" style="left: 1277px; top: 500px; width: 34px; ">the</span>
          <span class="word si fs0" style="left: 1318px; top: 500px; width: 79px; ">security</span>
        </span>
        <span class="line fs0">
          <span class="word si fs0" style="left: 765px; top: 532px; width: 123px; ">checkpoint.</span>
          <span class="word si fs0" style="left: 895px; top: 532px; width: 49px; ">Here</span>
          <span class="word si fs0" style="left: 950px; top: 532px; width: 119px; ">researchers</span>
          <span class="word si fs0" style="left: 1076px; top: 532px; width: 42px; ">and</span>
          <span class="word si fs0" style="left: 1124px; top: 532px; width: 102px; ">engineers</span>
          <span class="word si fs0" style="left: 1232px; top: 532px; width: 118px; ">experiment</span>
        </span>
        <span class="line fs0">
          <span class="word si fs0" style="left: 765px; top: 563px; width: 42px; ">with</span>
          <span class="word si fs0" style="left: 814px; top: 563px; width: 14px; ">a</span>
          <span class="word si fs0" style="left: 834px; top: 563px; width: 70px; ">variety</span>
          <span class="word si fs0" style="left: 911px; top: 563px; width: 21px; ">of</span>
          <span class="word si fs0" style="left: 938px; top: 563px; width: 100px; ">emerging</span>
          <span class="word si fs0" style="left: 1045px; top: 563px; width: 135px; ">technologies</span>
          <span class="word si fs0" style="left: 1186px; top: 563px; width: 21px; ">to</span>
          <span class="word si fs0" style="left: 1213px; top: 563px; width: 87px; ">develop</span>
        </span>
        <span class="line fs0">
          <span class="word si fs0" style="left: 765px; top: 595px; width: 101px; ">hardware</span>
          <span class="word si fs0" style="left: 873px; top: 595px; width: 118px; ">prototypes.</span>
          <span class="word si fs0" style="left: 997px; top: 595px; width: 100px; ">Currently,</span>
          <span class="word si fs0" style="left: 1104px; top: 595px; width: 34px; ">the</span>
          <span class="word si fs0" style="left: 1144px; top: 595px; width: 42px; ">labs</span>
          <span class="word si fs0" style="left: 1192px; top: 595px; width: 35px; ">are</span>
          <span class="word si fs0" style="left: 1233px; top: 595px; width: 87px; ">pursuing</span>
        </span>
        <span class="line fs0">
          <span class="word si fs0" style="left: 765px; top: 626px; width: 82px; ">multiple</span>
          <span class="word si fs0" style="left: 853px; top: 626px; width: 83px; ">projects</span>
          <span class="word si fs0" style="left: 943px; top: 626px; width: 42px; ">with</span>
          <span class="word si fs0" style="left: 991px; top: 626px; width: 14px; ">a</span>
          <span class="word si fs0" style="left: 1012px; top: 626px; width: 64px; ">broad</span>
          <span class="word si fs0" style="left: 1083px; top: 626px; width: 62px; ">range</span>
          <span class="word si fs0" style="left: 1151px; top: 626px; width: 21px; ">of</span>
          <span class="word si fs0" style="left: 1178px; top: 626px; width: 134px; ">applications.</span>
        </span>
      </p>

Last edited by stumped; 06-30-2017 at 12:07 AM.
stumped is offline   Reply With Quote