View Single Post
Old 08-22-2017, 11:13 AM   #218
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,746
Karma: 24032915
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by slowsmile View Post
Therefore it would seem that the HTML Parser module that bs4 uses is the cause of this missing div tag problem.
I don't think that the HTML Parser module causes this problem, because, when I preprocessed the HMTL with the very same module, your plugin no longer created invalid HTML files. Also I used it in a couple of plugins and it never produced invalid HTML code.

Run the following prepossessing code, before running your plugin and you'll see that missing <div> error no longer occurs:

Code:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from sigil_bs4 import BeautifulSoup

def run(bk):
    html_id = 'LSH.xhtml'
    html = bk.readfile(html_id)
    soup = BeautifulSoup(html, 'html.parser')
    normalized_html = str(soup.prettyprint_xhtml(indent_level=0, eventual_encoding="utf-8", formatter="minimal", indent_chars="  "))
    bk.writefile(html_id, normalized_html)

    return 0

def main():
    print('I reached main when I should not have\n')
    return -1

if __name__ == "__main__":
    sys.exit(main())
Doitsu is offline   Reply With Quote