|
|
#1 |
|
Member
![]() Posts: 16
Karma: 10
Join Date: Jun 2020
Device: nook simple touch
|
Issue parsing HTML tags
Hi guys,
I am developing an (edit book) plugin, and I don't seem to be able to parse html tags. My `main.py` file looks like so: Code:
import lxml.etree
from PyQt5.Qt import QAction, QInputDialog
# The base class that all tools must inherit from
from calibre.gui2.tweak_book.plugin import Tool
from calibre import force_unicode
from calibre.gui2 import error_dialog
from calibre.ebooks.oeb.polish.container import OEB_DOCS, serialize
class MyTool(Tool):
name = 'my-tool'
allowed_in_toolbar = True
allowed_in_menu = True
def create_action(self, for_toolbar=True):
ac = QAction(get_icons('icon/icon.png'), 'My Tool', self.gui)
if not for_toolbar:
self.register_shortcut(ac, 'my-tool',
default_keys=('Ctrl+Shift+A',))
ac.triggered.connect(self.run)
return ac
def run(self):
container = self.current_container
# iterate over book files
for name, media_type in container.mime_map.items():
if media_type in OEB_DOCS:
self.my_method(container.parsed(name))
container.dirty(name)
def my_method(self, root):
for el in root.iter('div'):
el.attrib['class'] = 'my_class'
# debug
print('found a div tag')
What am I getting wrong here? Many thanks! |
|
|
|
|
|
#2 |
|
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,609
Karma: 28549044
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
This is XHTML, so you cant use bare tag names, use
root.xpath('//*[local-name()="div"]') |
|
|
|
| Advert | |
|
|
|
|
#3 |
|
Member
![]() Posts: 16
Karma: 10
Join Date: Jun 2020
Device: nook simple touch
|
Thanks so much!!
I need to learn about xpath. |
|
|
|
![]() |
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Parsing tags from a bibliographic entry and/or getting tags from Library of Congress | kirk8677 | Library Management | 2 | 05-27-2020 07:48 PM |
| HTML input plugin stripping text within toc tags in child html file | nimblebooks | Conversion | 3 | 02-21-2012 04:24 PM |
| Problem with html -> Mobi conversion - html tags visible. | khromov | Calibre | 9 | 08-06-2011 12:25 PM |
| Issue importing html zip archives and metadata parsing | KevinH | Calibre | 20 | 12-26-2010 12:57 AM |