Quote:
Originally Posted by Doitsu
If you have basic programming skills, you could also write an ad-hoc Sigil plugin using the BeautifulSoup library, which is bundled with Sigil, to manipulate tags. (The Sigil API documentation is here.)...
|
Thanks: very useful for what I am trying to do as a plugin.
Only, I do need a little help with syntax to make this modified code work:
Spoiler:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import sys, os
from sigil_bs4 import BeautifulSoup
def run(bk):
# get all html files
for (html_id, href) in bk.text_iter():
file_name = os.path.basename(href)
html = bk.readfile(html_id)
# convert html to soup
soup = BeautifulSoup(html, 'html.parser')
orig_html = str(soup)
# get all i tags
italics = soup.find_all('i') # how for 'i', 'b', 'small', 'br', 'h1/2/3...'
for i in italics:
if 'class' in i.attrs:
print(file_name, 'found') # finds
if 'calibre' in i['class']:
# remove class attribute
print(file_name, 'found attrib') # doesn't find "calibre3"
del i['class']
# # change <span> to <b>
# span.name = 'b'
# else:
# # delete <span> tags with other classes
# span.unwrap()
# else:
# # delete <span> tags w/o classes
# span.unwrap()
# update file with changes
if str(soup) != orig_html:
bk.writefile(html_id, str(soup))
print(file_name, 'updated')
print('Done')
return 0
1. how to pass to soup.find_all() a list of tags as argument
2. how to rework
Code:
if 'calibre' in tag['class']
so that it would match a substring, i.e., 'calibre15'.
3. Would the code work as well for selecting <meta... /> tag by 'name' and deleting it? How?
Maybe it's trivial, but I am green--python 2.+ for Gimp is the fartest I have gone. And couldn't make anything of your link

Thanks!
* Sorry for the delay: too many irons...
** Does this get 'out of topic'? (better in plug-ins)