MobileRead Forums

MobileRead Forums (https://www.mobileread.com/forums/index.php)
-   Plugins (https://www.mobileread.com/forums/forumdisplay.php?f=268)
-   -   Removing calibre classes... (https://www.mobileread.com/forums/showthread.php?t=328721)

carmenchu 04-08-2020 07:38 AM

Removing calibre classes...
 
As begun in this thread (moved from another, misleading title), I have working in a plug-in to remove those 'calibre#' classes that usually implement just the CSS defaults for the tag, and also redundant/unnecessary <meta.../> tags.
My (working) code so far is:
Spoiler:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import sys, os, re

from sigil_bs4 import BeautifulSoup


def run(bk):

# get all html files
for (html_id, href) in bk.text_iter():

deletec=[]
deletem=[]

file_name = os.path.basename(href)
html = bk.readfile(html_id)

# convert html to soup
soup = BeautifulSoup(html, 'html.parser')
orig_html = str(soup)

# get all i,b,small,sup,sub,br,a,li tags with class='*calibre*' (containing all those characters, in fact)
tags = soup.find_all(['i', 'b', 'small', 'sub', 'sup', 'br', 'a', 'li'], class_=re.compile("calibre"))
for tag in tags:
theclass = tag['class'] # list under 'html.parser': can be multivalued
if len(theclass) == 1: # not a multivalued class
# remove class attribute
deletec = deletec + [(tag.name, theclass)]
del tag['class']
# else: # remove the *calibre* style from class? How?

# this clears the plug-in console: add after
metas = soup.find_all('meta', attrs={'name': True})
for meta in metas:
# exclude 'calibre:cover', remove others
if not 'cover' in meta['name']: # works here: string, and NOT above: list
deletem = deletem + [(meta.name, meta['name'])]
meta.decompose() # all previous print statements lost here

# update file with changes
if str(soup) != orig_html:
bk.writefile(html_id, str(soup))
# write a list of changes for checking
print(deletec, sep=' ')
print(deletem, sep=' ')
print(file_name, 'updated')

print('Done')
return 0

def main():
print("I reached main when I should not have")
return -1

if __name__ == "__main__":
sys.exit(main())

- When run, it outputs in the plug-in window the full list of removals, so that one can check that nothing unintended is affected, and gets rid of calibre classes from i,b,small,sup,sub,br,a,li--plus all meta tags. Composite classes and <meta *name="*cover*"*/> excluded.
Now, I would appreciate further help from experts in improving the plug-in, allowing as options:
* edit the tag list
* include/exclude the metas removal
* include/exclude showing the (huge!) list of removals (i.e., a 'test mode')

As the Sigil_Plugin_Framework_rev12.epub lacks information on GUIs, please:
- For my needs, would Tkinter or PyQt5 be the simpler approach?
- Can some-helpful-body provide a very simple template (no bell or whistles) working in Sigil?

* Improvement of code welcome (I am learning)
** If somebody is interested, I can provide the plug-in in its present working state... for use only if those classes are in your way: maybe some 'user agents' require them?

bravosx 04-08-2020 08:49 AM

I'm trying to use your plugin and I get this error:

Status: failed

Traceback (most recent call last):
File "C:\Program Files\Sigil\plugin_launchers\python\launcher.py", line 135, in launch
target_script = __import__(script_module)
File "C:\Users\xxxxxxxx\AppData\Local\sigil-ebook\sigil\plugins\MyPlugin\plugin.py", line 10
deletec=[]
^
IndentationError: expected an indented block
Error: expected an indented block (plugin.py, line 10)

carmenchu 04-08-2020 09:11 AM

Sorry, that is just the 'run' code (and my 'copy & paste' deleted the python indents: cannot work!): attached the working (for me) plug-in.
Caveat:
* test it in a 'short' epub
* look at plug-in window to see what has has been deleted
* and test further with preview before saving the changed epub

bravosx 04-08-2020 09:34 AM

Ok. Now the plugin has worked.
Status: success

Thanks and best regards
bravosx

DiapDealer 04-08-2020 09:46 AM

Quote:

Originally Posted by carmenchu (Post 3973708)
As the Sigil_Plugin_Framework_rev12.epub lacks information on GUIs, please:
- For my needs, would Tkinter or PyQt5 be the simpler approach?
- Can some-helpful-body provide a very simple template (no bell or whistles) working in Sigil?

The plugin framework provides no extensions/improvements to python gui functionality, hence the reason there's no information provided in the Plugin Framework documentation. I'm sure some will be able to offer help in constructing guis with python using tkinter or pyqt5, but it's beyond the scope of our plugin framework guide. There are plenty of tutorials/examples available online for making tkinter and pyqt5 dialogs, and there are plenty of examples of integrating such dialogs with Sigil plugins in many of the available Sigil plugins.

"Simpler" is in the eye of the beholder. I've done a lot of tkinter guis for my various plugins, but I've always found tkinter to be terribly unintuitive for some reason. I'm not sure why; it might just be me. One advantage PyQt has is that the gui will match Sigil's gui style, and the plugin framework even offers ways to ensure that the plugin's dialogs will match Sigil's lightmode/darkmode user preference. Tkinter will only automatically do that on macOS (with newer versions of macOS). My plugins with GUI's will eventually all use PyQt5 instead of tkinter.

carmenchu 04-08-2020 10:03 AM

1 Attachment(s)
Quote:

Originally Posted by bravosx (Post 3973776)
Ok. Now the plugin has worked.
Status: success

Thanks and best regards
bravosx

Thank you for testing.
Attached a sligh modification: moving two lines of code improves the report ... I am learning:o

JSWolf 04-08-2020 11:19 AM

Any chance this could also be made as a Calibre plugin?

Thomas_AR 04-08-2020 12:01 PM

Hi,
i tested your plugin on a small epub. Not much to remove. The plugin successfully removed some unnecessary meta tags.
Working good so far.

Thomas

Doitsu 04-08-2020 12:52 PM

Quote:

Originally Posted by carmenchu (Post 3973708)
As the Sigil_Plugin_Framework_rev12.epub lacks information on GUIs, please:
- For my needs, would Tkinter or PyQt5 be the simpler approach?
- Can some-helpful-body provide a very simple template (no bell or whistles) working in Sigil?

My unofficial regex tester validation plugin has a very simple Tkinter GUI (just two radio buttons, a text box, a check box and two push buttons).

Also have a look at DiapDealer's plugins.

BTW, if you just want some customization options, you could "cheat," and simply read and write preference settings, which is very easy.

carmenchu 04-08-2020 01:46 PM

Quote:

Originally Posted by Thomas_AR (Post 3973869)
Hi,
I tested your plugin on a small epub. Not much to remove...

Thanks for the feedback: the plug-in is intented for epubs converted with calibre 'at some point of their history': almost 90% of the ones I've got.

Quote:

Originally Posted by Doitsu (Post 3973888)
My unofficial regex tester validation plugin has a very simple Tkinter GUI (just two radio buttons, a text box, a check box and two push buttons).

Tons of thanks! It seems to hit the bull's eye.
Quote:

Originally Posted by Doitsu (Post 3973888)
BTW, if you just want some customization options, you could "cheat," and simply read and write preference settings, which is very easy.

It doesn't seem to allow for editing the preferences, doest it?
Of course, I intend to use preferences as well--those seem to have a documentation I can understand.
But my search into PyQt documentation has devolved into headache.
Maybe I have been spoiled by Gimp, which offers a framework for python plugins which makes their GUIs quite trivial.
:thanks:

DiapDealer 04-08-2020 02:29 PM

Quote:

Originally Posted by carmenchu (Post 3973918)
It doesn't seem to allow for editing the preferences, doest it?

If the preferences are few and simple enough, directly editing the plugin's preferences json file is a way of editing preferences that many Sigil plugins utilize. Especially if they won't be changing very often. But if you're envisioning preferences that will be updated frequently (perhaps each time the plugin is run even), then you will probably want a gui dialog that retrieves/sets those preferences. There's plenty of examples of both approaches in the user-contributed plugins.

Doitsu 04-08-2020 04:15 PM

Quote:

Originally Posted by carmenchu (Post 3973918)
But my search into PyQt documentation has devolved into headache.

The Qt documentation really isn't great, by PyQt can also be relatively simple if you stick to basic controls. For example, this post shows how to display a message box.

Displaying an input box is even easier:
Spoiler:
Code:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import sys
from PyQt5.QtWidgets import QApplication, QInputDialog
from PyQt5 import QtCore

def run(bk):
    app = QApplication(sys.argv)
    input = QInputDialog()
    input.setWindowFlags(QtCore.Qt.WindowCloseButtonHint) # hide "What is this?" icon [?]
    input.setWindowTitle("QInputDialog demo")
    button_clicked = input.exec_()
    print("You entered:", input.textValue())
   
    return 0
       
def main():
    print('I reached main when I should not have\n')
    return -1

if __name__ == "__main__":
    sys.exit(main())


carmenchu 04-10-2020 08:37 AM

1 Attachment(s)
Many thanks, every body!
Attached is a new version of the plug-in, doing the same, but with:
- an improved (short!) report
- use of 'preferences', per PrefsExampleSimple_v0.0.2--edit them for options.
--------
And now, some grumbling: adding a very simple GUI, with a frame, an editable text field and a couple of buttons, still seems (to me!) a headache.
I sorely miss in Sigil the Gimp (traditional) approach, which looks in code:
Spoiler:
Code:

#!/usr/bin/env python
# -*- coding: utf-8 -*-
###################################################
# GIMP plugin to export the visible canvas/selection to image as indixed PNG
#        (NEVER put a multi-layer RGB/GRAY image in INDIXED mode: the results can be AWFUL)
# (c) carmen 2019
#
##### released under GNU General Public License v2
###################################################

import sys,os,os.path

sys.stderr = open('G:/Tests/Gimp_plug/python-fu-output.txt','a')
sys.stdout=sys.stderr # So that they both go to the same file
# From hint in https://www.gimp-forum.net/Thread-Debugging-python-fu-scripts-in-Windows
from gimpfu import *

def export_visible_as_indexed(image,GS,crop,numCols,WWW,ditherType,alphaDither):
        # common tasks
        name,_=os.path.splitext(image.filename) # _ stands for the extension: not wanted
        postFix='_xcf.png'
        # if GS: postfix='-GS' + postfix
        # if crop: postfix='-crop' + postfix
        # theName=name + postfix
        theName=name
        if GS: theName += '-GS'
        if crop: theName += '-crop'
        theName += postFix
        if os.path.exists(theName):
                pdb.gimp_message(theName + ' already exists: Nothing doing!')
                # does away with 'frozen green progress bar' in UI
        else:
                pdb.gimp_edit_copy_visible(image) # action: returns True if successful: Copy from the projection o selection.
                precision = pdb.gimp_image_get_precision(image)
                if precision > 150:
                        # pdb.gimp_message('Precision too high: Nothing doing!')
                        theImage=pdb.gimp_image_new_with_precision(pdb.gimp_image_width(image),pdb.gimp_image_height(image),pdb.gimp_image_base_type(image),150) # Precision needed for conversion to indexed
                        layer=pdb.gimp_layer_new_from_visible(image,theImage,'Pasted layer')
                        # layer = pdb.gimp_layer_new_from_visible(image, dest_image, name)
                        selection = pdb.gimp_image_get_selection(image)
                        theImage.insert_layer(layer,position=-1)
                        # pdb.gimp_image_insert_layer(theImage,layer,0,0) #### TypeError: wrong parameter type -> Bug
                        # pdb.gimp_image_insert_layer(image, layer, parent, position)
                        pdb.plug_in_autocrop_layer(theImage,selection)
                else:
                        theImage=pdb.gimp_edit_paste_as_new_image()# (void): Paste buffer to a new image.
                        layer=theImage.layers[0]
                # type=pdb.gimp_image_base_type(theImage)
                # print 'Initial base_type is "%s" ' % pdb.gimp_image_base_type(theImage)
                if GS and pdb.gimp_image_base_type(theImage) != 1: pdb.gimp_image_convert_grayscale(theImage)
                # print 'Resulting base_type is "%s" ' % pdb.gimp_image_base_type(theImage)
                if crop: pdb.plug_in_autocrop(theImage,layer)
                paletteType=2 if WWW else 0
                pdb.gimp_image_convert_indexed(theImage,ditherType,paletteType,numCols,alphaDither,0,0)
                # fails if precision > 150
                pdb.file_png_save(theImage,layer,theName,theName,0,9,0,0,0,0,0)
                pdb.gimp_image_delete(theImage)


register(
                "export_visible_as_indexed",
                "Export visible canvas/selection as indexed PNG",
                "Export visible canvas/selection as indexed PNG",
                "carmen",
                "carmen",
                "2019",
                "Export visible as indexed PNG...",
                "RGB*, GRAY*",
                [
                        (PF_IMAGE, 'image', 'Input image', None),
                        (PF_BOOL, 'GS', 'Make Grayscale?', False),
                        (PF_TOGGLE, 'crop', 'Autocrop image?', False),
                        (PF_SPINNER, 'numCols', 'Number of colors:', 256,(1, 256, 1)),
                        (PF_BOOL, 'WWW', 'WWW optimized?', False),
                        (PF_OPTION, 'ditherType',    'Dither colors:',    0, ['No dithering','Floyd-Steinberg','Floyd-Steinberg + reduced bleeding','Fixed']),
                        (PF_BOOL, 'alphaDither', 'Dither transparency?', False)
                ],
                [],
                export_visible_as_indexed,
                menu='<Image>/Export As/',
)

main()


(notice the register() at bottom: thatīs the plug-in GUI) and runs as:
Attachment 178226
the pop-up dialog being created by the PF_... entries, which also gather them as variables for the function in the script.
The main point is that one can focus on the code, and leave the main program to extrude a 'standard' dialog (advanced plug-in coders can also implement their own, non-standard dialogs).
I suspect that one should be able to do something of the kind in Sigil through 'class instances' (above my head!) as
'Seven classes for the seven options under the sky,
one class to bind them all and under Sigil run them...'
For wizards, of course!

Doitsu 04-10-2020 12:33 PM

IMHO, your plugin is almost ready for an official release. I only have one minor nitpick: the plugin never reads or writes the "metas" preference setting.

I also have a minor suggestion:

if you change:

Code:

        tags = soup.find_all(ttags, class_=re.compile("calibre"))
to:

Code:

        if ttags == ['']:
            tags = soup.find_all(class_=re.compile("calibre"))
        else:
            tags = soup.find_all(ttags, class_=re.compile("calibre"))

the user can use the following preference setting:

Code:

  "tags": ""
to tell the plugin to remove calibre classes from all tags.

carmenchu 04-10-2020 08:31 PM

1 Attachment(s)
Quote:

Originally Posted by Doitsu (Post 3974641)
IMHO, your plugin is almost ready for an official release. I only have one minor nitpick: the plugin never reads or writes the "metas" preference setting.

Thanks: I seem to have corrected it in the new attachment.
Query: can really
Code:

prefs['metas'] = prefs['metas']
be required? Seems sappy...

Quote:

Originally Posted by Doitsu (Post 3974641)

I also have a minor suggestion:

if you change:

Code:

        tags = soup.find_all(ttags, class_=re.compile("calibre"))
to:

Code:

        if ttags == ['']:
            tags = soup.find_all(class_=re.compile("calibre"))
        else:
            tags = soup.find_all(ttags, class_=re.compile("calibre"))

the user can use the following preference setting:

Code:

  "tags": ""
to tell the plugin to remove calibre classes from all tags.

I have implemented a variant of your suggestion, and (I hope!) avoided the trouble of a void "tags": "", getting the "tags" preferences obliterated on next run.
It's an option which I wouldn't recommend, save for report purposes: one can achieve the same result by a simple search & replace of class="calibre\d*" (I purposely skip class="calibre\d* other" except in report: my notion isn't to eliminate manuel searchs and edits, but only their mechanical part).
By the way, is there a simple way of getting the list of tags ordered? It would be easier to read and edit, but I can't get it done...
----

**Do people know that there is a downloadable epub version of the python manual here? The improvement of consulting on i.e., the calibre viewer, with full, searchable toc and searcheable text, cannot be told--try it!**


All times are GMT -4. The time now is 08:44 PM.

Powered by: vBulletin
Copyright ©2000 - 3.8.5, Jelsoft Enterprises Ltd.
MobileRead.com is a privately owned, operated and funded community.