|  01-02-2018, 10:23 AM | #256 | 
| Sigil Developer            Posts: 9,070 Karma: 6361556 Join Date: Nov 2009 Device: many | 
			
			Hi Doitsu, I have slightly modified your GumboOffset example to do what we do inside Sigil to make it more general Code: import sigil_gumbo_bs4_adapter as gumbo_bs4
wspace = (" ", "\n", "\r", "\t", "\v" "\f")
def preprocess(src):
    newsrc = src
    line_offset = 0;
    pos_offset = 0;
    n = len(src)
    if src.startswith("<?xml"):
        # remove any xml header line and trailing whitespace
        end = src.find('>',5)
        if end != -1:
            end = end + 1
            while end < n and src[end:end+1] in wspace:
                if src[end:end+1] == "\n":
                    line_offset += 1
                end += 1
        if (end < n):
            pos_offset = end
            newsrc = src[end:]
    return (newsrc, line_offset, pos_offset)
 
def run(bk):
    for id_type, id in bk.selected_iter():
        filename =  os.path.basename(bk.id_to_href(id))
        html = bk.readfile(id).replace('\r\n', '\n')
        (html, line_offset, pos_offset)  = preprocess(html)
        soup = gumbo_bs4.parse(html)
        
        for para in soup.find_all('p'):
            linenumber = para.line + line_offset
            colnumber = para.col
            offset =  para.offset + pos_offset
            message = escape(str(para)).replace('"', """)
            bk.add_extended_result('info', filename, linenumber, offset, 'Line: ' + str(linenumber) + ' Col: ' + str(colnumber) + ' Gumbo method: ' + message)
        
    return 0
        
def main():
    print('I reached main when I should not have\n')
    return -1
if __name__ == "__main__":
    sys.exit(main())Last edited by KevinH; 01-02-2018 at 10:26 AM. | 
|   |   | 
|  01-02-2018, 11:24 AM | #257 | |
| Grand Sorcerer            Posts: 5,762 Karma: 24088559 Join Date: Dec 2010 Device: Kindle PW2 | Quote: 
  Thanks for the updated code, I'll check it out tomorrow. | |
|   |   | 
|  01-02-2018, 11:39 AM | #258 | 
| Guru            Posts: 899 Karma: 3501166 Join Date: Jan 2017 Location: Poland Device: Various | 
			
			From the fourth paragraph, the offset is shifted. The problem is related to double-byte diacritics. Code: <div> <p>test</p> <p>test</p> <p>test</p> <p xml:lang="es">información</p> <p>here</p> <p>here</p> <p>here</p> <p xml:lang="pl">żółtko</p> <p>end</p> <p>end</p> <p>end</p> </div> | 
|   |   | 
|  01-02-2018, 12:03 PM | #259 | 
| Sigil Developer            Posts: 9,070 Karma: 6361556 Join Date: Nov 2009 Device: many | 
			
			Yes the offset gumbo records is a byte offset from a start of a utf-8 encoded file or string.  The column number is "proper" as it is measured in unicode code points not in bytes.  The solution is to use the routine previously posted by Doitsu to convert line and column numbers inside python to an offset in unicode codepoints if that is what you want.  Offsets are hard to work with given they are encoding dependent.  Whereas line and column given in codepoints should be  easier to work with and convert to any encoding you like. KevinH | 
|   |   | 
|  01-02-2018, 12:07 PM | #260 | 
| Sigil Developer            Posts: 9,070 Karma: 6361556 Join Date: Nov 2009 Device: many | 
			
			To make things even harder, a Qt QChar is a little endian utf-16 encoding which makes offsets harder to work with without defining exactly what the basis you are using! The validation plugin should be passed the offset in unicode codepoints if you need exact positioning in the validation result window inside Sigil. The gumbo line and col info can be used to accurately determine the offset in codepoints If on the other hand you want to use offsets into python utf-8 bytestrings to extract things, the gumbo offsets can be used directly for that. Last edited by KevinH; 01-02-2018 at 12:17 PM. | 
|   |   | 
|  01-07-2018, 08:52 AM | #261 | 
| Connoisseur            Posts: 57 Karma: 600000 Join Date: Jan 2018 Device: Galaxy Tab S2 | 
				
				tk.withdraw
			 
			
			Hi, i hope this is now my last issue transfering my stuff from win to mac. i have one self written plugin which isnt working correctly under mac os x, but works fine under win. I've already installed the activestate tcl and i've on ly a problem now with my own script. I've nailed that down to this: Code: #!/usr/bin/env python
# -*- coding: utf-8 -*-
 
# target script
 
from tkinter import *
from tkinter import messagebox
def run(bk):
 
    root = Tk()
#    root.withdraw()
 
    print('Start\n')
    if messagebox.askyesno("Testquestion", "Ok to go on?"):
        print('Middle\n')
        
    print('End\n')
    
    return 0
def main():
    print ('I reached main when I should not have\n')
    return -1
 
if __name__ == "__main__":
    sys.exit(main()) As i now cant give any input the Launcher stops at "Start" is not coming to middle or end. What am i doing wrong? Greets Maui | 
|   |   | 
|  01-07-2018, 10:43 AM | #262 | 
| Grand Sorcerer            Posts: 5,762 Karma: 24088559 Join Date: Dec 2010 Device: Kindle PW2 | 
			
			I don't know why your code doesn't work, but if all you need is a message box, you might as well use PyQt5, which is bundled with Sigil 0.9.8 and higher.: Code: #!/usr/bin/env python
# -*- coding: utf-8 -*-
import sys
from PyQt5.QtWidgets import QApplication, QMessageBox
   
def run(bk):
    print('Start\n')
    app = QApplication(sys.argv)
    msg = QMessageBox()
    msg.setWindowTitle("QMessageBox demo")
    msg.setText("This is a QMessageBox.")
    msg.setStandardButtons(QMessageBox.Ok | QMessageBox.Cancel)
    buttonClicked = msg.exec_()
    
    if buttonClicked == QMessageBox.Ok:
        print('\nYou clicked OK.')
    else:
        print('\nYou clicked Cancel.')
    
    print('\nEnd')
    
    return 0
        
def main():
    print('I reached main when I should not have\n')
    return -1
if __name__ == "__main__":
    sys.exit(main()) | 
|   |   | 
|  01-07-2018, 11:22 AM | #263 | 
| Sigil Developer            Posts: 9,070 Karma: 6361556 Join Date: Nov 2009 Device: many | 
			
			On a Mac, python tk mainwindows do not automatically grab focus or come to the surface.  They are often hidden under other Windows.  You need to click on the Python launcher icon that gets added to the end of the Dock to force that window to the front and make it take focus. There is a workaround in Python to force the main window to the front and grab focus. See my FolderIn or FolderOut or ePub3-itizer plugin code that uses this workaround for tk graphics to work as expected on a Mac See the needed code here: https://github.com/kevinhendricks/eP.../src/plugin.py the bulk of the Tk stuff starts near line 262. The added code is under the darwin if. Last edited by KevinH; 01-07-2018 at 11:33 AM. | 
|   |   | 
|  01-09-2018, 10:23 AM | #264 | 
| Connoisseur            Posts: 57 Karma: 600000 Join Date: Jan 2018 Device: Galaxy Tab S2 | 
			
			Hi,  thanks for your help, i successfully transferred now my Sigil setup completely from win 7 to mac High Sierra. Maui | 
|   |   | 
|  01-16-2018, 11:26 AM | #265 | 
| Guru            Posts: 899 Karma: 3501166 Join Date: Jan 2017 Location: Poland Device: Various | 
			
			It’s me again. Based on the Regex tester, the line number and offset works perfect for xhtml files. But ... How do I calculate line number for OPF (eg. in metadata, guide)? For example, I want to check if the language of the file is set to Polish and guide section. For missing element I need first line with open tag <metadata> or <guide>. If an error occurs – the exact location in the file. I've attached sample plugin + test file. | 
|   |   | 
|  01-16-2018, 11:38 AM | #266 | |
| Grand Sorcerer            Posts: 5,762 Karma: 24088559 Join Date: Dec 2010 Device: Kindle PW2 | Quote: 
 Code: coffset = charoffset(linenumber, colnumber, offlst)
if filename == 'content.opf':
    coffset += linenumber - 1Last edited by Doitsu; 01-19-2018 at 07:49 AM. | |
|   |   | 
|  01-18-2018, 07:09 PM | #267 | 
| Witchman            Posts: 628 Karma: 788808 Join Date: May 2013 Location: Philippines Device: Android S5 | 
			
			@Becky...Regarding just getting the line number for the meta language identifier, you could perhaps put Kevin's code into a simple function and get the line number something like this: Code: def getOPFLineNumber(opf, search_text):
    opf_data = opf.splitlines()     # split the opf string into separate lines of code
    
    # assign a line num to the search text when found
    linenum = ''
    for index, line in enumerate(opf_data):
        if search_text in line:
            linum = index + 1
            linenum = str(linum)
            return(linenum)Code:   dc_language = None
    meta_language = tree.find('.//{http://purl.org/dc/elements/1.1/}language')
    if hasattr(meta_language, 'text'):
        dc_language = meta_language.text
    if dc_language:
        if dc_language != "pl":
            opf_data = bk.get_opf()    # get all the opf data as a string
            linenumber = getOPFLineNumber(opf_data, "<dc:language>")
            message = "Language specified in metadata is other than 'pl' (Polish)"
            bk.add_extended_result("error", "content.opf", linenumber , 0, 'Becky Metadata #002' + ' -- ' + message)
    else:
        message = "Language not specified in metadata"
        bk.add_extended_result("error", "content.opf", 0, 0, 'Becky Metadata #001' + ' -- ' + message)Also, I know that using bk.get_opf() will certainly work for an Edit plugin but I'm not at all sure whether this bk method will be available for a Validation plugin like your plugin. If this is true then you could probably use xml.etree to get the opf file contents as a string instead of using bk.get_opf(). Last edited by slowsmile; 01-18-2018 at 09:58 PM. | 
|   |   | 
|  01-19-2018, 06:29 AM | #268 | 
| Guru            Posts: 899 Karma: 3501166 Join Date: Jan 2017 Location: Poland Device: Various | 
			
			Thank you both. @Doitsu for a hint, @slowsmile for the solution. That should be enough for me. That’s something I wanted, but I could not do it myself. I need one more thing: how to detect (via the plugin) if the file has been modified but not saved (exist * after filename). | 
|   |   | 
|  01-19-2018, 07:41 AM | #269 | |
| Grand Sorcerer            Posts: 28,860 Karma: 207000000 Join Date: Jan 2010 Device: Nexus 7, Kindle Fire HD | Quote: 
 Plugins work on copies of files. Meaning changes aren't incorporated into the existing plugin until after the plugin completes. There are some dictionaries in wrapper.py that can be accessed with that info, but that's considered bad form (somewhat) since unforeseen things can happen if those dictionary contents are accidentally modified incorrectly. | |
|   |   | 
|  01-20-2018, 05:31 AM | #270 | 
| Grand Sorcerer            Posts: 5,762 Karma: 24088559 Join Date: Dec 2010 Device: Kindle PW2 | 
			
			I've got a related question: is there a built-in property that I can use to make Sigil display the dirty-flag asterisk even when no file was changed? (I know that I can simply add and remove a dummy file, but I'm wondering if there's a more elegant method using a built-in property.)
		 | 
|   |   | 
|  | 
| 
 | 
|  Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post | 
| Loading Plugin in development | Sladd | Development | 6 | 06-17-2014 06:57 PM | 
| Question for plugin development gurus | DiapDealer | Plugins | 2 | 02-04-2012 11:33 PM | 
| DR800 Plugin development for DR800/DR1000 | yuri_b | iRex Developer's Corner | 0 | 09-18-2010 09:46 AM | 
| Device plugin development | reader42 | Plugins | 10 | 03-29-2010 12:39 PM | 
| Calibre plugin development - Newbie problems | minstrel | Plugins | 5 | 04-12-2009 12:44 PM |