Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil > Plugins

Notices

Reply
 
Thread Tools Search this Thread
Old 09-04-2015, 06:22 AM   #1
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,583
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
URL Checker plugin

[Plugin] URLChecker - Checks URLs

Updated: April 18, 2021
Current Version: "0.3.0"

Credits: The latest version was created by aokai. For more information, see his original post.

Installation:

To install the plugin open Sigil and select:

Plugins > Manage Plugins > Add Plugin > URLChecker_v0.3.0zip > OK.

Also check the Use Bundled Python check box, if it isn't already checked.

Usage:

To run the plugin select:

Plugins > Validation > URLChecker > Start.

If broken URLs were found, the plugin will display them in the Validation Results windows. Otherwise, it'll display a list of all working URLs. It'll also copy a log file to the Desktop.

Warning: Since this plugin mimics a web browser, each visited website might log the IP address of your machine and/or store cookies on it. Theoretically, it might also install drive-by malware on your machine if your machine is insufficiently protected.
For this reason, you might want to use this plugin only with ebooks that you've created yourself or obtained from trustworthy sources.

Note: The plugin might report working URLs as broken, if the URL was shortened or the website author has disabled web crawling for their site using robots.txt or user agent based counter measures.

License: GNU General Public License v3 (GPL-3)
Attached Files
File Type: zip URLChecker_v0.3.0.zip (3.8 KB, 968 views)

Last edited by Doitsu; 04-18-2021 at 10:35 AM. Reason: New version with more detailed error messages
Doitsu is offline   Reply With Quote
Old 09-06-2015, 06:50 AM   #2
xingenter
Member Retired
xingenter began at the beginning.
 
Posts: 16
Karma: 10
Join Date: Apr 2013
Location: UK
Device: none
I was delighted when I saw this plugin and downloaded it to try. I have Python 2 AND 3 installed on my Linux Mint 17.2 PC. I am using version 0.8.6 of Sigil as created by DiapDealer. I followed the instructions to install 'requests' which I did for both versions of Python and I used the --upgrade parameter. So far so good.
When I ran the plugin on an ePub file with lots of URLs I got this message:

Status: failed

Traceback (most recent call last):
File "/usr/local/share/sigil/plugin_launchers//python/launcher.py", line 134, in launch
target_script = __import__(script_module)
File "/home/markb/.local/share/sigil-ebook/sigil/plugins/URLChecker/plugin.py", line 4, in <module>
from bs4 import BeautifulSoup
File "/home/markb/.local/share/sigil-ebook/sigil/plugins/URLChecker/bs4/__init__.py", line 30, in <module>
from .builder import builder_registry, ParserRejectedMarkup
File "/home/markb/.local/share/sigil-ebook/sigil/plugins/URLChecker/bs4/builder/__init__.py", line 4, in <module>
from bs4.element import (
File "/home/markb/.local/share/sigil-ebook/sigil/plugins/URLChecker/bs4/element.py", line 6, in <module>
from bs4.dammit import EntitySubstitution
File "/home/markb/.local/share/sigil-ebook/sigil/plugins/URLChecker/bs4/dammit.py", line 12, in <module>
from html.entities import codepoint2name
ImportError: No module named html.entities
Error: No module named html.entities

I assume I am missing some other dependency. Can anyone help me to get it working, please?

Thanks,

Mr B
xingenter is offline   Reply With Quote
Advert
Old 09-06-2015, 08:09 AM   #3
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,583
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
I was able to replicate this error on my old Linux machine. It appears to have been caused either by an incorrectly set Sigil Python 3 binary path or problems caused by installing Python 3 alongside Python 2.

On my old machine, the Python 3 pat was set to /usr/bin/python, which was linked to the Python 2 executable.

After changing it to /usr/bin/python3 and deleting the Python 2 path, the plugin worked fine.

If that doesn't work for you try removing the embedded BeautifulSoup4 package and install it separately with pip or pip3.

1. Open Sigil and select Preferences from the Edit menu.
2. Click Open Preferences Location. This will open the Sigil preferences folder
3. Open the plugins folder, which contains the URLChecker folder.
4. Open the URLChecker folder and delete the BS4 folder.
5. Install the Python 2/3 versions of BeautifulSoup:

Code:
sudo pip install beautifulsoup4
Code:
sudo pip3 install beautifulsoup4
Then re-test the plugin with either the Python 2 or the Python 3 Sigil interpreter path empty. One of them should definitely work.

P.S. On my old machine I got an IncompleteError when I tried to install beautifulsoup4. If this happens to you, too, you'll have to re-install/update pip/pip3.

Last edited by Doitsu; 09-06-2015 at 08:39 AM.
Doitsu is offline   Reply With Quote
Old 09-11-2015, 03:51 AM   #4
xingenter
Member Retired
xingenter began at the beginning.
 
Posts: 16
Karma: 10
Join Date: Apr 2013
Location: UK
Device: none
Getting it working



I followed your advice. Removing the path to Python 2 made the URL checker work. Removing Python 3 and keeping Python 2 didn't.
Then I deleted the bs4 folder and used pip/pip3 to install Beautiful Soup again. Once I'd done that I was able to run the plugin with either version of Python. That's great because I have several plugins now and some need Python 2, some need 3 so I have to have both.

The plugin does make life a lot easier. Thank you for writing it. I really like how the errors appear in the validation window at the bottom, like they do when running FlightCrew.
What I would find helpful though is to either: keep the window open with all the links scanned so it can be cut & pasted into a document OR offer the option to save to a text file too. Being able to navigate directly to the broken links in Sigil is marvellous but it would be great to have a definitive list of all links. I format books for a small publisher and I could send a file of broken links that URL Checker found.
I hope that makes sense.

I really appreciate your help.

Mr B
xingenter is offline   Reply With Quote
Old 09-11-2015, 05:21 AM   #5
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,422
Karma: 85397180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Slightly

Nice plugin. And I imagine it was the inspiration for URL Checker that is now present (builtin tool) in the latest version of calibre Editor.
eschwartz is offline   Reply With Quote
Advert
Old 09-11-2015, 05:38 AM   #6
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,583
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by evilmrb View Post
What I would find helpful though is to either: keep the window open with all the links scanned so it can be cut & pasted into a document OR offer the option to save to a text file too.
I've updated the tool so that it writes a log file to the Desktop folder (or the home folder if your system doesn't have a Desktop folder.
The log file starts with "URLChecker" followed by the date and time and ".log." E.g. URLChecker_20150912-112720.log.

(I've attached the new version to the first post.)

Last edited by Doitsu; 09-12-2015 at 05:17 AM.
Doitsu is offline   Reply With Quote
Old 09-11-2015, 08:10 AM   #7
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,636
Karma: 5433388
Join Date: Nov 2009
Device: many
Hi Doitsu,

FYI, I just added this to the Sigil Plugin Index thread.

Thanks!
KevinH
KevinH is offline   Reply With Quote
Old 09-15-2015, 03:11 AM   #8
eathan
Junior Member
eathan began at the beginning.
 
Posts: 7
Karma: 10
Join Date: Sep 2013
Device: Kindle paperwhite
Hi Doitsu,

Thank you for an interesting plug. Taking this opportunity, I would like to ask if you would be interested in adding extra features, which are described in a separate thread:

https://www.mobileread.com/forums/sho...d.php?t=262322
eathan is offline   Reply With Quote
Old 09-15-2015, 03:58 AM   #9
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,583
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by eathan View Post
Taking this opportunity, I would like to ask if you would be interested in adding extra features, which are described in a separate thread:
At this time, I don't plan on adding any new features. Since the plugin uses the well-documented BeautifulSoup HTML parser, you could easily add the extra features that you're interested in yourself, since the plugin is rather simple and the source code is sufficiently commented.
Doitsu is offline   Reply With Quote
Old 09-18-2015, 03:07 AM   #10
xingenter
Member Retired
xingenter began at the beginning.
 
Posts: 16
Karma: 10
Join Date: Apr 2013
Location: UK
Device: none
Quote:
Originally Posted by Doitsu View Post
I've updated the tool so that it writes a log file to the Desktop folder (or the home folder if your system doesn't have a Desktop folder.
The log file starts with "URLChecker" followed by the date and time and ".log." E.g. URLChecker_20150912-112720.log.

(I've attached the new version to the first post.)
That's brilliant. Thank you so much!

Mr B
xingenter is offline   Reply With Quote
Old 09-18-2015, 03:11 AM   #11
xingenter
Member Retired
xingenter began at the beginning.
 
Posts: 16
Karma: 10
Join Date: Apr 2013
Location: UK
Device: none
Quote:
Originally Posted by eschwartz View Post
Slightly

Nice plugin. And I imagine it was the inspiration for URL Checker that is now present (builtin tool) in the latest version of calibre Editor.
I saw that too and thought it was a spooky coincidence. I tested both on an ePub file with lots of links in it. URL checker found a couple of bad links and Calibre found 30. However, I'm certain that there are not 30 bad links so I'd be wary of trusting the Calibre feature too much.
xingenter is offline   Reply With Quote
Old 09-18-2015, 03:44 AM   #12
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,583
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by evilmrb View Post
I saw that too and thought it was a spooky coincidence. I tested both on an ePub file with lots of links in it. URL checker found a couple of bad links and Calibre found 30. However, I'm certain that there are not 30 bad links so I'd be wary of trusting the Calibre feature too much.
Since Kovid Goyal is a professional programmer and I can barely write one line of Python without errors, it's more likely that there's a bug in my very simple plugin.
BTW, it only checks URLs that start with "http." For example, links that start with "file", "www", "ftp" or domain names (e.g. google.com) won't be checked.

Can you please re-check your file and let me know what kind of broken links Calibre Editor found that my plugin missed.
Doitsu is offline   Reply With Quote
Old 11-09-2015, 11:52 AM   #13
David Kudler
Enthusiast
David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'
 
David Kudler's Avatar
 
Posts: 48
Karma: 10000
Join Date: Apr 2011
Device: iPad
Installing beautifulsoup on Mac OS X

Thanks so much, Doitsu!

BTW, it took me a couple of whacks to install beautifulsoup on my Mac. In case anyone else has the same problem, here's how I did it:
  1. Open Terminal
  2. Log in to an admin account in Terminal (if you're not already in one — but you shouldn't use an admin account for everyday work, right?):
    Code:
    login [admin username]
  3. Enter this command:
    Code:
    sudo easy_install beautifulsoup4
  4. Enter your password, sit back and relax.

Once the installation was done, the plugin worked beautifully.

One nice side benefit: it gives me a list of all of the external URLs used in the ebook — which is helpful for making sure all of them are correct. I've been having to search through one by one, which is a pain. (For example, you don't want any Amazon links in a file you're going to load to Apple!)
David Kudler is offline   Reply With Quote
Old 11-17-2017, 09:17 AM   #14
AlanHK
Guru
AlanHK ought to be getting tired of karma fortunes by now.AlanHK ought to be getting tired of karma fortunes by now.AlanHK ought to be getting tired of karma fortunes by now.AlanHK ought to be getting tired of karma fortunes by now.AlanHK ought to be getting tired of karma fortunes by now.AlanHK ought to be getting tired of karma fortunes by now.AlanHK ought to be getting tired of karma fortunes by now.AlanHK ought to be getting tired of karma fortunes by now.AlanHK ought to be getting tired of karma fortunes by now.AlanHK ought to be getting tired of karma fortunes by now.AlanHK ought to be getting tired of karma fortunes by now.
 
AlanHK's Avatar
 
Posts: 668
Karma: 929286
Join Date: Apr 2014
Device: PW-3, iPad, Android phone
This will be useful. I have to put similar links in a series of books, this gives a simple way to collect and check them and then paste into the next.

However, it reported all my links as "broken", as Sigil is blocked by my firewall. Would prefer an option to just collect links, no checking. Also it might check if it has any connectivity at all (e.g. ping Google.com) and if not, report that first rather then alarming the user.
AlanHK is offline   Reply With Quote
Old 11-19-2017, 05:09 AM   #15
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,583
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by AlanHK View Post
However, it reported all my links as "broken", as Sigil is blocked by my firewall.
Feel free to use Calibre Editor. It has a built-in URL checker. (Tools > External Links > Check external links.)

Quote:
Originally Posted by AlanHK View Post
Would prefer an option to just collect links, no checking.
The plugin should generate a log file that starts with "URLChecker" followed by the date and time and ".log." E.g. URLChecker_20171119-110141.log in the Desktop folder or, if it can't be found, the Documents folder. (The location varies, depending on the OS.)

Quote:
Originally Posted by AlanHK View Post
Also it might check if it has any connectivity at all (e.g. ping Google.com) and if not, report that first rather then alarming the user.
This plugin is intended for users who know that running a plugin that checks URLs on a machine with no Internet access is pointless.
Doitsu is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
url link checker needed lindaw2396 ePub 2 01-22-2013 12:30 PM
Print friendly url unrelated to regular url (and javascript) sleepless Recipes 3 12-03-2011 10:43 AM
epub checker drMerry Development 3 06-17-2011 02:04 PM
Spell checker crutledge Sigil 31 12-29-2010 01:31 PM


All times are GMT -4. The time now is 05:35 PM.


MobileRead.com is a privately owned, operated and funded community.