Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 11-25-2019, 03:07 AM   #1
Vroni
Banned
Vroni knows the difference between 'who' and 'whom'Vroni knows the difference between 'who' and 'whom'Vroni knows the difference between 'who' and 'whom'Vroni knows the difference between 'who' and 'whom'Vroni knows the difference between 'who' and 'whom'Vroni knows the difference between 'who' and 'whom'Vroni knows the difference between 'who' and 'whom'Vroni knows the difference between 'who' and 'whom'Vroni knows the difference between 'who' and 'whom'Vroni knows the difference between 'who' and 'whom'Vroni knows the difference between 'who' and 'whom'
 
Posts: 168
Karma: 10010
Join Date: Oct 2018
Device: Tolino/PRS 650/Tablet
Coping with countless images

Hi,

i have an epub with tons of pictures. Round about 50% is just a fleuron, and all of them are the same.

The filenames are all numbered in sequence, so there is no chance to identify a picture by its name.

So mi renamed one of the fleuron images to fleuron.jpg to be the master and change all other occurences to that file. I can walk trough all img elements, but unfortunetly the fleuron image is surrounded by a lot of different other images. In the preview its now not clear what picture has currently been caught by the regex. So this approach ends up in a lot of manual tasks. either by replacing by trial and if the wrong one has been marked rollback and try the next one. Or i use the "open picture in tab" to see if the current caught picture is the fleuron one and the reference can be set to the master fleuron.

I tried another approach from the report but you cant rename or jump from the piture list to anything else - except deleting which is not helpful.

If the preview would mark what has been marked in the code view - that would be an easy thing - but thats not available.

Has someone another idea to speed this process up? Did i miss something?

\\\/roni

Last edited by Vroni; 11-25-2019 at 03:09 AM. Reason: typo
Vroni is offline   Reply With Quote
Old 11-25-2019, 04:58 AM   #2
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,584
Karma: 22735033
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by Vroni View Post
Has someone another idea to speed this process up? Did i miss something?
Since you know some Python, you could use BeautifulSoup to get all image tags and PIL to get the image size. The following Edit plugin code should work for you:

Spoiler:
Code:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import sys, os
from io import BytesIO
from PIL import Image
from sigil_bs4 import BeautifulSoup

def run(bk):

    # preferences
    max_width = 50
    fleuron_name = 'fleuron.jpg'

    # process file list
    for (html_id, file_href) in bk.text_iter():
        file_name = os.path.basename(file_href)
        print('Processing {}...\n'.format(file_name))
        html = bk.readfile(html_id)

        # load html code into BeautifulSoup
        soup = BeautifulSoup(html, 'html.parser')
        orig_soup = str(soup)
 
        # look for images
        img_tags = soup.find_all('img')
        for img in img_tags:
            if 'src' in img.attrs:
                href = img['src']
                base_name = os.path.basename(href)
                id = bk.basename_to_id(base_name)
                if id:
                    # get image file size
                    imgdata = bk.readfile(id)
                    img_data = Image.open(BytesIO(imgdata)).convert('L')
                    width, height = img_data.size
                    if width <= max_width:
                        img['src'] = href.replace(base_name, fleuron_name)
                        print('{} renamed to {}'.format(base_name, fleuron_name))
                        
            else:
                print(img['src'] + ' skipped! (empty img tag)\n')

        if str(soup) != orig_soup:
            bk.writefile(html_id, str(soup.prettyprint_xhtml()))
            print('\n{} updated\n'.format(file_name))
    
    print('\nPlease click OK to close the Plugin Runner window.')
    
    return 0

def main():
    print('I reached main when I should not have\n')
    return -1

if __name__ == "__main__":
    sys.exit(main())


It looks for images with a width of up to 50 pixels and changes the file name in the img src attribute to fleuron.jpg.

If that code catches too many false positives, you might find KevinH's Access-Aide plugin helpful. Simply change the alt attribute of all fleurons to fleuron and then use a regex to change the file name of all images with a fleuron alt attribute.
Doitsu is offline   Reply With Quote
Advert
Old 11-25-2019, 06:22 AM   #3
Vroni
Banned
Vroni knows the difference between 'who' and 'whom'Vroni knows the difference between 'who' and 'whom'Vroni knows the difference between 'who' and 'whom'Vroni knows the difference between 'who' and 'whom'Vroni knows the difference between 'who' and 'whom'Vroni knows the difference between 'who' and 'whom'Vroni knows the difference between 'who' and 'whom'Vroni knows the difference between 'who' and 'whom'Vroni knows the difference between 'who' and 'whom'Vroni knows the difference between 'who' and 'whom'Vroni knows the difference between 'who' and 'whom'
 
Posts: 168
Karma: 10010
Join Date: Oct 2018
Device: Tolino/PRS 650/Tablet
Quote:
Originally Posted by Doitsu View Post
Since you know some Python,.
That's what I was afraid of.
Vroni is offline   Reply With Quote
Old 11-25-2019, 08:18 AM   #4
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,552
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by Vroni View Post
That's what I was afraid of.
???
You're afraid of having options that you wouldn't have if there were no plugin interface and a working knowledge of Python?
DiapDealer is offline   Reply With Quote
Old 11-25-2019, 08:31 AM   #5
Vroni
Banned
Vroni knows the difference between 'who' and 'whom'Vroni knows the difference between 'who' and 'whom'Vroni knows the difference between 'who' and 'whom'Vroni knows the difference between 'who' and 'whom'Vroni knows the difference between 'who' and 'whom'Vroni knows the difference between 'who' and 'whom'Vroni knows the difference between 'who' and 'whom'Vroni knows the difference between 'who' and 'whom'Vroni knows the difference between 'who' and 'whom'Vroni knows the difference between 'who' and 'whom'Vroni knows the difference between 'who' and 'whom'
 
Posts: 168
Karma: 10010
Join Date: Oct 2018
Device: Tolino/PRS 650/Tablet
Afraid can mean anxiety or fear.

So i was afraid that Python is the only option i have.

And if you would look over my shoulder how slowly I am still in python...

My first (and only) plugin took weeks
Vroni is offline   Reply With Quote
Advert
Old 11-25-2019, 08:44 AM   #6
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,552
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Understood. I just figured HAVING an option as opposed to NOT having any might actually be comforting in some small way. Call me crazy, though.
DiapDealer is offline   Reply With Quote
Old 11-25-2019, 09:37 AM   #7
Turtle91
A Hairy Wizard
Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.
 
Turtle91's Avatar
 
Posts: 3,099
Karma: 18727053
Join Date: Dec 2012
Location: Charleston, SC today
Device: iPhone 11/X/6/iPad 1,2,Air & Air Pro/Surface Pro/Kindle PW & Fire
Quote:
Originally Posted by Vroni View Post
Hi,

i have an epub with tons of pictures. Round about 50% is just a fleuron, and all of them are the same.

The filenames are all numbered in sequence, so there is no chance to identify a picture by its name....
Is there some kind of surrounding tag that can be used to identify which image is used as a fleuron?

eg:

<div class="fleuron"><img alt="" src="../Images/01.jpg"/></div>
<div class="fleuron"><img alt="" src="../Images/02.jpg"/></div>
<div class="fleuron"><img alt="" src="../Images/15.jpg"/></div>

search: <div class="fleuron"><img alt="" src="../Images/(.*?).jpg"/></div>
replace: <div class="fleuron"><img alt="" src="../Images/fleuron.jpg"/></div>

Then run a report and delete all images that are used 0 times.
Turtle91 is online now   Reply With Quote
Old 11-25-2019, 09:38 AM   #8
Brett Merkey
Not Quite Dead
Brett Merkey ought to be getting tired of karma fortunes by now.Brett Merkey ought to be getting tired of karma fortunes by now.Brett Merkey ought to be getting tired of karma fortunes by now.Brett Merkey ought to be getting tired of karma fortunes by now.Brett Merkey ought to be getting tired of karma fortunes by now.Brett Merkey ought to be getting tired of karma fortunes by now.Brett Merkey ought to be getting tired of karma fortunes by now.Brett Merkey ought to be getting tired of karma fortunes by now.Brett Merkey ought to be getting tired of karma fortunes by now.Brett Merkey ought to be getting tired of karma fortunes by now.Brett Merkey ought to be getting tired of karma fortunes by now.
 
Posts: 194
Karma: 654170
Join Date: Jul 2015
Device: Paperwhite 4; Galaxy Tab
Quote:
The filenames are all numbered in sequence, so there is no chance to identify a picture by its name.
I don't use Sigil so I don't know its behavior but I had the problem a few times and Calibre helped me out. There was no way to differentiate the images with regex so I used a regex that would find every image and just stepped thru every one. At each found image, Calibre selected the code and showed me the image in the preview pane. For most images, pass. For the fleuron, "Replace and Find."

Not elegant, but sounds like much less effort than what you described.
Brett Merkey is offline   Reply With Quote
Old 11-25-2019, 09:41 AM   #9
Turtle91
A Hairy Wizard
Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.
 
Turtle91's Avatar
 
Posts: 3,099
Karma: 18727053
Join Date: Dec 2012
Location: Charleston, SC today
Device: iPhone 11/X/6/iPad 1,2,Air & Air Pro/Surface Pro/Kindle PW & Fire
An alternate - if a little more manually intensive - is to open the inspector. Hover the mouse over the different images in the inspector list and it will highlight the image in the preview pane. Then you can note which image name is the fleuron.
Turtle91 is online now   Reply With Quote
Old 11-25-2019, 10:03 AM   #10
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
Let's say you're starting with images:

Code:
<img alt="" src="../Images/image001.png" />
[...]
<img alt="" src="../Images/image098.png" />
<img alt="" src="../Images/image099.png" />
This is potentially what I would do:

1. Go into Tools > Reports > Image Files.

2. Look through the images, any that are duplicate fleurons: Right-Click > Delete From Book:

Click image for larger version

Name:	Sigil.Report.Delete.Images.png
Views:	171
Size:	93.5 KB
ID:	175130

3. Once you get rid of all the fleurons, then return back to the main Sigil window.

Open the Images folder, Shift-Click to highlight all images, Right-Click > Rename:

Click image for larger version

Name:	Sigil.Rename.Images.png
Views:	161
Size:	23.6 KB
ID:	175131

4. Rename to something completely different. Like "TempImages001".

Now, all your surviving images will be named "TempImages001", "TempImages002":

Code:
<img alt="" src="../Images/TempImages001.png" />
[...]
<img alt="" src="../Images/image098.png" />
<img alt="" src="../Images/TempImages002.png" />
while all the non-existent fleurons will be under the old naming convention.

5. Now you can use Regex to easily change all the old image code into "fleuron.png":

Search: <img alt="" src="[^"]+image\d+\.png" />
Replace: <img alt="" src="../Images/fleuron.png" />

6 (Optional). Now go through and give your surviving images all human-readable names.
Tex2002ans is offline   Reply With Quote
Old 11-25-2019, 10:37 AM   #11
Brett Merkey
Not Quite Dead
Brett Merkey ought to be getting tired of karma fortunes by now.Brett Merkey ought to be getting tired of karma fortunes by now.Brett Merkey ought to be getting tired of karma fortunes by now.Brett Merkey ought to be getting tired of karma fortunes by now.Brett Merkey ought to be getting tired of karma fortunes by now.Brett Merkey ought to be getting tired of karma fortunes by now.Brett Merkey ought to be getting tired of karma fortunes by now.Brett Merkey ought to be getting tired of karma fortunes by now.Brett Merkey ought to be getting tired of karma fortunes by now.Brett Merkey ought to be getting tired of karma fortunes by now.Brett Merkey ought to be getting tired of karma fortunes by now.
 
Posts: 194
Karma: 654170
Join Date: Jul 2015
Device: Paperwhite 4; Galaxy Tab
Quote:
This is potentially what I would do...
Looks like a winner to me. Just tested it in Calibre. I didn't even know you could delete from the reports utility. Good to know. The process makes everything quite visible, no mistakes. Thanks!

Last edited by Brett Merkey; 11-25-2019 at 11:07 AM.
Brett Merkey is offline   Reply With Quote
Old 11-25-2019, 01:33 PM   #12
BeckyEbook
Guru
BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.BeckyEbook ought to be getting tired of karma fortunes by now.
 
BeckyEbook's Avatar
 
Posts: 692
Karma: 2180740
Join Date: Jan 2017
Location: Poland
Device: Misc
Additional question for OP – is the fleuron file identical? (has the same bytes, same size, etc.)

If so – I prepared the plugin (based on the idea of Doitsu and my other private plugin).
I wanted to prepare it because I know that it may be useful to me in the future.
I tested the plugin, but of course I recommend running it on a copy of the epub file.

I know it can be optimized, but I wanted to make it work, not the perfect version.
Attached Files
File Type: zip TheSameImage.zip (20.0 KB, 163 views)
BeckyEbook is offline   Reply With Quote
Old 11-25-2019, 01:52 PM   #13
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by BeckyEbook View Post
Additional question for OP – is the fleuron file identical? (has the same bytes, same size, etc.)
Usually when I see "duplicate" fleurons in books, it's based on PDF scans... so each fleuron would have slightly different artifacts/specks/resolution.

But great to have more tools in the toolbelt.
Tex2002ans is offline   Reply With Quote
Old 11-25-2019, 03:03 PM   #14
Vroni
Banned
Vroni knows the difference between 'who' and 'whom'Vroni knows the difference between 'who' and 'whom'Vroni knows the difference between 'who' and 'whom'Vroni knows the difference between 'who' and 'whom'Vroni knows the difference between 'who' and 'whom'Vroni knows the difference between 'who' and 'whom'Vroni knows the difference between 'who' and 'whom'Vroni knows the difference between 'who' and 'whom'Vroni knows the difference between 'who' and 'whom'Vroni knows the difference between 'who' and 'whom'Vroni knows the difference between 'who' and 'whom'
 
Posts: 168
Karma: 10010
Join Date: Oct 2018
Device: Tolino/PRS 650/Tablet
Quote:
Originally Posted by Turtle91 View Post
Is there some kind of surrounding tag that can be used to identify which image is used as a fleuron?

eg:

<div class="fleuron"><img alt="" src="../Images/01.jpg"/></div>
<div class="fleuron"><img alt="" src="../Images/02.jpg"/></div>
<div class="fleuron"><img alt="" src="../Images/15.jpg"/></div>
yes they all have a surrounding <div class="figure">




Quote:
Originally Posted by BeckyEbook View Post
Additional question for OP – is the fleuron file identical? (has the same bytes, same size, etc.)
Unfortunately not. They differ in pixel size like 400*80 or 401*80 or 402*79 and so on. And even if they have the same dimension, the picture itself is not exactly at the same position in the canvas - resulting in slightly different file size.
Vroni is offline   Reply With Quote
Old 11-25-2019, 03:54 PM   #15
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,813
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by Tex2002ans View Post
Let's say you're starting with images:

Code:
<img alt="" src="../Images/image001.png" />
[...]
<img alt="" src="../Images/image098.png" />
<img alt="" src="../Images/image099.png" />
This is potentially what I would do:

1. Go into Tools > Reports > Image Files.

2. Look through the images, any that are duplicate fleurons: Right-Click > Delete From Book:
Reports is my FAV
theducks is online now   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Question: Coping books between libraries from a plugin Terisa de morgan Development 2 10-20-2015 12:46 AM
Duplicate checking while coping a library Giuseppe Chillem Calibre 2 11-29-2011 08:58 AM
Coping with old books Iain Workshop 10 09-21-2010 09:12 AM
Coping with capacitive buttons Grimulkan iRex 7 05-29-2009 07:34 AM
Coping / synching Plucker config between 2 PCs FRAC Reading and Management 1 10-18-2005 05:21 PM


All times are GMT -4. The time now is 09:24 AM.


MobileRead.com is a privately owned, operated and funded community.