Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 01-21-2023, 09:52 AM   #1
philja
Zealot
philja began at the beginning.
 
Posts: 121
Karma: 10
Join Date: Nov 2015
Location: Europe EEC
Device: none
problems with names of image files in epub2

Yesterday I needed to make a small update to a book I published in mid 2021. At that time, epubcheck made a clean run with no errors.

Now I am using Sigil 1.9.20, and after modifying some small text entries in one chapter, I automatically ran epubcheck. There was a slight delay while epubcheck did its update search and downloaded and installed v5.0.0 and then I was confronted with a sizeable list of pink errors. These all concerned filenames.

The error messages are all similar to (my red coloring) :

Quote:
Col: -1: ERROR(RSC-001): File "Pictures/A%2Bm6%2BM6-Ans.gif" could not be found.
The book has some 130 gif images, many using a naming convention like:
A+m6+M6-Ans.gif and Bb+P5-M2-m6-Ans.gif

This convention is not arbitrary. The + and - signs denote a direction from the initial letter. I would like to maintain the convention if possible.

Somewhere since I made the epub, these + signs in the names are being converted to %2B in the xhtml files and also in the opf file. The - signs remain unchanged.

The file names are shown correctly (as their original names) in Sigil's Browser pane and the book displays correctly in the readers I have tried. So maybe I could just ignore the problem but I would like to know if there is a way of preserving the file names unchanged in Sigil.

Straight editing by replacing %2B with a + sign works in the xhtml files. In the opf file, it goes back to %2B as soon as the focus is changed. The result is another fail with a different error relative to the xhtml page saying that the file name is not referenced in the opf file.
philja is offline   Reply With Quote
Old 01-21-2023, 10:11 AM   #2
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,800
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
+ is a keyword on many OS (means concatenate the 2 files).
obviously that is not true in this case, but this prevents issues with poorer file systems
theducks is offline   Reply With Quote
Advert
Old 01-21-2023, 10:13 AM   #3
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 7,644
Karma: 5433388
Join Date: Nov 2009
Device: many
Yes plus signs are technically still illegal in urls as they can form part of queries at the end of the url.

See this link for its now deprecated use in url formation and the need to encode the plus sign
https://en.wikipedia.org/wiki/Query_string

Therefore when used in file names that will become part of a url, they still must be % encoded. %2B is an encoded plus sign.

Earlier versions of Sigil did not properly enforce this url spec.

If you want to keep that naming convention you should encode it the exact same way %2B in the OPF manifest and in all urls. OPF manifest entries are urls. The files themselves inside BookBrowser and inside the epub do not have to be changed, just how they are used when referenced in urls/opf/src/hrefs etc.

FWIW, I agree with theducks, and I personally would never use + symbols (or # or spaces or any other symbol that needs to be encoded or that has special meaning in any file system) inside epub files names to prevent issues with all e-readers (especially older ones). But with the proper encoding in urls/href/src/opf manifests, etc it *should* work just fine allowing the file name to keep its "+" signs but encoding all references to it.

If it helps, here is a routine borrowed long ago from calibre source to create a file name that is safe across many devices and mac, win, linux:

Code:
def cleanup_file_name(name):
    import string
    _filename_sanitize = re.compile(r'[\xae\0\\|\?\*<":>\+/]')
    substitute='_'
    one = ''.join(char for char in name if char in string.printable)
    one = _filename_sanitize.sub(substitute, one)
    one = re.sub(r'\s', '_', one).strip()
    one = re.sub(r'^\.+$', '_', one)
    one = one.replace('..', substitute)
    # Windows doesn't like path components that end with a period
    if one.endswith('.'):
        one = one[:-1]+substitute
    # Mac and Unix don't like file names that begin with a full stop
    if len(one) > 0 and one[0:1] == '.':
        one = substitute+one[1:]
    return one
As you can see the list of things to sanitize (remove) is quite long and does include the '+' sign.

Last edited by KevinH; 01-21-2023 at 11:18 AM.
KevinH is online now   Reply With Quote
Old 01-21-2023, 12:33 PM   #4
philja
Zealot
philja began at the beginning.
 
Posts: 121
Karma: 10
Join Date: Nov 2015
Location: Europe EEC
Device: none
Thanks theducks and KevinH.
Quote:
Earlier versions of Sigil did not properly enforce this url spec.
That surely explains why I didn't pick this problem up when I wrote the first version in 2020/21. The LTS version of Linux that I was using in those days had a version of Sigil which was very out of date v0.9 ish.

I've bitten the bullet and renamed the image files without the + sign with corresponding edits in my image tracking system and now all is good.

Last edited by philja; 01-21-2023 at 02:07 PM.
philja is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Two Problems: Author Names and Series TheArtfulDodger Library Management 7 01-17-2020 10:19 PM
full page image problems with iBooks/cover problems in iTunes iain robinson ePub 1 06-28-2013 11:10 AM
Potential problems with long file names GeoffR Kobo Reader 13 03-25-2013 05:25 AM
Keeping my Image File Names. rurdyrucker Calibre 1 04-05-2012 09:17 AM
Obfuscating font and image names JohnnyD Sigil 2 10-30-2009 05:40 PM


All times are GMT -4. The time now is 09:37 PM.


MobileRead.com is a privately owned, operated and funded community.