View Single Post
Old 01-21-2023, 10:13 AM   #3
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 8,878
Karma: 6120478
Join Date: Nov 2009
Device: many
Yes plus signs are technically still illegal in urls as they can form part of queries at the end of the url.

See this link for its now deprecated use in url formation and the need to encode the plus sign
https://en.wikipedia.org/wiki/Query_string

Therefore when used in file names that will become part of a url, they still must be % encoded. %2B is an encoded plus sign.

Earlier versions of Sigil did not properly enforce this url spec.

If you want to keep that naming convention you should encode it the exact same way %2B in the OPF manifest and in all urls. OPF manifest entries are urls. The files themselves inside BookBrowser and inside the epub do not have to be changed, just how they are used when referenced in urls/opf/src/hrefs etc.

FWIW, I agree with theducks, and I personally would never use + symbols (or # or spaces or any other symbol that needs to be encoded or that has special meaning in any file system) inside epub files names to prevent issues with all e-readers (especially older ones). But with the proper encoding in urls/href/src/opf manifests, etc it *should* work just fine allowing the file name to keep its "+" signs but encoding all references to it.

If it helps, here is a routine borrowed long ago from calibre source to create a file name that is safe across many devices and mac, win, linux:

Code:
def cleanup_file_name(name):
    import string
    _filename_sanitize = re.compile(r'[\xae\0\\|\?\*<":>\+/]')
    substitute='_'
    one = ''.join(char for char in name if char in string.printable)
    one = _filename_sanitize.sub(substitute, one)
    one = re.sub(r'\s', '_', one).strip()
    one = re.sub(r'^\.+$', '_', one)
    one = one.replace('..', substitute)
    # Windows doesn't like path components that end with a period
    if one.endswith('.'):
        one = one[:-1]+substitute
    # Mac and Unix don't like file names that begin with a full stop
    if len(one) > 0 and one[0:1] == '.':
        one = substitute+one[1:]
    return one
As you can see the list of things to sanitize (remove) is quite long and does include the '+' sign.

Last edited by KevinH; 01-21-2023 at 11:18 AM.
KevinH is offline   Reply With Quote