Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Plugins

Notices

Reply
 
Thread Tools Search this Thread
Old 01-14-2015, 05:16 PM   #211
jveth
Junior Member
jveth began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Jan 2015
Location: Hungary
Device: Kindle5
Incorrect non-standard hyphenation, e.g. hungarian

Hi Saulus, thank you very much for this plugin. I am very glad I found this.
However, the plugin unfortunately does not support non-standard hyphenations, although the recent libreoffice dictionaries do.
This results in incorrect soft-hyphenation also for me, as others already reported in posts
#101 by mattheo
#141 by karakai
#207 by imaginer

I am a bit surprised, that only hungarians report such problems. since, according to the very nice documentation in tb87nemeth.pdf , still there are much more languages involved, including (although perhaps less frequent as in hungarian), also english and german.

Anyway, for this plugin, there does not seem any progress for non-standard hyphenation since #101 (may 2003).
In order to try to help you (and us hungarians as well ), I decided to dig into this a bit more.


1) While the pyhyphen-2.0.5 module works as expected, still the hyphenator-0.5.1 module cannot handle the non-standard cases.

Although I do not know python, still I succeded to construct the following simple test script:

Code:
from hyphen import Hyphenator as Hypy2				# pyhyphen-2.0.5
h2hu = Hypy2('hu_HU')
def hypins2hu(p): return '-'.join(h2hu.syllables(p))

from zzhyphenator import Hyphenator as Hyp0			# hyphenator-0.5.1 (Berendsen,2008)
h0hu = Hyp0('/usr/share/hyphen/hyph_hu_HU.dic')
hypins0hu = h0hu.inserted

def printithu(zz, txt):
    print '2.0.5: ' , hypins2hu(zz) , ' --- ' , '"' + zz + '", as reported by' , txt
    print '0.5.1: ' , hypins0hu(zz)
    return

printithu(u'valamennyit',	'mattheo #101')
printithu(u'valamennyi',	'mattheo #101')
printithu(u'rosszabb',		'mattheo #101')

printithu(u'poggyászomban',	'imaginer #207')
Without adding my own findings, only running the test for the reported cases:

Quote:
2.0.5: va-la-meny-nyit --- "valamennyit", as reported by mattheo #101
0.5.1: va-la-meny-nyt
2.0.5: va-la-meny-nyi --- "valamennyi", as reported by mattheo #101
0.5.1: va-la-meny-ny
2.0.5: rosz-szabb --- "rosszabb", as reported by mattheo #101
0.5.1: rosz-zabb
2.0.5: pogy-gyá-szom-ban --- "poggyászomban", as reported by imaginer #207
0.5.1: pogy-yá-szom-ban
clearly show, that the soft-hyphened words in 0.5.1 obviously are corrupted. They differ from both the expected valid hyphenated and also the non-hyphenated version. Also, they are no more identical to the original, when removing soft hyphens that mostly happens when reflowing the text, no actual hyphen needed at this point.

2) Unfortunately, even with correct hyphenator, the non-standard break-points cannot be used for SHY. I hope this is obvious from the above examples.

Still insisting on my not knowing python , I changed hjob.py in the plugin as follows:

Code:
                for w in wlist:
                    if len(w) >= min_len and u'-' not in w:
                        ww = w
                        for hh in h.iterate(ww):
                            #print 'Hyphenator hint: ' , hh[0] , '-' , hh[1]<---># trace
                            if hh[0] + hh[1] == ww:
                                #w = hh[0] + '-' + w[len(hh[0]):]<------><------># *** TEST *** see all possibilities
                                w = hh[0] + u'\u00AD' + w[len(hh[0]):]
                    newt += w
this now seems to do a much better job .

3) My first thought was, of course, to use the actual pyhyphen module. However, when trying to include the standard installed pyhyphen module in my system, I was compelled to discover that, without sufficiently knowing python, still it is beyond my scopes.

So the next thing for me was to check, whith ignoring the non-standard cases, whether the old hyphenator works well with current dictionaries.
Conclusion: Using the 100.000 word large sample from the Hungarian gigaword corpus referred to by tb87nemeth.pdf, I found that the old module still seems OK at the moment. That is, all standard hyphenation points are identically observed by both 0.5.1 and 2.0.5.

Last edited by jveth; 01-30-2015 at 06:20 PM.
jveth is offline   Reply With Quote
Old 02-06-2015, 09:54 AM   #212
SauliusP.
Plugin developer
SauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notes
 
SauliusP.'s Avatar
 
Posts: 108
Karma: 24394
Join Date: Feb 2012
Location: Lithuania
Device: Kindle
Quote:
Originally Posted by jveth View Post
I am a bit surprised, that only hungarians report such problems.
Hi there,

Not only Hungarians, probably, have this problem, others might just ignore them. I have encountered something similar myself in Lithuanian books, however, I simply to not care.

As for Hungarian, I do not know much about it, but I like Aurora's "Kurvak, gengszterek". Maybe I'll try to incorporte pyhyphen just as a tribute to them :-)

Give me few more weeks, I am quite busy with real life currently.
SauliusP. is offline   Reply With Quote
Advert
Old 02-11-2015, 05:00 AM   #213
098799
Junior Member
098799 began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Feb 2015
Device: Kindle Touch
Calibre 2.5 error

Hi, I tried to get this plugin to work, but get the following error

Code:
calibre, version 2.5.0
BŁĄD: Nieznany wyjątek: <b>LookupError</b>:unknown encoding: .ab4i

calibre 2.5  isfrozen: False is64bit: True
Linux-3.16.0-28-generic-x86_64-with-Ubuntu-14.10-utopic Linux ('64bit', 'ELF')
('Linux', '3.16.0-28-generic', '#38-Ubuntu SMP Fri Dec 12 17:37:40 UTC 2014')
Python 2.7.8
Linux: ('Ubuntu', '14.10', 'utopic')
Successfully initialized third party plugins: Hyphenate This!
Traceback (most recent call last):
  File "calibre_plugins.hyphenatethis.hyphenatethisaction", line 118, in hyphenate
  File "calibre_plugins.hyphenatethis.hyphenatethisaction", line 92, in _select_books
  File "calibre_plugins.hyphenatethis.hyphenator.hyphenator", line 165, in __init__
  File "calibre_plugins.hyphenatethis.hyphenator.hyphenator", line 89, in __init__
LookupError: unknown encoding: .ab4i
Any idea?
098799 is offline   Reply With Quote
Old 02-11-2015, 05:28 AM   #214
SauliusP.
Plugin developer
SauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notes
 
SauliusP.'s Avatar
 
Posts: 108
Karma: 24394
Join Date: Feb 2012
Location: Lithuania
Device: Kindle
Hi,

If you could upload your book and used dictionary somewhere and share link via PM, that would be a big help.
SauliusP. is offline   Reply With Quote
Old 02-11-2015, 05:48 AM   #215
098799
Junior Member
098799 began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Feb 2015
Device: Kindle Touch
I made it work by downloading yet another dictionary from http://www.textcontrol.com/en_US/dow.../dictionaries/

Sorry to have bothered you! The plugin looks great. Hyphenation is probably the one thing I really wished I had in my Kindle.
098799 is offline   Reply With Quote
Advert
Old 02-11-2015, 05:49 AM   #216
Ruskie_it
Fanatic
Ruskie_it ought to be getting tired of karma fortunes by now.Ruskie_it ought to be getting tired of karma fortunes by now.Ruskie_it ought to be getting tired of karma fortunes by now.Ruskie_it ought to be getting tired of karma fortunes by now.Ruskie_it ought to be getting tired of karma fortunes by now.Ruskie_it ought to be getting tired of karma fortunes by now.Ruskie_it ought to be getting tired of karma fortunes by now.Ruskie_it ought to be getting tired of karma fortunes by now.Ruskie_it ought to be getting tired of karma fortunes by now.Ruskie_it ought to be getting tired of karma fortunes by now.Ruskie_it ought to be getting tired of karma fortunes by now.
 
Posts: 536
Karma: 1000000
Join Date: Dec 2011
Location: Rome, Italy
Device: Kindle PW5, Kindle PW4, Kindle 4 NT
Amen to that...
Ruskie_it is offline   Reply With Quote
Old 02-16-2015, 03:06 PM   #217
EbokJunkie
Addict
EbokJunkie can differentiate black from dark navy blueEbokJunkie can differentiate black from dark navy blueEbokJunkie can differentiate black from dark navy blueEbokJunkie can differentiate black from dark navy blueEbokJunkie can differentiate black from dark navy blueEbokJunkie can differentiate black from dark navy blueEbokJunkie can differentiate black from dark navy blueEbokJunkie can differentiate black from dark navy blueEbokJunkie can differentiate black from dark navy blueEbokJunkie can differentiate black from dark navy blueEbokJunkie can differentiate black from dark navy blue
 
Posts: 229
Karma: 13495
Join Date: Feb 2009
Location: SoCal
Device: Kindle 3, Kindle PW, Pocketbook 301+, Pocketbook Touch, Sony 950, 350
SauliusP.
May we hope you'll find time to convert plugin to command line utility?
That would be great!

Last edited by EbokJunkie; 02-16-2015 at 03:09 PM.
EbokJunkie is offline   Reply With Quote
Old 02-20-2015, 04:27 PM   #218
SauliusP.
Plugin developer
SauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notes
 
SauliusP.'s Avatar
 
Posts: 108
Karma: 24394
Join Date: Feb 2012
Location: Lithuania
Device: Kindle
I've never considered that, nor did I investigate abilities of CLI Calibre. I definitely would not develop the plugin as an independent standalone application, as it heavily depends on Calibre's features.

And I have no use case for it personally. Sorry :|
SauliusP. is offline   Reply With Quote
Old 02-22-2015, 01:57 PM   #219
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,422
Karma: 85397180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Perhaps you could take a look at how Modify EPUB does it.

Note: It uses calibre-debug to access the plugin through calibre APIs.
Note2: It is the only command-line plugin, probably for reason -- it is essentially shtick. Do it if it makes you feel happy...
eschwartz is offline   Reply With Quote
Old 03-10-2015, 05:08 PM   #220
Aellea
Junior Member
Aellea began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Mar 2015
Device: Kindle (keyboard)
Broken on Calibre 2.20?

I've just updated my Calibre to 2.20 (64bit) together with this plugin. I used Hyphenate This! successfuly in the past (older versions).

After this update it is no longer working. Confirmation dialog (format selection) opens but nothing happens afterwards. There are no obvious errors either.

system: WIN 7 (64bit)
format: AZW3
lang: Polish (using hyph_pl_PL.dic)
log: Hyphenator using self LR: 2 2

Why? Do I need an older/other Calibre version or something?
Aellea is offline   Reply With Quote
Old 03-10-2015, 05:37 PM   #221
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 73,987
Karma: 128903378
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
I just tested the Hyphenate This! plugin with Calibre 64-bit 2.20 under Windows 8.1 and it worked no problem. I did have the same problem you did when I mistakenly did not select the format I wanted to hyphenate. You have to click on the format you want even if it's the only format you have.
JSWolf is online now   Reply With Quote
Old 03-11-2015, 02:18 AM   #222
SauliusP.
Plugin developer
SauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notes
 
SauliusP.'s Avatar
 
Posts: 108
Karma: 24394
Join Date: Feb 2012
Location: Lithuania
Device: Kindle
Quote:
Originally Posted by Aellea View Post
Why? Do I need an older/other Calibre version or something?
No, as indicated by another comment, you just need to explicitly click on the format you want to hyphenate. I myself have this problem from the first version of Calibre 2. Calibre 1.x somehow preserved selected format in the dialogue box. It's a bit annoying, as "half-selected" format might be misleading: one clicks "OK" and nothing happens.
SauliusP. is offline   Reply With Quote
Old 03-12-2015, 08:39 AM   #223
nyrk
Junior Member
nyrk began at the beginning.
 
Posts: 6
Karma: 10
Join Date: Jun 2012
Location: Vienna, Austria
Device: Sony PRS 350, kindle, kindle paperwhite 2, sony prs T2
Saulius P.:

I just want to add my THANKS for this great plugin that turns reading on my kindle into an uninterrupted process and thus into a lot more fun!

I certainly don't take such efforts for granted and I can only hope that you will continue to maintain/update this plugin, at least as long as devices don't offer hyphenation options in first place.

I will indeed express my feelings with Paypal (from within the plugin settings) !

It may not be a huge amount, but as always with good freeware, I can imagine that everyone who likes a free product may show his appreciation by contributing as much as he's willing to spare so that many small contributions add up to something more substantial!
Imo, good work should be appreciated by those who benefit from it - also but not solely through words.

Thank you - please keep up the good work!
nyrk is offline   Reply With Quote
Old 03-12-2015, 08:51 AM   #224
SauliusP.
Plugin developer
SauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notes
 
SauliusP.'s Avatar
 
Posts: 108
Karma: 24394
Join Date: Feb 2012
Location: Lithuania
Device: Kindle
Quote:
Originally Posted by nyrk View Post
I certainly don't take such efforts for granted and I can only hope that you will continue to maintain/update this plugin, at least as long as devices don't offer hyphenation options in first place.
Thank you for your warm expression, that means a lot for me!
SauliusP. is offline   Reply With Quote
Old 03-18-2015, 02:51 PM   #225
dragonkore
Junior Member
dragonkore began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Mar 2015
Device: Kindle Voyage
This is amazing, thank you. It really does make a big difference in terms of formatting an ebook to look more like a printed one.
dragonkore is offline   Reply With Quote
Reply

Tags
amazon account, formatting, hypenation, hyphenate this, hyphenation, spaces


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
[GUI Plugin] Open With kiwidude Plugins 403 04-01-2024 08:39 AM
[GUI Plugin] SmartEject JimmXinu Plugins 80 01-28-2024 06:15 PM
[GUI Plugin] KindleUnpack - The Plugin DiapDealer Plugins 492 10-25-2022 08:13 AM
[GUI Plugin] Wordpress frescogamba Plugins 11 04-06-2015 09:09 PM
[GUI Plugin] Plugin Updater **Deprecated** kiwidude Plugins 159 06-19-2011 12:27 PM


All times are GMT -4. The time now is 12:12 PM.


MobileRead.com is a privately owned, operated and funded community.