Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Plugins

Notices

Reply
 
Thread Tools Search this Thread
Old 03-26-2013, 05:39 PM   #16
SauliusP.
Plugin Developer
SauliusP. has memorized the entire works of Homer, Shakespeare, and Jane AustenSauliusP. has memorized the entire works of Homer, Shakespeare, and Jane AustenSauliusP. has memorized the entire works of Homer, Shakespeare, and Jane AustenSauliusP. has memorized the entire works of Homer, Shakespeare, and Jane AustenSauliusP. has memorized the entire works of Homer, Shakespeare, and Jane AustenSauliusP. has memorized the entire works of Homer, Shakespeare, and Jane AustenSauliusP. has memorized the entire works of Homer, Shakespeare, and Jane AustenSauliusP. has memorized the entire works of Homer, Shakespeare, and Jane AustenSauliusP. has memorized the entire works of Homer, Shakespeare, and Jane AustenSauliusP. has memorized the entire works of Homer, Shakespeare, and Jane AustenSauliusP. has memorized the entire works of Homer, Shakespeare, and Jane Austen
 
SauliusP.'s Avatar
 
Posts: 97
Karma: 23854
Join Date: Feb 2012
Location: Lithuania
Device: Kindle
Why, I think it could be quite simple. It is not about the line, but about word break. So, if word starts (and/or ends) with a syllable shorter than two letters, first two (or last respectively) would not be split up. Thus ”hyphenate” instead of ”hy-phe-na-te" would result in "hyphe-nate”, and ”hyphenated” would result in ”hyphe-na-ted”.

@veezh, please confirm if I understood it correctly. Quite a simple feature, I would add without a problem.

Last edited by SauliusP.; 03-26-2013 at 05:42 PM.
SauliusP. is offline   Reply With Quote
Old 03-26-2013, 06:07 PM   #17
theducks
Grand Sorcerer
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 15,054
Karma: 5936659
Join Date: Aug 2009
Location: (The original) Silicon Valley, USA
Device: Galaxy Tab 2, Astak Pocket Pro, K4NT
I was think just like you. Word

I can see where Line comes in:

You do need to know the line-character length to know where the break is needed in the last word.
The Minimums (before and after) test should then be applied.
If that fails, then the whole word is not hyphenated
theducks is offline   Reply With Quote
Old 03-26-2013, 06:25 PM   #18
JimmXinu
Plugin Developer
JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.JimmXinu ought to be getting tired of karma fortunes by now.
 
Posts: 1,780
Karma: 510215
Join Date: Dec 2011
Location: Midwest USA
Device: Nook STR w/Glowlight, Kindle 3g, Droid
Hi, a few issues to know/consider:

The file kpp-american-english-dictionary-797865-words-list.oxt wouldn't load in the plugin for me. Debug output would show "Starting oxt workflow: C:\Users\<user>\Downloads\kpp-american-english-dictionary-797865-words-list.oxt" but that's it. PI v0.0.4, Calibre 0.9.24[64bit] on Win7. kpp-canadian-english-dictionary-674277-word-list.oxt did load and shows up as "eng.dic - English".

The Index of plugins thread is still showing v0.0.3 for this plugin instead of v0.0.4.

Internally, I see that you're using threads to run the background jobs instead of full processes. I did something very similar when I first wrote FFDL and I was advised by more experienced plugin developers that the way I was handling background processing was not particularly safe and I should use background process jobs instead. That was Jan 05, 2012 on calibre 0.8.33. I don't have the PMs anymore, but I think it was kiwidude telling me that, and that it had to do with memory leaks. It might be worth asking around.

For anyone else trying this out--if you Tweak the output and just look at the html files in Firefox's view page source, the added conditional hyphen characters don't show up. But they are there when viewed in a suitable editor.

Nook hyphenation is still (sometimes) broken. This is a Nook bug--nothing to do with this plugin--but I'd hoped that explicit soft hyphens would work around it. Applying style "adobe-hyphenate: none;" prevents the soft hyphens from being honored. Without that (or with "adobe-hyphenate: explicit;"), it does use the soft hyphens instead of it's own placement, but hyphenation is still broken--the ends of hyphenated words are sometimes not shown.
JimmXinu is offline   Reply With Quote
Old 03-26-2013, 06:27 PM   #19
pirl8
Pest
pirl8 ought to be getting tired of karma fortunes by now.pirl8 ought to be getting tired of karma fortunes by now.pirl8 ought to be getting tired of karma fortunes by now.pirl8 ought to be getting tired of karma fortunes by now.pirl8 ought to be getting tired of karma fortunes by now.pirl8 ought to be getting tired of karma fortunes by now.pirl8 ought to be getting tired of karma fortunes by now.pirl8 ought to be getting tired of karma fortunes by now.pirl8 ought to be getting tired of karma fortunes by now.pirl8 ought to be getting tired of karma fortunes by now.pirl8 ought to be getting tired of karma fortunes by now.
 
Posts: 192
Karma: 239254
Join Date: Jan 2012
Location: Italy
Device: KT, KPW
Silly me. I understood there shouldn't be less than n characters left alone on a line. In this case, setting a minimum prefix/suffix would have applied always, and not only on for single syllables.

A further possible enhancement could be an option to not hyphenate certain tags (or tags with a certain style).
pirl8 is offline   Reply With Quote
Old 03-26-2013, 06:33 PM   #20
veezh
plus ça change
veezh does all things with Zen-like beautyveezh does all things with Zen-like beautyveezh does all things with Zen-like beautyveezh does all things with Zen-like beautyveezh does all things with Zen-like beautyveezh does all things with Zen-like beautyveezh does all things with Zen-like beautyveezh does all things with Zen-like beautyveezh does all things with Zen-like beautyveezh does all things with Zen-like beautyveezh does all things with Zen-like beauty
 
veezh's Avatar
 
Posts: 97
Karma: 32134
Join Date: Dec 2009
Location: France
Device: Kindle PW2
Quote:
Originally Posted by SauliusP. View Post
Why, I think it could be quite simple. It is not about the line, but about word break. So, if word starts (and/or ends) with a syllable shorter than two letters, first two (or last respectively) would not be split up.
Exactly. As I understand it, it's not about the line, but about the word break.

Just to clarify, syllables in English (and in French and Dutch) should not be broken if two letters or less would be left stranded on either line. Only a syllable made up of three or more letters should be broken.

I don't know if this style rule applies in other languages, so it would be great if users themselves could define the minimum number of letters in broken syllables.
veezh is offline   Reply With Quote
Old 03-26-2013, 06:43 PM   #21
shamanNS
Connoisseur
shamanNS is that somebody.shamanNS is that somebody.shamanNS is that somebody.shamanNS is that somebody.shamanNS is that somebody.shamanNS is that somebody.shamanNS is that somebody.shamanNS is that somebody.shamanNS is that somebody.shamanNS is that somebody.shamanNS is that somebody.
 
Posts: 62
Karma: 45872
Join Date: Feb 2010
Location: Serbia
Device: Kindle Paperwhite WiFi
Quote:
Originally Posted by pirl8 View Post
...
A further possible enhancement could be an option to not hyphenate certain tags (or tags with a certain style).
Indeed. Heading tags were the first ones to come to my mind.

edit:
@veezh: Those rules already exist inside hyph_*.dic files, at the very begining:

Code:
LEFTHYPHENMIN 2
RIGHTHYPHENMIN 3
Question is if this plugin obeys/supports them ?

Last edited by shamanNS; 03-26-2013 at 06:49 PM.
shamanNS is offline   Reply With Quote
Old 03-26-2013, 10:00 PM   #22
oteksamptis
Junior Member
oteksamptis began at the beginning.
 
Posts: 6
Karma: 10
Join Date: Dec 2012
Device: NST
Hello
After installation shows me this message:

Code:
calibre, version 0.9.24
BŁĄD: Nieznany wyjątek: <b>LookupError</b>:unknown encoding: 300677

Traceback (most recent call last):
  File "calibre_plugins.hyphenatethis.hyphenatethisaction", line 107, in hyphenate
  File "calibre_plugins.hyphenatethis.hyphenatethisaction", line 81, in _select_books
  File "calibre_plugins.hyphenatethis.hyphenator.hyphenator", line 165, in __init__
  File "calibre_plugins.hyphenatethis.hyphenator.hyphenator", line 89, in __init__
LookupError: unknown encoding: 300677
encoding utf-8
oteksamptis is offline   Reply With Quote
Old 03-26-2013, 11:50 PM   #23
OddCosine
Connoisseur
OddCosine ought to be getting tired of karma fortunes by now.OddCosine ought to be getting tired of karma fortunes by now.OddCosine ought to be getting tired of karma fortunes by now.OddCosine ought to be getting tired of karma fortunes by now.OddCosine ought to be getting tired of karma fortunes by now.OddCosine ought to be getting tired of karma fortunes by now.OddCosine ought to be getting tired of karma fortunes by now.OddCosine ought to be getting tired of karma fortunes by now.OddCosine ought to be getting tired of karma fortunes by now.OddCosine ought to be getting tired of karma fortunes by now.OddCosine ought to be getting tired of karma fortunes by now.
 
OddCosine's Avatar
 
Posts: 86
Karma: 1155373
Join Date: Jan 2012
Device: Kindle
Quote:
Originally Posted by JimmXinu View Post
The file kpp-american-english-dictionary-797865-words-list.oxt wouldn't load in the plugin for me. Debug output would show "Starting oxt workflow: C:\Users\<user>\Downloads\kpp-american-english-dictionary-797865-words-list.oxt" but that's it. PI v0.0.4, Calibre 0.9.24[64bit] on Win7. kpp-canadian-english-dictionary-674277-word-list.oxt did load and shows up as "eng.dic - English".
Same issue here. My setup: PI v0.0.4, Calibre 0.9.24[32bit] on Win7[64bit]

Nice plug-in. I didn't even know I wanted soft hyphenation till I saw this!
OddCosine is offline   Reply With Quote
Old 03-27-2013, 01:07 AM   #24
itimpi
Wizard
itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.
 
Posts: 4,094
Karma: 780247
Join Date: Nov 2008
Device: Sony PRS-950, iphone/ipad (Marvin/iBooks/QuickReader)
EDIT: I responded to a post without realising there were many replies already there!

Is that rule not another way of saying:
- make sure that a word contains at least 3 letters before the soft-hyphenation point.
- make sure that a word contains at least 3 letters after the soft-hyphenation point.
- do not soft-hyphenate words of less than 6 characters (implied by the above two rules)
If so one does not need to know the actual line length as the rules are part of the embedding soft-hyphens into words logic.

Last edited by itimpi; 03-27-2013 at 01:09 AM.
itimpi is offline   Reply With Quote
Old 03-27-2013, 04:09 AM   #25
SauliusP.
Plugin Developer
SauliusP. has memorized the entire works of Homer, Shakespeare, and Jane AustenSauliusP. has memorized the entire works of Homer, Shakespeare, and Jane AustenSauliusP. has memorized the entire works of Homer, Shakespeare, and Jane AustenSauliusP. has memorized the entire works of Homer, Shakespeare, and Jane AustenSauliusP. has memorized the entire works of Homer, Shakespeare, and Jane AustenSauliusP. has memorized the entire works of Homer, Shakespeare, and Jane AustenSauliusP. has memorized the entire works of Homer, Shakespeare, and Jane AustenSauliusP. has memorized the entire works of Homer, Shakespeare, and Jane AustenSauliusP. has memorized the entire works of Homer, Shakespeare, and Jane AustenSauliusP. has memorized the entire works of Homer, Shakespeare, and Jane AustenSauliusP. has memorized the entire works of Homer, Shakespeare, and Jane Austen
 
SauliusP.'s Avatar
 
Posts: 97
Karma: 23854
Join Date: Feb 2012
Location: Lithuania
Device: Kindle
Hi all,

First, few answers.
1. If after adding OXT dictionary nothing new appears, try to extract hyph_*.dic file from it and add it. This is because OXT files do not have strict structure and I rely on properly placed descriptor. However, if there is a demand on this plugin, I will enhance dictionary setup routine. But it is low priority.
2. I have submitted request to update index of plugins, but further it is out of my control. Unfortunately, sometimes there are significant delays.
3. @oteksamptis: please kindly give me a download link to your book and dictionary used. Will try it out.

Now, my nearest plan:
1. Short syllables ignoring. I have used one open source library and don't know, if it uses LEFTHYPHENMIN and RIGHTHYPHENMIN descriptors. If not, I will add support of them. Also, I will add feature to override values in the dictionary for every language separately. In my mother tongue the only rule is not to split syllable itself and not to hyphenate one letter.
2. Tags to ignore/tags to hyphenate. I will add two fields to enter tags to ignore and/or tags to hyphenate. So it will be convenient for everybody. Clearing all tags would result in default (current) behaviour.
3. As for different implementation of conversion job, most probably I will not change anything. This is least of concerns. Memory leak possibility might be bad for those, who run Calibre constantly 24/7. I do not. It is up for 10 minutes at most for me (except when developing). Anyway, I promise to give this plugin some load test and smoke test, will see, if it results in any problem.

Question to community: my plugin does not make difference between en_US or en_UK dictionaries. They would be overwritten if applied sequentially. Is there a need to install different English dictionaries and then give a choice window what dictionary to use when hyphenating English book? Calibre internally has only generic "English".
SauliusP. is offline   Reply With Quote
Old 03-27-2013, 05:25 AM   #26
Jasmine GreenTea
Member
Jasmine GreenTea began at the beginning.
 
Jasmine GreenTea's Avatar
 
Posts: 21
Karma: 10
Join Date: Oct 2012
Device: Sony PRS T2
Has anyone tried this on the Sony (T1 or T2)?

I would really like to see it working on my T2...

I am not sure I understand where I am supposed to unpack the zip file. In the Calibre folder, I guess, but where exactly?

Last edited by Jasmine GreenTea; 03-27-2013 at 05:29 AM.
Jasmine GreenTea is offline   Reply With Quote
Old 03-27-2013, 05:38 AM   #27
calvin
DRM remover
calvin began at the beginning.
 
calvin's Avatar
 
Posts: 84
Karma: 10
Join Date: Dec 2009
Location: North of Germany
Device: Kindle 3, 4 & Touch, iPhone/iPad, Hanvon N516 (OpenInkpot)
@Jasmine GreenTea
you have to add it under: Preferences - Plugin - Load Plugin from file

@SauliusP
One Language with two hyphen dictionaries would be great. We have the same problem here in germany. Old books with old rules and the newer books with the new hyphenation rules.

I tried to add both dictionaries but the second overwrites the first one...
calvin is offline   Reply With Quote
Old 03-27-2013, 07:03 AM   #28
Jasmine GreenTea
Member
Jasmine GreenTea began at the beginning.
 
Jasmine GreenTea's Avatar
 
Posts: 21
Karma: 10
Join Date: Oct 2012
Device: Sony PRS T2
@calvin

Thank you for your help. I have done that, it works.

But now... how to install the OXT Dictionary? I tried to do that from the Add Plug-In menu, but it does not see any file (it seems to expect only ZIPs...).
Jasmine GreenTea is offline   Reply With Quote
Old 03-27-2013, 08:51 AM   #29
pirl8
Pest
pirl8 ought to be getting tired of karma fortunes by now.pirl8 ought to be getting tired of karma fortunes by now.pirl8 ought to be getting tired of karma fortunes by now.pirl8 ought to be getting tired of karma fortunes by now.pirl8 ought to be getting tired of karma fortunes by now.pirl8 ought to be getting tired of karma fortunes by now.pirl8 ought to be getting tired of karma fortunes by now.pirl8 ought to be getting tired of karma fortunes by now.pirl8 ought to be getting tired of karma fortunes by now.pirl8 ought to be getting tired of karma fortunes by now.pirl8 ought to be getting tired of karma fortunes by now.
 
Posts: 192
Karma: 239254
Join Date: Jan 2012
Location: Italy
Device: KT, KPW
Quote:
Originally Posted by Jasmine GreenTea View Post
I tried to do that from the Add Plug-In menu, but it does not see any file (it seems to expect only ZIPs...).
You should do it from the "Hyphenate This!" configuration menu.

Click on the "down" arrow in the "Hyphenate This" plugin icon, and choose "Settings". You'll find an "Add dictionary" button.

Just add "hyphen" ditctionaries (normally: hyph_XX_YY.dic)
pirl8 is offline   Reply With Quote
Old 03-27-2013, 10:07 AM   #30
calvin
DRM remover
calvin began at the beginning.
 
calvin's Avatar
 
Posts: 84
Karma: 10
Join Date: Dec 2009
Location: North of Germany
Device: Kindle 3, 4 & Touch, iPhone/iPad, Hanvon N516 (OpenInkpot)
One idea for future developement:

Would it possible to fill a custom column after an ebook was hyphenated?

So you could see if a book is already hyphenated without opening it in sigil or previewer.
calvin is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
[GUI Plugin] Open With kiwidude Plugins 250 11-20-2014 03:45 PM
[GUI Plugin] SmartEject JimmXinu Plugins 28 11-17-2014 02:12 PM
[GUI Plugin] KindleUnpack - The Plugin DiapDealer Plugins 286 10-30-2014 10:25 AM
[GUI Plugin] Wordpress frescogamba Plugins 8 05-20-2014 01:22 PM
[GUI Plugin] Plugin Updater **Deprecated** kiwidude Plugins 159 06-19-2011 01:27 PM


All times are GMT -4. The time now is 08:14 AM.


MobileRead.com is a privately owned, operated and funded community.