![]() |
#511 | |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 714
Karma: 2003751
Join Date: Oct 2008
Location: Ottawa, ON
Device: Kobo Glo HD
|
Quote:
The process would be SLOW, most likely. But it is done once, on PC, when book is made. Our desktops are very capable machines. The situation with "present" and ambiguous cases in other languages IS NOT a concern, since we are NOT doing machine-only hyphenation here. The database can recognize such cases and ask for human intervention. You resolve it once, during book creation. I really don't see any problems with the method. |
|
![]() |
![]() |
![]() |
#512 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,790
Karma: 507333
Join Date: May 2009
Device: none
|
Never mind. What I wrote here doesn't make sense.
Last edited by ahi; 09-02-2009 at 02:52 PM. |
![]() |
![]() |
Advert | |
|
![]() |
#513 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,790
Karma: 507333
Join Date: May 2009
Device: none
|
Quote:
How would you go about building such a database, Ankh? Just processing oodles and oodles of PG eTexts, and manually hyphenate the words therefrom? - Ahi |
|
![]() |
![]() |
![]() |
#514 | |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 714
Karma: 2003751
Join Date: Oct 2008
Location: Ottawa, ON
Device: Kobo Glo HD
|
Quote:
Then yes, expect users to help with the growth of the database. The database-assisted hyphenation engine can ask for intervention whenever a word is not in the database. When job is done, process the database, extract the words that were added to basic text file, one line per hyphenated word, submit such file back to the maintainer. Review (use dictionaries and any other tools available), merge changes, new version of the database. Open source. |
|
![]() |
![]() |
![]() |
#515 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,790
Karma: 507333
Join Date: May 2009
Device: none
|
Quote:
- Ahi |
|
![]() |
![]() |
Advert | |
|
![]() |
#516 | |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 714
Karma: 2003751
Join Date: Oct 2008
Location: Ottawa, ON
Device: Kobo Glo HD
|
Quote:
A rigid review policy before submission is needed, or everything can easily fall apart. In the early stage, the tool and the database will be almost useless (too many misses), but that can quickly change, since the most frequently used words and their incarnations (whatever is the reason for slightly different form) will soon find its way into the database. A "perfect" database might never emerge, but pretty complete one (hit ratio above 99%) would be more than useful. Your call, I am not ready to make such a commitment. Once soft hyphens are implemented on prs505, I promise that I will use the tool and contribute to the growth of the database. Last edited by Ankh; 09-02-2009 at 04:05 PM. |
|
![]() |
![]() |
![]() |
#517 | |
Still wondering why
![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 253
Karma: 800
Join Date: Jun 2009
Location: Athens, Greece
Device: PRS 505, (BlackBerry Bold ?)
|
Quote:
|
|
![]() |
![]() |
![]() |
#518 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,790
Karma: 507333
Join Date: May 2009
Device: none
|
Must be some weird unicode mangling...
![]() - Ahi |
![]() |
![]() |
![]() |
#519 |
Still wondering why
![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 253
Karma: 800
Join Date: Jun 2009
Location: Athens, Greece
Device: PRS 505, (BlackBerry Bold ?)
|
|
![]() |
![]() |
![]() |
#520 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 5,187
Karma: 25133758
Join Date: Nov 2008
Location: SF Bay Area, California, USA
Device: Pocketbook Touch HD3 (Past: Kobo Mini, PEZ, PRS-505, Clié)
|
Quote:
However, a good, if not perfect, hyphenation algorithm could be made, based on linguistic analysis of the language. And it could be combined with a dictionary, so it would automatically put up flags for words that could be either compound words or identically-spelled words with different meanings. (I'd say "homonyms," but they might not have the same pronunciation.) It wouldn't fix all mishy-phens, but it'd allow the formatting person (whoever that is, author or editor) to quickly identify the possibilities, rather than doing a line-by-line proof every time the formatting shifts a bit. As far as I know, "present" is always split "pre-sent," with any of its three possible pronunciations. However, unless I was using it in a sentence like "we knew the authorization would arrive later that week, so we pre- sent the package," I'd avoid hyphenating it to avoid confusion, because ending with "pre-" implies the long-e pronunciation. There's no reason hyphenation software couldn't be as good as current spellcheck software--not perfect, but good enough to remove a lot of the gruntwork of proofreading, and good enough to reflow a book to avoid almost all troublesome hyphenations. As amusing or sometimes annoying as bad hyphenations are, I'd rather publishers spent more time on actual typos, and apostrophe use, and a good table of contents. Oh, and an index for nonfiction books. I need good content before I need great formatting. |
|
![]() |
![]() |
![]() |
#521 | |
Exwyzeeologist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 535
Karma: 3261
Join Date: Jun 2009
Device: :PRS-505::iPod touch:
|
Quote:
|
|
![]() |
![]() |
![]() |
#522 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,790
Karma: 507333
Join Date: May 2009
Device: none
|
|
![]() |
![]() |
![]() |
#523 | |
Somewhat clueless
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 772
Karma: 9999999
Join Date: Nov 2008
Location: UK
Device: Kindle Oasis
|
Quote:
Striving for perfect (for somebody's arbitrary definition of perfect) typography on a book which is riddled with errors is simply polishing the proverbial you-know-what. For me (and I emphasise that this is for me - as I've said, different people want different things), if the content is accurate then I'm happy with a very basic layout - minor hyphenation errors or even stacks etc. don't really bother me. Having said that, I take your point that a format which mixed a reflowable format with specialised layouts for specific sizes would be ideal. My concern is that I can't see how publishers who can't even get the basics right are going to do it well - I'd rather they just concentrated on the fundamentals. /JB |
|
![]() |
![]() |
![]() |
#524 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,790
Karma: 507333
Join Date: May 2009
Device: none
|
Quote:
But, to be honest, if I did think it was an either or proposition, I'd follow you in preferring them to do the proofing properly. After all, I can typeset the book myself more easily than I can proofread it. ![]() - Ahi |
|
![]() |
![]() |
![]() |
#525 |
Exwyzeeologist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 535
Karma: 3261
Join Date: Jun 2009
Device: :PRS-505::iPod touch:
|
This completely ignores that proofreading and typesetting duties are generally handled by entirely different people or departments, especially in an organization of, oh, say, five or more people. To me, this further emphasizes that it's not an either/or proposition.
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
bad format of pdf ebook reader | Adolfo00 | Calibre | 9 | 04-22-2010 12:11 PM |
Convert PDF To Sony eBook Format? | Sjwdavies | Sony Reader | 12 | 12-13-2009 03:15 AM |
Free eBook for Kindle or pdf format | cmwilson | Deals and Resources (No Self-Promotion or Affiliate Links) | 38 | 05-06-2009 03:32 AM |
Master Format for multi-format eBook Generation? | cerement | Workshop | 43 | 04-01-2009 12:00 PM |
Format Comparison: PDF, EPUB, and Mobi Downloads from Ebook Bundles | Kris777 | News | 2 | 01-22-2009 04:19 AM |