12-29-2006, 05:59 PM | #16 |
Connoisseur
Posts: 69
Karma: 34
Join Date: Dec 2006
Location: Dallas, TX
Device: PRS-500
|
Open issues
BTW to see my list of open issues please checkout http://code.google.com/p/bbebinder/issues/list. If you want to add any new issues you can either post them here to this forum, or I believe you can submit your own issue at the Google Code page.
|
12-30-2006, 04:49 AM | #17 |
Enthusiast
Posts: 38
Karma: 36
Join Date: Dec 2006
Device: Sony Reader PRS-500
|
Are you happy for people to contribute code changes as well as I've got another week of not having to go to work ?
|
Advert | |
|
12-30-2006, 08:55 AM | #18 | |
Connoisseur
Posts: 69
Karma: 34
Join Date: Dec 2006
Location: Dallas, TX
Device: PRS-500
|
Quote:
However, I think that if I just communicate with you beforehand about the changes that I intend to make then we should be OK. Why don't you email me at cjmumford@gmail.com and we can coordinate. Take a look at the open issues on the project page to see if any of them peak your interest. If there are any other fixes/enhancements that you'd rather work on then please feel free to submit an issue. |
|
01-01-2007, 01:35 PM | #19 |
Old Dog Learns New Tricks
Posts: 123
Karma: 142
Join Date: Nov 2006
Location: Maryland USA
Device: Sony PRS-500,PocketBook 301, Sony 650
|
Great Tool. Do you think that orphan control should be added. Here is reply I added to threads on other tools.
Orphan control is a feature that I think would be helpful in ALL the programs that generate file formats for the SONY Reader. I find reading a bit awkward when one part of the sentence is on one page and one on the next page. I think the orphan control feature would be helpful. Maybe it needs to be more complex than IF SENTENCE DOESN"T FIT GO TO NEW PAGE because with large font size(TR 16-TR18)and small display size, long sentences would leave some pages very short as the moved to the next page.{such as this sentence} Maybe for sentences less than ## characters the rule would hold but for other a break in the sentence across pages would be "a necessary evil" <grin> |
01-01-2007, 02:17 PM | #20 | |
Connoisseur
Posts: 69
Karma: 34
Join Date: Dec 2006
Location: Dallas, TX
Device: PRS-500
|
Quote:
If instead you meant a viewable page I'm not sure if the BBeB format allows me to control this being that text can be resized, and frankly a new reader with 768x1024 resolution could be released. I have no way of knowing how the Sony Reader is going to layout the eBook. I think that what you would want here is something like the MS Word paragraph stiles "keep with next" and "page break before" - and I'm not sure that BBeB supports this. BTW - thanks for taking the time to give the program a try. We're working on table of contents and image support, and assuming that the general consensus is that the quality is high enough I'll announce this program to the rest of the forum readers. |
|
Advert | |
|
01-01-2007, 03:31 PM | #21 |
Addict
Posts: 205
Karma: 317
Join Date: Oct 2006
Location: England
Device: Sony PRS-505, iPad, Kindle 3
|
Nice work! Looks very promising.
You seem to be doing some Gutenberg specific detections and a simple clean-up for the HTML versions is page number stripping, I do that in gutlrf.pl like so: $_ =~ s#<span class='pagenum'>.*</span>## ; $_ =~ s#<span class=\"pagenum\">.*</span>## ; I'll post more bug reports to the google code site. |
01-01-2007, 04:15 PM | #22 | |
Connoisseur
Posts: 69
Karma: 34
Join Date: Dec 2006
Location: Dallas, TX
Device: PRS-500
|
Quote:
BTW does Gutenberg have a recommended HTML format that you're aware of, or are they at the mercy of every submitters ideas of what good HTML is? If I wind up doing a bunch of html cleanup then I'll probably implement it where it reads various cleanup parameters (maybe like the two you put above) from a data file so that users can add their own values. |
|
01-01-2007, 08:08 PM | #23 |
Addict
Posts: 205
Karma: 317
Join Date: Oct 2006
Location: England
Device: Sony PRS-505, iPad, Kindle 3
|
Try 19337-h, most HTML ones have this page number span.
Yes there is an HTML standard for Gutenberg that most people follow, alas some don't though. http://www.gutenberg.org/wiki/Gutenberg:HTML_FAQ and http://gutenberg.hwg.org/index.html |
01-01-2007, 08:48 PM | #24 |
Technogeezer
Posts: 7,233
Karma: 1601464
Join Date: Nov 2006
Location: Virginia, USA
Device: Sony PRS-500
|
Vienna01, I'm confused. I have always heard that widow/orhpan referenced the last and first few lines on a physical page. Most times two lines are required on each. Thus if the page breaks with only one line at the bottom of the page, the line would be moved to the following page. If only one line would be at the start of the following page the page break is moved back one line unless that would leave only one line on the first page then both would be moved to the following page. I have never seen it referenced to whole sentances or paragraphs.
More critical from my view would be keeping the headers together with the following text. Few things are more jarring than to see a header as the last line of a page and have to turn the page to start the associated paragraph. Just my two cents. |
01-01-2007, 09:17 PM | #25 | |
Connoisseur
Posts: 69
Karma: 34
Join Date: Dec 2006
Location: Dallas, TX
Device: PRS-500
|
Quote:
|
|
01-01-2007, 09:22 PM | #26 | |
Connoisseur
Posts: 69
Karma: 34
Join Date: Dec 2006
Location: Dallas, TX
Device: PRS-500
|
Quote:
|
|
01-03-2007, 02:24 PM | #27 |
Connoisseur
Posts: 76
Karma: 15
Join Date: Oct 2006
Device: Sony Reader
|
I've been using this wonderful tool a lot over the last few days - thanks again! I had a suggestion I thought I'd make about text handling.
I was recently re-reading the Marlowe plays, and thought I'd use the reader this time around. I downloaded the plain text versions from Gutenberg (most are only available in plain text). Most Gutenberg formatting apps, including this one, strip the single returns so the text flows on the display. However, with plays and poetry, you don't want that. I just used Wordpad to save as RTF (need that metadata so the book list looks nice), but it made me think - there are any number of operations that might need to be tweaked from book to book (like the page number stripping someone mentioned). I believe you said you might put things like that into a config file so people could modify/add, perhaps it could be a little more dynamic - have a little "always on top" window of checkboxes listing each operation that can be done on the text. You could check/uncheck various operations and hit "apply" to see how it would look. It would also be nice, when TOC generation is working, to be able to change the pattern you use to find headings worthy of a TOC entry from within the program, rather than having to exit, change the config file, and try again. |
01-03-2007, 07:29 PM | #28 | |||
Connoisseur
Posts: 69
Karma: 34
Join Date: Dec 2006
Location: Dallas, TX
Device: PRS-500
|
Quote:
Quote:
I noticed that at the beginning of each paragraph there are names like _Cloan._ and _Iar._. Do you know what these mean? Quote:
<FingersCrossed>BTW I'm hoping that TOC and images will be coming in the next two weeks.</FingersCrossed> |
|||
01-04-2007, 12:29 AM | #29 | |
Connoisseur
Posts: 76
Karma: 15
Join Date: Oct 2006
Device: Sony Reader
|
Quote:
MOST lines in plays look something like one of the following, which makes it hard (and hence a good toggle): HAMLET. Oh, what shall I do? POLONIUS. Oh! I am slain. Joe.Blow. Heya So hard to recognize. There are also books that break, due to scanning badness, in mid-sentence. I detect those in my lame-o search-replace macros with lower-case letter followed by no punctuation mark, possibly a space, then a line break or two, followed by a lower case letter. So it picks up stuff like: And then the man jumped off the cliff. Shakespeare and others often line break on purpose, but almost always start the next line with a cap. ex: NORTHUMBERLAND. What news, Lord Bardolph? every minute now Should be the father of some stratagem: QUEEN. No, be assur'd you shall not find me, daughter, After the slander of most stepmothers, Evil-ey'd unto you. You're my prisoner, but Your gaoler shall deliver you the keys That lock up your restraint. For you, Posthumus, etc etc. Last edited by airlik; 01-04-2007 at 01:22 AM. |
|
01-04-2007, 04:43 AM | #30 |
Addict
Posts: 205
Karma: 317
Join Date: Oct 2006
Location: England
Device: Sony PRS-505, iPad, Kindle 3
|
Not sure if you know about GutenMark but I'd suggest you look at this for Gutenberg text files, it really is the best for converting into HTML - Gutenberg themselves recommend it. Easiest way would be to simply call it instead of spending your time implementing its functionality into BBeBinder, but at least the source code would really help.
Just a suggestion on formatting, I've noticed you create BBeBs with formatting for paragraphs that follows the Web page version, i.e. with a new line between paragraphs. Most ebooks skip this and simply start on a newline, it does look better. |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Converting to BBeb? | terrycagg | Workshop | 7 | 12-10-2007 09:58 AM |
how to create BBeB File? | mkarthic | Introduce Yourself | 2 | 10-30-2007 10:54 AM |
PRS-500 Announcing BBeB Binder 0.2 | cmumford | Sony Reader Dev Corner | 29 | 03-17-2007 10:41 AM |
[Librie] Sony Reader BBeB vs. Libre BBeB | CCDMan | Legacy E-Book Devices | 1 | 03-30-2006 03:53 AM |