Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Readers > Sony Reader

Notices

Reply
 
Thread Tools Search this Thread
Old 05-11-2008, 04:38 PM   #1
LittleDragon
Junior Member
LittleDragon began at the beginning.
 
Posts: 7
Karma: 10
Join Date: May 2008
Device: Sony PRS-505
Book Processor - Anything to LRF and HTML converter

Hi there

I see there are quite a few tools out there to convert all kinds of different files to LRF, but here I come with yet another one

It's called Book Processor. It takes a source file as input and can output LRF and HTML. The source file can be created by hand, or from the original input file (as long as you have a program capable of reading the input file).

You can find the application, the documentation and an example book at http://stuf.ro/bp/

The project is in a pretty early stage, so do expect bugs.

Radu
LittleDragon is offline   Reply With Quote
Old 05-11-2008, 05:12 PM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,842
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Welcome to Mobileread. May I ask what missing features in existing converters are you trying to supply?
kovidgoyal is online now   Reply With Quote
Old 05-11-2008, 05:12 PM   #3
Ervserver
Wizard
Ervserver ought to be getting tired of karma fortunes by now.Ervserver ought to be getting tired of karma fortunes by now.Ervserver ought to be getting tired of karma fortunes by now.Ervserver ought to be getting tired of karma fortunes by now.Ervserver ought to be getting tired of karma fortunes by now.Ervserver ought to be getting tired of karma fortunes by now.Ervserver ought to be getting tired of karma fortunes by now.Ervserver ought to be getting tired of karma fortunes by now.Ervserver ought to be getting tired of karma fortunes by now.Ervserver ought to be getting tired of karma fortunes by now.Ervserver ought to be getting tired of karma fortunes by now.
 
Ervserver's Avatar
 
Posts: 2,624
Karma: 1008294
Join Date: Dec 2007
Location: Iowa, USA
Device: Nook Simple Touch
ok will give it a try, thanks
Ervserver is offline   Reply With Quote
Old 05-11-2008, 06:04 PM   #4
LittleDragon
Junior Member
LittleDragon began at the beginning.
 
Posts: 7
Karma: 10
Join Date: May 2008
Device: Sony PRS-505
Quote:
Originally Posted by kovidgoyal View Post
Welcome to Mobileread.
Thank you

Quote:
Originally Posted by kovidgoyal View Post
May I ask what missing features in existing converters are you trying to supply?
Well, as far as I know from what I used so far, none of the current projects implement ligatures for example, very few implement font or image embedding and all of them are targeted at english readers and use english typographical conventions.

There are some languages in which dialogs are represented by a line beginning with a dash, followed by the actual dialog, and, since the Reader always justifies text, the space between the dash and the first sentence usually has a variable length. You can of course replace the space with a nonbreaking space manually for each document written in a language that uses these typographical conventions, but I'm trying to automate such tasks.

The idea behind the project is to automate as much as possible and to always obtain a consistent result, regardless of the way the input file looks.

To that end, I also implemented some more advanced features that can be used to automate book organization, without any interventions in the input text. For example, you can separate chapters by using a regexp or you can automatically transform chapter names with a single instruction (check out the example book in which all chapter names are uppercase). This is something you would normally do manually.

There's also the organization of chapters in "parts", something that is present in most large books, but I couldn't really find in any of the LRF convertors that I tried (you can always use a hack, such as an empty chapter, as a part separator, but I'm not very confortable with that).

There are some more features that I wanted but couldn't find (such as automatic OCR fixing and footnotes). Check out the feature list in the manual for a complete list of features... There are some that already exist in other implementations and some that I believe have never been implemented so far (at least to my knowledge).
LittleDragon is offline   Reply With Quote
Old 05-11-2008, 06:25 PM   #5
zelda_pinwheel
zeldinha zippy zeldissima
zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.
 
zelda_pinwheel's Avatar
 
Posts: 27,827
Karma: 921169
Join Date: Dec 2007
Location: Paris, France
Device: eb1150 & is that a nook in her pocket, or she just happy to see you?
hello Little Dragon, some of the features you mention sound quite interesting to me, are you planning to support output in any other format besides lrf and html in future (like .prc / mobi) ?
zelda_pinwheel is offline   Reply With Quote
Old 05-11-2008, 06:29 PM   #6
LittleDragon
Junior Member
LittleDragon began at the beginning.
 
Posts: 7
Karma: 10
Join Date: May 2008
Device: Sony PRS-505
Quote:
Originally Posted by zelda_pinwheel View Post
hello Little Dragon, some of the features you mention sound quite interesting to me, are you planning to support output in any other format besides lrf and html in future (like .prc / mobi) ?
I can only test LRF so far, since I only have a PRS-505... I also plan to support other languages besides English and Romanian, but I have to get up to date with their typographical conventions.
LittleDragon is offline   Reply With Quote
Old 05-11-2008, 06:41 PM   #7
zelda_pinwheel
zeldinha zippy zeldissima
zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.
 
zelda_pinwheel's Avatar
 
Posts: 27,827
Karma: 921169
Join Date: Dec 2007
Location: Paris, France
Device: eb1150 & is that a nook in her pocket, or she just happy to see you?
thanks for the answer ! just so you know, you can use an emulator to test other formats, if you like (i know they are available for mobi .prc and for .imp, and probably for all others as well). i will definitely keep an eye on your project ; the format i use is .imp, which is based on html but is a "dead-end" format (you can't convert it to anything else), but i think when i make Project Gutenberg books to upload here it would be nice if i could also make a .prc version, since it can be read by a lot more people and easily converted if necessary. and so far i have not been very satisfied with the programs i have used for preparing my texts.
zelda_pinwheel is offline   Reply With Quote
Old 05-11-2008, 07:10 PM   #8
LittleDragon
Junior Member
LittleDragon began at the beginning.
 
Posts: 7
Karma: 10
Join Date: May 2008
Device: Sony PRS-505
Quote:
Originally Posted by zelda_pinwheel View Post
thanks for the answer ! just so you know, you can use an emulator to test other formats, if you like (i know they are available for mobi .prc and for .imp, and probably for all others as well). i will definitely keep an eye on your project ; the format i use is .imp, which is based on html but is a "dead-end" format (you can't convert it to anything else), but i think when i make Project Gutenberg books to upload here it would be nice if i could also make a .prc version, since it can be read by a lot more people and easily converted if necessary. and so far i have not been very satisfied with the programs i have used for preparing my texts.
Hrm... I'll look around for emulators, but the output will defenetly have to be tested on a real device. For example, the Reader emulator that comes with the Sony eBook Library software does a very good job at emulating, but I only found a good font size after several tries on the real device... If I now load a LRF in the emulator, it does display as it should, but the fonts look a lot larger on the screen than on the device.

One of the goals of the project though is consistency and I'll defenetly check out existing libraries and emulators for other formats. I think I could initially approximate the difference in font size by the difference in resolution and actual screen size...

In any case, I still have quite a bit of tweaking left to do...
LittleDragon is offline   Reply With Quote
Old 05-12-2008, 05:06 AM   #9
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
You can download the Windows MobiPocket Reader from http://www.mobipocket.com to test out MobiPocket books.
HarryT is offline   Reply With Quote
Old 05-12-2008, 05:44 AM   #10
LittleDragon
Junior Member
LittleDragon began at the beginning.
 
Posts: 7
Karma: 10
Join Date: May 2008
Device: Sony PRS-505
Quote:
Originally Posted by HarryT View Post
You can download the Windows MobiPocket Reader from http://www.mobipocket.com to test out MobiPocket books.
Cool, thanks, I'm checking it out.
LittleDragon is offline   Reply With Quote
Old 05-12-2008, 03:05 PM   #11
LittleDragon
Junior Member
LittleDragon began at the beginning.
 
Posts: 7
Karma: 10
Join Date: May 2008
Device: Sony PRS-505
New version: 0.1.1

Changelog:
- rewrote the quote detection algorithm to take in account possible OCR errors
- added stderr warnings where unbalanced quotes are detected
- adjusted the OCR fixing heuristics
LittleDragon is offline   Reply With Quote
Old 05-13-2008, 04:31 PM   #12
LittleDragon
Junior Member
LittleDragon began at the beginning.
 
Posts: 7
Karma: 10
Join Date: May 2008
Device: Sony PRS-505
New version: 0.1.2

Changelog:
- added table of contents to the HTML output
- fixed the position of the first part in BBeB
- adjusted the OCR fixing heuristics
LittleDragon is offline   Reply With Quote
Reply

Tags
html, lrf, prs500, prs505, reader


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Book Processor Adair Introduce Yourself 10 10-06-2010 09:27 AM
Problem Converting Book Designer HTML to LRF Phonella Calibre 6 10-22-2009 01:21 PM
CBZ > LRF (LRF>HTML/MOBI????) sideburnt Calibre 4 09-15-2009 06:44 AM
Yet Another Gutenberg Book/HTML converter FangornUK Sony Reader 59 05-01-2009 10:15 AM
PRS-500 Linux based HTML to LRF converter? Thiana Sony Reader Dev Corner 3 04-08-2007 02:34 AM


All times are GMT -4. The time now is 11:35 PM.


MobileRead.com is a privately owned, operated and funded community.