Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Readers > Sony Reader

Notices

Reply
 
Thread Tools Search this Thread
Old 12-31-2007, 06:56 AM   #1
shen
Member
shen began at the beginning.
 
Posts: 20
Karma: 46
Join Date: Nov 2007
Location: Germany
Device: Sony PRS-505
From PDF to LRF via Mobipocket Creator and BD - works great :-)

Hi all,

since my results in converting PDF to LRF using Book Designer often were not as good as expected (i.e. losing italics, not detecting and eliminating page numbers, losing images) in some cases, I was looking for a new way to give me better results.

As I knew, that Mobipocket deals really good with PDFs I was trying to convert PDFs with Mobipocket. But I had no luck with converting these really well looking PRCs to LRF, as Book Designer didn't recognize them.

So I gave Mobipocket Creator a try. Same result with these PRCs ... but ... Creator makes a temporary HTML which works very well (at least in my tests) with BD.

So that's the way to go:

First install Mobipocket Creator from http://www.mobipocket.com/en/DownloadSoft/default.asp

Install it with the advanced features enabled.

Start it and choose "Import From Existing File" and import your PDF.

Click on "Build".

Now check "Open folder containing eBook" and click "OK".

There you will find a HTML file. Open this in Book Designer and save your LRF - that's all.

I know there are a few tools for converting PDF to LRF or PDF to HTML, but this is the easiest way with the best results I found so far (at least for standard eBooks - technical documents may not work that good) - give it a try!

Stefan

Last edited by shen; 12-31-2007 at 08:18 AM.
shen is offline   Reply With Quote
Old 01-01-2008, 08:35 AM   #2
astra
The Introvert
astra ought to be getting tired of karma fortunes by now.astra ought to be getting tired of karma fortunes by now.astra ought to be getting tired of karma fortunes by now.astra ought to be getting tired of karma fortunes by now.astra ought to be getting tired of karma fortunes by now.astra ought to be getting tired of karma fortunes by now.astra ought to be getting tired of karma fortunes by now.astra ought to be getting tired of karma fortunes by now.astra ought to be getting tired of karma fortunes by now.astra ought to be getting tired of karma fortunes by now.astra ought to be getting tired of karma fortunes by now.
 
astra's Avatar
 
Posts: 8,308
Karma: 1000077497
Join Date: Jan 2007
Location: United Kingdom
Device: Sony Reader PRS-650 & 505 & 500
Does it keep italics?
astra is offline   Reply With Quote
 
Enthusiast
Old 01-01-2008, 08:37 AM   #3
shen
Member
shen began at the beginning.
 
Posts: 20
Karma: 46
Join Date: Nov 2007
Location: Germany
Device: Sony PRS-505
Yes, italics are kept :-)

Stefan
shen is offline   Reply With Quote
Old 01-01-2008, 08:38 AM   #4
astra
The Introvert
astra ought to be getting tired of karma fortunes by now.astra ought to be getting tired of karma fortunes by now.astra ought to be getting tired of karma fortunes by now.astra ought to be getting tired of karma fortunes by now.astra ought to be getting tired of karma fortunes by now.astra ought to be getting tired of karma fortunes by now.astra ought to be getting tired of karma fortunes by now.astra ought to be getting tired of karma fortunes by now.astra ought to be getting tired of karma fortunes by now.astra ought to be getting tired of karma fortunes by now.astra ought to be getting tired of karma fortunes by now.
 
astra's Avatar
 
Posts: 8,308
Karma: 1000077497
Join Date: Jan 2007
Location: United Kingdom
Device: Sony Reader PRS-650 & 505 & 500
Quote:
Originally Posted by shen View Post
Yes, italics are kept :-)

Stefan
Thanks.
I will give it a try.
astra is offline   Reply With Quote
Old 01-02-2008, 11:45 AM   #5
astra
The Introvert
astra ought to be getting tired of karma fortunes by now.astra ought to be getting tired of karma fortunes by now.astra ought to be getting tired of karma fortunes by now.astra ought to be getting tired of karma fortunes by now.astra ought to be getting tired of karma fortunes by now.astra ought to be getting tired of karma fortunes by now.astra ought to be getting tired of karma fortunes by now.astra ought to be getting tired of karma fortunes by now.astra ought to be getting tired of karma fortunes by now.astra ought to be getting tired of karma fortunes by now.astra ought to be getting tired of karma fortunes by now.
 
astra's Avatar
 
Posts: 8,308
Karma: 1000077497
Join Date: Jan 2007
Location: United Kingdom
Device: Sony Reader PRS-650 & 505 & 500
Quote:
Originally Posted by shen View Post
Now check "Open folder containing eBook" and click "OK".
I didn't find this option.
astra is offline   Reply With Quote
Old 01-02-2008, 12:34 PM   #6
Stephanos
Connoisseur
Stephanos doesn't litterStephanos doesn't litter
 
Posts: 58
Karma: 133
Join Date: Oct 2007
Location: Minnesota, USA
Device: EB-1150, PRS-505, NST
I've used this method before and it works fairly well. You may want to copy the file called "pdf2xml.exe" from the "Mobipocket Reader" program file into the "Mobipocket Creator" program file. The latest version of the reader (6.1) has a more recent version of this file which works better in some cases.

astra_lestat, look in the My Documents\My Publications folder. That is where the Creator will store the files by default.
Stephanos is offline   Reply With Quote
Old 01-02-2008, 05:29 PM   #7
JTravers
Groupie
JTravers ought to be getting tired of karma fortunes by now.JTravers ought to be getting tired of karma fortunes by now.JTravers ought to be getting tired of karma fortunes by now.JTravers ought to be getting tired of karma fortunes by now.JTravers ought to be getting tired of karma fortunes by now.JTravers ought to be getting tired of karma fortunes by now.JTravers ought to be getting tired of karma fortunes by now.JTravers ought to be getting tired of karma fortunes by now.JTravers ought to be getting tired of karma fortunes by now.JTravers ought to be getting tired of karma fortunes by now.JTravers ought to be getting tired of karma fortunes by now.
 
Posts: 166
Karma: 1052701
Join Date: Sep 2007
Device: iPad Air
Has anyone compared the results of this method to that of the pdf2lrf tool in libprs500?

I would be interested in hearing the results before going ahead and installing this myself.

JTravers is offline   Reply With Quote
Old 01-02-2008, 08:36 PM   #8
shen
Member
shen began at the beginning.
 
Posts: 20
Karma: 46
Join Date: Nov 2007
Location: Germany
Device: Sony PRS-505
I tested libprs500 in the past using the GUI and in very most cases, I was not very happy with the results.
So I switched over to Book Designer and PdfLrf. Depending on my source PDF, one of them gave me acceptable results. Not perfect in most cases, but the LRFs were usable enough for me.
But there were a few cases which didn't give me results I was willing to accept - no fun to read. That's why I was looking for alternative ways. I've tested nearly all known tools and methods from this forum and I was looking for something new, that's how I found out this Mobipocket Creator / BD combo which gave me surprisingly better results on PDFs which I had no luck with.
I'm not looking for the 100% perfectly conversion tool, nor for the 100% perfect output (as I know that this hardly can be done - source PDFs differ so much in quality and layout). And I'm not willing to spend much time in correcting conversions, i.e. in Book Designer. I just want to convert my PDFs fast and with only few user interaction, put the LRF on my Sony, read it and that's all.

What really does the trick here is the conversion and text reformatting from PDF to HTML, which Mobipocket does a great job on. Try it and open the output HTML in your Browser. As PDFs are organized in hard coded individual pages, a conversion to a single floating text on a large single page has to made at first, including images and dealing with italics, bold text, eliminating page numbers, headers, footes and so on. MP just deals great with that tasks - fast and automatically.

Once you have it converted to such a HTML page, it's easy to create a well looking LRF, which I do with Book Designer. Of course other utilities may also be used if you start with a HTML page, but I found that this combo does a great job to my literaric PDFs - and they are german in most cases.

This may not apply to all PDFs, but I've tried a few and nothing gave me better results in such a short time.

Try it, it's worth a try - especially if you are not happy with the results you're getting right now, whatever tools you're using at the moment.

For a quick check, you also can install Mobipocket Reader, open your PDF and read the output at the screen. Here you can do fast precheck of the conversion which you can expect from MP. And after a conversion to LRF in BD, the resulting LRF is in most cases at least that good if not better.

I'm very happy with that. And to be honest ... there's not much left to do for Book Designer, most of the conversion tricks BD does are already made by MP. But MP keeps images and italics, eliminates page numbers which BD didn't do very well in some cases.
At least you can define you preferred fonts and font sizes and other formatting options in BD.
So reformatting the text is done mostly in MP and the look and feel is done mostly by BD.
Too sad, that there's no converter from Mobipocket PRCs to Sony LRFs, because this could be a great combo.

Stefan

Last edited by shen; 01-02-2008 at 08:54 PM.
shen is offline   Reply With Quote
Old 01-07-2008, 06:07 AM   #9
astra
The Introvert
astra ought to be getting tired of karma fortunes by now.astra ought to be getting tired of karma fortunes by now.astra ought to be getting tired of karma fortunes by now.astra ought to be getting tired of karma fortunes by now.astra ought to be getting tired of karma fortunes by now.astra ought to be getting tired of karma fortunes by now.astra ought to be getting tired of karma fortunes by now.astra ought to be getting tired of karma fortunes by now.astra ought to be getting tired of karma fortunes by now.astra ought to be getting tired of karma fortunes by now.astra ought to be getting tired of karma fortunes by now.
 
astra's Avatar
 
Posts: 8,308
Karma: 1000077497
Join Date: Jan 2007
Location: United Kingdom
Device: Sony Reader PRS-650 & 505 & 500
I have tried it and I didn't like it, sorry.
Too many broken paragraphs, at the same time too many paragraphs annexed forming one huge paragraph.
astra is offline   Reply With Quote
Old 01-20-2008, 06:53 AM   #10
marco62
Junior Member
marco62 began at the beginning.
 
Posts: 6
Karma: 10
Join Date: Nov 2007
Device: PRS 505
Stefan,

thx a lot , ur suggestion for me is the best one i have find . Now i can have a good book in just 2 minutes , sure some page break is missing but the result is very good and most of all FAST !

Now we are all waiting for the next firmware update ( I have read will be out in r the first quarter 2008 ) , coz imho this device must only read PDF files ...but perfect. Thx again stefan
marco62 is offline   Reply With Quote
Old 01-21-2008, 09:55 AM   #11
Surfergirl
Enthusiast
Surfergirl has a complete set of Star Wars action figures.Surfergirl has a complete set of Star Wars action figures.Surfergirl has a complete set of Star Wars action figures.
 
Surfergirl's Avatar
 
Posts: 48
Karma: 299
Join Date: Oct 2007
Location: South Wales, UK
Device: PRS-505 (Blue)/PRS-505 (Red)/iPhone 3GS
Quote:
Originally Posted by shen View Post
... I'm very happy with that. And to be honest ... there's not much left to do for Book Designer, most of the conversion tricks BD does are already made by MP. But MP keeps images and italics, eliminates page numbers which BD didn't do very well in some cases.
At least you can define you preferred fonts and font sizes and other formatting options in BD.
So reformatting the text is done mostly in MP and the look and feel is done mostly by BD.
I've got a large technical manual PDF (including many screenprints and diagrams), and was having a lot of problems trying to convert it. Tried this combination, and am very happy with the results. The only problem I have is that if I use the MP-created HTML/image files in libprs500, I get all of the images included in an almost-correctly formatted document (only minor paragraph/line-spacing inaccuracies), but if I open the same HTML file in BD, I can sort out the little formatting problems, but the document doesn't pick up the image files. Does anyone have any ideas why BD would ignore the image files?

At the moment, I'm using the libprs500-produced LRF and ignoring the little mis-formattings, and would recommend the conversion procedure for anyone who is having problems getting a readable PDF conversion. I've successfully used PDFLRF and other PDF converters in the past for different docs, but this particular manual was a real pain in the backside, and Shen's method was the only one which gave me an acceptable document.

Irene
Surfergirl is offline   Reply With Quote
Old 02-01-2008, 05:36 AM   #12
marco62
Junior Member
marco62 began at the beginning.
 
Posts: 6
Karma: 10
Join Date: Nov 2007
Device: PRS 505
Last problem for me now is code samples .

If a book contain code examples well indented , in the conversion indentation is lost , and code in really unreadable in all the book.

Is some1 know a fast way to setup coode in BD or Mobi
marco62 is offline   Reply With Quote
Old 02-01-2008, 12:25 PM   #13
dgallina
Junior Member
dgallina began at the beginning.
 
Posts: 6
Karma: 22
Join Date: Nov 2007
Device: Sony PRS-505
Does MobiPocket convert the text by interpreting it with OCR?

The book i tried initially appeared to convert well (nice formatting), but the text accuracy was absolutely miserable. Ended up using the PDF cut-paste and manual fix method since that at least preserved the correct text.
dgallina is offline   Reply With Quote
Old 02-01-2008, 01:20 PM   #14
dcalder
Zealot
dcalder knows what is on the back of the AURYN.dcalder knows what is on the back of the AURYN.dcalder knows what is on the back of the AURYN.dcalder knows what is on the back of the AURYN.dcalder knows what is on the back of the AURYN.dcalder knows what is on the back of the AURYN.dcalder knows what is on the back of the AURYN.dcalder knows what is on the back of the AURYN.dcalder knows what is on the back of the AURYN.dcalder knows what is on the back of the AURYN.dcalder knows what is on the back of the AURYN.
 
Posts: 113
Karma: 9856
Join Date: Dec 2007
Location: Ontario, Canada
Device: Sony PRS-300/Kindle Keyboard/iPad Mini
Anybody got a sample to test? I've got a sneaking suspicion that WordPerfect would handle it better/more easily - Reveal Codes is the WP user's friend. I've cleaned up plenty of ASCII-text from mailing list posts by running it through WP. What I'd suspect would work for the PDF would be to either open it in WP or cut-&-paste it in, turn on Reveal Codes, see what codes are being used at the end of lines versus end/beginning of paragraphs, and go from there. Regardless of whether there's a blank line between paragraphs or if the paragraphs are indicated by indentation alone, there will be something unique about the coding that separates them. Search and replace that with some sort of unique indicator word/phrase. Then search and replace the hard line feeds with the soft line feed code. One more search and replace to turn the indicator back into the proper paragraph separation code, then a quick once-over to confirm that things look good.

At that point, I'd probably run a macro to just go ahead and do the HTML conversion (mainly just a series of search-&-replaces to replace WP code with HTML for bold, italic, underline, etc.), add in any desired extra HTML coding, then save out as plain text. Rename the txt file to html and you're good to go. Why not just let WP save as HTML, you may ask. Simple - the same reason that I highly recommend not letting Word save as HTML - they both do a lousy job and include way too much unnecessary junk.
dcalder is offline   Reply With Quote
Old 05-04-2009, 01:04 PM   #15
chrisophus
Junior Member
chrisophus began at the beginning.
 
Posts: 9
Karma: 10
Join Date: Aug 2008
Device: Kindle
Complex PDF to HTML

I wrote a python script to convert the output of pdf2xml (from Mobipocket Creator) to html which is suitable for converting to ebook formats. I wrote it specifically to handle code indentation properly. It uses the same source that Mobipocket Creator uses and tries to do an even better job. It is opensource (GPL) so you can tweak it if you know python. I posted about it at http://talkings.org/2009/05/03/complex-pdf-html/. The download link is there as well.
chrisophus is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Using MobiPocket Creator to convert PDF to PRC Bilbo1967 Kindle Formats 15 08-16-2010 07:16 AM
Great tips for PDF reading from Solitaire1: works on Gen3 hidari Bookeen 2 01-28-2010 06:36 PM
Dicken's Hand written manuscript available as a PDF. Works great on my DX! Roy White Amazon Kindle 2 12-11-2009 12:41 PM
Mobipocket creator, PDF Skar90 Software 7 10-10-2009 12:33 PM
Mobipocket Reader 4.8 and Mobipocket eNews Creator Mobipocket Reading and Management 1 01-29-2004 08:03 AM


All times are GMT -4. The time now is 02:02 AM.


MobileRead.com is a privately owned, operated and funded community.