Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Readers > Sony Reader

Notices

Reply
 
Thread Tools Search this Thread
Old 09-25-2007, 11:36 AM   #31
NatCh
Gizmologist
NatCh ought to be getting tired of karma fortunes by now.NatCh ought to be getting tired of karma fortunes by now.NatCh ought to be getting tired of karma fortunes by now.NatCh ought to be getting tired of karma fortunes by now.NatCh ought to be getting tired of karma fortunes by now.NatCh ought to be getting tired of karma fortunes by now.NatCh ought to be getting tired of karma fortunes by now.NatCh ought to be getting tired of karma fortunes by now.NatCh ought to be getting tired of karma fortunes by now.NatCh ought to be getting tired of karma fortunes by now.NatCh ought to be getting tired of karma fortunes by now.
 
NatCh's Avatar
 
Posts: 11,615
Karma: 929550
Join Date: Jan 2006
Location: Republic of Texas Embassy at Jackson, TN
Device: Pocketbook Touch HD3
Sounds like JSWolf just did a trial run, and wasn't impressed enough with it to do a full-up review.
NatCh is offline   Reply With Quote
Old 09-25-2007, 12:02 PM   #32
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,596
Karma: 28548962
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Quote:
Originally Posted by JSWolf View Post
I just tried Michelangelo 2.0 on a file in Word that had really messed up line ends. It didn't even get it right as far as ending at the end of a sentence. To be honest, it doesn't do a good enough job to warrant the fee. All it does is set the page size in Word to one for the Reader, and attempt to join badly formed line endings. As for the line endings, there is no way it knows where to fix things properly. I have hard returns now in the middle of sentences. There are also buttons for increasing and reducing the font size.

All this is is an add-in to Word to try to fix bad line endings, set the page size, and increase or decrease the font size. It does nothing you cannot do by hand and better at that. Maybe it is OK if the lines are not too messed up. But if they are rather messed up, this won't do it. I just clicked the button to increase the font size and either I've just locked up Word or this is really slow. All in all, do it by hand.
How well/badly does pdf2lrf's automatic line end detection do on a really messed up PDF?
kovidgoyal is online now   Reply With Quote
Old 09-25-2007, 12:59 PM   #33
lanekko
Member
lanekko doesn't litterlanekko doesn't litter
 
Posts: 17
Karma: 121
Join Date: Sep 2007
Device: PRS-500
Quote:
Originally Posted by NatCh View Post
Sounds like JSWolf just did a trial run, and wasn't impressed enough with it to do a full-up review.
I think a review should have a more detailed log? and maybe less biases towards it before testing it?

"I'm testing it now in a really bad situation and seeing how it functions. I have no hope for it to get it right or even close."

Anf of course testing should be made in different case scenarios, not only in the "worst really messed up" scenario?

hehe It's like testing the new jaguar for collision resistance and the test consisting in dropping it from an airplane at 35,000 ft.

and sure kovidgoyal, different tests using different applications will build a more objective review. Maybe if JSWolf doesn't have the time to do it, somebody else could request the company a trial version for review purposes, or has anyone already?

Last edited by lanekko; 09-25-2007 at 01:02 PM. Reason: typo
lanekko is offline   Reply With Quote
Old 09-25-2007, 01:45 PM   #34
NatCh
Gizmologist
NatCh ought to be getting tired of karma fortunes by now.NatCh ought to be getting tired of karma fortunes by now.NatCh ought to be getting tired of karma fortunes by now.NatCh ought to be getting tired of karma fortunes by now.NatCh ought to be getting tired of karma fortunes by now.NatCh ought to be getting tired of karma fortunes by now.NatCh ought to be getting tired of karma fortunes by now.NatCh ought to be getting tired of karma fortunes by now.NatCh ought to be getting tired of karma fortunes by now.NatCh ought to be getting tired of karma fortunes by now.NatCh ought to be getting tired of karma fortunes by now.
 
NatCh's Avatar
 
Posts: 11,615
Karma: 929550
Join Date: Jan 2006
Location: Republic of Texas Embassy at Jackson, TN
Device: Pocketbook Touch HD3
Quote:
Originally Posted by lanekko View Post
I think a review should have a more detailed log? and maybe less biases towards it before testing it?

"I'm testing it now in a really bad situation and seeing how it functions. I have no hope for it to get it right or even close."
JSWolf has kind of a blunt way of putting things, but if he felt like the results were worthwhile, I don't believe he'd say otherwise.

Besides, according to the claims on the originating site, it ought to swallow the worst case formatting PDF can offer and spit out a beautiful RTF with no fuss or muss. That's what it seems to be saying, anyway.
NatCh is offline   Reply With Quote
Old 09-25-2007, 01:48 PM   #35
lanekko
Member
lanekko doesn't litterlanekko doesn't litter
 
Posts: 17
Karma: 121
Join Date: Sep 2007
Device: PRS-500
Quote:
Originally Posted by NatCh View Post
JSWolf has kind of a blunt way of putting things, but if he felt like the results were worthwhile, I don't believe he'd say otherwise.

Besides, according to the claims on the originating site, it ought to swallow the worst case formatting PDF can offer and spit out a beautiful RTF with no fuss or muss. That's what it seems to be saying, anyway.
sure and I respect his style, but we seem to need a more detailed and objective review...
lanekko is offline   Reply With Quote
Old 09-25-2007, 01:51 PM   #36
jasonkchapman
Guru
jasonkchapman knows what time it isjasonkchapman knows what time it isjasonkchapman knows what time it isjasonkchapman knows what time it isjasonkchapman knows what time it isjasonkchapman knows what time it isjasonkchapman knows what time it isjasonkchapman knows what time it isjasonkchapman knows what time it isjasonkchapman knows what time it isjasonkchapman knows what time it is
 
jasonkchapman's Avatar
 
Posts: 767
Karma: 2347
Join Date: Jul 2007
Location: NYC
Device: Sony Reader, nook, Droid, nookColor, nookTablet
Quote:
Originally Posted by lanekko View Post
sure and I respect his style, but we seem to need a more detailed and objective review...
I don't.
jasonkchapman is offline   Reply With Quote
Old 09-25-2007, 04:05 PM   #37
Patricia
Reader
Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.Patricia ought to be getting tired of karma fortunes by now.
 
Patricia's Avatar
 
Posts: 11,504
Karma: 8720163
Join Date: May 2007
Location: South Wales, UK
Device: Sony PRS-500, PRS-505, Asus EEEpc 4G
Perhaps lanekko might tell us exactly how this is an improvement on Stingo's excellent macro.
It would certainly help if he were to post some examples of how this program cleaned up PG text files.
Without actual concrete evidence it is difficult to take his claims seriously.
Patricia is offline   Reply With Quote
Old 09-25-2007, 04:14 PM   #38
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 80,650
Karma: 150249619
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by lanekko View Post
Did you copy/paste text from a .pdf?
I used a text file as found out on the net. Your standard screwed up text file with poor line endings. Given that the start of the paragraphs were just left justified text, there is no way any program can figure out what's a proper paragraph or not.
JSWolf is offline   Reply With Quote
Old 09-25-2007, 04:57 PM   #39
lanekko
Member
lanekko doesn't litterlanekko doesn't litter
 
Posts: 17
Karma: 121
Join Date: Sep 2007
Device: PRS-500
Quote:
Originally Posted by Patricia View Post
Perhaps lanekko might tell us exactly how this is an improvement on Stingo's excellent macro.
It would certainly help if he were to post some examples of how this program cleaned up PG text files.
Without actual concrete evidence it is difficult to take his claims seriously.
The word macro (the last version I used) uses the "find page break" "replace with null" approach, which does remove the breaks (but only a certain kind) but doesn't leave out the breaks that were intentional (like a fullstop, title or a bullet point) JSwolf, if you use some random text from the internet, then you weren't using it as it is suggested and advertised on their sales page were you? which is copying text from a .pdf and pasting it in word, as simple as that.

Sure, only because 2 of you are asking for it, I can take a random .pdf from gutemberg, convert it and post the results.

Added:

I'm choosing chance and luck by richard proctor http://www.gutenberg.org/files/17224/17224-pdf.pdf

Last edited by lanekko; 09-25-2007 at 05:00 PM.
lanekko is offline   Reply With Quote
Old 09-25-2007, 05:50 PM   #40
lanekko
Member
lanekko doesn't litterlanekko doesn't litter
 
Posts: 17
Karma: 121
Join Date: Sep 2007
Device: PRS-500
Attached are the 2 files of the chance and luck, one before using the software "pretest" (only the result of copying/pasting in word), the other one after using the software "test". See all the line breaks that got removed thousands and thousands
This wasn't a good book to sample... because it didn't have bullets, subtitles, etc...
Attached Files
File Type: pdf pretest.pdf (782.8 KB, 510 views)
File Type: pdf test.pdf (866.3 KB, 596 views)
lanekko is offline   Reply With Quote
Old 09-25-2007, 06:04 PM   #41
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 80,650
Karma: 150249619
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Ok, I pasted some of the first few paragraphs from the above linked PDF and guess what? It did make mistakes. The first LONG paragraph has 3 extra returns. And it does put returns in sometimes not at the proper end of a sentence. So, It's not too too bad overall, but you'd still need to go through it with the PDF to fix the mistakes. There is no way any program can figure out exactly where ever paragraph is supposed to start. It's not possible unless there was maybe spaces or a tab at the beginning of the paragraph or it used line ends/lengths to try to guess. If you have lines of say 60-80 characters and then a line of say less then that that was a proper sentence end, then yeah, it could do it. But when there is no place to tell what is a paragraph or not, it cannot 100 % figure it all out.

using 4 paragraphs from that PDF, I have 10 returns when I should only have 4.
JSWolf is offline   Reply With Quote
Old 09-25-2007, 06:24 PM   #42
bojan
Enthusiast
bojan is on a distinguished road
 
Posts: 48
Karma: 68
Join Date: Aug 2006
Location: Slovenia
Device: iRex iLiad
A quick look at your results shows that first few pages have a 100% correct "recognition" of paragraphs, but the rest of the document has 0%, which shows to me that you have corrected a few first pages by hand, just to "fake" the results.

And you are right, this is not a good example of pdf, because is untagged, and you cannot get better results using the copy/past method.
bojan is offline   Reply With Quote
Old 09-25-2007, 06:58 PM   #43
lanekko
Member
lanekko doesn't litterlanekko doesn't litter
 
Posts: 17
Karma: 121
Join Date: Sep 2007
Device: PRS-500
Quote:
Originally Posted by bojan View Post
A quick look at your results shows that first few pages have a 100% correct "recognition" of paragraphs, but the rest of the document has 0%, which shows to me that you have corrected a few first pages by hand, just to "fake" the results.

And you are right, this is not a good example of pdf, because is untagged, and you cannot get better results using the copy/past method.
Sick and tired of this.... logging off forever after this post, won't waste my time any longer, gotta lot to read I've already found the solution, don't need you guys anymore, nor to convince you about it ... have happy lives read a lot and enjoy your readers. I don't care if you like it, you don't, you buy it, you don't... this is my last post.

I didn't fake anything, the results are consistent all the way to the end : page 698, the one before the last one (pre test file)




And page 821 of the test file-the page before the last one in test file, it contains the same text, the breaks were removed:




Signing off....
lanekko is offline   Reply With Quote
Old 09-26-2007, 09:15 AM   #44
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 80,650
Karma: 150249619
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Screen #1 is the original PDF
Screen #2 is the paste in Word
Screen #3 is the result

As you can see, it got the first paragraph incorrect. I have two extra returns that are not even at a proper sentence end. So as we can clearly seem this is incorrect. Now multiply this over the entire document and you'll see the number of errors will increase by a lot. This sample is representative of the product. And if the messy line ends are worse, this program's results will be worse. As you can see, it's very easy to see the three paragraphs in screen #2. If the program did paragraph finding using average line lengths, it would have been able to get it right.

Basically, it's an expensive program that doesn't work well.
Attached Thumbnails
Click image for larger version

Name:	screen1.PNG
Views:	372
Size:	171.0 KB
ID:	5832   Click image for larger version

Name:	screen2.PNG
Views:	364
Size:	116.5 KB
ID:	5833   Click image for larger version

Name:	screen3.PNG
Views:	395
Size:	112.3 KB
ID:	5834  
JSWolf is offline   Reply With Quote
Old 10-01-2007, 02:45 PM   #45
kahn10
Junior Member
kahn10 began at the beginning.
 
Posts: 3
Karma: 10
Join Date: May 2007
planted writing

What worries me most is that it is not really a challenge to write a shill post that doesn't make it obvious that you are shilling for the product, Lanekko has not been capable of this, making me call into doubt his ability to write code.

I think it should be obvious to everyone that this is too much money to spend, even if it worked as well as other programs freely available on the internet. I actually wrote a program a few years back that could do the same thing when I was converting symposium abstracts and will give it out for free if I can dig it up, I'll let you know.
kahn10 is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Have you heard of this reader?? Jcas General Discussions 12 10-28-2010 03:08 AM
Seriously thoughtful Anybody heard from Mindy Greg Anos Lounge 4 04-11-2010 10:30 AM


All times are GMT -4. The time now is 09:56 PM.


MobileRead.com is a privately owned, operated and funded community.