09-25-2007, 11:36 AM | #31 |
Gizmologist
Posts: 11,615
Karma: 929550
Join Date: Jan 2006
Location: Republic of Texas Embassy at Jackson, TN
Device: Pocketbook Touch HD3
|
Sounds like JSWolf just did a trial run, and wasn't impressed enough with it to do a full-up review.
|
09-25-2007, 12:02 PM | #32 | |
creator of calibre
Posts: 43,858
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Quote:
|
|
Advert | |
|
09-25-2007, 12:59 PM | #33 | |
Member
Posts: 17
Karma: 121
Join Date: Sep 2007
Device: PRS-500
|
Quote:
"I'm testing it now in a really bad situation and seeing how it functions. I have no hope for it to get it right or even close." Anf of course testing should be made in different case scenarios, not only in the "worst really messed up" scenario? hehe It's like testing the new jaguar for collision resistance and the test consisting in dropping it from an airplane at 35,000 ft. and sure kovidgoyal, different tests using different applications will build a more objective review. Maybe if JSWolf doesn't have the time to do it, somebody else could request the company a trial version for review purposes, or has anyone already? Last edited by lanekko; 09-25-2007 at 01:02 PM. Reason: typo |
|
09-25-2007, 01:45 PM | #34 | |
Gizmologist
Posts: 11,615
Karma: 929550
Join Date: Jan 2006
Location: Republic of Texas Embassy at Jackson, TN
Device: Pocketbook Touch HD3
|
Quote:
Besides, according to the claims on the originating site, it ought to swallow the worst case formatting PDF can offer and spit out a beautiful RTF with no fuss or muss. That's what it seems to be saying, anyway. |
|
09-25-2007, 01:48 PM | #35 | |
Member
Posts: 17
Karma: 121
Join Date: Sep 2007
Device: PRS-500
|
Quote:
|
|
Advert | |
|
09-25-2007, 01:51 PM | #36 |
Guru
Posts: 767
Karma: 2347
Join Date: Jul 2007
Location: NYC
Device: Sony Reader, nook, Droid, nookColor, nookTablet
|
|
09-25-2007, 04:05 PM | #37 |
Reader
Posts: 11,504
Karma: 8720163
Join Date: May 2007
Location: South Wales, UK
Device: Sony PRS-500, PRS-505, Asus EEEpc 4G
|
Perhaps lanekko might tell us exactly how this is an improvement on Stingo's excellent macro.
It would certainly help if he were to post some examples of how this program cleaned up PG text files. Without actual concrete evidence it is difficult to take his claims seriously. |
09-25-2007, 04:14 PM | #38 |
Resident Curmudgeon
Posts: 73,983
Karma: 128903378
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
I used a text file as found out on the net. Your standard screwed up text file with poor line endings. Given that the start of the paragraphs were just left justified text, there is no way any program can figure out what's a proper paragraph or not.
|
09-25-2007, 04:57 PM | #39 | |
Member
Posts: 17
Karma: 121
Join Date: Sep 2007
Device: PRS-500
|
Quote:
Sure, only because 2 of you are asking for it, I can take a random .pdf from gutemberg, convert it and post the results. Added: I'm choosing chance and luck by richard proctor http://www.gutenberg.org/files/17224/17224-pdf.pdf Last edited by lanekko; 09-25-2007 at 05:00 PM. |
|
09-25-2007, 05:50 PM | #40 |
Member
Posts: 17
Karma: 121
Join Date: Sep 2007
Device: PRS-500
|
Attached are the 2 files of the chance and luck, one before using the software "pretest" (only the result of copying/pasting in word), the other one after using the software "test". See all the line breaks that got removed thousands and thousands
This wasn't a good book to sample... because it didn't have bullets, subtitles, etc... |
09-25-2007, 06:04 PM | #41 |
Resident Curmudgeon
Posts: 73,983
Karma: 128903378
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
Ok, I pasted some of the first few paragraphs from the above linked PDF and guess what? It did make mistakes. The first LONG paragraph has 3 extra returns. And it does put returns in sometimes not at the proper end of a sentence. So, It's not too too bad overall, but you'd still need to go through it with the PDF to fix the mistakes. There is no way any program can figure out exactly where ever paragraph is supposed to start. It's not possible unless there was maybe spaces or a tab at the beginning of the paragraph or it used line ends/lengths to try to guess. If you have lines of say 60-80 characters and then a line of say less then that that was a proper sentence end, then yeah, it could do it. But when there is no place to tell what is a paragraph or not, it cannot 100 % figure it all out.
using 4 paragraphs from that PDF, I have 10 returns when I should only have 4. |
09-25-2007, 06:24 PM | #42 |
Enthusiast
Posts: 48
Karma: 68
Join Date: Aug 2006
Location: Slovenia
Device: iRex iLiad
|
A quick look at your results shows that first few pages have a 100% correct "recognition" of paragraphs, but the rest of the document has 0%, which shows to me that you have corrected a few first pages by hand, just to "fake" the results.
And you are right, this is not a good example of pdf, because is untagged, and you cannot get better results using the copy/past method. |
09-25-2007, 06:58 PM | #43 | |
Member
Posts: 17
Karma: 121
Join Date: Sep 2007
Device: PRS-500
|
Quote:
I didn't fake anything, the results are consistent all the way to the end : page 698, the one before the last one (pre test file) And page 821 of the test file-the page before the last one in test file, it contains the same text, the breaks were removed: Signing off.... |
|
09-26-2007, 09:15 AM | #44 |
Resident Curmudgeon
Posts: 73,983
Karma: 128903378
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
Screen #1 is the original PDF
Screen #2 is the paste in Word Screen #3 is the result As you can see, it got the first paragraph incorrect. I have two extra returns that are not even at a proper sentence end. So as we can clearly seem this is incorrect. Now multiply this over the entire document and you'll see the number of errors will increase by a lot. This sample is representative of the product. And if the messy line ends are worse, this program's results will be worse. As you can see, it's very easy to see the three paragraphs in screen #2. If the program did paragraph finding using average line lengths, it would have been able to get it right. Basically, it's an expensive program that doesn't work well. |
10-01-2007, 02:45 PM | #45 |
Junior Member
Posts: 3
Karma: 10
Join Date: May 2007
|
planted writing
What worries me most is that it is not really a challenge to write a shill post that doesn't make it obvious that you are shilling for the product, Lanekko has not been capable of this, making me call into doubt his ability to write code.
I think it should be obvious to everyone that this is too much money to spend, even if it worked as well as other programs freely available on the internet. I actually wrote a program a few years back that could do the same thing when I was converting symposium abstracts and will give it out for free if I can dig it up, I'll let you know. |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Have you heard of this reader?? | Jcas | General Discussions | 12 | 10-28-2010 03:08 AM |
Seriously thoughtful Anybody heard from Mindy | Greg Anos | Lounge | 4 | 04-11-2010 10:30 AM |