Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > PDF

Notices

Reply
 
Thread Tools Search this Thread
Old 04-27-2003, 02:54 PM   #31
macrotor
Connoisseur
macrotor knows what time it ismacrotor knows what time it ismacrotor knows what time it ismacrotor knows what time it ismacrotor knows what time it ismacrotor knows what time it ismacrotor knows what time it ismacrotor knows what time it ismacrotor knows what time it ismacrotor knows what time it ismacrotor knows what time it is
 
macrotor's Avatar
 
Posts: 59
Karma: 2418
Join Date: Nov 2002
Location: Fremont, CA, USA
Device: Tungsten|C with Nokia6200
Okay, I got pdftohtml to work on MacOSX. I was hoping to take a little more time and make it work directly from the Print window using the PDF scripting feature, but I just haven't had the time. In any case, here is how to compile it:

Get the sourcecode from:
http://pdftohtml.sourceforge.net/

It was at version 0.35 when I did this. If it is now a later version, than the following fix will not be necessary:
In the "src" directory, you will find a file called HtmlOutputDev.cc. Open it in a text editor and go to line 791. Remove the "= 1" from the part that reads "firstPage = 1". Save and close.

Okay, open up your terminal. We have to set OS X to use the older gcc compiler. Type the follwoing command:
sudo gcc_select 2

Now it's time to move into the pdftohtml directory and compile it using "make all".

Reset your compiler to the new version:
sudo gcc_select 3

There should be a new "pdftohtml" executable files. That file is all you need, so put it somewhere in your path. Type "pdftohtml -help" to get basic instructions on how to use it.

I always use the "-c" option so that I have a near exact replica of the formatted PDF. I then let iSilo do all the stripping. It depends on how fast you want it to work.

At this point, You'll have to help me experiment on what are the best settings. Exact replicas don't always fit on a Palm screen well. I hope this works for you!
macrotor is offline   Reply With Quote
Old 05-08-2003, 07:23 PM   #32
daught
Enthusiast
daught knows what time it isdaught knows what time it isdaught knows what time it isdaught knows what time it isdaught knows what time it isdaught knows what time it isdaught knows what time it isdaught knows what time it isdaught knows what time it isdaught knows what time it isdaught knows what time it is
 
daught's Avatar
 
Posts: 29
Karma: 2314
Join Date: Feb 2003
Jim,

Is there any way you can provide us very, extremely, terrifyingly timid terminal users with your compiled OS X compatible version of pdftohtml?

Gary
daught is offline   Reply With Quote
Old 05-09-2003, 07:35 PM   #33
macrotor
Connoisseur
macrotor knows what time it ismacrotor knows what time it ismacrotor knows what time it ismacrotor knows what time it ismacrotor knows what time it ismacrotor knows what time it ismacrotor knows what time it ismacrotor knows what time it ismacrotor knows what time it ismacrotor knows what time it ismacrotor knows what time it is
 
macrotor's Avatar
 
Posts: 59
Karma: 2418
Join Date: Nov 2002
Location: Fremont, CA, USA
Device: Tungsten|C with Nokia6200
Wink It won't be THAT easy.

Okay, I'll post the binary here. However, I tested it on a clean system and found that you still need to install GhostScript. The absolute best way to do this is by using fink (http://fink.sourceforge.net).
You can use the Fink Commander application so that you can avoid the terminal. Just install the BINARY version of Ghostscript 8.00, then put this file somewhere in your path (like /sw/bin). I tried one standalone ghostscript installer, but it lacked png support, so you have to use Fink to get a full install.

I'm afraid this doesn't get any easier if you want it for free. The opensource community requires a little elbow-grease from their users!

Here is an example command to get an accurate page for iSilo. Mind you, if you don't care about the tables and fixed formatting, then remove the '-c' option.

pdftohtml -c -noframes example.pdf example.html

This will create the example.html files and all the png graphics in the current directory. It would be great to wrap this in an installer with a GUI, but I have a pregnant wife that believes I have better things to do! At least I can finally carry all my tech manuals in my pocket with a searchable index. I hope this gets you started!
Attached Files
File Type: zip pdftohtml.zip (245.3 KB, 368 views)
macrotor is offline   Reply With Quote
Old 06-16-2003, 01:38 AM   #34
Alexander Turcic
Fully Converged
Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.
 
Alexander Turcic's Avatar
 
Posts: 17,094
Karma: 10000048
Join Date: Oct 2002
Location: Switzerland
Device: Sony PRS-650 / Nexus 7 / Kindle PW
If you can afford it: Adobe Acrobat 6.0 allows you to export any PDF file to HTML 3.2, HTML 4.01 with CSS, DOC, RTF, TXT, XML 1.0. Very cool!
Alexander Turcic is offline   Reply With Quote
Old 06-16-2003, 05:42 AM   #35
cbarnett
MR prodigal son
cbarnett ought to be getting tired of karma fortunes by now.cbarnett ought to be getting tired of karma fortunes by now.cbarnett ought to be getting tired of karma fortunes by now.cbarnett ought to be getting tired of karma fortunes by now.cbarnett ought to be getting tired of karma fortunes by now.cbarnett ought to be getting tired of karma fortunes by now.cbarnett ought to be getting tired of karma fortunes by now.cbarnett ought to be getting tired of karma fortunes by now.cbarnett ought to be getting tired of karma fortunes by now.cbarnett ought to be getting tired of karma fortunes by now.cbarnett ought to be getting tired of karma fortunes by now.
 
cbarnett's Avatar
 
Posts: 1,085
Karma: 1083739
Join Date: Mar 2003
Location: Australia
Device: Galaxy Note, Nexus7
I have access Acrobat v5 at work, if I want it. Can you export like that in v5?
cbarnett is offline   Reply With Quote
Old 06-16-2003, 08:51 AM   #36
Alexander Turcic
Fully Converged
Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.
 
Alexander Turcic's Avatar
 
Posts: 17,094
Karma: 10000048
Join Date: Oct 2002
Location: Switzerland
Device: Sony PRS-650 / Nexus 7 / Kindle PW
Nope the great export functionality is new in Acrobat 6.0. We have it here at work and I just tested a few PDF files... the outcome is amazing and I don't have to tell you how much I enjoy reading them with iSilo now
Alexander Turcic is offline   Reply With Quote
Old 06-16-2003, 07:09 PM   #37
BasilC
Zealot
BasilC is on a distinguished road
 
BasilC's Avatar
 
Posts: 129
Karma: 60
Join Date: Feb 2003
Location: London England
Device: Palm Tungsten T3
Quote:
Originally posted by Alexander
If you can afford it: Adobe Acrobat 6.0 allows you to export any PDF file to HTML 3.2, HTML 4.01 with CSS, DOC, RTF, TXT, XML 1.0. Very cool!

The free Adobe Reader 6 allows you to save a pdf as a text file. It seems to work OK, except that it leaves in page headers and footers and page numbers. Adobe Reader 6 will also read pdf files out loud (in an American accent, naturally)!

Incidentally, how much does Acrobat cost? Does the standard version do the conversions to HTML or just the Professional version?
BasilC is offline   Reply With Quote
Old 06-16-2003, 07:13 PM   #38
BasilC
Zealot
BasilC is on a distinguished road
 
BasilC's Avatar
 
Posts: 129
Karma: 60
Join Date: Feb 2003
Location: London England
Device: Palm Tungsten T3
Angry

Quote:
Originally posted by BasilC
Incidentally, how much does Acrobat cost?
I just found out the price on the Adobe website. Standard is £287, Professional £440. Forget it!
BasilC is offline   Reply With Quote
Old 08-07-2003, 04:31 AM   #39
wumpi
Enthusiast
wumpi knows what time it iswumpi knows what time it iswumpi knows what time it iswumpi knows what time it iswumpi knows what time it iswumpi knows what time it iswumpi knows what time it iswumpi knows what time it iswumpi knows what time it iswumpi knows what time it iswumpi knows what time it is
 
wumpi's Avatar
 
Posts: 37
Karma: 2358
Join Date: Feb 2003
Do you guys know Raphael Fetzer's pdaConverter?

pdaConverter simplifies the creation of documents for PalmOS based handhelds and works only under MS Windows. The daily synchronisation of webpages/channels can be done, too.

pdaConverter supports the following filetypes:jpg, gif, png, pdf, html, rtf, hlp, wpd, txt, Aportis Doc
And can produce these filetypes:Plucker, Aportis Doc, zTXT

Really useful if you use Plucker!
wumpi is offline   Reply With Quote
Old 08-07-2003, 06:57 PM   #40
BasilC
Zealot
BasilC is on a distinguished road
 
BasilC's Avatar
 
Posts: 129
Karma: 60
Join Date: Feb 2003
Location: London England
Device: Palm Tungsten T3
I downloaded pdaConverter, but it won't install properly on my computer, for some reason. Pity, it looks interesting.
BasilC is offline   Reply With Quote
Old 08-12-2003, 04:33 PM   #41
macrotor
Connoisseur
macrotor knows what time it ismacrotor knows what time it ismacrotor knows what time it ismacrotor knows what time it ismacrotor knows what time it ismacrotor knows what time it ismacrotor knows what time it ismacrotor knows what time it ismacrotor knows what time it ismacrotor knows what time it ismacrotor knows what time it is
 
macrotor's Avatar
 
Posts: 59
Karma: 2418
Join Date: Nov 2002
Location: Fremont, CA, USA
Device: Tungsten|C with Nokia6200
pdftohtml a little dissapointing.

Well, pdftohtml has been less than wonderful. If you want formatted text, it's fine. However, it converts all graphics on a page into a single background graphic. Not very good for iSilo.

Oh well, I'll keep looking.
macrotor is offline   Reply With Quote
Old 09-26-2003, 11:17 AM   #42
vitalyb
Member
vitalyb began at the beginning.
 
vitalyb's Avatar
 
Posts: 17
Karma: 10
Join Date: Aug 2003
Location: Israel
Device: HTC Touch Pro
Where is the option in Acrobat 6?
vitalyb is offline   Reply With Quote
Old 01-11-2004, 04:52 PM   #43
Alexander Turcic
Fully Converged
Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.
 
Alexander Turcic's Avatar
 
Posts: 17,094
Karma: 10000048
Join Date: Oct 2002
Location: Switzerland
Device: Sony PRS-650 / Nexus 7 / Kindle PW
PDF Plain Text Extractor is another tool that can extract plain text from PDF files without any PDF SDK or other third party lib's help.

You don't need any products from Adobe (neither Adobe Acrobat Reader nor Adobe Acrobat) installed on your computer. P2T focus on text extraction from pdf file. It analyzes the raw pdf file directly and extract plain text from it. The layout of the document is reserved.

Haven't tested it yet, trial is available though; full costs $59.95
Alexander Turcic is offline   Reply With Quote
Old 01-12-2004, 05:49 PM   #44
BasilC
Zealot
BasilC is on a distinguished road
 
BasilC's Avatar
 
Posts: 129
Karma: 60
Join Date: Feb 2003
Location: London England
Device: Palm Tungsten T3
I finally managed to install pdaConverter. I can get it to convert pdf text to Plucker format pretty well, and it's then much quicker to read than using Adobe Reader for Palm. However, I can't get it to convert images, even though I tick the option to do this. In general, it looks like a very useful program, the only trouble is that there isn't a usable manual. Anyone figured out how to make full use of it?
BasilC is offline   Reply With Quote
Old 01-14-2004, 10:53 AM   #45
sas
Enthusiast
sas began at the beginning.
 
sas's Avatar
 
Posts: 26
Karma: 42
Join Date: Mar 2003
Device: T650 & T/T3
Quote:
Originally Posted by BasilC
.... However, I can't get it to convert images, even though I tick the option to do this.
You mean pictures included in the original PDF, or any picture? I was never able to convert images from the PDF either, but for JPGs and GIFs it works fine both with add file / from clipboard and through system integration.

If you did not find enough information in the help file - why not to e-mail Raphael, or post here?

Enjoy
sas is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Converting PDFs JoshLessard Amazon Kindle 12 10-07-2010 06:40 AM
Converting Layered? PDFs kerrware Calibre 2 06-30-2010 03:31 PM
reader for PDFs without converting? kuck Which one should I buy? 24 06-30-2010 02:55 AM
Numbers in pdfs not converting kilgoretrout Workshop 9 06-25-2010 05:18 PM
converting PDFs with equations significance Calibre 6 10-25-2009 09:36 PM


All times are GMT -4. The time now is 07:21 PM.


MobileRead.com is a privately owned, operated and funded community.