Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Readers > Sony Reader > Sony Reader Dev Corner

Notices

Reply
 
Thread Tools Search this Thread
Old 10-24-2007, 02:41 PM   #1
lilpretender
Enthusiast
lilpretender began at the beginning.
 
Posts: 45
Karma: 10
Join Date: Oct 2007
Device: PRS-500
Conversion software

Are there any programs that will convert html documents to text? I
have a lot of files to convert to text and don't want to do them one
by one. And will it also convert some of them into one text?

Thank you.
lilpretender is offline   Reply With Quote
Old 10-24-2007, 02:46 PM   #2
RWood
Technogeezer
RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.
 
RWood's Avatar
 
Posts: 7,233
Karma: 1596436
Join Date: Nov 2006
Location: Virginia, USA
Device: Sony PRS-500
For that task your best bet would be html2lrf in free package libprs500. Click on the libprs500 link on the Conversion page in the MobileRead Wiki.
RWood is offline   Reply With Quote
Old 10-24-2007, 04:49 PM   #3
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 36,689
Karma: 17734032
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Sony Reader PRS-650, iPad, nook STR
The thread for html2lrf is a sticky in the Reader content section I think.
JSWolf is offline   Reply With Quote
Old 10-24-2007, 09:16 PM   #4
lilpretender
Enthusiast
lilpretender began at the beginning.
 
Posts: 45
Karma: 10
Join Date: Oct 2007
Device: PRS-500
Quote:
Originally Posted by RWood View Post
For that task your best bet would be html2lrf in free package libprs500. Click on the libprs500 link on the Conversion page in the MobileRead Wiki.
Okay tried it. It did convert html to another format. I haven't yet put them on the memory stick, or read them in the reader yet. But I have saved to disk. The only problem is that I converted more than one html document. About 10 pages so far, but the problem is that when I save it on disk it saves it in seperate folders, so if I want to put it on disk, or in the library I have to open each folder to drag and drop it in there.
lilpretender is offline   Reply With Quote
Old 10-24-2007, 10:01 PM   #5
RWood
Technogeezer
RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.
 
RWood's Avatar
 
Posts: 7,233
Karma: 1596436
Join Date: Nov 2006
Location: Virginia, USA
Device: Sony PRS-500
Quote:
Originally Posted by lilpretender View Post
Okay tried it. It did convert html to another format. I haven't yet put them on the memory stick, or read them in the reader yet. But I have saved to disk. The only problem is that I converted more than one html document. About 10 pages so far, but the problem is that when I save it on disk it saves it in seperate folders, so if I want to put it on disk, or in the library I have to open each folder to drag and drop it in there.
Very strange. When I have used it it consolidates all linked HTML files into one LRF.
RWood is offline   Reply With Quote
Old 10-24-2007, 10:03 PM   #6
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 36,689
Karma: 17734032
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Sony Reader PRS-650, iPad, nook STR
maybe lilpretender is using the GUI (which I do not use) and it acts differently then the command line version which I do use.
JSWolf is offline   Reply With Quote
Old 10-24-2007, 10:10 PM   #7
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 25,797
Karma: 4998511
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
No both GUI and CLI follow all links. I think he means he's got 10 different documents.
kovidgoyal is offline   Reply With Quote
Old 10-24-2007, 10:28 PM   #8
ebookfab
Member
ebookfab began at the beginning.
 
Posts: 18
Karma: 10
Join Date: May 2005
Location: Indianapolis, IN
Device: Palm TX, Sony PRS505, Sony 700
Also HTMLAsText by NirSoft. It is freeware.
Fred
ebookfab is offline   Reply With Quote
Old 10-24-2007, 10:29 PM   #9
RWood
Technogeezer
RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.
 
RWood's Avatar
 
Posts: 7,233
Karma: 1596436
Join Date: Nov 2006
Location: Virginia, USA
Device: Sony PRS-500
If he is using 10 different documents as kovid suggests then it would be a simple matter to write an integration HTML that links/calls all of the files.
RWood is offline   Reply With Quote
Old 10-24-2007, 10:43 PM   #10
RalphTrickey
Enthusiast
RalphTrickey began at the beginning.
 
Posts: 42
Karma: 30
Join Date: Sep 2007
Device: Sony
Or he could just do a find file and they all show up in one 'virtual' folder. That flattens them out so you can drag and drop easily.
RalphTrickey is offline   Reply With Quote
Old 10-24-2007, 11:18 PM   #11
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 36,689
Karma: 17734032
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Sony Reader PRS-650, iPad, nook STR
I think I get it now.. 10 different HTML files in 10 different directories.
JSWolf is offline   Reply With Quote
Old 10-25-2007, 04:38 AM   #12
kacir
Wizard
kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.
 
kacir's Avatar
 
Posts: 2,765
Karma: 3093505
Join Date: May 2006
Device: PocketBook 360, before it was Sony Reader, cassiopeia A-20
Quote:
Originally Posted by lilpretender View Post
Are there any programs that will convert html documents to text? I
have a lot of files to convert to text and don't want to do them one
by one. And will it also convert some of them into one text?

Thank you.
for converting html manuals to text I use browser called lynx.
lynx is a text-only browser. There is version for Linux, mac OSX an windows.

just run it from commandline like this
lynx -dump -width=10000 myHTMLfile.html > myHTMLfile.txt
that is it.
-dump option means that the browser does not start "browsing" but converts the html page to text and dumps it to the standard output
-width=10000 tells lynx not to wrap lines at the 80 character position
> myHTMLfile.txt means that the text file that lynx has dumped to the standart output will be saved as myHTMLfile.txt

you can get lynx (it is Free Software) here:
http://www.subir.com/lynx/binaries.html

you can see other options here:
http://linux.die.net/man/1/lynx

I suggest that you
- save the binary in c:\bin\lynx.exe
- start console (command line)
- change to the directory with books
using cd "C:/my books to convert"
- issue command dir *.htm* /b > convert.bat
- that command will create text file with list of all your books
- edit convert.bat with your favourite text editor.
you get file like this
.........
mybook1.htm
mybook2.htm
........
after editing it should look like:
..........
c:\bin\lynx.exe -dump -width=10000 mybook1.htm > mybook1.htm.txt
c:\bin\lynx.exe -dump -width=10000 mybook2.htm > mybook2.htm.txt
..........
- run convert bat, put your feet on the table and smugly watch as hundreds of files gets converted in couple of minutes.
kacir is offline   Reply With Quote
Old 10-27-2007, 11:26 PM   #13
lilpretender
Enthusiast
lilpretender began at the beginning.
 
Posts: 45
Karma: 10
Join Date: Oct 2007
Device: PRS-500
Quote:
Originally Posted by RWood View Post
For that task your best bet would be html2lrf in free package libprs500. Click on the libprs500 link on the Conversion page in the MobileRead Wiki.
Now I have some other questions. How do you convert in batch? I can edit meta information in batch, but can't convert in batch.
Also, can I use this program for the PRS-505 too? I'm thinking of getting one.
lilpretender is offline   Reply With Quote
Old 10-28-2007, 12:11 AM   #14
RWood
Technogeezer
RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.
 
RWood's Avatar
 
Posts: 7,233
Karma: 1596436
Join Date: Nov 2006
Location: Virginia, USA
Device: Sony PRS-500
The LRF files will work on both the 500 and the 505. He is working to add 505 features to the current functionality.

For batch conversion I use the command line version and make a DOS BATch file. I started on PCs years ago and with many years of DOS batch file processing (and many years of IBM JCL scripts before that.)
RWood is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Aptara eGen: high-volume ebook conversion Software Nate the great News 9 01-15-2010 11:27 AM
Free PDF Conversion Software (Today ONLY) PGP_Protector News 7 03-03-2009 11:57 AM
New PDF conversion software at Amazon? carld Amazon Kindle 1 10-08-2008 01:09 PM
try the Hanlin Conversion Software and Printer CommanderROR HanLin eBook 9 08-06-2008 06:42 AM


All times are GMT -4. The time now is 04:29 PM.


MobileRead.com is a privately owned, operated and funded community.