Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > ePub

Notices

Reply
 
Thread Tools Search this Thread
Old 08-07-2013, 04:05 AM   #1
malc_b
Junior Member
malc_b began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Jul 2013
Device: nook
Convert website with txt files to epub

I couldn't find anywhere how to convert a website that has some txt files to an epub so here is my method. For a plain html website all you need to do is to save the site to disk with say winHTTrack then drag the index file into calibre. But this only works if all the links are html files. It fails if some of the files are .txt and also I guess if pdf files.

So here is the method:

1. Grab the site using winHTTrack setting the options to store the file site in /web (flattening the structure).

2. Create a temp directory in the saved site site directory

3. Use this batch file (from site saved directory) to convert all the text files to html

---------------------------------------------------------------
for /R web %%i IN (*.txt) DO (
"C:\Program Files\Calibre2\ebook-convert.exe" web\%%~nxi temp
ren temp\index*.html %%~ni.html
)
---------------------------------------------------------------

4. Edit site index html file to change all the .txt links to .html and save to <saved-site>/temp directory.

5. Copy any other html files from <saved-site>/web to <saved-site>/temp

4. Drag index file from <saved-site>/temp into calibre

How this works is that ebook-convert.exe when given a directory (i.e. <saved-site>/temp) it dumps the intermediate html output from a .txt conversion and stops. Hence the batch file first converts each *.txt into html. The output is normally index.html but sometimes index1.html. The next line in the batch file renames the html output to the same name as the text file but with html extension. Hence when the batch file finishes calibre has done a default conversion of all the txt files to html files of the same name and stored in temp directory. It's then just a case of copy the other files and editing the site index file to point to .html rather than .txt, then dragging the index file into calibre.
malc_b is offline   Reply With Quote
Old 08-07-2013, 07:46 AM   #2
mrmikel
Book Twiddler
mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.
 
Posts: 1,984
Karma: 1405001
Join Date: Apr 2008
Location: Central Oregon Coast
Device: PRS-300
If you work with Sigil, you can start with a blank document, add in all the html documents, arrange them in the desired order, copy and paste in the text documents from any text editor.

For the pdfs, you will need to convert them to html, or copy and paste them from a pdf viewer using the display text function or copy and paste them from the normal display mode. This will likely leave extra spaces, or breaks or carriage returns that will need to be cleaned up.

If you are unlucky and the pdfs are image pdfs containing no text, you will have to process them with an Optical Character Recognition program whose output will need to be cleaned up also...an error rate of only 2% means an error on every page.

This procedure avoids calibre adding in its own hard to understand tags and css.
mrmikel is offline   Reply With Quote
 
Enthusiast
Old 08-08-2013, 06:49 AM   #3
malc_b
Junior Member
malc_b began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Jul 2013
Device: nook
Yes, but suppose you want to want to convert a website with a hundred text documents? Or more. Copy and paste gets pretty boring after the first 10 .
malc_b is offline   Reply With Quote
Old 08-08-2013, 09:20 AM   #4
mrmikel
Book Twiddler
mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.
 
Posts: 1,984
Karma: 1405001
Join Date: Apr 2008
Location: Central Oregon Coast
Device: PRS-300
Try Easy Text To HTML Converter (http://www.easyhtools.com/download.html). It's freeware and in my brief test, it worked ok. It will convert text files in bulk.
mrmikel is offline   Reply With Quote
Reply

Tags
convert, epub, txt, website

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Convert epub to txt for ipod HELP!!!!!! adrian59 Conversion 1 09-03-2012 11:18 PM
If I convert an epub to .txt with Calibre, what does it look like? theusualuser Ectaco jetBook 8 12-10-2010 01:27 PM
Convert .TXT to .EPUB Arfer Calibre 6 09-02-2010 10:41 AM
Txt files - Convert to Epub - Multiple files into one book - noob help Cernan Calibre 6 05-18-2010 10:12 AM
Convert ePub to txt for better functionality PodPeople Ectaco jetBook 1 03-14-2010 01:56 PM


All times are GMT -4. The time now is 03:57 AM.


MobileRead.com is a privately owned, operated and funded community.