View Single Post
Old 05-06-2010, 11:28 PM   #1
FatDog
Witless protection Agent
FatDog ought to be getting tired of karma fortunes by now.FatDog ought to be getting tired of karma fortunes by now.FatDog ought to be getting tired of karma fortunes by now.FatDog ought to be getting tired of karma fortunes by now.FatDog ought to be getting tired of karma fortunes by now.FatDog ought to be getting tired of karma fortunes by now.FatDog ought to be getting tired of karma fortunes by now.FatDog ought to be getting tired of karma fortunes by now.FatDog ought to be getting tired of karma fortunes by now.FatDog ought to be getting tired of karma fortunes by now.FatDog ought to be getting tired of karma fortunes by now.
 
Posts: 274
Karma: 1002898
Join Date: Nov 2009
Location: Los Angeles
Device: Kindle
Ahhhhh - Utility overload: BookDesigner, BookCreator, Textify, txt2lrf...too much

(I'm feeling a bit over-whelmed)

What software (other than Caliber) would you guys use to reformat .txt files and put them into a 'universal' format for ebook readers?

Let me explain...

Nerd cred: I am a software engineer with years of perl/unix experience. I'm currently using a Mac to develop Java (and struggling with all the supporting technologies that have sprung up, but I digress).

Since the days when 9600 baud modems were $500 I have been randomly collecting text-based stories from BBS's, usenet, fan fict sites, etc. Think lots of 5-25 meg .txt files.

As a weekend project I have created some perl modules that clean the files up, remove spam, and try to tease out titles, authors, chapters, keycodes, etc.

Then I use multi-edit (a programmers editor with a power full C-like macro language) to run through the files and re-paginate the paragraphs, find rows that end in hyphens and join them, remove more random crap and basically remove formatting.

Then more perl scripts to sort the stories by Title/Author and perhaps de-dup the stories.

For a while I studied XHTML and tried to find a standard DOCTYPE to use tags to mark things up. I looked at Project Gutenberg's styles and liked the Epic-Book-Chapter concepts but it was a very fat schema. I created a few of my own tags and used CSS to control the formatting. It worked on a story-by-story basis but load a 5 meg file with 150 stories into a browser to check things and it gets very, very slow. And I did not realize how important a table of contents was.

(But I had a lot of fun learning).

So my collection builds with no organization or focus.

Last week - (thanks to the enthusiasm in the Sony section of this forum) I bid on several and finally won a Sony 505 from eBay.

In anticipation of the new arrival, I decided to try and re-format some of the 33 megs of .txt files on my hard drive into some semblance of order.


I LOVE all the utilities that people here have written. But many of the postings are 3 years old so this means they are either perfect/tried-and-true, or other tools have supplanted them.

I LOVE Calibre and it will probably be my main management software.

So what is the best way to convert my .txt files into .lrf?

I downloaded Book Designer 4.0 and pasted in a few stories. It seemed to give me total control over titles, subtitles, authors, but I had to multi-select lines to create paragraphs. This could take hours.

I tried textify which seemed to join lines into single rows which would solve this problem, but it also put things into a single line that did not belong.

I opened a 7 meg text file in Book Designer and to my surprise it seemed to correctly join rows into paragraphs - When I started to run through the file to mark Title/Author, Author-Text, it crashed out.

Am I frustrated - hell no. This is fun.

Ok - lets assume I take my ... Buffy fan-fiction and break the stories into .txt files by author. There are multiple chapters to one story , and a bunch of short stories. Can I put these all into 1 ebook/.lrf file or am I fighting the model with multiple books in 1? What tools should I use to pre-process the file before Book Designer?
FatDog is offline   Reply With Quote