11-15-2008, 11:10 PM | #1 |
Wizard
Posts: 2,409
Karma: 4132096
Join Date: Sep 2008
Device: Kindle Paperwhite/iOS Kindle App
|
Tell me there is an easier way!
I am looking for an easier way to convert a large batch of plain text ebooks that my sister sent me. There are lots of messy files with many paragraph breaks, or no paragraph breaks and other such issues. I was importing them into Open Office and manually going through to remove the breaks (using the CF option when I import them as some had no breaks at all) but still getting a lot of garbage in them once I converted them into pdb files to load on the ipod.
I finally figured out a way to get the books to appear in a satisfactory way in the finished file, but it is very labour-intensive, using Neo Office and Kompozer, which is an HTML program: 1) Open it in Neo Office and if it gives the option, say 'cf' only 2) Manually scan the document for large gaps and remove them 3) Save the file as a plain text document 4) Re-open the file 5) Select-all and copy 6) Paste it into a new window in Kompozer 7) In Kompozer, Select-all and copy 8) Paste this into a new Neo Office document 9) Save this as a Word file 10) Use conversion program to convert Word to PDB Isn't there an easier way? All I want is regular old text, one line break between paragraphs, nothing fancy. It seems though that depending on the program originally used to make the text file, there are tabs or special characters used to indicate the line breaks, and I don't see them in Neo Office, but I do once the file is converted. It seems the only way to get "clean" text is to paste it into a web page program, which generates proper paragraph breaks where the line breaks are, and then when I paste that back into Neo Office, everything is fine. But this whole process can take upwards of 15 minutes per book! I am on a Mac here and I don't have MS software on it. I ave Neo Office, and Pages. I am willing to buy a new program if need be, but as I am on a Mac, I suspect my options may be limited. Advice? |
11-16-2008, 03:52 AM | #2 | |
Guru
Posts: 610
Karma: 4150
Join Date: Mar 2008
Device: Sony Reader PRS-T3, Kobo Libra H2O
|
Quote:
(It might be a good idea to do your editing in such an editor even if you decide learning regexps is too much work to be worth it - these editors tend to be much better in displaying special characters than your average office application.) |
|
Advert | |
|
11-16-2008, 07:29 AM | #3 |
eBook Enthusiast
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
|
What is the original source of the material? If you download free books from sources like PG, they should be reasonably well-formatted.
|
11-16-2008, 09:41 AM | #4 | |
Grand Sorcerer
Posts: 11,248
Karma: 35000000
Join Date: Jan 2008
Device: Pocketbook
|
Quote:
|
|
11-24-2008, 01:13 PM | #5 | |
Grand Sorcerer
Posts: 7,452
Karma: 7185064
Join Date: Oct 2007
Location: Linköpng, Sweden
Device: Kindle Voyage, Nexus 5, Kindle PW
|
Quote:
|
|
Advert | |
|
11-25-2008, 06:21 PM | #6 |
Boo-Frickety-Hoo-Erizer
Posts: 251
Karma: 686
Join Date: Oct 2007
Device: Kobo Glo HD!
|
Get yer shoppin' shoes on.
Go get TextSpresso from http://www.taylor-design.com/textspresso/overview.htm Yes, it's for PC or Mac, too. $25. With this, you can batch convert to rmove html garbage, bad line feeds, recombine sentences, and be left with reasonably formatted text. Small handful of button pushes and yer done. -bjc |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Easier German Novels | synlor | Reading Recommendations | 15 | 08-05-2013 01:27 AM |
Easier way to use bookmarks in the K3 browser | AthenaAtDelphi | Amazon Kindle | 3 | 10-14-2010 10:54 PM |
Which is easier on the eyes? | talaivan | Which one should I buy? | 8 | 11-05-2008 02:12 PM |
Easier Navigation for Connect | jerryleejr | Sony Reader | 5 | 06-20-2008 05:15 PM |
Easier on the hands and eyes? | AnnCook | Which one should I buy? | 17 | 05-25-2008 11:42 AM |