|11-15-2008, 11:10 PM||#1|
Join Date: Sep 2008
Device: Kobo Glo, iPad/iPhone
Tell me there is an easier way!
I am looking for an easier way to convert a large batch of plain text ebooks that my sister sent me. There are lots of messy files with many paragraph breaks, or no paragraph breaks and other such issues. I was importing them into Open Office and manually going through to remove the breaks (using the CF option when I import them as some had no breaks at all) but still getting a lot of garbage in them once I converted them into pdb files to load on the ipod.
I finally figured out a way to get the books to appear in a satisfactory way in the finished file, but it is very labour-intensive, using Neo Office and Kompozer, which is an HTML program:
1) Open it in Neo Office and if it gives the option, say 'cf' only
2) Manually scan the document for large gaps and remove them
3) Save the file as a plain text document
4) Re-open the file
5) Select-all and copy
6) Paste it into a new window in Kompozer
7) In Kompozer, Select-all and copy
8) Paste this into a new Neo Office document
9) Save this as a Word file
10) Use conversion program to convert Word to PDB
Isn't there an easier way? All I want is regular old text, one line break between paragraphs, nothing fancy. It seems though that depending on the program originally used to make the text file, there are tabs or special characters used to indicate the line breaks, and I don't see them in Neo Office, but I do once the file is converted. It seems the only way to get "clean" text is to paste it into a web page program, which generates proper paragraph breaks where the line breaks are, and then when I paste that back into Neo Office, everything is fine. But this whole process can take upwards of 15 minutes per book!
I am on a Mac here and I don't have MS software on it. I ave Neo Office, and Pages. I am willing to buy a new program if need be, but as I am on a Mac, I suspect my options may be limited.
|11-16-2008, 03:52 AM||#2|
Join Date: Mar 2008
Device: Sony Reader PRS-505
(It might be a good idea to do your editing in such an editor even if you decide learning regexps is too much work to be worth it - these editors tend to be much better in displaying special characters than your average office application.)
|11-16-2008, 07:29 AM||#3|
Join Date: Nov 2006
Device: Kindle PW2, iPad Retina Mini, iPhone 4, MS Surface Pro
What is the original source of the material? If you download free books from sources like PG, they should be reasonably well-formatted.
|11-16-2008, 09:41 AM||#4|
Gentleman & Cynic
Join Date: Jan 2008
Location: 5 generation native Texan
Device: BeBook/Openinkpot, CYbook 3rd gen awaiting RTF software upgrade
|11-24-2008, 01:13 PM||#5|
Join Date: Oct 2007
Location: Linköpng, Sweden
Device: Nexus 7, Nexus 4, iPad 2, Notion Ink Adam Qi, Kindle WiFi, Kindle PW
|11-25-2008, 06:21 PM||#6|
Join Date: Oct 2007
Device: SONY PRS 350!
Get yer shoppin' shoes on.
Go get TextSpresso from http://www.taylor-design.com/textspresso/overview.htm
Yes, it's for PC or Mac, too. $25.
With this, you can batch convert to rmove html garbage, bad line feeds, recombine sentences, and be left with reasonably formatted text. Small handful of button pushes and yer done.
|Thread Tools||Search this Thread|
|Thread||Thread Starter||Forum||Replies||Last Post|
|Easier German Novels||synlor||Reading Recommendations||15||08-05-2013 01:27 AM|
|Easier way to use bookmarks in the K3 browser||AthenaAtDelphi||Amazon Kindle||3||10-14-2010 10:54 PM|
|Which is easier on the eyes?||talaivan||Which one should I buy?||8||11-05-2008 02:12 PM|
|Easier Navigation for Connect||jerryleejr||Sony Reader||5||06-20-2008 05:15 PM|
|Easier on the hands and eyes?||AnnCook||Which one should I buy?||17||05-25-2008 11:42 AM|