![]() |
#1 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,087
Karma: 447222
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
|
Best way for .TXT to be edited?
I had a terribly badly formatted book that I had to go back to the raw TXT and try to start over
That file was clean ASCII with CR-LF's separating paragraphs (#1 and #2) I added it to Calibre and converted to epub using the default TXT input preferences. I didn't see any knobs to turn that would help. There were some H2 and H3 assumptions made and the text was divided in unexpected places. For example (#3) the TOC text from the TXT file was pretty much converted to the first 42 files with 2 or 3 lines per file. The bulk of the text was in the last 4 files. Since the TOC was in the ASCII text file two time it was converted 2 times, driving up the number of 2 line files. Q1 - why were there so many 2 or 3 line files created? What is the conversion logic that decided H2 and H3's and separate file? Q2 - is there a better way to add and convert txt files? Q3 - RegEx will clean or fix a lot. For example <p>6</p> into <h1>Chapter 6</h1> but it can still be a lot of fiddly work. Are there any options or plug ins that might help? Thanks |
![]() |
![]() |
![]() |
#2 | |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 30,920
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Quote:
![]() Personally, I am in no big rush so I use the EPUB Editor (Sigil in my case as I have dozens of saved searches) to do line various 'Join" cleanup. Then I do a spell check pass to find gross damage (gap (split) words or still hyphens,lost hyphens) And (section)file merge when the basics have been smoothed. |
|
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,087
Karma: 447222
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
|
Ahh - much better
I used Paragraph style: Block and Formatting style: Plain and it gave MUCH cleaner outputs to use in the Editor. A lot of the text can be easily deleted (if there are 2 or more TOC html's, then you know it's been Calibre-ized multiple times) |
![]() |
![]() |
![]() |
#4 |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 79,291
Karma: 145488788
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
Question...
Where did this badly formatted eBook come from? I have never seen an ePub or KF8 eBook from Penguin that was so badly formatted that you need to start with the raw text. |
![]() |
![]() |
![]() |
#5 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,087
Karma: 447222
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
|
Doing a favor for a friend. I believe that the original file (probably an epub) was worked on by him and probably others.
I just decided to start with a clean slate instead of trying to un-do or correct things I decided that using TXT was as barebones as I could get |
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 79,291
Karma: 145488788
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
What I'm really asking is did your friend buy this eBook?
|
![]() |
![]() |
![]() |
#7 |
Ex-Helpdesk Junkie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
|
![]() |
![]() |
![]() |
#8 |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 79,291
Karma: 145488788
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
|
![]() |
![]() |
![]() |
#9 |
Ex-Helpdesk Junkie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
|
![]() |
![]() |
![]() |
#10 |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 30,920
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
|
![]() |
![]() |
![]() |
#11 | |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 79,291
Karma: 145488788
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
Quote:
If I am wrong, then my advice is to go back to the original file and work from that. It will be a lot easier to work from the original eBook file. |
|
![]() |
![]() |
![]() |
#12 |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 79,291
Karma: 145488788
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
|
![]() |
![]() |
![]() |
#13 | |
Ex-Helpdesk Junkie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
Quote:
The OP has every right to say " ![]() The potential for a case to be a case of piracy is not valid grounds for concern. Thus you are prying. You are assigning extra dodginess to a case that of itself has both legal and illegal explanations, with no way of knowing which one it is, and your MO in such cases is that people are assumed guilty until proven innocent. Let it rest. |
|
![]() |
![]() |
![]() |
Thread Tools | Search this Thread |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Slow txt to mobi convertion, performance at o(n^2) as lines of txt grow? | forceps | Calibre | 6 | 11-26-2012 11:47 AM |
Lost Edited Metadata | roxy62 | Library Management | 1 | 02-23-2011 09:56 PM |
epub edited - now it's messed up | NASCARaddicted | ePub | 13 | 08-19-2010 12:04 PM |
Unutterably Silly The last edited notification | ShortNCuddlyAm | Lounge | 1 | 03-21-2010 10:54 PM |