Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 01-05-2015, 10:35 AM   #1
phossler
Wizard
phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.
 
Posts: 1,087
Karma: 447222
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
Best way for .TXT to be edited?

I had a terribly badly formatted book that I had to go back to the raw TXT and try to start over

That file was clean ASCII with CR-LF's separating paragraphs (#1 and #2)

I added it to Calibre and converted to epub using the default TXT input preferences. I didn't see any knobs to turn that would help.

There were some H2 and H3 assumptions made and the text was divided in unexpected places. For example (#3) the TOC text from the TXT file was pretty much converted to the first 42 files with 2 or 3 lines per file. The bulk of the text was in the last 4 files. Since the TOC was in the ASCII text file two time it was converted 2 times, driving up the number of 2 line files.

Q1 - why were there so many 2 or 3 line files created? What is the conversion logic that decided H2 and H3's and separate file?

Q2 - is there a better way to add and convert txt files?

Q3 - RegEx will clean or fix a lot. For example

<p>6</p> into <h1>Chapter 6</h1>

but it can still be a lot of fiddly work. Are there any options or plug ins that might help?

Thanks
Attached Thumbnails
Click image for larger version

Name:	1.JPG
Views:	165
Size:	37.2 KB
ID:	133350   Click image for larger version

Name:	2.JPG
Views:	166
Size:	80.8 KB
ID:	133351   Click image for larger version

Name:	3.JPG
Views:	176
Size:	107.8 KB
ID:	133352  
phossler is offline   Reply With Quote
Old 01-05-2015, 12:01 PM   #2
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 30,920
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by phossler View Post
I had a terribly badly formatted book that I had to go back to the raw TXT and try to start over

That file was clean ASCII with CR-LF's separating paragraphs (#1 and #2)

I added it to Calibre and converted to epub using the default TXT input preferences. I didn't see any knobs to turn that would help.

There were some H2 and H3 assumptions made and the text was divided in unexpected places. For example (#3) the TOC text from the TXT file was pretty much converted to the first 42 files with 2 or 3 lines per file. The bulk of the text was in the last 4 files. Since the TOC was in the ASCII text file two time it was converted 2 times, driving up the number of 2 line files.

Q1 - why were there so many 2 or 3 line files created? What is the conversion logic that decided H2 and H3's and separate file?

Q2 - is there a better way to add and convert txt files?

Q3 - RegEx will clean or fix a lot. For example

<p>6</p> into <h1>Chapter 6</h1>

but it can still be a lot of fiddly work. Are there any options or plug ins that might help?

Thanks
have you played with the Preferences: Input Options: (TXT input): Structure? Auto is not always best

Personally, I am in no big rush so I use the EPUB Editor (Sigil in my case as I have dozens of saved searches) to do line various 'Join" cleanup.
Then I do a spell check pass to find gross damage (gap (split) words or still hyphens,lost hyphens)
And (section)file merge when the basics have been smoothed.
theducks is offline   Reply With Quote
Advert
Old 01-05-2015, 12:54 PM   #3
phossler
Wizard
phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.
 
Posts: 1,087
Karma: 447222
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
Ahh - much better

I used Paragraph style: Block and Formatting style: Plain and it gave MUCH cleaner outputs to use in the Editor.

A lot of the text can be easily deleted (if there are 2 or more TOC html's, then you know it's been Calibre-ized multiple times)
phossler is offline   Reply With Quote
Old 01-05-2015, 01:32 PM   #4
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 79,291
Karma: 145488788
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Question...

Where did this badly formatted eBook come from? I have never seen an ePub or KF8 eBook from Penguin that was so badly formatted that you need to start with the raw text.
JSWolf is offline   Reply With Quote
Old 01-05-2015, 02:03 PM   #5
phossler
Wizard
phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.
 
Posts: 1,087
Karma: 447222
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
Doing a favor for a friend. I believe that the original file (probably an epub) was worked on by him and probably others.

I just decided to start with a clean slate instead of trying to un-do or correct things

I decided that using TXT was as barebones as I could get
phossler is offline   Reply With Quote
Advert
Old 01-05-2015, 02:18 PM   #6
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 79,291
Karma: 145488788
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
What I'm really asking is did your friend buy this eBook?
JSWolf is offline   Reply With Quote
Old 01-05-2015, 02:38 PM   #7
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Quote:
Originally Posted by JSWolf View Post
What I'm really asking is did your friend buy this eBook?
There is no need to answer this question, consdering its entire purpose is to stir up trouble without having any right to have an opinion.

Your needless suspicion is irritating.
eschwartz is offline   Reply With Quote
Old 01-05-2015, 03:46 PM   #8
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 79,291
Karma: 145488788
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by eschwartz View Post
There is no need to answer this question, consdering its entire purpose is to stir up trouble without having any right to have an opinion.

Your needless suspicion is irritating.
This sounds dodgy is why I ask.
JSWolf is offline   Reply With Quote
Old 01-05-2015, 04:32 PM   #9
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Quote:
Originally Posted by JSWolf View Post
This sounds dodgy is why I ask.
Everything sounds dodgy to you, that's the problem.
eschwartz is offline   Reply With Quote
Old 01-05-2015, 04:40 PM   #10
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 30,920
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by JSWolf View Post
This sounds dodgy is why I ask.
What it sounds like is: reconverting a converted file mess.
IIRC you frequently recommend avoiding that. (I agree. I also archive the Gold -original so I can work with the true original.)
theducks is offline   Reply With Quote
Old 01-05-2015, 06:14 PM   #11
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 79,291
Karma: 145488788
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by theducks View Post
What it sounds like is: reconverting a converted file mess.
IIRC you frequently recommend avoiding that. (I agree. I also archive the Gold -original so I can work with the true original.)
I would not have been suspicious if it was said that this eBook came from his friend. But it was said to have come from the friend and others. So that to me sounds like not paid for.

If I am wrong, then my advice is to go back to the original file and work from that. It will be a lot easier to work from the original eBook file.
JSWolf is offline   Reply With Quote
Old 01-05-2015, 06:16 PM   #12
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 79,291
Karma: 145488788
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by eschwartz View Post
Everything sounds dodgy to you, that's the problem.
That's because I read what's actually being said and if it sounds dodgy, I'll ask just in case.
JSWolf is offline   Reply With Quote
Old 01-05-2015, 06:33 PM   #13
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Quote:
Originally Posted by JSWolf View Post
That's because I read what's actually being said and if it sounds dodgy, I'll ask just in case.
Like I said, you are overly concerned where you need not and should not be.

The OP has every right to say " I don't care about your question" and you will have zero grounds for accusation.

The potential for a case to be a case of piracy is not valid grounds for concern.

Thus you are prying.

You are assigning extra dodginess to a case that of itself has both legal and illegal explanations, with no way of knowing which one it is, and your MO in such cases is that people are assumed guilty until proven innocent.

Let it rest.
eschwartz is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Slow txt to mobi convertion, performance at o(n^2) as lines of txt grow? forceps Calibre 6 11-26-2012 11:47 AM
Lost Edited Metadata roxy62 Library Management 1 02-23-2011 09:56 PM
epub edited - now it's messed up NASCARaddicted ePub 13 08-19-2010 12:04 PM
Unutterably Silly The last edited notification ShortNCuddlyAm Lounge 1 03-21-2010 10:54 PM


All times are GMT -4. The time now is 11:06 PM.


MobileRead.com is a privately owned, operated and funded community.