Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 08-25-2013, 04:58 PM   #1
MelBr
Zealot
MelBr ought to be getting tired of karma fortunes by now.MelBr ought to be getting tired of karma fortunes by now.MelBr ought to be getting tired of karma fortunes by now.MelBr ought to be getting tired of karma fortunes by now.MelBr ought to be getting tired of karma fortunes by now.MelBr ought to be getting tired of karma fortunes by now.MelBr ought to be getting tired of karma fortunes by now.MelBr ought to be getting tired of karma fortunes by now.MelBr ought to be getting tired of karma fortunes by now.MelBr ought to be getting tired of karma fortunes by now.MelBr ought to be getting tired of karma fortunes by now.
 
Posts: 105
Karma: 414068
Join Date: Feb 2013
Device: iPad Pro, Kobo Aura One
Dealing with bad formatting: "broken" lines inside paragraphs?

Hi all,

A lot of TXT files that I find on Archive.org (for example), have broken paragraphs and are formatted for 74 character displays. When converting such files to epub, I always end up with bad formatting and the text just doesn't 'flow'.

is there an easy way to fix this when importing/converting such files? I've gone through options in Heuristic Processing panel but I'm not sure which checkbox needs to be checked.

Thanks!
MelBr is offline   Reply With Quote
Old 08-25-2013, 06:02 PM   #2
Adoby
Handy Elephant
Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.Adoby ought to be getting tired of karma fortunes by now.
 
Adoby's Avatar
 
Posts: 1,737
Karma: 26785684
Join Date: Dec 2009
Location: Southern Sweden, far out in the quiet woods
Device: Samsung Galaxy Tab S8 Ultra
Try to adjust the "Line un-wrap factor. It decides when a line is assumed to be a part of an ongoing paragraph. Try for instance 0.5 or 0.7.

And start by trying to have everything checked. And test. Uncheck things that you think cause trouble. Expect to do some experimenting, and perhaps even manual editing in sigil if you want perfect results and not just usable.
Adoby is offline   Reply With Quote
Advert
Old 08-25-2013, 06:08 PM   #3
Rizla
Member Retired
Rizla ought to be getting tired of karma fortunes by now.Rizla ought to be getting tired of karma fortunes by now.Rizla ought to be getting tired of karma fortunes by now.Rizla ought to be getting tired of karma fortunes by now.Rizla ought to be getting tired of karma fortunes by now.Rizla ought to be getting tired of karma fortunes by now.Rizla ought to be getting tired of karma fortunes by now.Rizla ought to be getting tired of karma fortunes by now.Rizla ought to be getting tired of karma fortunes by now.Rizla ought to be getting tired of karma fortunes by now.Rizla ought to be getting tired of karma fortunes by now.
 
Posts: 3,183
Karma: 11721895
Join Date: Nov 2010
Device: Nook STR (rooted) & Sony T2
Probably the .txt files contain unwanted carriage returns / new-lines. You need to edit them out using some kind of search and replace feature (i.e. regex). A lot of people do it in MS Word.
Rizla is offline   Reply With Quote
Old 08-25-2013, 06:36 PM   #4
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,715
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Fix the text file. I'd use Notepad++ on Windows, TextWrangler on Mac and SciTE on Linux. And I'd save the following as macro so I have it next time.
  1. replace all double new lines (end of paragraph) with say '###'
  2. replace all single new lines with space
  3. replace all double space with space
  4. replace the '###' with new line

You can eyeball check by enabling word wrap in the the text editor, and widening narrowing the window size to ensure the text flows

The file should now be good for conversion

BR
BetterRed is offline   Reply With Quote
Old 08-25-2013, 06:50 PM   #5
MelBr
Zealot
MelBr ought to be getting tired of karma fortunes by now.MelBr ought to be getting tired of karma fortunes by now.MelBr ought to be getting tired of karma fortunes by now.MelBr ought to be getting tired of karma fortunes by now.MelBr ought to be getting tired of karma fortunes by now.MelBr ought to be getting tired of karma fortunes by now.MelBr ought to be getting tired of karma fortunes by now.MelBr ought to be getting tired of karma fortunes by now.MelBr ought to be getting tired of karma fortunes by now.MelBr ought to be getting tired of karma fortunes by now.MelBr ought to be getting tired of karma fortunes by now.
 
Posts: 105
Karma: 414068
Join Date: Feb 2013
Device: iPad Pro, Kobo Aura One
THANK YOU guys! Will try ll these solutions to see what works the best for me. Thanks again.

Yes, soft wrapping just doesn't work well with these hard-formatted txt files. (I even found a bunch of Epub files that are also messed up because they were OCRed from PDFs etc)
MelBr is offline   Reply With Quote
Advert
Old 08-26-2013, 12:10 AM   #6
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 9,896
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Kindle PaperWhite SE 11th Gen
Quote:
Originally Posted by MelBr View Post
A lot of TXT files that I find on Archive.org (for example), have broken paragraphs and are formatted for 74 character displays. When converting such files to epub, I always end up with bad formatting and the text just doesn't 'flow'.

is there an easy way to fix this when importing/converting such files? I've gone through options in Heuristic Processing panel but I'm not sure which checkbox needs to be checked.
During conversion look at the options in the Text Input area and instead of auto for Paragraph style select one of the other options and try until you get the proper fit. Read this section of the manual for insight into the various options.
DoctorOhh is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Can't get rid of &nbsp "paragraphs" when converting Y|yukichigai Conversion 3 01-23-2012 11:20 PM
Removing "orphan" paragraphs and other stuff sebdea Conversion 0 01-22-2012 09:37 AM
Touch Horizontal white streaky lines - on "Connected and Charging" screen beautifulsoup Kobo Reader 0 07-27-2011 06:47 PM
Ten Favorite Noir Lines...Plus "The Grifters" Coming to Kindle Paul Levine Reading Recommendations 5 04-16-2011 12:53 AM
How to reduce indents without "removing space between paragraphs" Skydog Calibre 5 06-05-2010 12:58 AM


All times are GMT -4. The time now is 04:06 AM.


MobileRead.com is a privately owned, operated and funded community.