Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 02-01-2009, 09:25 AM   #1
daesdaemar
Addict
daesdaemar has a spectacular aura aboutdaesdaemar has a spectacular aura aboutdaesdaemar has a spectacular aura aboutdaesdaemar has a spectacular aura aboutdaesdaemar has a spectacular aura aboutdaesdaemar has a spectacular aura aboutdaesdaemar has a spectacular aura aboutdaesdaemar has a spectacular aura aboutdaesdaemar has a spectacular aura aboutdaesdaemar has a spectacular aura aboutdaesdaemar has a spectacular aura about
 
Posts: 210
Karma: 4282
Join Date: Oct 2008
Location: Florida
Device: Sony 505, Kindle 3, iPad 3
line formatting formatting question

We are all aware of those pesky text documents that have a hard return/line break at the end of each line. Fortunately, there are usually two line breaks at the end of a paragraph so it is relatively easy to format out the line breaks at the end of each line and yet keep the paragraph structure.

Here's my problem:

I have a book in lit format that I converted to lrf in Calibre and the line breaks were scattered all over the place and it looked awful. I then converted the lit file with ConvertLit to html and then to text/rtf. There is a hard line break at the end of every single line, and there are not two breaks at the end of paragraphs. If I remove the line breaks I have a 400 page document with no structure at all.

I have also tried to convert in BookDesigner with the same problem -- no paragraph structure.

Any ideas?
daesdaemar is offline   Reply With Quote
Old 02-01-2009, 02:11 PM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,771
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
The algorithm to use in such a case is based on average line length. First calculate the average line length and when there are lines significantly shorter than that, dont remove the breaks. That will take care of most of the breaks correctly
kovidgoyal is offline   Reply With Quote
Advert
Old 02-01-2009, 02:17 PM   #3
daesdaemar
Addict
daesdaemar has a spectacular aura aboutdaesdaemar has a spectacular aura aboutdaesdaemar has a spectacular aura aboutdaesdaemar has a spectacular aura aboutdaesdaemar has a spectacular aura aboutdaesdaemar has a spectacular aura aboutdaesdaemar has a spectacular aura aboutdaesdaemar has a spectacular aura aboutdaesdaemar has a spectacular aura aboutdaesdaemar has a spectacular aura aboutdaesdaemar has a spectacular aura about
 
Posts: 210
Karma: 4282
Join Date: Oct 2008
Location: Florida
Device: Sony 505, Kindle 3, iPad 3
Quote:
Originally Posted by kovidgoyal View Post
The algorithm to use in such a case is based on average line length. First calculate the average line length and when there are lines significantly shorter than that, dont remove the breaks. That will take care of most of the breaks correctly
I will try working on that... thanks.
daesdaemar is offline   Reply With Quote
Old 02-01-2009, 02:52 PM   #4
kacir
Wizard
kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.
 
kacir's Avatar
 
Posts: 3,447
Karma: 10484861
Join Date: May 2006
Device: PocketBook 360, before it was Sony Reader, cassiopeia A-20
For such books I use Vim script
(www.vim.org - a very powerful text editor)

You can write a command in Vim saying
"find every line NOT ending with dot, question mark, exclamation point or closing quote, optionally followed by a space character and join it with the next line"
:vglobal/[.!?"']\s*$/join
I often abbreviate the above command this way:
:v/[.!?"']\s*$/j
That is it.

You can also say:
"find every line ending with .!?" and enter an empty line after it"
"find every line shorter than (let's say) 50 characters and enter an empty line after it"
"find two empty lines and replace it with one empty line"
"Join paragraphs"
"delete empty lines"
That should take care about formatting 99 percent of excessive newline characters.

You have to tweak the above steps for a particular book, because every single misformated book is unique.

You can also try to have a look at the html file and try to distinguish between wanted and unwanted line breaks. Most often, unfortunately, the html file is generated by MSWord. MSWord is THE most horrible tool for producing html format.
You can also try to process html file with a program html_tidy http://www.w3.org/People/Raggett/tidy/
kacir is offline   Reply With Quote
Old 02-01-2009, 03:37 PM   #5
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 73,645
Karma: 127837858
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
This does not sound like it is a legal LIT file. Where did this LIT come from?
JSWolf is offline   Reply With Quote
Advert
Old 02-01-2009, 05:00 PM   #6
daesdaemar
Addict
daesdaemar has a spectacular aura aboutdaesdaemar has a spectacular aura aboutdaesdaemar has a spectacular aura aboutdaesdaemar has a spectacular aura aboutdaesdaemar has a spectacular aura aboutdaesdaemar has a spectacular aura aboutdaesdaemar has a spectacular aura aboutdaesdaemar has a spectacular aura aboutdaesdaemar has a spectacular aura aboutdaesdaemar has a spectacular aura aboutdaesdaemar has a spectacular aura about
 
Posts: 210
Karma: 4282
Join Date: Oct 2008
Location: Florida
Device: Sony 505, Kindle 3, iPad 3
@kacir... very interesting. I will need to look into this program.

@JSWolf... this is actually a lit book that I have legally purchased several years ago. I am trying to format it for my Sony 505. I have actually run across several lit books that do not convert easily.
daesdaemar is offline   Reply With Quote
Old 02-01-2009, 05:06 PM   #7
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,771
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
these are LIT books that embed TXT files instead of HTML files
kovidgoyal is offline   Reply With Quote
Old 02-01-2009, 05:27 PM   #8
daesdaemar
Addict
daesdaemar has a spectacular aura aboutdaesdaemar has a spectacular aura aboutdaesdaemar has a spectacular aura aboutdaesdaemar has a spectacular aura aboutdaesdaemar has a spectacular aura aboutdaesdaemar has a spectacular aura aboutdaesdaemar has a spectacular aura aboutdaesdaemar has a spectacular aura aboutdaesdaemar has a spectacular aura aboutdaesdaemar has a spectacular aura aboutdaesdaemar has a spectacular aura about
 
Posts: 210
Karma: 4282
Join Date: Oct 2008
Location: Florida
Device: Sony 505, Kindle 3, iPad 3
Quote:
Originally Posted by kovidgoyal View Post
these are LIT books that embed TXT files instead of HTML files
Based on my limited knowledge, that would explain some of the problems I've had with lit books.
daesdaemar is offline   Reply With Quote
Old 02-02-2009, 03:04 AM   #9
kacir
Wizard
kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.
 
kacir's Avatar
 
Posts: 3,447
Karma: 10484861
Join Date: May 2006
Device: PocketBook 360, before it was Sony Reader, cassiopeia A-20
Quote:
Originally Posted by daesdaemar View Post
@kacir... very interesting. I will need to look into this program.
If you need any help to get started with Vim, do not hesitate to ask.
kacir is offline   Reply With Quote
Old 02-06-2009, 11:47 AM   #10
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 73,645
Karma: 127837858
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
With the wonky LIT, use lit2oeb to convert the LIT into it's component parts. Then you can fix it up so it converts the way you want.
JSWolf is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
[KOBO] Strip existing formatting to apply my own default formatting to all books digital_steve Calibre 2 08-10-2010 06:34 PM
Calibre and FORMATTING how to stop it altering my formatting? nerys Calibre 37 07-23-2010 02:35 AM
Formatting Question HiddenZebra Amazon Kindle 2 06-26-2010 02:16 AM
Calibre and FORMATTING how to stop it altering my formatting? nerys Calibre 0 02-28-2010 04:51 PM
Formatting epic poems with line numbers? Lima_dat Workshop 4 02-25-2008 03:53 PM


All times are GMT -4. The time now is 04:59 AM.


MobileRead.com is a privately owned, operated and funded community.