Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 11-11-2010, 04:20 AM   #1
GreenMonkey
DRM hater
GreenMonkey ought to be getting tired of karma fortunes by now.GreenMonkey ought to be getting tired of karma fortunes by now.GreenMonkey ought to be getting tired of karma fortunes by now.GreenMonkey ought to be getting tired of karma fortunes by now.GreenMonkey ought to be getting tired of karma fortunes by now.GreenMonkey ought to be getting tired of karma fortunes by now.GreenMonkey ought to be getting tired of karma fortunes by now.GreenMonkey ought to be getting tired of karma fortunes by now.GreenMonkey ought to be getting tired of karma fortunes by now.GreenMonkey ought to be getting tired of karma fortunes by now.GreenMonkey ought to be getting tired of karma fortunes by now.
 
GreenMonkey's Avatar
 
Posts: 945
Karma: 2066176
Join Date: Jun 2010
Location: Michigan
Device: Nook ST glow, Kindle Voyage
RTF to EPUB...extra line breaks

Hey guys

I've noticed this on a lot of RTF files and I haven't been able to figure it out.

When I convert to EPUB, I get extra line breaks after each paragraph, that weren't in the original RTF.

¶ here stands for Paragraph break in the original RTF:

Like this in the RTF:

This is a paragraph¶
This is paragraph two.

Comes out:

This is a paragraph

This is paragraph two.



If I ask Calibre to "Remove spacing between paragraphs" under Look & Feel...it removes them. But it removes ALL breaks, then...including the ones that ARE supposed to be there.

So if I have this in the RTF:
This is a paragraph¶

This is paragraph two.

I get:

This is a paragraph.
This is paragraph two.


Any fix for this kind of behavior? I searched the forum...tried tweaking the Xpath detection stuff with a \ (even though RTF isn't HTML related). I did open the doc and turn on all formatting characters to make sure there weren't any extra breaks. Nope. Just (p).

I seem to either get: spaces between all paragraphs, or, no space ever. What I don't get is why I'm getting blank lines between paragraphs at all - they aren't present in the original RTF.

Last edited by GreenMonkey; 11-11-2010 at 05:30 AM.
GreenMonkey is offline   Reply With Quote
Old 11-11-2010, 05:19 AM   #2
itimpi
Wizard
itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.
 
Posts: 4,552
Karma: 950151
Join Date: Nov 2008
Device: Sony PRS-950, iphone/ipad (Marvin/iBooks/QuickReader)
If you look at the HTML that is generated, I very much doubt if you will see any line breaks. What I expect you are seeing is paragraph breaks with a style (which is normal default for HTML) that specifies a paragraph break should add some white space.

Another point is that in HTML multiple consecutive paragraph breaks are typically treated as a single paragraph break.

On that basis, the behaviour you describe is exactly what I would expect.

It sounds as if in the original RTF file the author has tried to use blank lines to separate paragraphs? This is not abnormal in files where the author has mixed up the use of paragraph breaks to simply indicate end-of-line and also to indicate genuine paragraph breaks.

What I am not sure from your message, is under what circumstances you want their to be space between paragraphs, and when you do not want this behavior?
itimpi is offline   Reply With Quote
Advert
Old 11-11-2010, 05:27 AM   #3
GreenMonkey
DRM hater
GreenMonkey ought to be getting tired of karma fortunes by now.GreenMonkey ought to be getting tired of karma fortunes by now.GreenMonkey ought to be getting tired of karma fortunes by now.GreenMonkey ought to be getting tired of karma fortunes by now.GreenMonkey ought to be getting tired of karma fortunes by now.GreenMonkey ought to be getting tired of karma fortunes by now.GreenMonkey ought to be getting tired of karma fortunes by now.GreenMonkey ought to be getting tired of karma fortunes by now.GreenMonkey ought to be getting tired of karma fortunes by now.GreenMonkey ought to be getting tired of karma fortunes by now.GreenMonkey ought to be getting tired of karma fortunes by now.
 
GreenMonkey's Avatar
 
Posts: 945
Karma: 2066176
Join Date: Jun 2010
Location: Michigan
Device: Nook ST glow, Kindle Voyage
If there is a blank line, it should stay there.

If there is no blank line, there shouldn't be one.

I moved to paragraph symbols for this post and updated the original post (ASCII/ANSI shortcuts FTW)

So the behavior is basically...whenever a ¶ is there...I get a blank line from Calibre in the epub output.

If I ask it to remove "Remove spacing between paragraphs" it removes the 'extra' ones...but of course, also the ones that were supposed to be there.

I don't really get why I'm getting a blank line at every ¶ , even with that option to add it unchecked.

Last edited by GreenMonkey; 11-11-2010 at 05:31 AM.
GreenMonkey is offline   Reply With Quote
Old 11-15-2010, 12:02 AM   #4
GreenMonkey
DRM hater
GreenMonkey ought to be getting tired of karma fortunes by now.GreenMonkey ought to be getting tired of karma fortunes by now.GreenMonkey ought to be getting tired of karma fortunes by now.GreenMonkey ought to be getting tired of karma fortunes by now.GreenMonkey ought to be getting tired of karma fortunes by now.GreenMonkey ought to be getting tired of karma fortunes by now.GreenMonkey ought to be getting tired of karma fortunes by now.GreenMonkey ought to be getting tired of karma fortunes by now.GreenMonkey ought to be getting tired of karma fortunes by now.GreenMonkey ought to be getting tired of karma fortunes by now.GreenMonkey ought to be getting tired of karma fortunes by now.
 
GreenMonkey's Avatar
 
Posts: 945
Karma: 2066176
Join Date: Jun 2010
Location: Michigan
Device: Nook ST glow, Kindle Voyage
Anybody else have any thoughts on this?
GreenMonkey is offline   Reply With Quote
Old 11-15-2010, 02:30 AM   #5
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
I just tested this out - rtf seems to do something similar to txt input - extra blank lines appear to be deleted as part of the standard processing. As itimpi noted, this is normal. The use case where this is actually problematic would be for soft breaks (the only case where it occasionally bothers me). If the source document had soft breaks then Calibre deletes them for both rtf and text by default. For text you can tune this (preserve spaces), but not much can be done for rtf.

If your concern is actually because of softbreaks you have two options:
  1. submit a bug/feature request - bugs.calibre-ebook.com
  2. replace all your blank lines with a soft break marker, e.g. * * *
ldolse is offline   Reply With Quote
Advert
Old 11-17-2010, 01:18 AM   #6
GreenMonkey
DRM hater
GreenMonkey ought to be getting tired of karma fortunes by now.GreenMonkey ought to be getting tired of karma fortunes by now.GreenMonkey ought to be getting tired of karma fortunes by now.GreenMonkey ought to be getting tired of karma fortunes by now.GreenMonkey ought to be getting tired of karma fortunes by now.GreenMonkey ought to be getting tired of karma fortunes by now.GreenMonkey ought to be getting tired of karma fortunes by now.GreenMonkey ought to be getting tired of karma fortunes by now.GreenMonkey ought to be getting tired of karma fortunes by now.GreenMonkey ought to be getting tired of karma fortunes by now.GreenMonkey ought to be getting tired of karma fortunes by now.
 
GreenMonkey's Avatar
 
Posts: 945
Karma: 2066176
Join Date: Jun 2010
Location: Michigan
Device: Nook ST glow, Kindle Voyage
OK, thanks. Maybe I'll submit something. The RTF conversions are very good - so close - just this paragraph / line break issue remains.
GreenMonkey is offline   Reply With Quote
Old 11-17-2010, 08:21 AM   #7
oldbwl
Zealot
oldbwl doesn't litteroldbwl doesn't litter
 
oldbwl's Avatar
 
Posts: 122
Karma: 164
Join Date: Aug 2010
Location: Old Ynysybwl
Device: Sony PRS-300
I can replicate this with a simple file which I created from scratch the original RTF is

Title¶
By¶
Author¶

Is always being converted in epub to

Title

By

Author


I have not entered a bug report as somewhere here I thought I had seen that the RTF to XXXX module was not being maintained at present. (I hope I got that right as I can't find it right now!).

Shame really becuase in my own work flow I like to convert from the initial document to RTF, run a series of macros which do all my correcting i.e. indents, ABC LIT Transformer yadayada removals etc and then convert to epub.

Some conversion from TXT files to RTF lose all ¶'s on some occassions and not others without any reason I can discern.

PS> if anyone wants to easily extract text from a PDF into a Word or RTF file install Foxit Reader. Select any word in the PDF with the text selection icon and then a 'CTRL+A' followed by 'CTRL+C' will select all and copy it ready for pasting in to whatever you want to edit it with etc. It makes the ABC Transformer stuff much easier to extract too.
oldbwl is offline   Reply With Quote
Old 11-17-2010, 08:26 AM   #8
janvanmaar
Addict
janvanmaar has a complete set of Star Wars action figures.janvanmaar has a complete set of Star Wars action figures.janvanmaar has a complete set of Star Wars action figures.janvanmaar has a complete set of Star Wars action figures.janvanmaar has a complete set of Star Wars action figures.
 
Posts: 219
Karma: 404
Join Date: Nov 2010
Device: Kindle 3G, Samsung SIII
You can use conversion to odt to avoid this and similar problems. If your macros require rtf then you can take source->rtf->macros->odt->destination path. I use doc->odt->destination and rtf->odt->destination regularly and it works well.

Last edited by janvanmaar; 11-17-2010 at 08:29 AM.
janvanmaar is offline   Reply With Quote
Old 11-17-2010, 09:56 AM   #9
oldbwl
Zealot
oldbwl doesn't litteroldbwl doesn't litter
 
oldbwl's Avatar
 
Posts: 122
Karma: 164
Join Date: Aug 2010
Location: Old Ynysybwl
Device: Sony PRS-300
I will try some odt - the macros run in word and my v2010 will load odt
oldbwl is offline   Reply With Quote
Old 11-17-2010, 09:59 AM   #10
oldbwl
Zealot
oldbwl doesn't litteroldbwl doesn't litter
 
oldbwl's Avatar
 
Posts: 122
Karma: 164
Join Date: Aug 2010
Location: Old Ynysybwl
Device: Sony PRS-300
ODT does not appear as an option in the conversion in my 0.7.28 Calibre, is there a downloadable plugin?

I can save the RTF as ODT but to get that converted back to epub there is no module?
oldbwl is offline   Reply With Quote
Old 11-17-2010, 10:18 AM   #11
janvanmaar
Addict
janvanmaar has a complete set of Star Wars action figures.janvanmaar has a complete set of Star Wars action figures.janvanmaar has a complete set of Star Wars action figures.janvanmaar has a complete set of Star Wars action figures.janvanmaar has a complete set of Star Wars action figures.
 
Posts: 219
Karma: 404
Join Date: Nov 2010
Device: Kindle 3G, Samsung SIII
That is weird. I did not do anything special, ODT input plugin just came with my Calibre (also 0.7.28) installation. I can simply add any ODT file to Calibre via the Add button and then convert to whatever.
Perhaps there is difference in plugins installed/enabled by default for different OS (I am on Linux)? You can check under Preferences->Conversion Input Plugins, whether ODT is there and green...

Last edited by janvanmaar; 11-17-2010 at 10:21 AM.
janvanmaar is offline   Reply With Quote
Old 11-17-2010, 10:38 AM   #12
oldbwl
Zealot
oldbwl doesn't litteroldbwl doesn't litter
 
oldbwl's Avatar
 
Posts: 122
Karma: 164
Join Date: Aug 2010
Location: Old Ynysybwl
Device: Sony PRS-300
I can Add an ODT - not convert to. There is no ODT plugin in my setup, and I am using Win7
oldbwl is offline   Reply With Quote
Old 11-17-2010, 10:44 AM   #13
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,771
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
you need to save your RTF as ODT in openoffice and then convert from the ODT in calibre.
kovidgoyal is online now   Reply With Quote
Old 11-17-2010, 11:05 AM   #14
janvanmaar
Addict
janvanmaar has a complete set of Star Wars action figures.janvanmaar has a complete set of Star Wars action figures.janvanmaar has a complete set of Star Wars action figures.janvanmaar has a complete set of Star Wars action figures.janvanmaar has a complete set of Star Wars action figures.
 
Posts: 219
Karma: 404
Join Date: Nov 2010
Device: Kindle 3G, Samsung SIII
Ah sorry, did not get your question. Exactly as Kovid says (of course)
janvanmaar is offline   Reply With Quote
Old 11-17-2010, 11:48 AM   #15
oldbwl
Zealot
oldbwl doesn't litteroldbwl doesn't litter
 
oldbwl's Avatar
 
Posts: 122
Karma: 164
Join Date: Aug 2010
Location: Old Ynysybwl
Device: Sony PRS-300
OK, so, I can save from RTF in Word 2010 to ODT, add that as a file - merge with existing book to maintain Metadata and then convert. Tried it and it worked.

I think I will stick with the gaps though and save on the number of steps as I am converting many books at the moment. Good thread though learnt a lot as ususal, thanks everyone
oldbwl is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Extra paragraph line when converting from LRF jhempel24 Calibre 3 08-18-2010 07:00 AM
Stripping extra line returns jwhayn Sony Reader 3 02-27-2010 06:46 PM
Odd line/paragraph breaks in epub and FB2? PKFFW Calibre 4 10-01-2009 07:49 AM
No line breaks ecpepper Amazon Kindle 3 08-09-2009 06:42 PM
Removing extra line breaks plemming Calibre 0 07-31-2008 07:50 PM


All times are GMT -4. The time now is 02:02 AM.


MobileRead.com is a privately owned, operated and funded community.