Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 12-19-2010, 11:18 AM   #1
MacEvansCB
Enthusiast
MacEvansCB began at the beginning.
 
Posts: 25
Karma: 10
Join Date: Nov 2010
Location: Somewhere in Iowa
Device: Nook Color
Getting extra Paragraphs

This one is going to be harder that the last one I found my own answer to.

When converting from PDF to RTF:
Every time a sentence starts at the beginning of a line within a paragraph, Calibre always starts a new paragraph.

Any way to keep this from happening???
MacEvansCB is offline   Reply With Quote
Old 12-19-2010, 12:06 PM   #2
MacEvansCB
Enthusiast
MacEvansCB began at the beginning.
 
Posts: 25
Karma: 10
Join Date: Nov 2010
Location: Somewhere in Iowa
Device: Nook Color
I have also noticed that if a line ends in a capital letter, a new paragraph is started, even though there is no punctuation.
MacEvansCB is offline   Reply With Quote
Advert
Old 12-19-2010, 12:41 PM   #3
MacEvansCB
Enthusiast
MacEvansCB began at the beginning.
 
Posts: 25
Karma: 10
Join Date: Nov 2010
Location: Somewhere in Iowa
Device: Nook Color
also found a case where a line starting with a double quote " also started a new paragraph, again with no other punctionation
MacEvansCB is offline   Reply With Quote
Old 12-19-2010, 12:45 PM   #4
MacEvansCB
Enthusiast
MacEvansCB began at the beginning.
 
Posts: 25
Karma: 10
Join Date: Nov 2010
Location: Somewhere in Iowa
Device: Nook Color
And, hopefully finally, I've found a case where a typed-out ellipsis " . . . " at the end of a line starts a new paragraph.
MacEvansCB is offline   Reply With Quote
Old 12-19-2010, 01:22 PM   #5
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
There is no concept of a 'paragraph' in pdfs.

PDF unwrap works based off of punctuation, html that pdftohtml generates doesn't provide any other clues as to what is and is not a paragraph. You just need to fix it up yourself after conversion unfortunately.

There is a new pdf engine that will probably get released someday which contains info such as indentation and spacing between lines, which could be used to determine paragraph boundaries. No telling when it will be ready though.
ldolse is offline   Reply With Quote
Advert
Old 12-19-2010, 02:46 PM   #6
MacEvansCB
Enthusiast
MacEvansCB began at the beginning.
 
Posts: 25
Karma: 10
Join Date: Nov 2010
Location: Somewhere in Iowa
Device: Nook Color
Thanks.
I do know that this is a difficult area in PDF translation.
But even so, Calibre is light-years ahead of the translators inside of Acrobat Pro. Acrobat Pro makes a total mess of a document's paragraph structure, even when its clean.
I was hoping that I wouldn't have to scrub thru the Calibre conversions, but ... oh, well.
At least there's a lot less to look for.
MacEvansCB is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Ragged right / space between paragraphs Oldpilot Sigil 5 11-11-2010 07:59 PM
Paragraphs between Pages and Calibre La Nuestra Calibre 21 10-18-2010 08:03 AM
Paragraphs from indentations? Raketemensch Calibre 6 09-16-2010 10:43 AM
Remove spacing between paragraphs doesn't. Djehuty Calibre 6 04-28-2009 04:53 AM
Paragraphs and indent mrmikel Calibre 33 01-10-2009 05:37 PM


All times are GMT -4. The time now is 07:50 AM.


MobileRead.com is a privately owned, operated and funded community.