Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 02-03-2011, 08:43 AM   #1
bfollowell
Fanatic
bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.
 
Posts: 510
Karma: 1152752
Join Date: Aug 2010
Location: Evansville, IN, USA
Device: Amazon Kindle 3 Wi-Fi & B&N Nook Tablet & B&N Nook HD+
Chapters are one giant paragraph. How to fix?

I'm working to convert a couple of epubs and the original author/creator didn't create proper paragraphs. I have no idea what he or she was thinking but basically, each chapter is one giant paragraph with the individual "paragraphs" being created by adding a few breaks between them to create space.

I'm looking for an easy way to correct this that isn't quite so hands on but haven't had a lot of luck so far. I've looked through the convert options in Calibre and haven't seen anything I thought would relate.

Has anyone ever run into this? Is anyone aware of any convert options I might be missing that might help? If not, does anyone have any suggestions of an easier way to fix this?

Thanks to anyone that may be able steer me in the right direction.

- Byron Followell
bfollowell is offline   Reply With Quote
Old 02-03-2011, 09:00 AM   #2
Manichean
Wizard
Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!Manichean My eyes! My eyes! The light is just too bright!
 
Manichean's Avatar
 
Posts: 3,130
Karma: 80446
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
You could try to use the search & replace option to convert those breaks into closing and opening paragraph tags. If you, for example, had "<br><br>" between the individual "paragraphs", you could try to replace that with "</p><p>", that should create a real paragraph.
Manichean is offline   Reply With Quote
 
Enthusiast
Old 02-03-2011, 09:16 AM   #3
user_none
Sigil & calibre developer
user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.
 
user_none's Avatar
 
Posts: 2,433
Karma: 950001
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
If there isn't much additional formatting you can try converting to TXT and then back to EPUB. The spacing when converting to TXT should be preserved and then detected as a paragraph break wheb converting back to EPUB.
user_none is offline   Reply With Quote
Old 02-03-2011, 09:39 AM   #4
bfollowell
Fanatic
bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.
 
Posts: 510
Karma: 1152752
Join Date: Aug 2010
Location: Evansville, IN, USA
Device: Amazon Kindle 3 Wi-Fi & B&N Nook Tablet & B&N Nook HD+
Quote:
Originally Posted by Manichean View Post
You could try to use the search & replace option to convert those breaks into closing and opening paragraph tags. If you, for example, had "<br><br>" between the individual "paragraphs", you could try to replace that with "</p><p>", that should create a real paragraph.
Thanks for the suggestion. This is the route I'll probably wind up taking. I'm certain it will help but it will still be somewhat manual because there are so many variations. Some sections only have one or two breaks, some have four or five. Some have breaks with certain class codes assigned, others have different classes or none at all. It's really a horribly, sloppy mess, hence the reason I want to try and clean it up. What can you do? Not everyone has the best ebook creation procedures.

Quote:
Originally Posted by user_none View Post
If there isn't much additional formatting you can try converting to TXT and then back to EPUB. The spacing when converting to TXT should be preserved and then detected as a paragraph break wheb converting back to EPUB.
I thought of that but there is a lot of italics that I'd like to keep and I'd lose all that. If it was as all just dumb text, I'd try that for sure.

Thanks for the suggestion. It's just not really an option here.

- Byron
bfollowell is offline   Reply With Quote
Old 02-03-2011, 09:47 AM   #5
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
Quote:
Originally Posted by bfollowell View Post
I thought of that but there is a lot of italics that I'd like to keep and I'd lose all that. If it was as all just dumb text, I'd try that for sure.

Thanks for the suggestion. It's just not really an option here.
Actually if you use Textile or Markdown syntax in your text output settings italics and a lot of other formatting will be preserved. You should give it a quick try just to see if it will do the trick.
ldolse is offline   Reply With Quote
Old 02-03-2011, 10:06 AM   #6
bfollowell
Fanatic
bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.
 
Posts: 510
Karma: 1152752
Join Date: Aug 2010
Location: Evansville, IN, USA
Device: Amazon Kindle 3 Wi-Fi & B&N Nook Tablet & B&N Nook HD+
Quote:
Originally Posted by ldolse View Post
Actually if you use Textile or Markdown syntax in your text output settings italics and a lot of other formatting will be preserved. You should give it a quick try just to see if it will do the trick.
Really?! hmmm...

I'm curious now and the learning cap is on. I will have to experiment with this and let you know how it goes. I'll let you know if I hit a wall or have questions.

Thanks VERY much for this suggestion. Like user_none suggested, converting to text and then back would definitely take care of all the spacing and would help create proper paragraph tags. As long as I can keep most of the text formatting, I can easily fix the paragraph formatting. Because of the way this ebook was created, it was all going to need to be redone anyway. This would definitely save me tons of time.

- Byron
bfollowell is offline   Reply With Quote
Old 02-03-2011, 10:39 AM   #7
bfollowell
Fanatic
bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.
 
Posts: 510
Karma: 1152752
Join Date: Aug 2010
Location: Evansville, IN, USA
Device: Amazon Kindle 3 Wi-Fi & B&N Nook Tablet & B&N Nook HD+
ldolse,

That worked wonderfully!!

It didn't work perfectly but it did work extremely well.

There's just a little bit of text formatting cleanup to fix and then I can just start adding in my paragraph formatting and I'll be done. Still a little bit of work but at least now I'm starting from a clean slate. I'm sure I've noticed the Markdown option on there before and just didn't know what the heck it was it never looked into it. I'm glad you clued me in to it.

About the only problems I see are a handful of sections that should be italics but aren't they have an underscore before and after the section. It looks like they are all instances where the text that should been italicised was inside of a set of parentheses or quotation marks. That shouldn't take me too long to fix though.

Thanks again for your excellent suggestion. I'll make sure to give you some karma points.

- Byron
bfollowell is offline   Reply With Quote
Old 02-03-2011, 10:57 AM   #8
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
Glad to hear it. An underscore before and after 'should' convert back to italics - obviously italics themselves can't be maintained in text, so Markdown uses *word* or _word_ to represent italics. I normally use *word*, but I'm pretty sure _word_ works just as well.

Read more about the syntax here:
http://daringfireball.net/projects/markdown/syntax#em
ldolse is offline   Reply With Quote
Old 02-03-2011, 11:03 AM   #9
user_none
Sigil & calibre developer
user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.
 
user_none's Avatar
 
Posts: 2,433
Karma: 950001
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
I know that underscores that are not next to spaces trip up the Textile processor. I wonder if the Markdown processor ha the same limitation. I will modify the italicize heuristic to account for quotes and parantheses. This why if you enable that heuristic anything missed by the markdown processor will be caught and formatted properly.
user_none is offline   Reply With Quote
Old 02-03-2011, 01:20 PM   #10
bfollowell
Fanatic
bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.bfollowell ought to be getting tired of karma fortunes by now.
 
Posts: 510
Karma: 1152752
Join Date: Aug 2010
Location: Evansville, IN, USA
Device: Amazon Kindle 3 Wi-Fi & B&N Nook Tablet & B&N Nook HD+
Quote:
Originally Posted by user_none View Post
I know that underscores that are not next to spaces trip up the Textile processor. I wonder if the Markdown processor ha the same limitation. I will modify the italicize heuristic to account for quotes and parantheses. This why if you enable that heuristic anything missed by the markdown processor will be caught and formatted properly.
After reading your post, I went back to the text file that resulted from my original conversion from sloppy epub to txt. I changed all the underscores to asterisks and, for whatever reason, the conversion back to epub went much smoother. Even the ones I mentioned that didn't seem to convert the first time around because they were inside quotes or parentheses converted. Your guys’ suggestions saved me a ton of work and I am EXTREMELY appreciative.

Thanks again.

- Byron
bfollowell is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Giant Killer Eels stuartneild Self-Promotions by Authors and Publishers 18 09-18-2011 04:16 PM
azw to mobi: Not detecting chapters/page break at chapters and no TOC RachDvn Calibre 3 01-16-2011 09:53 AM
scanned PDF has weird paragraph breaks. Possible to fix lunixer PDF 0 08-30-2010 10:47 PM
The Kno: A giant double-screen tablet to replace giant textbooks Sweetpea News 25 06-19-2010 03:48 PM
ePub Chapters vs. Stanza Chapters kjk Sigil 4 09-14-2009 10:50 AM


All times are GMT -4. The time now is 11:39 AM.


MobileRead.com is a privately owned, operated and funded community.