Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 08-04-2009, 06:22 PM   #1
jmurphy
Zealot
jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.
 
Posts: 105
Karma: 1133068
Join Date: Sep 2007
Device: ipaq
Convert TXT to anything - simply wraps with < html > < body > ?

Am I doing something wrong, or is converting from TXT extremely limited?

In trying to convert from TXT to ePub, for instance, the txt simply gets wrapped with html and body tags.

As a result, the ePub has absolutely no paragraphs.

Is this the expected behaviour?

My (unreasonable?) expection is that during the conversion to HTML, Calibre would convert hard-returns (or double hard-returns) to html paragraphs.

What I'd really like to see is roughly round-tripping : Take an ePub, save the content as text, run that text through Calibre and have an ePub that at least resembles the original. In this scenario, Calibre would process the text looking for Keywords like "Chapter" etc, and at least add a header tag to them during the html conversion, before converting to ePub.

So, what should I expect from Calibre when converting a TXT file to another format?

John Murphy
jmurphy is offline   Reply With Quote
Old 08-04-2009, 06:30 PM   #2
user_none
Sigil & calibre developer
user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.
 
user_none's Avatar
 
Posts: 2,487
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
Conversion with txt as the input format should have the text run though markdown and it should be creating paragraphs. It should also unwrap hard line broken paragraphs. Open a bug at http://calibre.kovidgoyal.net and attach the text file so I don't forget to look into it.
user_none is offline   Reply With Quote
Advert
Old 08-04-2009, 07:08 PM   #3
jmurphy
Zealot
jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.
 
Posts: 105
Karma: 1133068
Join Date: Sep 2007
Device: ipaq
Quote:
Originally Posted by user_none View Post
It should also unwrap hard line broken paragraphs.
How does Calibre unwrap hardline broke paragraphs?

Or, more to the point: How does Calibre recognize paragraphs in TXT files? Is it expecting double spacing? The TXT file I tested with only had single spacing...

John
jmurphy is offline   Reply With Quote
Old 08-04-2009, 07:41 PM   #4
user_none
Sigil & calibre developer
user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.
 
user_none's Avatar
 
Posts: 2,487
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
Quote:
Originally Posted by jmurphy View Post
How does Calibre unwrap hardline broke paragraphs?

Or, more to the point: How does Calibre recognize paragraphs in TXT files? Is it expecting double spacing? The TXT file I tested with only had single spacing...

John
It recognizes single spaced lines as the same paragraph. All paragraphs are borken by an empty line.

So:

Code:
line1 is right here.
line2  is part of the same paragraph as line1.
Code:
line 1 is it's own.

line2 is also it's own.
It almost sounds like the file is:

Code:
    para
    para
    para
In which case that would be interpreted as all one large paragraph.
user_none is offline   Reply With Quote
Old 08-06-2009, 12:33 AM   #5
jmurphy
Zealot
jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.jmurphy ought to be getting tired of karma fortunes by now.
 
Posts: 105
Karma: 1133068
Join Date: Sep 2007
Device: ipaq
Quote:
Originally Posted by user_none View Post
It recognizes single spaced lines as the same paragraph. All paragraphs are borken by an empty line.

So:

Code:
line1 is right here.
line2  is part of the same paragraph as line1.
Code:
line 1 is it's own.

line2 is also it's own.
It almost sounds like the file is:

Code:
    para
    para
    para
In which case that would be interpreted as all one large paragraph.


Not "almost like the file is". As I said, that is exactly the way the file is. Single spaced. Every native Windows based tool I've used uses a single hard return as a paragraph marker.

Is there a way to configure Calibre to recognize a single hard return as a paragraph marker?

Earlier in this thread you mentioned "markdown". What is "markdown"?

John
jmurphy is offline   Reply With Quote
Advert
Old 08-06-2009, 01:03 AM   #6
slantybard
my parent's oops...
slantybard ought to be getting tired of karma fortunes by now.slantybard ought to be getting tired of karma fortunes by now.slantybard ought to be getting tired of karma fortunes by now.slantybard ought to be getting tired of karma fortunes by now.slantybard ought to be getting tired of karma fortunes by now.slantybard ought to be getting tired of karma fortunes by now.slantybard ought to be getting tired of karma fortunes by now.slantybard ought to be getting tired of karma fortunes by now.slantybard ought to be getting tired of karma fortunes by now.slantybard ought to be getting tired of karma fortunes by now.slantybard ought to be getting tired of karma fortunes by now.
 
Posts: 485
Karma: 1477572
Join Date: Feb 2009
Device: Vx->Handera->Clie-> Axim->505->650->KPW/Aura ->L2->iOS/CBW
Markdown:
http://daringfireball.net/projects/markdown/syntax
slantybard is offline   Reply With Quote
Old 08-06-2009, 02:19 AM   #7
rogue_ronin
Banned
rogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-booksrogue_ronin has learned how to read e-books
 
Posts: 475
Karma: 796
Join Date: Sep 2008
Location: Honolulu
Device: Nokia 770 (fbreader)
Quote:
Originally Posted by slantybard View Post

So, calibre is pre-processing text with markdown? That's cool. Good for folks to know, they might want to do a quick edit before adding to calibre.

m a r
rogue_ronin is offline   Reply With Quote
Old 08-06-2009, 06:50 AM   #8
user_none
Sigil & calibre developer
user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.
 
user_none's Avatar
 
Posts: 2,487
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
Quote:
Originally Posted by jmurphy View Post
Not "almost like the file is". As I said, that is exactly the way the file is. Single spaced. Every native Windows based tool I've used uses a single hard return as a paragraph marker.
The majority of plain text files it was expected for people to use as are from Project Gutenberg. They all almost all line broken paragraphs with empty lines separating paragraphs. E.G. War and Peace.

Quote:
Originally Posted by jmurphy View Post
Is there a way to configure Calibre to recognize a single hard return as a paragraph marker?
Open a ticket requesting that as a feature so I don't forget about it.
user_none is offline   Reply With Quote
Reply

Tags
text conversion, text convert, txt conversion, txt convert


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Convert your my clippings.txt to .doc, csv, and html Ericc22 Amazon Kindle 12 11-24-2022 06:00 AM
Calibre convert Chinese PDF to EPUB well, but not TXT and HTML jimmyzou ePub 15 12-27-2013 04:02 PM
How can i convert HTML or txt file to EPUB file ? guguqiaqia ePub 7 05-28-2010 09:15 PM
text -> epub as a tool to simply convert ingyu72 Sony Reader 0 09-17-2009 08:59 PM
[Old Thread] unable to convert ebooks(rtf, txt,lit,html,pdf) to lrf in calibre .4.131 jackdeth191 Calibre 9 05-02-2009 02:55 AM


All times are GMT -4. The time now is 04:17 PM.


MobileRead.com is a privately owned, operated and funded community.