Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 08-14-2010, 05:59 PM   #1
lunixer
Addict
lunixer has learned how to read e-bookslunixer has learned how to read e-bookslunixer has learned how to read e-bookslunixer has learned how to read e-bookslunixer has learned how to read e-bookslunixer has learned how to read e-bookslunixer has learned how to read e-bookslunixer has learned how to read e-books
 
lunixer's Avatar
 
Posts: 231
Karma: 928
Join Date: Aug 2010
Device: Kindle 3
Trouble with .txt files

Any time I try to convert a .txt ebook, I get no formatting in the result. Typically I try to convert from .txt to mobi. Is there any reason why this might be? I can't quite figure it out. I've attached an example of such a file.

Last edited by lunixer; 08-16-2010 at 10:38 PM.
lunixer is offline   Reply With Quote
Old 08-14-2010, 06:13 PM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 25,371
Karma: 4961459
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
http://calibre-ebook.com/user_manual...-txt-documents
kovidgoyal is offline   Reply With Quote
Old 08-14-2010, 08:42 PM   #3
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 8,770
Karma: 12516053
Join Date: Feb 2009
Location: North Carolina
Device: Nexus 7
Quote:
Originally Posted by lunixer View Post
Any time I try to convert a .txt ebook, I get no formatting in the result. Typically I try to convert from .txt to mobi. Is there any reason why this might be? I can't quite figure it out. I've attached an example of such a file.
You have to select the method you want to use to format your text file within the Text Input area during conversion. For the file you attached you should check Treat each line as a paragraph. Hover over each option to get an explanation.
DoctorOhh is offline   Reply With Quote
Old 08-15-2010, 03:03 PM   #4
Agama
Guru
Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.Agama ought to be getting tired of karma fortunes by now.
 
Agama's Avatar
 
Posts: 637
Karma: 436517
Join Date: Jul 2010
Location: UK
Device: PRS-300, PW2
Applying Markdown to plain text source files can also be used to good effect with calibre. This can help improve the look-and-feel of the final book considerably, including automatically produced multilevel TOCs and formatting for headings/sub-headings. It takes about 15-30 mins per book. If you're interested then there are several threads in this forum that will get you started.
Agama is offline   Reply With Quote
Old 08-16-2010, 10:39 PM   #5
lunixer
Addict
lunixer has learned how to read e-bookslunixer has learned how to read e-bookslunixer has learned how to read e-bookslunixer has learned how to read e-bookslunixer has learned how to read e-bookslunixer has learned how to read e-bookslunixer has learned how to read e-bookslunixer has learned how to read e-books
 
lunixer's Avatar
 
Posts: 231
Karma: 928
Join Date: Aug 2010
Device: Kindle 3
Thank you very much for your suggestions. Notice: I have deleted the attachment on the first post after being informed that it violated copyright law.
lunixer is offline   Reply With Quote
Old 08-17-2010, 01:23 AM   #6
grid
Junior Member
grid began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Aug 2010
Device: nook
I'm sure the above suggestions are the proper solutions - but I've always had good results printing to pdf then converting the pdf to the desired format.
grid is offline   Reply With Quote
Old 08-17-2010, 09:01 AM   #7
Jabby
Jr. - Junior Member
Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.
 
Posts: 567
Karma: 2000170
Join Date: Aug 2010
Location: East Texas
Device: Archos, Asus, HP, Lenovo, Nexus and Samsung tablets in 7,8 and 10"
Quote:
Originally Posted by kovidgoyal View Post
Here is the problem, as I see it. All the text editors that I am aware of use CR/LF sequence to mark the end of line. Calibre looks for two consecutive CR/LFs to identify a paragraph but totally ignores the single CR/LF making it impossible to create bullet lines.

Looking at the attached file in Calibre Viewer illustrates the point.

Regards - John
Attached Files
File Type: txt Test File.txt (692 Bytes, 93 views)

Last edited by Jabby; 08-17-2010 at 09:17 AM. Reason: Forgot attachment
Jabby is offline   Reply With Quote
Old 08-17-2010, 02:46 PM   #8
jackie_w
Wizard
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 2,689
Karma: 3818505
Join Date: Sep 2009
Location: UK
Device: Sony PRS-350, PB360, Kobo Glo/AuraHD/Aura6"
Quote:
Originally Posted by Jabby View Post
Here is the problem, as I see it. All the text editors that I am aware of use CR/LF sequence to mark the end of line. Calibre looks for two consecutive CR/LFs to identify a paragraph but totally ignores the single CR/LF making it impossible to create bullet lines.
Hi John, Perhaps you did not investigate the link quoted in your post thoroughly enough

I have attached your TXT file updated with a little Markdown to produce the EPUB also in the attached zip.

You will see that it has:
  1. Bullet points (note the asterisks in the TXT)
  2. The address at the top is also formatted correctly (note the 2 extra spaces at the end of each line.
Attached Files
File Type: zip Markdown2.zip (90.3 KB, 91 views)
jackie_w is offline   Reply With Quote
Old 08-17-2010, 08:49 PM   #9
user_none
Sigil & calibre developer
user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.
 
user_none's Avatar
 
Posts: 2,433
Karma: 950001
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
Quote:
Originally Posted by Jabby View Post
Here is the problem, as I see it. All the text editors that I am aware of use CR/LF sequence to mark the end of line. Calibre looks for two consecutive CR/LFs to identify a paragraph but totally ignores the single CR/LF making it impossible to create bullet lines.
The default is to use two new line markers as the paragraph boundary. This example illistrates why:

Code:
This is all
one paragraph

This is also all one paragraph. We use
two new line markers to separate
paragraphs for the following reason.

If I make a
list of items separated by 
a single new line marker then,
I can't tell if it's paragraph
or a list. So I assume it's
a paragraph because they're
more common.
Your two options are to put a second new line after each item in the list or as jackie_w suggested use markdown to give a higher degree of formatting.

Also, only Windows uses this sequence. Unix based system (both Kovid and I use Linux, GRiker uses OS X) use LF only. Apple's OS 9 and earlier used CR only to denote a new line. TXT input must support all of these variations including any combination of the above new line markers within the same file. Due to this we cannot do something like: CR/LF denotes new line and CR only denotes items in a list. Internally TXT input converts all new line markers to LF. This solves the different OS using different markers and allows for TXT input to easily match against a single new line character.
user_none is offline   Reply With Quote
Old 08-18-2010, 01:37 PM   #10
Jabby
Jr. - Junior Member
Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.
 
Posts: 567
Karma: 2000170
Join Date: Aug 2010
Location: East Texas
Device: Archos, Asus, HP, Lenovo, Nexus and Samsung tablets in 7,8 and 10"
Quote:
Originally Posted by user_none View Post

Also, only Windows uses this sequence. Unix based system (both Kovid and I use Linux, GRiker uses OS X) use LF only. Apple's OS 9 and earlier used CR only to denote a new line. TXT input must support all of these variations including any combination of the above new line markers within the same file. Due to this we cannot do something like: CR/LF denotes new line and CR only denotes items in a list. Internally TXT input converts all new line markers to LF. This solves the different OS using different markers and allows for TXT input to easily match against a single new line character.
I'll admit I haven't written a line of code since C was a pup nor do I wish to. I'll also admit I can be pretty dense at times. But I just don't get it.

I took the test file that I attached earlier and replaced all CR+LF with a single LF and later with a single CR. All three versions displayed the same using the Calibre Viewer (name and address lines displayed on a single line).

So here is my question: If two consecutive EOL markers (CR+LF, CR or LF) can be detected and used as a paragraph marker why can't a single instance be detected and used as a single EOL marker. Here is a list of allowed EOL terminators all others are ignored.

CR Line terminator
LF Line terminator
CRLF Line terminator
CRCR Paragraph terminator
LFLF Paragraph terminator
CRLFCRLF Paragraph terminator

By the way, I really don't have a problem. I use a text editor to create address book, prescription and other personal info to keep handy. Text (.txt) files display just fine on all my eReaders if I transfer them directly. Calibre will convert Wordpad (.rtf) to epub or mobi correctly if you use shift+enter key to terminate single lines. So I'm in good shape.

Ah, the dichotomy of it all - John
Jabby is offline   Reply With Quote
Old 08-18-2010, 03:19 PM   #11
jackie_w
Wizard
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 2,689
Karma: 3818505
Join Date: Sep 2009
Location: UK
Device: Sony PRS-350, PB360, Kobo Glo/AuraHD/Aura6"
It seems I wasted my time writing post #8 advising you exactly how to fix your TXT file in about 10 seconds. You seem to be hung up on CR and LF which are a non-issue.
jackie_w is offline   Reply With Quote
Old 08-18-2010, 05:12 PM   #12
Jabby
Jr. - Junior Member
Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.
 
Posts: 567
Karma: 2000170
Join Date: Aug 2010
Location: East Texas
Device: Archos, Asus, HP, Lenovo, Nexus and Samsung tablets in 7,8 and 10"
Quote:
Originally Posted by jackie_w View Post
It seems I wasted my time writing post #8 advising you exactly how to fix your TXT file in about 10 seconds. You seem to be hung up on CR and LF which are a non-issue.
No, you did not waste your time. In fact , I had read the link info and examined your markdown attachment carefully. I meant to thank you for your effort but something got in between - lazy maybe.

My original reply was to Kovid. After reading the documentation this statement made me wonder; "by default calibre only groups lines in the input document into paragraphs. The default is to assume one or more blank lines are a paragraph boundary:" Here is my quibble - why group lines into only paragraphs. Why not into paragraphs and single lines? It is certainly possible and then markdowns would only be necessary if you wanted to add other basic formatting.

By the way, showing that it could be done was what all that focus on CR and LF was all about - which was the issue.

Regards - John
Jabby is offline   Reply With Quote
Old 08-18-2010, 05:31 PM   #13
itimpi
Wizard
itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.
 
Posts: 4,022
Karma: 777817
Join Date: Nov 2008
Device: Sony PRS-950, iphone/ipad (Marvin/iBooks/QuickReader)
I think that the key point is that in most modern ebook formats (that are HTML based) end-of-line is simply treated as white space that is equivalent to a single space. This is great for reflowing text which is what they aim to support to fit a wide variety of screen sizes and allow font zooming.

You then need some special logic for recognizing paragraph breaks which then get there own special tag ( typically <p> or an equivalent). The whole concept of end-of-line therefore tends to be stripped out.
itimpi is online now   Reply With Quote
Old 08-18-2010, 07:26 PM   #14
user_none
Sigil & calibre developer
user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.
 
user_none's Avatar
 
Posts: 2,433
Karma: 950001
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
Quote:
Originally Posted by Jabby View Post
My original reply was to Kovid. After reading the documentation this statement made me wonder; "by default calibre only groups lines in the input document into paragraphs. The default is to assume one or more blank lines are a paragraph boundary:"
Kovid isn't weighing in very much because I'm the author and maintainer of TXT input. For TXT files paragraphs are the only reliable components that they can be broken down into.

Quote:
Originally Posted by Jabby View Post
Here is my quibble - why group lines into only paragraphs. Why not into paragraphs and single lines? It is certainly possible and then markdowns would only be necessary if you wanted to add other basic formatting.
There are two parts to this. The easy part is Markdown was chosen as the method for adding formatting to TXT files. It is easy, quick, and the markup even looks good when just viewing it as a standard text file. We have one all purpose formatting method that handles pretty much every case short of not using HTML. Adding other formatting methods that do the same thing is unnecessary.

You're describing doing the following and it won't work.

Quote:
Originally Posted by Jabby
CR Line terminator
LF Line terminator
CRLF Line terminator
CRCR Paragraph terminator
LFLF Paragraph terminator
CRLFCRLF Paragraph terminator
1) Many TXT documents (look at project Gutenberg) have this formatting:

Code:
I am all one
paragraph split
along multiple lines
with a single new
line character.
By your line ending description above it would turn into:

[code]
<p>
I am all one<br />
paragraph split<br />
along multiple lines<br />
with a single new<br />
line character.<br />
</p>

The whole point of TXT input is to take a fixed placement document and turn it into a reflowable format. calibre's conversion process actually requires this. Input -> reflowable intermediate format -> Output.

You've removed the entire idea of a reflowable paragraph that changes layout to fit with the page width but by doing the above. The TXT input is based on intent. Novels are the typical input and it is designed to handle the majority of their formatting cases. There is a "Treat each line as a paragraph" and Markdown to handle cases corner cases such as yours.

2) I'm not 100% clear but if you are implying that we allow for mixed CR/LF characters (aside from the standard Windows CRLF) within the document to denote different meaning? Such as LFLF for paragraph and CR for new line, No. CR and LF are invisible characters. They are all treated the same because that's how the majority of text editors treat them. Many uses will edit a file on Windows and then on say OS X. Some editors will convert all new lines to the system's standard and some insert their systems new line where indicated while still displaying correctly. Uses will become very confused when viewring their converted documents and different lines behave in different ways. Especially when they can't see there is a CR instead of an LF chracter in the source TXT file. In this case telling a user to open their document in a hex editor is not acceptable.
user_none is offline   Reply With Quote
Old 08-19-2010, 06:43 PM   #15
Jabby
Jr. - Junior Member
Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.Jabby ought to be getting tired of karma fortunes by now.
 
Posts: 567
Karma: 2000170
Join Date: Aug 2010
Location: East Texas
Device: Archos, Asus, HP, Lenovo, Nexus and Samsung tablets in 7,8 and 10"
Quote:
Originally Posted by user_none View Post
1) Many TXT documents (look at project Gutenberg) have this formatting:

Who woulda thunk it?
I thought this type of file went out with the typers.
Since you want to support this type of file your options are
definitely limited.


Code:
I am all one
paragraph split
along multiple lines
with a single new
line character.
By your line ending description above it would turn into:

[code]
<p>
I am all one<br />
paragraph split<br />
along multiple lines<br />
with a single new<br />
line character.<br />
</p>

Yep!
Sorry to have wasted your time.

Regards - John
Jabby is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Kindle and TXT files. bakerjw Amazon Kindle 11 07-28-2013 12:37 AM
Disappearing txt files?!? JaneFancher Sigil 2 06-26-2010 02:55 PM
How many files(txt) can be in one folder alxwang PocketBook 5 06-20-2010 09:39 PM
Txt files - Convert to Epub - Multiple files into one book - noob help Cernan Calibre 6 05-18-2010 10:12 AM
Reformatting .txt files willijt Workshop 14 03-27-2010 10:05 AM


All times are GMT -4. The time now is 12:27 PM.


MobileRead.com is a privately owned, operated and funded community.