08-14-2010, 05:59 PM | #1 |
Addict
Posts: 231
Karma: 928
Join Date: Aug 2010
Device: Kindle 3
|
Trouble with .txt files
Any time I try to convert a .txt ebook, I get no formatting in the result. Typically I try to convert from .txt to mobi. Is there any reason why this might be? I can't quite figure it out. I've attached an example of such a file.
Last edited by lunixer; 08-16-2010 at 10:38 PM. |
08-14-2010, 06:13 PM | #2 |
creator of calibre
Posts: 44,327
Karma: 23661992
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
|
Advert | |
|
08-14-2010, 08:42 PM | #3 |
US Navy, Retired
Posts: 9,865
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Nexus 7
|
You have to select the method you want to use to format your text file within the Text Input area during conversion. For the file you attached you should check Treat each line as a paragraph. Hover over each option to get an explanation.
|
08-15-2010, 03:03 PM | #4 |
Guru
Posts: 776
Karma: 2751519
Join Date: Jul 2010
Location: UK
Device: PW2, Nexus7
|
Applying Markdown to plain text source files can also be used to good effect with calibre. This can help improve the look-and-feel of the final book considerably, including automatically produced multilevel TOCs and formatting for headings/sub-headings. It takes about 15-30 mins per book. If you're interested then there are several threads in this forum that will get you started.
|
08-16-2010, 10:39 PM | #5 |
Addict
Posts: 231
Karma: 928
Join Date: Aug 2010
Device: Kindle 3
|
Thank you very much for your suggestions. Notice: I have deleted the attachment on the first post after being informed that it violated copyright law.
|
Advert | |
|
08-17-2010, 01:23 AM | #6 |
Junior Member
Posts: 2
Karma: 10
Join Date: Aug 2010
Device: nook
|
I'm sure the above suggestions are the proper solutions - but I've always had good results printing to pdf then converting the pdf to the desired format.
|
08-17-2010, 09:01 AM | #7 | |
Jr. - Junior Member
Posts: 586
Karma: 2000358
Join Date: Aug 2010
Location: Alabama
Device: Archos, Asus, HP, Lenovo, Nexus and Samsung tablets in 7,8 and 10"
|
Quote:
Looking at the attached file in Calibre Viewer illustrates the point. Regards - John Last edited by Jabby; 08-17-2010 at 09:17 AM. Reason: Forgot attachment |
|
08-17-2010, 02:46 PM | #8 | |
Grand Sorcerer
Posts: 6,216
Karma: 16534894
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
|
Quote:
I have attached your TXT file updated with a little Markdown to produce the EPUB also in the attached zip. You will see that it has:
|
|
08-17-2010, 08:49 PM | #9 | |
Sigil & calibre developer
Posts: 2,487
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
|
Quote:
Code:
This is all one paragraph This is also all one paragraph. We use two new line markers to separate paragraphs for the following reason. If I make a list of items separated by a single new line marker then, I can't tell if it's paragraph or a list. So I assume it's a paragraph because they're more common. Also, only Windows uses this sequence. Unix based system (both Kovid and I use Linux, GRiker uses OS X) use LF only. Apple's OS 9 and earlier used CR only to denote a new line. TXT input must support all of these variations including any combination of the above new line markers within the same file. Due to this we cannot do something like: CR/LF denotes new line and CR only denotes items in a list. Internally TXT input converts all new line markers to LF. This solves the different OS using different markers and allows for TXT input to easily match against a single new line character. |
|
08-18-2010, 01:37 PM | #10 | |
Jr. - Junior Member
Posts: 586
Karma: 2000358
Join Date: Aug 2010
Location: Alabama
Device: Archos, Asus, HP, Lenovo, Nexus and Samsung tablets in 7,8 and 10"
|
Quote:
I took the test file that I attached earlier and replaced all CR+LF with a single LF and later with a single CR. All three versions displayed the same using the Calibre Viewer (name and address lines displayed on a single line). So here is my question: If two consecutive EOL markers (CR+LF, CR or LF) can be detected and used as a paragraph marker why can't a single instance be detected and used as a single EOL marker. Here is a list of allowed EOL terminators all others are ignored. CR Line terminator LF Line terminator CRLF Line terminator CRCR Paragraph terminator LFLF Paragraph terminator CRLFCRLF Paragraph terminator By the way, I really don't have a problem. I use a text editor to create address book, prescription and other personal info to keep handy. Text (.txt) files display just fine on all my eReaders if I transfer them directly. Calibre will convert Wordpad (.rtf) to epub or mobi correctly if you use shift+enter key to terminate single lines. So I'm in good shape. Ah, the dichotomy of it all - John |
|
08-18-2010, 03:19 PM | #11 |
Grand Sorcerer
Posts: 6,216
Karma: 16534894
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
|
It seems I wasted my time writing post #8 advising you exactly how to fix your TXT file in about 10 seconds. You seem to be hung up on CR and LF which are a non-issue.
|
08-18-2010, 05:12 PM | #12 | |
Jr. - Junior Member
Posts: 586
Karma: 2000358
Join Date: Aug 2010
Location: Alabama
Device: Archos, Asus, HP, Lenovo, Nexus and Samsung tablets in 7,8 and 10"
|
Quote:
My original reply was to Kovid. After reading the documentation this statement made me wonder; "by default calibre only groups lines in the input document into paragraphs. The default is to assume one or more blank lines are a paragraph boundary:" Here is my quibble - why group lines into only paragraphs. Why not into paragraphs and single lines? It is certainly possible and then markdowns would only be necessary if you wanted to add other basic formatting. By the way, showing that it could be done was what all that focus on CR and LF was all about - which was the issue. Regards - John |
|
08-18-2010, 05:31 PM | #13 |
Wizard
Posts: 4,553
Karma: 950151
Join Date: Nov 2008
Device: Sony PRS-950, iphone/ipad (Marvin/iBooks/QuickReader)
|
I think that the key point is that in most modern ebook formats (that are HTML based) end-of-line is simply treated as white space that is equivalent to a single space. This is great for reflowing text which is what they aim to support to fit a wide variety of screen sizes and allow font zooming.
You then need some special logic for recognizing paragraph breaks which then get there own special tag ( typically <p> or an equivalent). The whole concept of end-of-line therefore tends to be stripped out. |
08-18-2010, 07:26 PM | #14 | |||
Sigil & calibre developer
Posts: 2,487
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
|
Quote:
Quote:
You're describing doing the following and it won't work. Quote:
Code:
I am all one paragraph split along multiple lines with a single new line character. [code] <p> I am all one<br /> paragraph split<br /> along multiple lines<br /> with a single new<br /> line character.<br /> </p> The whole point of TXT input is to take a fixed placement document and turn it into a reflowable format. calibre's conversion process actually requires this. Input -> reflowable intermediate format -> Output. You've removed the entire idea of a reflowable paragraph that changes layout to fit with the page width but by doing the above. The TXT input is based on intent. Novels are the typical input and it is designed to handle the majority of their formatting cases. There is a "Treat each line as a paragraph" and Markdown to handle cases corner cases such as yours. 2) I'm not 100% clear but if you are implying that we allow for mixed CR/LF characters (aside from the standard Windows CRLF) within the document to denote different meaning? Such as LFLF for paragraph and CR for new line, No. CR and LF are invisible characters. They are all treated the same because that's how the majority of text editors treat them. Many uses will edit a file on Windows and then on say OS X. Some editors will convert all new lines to the system's standard and some insert their systems new line where indicated while still displaying correctly. Uses will become very confused when viewring their converted documents and different lines behave in different ways. Especially when they can't see there is a CR instead of an LF chracter in the source TXT file. In this case telling a user to open their document in a hex editor is not acceptable. |
|||
08-19-2010, 06:43 PM | #15 | |
Jr. - Junior Member
Posts: 586
Karma: 2000358
Join Date: Aug 2010
Location: Alabama
Device: Archos, Asus, HP, Lenovo, Nexus and Samsung tablets in 7,8 and 10"
|
Quote:
Regards - John |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Kindle and TXT files. | bakerjw | Amazon Kindle | 11 | 07-28-2013 12:37 AM |
Disappearing txt files?!? | JaneFancher | Sigil | 2 | 06-26-2010 02:55 PM |
How many files(txt) can be in one folder | alxwang | PocketBook | 5 | 06-20-2010 09:39 PM |
Txt files - Convert to Epub - Multiple files into one book - noob help | Cernan | Calibre | 6 | 05-18-2010 10:12 AM |
Reformatting .txt files | willijt | Workshop | 14 | 03-27-2010 10:05 AM |