|
|
#1 |
|
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 230
Karma: 928
Join Date: Aug 2010
Device: Kindle 3
|
Trouble with .txt files
Last edited by lunixer; 08-16-2010 at 10:38 PM. |
|
|
|
|
|
#2 |
|
Creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 22,660
Karma: 3473290
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
__________________
Get calibre Notice to all: I can not provide assistance with DRM removal, for legal reasons, so please do not contact me about it. |
|
|
|
|
Enthusiast
|
|
|
|
#3 |
|
US Navy, Retired
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,420
Karma: 11289119
Join Date: Feb 2009
Location: North Carolina
Device: Nexus 7, Sony PRS-950, Sony PRS-505, PRS-300
|
You have to select the method you want to use to format your text file within the Text Input area during conversion. For the file you attached you should check Treat each line as a paragraph. Hover over each option to get an explanation.
__________________
-- Good Reading, Walt -- 20GB of free CLOUD STORAGE: Use this link to sign up for a free 15GB Copy.com cloud storage account and we both get an extra 5GB of free space. |
|
|
|
|
|
#4 |
|
Fanatic
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 555
Karma: 422221
Join Date: Jul 2010
Location: UK
Device: Sony PRS-300 & Kindle PW
|
Applying Markdown to plain text source files can also be used to good effect with calibre. This can help improve the look-and-feel of the final book considerably, including automatically produced multilevel TOCs and formatting for headings/sub-headings. It takes about 15-30 mins per book. If you're interested then there are several threads in this forum that will get you started.
|
|
|
|
|
|
#5 |
|
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 230
Karma: 928
Join Date: Aug 2010
Device: Kindle 3
|
Thank you very much for your suggestions. Notice: I have deleted the attachment on the first post after being informed that it violated copyright law.
|
|
|
|
|
|
#6 |
|
Junior Member
![]() Posts: 2
Karma: 10
Join Date: Aug 2010
Device: nook
|
I'm sure the above suggestions are the proper solutions - but I've always had good results printing to pdf then converting the pdf to the desired format.
|
|
|
|
|
|
#7 | |
|
Jr. - Junior Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 533
Karma: 2000170
Join Date: Aug 2010
Location: East Texas
Device: DXG, K3, Jetbook(+Lite), eSlick, Nook, PRS350, PB301+, PB360
|
Quote:
Looking at the attached file in Calibre Viewer illustrates the point. Regards - John Last edited by Jabby; 08-17-2010 at 09:17 AM. Reason: Forgot attachment |
|
|
|
|
|
|
#8 | |
|
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,134
Karma: 3391252
Join Date: Sep 2009
Location: UK
Device: Sony PRS-350/650/T1, PB360, KoboGlo, KoboAuraHD
|
Quote:
![]() I have attached your TXT file updated with a little Markdown to produce the EPUB also in the attached zip. You will see that it has:
|
|
|
|
|
|
|
#9 | |
|
Sigil & calibre developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,384
Karma: 848775
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
|
Quote:
Code:
This is all one paragraph This is also all one paragraph. We use two new line markers to separate paragraphs for the following reason. If I make a list of items separated by a single new line marker then, I can't tell if it's paragraph or a list. So I assume it's a paragraph because they're more common. Also, only Windows uses this sequence. Unix based system (both Kovid and I use Linux, GRiker uses OS X) use LF only. Apple's OS 9 and earlier used CR only to denote a new line. TXT input must support all of these variations including any combination of the above new line markers within the same file. Due to this we cannot do something like: CR/LF denotes new line and CR only denotes items in a list. Internally TXT input converts all new line markers to LF. This solves the different OS using different markers and allows for TXT input to easily match against a single new line character. |
|
|
|
|
|
|
#10 | |
|
Jr. - Junior Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 533
Karma: 2000170
Join Date: Aug 2010
Location: East Texas
Device: DXG, K3, Jetbook(+Lite), eSlick, Nook, PRS350, PB301+, PB360
|
Quote:
I'll also admit I can be pretty dense at times. But I just don't get it.I took the test file that I attached earlier and replaced all CR+LF with a single LF and later with a single CR. All three versions displayed the same using the Calibre Viewer (name and address lines displayed on a single line). So here is my question: If two consecutive EOL markers (CR+LF, CR or LF) can be detected and used as a paragraph marker why can't a single instance be detected and used as a single EOL marker. Here is a list of allowed EOL terminators all others are ignored. CR Line terminator LF Line terminator CRLF Line terminator CRCR Paragraph terminator LFLF Paragraph terminator CRLFCRLF Paragraph terminator By the way, I really don't have a problem. I use a text editor to create address book, prescription and other personal info to keep handy. Text (.txt) files display just fine on all my eReaders if I transfer them directly. Calibre will convert Wordpad (.rtf) to epub or mobi correctly if you use shift+enter key to terminate single lines. So I'm in good shape. Ah, the dichotomy of it all - John
|
|
|
|
|
|
|
#11 |
|
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,134
Karma: 3391252
Join Date: Sep 2009
Location: UK
Device: Sony PRS-350/650/T1, PB360, KoboGlo, KoboAuraHD
|
It seems I wasted my time writing post #8 advising you exactly how to fix your TXT file in about 10 seconds. You seem to be hung up on CR and LF which are a non-issue.
|
|
|
|
|
|
#12 | |
|
Jr. - Junior Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 533
Karma: 2000170
Join Date: Aug 2010
Location: East Texas
Device: DXG, K3, Jetbook(+Lite), eSlick, Nook, PRS350, PB301+, PB360
|
Quote:
My original reply was to Kovid. After reading the documentation this statement made me wonder; "by default calibre only groups lines in the input document into paragraphs. The default is to assume one or more blank lines are a paragraph boundary:" Here is my quibble - why group lines into only paragraphs. Why not into paragraphs and single lines? It is certainly possible and then markdowns would only be necessary if you wanted to add other basic formatting. By the way, showing that it could be done was what all that focus on CR and LF was all about - which was the issue. Regards - John |
|
|
|
|
|
|
#13 |
|
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,630
Karma: 561147
Join Date: Nov 2008
Device: Sony PRS-950, iphone/ipad (stanza/iBooks/QuickReader)
|
I think that the key point is that in most modern ebook formats (that are HTML based) end-of-line is simply treated as white space that is equivalent to a single space. This is great for reflowing text which is what they aim to support to fit a wide variety of screen sizes and allow font zooming.
You then need some special logic for recognizing paragraph breaks which then get there own special tag ( typically <p> or an equivalent). The whole concept of end-of-line therefore tends to be stripped out. |
|
|
|
|
|
#14 | |||
|
Sigil & calibre developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,384
Karma: 848775
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
|
Quote:
Quote:
You're describing doing the following and it won't work. Quote:
Code:
I am all one paragraph split along multiple lines with a single new line character. [code] <p> I am all one<br /> paragraph split<br /> along multiple lines<br /> with a single new<br /> line character.<br /> </p> The whole point of TXT input is to take a fixed placement document and turn it into a reflowable format. calibre's conversion process actually requires this. Input -> reflowable intermediate format -> Output. You've removed the entire idea of a reflowable paragraph that changes layout to fit with the page width but by doing the above. The TXT input is based on intent. Novels are the typical input and it is designed to handle the majority of their formatting cases. There is a "Treat each line as a paragraph" and Markdown to handle cases corner cases such as yours. 2) I'm not 100% clear but if you are implying that we allow for mixed CR/LF characters (aside from the standard Windows CRLF) within the document to denote different meaning? Such as LFLF for paragraph and CR for new line, No. CR and LF are invisible characters. They are all treated the same because that's how the majority of text editors treat them. Many uses will edit a file on Windows and then on say OS X. Some editors will convert all new lines to the system's standard and some insert their systems new line where indicated while still displaying correctly. Uses will become very confused when viewring their converted documents and different lines behave in different ways. Especially when they can't see there is a CR instead of an LF chracter in the source TXT file. In this case telling a user to open their document in a hex editor is not acceptable. |
|||
|
|
|
|
|
#15 | |
|
Jr. - Junior Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 533
Karma: 2000170
Join Date: Aug 2010
Location: East Texas
Device: DXG, K3, Jetbook(+Lite), eSlick, Nook, PRS350, PB301+, PB360
|
Quote:
Regards - John |
|
|
|
|
![]() |
| Thread Tools | Search this Thread |
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Disappearing txt files?!? | JaneFancher | Sigil | 2 | 06-26-2010 02:55 PM |
| How many files(txt) can be in one folder | alxwang | PocketBook | 5 | 06-20-2010 09:39 PM |
| Txt files - Convert to Epub - Multiple files into one book - noob help | Cernan | Calibre | 6 | 05-18-2010 10:12 AM |
| Reformatting .txt files | willijt | Workshop | 14 | 03-27-2010 10:05 AM |
| Kindle and TXT files. | bakerjw | Amazon Kindle | 9 | 07-08-2008 07:19 PM |