11-23-2010, 06:28 AM | #1 |
Enthusiast
Posts: 34
Karma: 10
Join Date: Nov 2010
Location: Netherlands
Device: Kindle 3
|
convert txt to epub, no new line formatting
Hello,
I use the calibre ebook-convert.exe to convert my txt files to epub for use on a bebook. But the conversion seems to ignore 'paragraphs'. If the txt is like this: ------- First sentence [new line] Second sentence ....[new line] ------- I want it in epub also to be: ------- First sentence. Second sentence. ------- But I get: ------- First sentence. Second sentence. ------- I thought I was clever and tried to modify the txt to look like this: ------- First sentence [new line] [new line] Second sentence ....[new line] ------- But now the epub ouotput will also have that empty line: ------- First sentence. Second sentence. ------- What is the solution to this?? Thanks for any help. |
11-23-2010, 06:50 AM | #2 |
Wizard
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
|
You need to enable the "treat each line as a paragraph option"
|
11-23-2010, 06:51 AM | #3 |
Wizard
Posts: 4,552
Karma: 950151
Join Date: Nov 2008
Device: Sony PRS-950, iphone/ipad (Marvin/iBooks/QuickReader)
|
There is an option in the txt conversion settings that specifies whether each line should be treated as a paragraph, or whether paragraph separators should be blank lines. Is that what you are looking for?
|
11-23-2010, 06:58 AM | #4 |
Enthusiast
Posts: 34
Karma: 10
Join Date: Nov 2010
Location: Netherlands
Device: Kindle 3
|
If found the option, it's --single-line-paras, but it still does not work, there are extra newlines.
Input .txt: ------ "Go to the Dragon Reborn," Lan called to him. "Or to your queen's army. Either of them will take you."[newline] "And you? You will ride all the way to the Seven Towers without supplies?"[newline] "I'll forage."[newline] ------ Output .epub: ------ "Go to the Dragon Reborn," Lan called to him. "Or to your queen's army. Either of them will take you." "And you? You will ride all the way to the Seven Towers without supplies?" "I'll forage." ------ And if I do convert without the option it's just one single block of text: ------ "Go to the Dragon Reborn," Lan called to him. "Or to your queen's army. Either of them will take you." "And you? You will ride all the way to the Seven Towers without supplies?" "I'll forage." ------ Last edited by JeanC; 11-23-2010 at 07:23 AM. |
11-23-2010, 07:13 AM | #5 |
Wizard
Posts: 4,552
Karma: 950151
Join Date: Nov 2008
Device: Sony PRS-950, iphone/ipad (Marvin/iBooks/QuickReader)
|
Yes - shows it is a small world .
I have found that my normal username is not one commonly used elsewhere by others Last edited by itimpi; 11-23-2010 at 07:16 AM. |
11-23-2010, 07:14 AM | #6 |
Wizard
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
|
User manual is here:
http://www.calibre-ebook.com/user_manual/ Also ebook-convert help is context sensitive - after you pass each argument the help will change based on what you're trying to do. So when it sees that you're trying to convert a .txt file to .epub it will show you the txt/epub options. For example, here is when I try to convert txt to epub: Code:
PC:Contact (2369) userid$ ebook-convert Contact\ -\ Carl\ Sagan.txt Contact\ -\ Carl\ Sagan.epub -h Usage: ebook-convert input_file output_file [options] Convert an ebook from one format to another. input_file is the input and output_file is the output. Both must be specified as the first two arguments to the command. The output ebook format is guessed from the file extension of output_file. output_file can also be of the special format .EXT where EXT is the output file extension. In this case, the name of the output file is derived the name of the input file. Note that the filenames must not start with a hyphen. Finally, if output_file has no extension, then it is treated as a directory and an "open ebook" (OEB) consisting of HTML files is written to that directory. These files are the files that would normally have been passed to the output plugin. After specifying the input and output file you can customize the conversion by specifying various options. The available options depend on the input and output file types. To get help on them specify the input and output file and then use the -h option. For full documentation of the conversion system see http://calibre-ebook.com/user_manual/conversion.html Whenever you pass arguments to ebook-convert that have spaces in them, enclose the arguments in quotation marks. Options: --version show program's version number and exit -h, --help show this help message and exit --input-profile=INPUT_PROFILE Specify the input profile. The input profile gives the conversion system information on how to interpret various information in the input document. For example resolution dependent lengths (i.e. lengths in pixels). Choices are:cybookg3, cybook_opus, default, hanlinv3, hanlinv5, illiad, irexdr1000, irexdr800, kindle, msreader, mobipocket, nook, sony, sony300, sony900 --output-profile=OUTPUT_PROFILE Specify the output profile. The output profile tells the conversion system how to optimize the created document for the specified device. In some cases, an output profile is required to produce documents that will work on a device. For example EPUB on the SONY reader. Choices are:cybookg3, cybook_opus, default, hanlinv3, hanlinv5, illiad, ipad, irexdr1000, irexdr800, jetbook5, kindle, kindle_dx, kobo, msreader, mobipocket, nook, nook_color, bambook, sony, sony300, sony900, sony-landscape, tablet --list-recipes List builtin recipes INPUT OPTIONS: Options to control the processing of the input txt file --markdown Run the text input through the markdown pre-processor. To learn more about markdown see http://daringfireball.net/projects/markdown/ --markdown-disable-toc Do not insert a Table of Contents into the output text. --input-encoding=INPUT_ENCODING Specify the character encoding of the input document. If set this option will override any encoding declared by the document itself. Particularly useful for documents that do not declare an encoding or that have erroneous encoding declarations. --print-formatted-paras Normally calibre treats blank lines as paragraph markers. With this option it will assume that every line starting with an indent (either a tab or 2+ spaces) represents a paragraph. Paragraphs end when the next line that starts with an indent is reached. --preserve-spaces Normally extra spaces are condensed into a single space. With this option all spaces will be displayed. --single-line-paras Normally calibre treats blank lines as paragraph markers. With this option it will assume that every line represents a paragraph instead. OUTPUT OPTIONS: Options to control the processing of the output epub --no-svg-cover Do not use SVG for the book cover. Use this option if your EPUB is going to be used on a device that does not support SVG, like the iPhone or the JetBook Lite. Without this option, such devices will display the cover as a blank page. --preserve-cover-aspect-ratio When using an SVG cover, this option will cause the cover to scale to cover the available screen area, but still preserve its aspect ratio (ratio of width to height). That means there may be white borders at the sides or top and bottom of the image, but the image will never be distorted. Without this option the image may be slightly distorted, but there will be no borders. --no-default-epub-cover Normally, if the input file has no cover and you don't specify one, a default cover is generated with the title, authors, etc. This option disables the generation of this cover. --extract-to=EXTRACT_TO Extract the contents of the generated EPUB file to the specified directory. The contents of the directory are first deleted, so be careful. --pretty-print If specified, the output plugin will try to create output that is as human readable as possible. May not have any effect for some output plugins. --flow-size=FLOW_SIZE Split all HTML files larger than this size (in KB). This is necessary as most EPUB readers cannot handle large file sizes. The default of 260KB is the size required for Adobe Digital Editions. --dont-split-on-page-breaks Turn off splitting at page breaks. Normally, input files are automatically split at every page break into two files. This gives an output ebook that can be parsed faster and with less resources. However, splitting is slow and if your source file contains a very large number of page breaks, you should turn off splitting on page breaks. LOOK AND FEEL: Options to control the look and feel of the output --base-font-size=BASE_FONT_SIZE The base font size in pts. All font sizes in the produced book will be rescaled based on this size. By choosing a larger size you can make the fonts in the output bigger and vice versa. By default, the base font size is chosen based on the output profile you chose. --disable-font-rescaling Disable all rescaling of font sizes. --font-size-mapping=FONT_SIZE_MAPPING Mapping from CSS font names to font sizes in pts. An example setting is 12,12,14,16,18,20,22,24. These are the mappings for the sizes xx-small to xx-large, with the final size being for huge fonts. The font rescaling algorithm uses these sizes to intelligently rescale fonts. The default is to use a mapping based on the output profile you chose. --line-height=LINE_HEIGHT The line height in pts. Controls spacing between consecutive lines of text. By default no line height manipulation is performed. --linearize-tables Some badly designed documents use tables to control the layout of text on the page. When converted these documents often have text that runs off the page and other artifacts. This option will extract the content from the tables and present it in a linear fashion. --extra-css=EXTRA_CSS Either the path to a CSS stylesheet or raw CSS. This CSS will be appended to the style rules from the source file, so it can be used to override those rules. --smarten-punctuation Convert plain quotes, dashes and ellipsis to their typographically correct equivalents. For details, see http://daringfireball.net/projects/smartypants --margin-top=MARGIN_TOP Set the top margin in pts. Default is 5.0. Note: 72 pts equals 1 inch --margin-left=MARGIN_LEFT Set the left margin in pts. Default is 5.0. Note: 72 pts equals 1 inch --margin-right=MARGIN_RIGHT Set the right margin in pts. Default is 5.0. Note: 72 pts equals 1 inch --margin-bottom=MARGIN_BOTTOM Set the bottom margin in pts. Default is 5.0. Note: 72 pts equals 1 inch --change-justification=CHANGE_JUSTIFICATION Change text justification. A value of "left" converts all justified text in the source to left aligned (i.e. unjustified) text. A value of "justify" converts all unjustified text to justified. A value of "original" (the default) does not change justification in the source file. Note that only some output formats support justification. --insert-blank-line Insert a blank line between paragraphs. Will not work if the source file does not use paragraphs (<p> or <div> tags). --remove-paragraph-spacing Remove spacing between paragraphs. Also sets an indent on paragraphs of 1.5em. Spacing removal will not work if the source file does not use paragraphs (<p> or <div> tags). --remove-paragraph-spacing-indent-size=REMOVE_PARAGRAPH_SPACING_INDENT_SIZE When calibre removes inter paragraph spacing, it automatically sets a paragraph indent, to ensure that paragraphs can be easily distinguished. This option controls the width of that indent. --asciiize Transliterate unicode characters to an ASCII representation. Use with care because this will replace unicode characters with ASCII. For instance it will replace "Михаил Горбачёв" with "Mikhail Gorbachiov". Also, note that in cases where there are multiple representations of a character (characters shared by Chinese and Japanese for instance) the representation used by the largest number of people will be used (Chinese in the previous example). --remove-header Use a regular expression to try and remove the header. --header-regex=HEADER_REGEX The regular expression to use to remove the header. --remove-footer Use a regular expression to try and remove the footer. --footer-regex=FOOTER_REGEX The regular expression to use to remove the footer. STRUCTURE DETECTION: Control auto-detection of document structure. --chapter=CHAPTER An XPath expression to detect chapter titles. The default is to consider <h1> or <h2> tags that contain the words "chapter","book","section" or "part" as chapter titles as well as any tags that have class="chapter". The expression used must evaluate to a list of elements. To disable chapter detection, use the expression "/". See the XPath Tutorial in the calibre User Manual for further help on using this feature. --chapter-mark=CHAPTER_MARK Specify how to mark detected chapters. A value of "pagebreak" will insert page breaks before chapters. A value of "rule" will insert a line before chapters. A value of "none" will disable chapter marking and a value of "both" will use both page breaks and lines to mark chapters. --prefer-metadata-cover Use the cover detected from the source file in preference to the specified cover. --remove-first-image Remove the first image from the input ebook. Useful if the first image in the source file is a cover and you are specifying an external cover. --insert-metadata Insert the book metadata at the start of the book. This is useful if your ebook reader does not support displaying/searching metadata directly. --page-breaks-before=PAGE_BREAKS_BEFORE An XPath expression. Page breaks are inserted before the specified elements. --preprocess-html Attempt to detect and correct hard line breaks and other problems in the source file. This may make things worse, so use with care. --html-unwrap-factor=HTML_UNWRAP_FACTOR Scale used to determine the length at which a line should be unwrapped if preprocess is enabled. Valid values are a decimal between 0 and 1. The default is 0.40, just below the median line length. This will unwrap typical books with hard line breaks, but should be reduced if the line length is variable. TABLE OF CONTENTS: Control the automatic generation of a Table of Contents. By default, if the source file has a Table of Contents, it will be used in preference to the automatically generated one. --level1-toc=LEVEL1_TOC XPath expression that specifies all tags that should be added to the Table of Contents at level one. If this is specified, it takes precedence over other forms of auto-detection. --level2-toc=LEVEL2_TOC XPath expression that specifies all tags that should be added to the Table of Contents at level two. Each entry is added under the previous level one entry. --level3-toc=LEVEL3_TOC XPath expression that specifies all tags that should be added to the Table of Contents at level three. Each entry is added under the previous level two entry. --toc-threshold=TOC_THRESHOLD If fewer than this number of chapters is detected, then links are added to the Table of Contents. Default: 6 --max-toc-links=MAX_TOC_LINKS Maximum number of links to insert into the TOC. Set to 0 to disable. Default is: 50. Links are only added to the TOC if less than the threshold number of chapters were detected. --no-chapters-in-toc Don't add auto-detected chapters to the Table of Contents. --use-auto-toc Normally, if the source file already has a Table of Contents, it is used in preference to the auto- generated one. With this option, the auto-generated one is always used. --toc-filter=TOC_FILTER Remove entries from the Table of Contents whose titles match the specified regular expression. Matching entries and all their children are removed. METADATA: Options to set metadata in the output --title=TITLE Set the title. --authors=AUTHORS Set the authors. Multiple authors should be separated by ampersands. --title-sort=TITLE_SORT The version of the title to be used for sorting. --author-sort=AUTHOR_SORT String to be used when sorting by author. --cover=COVER Set the cover to the specified file or URL --comments=COMMENTS Set the ebook description. --publisher=PUBLISHER Set the ebook publisher. --series=SERIES Set the series this ebook belongs to. --series-index=SERIES_INDEX Set the index of the book in this series. --rating=RATING Set the rating. Should be a number between 1 and 5. --isbn=ISBN Set the ISBN of the book. --tags=TAGS Set the tags for the book. Should be a comma separated list. --book-producer=BOOK_PRODUCER Set the book producer. --language=LANGUAGE Set the language. --pubdate=PUBDATE Set the publication date. --timestamp=TIMESTAMP Set the book timestamp (used by the date column in calibre). DEBUG: Options to help with debugging the conversion -v, --verbose Level of verbosity. Specify multiple times for greater verbosity. -d DEBUG_PIPELINE, --debug-pipeline=DEBUG_PIPELINE Save the output from different stages of the conversion pipeline to the specified directory. Useful if you are unsure at which stage of the conversion process a bug is occurring. |
11-23-2010, 07:32 AM | #7 | |
Enthusiast
Posts: 34
Karma: 10
Join Date: Nov 2010
Location: Netherlands
Device: Kindle 3
|
Quote:
Thanks to both of you but as you can read above I still have problems. ebook-convert makes paragraphs with <p></p> but my bebook makes empty lines between them. How can I remedy that? |
|
11-23-2010, 07:45 AM | #8 |
Wizard
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
|
You can try adding the --remove-paragraph-spacing option. Alternatively edit the doc in Sigil or using the Tweak epub in Calibre to adjust the top and bottom margins in the css to what you like.
|
11-23-2010, 07:55 AM | #9 |
Grand Sorcerer
Posts: 6,212
Karma: 16534894
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
|
You could also try adding the --extra-css option e.g.
Code:
--extra-css="p {margin-top:0; margin-bottom:0; text-indent:1em;}" |
11-23-2010, 07:57 AM | #10 |
Enthusiast
Posts: 34
Karma: 10
Join Date: Nov 2010
Location: Netherlands
Device: Kindle 3
|
Yes that did the trick, together with --remove-paragraph-spacing-indent-size
And I noted, you can also specify some css, so you maybe could kill the space between <p></p> too, but above works also so I am happy now! Thanks again. |
11-23-2010, 07:58 AM | #11 |
Enthusiast
Posts: 34
Karma: 10
Join Date: Nov 2010
Location: Netherlands
Device: Kindle 3
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
If I convert an epub to .txt with Calibre, what does it look like? | theusualuser | Ectaco jetBook | 8 | 12-10-2010 01:27 PM |
Convert .TXT to .EPUB | Arfer | Calibre | 6 | 09-02-2010 10:41 AM |
Convert ePub to txt for better functionality | PodPeople | Ectaco jetBook | 1 | 03-14-2010 01:56 PM |
TXT conversion to ePub or LRF - paragraph formatting | Zapped | Calibre | 6 | 10-23-2009 05:06 PM |
Quote marks not formatting in .TXT to .EPUB? | Sassyinkpen | Calibre | 11 | 10-07-2009 09:27 PM |