Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 11-23-2010, 06:28 AM   #1
JeanC
Enthusiast
JeanC began at the beginning.
 
JeanC's Avatar
 
Posts: 34
Karma: 10
Join Date: Nov 2010
Location: Netherlands
Device: Kindle 3
convert txt to epub, no new line formatting

Hello,

I use the calibre ebook-convert.exe to convert my txt files to epub for use on a bebook.

But the conversion seems to ignore 'paragraphs'.

If the txt is like this:

-------
First sentence [new line]
Second sentence ....[new line]
-------

I want it in epub also to be:

-------
First sentence.
Second sentence.
-------

But I get:
-------
First sentence. Second sentence.
-------

I thought I was clever and tried to modify the txt to look like this:

-------
First sentence [new line]
[new line]
Second sentence ....[new line]
-------

But now the epub ouotput will also have that empty line:

-------
First sentence.

Second sentence.
-------

What is the solution to this??
Thanks for any help.
JeanC is offline   Reply With Quote
Old 11-23-2010, 06:50 AM   #2
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
You need to enable the "treat each line as a paragraph option"
ldolse is offline   Reply With Quote
Advert
Old 11-23-2010, 06:51 AM   #3
itimpi
Wizard
itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.
 
Posts: 4,552
Karma: 950151
Join Date: Nov 2008
Device: Sony PRS-950, iphone/ipad (Marvin/iBooks/QuickReader)
There is an option in the txt conversion settings that specifies whether each line should be treated as a paragraph, or whether paragraph separators should be blank lines. Is that what you are looking for?
itimpi is offline   Reply With Quote
Old 11-23-2010, 06:58 AM   #4
JeanC
Enthusiast
JeanC began at the beginning.
 
JeanC's Avatar
 
Posts: 34
Karma: 10
Join Date: Nov 2010
Location: Netherlands
Device: Kindle 3
If found the option, it's --single-line-paras, but it still does not work, there are extra newlines.

Input .txt:

------
"Go to the Dragon Reborn," Lan called to him. "Or to your queen's army. Either of them will take you."[newline]
"And you? You will ride all the way to the Seven Towers without supplies?"[newline]
"I'll forage."[newline]
------

Output .epub:

------
"Go to the Dragon Reborn," Lan called to him. "Or to your queen's army. Either of them will take you."

"And you? You will ride all the way to the Seven Towers without supplies?"

"I'll forage."
------

And if I do convert without the option it's just one single block of text:

------
"Go to the Dragon Reborn," Lan called to him. "Or to your queen's army. Either of them will take you." "And you? You will ride all the way to the Seven Towers without supplies?" "I'll forage."
------


Last edited by JeanC; 11-23-2010 at 07:23 AM.
JeanC is offline   Reply With Quote
Old 11-23-2010, 07:13 AM   #5
itimpi
Wizard
itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.
 
Posts: 4,552
Karma: 950151
Join Date: Nov 2008
Device: Sony PRS-950, iphone/ipad (Marvin/iBooks/QuickReader)
Quote:
Originally Posted by JeanC View Post
Ps. Itimpi from newsbin?
Yes - shows it is a small world .

I have found that my normal username is not one commonly used elsewhere by others

Last edited by itimpi; 11-23-2010 at 07:16 AM.
itimpi is offline   Reply With Quote
Advert
Old 11-23-2010, 07:14 AM   #6
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
User manual is here:
http://www.calibre-ebook.com/user_manual/

Also ebook-convert help is context sensitive - after you pass each argument the help will change based on what you're trying to do. So when it sees that you're trying to convert a .txt file to .epub it will show you the txt/epub options.

For example, here is when I try to convert txt to epub:
Code:
PC:Contact (2369) userid$ ebook-convert Contact\ -\ Carl\ Sagan.txt Contact\ -\ Carl\ Sagan.epub -h
Usage: ebook-convert input_file output_file [options]

Convert an ebook from one format to another.

input_file is the input and output_file is the output. Both must be specified as the first two arguments to the command.

The output ebook format is guessed from the file extension of output_file. output_file can also be of the special format .EXT where EXT is the output file extension. In this case, the name of the output file is derived the name of the input file. Note that the filenames must not start with a hyphen. Finally, if output_file has no extension, then it is treated as a directory and an "open ebook" (OEB) consisting of HTML files is written to that directory. These files are the files that would normally have been passed to the output plugin.

After specifying the input and output file you can customize the conversion by specifying various options. The available options depend on the input and output file types. To get help on them specify the input and output file and then use the -h option.

For full documentation of the conversion system see
http://calibre-ebook.com/user_manual/conversion.html

Whenever you pass arguments to ebook-convert that have spaces in them, enclose the arguments in quotation marks.

Options:
  --version             show program's version number and exit

  -h, --help            show this help message and exit

  --input-profile=INPUT_PROFILE
                        Specify the input profile. The input profile gives the
                        conversion system information on how to interpret
                        various information in the input document. For example
                        resolution dependent lengths (i.e. lengths in pixels).
                        Choices are:cybookg3, cybook_opus, default, hanlinv3,
                        hanlinv5, illiad, irexdr1000, irexdr800, kindle,
                        msreader, mobipocket, nook, sony, sony300, sony900

  --output-profile=OUTPUT_PROFILE
                        Specify the output profile. The output profile tells
                        the conversion system how to optimize the created
                        document for the specified device. In some cases, an
                        output profile is required to produce documents that
                        will work on a device. For example EPUB on the SONY
                        reader. Choices are:cybookg3, cybook_opus, default,
                        hanlinv3, hanlinv5, illiad, ipad, irexdr1000,
                        irexdr800, jetbook5, kindle, kindle_dx, kobo,
                        msreader, mobipocket, nook, nook_color, bambook, sony,
                        sony300, sony900, sony-landscape, tablet

  --list-recipes        List builtin recipes


INPUT OPTIONS:
    Options to control the processing of the input txt file

    --markdown          Run the text input through the markdown pre-processor.
                        To learn more about markdown see
                        http://daringfireball.net/projects/markdown/

    --markdown-disable-toc
                        Do not insert a Table of Contents into the output
                        text.

    --input-encoding=INPUT_ENCODING
                        Specify the character encoding of the input document.
                        If set this option will override any encoding declared
                        by the document itself. Particularly useful for
                        documents that do not declare an encoding or that have
                        erroneous encoding declarations.

    --print-formatted-paras
                        Normally calibre treats blank lines as paragraph
                        markers. With this option it will assume that every
                        line starting with an indent (either a tab or 2+
                        spaces) represents a paragraph. Paragraphs end when
                        the next line that starts with an indent is reached.

    --preserve-spaces   Normally extra spaces are condensed into a single
                        space. With this option all spaces will be displayed.

    --single-line-paras
                        Normally calibre treats blank lines as paragraph
                        markers. With this option it will assume that every
                        line represents a paragraph instead.


OUTPUT OPTIONS:
    Options to control the processing of the output epub

    --no-svg-cover      Do not use SVG for the book cover. Use this option if
                        your EPUB is going to be used on a device that does
                        not support SVG, like the iPhone or the JetBook Lite.
                        Without this option, such devices will display the
                        cover as a blank page.

    --preserve-cover-aspect-ratio
                        When using an SVG cover, this option will cause the
                        cover to scale to cover the available screen area, but
                        still preserve its aspect ratio (ratio of width to
                        height). That means there may be white borders at the
                        sides or top and bottom of the image, but the image
                        will never be distorted. Without this option the image
                        may be slightly distorted, but there will be no
                        borders.

    --no-default-epub-cover
                        Normally, if the input file has no cover and you don't
                        specify one, a default cover is generated with the
                        title, authors, etc. This option disables the
                        generation of this cover.

    --extract-to=EXTRACT_TO
                        Extract the contents of the generated EPUB file to the
                        specified directory. The contents of the directory are
                        first deleted, so be careful.

    --pretty-print      If specified, the output plugin will try to create
                        output that is as human readable as possible. May not
                        have any effect for some output plugins.

    --flow-size=FLOW_SIZE
                        Split all HTML files larger than this size (in KB).
                        This is necessary as most EPUB readers cannot handle
                        large file sizes. The default of 260KB is the size
                        required for Adobe Digital Editions.

    --dont-split-on-page-breaks
                        Turn off splitting at page breaks. Normally, input
                        files are automatically split at every page break into
                        two files. This gives an output ebook that can be
                        parsed faster and with less resources. However,
                        splitting is slow and if your source file contains a
                        very large number of page breaks, you should turn off
                        splitting on page breaks.


LOOK AND FEEL:
    Options to control the look and feel of the output

    --base-font-size=BASE_FONT_SIZE
                        The base font size in pts. All font sizes in the
                        produced book will be rescaled based on this size. By
                        choosing a larger size you can make the fonts in the
                        output bigger and vice versa. By default, the base
                        font size is chosen based on the output profile you
                        chose.

    --disable-font-rescaling
                        Disable all rescaling of font sizes.

    --font-size-mapping=FONT_SIZE_MAPPING
                        Mapping from CSS font names to font sizes in pts. An
                        example setting is 12,12,14,16,18,20,22,24. These are
                        the mappings for the sizes xx-small to xx-large, with
                        the final size being for huge fonts. The font
                        rescaling algorithm uses these sizes to intelligently
                        rescale fonts. The default is to use a mapping based
                        on the output profile you chose.

    --line-height=LINE_HEIGHT
                        The line height in pts. Controls spacing between
                        consecutive lines of text. By default no line height
                        manipulation is performed.

    --linearize-tables  Some badly designed documents use tables to control
                        the layout of text on the page. When converted these
                        documents often have text that runs off the page and
                        other artifacts. This option will extract the content
                        from the tables and present it in a linear fashion.

    --extra-css=EXTRA_CSS
                        Either the path to a CSS stylesheet or raw CSS. This
                        CSS will be appended to the style rules from the
                        source file, so it can be used to override those
                        rules.

    --smarten-punctuation
                        Convert plain quotes, dashes and ellipsis to their
                        typographically correct equivalents. For details, see
                        http://daringfireball.net/projects/smartypants

    --margin-top=MARGIN_TOP
                        Set the top margin in pts. Default is 5.0. Note: 72
                        pts equals 1 inch

    --margin-left=MARGIN_LEFT
                        Set the left margin in pts. Default is 5.0. Note: 72
                        pts equals 1 inch

    --margin-right=MARGIN_RIGHT
                        Set the right margin in pts. Default is 5.0. Note: 72
                        pts equals 1 inch

    --margin-bottom=MARGIN_BOTTOM
                        Set the bottom margin in pts. Default is 5.0. Note: 72
                        pts equals 1 inch

    --change-justification=CHANGE_JUSTIFICATION
                        Change text justification. A value of "left" converts
                        all justified text in the source to left aligned (i.e.
                        unjustified) text. A value of "justify" converts all
                        unjustified text to justified. A value of "original"
                        (the default) does not change justification in the
                        source file. Note that only some output formats
                        support justification.

    --insert-blank-line
                        Insert a blank line between paragraphs. Will not work
                        if the source file does not use paragraphs (<p> or
                        <div> tags).

    --remove-paragraph-spacing
                        Remove spacing between paragraphs. Also sets an indent
                        on paragraphs of 1.5em. Spacing removal will not work
                        if the source file does not use paragraphs (<p> or
                        <div> tags).

    --remove-paragraph-spacing-indent-size=REMOVE_PARAGRAPH_SPACING_INDENT_SIZE
                        When calibre removes inter paragraph spacing, it
                        automatically sets a paragraph indent, to ensure that
                        paragraphs can be easily distinguished. This option
                        controls the width of that indent.

    --asciiize          Transliterate unicode characters to an ASCII
                        representation. Use with care because this will
                        replace unicode characters with ASCII. For instance it
                        will replace "Михаил Горбачёв" with "Mikhail
                        Gorbachiov". Also, note that in cases where there are
                        multiple representations of a character (characters
                        shared by Chinese and Japanese for instance) the
                        representation used by the largest number of people
                        will be used (Chinese in the previous example).

    --remove-header     Use a regular expression to try and remove the header.

    --header-regex=HEADER_REGEX
                        The regular expression to use to remove the header.

    --remove-footer     Use a regular expression to try and remove the footer.

    --footer-regex=FOOTER_REGEX
                        The regular expression to use to remove the footer.


STRUCTURE DETECTION:
    Control auto-detection of document structure.

    --chapter=CHAPTER   An XPath expression to detect chapter titles. The
                        default is to consider <h1> or <h2> tags that contain
                        the words "chapter","book","section" or "part" as
                        chapter titles as well as any tags that have
                        class="chapter". The expression used must evaluate to
                        a list of elements. To disable chapter detection, use
                        the expression "/". See the XPath Tutorial in the
                        calibre User Manual for further help on using this
                        feature.

    --chapter-mark=CHAPTER_MARK
                        Specify how to mark detected chapters. A value of
                        "pagebreak" will insert page breaks before chapters. A
                        value of "rule" will insert a line before chapters. A
                        value of "none" will disable chapter marking and a
                        value of "both" will use both page breaks and lines to
                        mark chapters.

    --prefer-metadata-cover
                        Use the cover detected from the source file in
                        preference to the specified cover.

    --remove-first-image
                        Remove the first image from the input ebook. Useful if
                        the first image in the source file is a cover and you
                        are specifying an external cover.

    --insert-metadata   Insert the book metadata at the start of the book.
                        This is useful if your ebook reader does not support
                        displaying/searching metadata directly.

    --page-breaks-before=PAGE_BREAKS_BEFORE
                        An XPath expression. Page breaks are inserted before
                        the specified elements.

    --preprocess-html   Attempt to detect and correct hard line breaks and
                        other problems in the source file. This may make
                        things worse, so use with care.

    --html-unwrap-factor=HTML_UNWRAP_FACTOR
                        Scale used to determine the length at which a line
                        should be unwrapped if preprocess is enabled. Valid
                        values are a decimal between 0 and 1. The default is
                        0.40, just below the median line length. This will
                        unwrap typical books  with hard line breaks, but
                        should be reduced if the line length is variable.


TABLE OF CONTENTS:
    Control the automatic generation of a Table of Contents. By default,
    if the source file has a Table of Contents, it will be used in
    preference to the automatically generated one.

    --level1-toc=LEVEL1_TOC
                        XPath expression that specifies all tags that should
                        be added to the Table of Contents at level one. If
                        this is specified, it takes precedence over other
                        forms of auto-detection.

    --level2-toc=LEVEL2_TOC
                        XPath expression that specifies all tags that should
                        be added to the Table of Contents at level two. Each
                        entry is added under the previous level one entry.

    --level3-toc=LEVEL3_TOC
                        XPath expression that specifies all tags that should
                        be added to the Table of Contents at level three. Each
                        entry is added under the previous level two entry.

    --toc-threshold=TOC_THRESHOLD
                        If fewer than this number of chapters is detected,
                        then links are added to the Table of Contents.
                        Default: 6

    --max-toc-links=MAX_TOC_LINKS
                        Maximum number of links to insert into the TOC. Set to
                        0 to disable. Default is: 50. Links are only added to
                        the TOC if less than the threshold number of chapters
                        were detected.

    --no-chapters-in-toc
                        Don't add auto-detected chapters to the Table of
                        Contents.

    --use-auto-toc      Normally, if the source file already has a Table of
                        Contents, it is used in preference to the auto-
                        generated one. With this option, the auto-generated
                        one is always used.

    --toc-filter=TOC_FILTER
                        Remove entries from the Table of Contents whose titles
                        match the specified regular expression. Matching
                        entries and all their children are removed.


METADATA:
    Options to set metadata in the output

    --title=TITLE       Set the title.

    --authors=AUTHORS   Set the authors. Multiple authors should be separated
                        by ampersands.

    --title-sort=TITLE_SORT
                        The version of the title to be used for sorting.

    --author-sort=AUTHOR_SORT
                        String to be used when sorting by author.

    --cover=COVER       Set the cover to the specified file or URL

    --comments=COMMENTS
                        Set the ebook description.

    --publisher=PUBLISHER
                        Set the ebook publisher.

    --series=SERIES     Set the series this ebook belongs to.

    --series-index=SERIES_INDEX
                        Set the index of the book in this series.

    --rating=RATING     Set the rating. Should be a number between 1 and 5.

    --isbn=ISBN         Set the ISBN of the book.

    --tags=TAGS         Set the tags for the book. Should be a comma separated
                        list.

    --book-producer=BOOK_PRODUCER
                        Set the book producer.

    --language=LANGUAGE
                        Set the language.

    --pubdate=PUBDATE   Set the publication date.

    --timestamp=TIMESTAMP
                        Set the book timestamp (used by the date column in
                        calibre).


DEBUG:
    Options to help with debugging the conversion

    -v, --verbose       Level of verbosity. Specify multiple times for greater
                        verbosity.

    -d DEBUG_PIPELINE, --debug-pipeline=DEBUG_PIPELINE
                        Save the output from different stages of the
                        conversion pipeline to the specified directory. Useful
                        if you are unsure at which stage of the conversion
                        process a bug is occurring.
ldolse is offline   Reply With Quote
Old 11-23-2010, 07:32 AM   #7
JeanC
Enthusiast
JeanC began at the beginning.
 
JeanC's Avatar
 
Posts: 34
Karma: 10
Join Date: Nov 2010
Location: Netherlands
Device: Kindle 3
Quote:
Originally Posted by itimpi View Post
Yes - shows it is a small world .

I have found that my normal username is not one commonly used elsewhere by others
You bet.

Thanks to both of you but as you can read above I still have problems.

ebook-convert makes paragraphs with <p></p> but my bebook makes empty lines between them. How can I remedy that?
JeanC is offline   Reply With Quote
Old 11-23-2010, 07:45 AM   #8
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
You can try adding the --remove-paragraph-spacing option. Alternatively edit the doc in Sigil or using the Tweak epub in Calibre to adjust the top and bottom margins in the css to what you like.
ldolse is offline   Reply With Quote
Old 11-23-2010, 07:55 AM   #9
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,171
Karma: 16228536
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
You could also try adding the --extra-css option e.g.
Code:
--extra-css="p {margin-top:0; margin-bottom:0; text-indent:1em;}"
Edit: but not at the same time as the --remove-paragraph-spacing option
jackie_w is offline   Reply With Quote
Old 11-23-2010, 07:57 AM   #10
JeanC
Enthusiast
JeanC began at the beginning.
 
JeanC's Avatar
 
Posts: 34
Karma: 10
Join Date: Nov 2010
Location: Netherlands
Device: Kindle 3
Yes that did the trick, together with --remove-paragraph-spacing-indent-size

And I noted, you can also specify some css, so you maybe could kill the space between <p></p> too, but above works also so I am happy now!

Thanks again.

Quote:
Originally Posted by ldolse View Post
You can try adding the --remove-paragraph-spacing option. Alternatively edit the doc in Sigil or using the Tweak epub in Calibre to adjust the top and bottom margins in the css to what you like.
JeanC is offline   Reply With Quote
Old 11-23-2010, 07:58 AM   #11
JeanC
Enthusiast
JeanC began at the beginning.
 
JeanC's Avatar
 
Posts: 34
Karma: 10
Join Date: Nov 2010
Location: Netherlands
Device: Kindle 3
You was just ahead of me!

Thanks, fine forum.

Quote:
Originally Posted by jackie_w View Post
You could also try adding the --extra-css option e.g.
Code:
--extra-css="p {margin-top:0; margin-bottom:0; text-indent:1em;}"
Edit: but not at the same time as the --remove-paragraph-spacing option
JeanC is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
If I convert an epub to .txt with Calibre, what does it look like? theusualuser Ectaco jetBook 8 12-10-2010 01:27 PM
Convert .TXT to .EPUB Arfer Calibre 6 09-02-2010 10:41 AM
Convert ePub to txt for better functionality PodPeople Ectaco jetBook 1 03-14-2010 01:56 PM
TXT conversion to ePub or LRF - paragraph formatting Zapped Calibre 6 10-23-2009 05:06 PM
Quote marks not formatting in .TXT to .EPUB? Sassyinkpen Calibre 11 10-07-2009 09:27 PM


All times are GMT -4. The time now is 05:50 PM.


MobileRead.com is a privately owned, operated and funded community.