Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 09-25-2013, 06:57 AM   #1
Haim_gds
Member
Haim_gds began at the beginning.
 
Posts: 18
Karma: 10
Join Date: Jun 2013
Device: [M92] Onyx boox note
Text file lose its previous format (paragraphs and blank lines) when converting

Text file lose its previous format (paragraphs and blank lines) when converting large text files for all major formats.

I'm sorry if the question was all ready answered but I've searched the forum and I didn't found that problem appeared in the past.
first of all. I have Boox M92 reader with "coolreader" and "FBReader" reading applications and I'm using calibre version 1.4.0. most of my books are English text file (plain text) that were reformatted for old text-mode reading (80 characters per line and 25 lines per screen and each paragraph start with 4 space characters and without blank line between paragraphs).

When I use calibre for converting text files to ePUB format I'm using the following options:
In "look and feel" I set only the text justification to justify text.
In "EPUB output" I mark 'do not split on page breaks', 'no default cover', 'no SVG cover' and 'flatten EPUB file structure'. i also 'split files larger then:' to a large number (3260).

With small text files it work well, the paragraphs in the file start with some space characters and it easy to read. but for large text files (more the 100KBytes) the contain the text characters only without the new lines and spaces, when using the M92 reading applications and with calibre.

I tried different options and it didn't fixed the problem, I tried different output file format (MOBI, HTML, FB2) and none of them was any better. When reading a large EPUB file (without the conversion) it look normal and well formatted.

Is there a way for fixing that problem?
Haim_gds is offline   Reply With Quote
Old 09-25-2013, 08:02 AM   #2
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 9,865
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Nexus 7
Quote:
Originally Posted by Haim_gds View Post
most of my books are English text file (plain text) that were reformatted for old text-mode reading (80 characters per line and 25 lines per screen and each paragraph start with 4 space characters and without blank line between paragraphs).

When I use calibre for converting text files to ePUB format I'm using the following options:
In "look and feel" I set only the text justification to justify text.
In "EPUB output" I mark 'do not split on page breaks', 'no default cover', 'no SVG cover' and 'flatten EPUB file structure'. i also 'split files larger then:' to a large number (3260).
You don't mention the most important item to adjust and that is the Text Input. It should be successful with the settings on automatic, but try setting paragraph sytle to Print.

From the calibre manual:
Quote:
Paragraph Style: Print
Assumes that every paragraph starts with an indent (either a tab or 2+ spaces). Paragraphs end when the next line that starts with an indent is reached:
Good Luck.
DoctorOhh is offline   Reply With Quote
Advert
Old 09-25-2013, 12:50 PM   #3
Haim_gds
Member
Haim_gds began at the beginning.
 
Posts: 18
Karma: 10
Join Date: Jun 2013
Device: [M92] Onyx boox note
Thanks, but it didn't help.
by default the Text Input set to automatic, I've changed it to 'Print' but it didn't solved the problem. I also tried reformatting the input text file with a blank line after each paragraph and it didn't help.
Haim_gds is offline   Reply With Quote
Old 09-25-2013, 05:07 PM   #4
Sabardeyn
Guru
Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.
 
Sabardeyn's Avatar
 
Posts: 644
Karma: 1242364
Join Date: May 2009
Location: The Right Coast
Device: PC (Calibre), Nexus 7 2013 (Moon+ Pro), HTC HD2/Leo (Freda)
Quote:
Originally Posted by Haim_gds View Post
With small text files it work well, the paragraphs in the file start with some space characters and it easy to read. but for large text files (more the 100KBytes) the contain the text characters only without the new lines and spaces, when using the M92 reading applications and with calibre.
Perhaps I have not understood what you want, but the emphasized portion of your comment above is what draws my attention.

E-books are considered to be reflowable text, changing placement based on font size, margins, and other factors. Therefore newlines are not needed except at the end of a paragraph. Under optimal layout there should not be any kind of positional or stylistic data included within the book's content. So no spaces or tabs, no blank lines, no heading size and spacing, etc should be included.

These items should be in a CSS stylesheet which would have to be added to the ebook (since text files don't have any markup associated with them). You could generate a single CSS stylesheet and re-use it for all your books. You would have to add the necessary markup to the body (text) of the ebook, which shouldn't be much more complex than writing a forum message with bbcode included since your needs are minimal.

You could try using Sigil to edit the files to produce what you want. Sigil can be found here on MobileRead.
Sabardeyn is offline   Reply With Quote
Old 09-26-2013, 05:45 AM   #5
Haim_gds
Member
Haim_gds began at the beginning.
 
Posts: 18
Karma: 10
Join Date: Jun 2013
Device: [M92] Onyx boox note
Sorry, English isn't my native tongue.

Quote:
E-books are considered to be reflowable text, changing placement based on font size, margins, and other factors. Therefore newlines are not needed except at the end of a paragraph. Under optimal layout there should not be any kind of positional or stylistic data included within the book's content. So no spaces or tabs, no blank lines, no heading size and spacing, etc should be included.
It dose make sense and it was what I thought at the start, but when I see into the EPUB file (I'm using total commander but it possible by renaming the file to zip and using unzip application) the constant of the 'index.html' contain the structure we talked about.
The small file has each paragraph justify to the left with a blank line after each paragraph but the large file has only the text repeatedly with out spaces or new lines.
Quote:
You could generate a single CSS stylesheet and re-use it for all your books.
I didn't understand, the CSS stylesheet exist inside the EPUB file, it crated by calibre according to the conversion input parameters. should I edit each EPUB file?

Is calibre is capable of dealing with all kind of text format (Dos, Unix..) or should I need to reform the text file to some format? currently, the EOL symbol in my files are 0x0d, 0x0a.
Haim_gds is offline   Reply With Quote
Advert
Old 09-26-2013, 08:06 AM   #6
Sabardeyn
Guru
Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.
 
Sabardeyn's Avatar
 
Posts: 644
Karma: 1242364
Join Date: May 2009
Location: The Right Coast
Device: PC (Calibre), Nexus 7 2013 (Moon+ Pro), HTC HD2/Leo (Freda)
Yes, CSS stylesheets are created by calibre during conversion; but the results are not always optimal. Calibre can see extremely minor variances between material and create extra - and unneccessary - styles which no user could see with the naked eye.

If a book does not look correct, you can edit the EPUB file by hand, by using Sigil. Sigil allows you to edit the text of the book as well as the CSS stylesheet, table of contents and other portions of the book. Removing some of the style variants and correcting the text accordingly may improve the speed the book is displayed as well as correct layout errors. But you do need to correct both: the text so it doesn't call altered styles, the stylesheet to correct/remove styles that are changed/not needed. Once you have a cleaned up CSS stylesheet, you could save it and re-use it for each book you convert, so you would only need to change the text portion of the ebooks you convert after that. (Re-using the CSS stylesheet means you don't have to do as much work for each book that needs to be corrected.)

Sigil is fairly easy to use, but because of the things it can do, it is also easy to make an error. I would suggest only using copies of your converted ebooks.

I'm not aware if calibre can work with all types of text files or if you would have to convert to a specific version. Perhaps that is the reason the spaces between words are lost on conversion. You could try converting one of your text files and see if that fixes the problem. 0x0d, 0x0a (or, CRLF) is the correct EOL.


Lastly, your English is very good. Sometimes my reading and comprehending skills could use improvement though.
Sabardeyn is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Exporting epub with InDesign: I lose blank lines! tibiafry ePub 67 11-13-2023 08:57 AM
Blank lines between paragraphs? ascherjim OpenInkpot 30 12-03-2009 12:19 AM
Removing blank lines between paragraphs? corroonb Workshop 3 08-13-2009 04:23 PM
Insert Blank Lines Between Paragraphs Timoleon Calibre 14 03-22-2009 02:43 PM
How to eliminate blank lines between paragraphs with Calibre Mr. Goodbar Calibre 8 06-02-2008 07:39 AM


All times are GMT -4. The time now is 09:32 AM.


MobileRead.com is a privately owned, operated and funded community.