![]() |
#1 |
Zealot
![]() Posts: 115
Karma: 10
Join Date: Jan 2011
Device: none
|
Text formatting (newbie questions)
I've been trying to make an epub out of rtf using 3 methods: directly in calibre,
Writer2EPUB & calibre rtf to txt >sigil. In all cases the result was the same: a lot of broken lines & blank lines between paragraphs instead if indents, which makes the book very ugly & difficult to read. Removing all the </p> <p> pairs manually even using "Find & Replace <p>[a-z] Find Next" feature takes forever & being a newbie (my second day with sigil) i don't know where to start with removing the blank lines. I would also like to be able to decrease the font size in the resulting ebook - my desktop ADE goes to 3 column display if i do it there & i like an ebook to be as much like the paper one as possible. I apologize in advance if my questions are too dumb but help would be very much appreciated. |
![]() |
![]() |
![]() |
#2 |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 302
Karma: 185297
Join Date: Sep 2009
Location: Ankh Morpork
Device: calibre
|
It sounds as if the file may have hardcoded line feeds in it. Luca (the creator of writer2epub) also has a writer plugin called text cleaner which can be helpful in removing blank lines. Give that a try then run though writer to epub, the file should be a lot better and Sigil can do the tidying up afterwards.
By setting a smaller font in writer you should get a smaller font after running it through writer2epub, or you could modify the css with Sigil to get whatever size you want, experiment with "font-size: xem" where x is the font size, less than 1 is smaller, greater than 1 is larger. |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Zealot
![]() Posts: 115
Karma: 10
Join Date: Jan 2011
Device: none
|
Thanks a lot, going to try right now.
|
![]() |
![]() |
![]() |
#4 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
|
What is the quality of your RTF? Usually it will be garbage in, garbage out.
There are also other RTF to HTML converters out there. You can import the resulting HTML directly in Sigil. |
![]() |
![]() |
![]() |
#5 |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 30,891
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Once you solve the other issues
![]() font size is usually a piece of cake ( except when: see @Toxaris post). In the Style sheet: find the class used in the <body> tag. change the font-size: value (test results by looking at many different types (in book usage) pages) If only a few areas have issues, spot fix their classes font-size. If not, you are going to have to work backwards through the nested 'box model' to find the culprit. IMHO 300+ line CSS for a simple book = a GIGO issue |
![]() |
![]() |
Advert | |
|
![]() |
#6 | |
Zealot
![]() Posts: 115
Karma: 10
Join Date: Jan 2011
Device: none
|
Quote:
The text cleaner did an admirable job connecting most of the broken lines but at the same time it substituted all the apostrophes, quotes and dashes with question marks, which in turn disappeared completely when imported to sigil. Is there a way around this problem? If at least the ?'s stayed in sigil i could try to fix it back with Find & Replace. The other question of the font size is really as simple as you said. Thanks and if you have any further suggestions i would very much welcome them. |
|
![]() |
![]() |
![]() |
#7 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
|
I still suspect your source file is not good. Convert your file first to HTML and try to repair it there.
|
![]() |
![]() |
![]() |
#8 | |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 30,891
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Quote:
|
|
![]() |
![]() |
![]() |
#9 |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 302
Karma: 185297
Join Date: Sep 2009
Location: Ankh Morpork
Device: calibre
|
I have come across this problem of character encoding a couple of times in the last year, it seems, as far as I can tell, not to be a function of text cleaner but something in the original file. It seems that some Microsoft software is not applying the correct coding and this error propagates into other systems getting confused and trying to re-encode. The main problem seems to be Libre(or Open)Office trying to correct errors in Word. Without being biased it seems that many US Windows installs are not Internationally aware.
Have you tried the other suggestion of using Calibre to convert to html first? This should solve the encoding issue and if the linefeeds are still there text cleaner should just remove them. |
![]() |
![]() |
![]() |
#10 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
|
Calibre also has a line unwrap feature under heuristics, and it's non-ascii handling for rtf was improved a while back as well.
If neither Calibre or WritertoePub can handle the non-ascii characters correctly then use Open Office or Word to convert to HTML. Then use the html source in Calibre. |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Kindle book formatting questions - text-to-speech, line spacing | vermontcathy | Writers' Corner | 2 | 03-11-2011 03:04 AM |
Text formatting | jerrywojo | Ectaco jetBook | 4 | 01-19-2010 03:37 PM |
help with formatting text files | chooky | Workshop | 2 | 11-26-2009 04:16 AM |
Text tool for formatting Gutenberg text files | bob_ninja | Workshop | 5 | 11-13-2007 12:28 PM |
PRS-500 Text Formatting Tool | tesseract420 | Sony Reader Dev Corner | 5 | 09-13-2007 05:36 PM |