![]() |
#1 |
Enthusiast
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 34
Karma: 4132
Join Date: Jun 2011
Device: Bookeen Cybook Opus
|
Epub -> txt with italic/bold characters
I'm converting a few epubs to txt with the intention of viewing them on a device that doesn't support any other format.
Problem is, Calibre removes all italics and bold, which makes it a lot harder to understand certain books. I'm aware the txt format doesn't support anything other than plain text, so I've taken to converting to html instead and removing all the html tags using the search-replace feature until I'm left with a file composed only of text and the bold and italic tags. At that point I swap <i> and </i> with the slash character and <b> and </b> with two asterisks. The net result: "He said what?! How rude!" Would be converted to: "He said **what**?! How /rude/!". This works, but the procedure to do the conversion is painfully slow, painstaking and prone to mistakes that can cause screwups in parts of the text I can't immediately see. I'm looking for some form of automatic conversion that'll do all this from an epub without having to disassemble the html. |
![]() |
![]() |
![]() |
#2 |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 30,909
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
I have a feeling that you have the wrong extension set.
There is no Bold or Italics in a 'Text' file, just alpha numeric and punctuation. What you might have is a Rich Text (RTF) document (If you see Bold in the editor, you might change the file type to RTF and see how calibre fairs) Your second is a marked up text file (notice that it still follows the rule above, Calibre interprets the marks as it proceeds. Last edited by theducks; 07-03-2013 at 08:57 AM. Reason: I was confused |
![]() |
![]() |
![]() |
#3 | |||
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 13,309
Karma: 78876004
Join Date: Nov 2007
Location: Toronto
Device: Libra H2O, Libra Colour
|
Quote:
Markdown is described at http://en.wikipedia.org/wiki/Markdown and Textile at http://en.wikipedia.org/wiki/Textile_(markup_language) As an example I tried converting a test document from Kovid for showing off the DOCX conversion. Converting to Markdown gave me Quote:
Quote:
|
|||
![]() |
![]() |
![]() |
#4 | |
US Navy, Retired
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 9,889
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Kindle PaperWhite SE 11th Gen
|
The OP is not the one confused.
He has an ePub that he is converting to txt to use on a device that only handles text. He still wants some indicators in the text so he can tell bold and italics since this emphasis is often quite necessary to understand what the author is conveying. Quote:
![]() |
|
![]() |
![]() |
![]() |
#5 | |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 30,909
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
![]() Quote:
![]() |
|
![]() |
![]() |
![]() |
#6 |
Enthusiast
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 34
Karma: 4132
Join Date: Jun 2011
Device: Bookeen Cybook Opus
|
|
![]() |
![]() |
![]() |
#7 |
Padawan Learner
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 243
Karma: 1085815
Join Date: May 2009
Location: www.OutlawGalaxy.com, Foothills of NY's Adirondack mountains
Device: My PC...using Puppy Linux (FBReader, Calibre, Kindle Cloud Reader,
|
Another way to do this (without using Calibre) might be to convert Epub to HTML, then go into the HTML code and convert the <b>bold</b> to *bold* and <i>italic</> to _italic_ (Might also be emphasis and another code word for bold and italic).
Then go back into the regular browser, highlight all of the text, copy and paste into a txt browser editor. Don't know if this helps but I think it would be a pretty foolproof way to do things. |
![]() |
![]() |
![]() |
#8 | |
Enthusiast
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 34
Karma: 4132
Join Date: Jun 2011
Device: Bookeen Cybook Opus
|
Quote:
I've manually converted one book like that because it was slightly clearer and I eventually got it right, but the one I wanted to convert next was essentially unfeasible. I dunno, it might be possible to do something with regular expressions, but honestly life's too short. |
|
![]() |
![]() |
![]() |
#9 |
Sigil & calibre developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,487
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
|
HTMLZ output has an option to convert styles to tags (How to handle CSS = tag). A lot of styling will be lost using this option because only a very small subset of styles can be represented as tags. However, in this sort of situation it's not that big a deal.
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
The bold and italic buttons on the bar | Artha | Sigil | 15 | 12-04-2011 04:52 PM |
italic and bold ok in sigil but not on Kobo reader | Mookiemon | Sigil | 14 | 07-23-2011 09:50 PM |
txt to epub tilde/special characters | Fuzzy Dustmite | Conversion | 1 | 04-11-2011 09:54 PM |
italic, bold etc to normal | cybmole | Sigil | 11 | 03-04-2011 10:37 AM |
PRS-500 Tags for Bold, Italic, Center, Etc. in LRF? | EatingPie | Sony Reader Dev Corner | 9 | 04-07-2007 01:06 AM |