04-05-2009, 03:58 PM | #1 |
Enthusiast
Posts: 37
Karma: 10
Join Date: Mar 2009
Device: sony 505
|
Covert Unicode TXT to EPUB, failed
I am trying to convert some txt files into epub for Sony 505.
The txt file are unicode saved by notepad. Those txt file start with 0xFF 0xFE which I believe are standard. In Calibre, I specify UTF-8 as source encoding (which is correct) as soon as the covertion starts, it complains unknown character 0xFF at 0 and stops. Any trick? |
04-05-2009, 05:28 PM | #2 |
Sigil & calibre developer
Posts: 2,488
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
|
I don't have any ideas for the currently released version. However, there is a lot of work going on with the input and output in the pluginize branch. One of the major changes is unicode support all through the conversion pipeline. I've been working on txt input and output for pluginize. Is it possible for you to attach or send me the txt files you are having problems with so I can make sure they work with the new pipeline?
|
Advert | |
|
04-05-2009, 05:56 PM | #3 |
Wizard
Posts: 3,442
Karma: 300001
Join Date: Sep 2006
Location: Belgium
Device: PRS-500/505/700, Kindle, Cybook Gen3, Words Gear
|
FF FE is UTF-16 BOM, not UTF-8. Make sure you choose UTF-8 in Notepad when saving, not "Unicode".
|
04-05-2009, 06:34 PM | #4 |
Enthusiast
Posts: 37
Karma: 10
Join Date: Mar 2009
Device: sony 505
|
I have tried UTF-16, no complain, however, the output was empty.
I have also tried Unicode, it says "Encoding not supported". The original file was Simplified Chinese Unicode saved in notepad, it can be read in Reader without problem. Here is the file: Thanks. |
04-05-2009, 11:43 PM | #5 |
creator of calibre
Posts: 43,842
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Try converting your txt file to html (open it in word and save it as HTML)
|
Advert | |
|
04-06-2009, 07:41 PM | #6 |
Enthusiast
Posts: 37
Karma: 10
Join Date: Mar 2009
Device: sony 505
|
Covert to html by Word does work.
However, first it creates 1.5M large file and then it takes almost 10 minutes to convert. Also, batch processing will not be possible by this method. This did inspire me and I try to save it as UTF-8 in notepad, works, takes couple of seconds to convert. So, looks like there is a bug in supporting Unicode TXT. UTF-8 TXT is perfect. |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Convert .TXT to .EPUB | Arfer | Calibre | 6 | 09-02-2010 10:41 AM |
I can't covert to epub without a cover | WallOfSound | Calibre | 2 | 05-31-2010 01:03 PM |
How to create non-embedded Unicode EPUB,LRF,TXT,RTF,PDF | alexmobile | Sony Reader | 1 | 09-23-2009 10:04 PM |
Performance on Unicode epub vs lrf | siulayhumga | Sony Reader | 0 | 08-03-2009 02:36 PM |
Need help with making unicode Chinese epub for Sony 505 | siulayhumga | ePub | 2 | 08-03-2009 02:30 PM |