Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 02-26-2012, 09:08 PM   #1
pshute
Enthusiast
pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.
 
Posts: 26
Karma: 189896
Join Date: Feb 2012
Device: Sony PRS-T1
Extended ASCII characters in txt file

I have a txt file I want to convert to ePub format for my Sony PRS-T1, but it contains many instances of extended ASCII line-drawing characters. These show up on the reader as various letters with accents, etc, instead of lines. (E.g. ASCII value C4 hex should be a horizontal line.)

I assume this means the reader can't handle extended ASCII characters.

Can someone give me some advice about what to do to fix it? Is there a way to make the device display them properly? Or should I be doing a search and replace to change them to something else? If the latter, what should I change them to ?
pshute is offline   Reply With Quote
Old 02-26-2012, 10:34 PM   #2
theducks
Grand Sorcerer
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 14,213
Karma: 5495470
Join Date: Aug 2009
Location: The (original) Silicon Valley, USA
Device: Galaxy Tab 2, Astak Pocket Pro, K4NT
Quote:
Originally Posted by pshute View Post
I have a txt file I want to convert to ePub format for my Sony PRS-T1, but it contains many instances of extended ASCII line-drawing characters. These show up on the reader as various letters with accents, etc, instead of lines. (E.g. ASCII value C4 hex should be a horizontal line.)

I assume this means the reader can't handle extended ASCII characters.

Can someone give me some advice about what to do to fix it? Is there a way to make the device display them properly? Or should I be doing a search and replace to change them to something else? If the latter, what should I change them to ?
The keyword is Codepage (it has an optional setting on the conversions form)
you need to select the encoding used: ie. CP1252
theducks is offline   Reply With Quote
Old 02-26-2012, 11:25 PM   #3
pshute
Enthusiast
pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.
 
Posts: 26
Karma: 189896
Join Date: Feb 2012
Device: Sony PRS-T1
Quote:
Originally Posted by theducks View Post
The keyword is Codepage (it has an optional setting on the conversions form)
you need to select the encoding used: ie. CP1252
Thanks for that. But which codepage should I pick?

I've tried CP1252, ascii, Latin 1, utf-8. Only ascii made it look any different, boxes for every line drawing character.

I also don't understand why choosing ascii made the conversion take several minutes compared to about half a minute for the others.
pshute is offline   Reply With Quote
Old 02-27-2012, 05:55 AM   #4
pshute
Enthusiast
pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.
 
Posts: 26
Karma: 189896
Join Date: Feb 2012
Device: Sony PRS-T1
Quote:
Originally Posted by pshute View Post
Thanks for that. But which codepage should I pick?

I've tried CP1252, ascii, Latin 1, utf-8. Only ascii made it look any different, boxes for every line drawing character.
I tried them all, and none worked. then I realised I can type enything in the character encoding box, so I entered CP437, which Wikipedia indicated was the right one.

That worked (in the Calibre reader, havent tried on the Sony yet), but now I see the columns are misaligned. It looks like the whitespace has been collapsed.

I tried ticking Preserve Whitespace, but it didn't help. Now what?

I just tried it on the device, and now the box drawing characters are all question marks. Maybe I should be replacing the characters instead?

Anyone know if a Sony can do box drawing characters?
pshute is offline   Reply With Quote
Old 02-27-2012, 07:41 AM   #5
itimpi
Wizard
itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.
 
Posts: 4,022
Karma: 777817
Join Date: Nov 2008
Device: Sony PRS-950, iphone/ipad (Marvin/iBooks/QuickReader)
If you want ascii art to stay aligned then you will need to make sure that a mono-space type font is selected.

Question marks on the device tends to mean that the font selected does not have relevant the characters in it.
itimpi is offline   Reply With Quote
Old 02-27-2012, 08:25 AM   #6
Rob Lister
Fanatic
Rob Lister ought to be getting tired of karma fortunes by now.Rob Lister ought to be getting tired of karma fortunes by now.Rob Lister ought to be getting tired of karma fortunes by now.Rob Lister ought to be getting tired of karma fortunes by now.Rob Lister ought to be getting tired of karma fortunes by now.Rob Lister ought to be getting tired of karma fortunes by now.Rob Lister ought to be getting tired of karma fortunes by now.Rob Lister ought to be getting tired of karma fortunes by now.Rob Lister ought to be getting tired of karma fortunes by now.Rob Lister ought to be getting tired of karma fortunes by now.Rob Lister ought to be getting tired of karma fortunes by now.
 
Posts: 533
Karma: 3293888
Join Date: Oct 2011
Location: Virginia
Device: Nook Simple Touch
Quote:
Originally Posted by pshute View Post
I tried them all, and none worked. then I realised I can type enything in the character encoding box, so I entered CP437, which Wikipedia indicated was the right one.

That worked (in the Calibre reader, havent tried on the Sony yet), but now I see the columns are misaligned. It looks like the whitespace has been collapsed.

I tried ticking Preserve Whitespace, but it didn't help. Now what?

I just tried it on the device, and now the box drawing characters are all question marks. Maybe I should be replacing the characters instead?

Anyone know if a Sony can do box drawing characters?
Can you show us the resulting css and xhtml?
Rob Lister is offline   Reply With Quote
Old 02-27-2012, 09:17 AM   #7
theducks
Grand Sorcerer
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 14,213
Karma: 5495470
Join Date: Aug 2009
Location: The (original) Silicon Valley, USA
Device: Galaxy Tab 2, Astak Pocket Pro, K4NT
Quote:
Originally Posted by pshute View Post
I tried them all, and none worked. then I realised I can type enything in the character encoding box, so I entered CP437, which Wikipedia indicated was the right one.

That worked (in the Calibre reader, havent tried on the Sony yet), but now I see the columns are misaligned. It looks like the whitespace has been collapsed.

I tried ticking Preserve Whitespace, but it didn't help. Now what?

I just tried it on the device, and now the box drawing characters are all question marks. Maybe I should be replacing the characters instead?

Anyone know if a Sony can do box drawing characters?
Also remember HTML tends to ignore multiple whitespace

If you wrap the Ascii Art in <pre> tags, that should stop that part of the problem.

You may need to embed (and specify) a font that has the old upper ASCII chars if your device does not have one for you to specify.
theducks is offline   Reply With Quote
Old 02-27-2012, 03:53 PM   #8
pshute
Enthusiast
pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.
 
Posts: 26
Karma: 189896
Join Date: Feb 2012
Device: Sony PRS-T1
Quote:
Originally Posted by Rob Lister View Post
Can you show us the resulting css and xhtml?
How do I get to it, please?
pshute is offline   Reply With Quote
Old 02-27-2012, 04:00 PM   #9
pshute
Enthusiast
pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.pshute can program the VCR without an owner's manual.
 
Posts: 26
Karma: 189896
Join Date: Feb 2012
Device: Sony PRS-T1
Quote:
Originally Posted by theducks View Post
Also remember HTML tends to ignore multiple whitespace

If you wrap the Ascii Art in <pre> tags, that should stop that part of the problem.

You may need to embed (and specify) a font that has the old upper ASCII chars if your device does not have one for you to specify.
I just found a list of supplied fonts at:
http://wiki.mobileread.com/wiki/PRST...or_xhtml_files
and a method of adding more at:
http://blog.the-ebook-reader.com/201...t1-no-rooting/

Whether any of the supplied fonts support these characters, I think I'll have to work out by trial and error. It'll definitely have to be a monospaced font, or there's no point to it.

Can I just wrap the whole document in pre tags? I dread having to work out some regex to detect lines with ascii art in them. I can't just edit the file manually because it's regularly updated and I'll just end up having to do it again and again.
pshute is offline   Reply With Quote
Old 02-28-2012, 06:35 AM   #10
Rob Lister
Fanatic
Rob Lister ought to be getting tired of karma fortunes by now.Rob Lister ought to be getting tired of karma fortunes by now.Rob Lister ought to be getting tired of karma fortunes by now.Rob Lister ought to be getting tired of karma fortunes by now.Rob Lister ought to be getting tired of karma fortunes by now.Rob Lister ought to be getting tired of karma fortunes by now.Rob Lister ought to be getting tired of karma fortunes by now.Rob Lister ought to be getting tired of karma fortunes by now.Rob Lister ought to be getting tired of karma fortunes by now.Rob Lister ought to be getting tired of karma fortunes by now.Rob Lister ought to be getting tired of karma fortunes by now.
 
Posts: 533
Karma: 3293888
Join Date: Oct 2011
Location: Virginia
Device: Nook Simple Touch
Quote:
Originally Posted by pshute View Post
How do I get to it, please?
If you don't have an epub editor, the easist way is to rename the file extension of the book to .zip, like ...

mybook.epub

to

mybook.zip

Your computer will 'warn' you, but do it anyway. It will be ok.

the book will turn into a folder.

Open the folder and there are the folders and .xhtml files that make up your book.

Hunt around till you find the section that contains the offending text.

When you're done, rename it again to .epub.

that's it.

OR!!!

Download and install an epub editor.

Sigal is what I use (I don't think a better [s]free[/s] one exists). Search the forums for a download link ... or google it.

Last edited by Rob Lister; 02-28-2012 at 06:44 AM.
Rob Lister is offline   Reply With Quote
Old 02-28-2012, 06:57 AM   #11
frostschutz
Linux User
frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.
 
frostschutz's Avatar
 
Posts: 738
Karma: 2030839
Join Date: Sep 2010
Device: iriver Story HD
If possible try to replace ascii art with sensible HTML such as <table> or <hr>.

Otherwise you'll just always run into problems with that document. Even if you find a font that displays the characters, it will only display fine as long as you also choose a font size small enough to fit the width of the ascii art on the screen. Which could be a rather small font.
frostschutz is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Non-ASCII characters in recipe titles show as ü bubak Recipes 2 11-30-2011 07:49 AM
Converting non-ASCII characters davidnye Recipes 0 08-20-2011 07:16 PM
advanced text search and non-ascii characters msz59 General Discussions 0 05-05-2011 09:47 AM
Typing non-ASCII characters with the keyboard Edmundo Amazon Kindle 5 01-20-2011 01:18 PM
Is it possible to sent books to device with filename in non-ascii characters? flyisland Calibre 8 10-16-2010 05:35 AM


All times are GMT -4. The time now is 06:00 AM.


MobileRead.com is a privately owned, operated and funded community.