View Full Version : Lots of ???


ted13b
04-15-2011, 01:37 PM
A friend gave me an ebook in epub format. It has a strange issue where every third or fourth character is a question mark. Is this caused by an improper file conversion? Any ideas for repairing it?

JSWolf
04-15-2011, 01:40 PM
A friend gave me an ebook in epub format. It has a strange issue where every third or fourth character is a question mark. Is this caused by an improper file conversion? Any ideas for repairing it?

Yeah, see if a legal version exists and buy it.

ted13b
04-15-2011, 01:58 PM
That's the problem, I already have everything by Alistair Reynolds in hardcover, but now that I read on my Sony I'd like to have the books on that, but the thought of buying them again annoys me.

DaleDe
04-15-2011, 02:00 PM
A friend gave me an ebook in epub format. It has a strange issue where every third or fourth character is a question mark. Is this caused by an improper file conversion? Any ideas for repairing it?

A ? where a character belongs typically means that the character set is not supported on the reading device. It may need an embedded character set in the ePub itself. This is often the case if the document is not in English.

spaze
04-15-2011, 02:03 PM
Teenagers use lot of question marks when they ask something on forums. Don't you think????????

Usually question mark means a character that is not among the ones in reader's character sets.

ted13b
04-15-2011, 02:45 PM
I'm not a teenager, my use of multiple question marks was a facetious attempt at humor based on the excessive number of them in my ebook. Characters in the book are not being converted into question marks, extra ones are inserted repeatedly in the text.

DaleDe
04-15-2011, 07:37 PM
I'm not a teenager, my use of multiple question marks was a facetious attempt at humor based on the excessive number of them in my ebook. Characters in the book are not being converted into question marks, extra ones are inserted repeatedly in the text.

These can be non-visible characters. To figure out what is happening some experimentation will be required. Do you see those question marks on a PC running Adobe Digital Editions?

Dale

Faster
04-16-2011, 01:10 PM
Is this caused by an improper file conversion? Any ideas for repairing it?

MS Word was probably used as part of the conversion process. The question marks are where there were optional hyphens (desirable places to split a word at the end of a line when using right justification). They should have been removed when using Word. Other applications may not recognize them and replace with question marks.
Confirmation that they are in fact replacements for optional hyphens would be that they always occur within words, never at start or end of a word.
As the ebook is already in epub format you will need to edit in something like Sigil.
You should be able to remove them in Sigil using Regex.
Go into Code View and select
Look in: All HTML Files.
And check Regular expression.
Find: (\w)\?(\w)
Replace: \1\2

This avoids removing legitimate question marks at the end of sentences.

benham
04-16-2011, 01:23 PM
Lots of ??? means lots of ???. You might get a better response to your 1 question if you more clearly name the thread. Sounds like you want to read illegal books without any problems.

dwig
04-16-2011, 08:18 PM
...Is this caused by an improper file conversion?..

Possibly, they could be characters that the reader can't display and the reader is substituting a "?" to indicate the unknown character. This happens when font encoding it not properly handled during conversion or when things like Word's soft hyphens are embedded in the text. If so, there may be ways of "repairing" the file. Other posts have made reasonable suggestions to try.

They could also be real question marks, or all identical undisplayable characters, placed in the text by the OCR software that was used to convert scanned pages into actual text. If this is the case there is no way to fix the file by reprocessing it. You would have to manually edit the text and insert the correct characters.