Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 06-11-2010, 10:26 AM   #1
kilgoretrout
Junior Member
kilgoretrout began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Jun 2010
Location: Petersburg VA
Device: ipad
Numbers in pdfs not converting

I have inserted an example of what Im getting..the "squares" are supposed to be numbers.

The Destruction of Dresden was first published in Great Britain by William Kimber & Co. Ltd on April , ; in a revised and updated edition by Corgi Books Ltd in ; and by Papermac, a division of Macmillan Publishers Ltd, in .

I am using Calibre to translate a pdf to epub. This is from epub. I have sigil bt dont have a clue how to isolate these occurrences of numbers throughout book and use a dif font

I amusing Macbook Pro, ipad.
kilgoretrout is offline   Reply With Quote
Old 06-11-2010, 11:27 AM   #2
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 6,251
Karma: 4801165
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
It's possible that the PDF is using some custom glyphs in the font for the numbers, instead of the standard number positions. For instance, some fonts include both "old-style" numbers and "normal" numbers, one set is placed in the usual positions for numbers, while the other occupies other positions, and are therefore not recognized as numbers.
Jellby is offline   Reply With Quote
 
Advertisement
Old 06-23-2010, 02:22 PM   #3
kilgoretrout
Junior Member
kilgoretrout began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Jun 2010
Location: Petersburg VA
Device: ipad
How to fix numbers

I am sure there are a million possibilities of WHY the numbers aren't converting but is there a way to choose a default font for an entire doc or possibly every occurrence of say...numbers?
kilgoretrout is offline   Reply With Quote
Old 06-23-2010, 04:52 PM   #4
Freeshadow
temp. out of service
Freeshadow ought to be getting tired of karma fortunes by now.Freeshadow ought to be getting tired of karma fortunes by now.Freeshadow ought to be getting tired of karma fortunes by now.Freeshadow ought to be getting tired of karma fortunes by now.Freeshadow ought to be getting tired of karma fortunes by now.Freeshadow ought to be getting tired of karma fortunes by now.Freeshadow ought to be getting tired of karma fortunes by now.Freeshadow ought to be getting tired of karma fortunes by now.Freeshadow ought to be getting tired of karma fortunes by now.Freeshadow ought to be getting tired of karma fortunes by now.Freeshadow ought to be getting tired of karma fortunes by now.
 
Posts: 2,323
Karma: 12984432
Join Date: May 2010
Location: Duisburg (DE)
Device: BeBook mini
the d be lower case numbers i assume
the problem is technically the same as removing ligatures
Freeshadow is offline   Reply With Quote
Old 06-24-2010, 09:43 AM   #5
frabjous
Wizard
frabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameter
 
frabjous's Avatar
 
Posts: 1,213
Karma: 12890
Join Date: Feb 2009
Location: Amherst, Massachusetts, USA
Device: Sony PRS-505
Am I the only one who can see the numbers in this quote?

Quote:
The Destruction of Dresden was first published in Great Britain by William Kimber & Co. Ltd on April , ; in a revised and updated edition by Corgi Books Ltd in ; and by Papermac, a division of Macmillan Publishers Ltd, in .
They're Unicode characters U+F730 throuth U+F739 (for 0 through 9), which is in the "private use area" of Unicode. I don't know which fonts support them, but on my system, the fallback seems to be the Norasi Unicode fonts (designed for Thai).

Yes, their appearance is of "lowercase" or "old style" numerals.

It should be possible to do a find and replace on them using this information. Rather than starting with calibre, you might want to use pdftohtml and pdfreflow to get an html file, and then find and replace in the html and then use calibre to convert to ePub. This is all scriptable. Something like:

Code:
#!/bin/bash
filename="$1"
pdftohtml -xml "$filename" 
pdfreflow "${filename%.pdf}.xml"
sed -i -e 'y//0123456789/' "${filename%.pdf}.html"
ebook-convert "${filename%.pdf}.html" "${filename%.pdf}.epub"

Last edited by frabjous; 06-24-2010 at 10:00 AM.
frabjous is offline   Reply With Quote
Old 06-25-2010, 10:57 AM   #6
kilgoretrout
Junior Member
kilgoretrout began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Jun 2010
Location: Petersburg VA
Device: ipad
I sent in the exact view of what Im seeing. They are legit fonts. The publisher is aware that the numbers use a dif font.
All I wanna know is THIS:
How doyou isolate a font globally in a pdf or epub..and use a more generic font.
If you have a fix for this..I would like to hear it. Im sure its quite simple,
kilgoretrout is offline   Reply With Quote
Old 06-25-2010, 10:58 AM   #7
kilgoretrout
Junior Member
kilgoretrout began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Jun 2010
Location: Petersburg VA
Device: ipad
That sounds "do-able" think Ill try it!
kilgoretrout is offline   Reply With Quote
Old 06-25-2010, 11:20 AM   #8
kilgoretrout
Junior Member
kilgoretrout began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Jun 2010
Location: Petersburg VA
Device: ipad
OK
Heres what I did..for the Nuremberg.pdf file..

#!/bin/bash
filename="$1"
pdftohtml -xml "$NUREMBERG"
pdfreflow "${NUREMBERG%.pdf}.xml"
sed -i -e 'y//0123456789/' "${NUREMBERG%.pdf}.html"
ebook-convert "${NUREMBERG%.pdf}.html" "${NUREMBERG%.pdf}.epub"

But apparently I havent compiled the reflow thing right. I ran configure, then makefile.

Heres the output I got when I ran the above
-bash: ebook-convert: command not found
MacBookPro:~ wth$

My machine is called Macbookpro and wth of course is my home name.
Thanx for your help.
kilgoretrout is offline   Reply With Quote
Old 06-25-2010, 01:16 PM   #9
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 6,251
Karma: 4801165
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
"ebook-convert" is part of Calibre, do you have Calibre correctly installed? is it in your PATH?
Jellby is offline   Reply With Quote
Old 06-25-2010, 06:18 PM   #10
frabjous
Wizard
frabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameter
 
frabjous's Avatar
 
Posts: 1,213
Karma: 12890
Join Date: Feb 2009
Location: Amherst, Massachusetts, USA
Device: Sony PRS-505
Quote:
Originally Posted by kilgoretrout View Post
OK
Heres what I did..for the Nuremberg.pdf file..

#!/bin/bash
filename="$1"
pdftohtml -xml "$NUREMBERG"
pdfreflow "${NUREMBERG%.pdf}.xml"
sed -i -e 'y//0123456789/' "${NUREMBERG%.pdf}.html"
ebook-convert "${NUREMBERG%.pdf}.html" "${NUREMBERG%.pdf}.epub"
Not sure what $NUREMBERG is, but I had it set up so that it would use whatever you added as first argument to the script. So if you saved the script as fixnumbers.sh you could put in:

fixnumbers.sh nuremberg.pdf

and it should work.

Quote:
But apparently I havent compiled the reflow thing right. I ran configure, then makefile.
It's just a script. No compiling is necessary. Just save the script in a text file and make it executable.

Quote:
Heres the output I got when I ran the above
-bash: ebook-convert: command not found
MacBookPro:~ wth$

My machine is called Macbookpro and wth of course is my home name.
Thanx for your help.
As Jellby says, ebook-convert is part of calibre. If it's in your path, it should find it. If not, well, if you remove that line, does it successfully create an html file? If so, you could try opening it in a browser and see if the numbers are fixed, and if so, manually import the html file into calibre for converting to whatever.

Last edited by frabjous; 06-25-2010 at 06:22 PM.
frabjous is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Converting PDFs macrotor PDF 62 08-14-2011 08:10 PM
Converting PDFs JoshLessard Amazon Kindle 12 10-07-2010 07:40 AM
reader for PDFs without converting? kuck Which one should I buy? 24 06-30-2010 03:55 AM
converting PDFs with equations significance Calibre 6 10-25-2009 10:36 PM
Converting PDFs to Images fargo iRex 9 05-02-2008 12:34 AM


All times are GMT -4. The time now is 10:29 PM.


MobileRead.com is a privately owned, operated and funded community.