Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book General > News

Notices

Reply
 
Thread Tools Search this Thread
Old 05-04-2009, 04:23 PM   #46
dauwhe
<geek type="xml"/>
dauwhe has a complete set of Star Wars action figures.dauwhe has a complete set of Star Wars action figures.dauwhe has a complete set of Star Wars action figures.
 
Posts: 22
Karma: 276
Join Date: Dec 2008
Location: Greenfield, Massachusetts, USA
Device: Kindle, Kindle 2, Sony Reader, iPod Touch
Quote:
Originally Posted by tirsales View Post
Yes - but it should be possible to create XHTML and ePub not from the PDF - but from the original source, shouldnt it?
Or at least possible to extract the text (or have the complete text in advance) and re-format this one...
Things are slightly better with "application" files (InDesign, etc.). If done by a decent typesetter, the split paragraph problem shouldn't happen, for example. But I do remember a book where most of the text appeared twice when first extracted from Quark. The (bad) typesetter had left almost another complete copy of the book "hidden" in a text box. The extraction program dutifully found all the text, whether hidden or not.

Dave
dauwhe is offline   Reply With Quote
Old 05-04-2009, 04:26 PM   #47
dauwhe
<geek type="xml"/>
dauwhe has a complete set of Star Wars action figures.dauwhe has a complete set of Star Wars action figures.dauwhe has a complete set of Star Wars action figures.
 
Posts: 22
Karma: 276
Join Date: Dec 2008
Location: Greenfield, Massachusetts, USA
Device: Kindle, Kindle 2, Sony Reader, iPod Touch
Quote:
Originally Posted by JSWolf View Post
It is so sad that PDF is used in cases where it's all wrong and it causes too many hassles. Just don't accept PDF.
A recent favorite... the PDF looked normal, but if I copied the text there would be extra spaces inside words: "mea sure ment".

Turns out the PDF had "mea " (with a space on the end) and then moved the next letter back so it overlapped the space. Looked fine in the PDF, but we had to clean up the mess to do an eBook.

Dave
dauwhe is offline   Reply With Quote
Advert
Old 05-04-2009, 04:47 PM   #48
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 73,660
Karma: 127838196
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by dauwhe View Post
A recent favorite... the PDF looked normal, but if I copied the text there would be extra spaces inside words: "mea sure ment".

Turns out the PDF had "mea " (with a space on the end) and then moved the next letter back so it overlapped the space. Looked fine in the PDF, but we had to clean up the mess to do an eBook.

Dave
Another good reason to turn around and say, we want the source you used to make the PDF. Send that over.
JSWolf is offline   Reply With Quote
Old 05-04-2009, 05:18 PM   #49
Ankh
Guru
Ankh ought to be getting tired of karma fortunes by now.Ankh ought to be getting tired of karma fortunes by now.Ankh ought to be getting tired of karma fortunes by now.Ankh ought to be getting tired of karma fortunes by now.Ankh ought to be getting tired of karma fortunes by now.Ankh ought to be getting tired of karma fortunes by now.Ankh ought to be getting tired of karma fortunes by now.Ankh ought to be getting tired of karma fortunes by now.Ankh ought to be getting tired of karma fortunes by now.Ankh ought to be getting tired of karma fortunes by now.Ankh ought to be getting tired of karma fortunes by now.
 
Ankh's Avatar
 
Posts: 714
Karma: 2003751
Join Date: Oct 2008
Location: Ottawa, ON
Device: Kobo Glo HD
Quote:
Originally Posted by dauwhe View Post
A recent favorite... the PDF looked normal, but if I copied the text there would be extra spaces inside words: "mea sure ment".
Well, it is hard to say based on a single word, but those "extra spaces" might easily turn out to be "soft hyphen" characters. Soft hyphens are indication to the reading/printing software where it is acceptable to break a word if it is at the end of line. Most likely cut&paste "filtered" out those codes and gave you spaces.
Ankh is offline   Reply With Quote
Old 05-04-2009, 05:27 PM   #50
dauwhe
<geek type="xml"/>
dauwhe has a complete set of Star Wars action figures.dauwhe has a complete set of Star Wars action figures.dauwhe has a complete set of Star Wars action figures.
 
Posts: 22
Karma: 276
Join Date: Dec 2008
Location: Greenfield, Massachusetts, USA
Device: Kindle, Kindle 2, Sony Reader, iPod Touch
Quote:
Originally Posted by Ankh View Post
Well, it is hard to say based on a single word, but those "extra spaces" might easily turn out to be "soft hyphen" characters. Soft hyphens are indication to the reading/printing software where it is acceptable to break a word if it is at the end of line. Most likely cut&paste "filtered" out those codes and gave you spaces.
That was my first thought, too. But we had someone look at the internals of the PDF, and they found actual space characters, with a text block backed up on top of them, overwriting the spaces. Wish I could have seen the file that the PDF was created from!

Dave
dauwhe is offline   Reply With Quote
Advert
Old 05-04-2009, 07:14 PM   #51
Ankh
Guru
Ankh ought to be getting tired of karma fortunes by now.Ankh ought to be getting tired of karma fortunes by now.Ankh ought to be getting tired of karma fortunes by now.Ankh ought to be getting tired of karma fortunes by now.Ankh ought to be getting tired of karma fortunes by now.Ankh ought to be getting tired of karma fortunes by now.Ankh ought to be getting tired of karma fortunes by now.Ankh ought to be getting tired of karma fortunes by now.Ankh ought to be getting tired of karma fortunes by now.Ankh ought to be getting tired of karma fortunes by now.Ankh ought to be getting tired of karma fortunes by now.
 
Ankh's Avatar
 
Posts: 714
Karma: 2003751
Join Date: Oct 2008
Location: Ottawa, ON
Device: Kobo Glo HD
Quote:
Originally Posted by dauwhe View Post
Wish I could have seen the file that the PDF was created from!
Agreed. Most of the time, it is a conversion from one format to another that causes oddities like that to appear.
Ankh is offline   Reply With Quote
Old 05-04-2009, 07:17 PM   #52
AnemicOak
Bookaholic
AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.
 
AnemicOak's Avatar
 
Posts: 14,391
Karma: 54969924
Join Date: Oct 2007
Location: Minnesota
Device: iPad Mini 4, AuraHD, iPhone XR +
Quote:
Originally Posted by Ankh View Post
Agreed. Most of the time, it is a conversion from one format to another that causes oddities like that to appear.
Especially something like PDF that wasn't designed to be exported from in the first place.
AnemicOak is offline   Reply With Quote
Old 05-04-2009, 07:35 PM   #53
zerospinboson
"Assume a can opener..."
zerospinboson ought to be getting tired of karma fortunes by now.zerospinboson ought to be getting tired of karma fortunes by now.zerospinboson ought to be getting tired of karma fortunes by now.zerospinboson ought to be getting tired of karma fortunes by now.zerospinboson ought to be getting tired of karma fortunes by now.zerospinboson ought to be getting tired of karma fortunes by now.zerospinboson ought to be getting tired of karma fortunes by now.zerospinboson ought to be getting tired of karma fortunes by now.zerospinboson ought to be getting tired of karma fortunes by now.zerospinboson ought to be getting tired of karma fortunes by now.zerospinboson ought to be getting tired of karma fortunes by now.
 
zerospinboson's Avatar
 
Posts: 755
Karma: 1942109
Join Date: Mar 2008
Location: Local Cluster
Device: iLiad v2, DR1000
Quote:
Originally Posted by dauwhe View Post
Things are slightly better with "application" files (InDesign, etc.). If done by a decent typesetter, the split paragraph problem shouldn't happen, for example. But I do remember a book where most of the text appeared twice when first extracted from Quark. The (bad) typesetter had left almost another complete copy of the book "hidden" in a text box. The extraction program dutifully found all the text, whether hidden or not.

Dave
Odd, that. I'd always thought that "the industry" would appreciate LaTeX for its strengths over those WYSIWYG-type things. Admittedly LaTeX doesn't have a very up-to-date (or widely adopted) IDE, but still.
Oh, well. Another myth dispelled.
zerospinboson is offline   Reply With Quote
Old 05-04-2009, 08:04 PM   #54
dauwhe
<geek type="xml"/>
dauwhe has a complete set of Star Wars action figures.dauwhe has a complete set of Star Wars action figures.dauwhe has a complete set of Star Wars action figures.
 
Posts: 22
Karma: 276
Join Date: Dec 2008
Location: Greenfield, Massachusetts, USA
Device: Kindle, Kindle 2, Sony Reader, iPod Touch
Quote:
Originally Posted by zerospinboson View Post
Odd, that. I'd always thought that "the industry" would appreciate LaTeX for its strengths over those WYSIWYG-type things. Admittedly LaTeX doesn't have a very up-to-date (or widely adopted) IDE, but still.
Oh, well. Another myth dispelled.
I've probably been around for ten or fifteen thousand books. One was done in TeX (a computer science author who insisted). I think the vast majority of trade books are now done in InDesign; it used to be almost all Quark. Some college textbooks are done in fancier systems like 3B2 or Arbortext, but trade publishers are used to total control over everything. They will complain about the justification of a single line, so it's much easier to deal with that kind of thing in a WYSIWYG environment.

If it was up to me, I'd use XSL-FO!

Dave
dauwhe is offline   Reply With Quote
Old 05-04-2009, 08:31 PM   #55
AnemicOak
Bookaholic
AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.
 
AnemicOak's Avatar
 
Posts: 14,391
Karma: 54969924
Join Date: Oct 2007
Location: Minnesota
Device: iPad Mini 4, AuraHD, iPhone XR +
Quote:
Originally Posted by dauwhe View Post
...I think the vast majority of trade books are now done in InDesign; it used to be almost all Quark. Some college textbooks are done in fancier systems like 3B2 or Arbortext, but trade publishers are used to total control over everything. They will complain about the justification of a single line, so it's much easier to deal with that kind of thing in a WYSIWYG environment.
Yeah, but InDesign (which I use all day, every day) and Quark are layout programs used to get your prepress files 'just right' for print work there should be a before layout source that you dump into your design/layout software and that's where the ebook would/should start from. Using a well edited word processor file to make an ebook doesn't really take much. Heck I just scanned a book I wanted that I'm sure won't see a commercial ebook for a long time, if ever, and it took me about eight hours work to scan, proof and produce a nice ebook. There is really very little design that can be done (at least when compared to traditional typesetting) at least when talking about fiction books.

The problem is publishers need to understand that PDF isn't usable (well it is, but you know what I mean) as a source and that whatever was used to make the PDF or whatever was dumped into InDesign is what they need to provide. PDF was designed by Adobe to be a dead end. Yes, they, and others have provided ways to get text out of it, but it wasn't designed for that for the most part. Most publishers just don't understand what's what, or don't want to understand. A lot of them still aren't really into the idea of ebooks.
AnemicOak is offline   Reply With Quote
Old 05-05-2009, 10:49 AM   #56
DaleDe
Grand Sorcerer
DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.
 
DaleDe's Avatar
 
Posts: 11,470
Karma: 13095790
Join Date: Aug 2007
Location: Grass Valley, CA
Device: EB 1150, EZ Reader, Literati, iPad 2 & Air 2, iPhone 7
Quote:
Originally Posted by tirsales View Post
Yes - but it should be possible to create XHTML and ePub not from the PDF - but from the original source, shouldnt it?
Or at least possible to extract the text (or have the complete text in advance) and re-format this one...
You are assuming there is an original source. This is not always the case. The source was sent in to the publisher and they make a PDF and then may make changes directly on the PDF itself. The original source is no longer valid. The process does not generally support the idea of source control although hopefully this will change in the future.

The publishing industry needs several changes to support eBooks in a reasonable fashion and then which format eBook do they support. Often there are specific changes required for particular formats. All of this complexity is not free for the publisher.

Dale
DaleDe is offline   Reply With Quote
Old 05-05-2009, 11:12 AM   #57
tirsales
MIA ... but returning som
tirsales ought to be getting tired of karma fortunes by now.tirsales ought to be getting tired of karma fortunes by now.tirsales ought to be getting tired of karma fortunes by now.tirsales ought to be getting tired of karma fortunes by now.tirsales ought to be getting tired of karma fortunes by now.tirsales ought to be getting tired of karma fortunes by now.tirsales ought to be getting tired of karma fortunes by now.tirsales ought to be getting tired of karma fortunes by now.tirsales ought to be getting tired of karma fortunes by now.tirsales ought to be getting tired of karma fortunes by now.tirsales ought to be getting tired of karma fortunes by now.
 
tirsales's Avatar
 
Posts: 1,600
Karma: 511342
Join Date: Nov 2007
Location: Germany
Device: PRS-505 and *Really* not owning a PRS-700
Quote:
Originally Posted by DaleDe View Post
You are assuming there is an original source. This is not always the case. The source was sent in to the publisher and they make a PDF and then may make changes directly on the PDF itself. The original source is no longer valid. The process does not generally support the idea of source control although hopefully this will change in the future.
Yes - but this is an awful *** workflow.
tirsales is offline   Reply With Quote
Old 05-06-2009, 03:03 PM   #58
T3_reader
Enthusiast
T3_reader began at the beginning.
 
Posts: 24
Karma: 10
Join Date: Apr 2009
Device: IPad & Android phone
I bought 'The Hobbit' the first day it was available on eReader.com. Every 5 sentences or so a space was missing inbetween two words. Today I downloaded the file again and in the new version this is fixed. I guess it is time to add a revision number to ebooks.
T3_reader is offline   Reply With Quote
Old 05-12-2009, 08:26 PM   #59
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 73,660
Karma: 127838196
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
What needs to happen is eBook shops need to have a system in place to be able to inform the customer of new editions of eBooks purchased so we can download the new versions. Otherwise, we may never know when an eBook has been updated.
JSWolf is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
The Hobbit meets The Matrix! Shawn P Cormier Self-Promotions by Authors and Publishers 48 09-13-2013 07:44 AM
Noob install mistakes Shiryas Calibre 2 04-05-2010 01:30 AM
typos or mistakes in ebooks delcimai Sony Reader 15 02-14-2010 11:53 AM
The Hobbit at Feedbooks Moe The Cat Deals and Resources (No Self-Promotion or Affiliate Links) 54 12-31-2008 07:32 AM


All times are GMT -4. The time now is 02:56 AM.


MobileRead.com is a privately owned, operated and funded community.