Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 07-23-2020, 04:23 AM   #1
Blaineoreski
Zealot
Blaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcover
 
Blaineoreski's Avatar
 
Posts: 119
Karma: 16268
Join Date: Apr 2020
Device: none
Have Good Source File But In Conversion to PDF Nearly Every Other Word is Split

Hi,

I am converting a file to PDF. HAve used the same source for these files many times.

Never had this anomaly before.

Once the file arrives in PDF format, it LOOKS great!

But, in copying and pasting text from it it seems that the actual text has split many of the words.

So that, if you copy and paste the words:

with security was demoralizing

You get...

with se curity was demor aliz ing

This does not appear in te PDF itself. All the words look perfect in the PDF.

I've tried running the conversion with: Smarten Punctuation an Unsmarten Punctuation on.

It is a pure text-to-text conversion. Not from a scanned image.

What do you think is happening?

Sincerely,

Blaine

Last edited by Blaineoreski; 07-23-2020 at 05:01 AM.
Blaineoreski is offline   Reply With Quote
Old 07-23-2020, 05:39 AM   #2
Bookstooge
Member Retired
Bookstooge ought to be getting tired of karma fortunes by now.Bookstooge ought to be getting tired of karma fortunes by now.Bookstooge ought to be getting tired of karma fortunes by now.Bookstooge ought to be getting tired of karma fortunes by now.Bookstooge ought to be getting tired of karma fortunes by now.Bookstooge ought to be getting tired of karma fortunes by now.Bookstooge ought to be getting tired of karma fortunes by now.Bookstooge ought to be getting tired of karma fortunes by now.Bookstooge ought to be getting tired of karma fortunes by now.Bookstooge ought to be getting tired of karma fortunes by now.Bookstooge ought to be getting tired of karma fortunes by now.
 
Posts: 805
Karma: 2091358
Join Date: May 2019
Device: Kindle Oasis 1st Gen, PB Era
What format is the source?
Bookstooge is offline   Reply With Quote
Advert
Old 07-23-2020, 05:46 AM   #3
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,253
Karma: 27110894
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Nothing particularly surprising. In PDF individual font glyphs are often positioned one by one, not as complete words or sentences. SO when extracting text from PDF, such as for copying, programs have to guess what are word boundaries based on positioning, they sometimes guess wrong.
kovidgoyal is offline   Reply With Quote
Old 07-23-2020, 07:13 AM   #4
Blaineoreski
Zealot
Blaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcover
 
Blaineoreski's Avatar
 
Posts: 119
Karma: 16268
Join Date: Apr 2020
Device: none
Hi!

Feel kinda worried because this is an IMPORTANT file for the project I'm doing and I'll need to reply on it for lots of quotes.

I can use ANY destination filetype. Is there any other filetype to convert to that may avoid the problem?

Thanks!!!!!!!!!!!!!!!!!!!
Blaineoreski is offline   Reply With Quote
Old 07-23-2020, 08:30 AM   #5
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,253
Karma: 27110894
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Not sure what you are asking?? PDF is the ONLY filetype with this issue.
kovidgoyal is offline   Reply With Quote
Advert
Old 07-23-2020, 09:22 AM   #6
Blaineoreski
Zealot
Blaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcover
 
Blaineoreski's Avatar
 
Posts: 119
Karma: 16268
Join Date: Apr 2020
Device: none
Hi Kovid,

Ah! Yep - just tried RTF and it's all good.

More proof of the POWER of Calibre!

Can't thank you enough for developing this application. When I think about all the help you're giving people I feel impressed. Thank you, Kovid!

Sincerely,

Blaine
Blaineoreski is offline   Reply With Quote
Old 07-23-2020, 09:40 AM   #7
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,253
Karma: 27110894
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
You're welcome
kovidgoyal is offline   Reply With Quote
Old 07-24-2020, 05:13 AM   #8
Blaineoreski
Zealot
Blaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcover
 
Blaineoreski's Avatar
 
Posts: 119
Karma: 16268
Join Date: Apr 2020
Device: none
Hi Kovid,

Is there any FONT text source to PDF which will largely avoid the glyphs problem? Somehow, I'm thinking a sans-serif font would make things easier?

Or...a monospaced font?

Here's my thinking: converting to Word - which worked PERFECTLY - lost all the chapter links. So, my main goal is:

text source > any filetype

that will preserve the navigation structure.

What do you think?

Sincerely,

Blaine
Blaineoreski is offline   Reply With Quote
Old 07-24-2020, 05:31 AM   #9
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,253
Karma: 27110894
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
It doesnt have anything to do with fonts. And conversion to DOCX will preserve links, if it is not, then open a bug report and attach a sample showing the issue.
kovidgoyal is offline   Reply With Quote
Old 07-24-2020, 07:14 AM   #10
Blaineoreski
Zealot
Blaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcover
 
Blaineoreski's Avatar
 
Posts: 119
Karma: 16268
Join Date: Apr 2020
Device: none
Hi Kovid,

Ah! You're right. The conversion to Work does keep all the chapters. And! Solves the word spacing problem.

Challenge is...the tools I need for editing are all...built around my PDF app. So, need a way to get the file BACK to PDF with the chapters / section navigation.

Seems like Word's own Word > PDF didn't respect the chapter structure.

Trying with other tools.

So the idea is:

pure text source > word > PDF with chapter divisions/navigation

Any suggestions?

Sincerely,

Blaine
Blaineoreski is offline   Reply With Quote
Old 07-24-2020, 08:22 AM   #11
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,253
Karma: 27110894
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Sorry no really am not a big PDF guy.
kovidgoyal is offline   Reply With Quote
Old 07-26-2020, 12:32 AM   #12
Blaineoreski
Zealot
Blaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcoverBlaineoreski exercises by bench pressing the entire Harry Potter series in hardcover
 
Blaineoreski's Avatar
 
Posts: 119
Karma: 16268
Join Date: Apr 2020
Device: none
Yep. Yep. Thanks just the same! : )))))
Blaineoreski is offline   Reply With Quote
Old 07-26-2020, 09:01 PM   #13
xfrank
Junior Member
xfrank began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Jul 2020
Device: Kindle
I noticed the same issue in converting from epub to pdf. Try to revert from Calibre 4 to 3. In the "3" version, text rendering in the pdf output is different, and is possible to select & copy elsewere the text without the "chopping" effect.
xfrank is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Delete source-file after conversion? theincredib13 Library Management 1 04-04-2018 10:41 AM
Conversion from Word file to Epub Charlie658 Conversion 2 11-10-2014 03:45 PM
Conversion Settings from MS Word Source tochill Calibre 0 07-13-2010 02:02 AM
After split pdf file, use Rasterfarian. harpum Sony Reader 0 07-14-2007 01:20 AM


All times are GMT -4. The time now is 03:32 PM.


MobileRead.com is a privately owned, operated and funded community.