Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > PDF

Notices

Reply
 
Thread Tools Search this Thread
Old 08-06-2014, 03:29 AM   #1
E-Books
Junior Member
E-Books began at the beginning.
 
Posts: 4
Karma: 10
Join Date: May 2013
Device: I-Pad 1 & Kindle Fire
How to Copy & Paste text from Chinese text PDF?

I am unable to extract the Chinese language text (in the same way how it appears) from editable PDF through the option "Copy & Paste and Save as RTF/Doc" in Acrobat 9 Professional.

This is actually font embedded text PDF (Used: type 1 fonts & custom encoding) and not scanned PDF. I found only the garbled text when "Copy & Paste" (or save as RTF) rather than actual Chinese text (Please refer the attached Screenshot 1).

I had also tried to extract the font embedded into the PDF to render the extracted (copied) text properly but not succeeded (i.e font files were extracted but which were not working). Also attached the sample PDF for your reference.
Note: I do not want OCR because I need to proofread and make corrections if any error found.


If anybody can offer any suggestion that will be greatly appreciated.
Attached Thumbnails
Click image for larger version

Name:	Screenshot-1.jpg
Views:	58
Size:	109.7 KB
ID:	126399  
Attached Files
File Type: pdf Sample.pdf (375.5 KB, 34 views)
E-Books is offline   Reply With Quote
Old 08-08-2014, 06:17 AM   #2
Janet16
Enthusiast
Janet16 ought to be getting tired of karma fortunes by now.Janet16 ought to be getting tired of karma fortunes by now.Janet16 ought to be getting tired of karma fortunes by now.Janet16 ought to be getting tired of karma fortunes by now.Janet16 ought to be getting tired of karma fortunes by now.Janet16 ought to be getting tired of karma fortunes by now.Janet16 ought to be getting tired of karma fortunes by now.Janet16 ought to be getting tired of karma fortunes by now.Janet16 ought to be getting tired of karma fortunes by now.Janet16 ought to be getting tired of karma fortunes by now.Janet16 ought to be getting tired of karma fortunes by now.
 
Janet16's Avatar
 
Posts: 36
Karma: 487354
Join Date: Jul 2011
Device: iPad
Just got a PDF Converter.
Try this ZAMzar:http://www.zamzar.com/
Well, I have well experience with Cisdem PDFConverterOCR, it can extract contents out of different languages PDF, it's only got a mac version, but I think you are on Windows, right?

Last edited by Janet16; 08-08-2014 at 06:21 AM.
Janet16 is offline   Reply With Quote
Old 08-08-2014, 09:00 AM   #3
E-Books
Junior Member
E-Books began at the beginning.
 
Posts: 4
Karma: 10
Join Date: May 2013
Device: I-Pad 1 & Kindle Fire
Thanks for the reply.
I have tried the link http://www.zamzar.com/ and found only the garbled text (Refer below)

Example text:
------------------------
; =
3ѨT ^% -ߺ13E;
16SNɿ3 3Hװ1+ v 0 dP'/ kcϿ (
!"#$%&'( )*+,
------------------------

And moreover, I am not interesting in OCR as the source is text PDF and OCR requires manual proofing & correction but which is not a easy job because of Chinese language.
E-Books is offline   Reply With Quote
Old 08-09-2014, 06:06 AM   #4
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 6,309
Karma: 4898871
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
Quote:
Originally Posted by E-Books View Post
This is actually font embedded text PDF (Used: type 1 fonts & custom encoding)
I guess the "custom encoding" is the problem. The copied text would come in that custom encoding. Can you extract the font, copy-paste the text and display it with the extracted font?
Jellby is offline   Reply With Quote
Old 08-09-2014, 10:22 AM   #5
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 582
Karma: 2526455
Join Date: Jun 2011
Location: California
Device: Kindle 2, iPad
Quote:
Originally Posted by Jellby View Post
I guess the "custom encoding" is the problem. The copied text would come in that custom encoding. Can you extract the font, copy-paste the text and display it with the extracted font?
Exactly right. If you are copying and pasting Chinese text into MS Word, the font encoding needs to be correctly mapped to Unicode-16 characters. This PDF's encoding is not. You would have to figure out how the encoded characters map to Unicode-16 and write a mapping filter.

Last edited by willus; 08-09-2014 at 10:25 AM.
willus is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Feature request: accented characters and plain-text copy & paste holymadness Library Management 3 07-15-2014 03:47 PM
PRS-T2 Howto copy and paste text?! JoelH Sony Reader 2 02-02-2013 03:33 AM
Kindle Touch select text, copy paste? Zimmy Amazon Kindle 3 02-18-2012 09:45 AM
Can the kindle 3 be used as a text editor with copy/paste function somehow? kinkle Amazon Kindle 3 05-19-2011 11:50 AM
copy/paste from ebook text within library 2.5 Bierkonig Sony Reader 4 01-28-2009 06:17 PM


All times are GMT -4. The time now is 07:34 PM.


MobileRead.com is a privately owned, operated and funded community.