08-06-2014, 03:29 AM | #1 |
Junior Member
Posts: 4
Karma: 10
Join Date: May 2013
Device: I-Pad 1 & Kindle Fire
|
How to Copy & Paste text from Chinese text PDF?
I am unable to extract the Chinese language text (in the same way how it appears) from editable PDF through the option "Copy & Paste and Save as RTF/Doc" in Acrobat 9 Professional.
This is actually font embedded text PDF (Used: type 1 fonts & custom encoding) and not scanned PDF. I found only the garbled text when "Copy & Paste" (or save as RTF) rather than actual Chinese text (Please refer the attached Screenshot 1). I had also tried to extract the font embedded into the PDF to render the extracted (copied) text properly but not succeeded (i.e font files were extracted but which were not working). Also attached the sample PDF for your reference. Note: I do not want OCR because I need to proofread and make corrections if any error found. If anybody can offer any suggestion that will be greatly appreciated. |
08-08-2014, 06:17 AM | #2 |
Enthusiast
Posts: 41
Karma: 2621116
Join Date: Jul 2011
Device: iPad
|
Just got a PDF Converter.
Try this ZAMzar:http://www.zamzar.com/ Well, I have well experience with Cisdem PDFConverterOCR, it can extract contents out of different languages PDF, it's only got a mac version, but I think you are on Windows, right? Last edited by Janet16; 08-08-2014 at 06:21 AM. |
Advert | |
|
08-08-2014, 09:00 AM | #3 |
Junior Member
Posts: 4
Karma: 10
Join Date: May 2013
Device: I-Pad 1 & Kindle Fire
|
Thanks for the reply.
I have tried the link http://www.zamzar.com/ and found only the garbled text (Refer below) Example text: ------------------------ ; È ›ÙÚÛ=ÜÖ × Ý É ¦3§¡Ñ¨T Þ^—%ø š-…ó‰ßº1¦3§àEÏ; ñ16S¶·N¾É¿3áâ ¦3§ãH—±Ð×°±›ä1+ v å0 æçÜ „ó©ªdPèóé'/ê kcëÖäÏ¿œìí§ ( !"#$%&'( )*+, ------------------------ And moreover, I am not interesting in OCR as the source is text PDF and OCR requires manual proofing & correction but which is not a easy job because of Chinese language. |
08-09-2014, 06:06 AM | #4 |
frumious Bandersnatch
Posts: 7,541
Karma: 19001081
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
|
|
08-09-2014, 10:22 AM | #5 |
Fuzzball, the purple cat
Posts: 1,287
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
Exactly right. If you are copying and pasting Chinese text into MS Word, the font encoding needs to be correctly mapped to Unicode-16 characters. This PDF's encoding is not. You would have to figure out how the encoded characters map to Unicode-16 and write a mapping filter.
Last edited by willus; 08-09-2014 at 10:25 AM. |
Advert | |
|
Thread Tools | Search this Thread |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Feature request: accented characters and plain-text copy & paste | holymadness | Library Management | 3 | 07-15-2014 03:47 PM |
PRS-T2 Howto copy and paste text?! | JoelH | Sony Reader | 2 | 02-02-2013 03:33 AM |
Kindle Touch select text, copy paste? | Zimmy | Amazon Kindle | 3 | 02-18-2012 09:45 AM |
Can the kindle 3 be used as a text editor with copy/paste function somehow? | kinkle | Amazon Kindle | 3 | 05-19-2011 11:50 AM |
copy/paste from ebook text within library 2.5 | Bierkonig | Sony Reader | 4 | 01-28-2009 06:17 PM |