02-21-2023, 06:51 AM | #2011 |
the rook, bossing Never.
Posts: 11,729
Karma: 87663463
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper11
|
I discovered a PDF (version 9 acrobat?) that wouldn't work properly in the Kobo Sage or Kobo Elipsa Crop. Also on Kobo you could only change page using the naviagtion slider, neither tap, swipe or buttons would change page. Actual text, so I might convert to epub if I can export it sensibly. It does import to LO Draw, but using that to format other than paper size or margins is tedious.
Anyhow, cropping it before importing to Calibre sorted it, though it got a little larger. After crop you could page through it. It's not OCR text, but just text. A copy and paste in any desktop program does give the text in the right order, but each line break in PDF is a new line and so it can't be reflowed, the paragraph breaks same as newlines. Can K2pdfopt export the text with only newline at paragraphs rather than the end of every line in the PDF? It's not an OCR layer, but only text. Last edited by Quoth; 02-21-2023 at 06:53 AM. |
02-21-2023, 03:04 PM | #2012 |
Grand Sorcerer
Posts: 5,326
Karma: 98809518
Join Date: Apr 2011
Device: pb360
|
Adding asciidoc or markdown markup should be almost trivial. Then newlines are ignored except for empty lines, which indicate a new paragraph.
Extremely long lines are editor unfriendly and are annoying, so I actually prefer text exports that haven't stripped newlines. |
Advert | |
|
02-21-2023, 04:15 PM | #2013 |
Fuzzball, the purple cat
Posts: 1,274
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
The information I get about a PDF file from the muPDF library is simply a list of characters and what their X,Y positions on the page are. There is no information to indicate either new-line or a paragraph--you have to infer this solely from the character positions, so I'd have to deduce from the line spacings or indentation if a paragraph was indended, which will be quite error prone. It's probably easier to hand edit the unicode text out of k2pdfopt (-ocrout option).
|
02-22-2023, 09:47 AM | #2014 |
the rook, bossing Never.
Posts: 11,729
Karma: 87663463
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper11
|
Paragraphs have either more indent or more space or start with a bolded word (all caps too usually) or are numbered list items. Only the ALL CAPS and list numbers survive text copy.
Rats. PDFs are evil to do anything with except print or read on a giant screen |
02-22-2023, 10:09 AM | #2015 |
the rook, bossing Never.
Posts: 11,729
Karma: 87663463
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper11
|
Actually the k2pdfout with -ocrout option put two newlines at paragraph breaks, but I'd only got one with a copy/paste.
So replaced all \n\n with ¬ replaced all \n with a "space" replaced ¬ with \n (But I wasn't using Windows notepad ) Now only needs a day's work to make into an epub. Almost looks sane in LO Writer. |
Advert | |
|
02-22-2023, 11:11 AM | #2016 |
the rook, bossing Never.
Posts: 11,729
Karma: 87663463
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper11
|
First pass edit done, now proof later as an epub and edit back annotations.
Thanks, @willus. |
02-25-2023, 12:38 PM | #2017 |
Fuzzball, the purple cat
Posts: 1,274
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
|
02-25-2023, 03:30 PM | #2018 |
the rook, bossing Never.
Posts: 11,729
Karma: 87663463
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper11
|
Only 20 minutes ago finished major edit/format/style in LO Writer and conversion of extra docx save in Calibre to epub2.
Still, beats typing it up from looking at the paperback! |
03-06-2023, 11:58 AM | #2019 |
Junior Member
Posts: 6
Karma: 10
Join Date: Mar 2023
Device: kindle oasis
|
I'm having problems downloading and installing k2pdfopt. I'm using a mac but the download always stalls, I click re-load and it downloads but when I install it I get the following:
Last login: Mon Mar 6 16:54:47 on ttys000 Davids-MacBook-Air:~ david$ /Users/david/Desktop/k2pdfopt ; exit; /Users/david/Desktop/k2pdfopt: line 1: syntax error near unexpected token `<' /Users/david/Desktop/k2pdfopt: line 1: `<html><head><title>Willus.com's K2pdfopt Download Page</title>' logout Saving session... ...copying shared history... ...saving history...truncating history files... ...completed. [Process completed] Any suggestions? |
03-07-2023, 01:51 AM | #2020 |
Junior Member
Posts: 6
Karma: 10
Join Date: Mar 2023
Device: kindle oasis
|
I am wondering if k2pdfopt isn't downloading properly as I have to reload it or whether it isn't running correctly. I'm not sure what the syntax error near the unexpected token ' <' meant and how I could correct it?
I'd be very grateful for any suggestions or advice as k2pdfopt looks exactly what I'm looking for and there doesn't seem to be anything better as far as I can find. Many thanks in anticipation. |
03-07-2023, 10:07 AM | #2021 |
Member
Posts: 15
Karma: 50592
Join Date: Dec 2021
Device: Kobo Libra 2
|
You're downloading a source code of the download page, not the program.
The first line of the source page reads: <html><head><title>Willus.com's K2pdfopt Download Page</title> Fill in the captcha and click on the relevant software (v2.54 mac OSX x86 64-bit ?). |
03-07-2023, 12:00 PM | #2022 |
Junior Member
Posts: 6
Karma: 10
Join Date: Mar 2023
Device: kindle oasis
|
Thanks for explaining that. One step closer! However I am clicking on the relevant software and that what is downloading so I'm not quite sure what is going on. Very frustrating.
|
03-07-2023, 02:09 PM | #2023 |
Member
Posts: 15
Karma: 50592
Join Date: Dec 2021
Device: Kobo Libra 2
|
It seems you have to right click on the rectangle (e.g. Mac OSX...) or control-click (when on Mac) on the rectangle and then select Save as... which saves the page, not the program.
Or try a different browser or wait till Willus responds. |
03-07-2023, 03:26 PM | #2024 |
Junior Member
Posts: 6
Karma: 10
Join Date: Mar 2023
Device: kindle oasis
|
Great! Good idea - I think a different browser might be better. I was wondering if perhaps the captcha is timing out before the download has fully completed? I'll try your suggestions - Thanks.
|
03-07-2023, 10:16 PM | #2025 |
Fuzzball, the purple cat
Posts: 1,274
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
|
Maybe the video on the mac help page will help?
|
Tags |
ebook apps, k5 tools, kindle tools, kindle touch, tools |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Viewing PDFs with another font | Font | PocketBook | 4 | 11-12-2010 08:27 AM |
Viewing Textbook PDFs... | NJReader | enTourage Archive | 4 | 08-17-2010 05:17 PM |
PRS-600 Restart bug while viewing PDFs? | conundrum | Sony Reader | 2 | 03-04-2010 08:46 PM |
More on viewing pdfs | dso371 | Bookeen | 8 | 03-11-2008 07:15 PM |
Viewing Untagged PDFs on Palm T|X | Eroica | Reading and Management | 3 | 12-10-2007 01:44 PM |