Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > PDF

Notices

Reply
 
Thread Tools Search this Thread
Old 02-21-2023, 06:51 AM   #2011
Quoth
the rook, bossing Never.
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 11,171
Karma: 85874891
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper11
I discovered a PDF (version 9 acrobat?) that wouldn't work properly in the Kobo Sage or Kobo Elipsa Crop. Also on Kobo you could only change page using the naviagtion slider, neither tap, swipe or buttons would change page. Actual text, so I might convert to epub if I can export it sensibly. It does import to LO Draw, but using that to format other than paper size or margins is tedious.

Anyhow, cropping it before importing to Calibre sorted it, though it got a little larger. After crop you could page through it.

It's not OCR text, but just text. A copy and paste in any desktop program does give the text in the right order, but each line break in PDF is a new line and so it can't be reflowed, the paragraph breaks same as newlines.

Can K2pdfopt export the text with only newline at paragraphs rather than the end of every line in the PDF? It's not an OCR layer, but only text.

Last edited by Quoth; 02-21-2023 at 06:53 AM.
Quoth is offline   Reply With Quote
Old 02-21-2023, 03:04 PM   #2012
j.p.s
Grand Sorcerer
j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.
 
Posts: 5,285
Karma: 98804578
Join Date: Apr 2011
Device: pb360
Adding asciidoc or markdown markup should be almost trivial. Then newlines are ignored except for empty lines, which indicate a new paragraph.

Extremely long lines are editor unfriendly and are annoying, so I actually prefer text exports that haven't stripped newlines.
j.p.s is offline   Reply With Quote
Advert
Old 02-21-2023, 04:15 PM   #2013
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,273
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
The information I get about a PDF file from the muPDF library is simply a list of characters and what their X,Y positions on the page are. There is no information to indicate either new-line or a paragraph--you have to infer this solely from the character positions, so I'd have to deduce from the line spacings or indentation if a paragraph was indended, which will be quite error prone. It's probably easier to hand edit the unicode text out of k2pdfopt (-ocrout option).
willus is offline   Reply With Quote
Old 02-22-2023, 09:47 AM   #2014
Quoth
the rook, bossing Never.
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 11,171
Karma: 85874891
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper11
Paragraphs have either more indent or more space or start with a bolded word (all caps too usually) or are numbered list items. Only the ALL CAPS and list numbers survive text copy.

Rats. PDFs are evil to do anything with except print or read on a giant screen
Quoth is offline   Reply With Quote
Old 02-22-2023, 10:09 AM   #2015
Quoth
the rook, bossing Never.
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 11,171
Karma: 85874891
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper11
Actually the k2pdfout with -ocrout option put two newlines at paragraph breaks, but I'd only got one with a copy/paste.

So replaced all \n\n with ¬
replaced all \n with a "space"
replaced ¬ with \n

(But I wasn't using Windows notepad )

Now only needs a day's work to make into an epub. Almost looks sane in LO Writer.
Quoth is offline   Reply With Quote
Advert
Old 02-22-2023, 11:11 AM   #2016
Quoth
the rook, bossing Never.
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 11,171
Karma: 85874891
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper11
First pass edit done, now proof later as an epub and edit back annotations.
Thanks, @willus.
Quoth is offline   Reply With Quote
Old 02-25-2023, 12:38 PM   #2017
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,273
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
Quote:
Originally Posted by Quoth View Post
First pass edit done, now proof later as an epub and edit back annotations.
Thanks, @willus.
Glad you had some success.
willus is offline   Reply With Quote
Old 02-25-2023, 03:30 PM   #2018
Quoth
the rook, bossing Never.
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 11,171
Karma: 85874891
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper11
Only 20 minutes ago finished major edit/format/style in LO Writer and conversion of extra docx save in Calibre to epub2.

Still, beats typing it up from looking at the paperback!
Quoth is offline   Reply With Quote
Old 03-06-2023, 11:58 AM   #2019
bamboohouses
Junior Member
bamboohouses began at the beginning.
 
Posts: 6
Karma: 10
Join Date: Mar 2023
Device: kindle oasis
I'm having problems downloading and installing k2pdfopt. I'm using a mac but the download always stalls, I click re-load and it downloads but when I install it I get the following:
Last login: Mon Mar 6 16:54:47 on ttys000
Davids-MacBook-Air:~ david$ /Users/david/Desktop/k2pdfopt ; exit;
/Users/david/Desktop/k2pdfopt: line 1: syntax error near unexpected token `<'
/Users/david/Desktop/k2pdfopt: line 1: `<html><head><title>Willus.com's K2pdfopt Download Page</title>'
logout
Saving session...
...copying shared history...
...saving history...truncating history files...
...completed.

[Process completed]

Any suggestions?
bamboohouses is offline   Reply With Quote
Old 03-07-2023, 01:51 AM   #2020
bamboohouses
Junior Member
bamboohouses began at the beginning.
 
Posts: 6
Karma: 10
Join Date: Mar 2023
Device: kindle oasis
I am wondering if k2pdfopt isn't downloading properly as I have to reload it or whether it isn't running correctly. I'm not sure what the syntax error near the unexpected token ' <' meant and how I could correct it?

I'd be very grateful for any suggestions or advice as k2pdfopt looks exactly what I'm looking for and there doesn't seem to be anything better as far as I can find.

Many thanks in anticipation.
bamboohouses is offline   Reply With Quote
Old 03-07-2023, 10:07 AM   #2021
taddymack
Member
taddymack is faster than a rolling 'o,' stronger than silent 'e,' and leaps capital 'T' in a single bound!taddymack is faster than a rolling 'o,' stronger than silent 'e,' and leaps capital 'T' in a single bound!taddymack is faster than a rolling 'o,' stronger than silent 'e,' and leaps capital 'T' in a single bound!taddymack is faster than a rolling 'o,' stronger than silent 'e,' and leaps capital 'T' in a single bound!taddymack is faster than a rolling 'o,' stronger than silent 'e,' and leaps capital 'T' in a single bound!taddymack is faster than a rolling 'o,' stronger than silent 'e,' and leaps capital 'T' in a single bound!taddymack is faster than a rolling 'o,' stronger than silent 'e,' and leaps capital 'T' in a single bound!taddymack is faster than a rolling 'o,' stronger than silent 'e,' and leaps capital 'T' in a single bound!taddymack is faster than a rolling 'o,' stronger than silent 'e,' and leaps capital 'T' in a single bound!taddymack is faster than a rolling 'o,' stronger than silent 'e,' and leaps capital 'T' in a single bound!taddymack is faster than a rolling 'o,' stronger than silent 'e,' and leaps capital 'T' in a single bound!
 
Posts: 15
Karma: 50592
Join Date: Dec 2021
Device: Kobo Libra 2
You're downloading a source code of the download page, not the program.
The first line of the source page reads:
<html><head><title>Willus.com's K2pdfopt Download Page</title>
Fill in the captcha and click on the relevant software (v2.54 mac OSX x86 64-bit ?).
taddymack is offline   Reply With Quote
Old 03-07-2023, 12:00 PM   #2022
bamboohouses
Junior Member
bamboohouses began at the beginning.
 
Posts: 6
Karma: 10
Join Date: Mar 2023
Device: kindle oasis
Thanks for explaining that. One step closer! However I am clicking on the relevant software and that what is downloading so I'm not quite sure what is going on. Very frustrating.
bamboohouses is offline   Reply With Quote
Old 03-07-2023, 02:09 PM   #2023
taddymack
Member
taddymack is faster than a rolling 'o,' stronger than silent 'e,' and leaps capital 'T' in a single bound!taddymack is faster than a rolling 'o,' stronger than silent 'e,' and leaps capital 'T' in a single bound!taddymack is faster than a rolling 'o,' stronger than silent 'e,' and leaps capital 'T' in a single bound!taddymack is faster than a rolling 'o,' stronger than silent 'e,' and leaps capital 'T' in a single bound!taddymack is faster than a rolling 'o,' stronger than silent 'e,' and leaps capital 'T' in a single bound!taddymack is faster than a rolling 'o,' stronger than silent 'e,' and leaps capital 'T' in a single bound!taddymack is faster than a rolling 'o,' stronger than silent 'e,' and leaps capital 'T' in a single bound!taddymack is faster than a rolling 'o,' stronger than silent 'e,' and leaps capital 'T' in a single bound!taddymack is faster than a rolling 'o,' stronger than silent 'e,' and leaps capital 'T' in a single bound!taddymack is faster than a rolling 'o,' stronger than silent 'e,' and leaps capital 'T' in a single bound!taddymack is faster than a rolling 'o,' stronger than silent 'e,' and leaps capital 'T' in a single bound!
 
Posts: 15
Karma: 50592
Join Date: Dec 2021
Device: Kobo Libra 2
It seems you have to right click on the rectangle (e.g. Mac OSX...) or control-click (when on Mac) on the rectangle and then select Save as... which saves the page, not the program.
Or try a different browser or wait till Willus responds.
taddymack is offline   Reply With Quote
Old 03-07-2023, 03:26 PM   #2024
bamboohouses
Junior Member
bamboohouses began at the beginning.
 
Posts: 6
Karma: 10
Join Date: Mar 2023
Device: kindle oasis
Great! Good idea - I think a different browser might be better. I was wondering if perhaps the captcha is timing out before the download has fully completed? I'll try your suggestions - Thanks.
bamboohouses is offline   Reply With Quote
Old 03-07-2023, 10:16 PM   #2025
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,273
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
Maybe the video on the mac help page will help?
willus is offline   Reply With Quote
Reply

Tags
ebook apps, k5 tools, kindle tools, kindle touch, tools


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Viewing PDFs with another font Font PocketBook 4 11-12-2010 08:27 AM
Viewing Textbook PDFs... NJReader enTourage Archive 4 08-17-2010 05:17 PM
PRS-600 Restart bug while viewing PDFs? conundrum Sony Reader 2 03-04-2010 08:46 PM
More on viewing pdfs dso371 Bookeen 8 03-11-2008 07:15 PM
Viewing Untagged PDFs on Palm T|X Eroica Reading and Management 3 12-10-2007 01:44 PM


All times are GMT -4. The time now is 10:19 PM.


MobileRead.com is a privately owned, operated and funded community.