Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book General > News

Notices

Reply
 
Thread Tools Search this Thread
Old 08-12-2010, 01:35 PM   #31
vastav
Member
vastav began at the beginning.
 
Posts: 18
Karma: 38
Join Date: Sep 2009
Location: San Francisco Bay Area
Device: none
Quote:
Originally Posted by Grauheim View Post
...I tried it on a fairly typical computer science paper in two-column ACM format. I've attached the pdf and the resulting ePub so you can look for yourselves....
Grauheim, I stumbled upon this thread from last year and got intrigued by your PDF. I ran it through my conversion software available at pdf2epub.com. I am attaching two resulting ePubs for your review -

1. HR GI 2009 - Camera Ready.epub : This was produced with your original PDF, with just one change that I made to delete the entire tags in PDF which look wrong. The ePub appears to come out mostly fine with some issues with formula images getting scattered.

2. HR GI 2009 - Camera Ready_mod.epub : This was produced after I spent couple of minutes on the PDF to tag the problematic formulas as images before creating the ePub. Details on how to fix the tagging issues like this is available on my website under help.

Our primary solution comes as an Acrobat plugin but we also have a web based solution available at http://www.pdf2epub.com. I'd be interested in hearing your feedback and help resolve any issues that you may find.
Attached Files
File Type: epub HR GI 2009 - Camera Ready.epub (315.5 KB, 297 views)
File Type: epub HR GI 2009 - Camera Ready_mod.epub (278.0 KB, 279 views)
File Type: pdf HR GI 2009 - Camera Ready.pdf (520.2 KB, 362 views)

Last edited by vastav; 08-12-2010 at 02:46 PM.
vastav is offline   Reply With Quote
Old 08-12-2010, 01:48 PM   #32
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 73,957
Karma: 128903250
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
The solution is not to start with PDF as a source.
JSWolf is offline   Reply With Quote
Advert
Old 08-12-2010, 01:56 PM   #33
vastav
Member
vastav began at the beginning.
 
Posts: 18
Karma: 38
Join Date: Sep 2009
Location: San Francisco Bay Area
Device: none
Quote:
Originally Posted by JSWolf View Post
The solution is not to start with PDF as a source.
Certainly that is the ideal scenario if you have access to the original source document that created the PDF. But often that is not the case, that is why a variety of conversion solutions exist...
vastav is offline   Reply With Quote
Old 01-04-2011, 09:58 AM   #34
johnnyb
Cloud Reader
johnnyb ought to be getting tired of karma fortunes by now.johnnyb ought to be getting tired of karma fortunes by now.johnnyb ought to be getting tired of karma fortunes by now.johnnyb ought to be getting tired of karma fortunes by now.johnnyb ought to be getting tired of karma fortunes by now.johnnyb ought to be getting tired of karma fortunes by now.johnnyb ought to be getting tired of karma fortunes by now.johnnyb ought to be getting tired of karma fortunes by now.johnnyb ought to be getting tired of karma fortunes by now.johnnyb ought to be getting tired of karma fortunes by now.johnnyb ought to be getting tired of karma fortunes by now.
 
Posts: 1,110
Karma: 4000066
Join Date: Aug 2010
Device: Kindle Oasis, Kindle Scribe, iPad Pro 11
This app is free right now and has done a great job on all the (simple) PDFs I've fed in it...
The downside is that it doesn't convert ToC in ePub style but creates a hyperlink ToC in the beginning of the document while deleting all hyperlinks that were there before...
For 0$ I can still recommend it for many scenarios...
johnnyb is online now   Reply With Quote
Old 01-04-2011, 01:33 PM   #35
screwballl
NewKindler
screwballl ought to be getting tired of karma fortunes by now.screwballl ought to be getting tired of karma fortunes by now.screwballl ought to be getting tired of karma fortunes by now.screwballl ought to be getting tired of karma fortunes by now.screwballl ought to be getting tired of karma fortunes by now.screwballl ought to be getting tired of karma fortunes by now.screwballl ought to be getting tired of karma fortunes by now.screwballl ought to be getting tired of karma fortunes by now.screwballl ought to be getting tired of karma fortunes by now.screwballl ought to be getting tired of karma fortunes by now.screwballl ought to be getting tired of karma fortunes by now.
 
screwballl's Avatar
 
Posts: 504
Karma: 1865773
Join Date: Dec 2010
Location: NWFL
Device: Kindle3 Wifi
Not sure where you found that, they have it for $39.95
screwballl is offline   Reply With Quote
Advert
Old 01-04-2011, 01:37 PM   #36
frabjous
Wizard
frabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameter
 
frabjous's Avatar
 
Posts: 1,213
Karma: 12890
Join Date: Feb 2009
Location: Amherst, Massachusetts, USA
Device: Sony PRS-505
The website link here says "PDF to EPUB Authors Promtion [sic]
Get PDF to EPUB FREE until the end of 2010. PDF to EPUB normally priced at US$39.95. Get it free, CLICK HERE".

Since it is no longer 2010, I don't know whether or not filling out this form would still work.

Since it's for Windows only, it's useless to me, however.
frabjous is offline   Reply With Quote
Old 01-04-2011, 01:46 PM   #37
screwballl
NewKindler
screwballl ought to be getting tired of karma fortunes by now.screwballl ought to be getting tired of karma fortunes by now.screwballl ought to be getting tired of karma fortunes by now.screwballl ought to be getting tired of karma fortunes by now.screwballl ought to be getting tired of karma fortunes by now.screwballl ought to be getting tired of karma fortunes by now.screwballl ought to be getting tired of karma fortunes by now.screwballl ought to be getting tired of karma fortunes by now.screwballl ought to be getting tired of karma fortunes by now.screwballl ought to be getting tired of karma fortunes by now.screwballl ought to be getting tired of karma fortunes by now.
 
screwballl's Avatar
 
Posts: 504
Karma: 1865773
Join Date: Dec 2010
Location: NWFL
Device: Kindle3 Wifi
oh yeah... that stupid useless program... it has already been covered and shot down, failing miserably here:

https://www.mobileread.com/forums/sho...d.php?t=113371
screwballl is offline   Reply With Quote
Old 08-30-2018, 06:25 PM   #38
kofii12345
Junior Member
kofii12345 began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Nov 2017
Device: Android phone and Kindle
The website is dead and I cannot verify my licence in the program. How can I get that work? This was the best pdf converter. Any other recommedations?
kofii12345 is offline   Reply With Quote
Old 08-30-2018, 06:32 PM   #39
fjtorres
Grand Sorcerer
fjtorres ought to be getting tired of karma fortunes by now.fjtorres ought to be getting tired of karma fortunes by now.fjtorres ought to be getting tired of karma fortunes by now.fjtorres ought to be getting tired of karma fortunes by now.fjtorres ought to be getting tired of karma fortunes by now.fjtorres ought to be getting tired of karma fortunes by now.fjtorres ought to be getting tired of karma fortunes by now.fjtorres ought to be getting tired of karma fortunes by now.fjtorres ought to be getting tired of karma fortunes by now.fjtorres ought to be getting tired of karma fortunes by now.fjtorres ought to be getting tired of karma fortunes by now.
 
Posts: 11,732
Karma: 128354696
Join Date: May 2009
Location: 26 kly from Sgr A*
Device: T100TA,PW2,PRS-T1,KT,FireHD 8.9,K2, PB360,BeBook One,Axim51v,TC1000
Quote:
Originally Posted by kofii12345 View Post
The website is dead and I cannot verify my licence in the program. How can I get that work? This was the best pdf converter. Any other recommedations?
Try the trial for FlexiPDF here:

https://www.softmaker.com/en/downloads/trials

(Their office software is pretty good for the price. And it exports workable epub. Good enough for proofing, anyway.)

FlexiPDF works about as well as any pdf converter I've seen (which is to say headers and footers are always a mess) but it is cheap and regularly on sale.
There doesn't seem to be much demand for pdf converters anymore and I suspect pdf use is declining dramatically.

Last edited by fjtorres; 08-30-2018 at 06:35 PM.
fjtorres is offline   Reply With Quote
Old 08-30-2018, 06:56 PM   #40
sealbeater
Banned
sealbeater ought to be getting tired of karma fortunes by now.sealbeater ought to be getting tired of karma fortunes by now.sealbeater ought to be getting tired of karma fortunes by now.sealbeater ought to be getting tired of karma fortunes by now.sealbeater ought to be getting tired of karma fortunes by now.sealbeater ought to be getting tired of karma fortunes by now.sealbeater ought to be getting tired of karma fortunes by now.sealbeater ought to be getting tired of karma fortunes by now.sealbeater ought to be getting tired of karma fortunes by now.sealbeater ought to be getting tired of karma fortunes by now.sealbeater ought to be getting tired of karma fortunes by now.
 
Posts: 666
Karma: 1752814
Join Date: Jan 2008
Device: Sony Reader PRS-505 : Onyx Boox Max : Sony PRS-900 : Onyx Kepler Pro
This makes me want to write a script that can do it.
sealbeater is offline   Reply With Quote
Old 08-30-2018, 10:31 PM   #41
rcentros
eReader Wrangler
rcentros ought to be getting tired of karma fortunes by now.rcentros ought to be getting tired of karma fortunes by now.rcentros ought to be getting tired of karma fortunes by now.rcentros ought to be getting tired of karma fortunes by now.rcentros ought to be getting tired of karma fortunes by now.rcentros ought to be getting tired of karma fortunes by now.rcentros ought to be getting tired of karma fortunes by now.rcentros ought to be getting tired of karma fortunes by now.rcentros ought to be getting tired of karma fortunes by now.rcentros ought to be getting tired of karma fortunes by now.rcentros ought to be getting tired of karma fortunes by now.
 
rcentros's Avatar
 
Posts: 7,441
Karma: 48453105
Join Date: Mar 2013
Location: Boise, ID
Device: PB HD3, GL3, Tolino Vision 4, Voyage, Clara HD
For the last PDF book I converted, I used the Poppler Utility pdftotext with the -layout and -nopgbrk options, cleaned it up in Jstar, my text editor and moved it into LibreOffice and saved as ODT. Then I just used Calibre to convert it to ePub. Worked out pretty well.

I use Linux, but the Poppler utilities are available for Windows as well.

http://blog.alivate.com.au/poppler-windows/
rcentros is offline   Reply With Quote
Old 08-30-2018, 10:36 PM   #42
rkomar
Wizard
rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.
 
Posts: 2,986
Karma: 18343081
Join Date: Oct 2010
Location: Sudbury, ON, Canada
Device: PRS-505, PB 902, PRS-T1, PB 623, PB 840, PB 633
Quote:
Originally Posted by sealbeater View Post
This makes me want to write a script that can do it.
Do what? Somehow verify the license?

If you mean convert PDF to EPUB, then it probably wouldn't be a simple script. The PDF structure is well defined and public knowledge, you can get the reference manuals for it from Adobe. It is, however, really complicated. PDF files are mostly giant programs to be executed within a state engine for rendering pixels in an image, quite different from something put together using a markup language like EPUB. You could do a rough job by scraping the data out of the program, but that would miss a lot of what else is in there, and would also depend on the data being placed in some acceptable order within the program (which it doesn't have to be).
rkomar is offline   Reply With Quote
Old 08-31-2018, 04:16 PM   #43
sealbeater
Banned
sealbeater ought to be getting tired of karma fortunes by now.sealbeater ought to be getting tired of karma fortunes by now.sealbeater ought to be getting tired of karma fortunes by now.sealbeater ought to be getting tired of karma fortunes by now.sealbeater ought to be getting tired of karma fortunes by now.sealbeater ought to be getting tired of karma fortunes by now.sealbeater ought to be getting tired of karma fortunes by now.sealbeater ought to be getting tired of karma fortunes by now.sealbeater ought to be getting tired of karma fortunes by now.sealbeater ought to be getting tired of karma fortunes by now.sealbeater ought to be getting tired of karma fortunes by now.
 
Posts: 666
Karma: 1752814
Join Date: Jan 2008
Device: Sony Reader PRS-505 : Onyx Boox Max : Sony PRS-900 : Onyx Kepler Pro
Quote:
Originally Posted by rkomar View Post
Do what? Somehow verify the license?

If you mean convert PDF to EPUB, then it probably wouldn't be a simple script. The PDF structure is well defined and public knowledge, you can get the reference manuals for it from Adobe. It is, however, really complicated. PDF files are mostly giant programs to be executed within a state engine for rendering pixels in an image, quite different from something put together using a markup language like EPUB. You could do a rough job by scraping the data out of the program, but that would miss a lot of what else is in there, and would also depend on the data being placed in some acceptable order within the program (which it doesn't have to be).
I don't think it would be very difficult at all, actually. Depends on the nature of the PDF, of course. Even if it's just images, it may be doable. It's just a matter of leveraging already existing tools. pdftotext, pdftohtml, pdftops and pdftodvi all already exist. I've never tried to make an epub before so I would need to read up on the structure. Anyway, its not something I have time for anytime soon, I was just commenting that $99 bux seems high for something that could probably be scripted out in 20 minutes.
sealbeater is offline   Reply With Quote
Old 08-31-2018, 06:21 PM   #44
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,212
Karma: 16534894
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
Quote:
Originally Posted by sealbeater View Post
I don't think it would be very difficult at all, actually. Depends on the nature of the PDF, of course. Even if it's just images, it may be doable. It's just a matter of leveraging already existing tools. pdftotext, pdftohtml, pdftops and pdftodvi all already exist. I've never tried to make an epub before so I would need to read up on the structure. Anyway, its not something I have time for anytime soon, I was just commenting that $99 bux seems high for something that could probably be scripted out in 20 minutes.
This sounds like a statement from someone who has not fully understood the problem!

Creating the epub is the easy bit. In my experience extracting the PDF contents and creating algorithms to rejoin the text fragments into paragraphs with correct text in the correct order is the hard bit. Not to mention making sure you don't lose bold, italics and scenebreaks in the process. Getting rid of page headers/footer and unwanted end-of-line hyphens also presents a challenge. Extracting all the images is also a hit-and-miss affair. I could go on ...

Once you've sorted out the above for simple fiction books you'll need to solve the problem of PDFs with text in multiple columns and handling footnotes if you're going to convert non-fiction PDFs.

If you think you can come up with a generic "magic button" solution for converting any/all PDFs to high-quality epub by writing a script in 20 minutes (or 20 hours or 20 days) I suggest you drop all your current projects, including your day job. I suspect you'd be able to retire on the proceeds. You may even be considered the New Messiah.
jackie_w is offline   Reply With Quote
Old 08-31-2018, 07:00 PM   #45
sealbeater
Banned
sealbeater ought to be getting tired of karma fortunes by now.sealbeater ought to be getting tired of karma fortunes by now.sealbeater ought to be getting tired of karma fortunes by now.sealbeater ought to be getting tired of karma fortunes by now.sealbeater ought to be getting tired of karma fortunes by now.sealbeater ought to be getting tired of karma fortunes by now.sealbeater ought to be getting tired of karma fortunes by now.sealbeater ought to be getting tired of karma fortunes by now.sealbeater ought to be getting tired of karma fortunes by now.sealbeater ought to be getting tired of karma fortunes by now.sealbeater ought to be getting tired of karma fortunes by now.
 
Posts: 666
Karma: 1752814
Join Date: Jan 2008
Device: Sony Reader PRS-505 : Onyx Boox Max : Sony PRS-900 : Onyx Kepler Pro
Quote:
Originally Posted by jackie_w View Post
This sounds like a statement from someone who has not fully understood the problem!
Perhaps. Perhaps you are overthinking it? What do I know, I've never ever ever had to convert or deal with PDFs before.

/s for those who missed it.

Quote:
Originally Posted by jackie_w View Post
Creating the epub is the easy bit. In my experience extracting the PDF contents and creating algorithms to rejoin the text fragments into paragraphs with correct text in the correct order is the hard bit.
You've used the Poppler tools?


Quote:
Originally Posted by jackie_w View Post
Not to mention making sure you don't lose bold, italics and scenebreaks in the process.
I'm not that big of a stickler but if PostScript supports it, I'm sure I could extract it.


Quote:
Originally Posted by jackie_w View Post
Getting rid of page headers/footer and unwanted end-of-line hyphens also presents a challenge.
Sed works wonders.

Quote:
Originally Posted by jackie_w View Post
Extracting all the images is also a hit-and-miss affair. I could go on ...

What's hit or miss about it?


Quote:
Originally Posted by jackie_w View Post
Once you've sorted out the above for simple fiction books you'll need to solve the problem of PDFs with text in multiple columns and handling footnotes if you're going to convert non-fiction PDFs.
That is something I would have to think about but I believe when there's a will, there's a way. Maybe something with PostScript and multi-line justification. I would have to investigate should I ever have enough time and interest.




Quote:
Originally Posted by jackie_w View Post
If you think you can come up with a generic "magic button" solution for converting any/all PDFs to high-quality epub by writing a script in 20 minutes (or 20 hours or 20 days) I suggest you drop all your current projects, including your day job. I suspect you'd be able to retire on the proceeds. You may even be considered the New Messiah.
I think I could do a good enough job to meet my needs. I like how much you qualified my statement with requirements. "Any/all PDFs" "high-quality epub (whatever that means)", etc. I think I could come up with a good solution. I know I would try my hand at it rather than shell out $99 bux. I appreciate your suggestion that I drop all my current projects, including my day job to persue this but my day job pays quite well and it's thanks to my skills in coming up with solutions to problems like this that is why I get paid so well. I already am able to retire if I were to choose to do so.

You are free to regard me as your New Messiah if you like however.
sealbeater is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
PDF in epub? Floeee Software 3 10-20-2009 05:52 PM
PDFTOEPUB BY DNAML- WARNING mets News 0 09-21-2009 01:16 PM
Google releases 1 million public domain books in ePub format joedevon News 25 09-02-2009 05:13 PM


All times are GMT -4. The time now is 03:47 AM.


MobileRead.com is a privately owned, operated and funded community.