Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > ePub

Notices

Reply
 
Thread Tools Search this Thread
Old 03-11-2017, 06:58 AM   #1
Difermo
Member
Difermo began at the beginning.
 
Posts: 20
Karma: 10
Join Date: Nov 2013
Device: none
mathematics equations, technical mechanics and all kind of diagrams

I'm sorry for openin new thread if one is already there. I searched on google and all what i found is to old. Date is 2012 or 2009. And I gess a lot have change since then.

I need to put some books in my eReader. It is kindle paperwhite. But if mobi or azw3 can not suport that i'm ready to buy some epub reader.

The books I need to put is for civil engineering. So there is a lot of math trigonometry, equations, integrals, lots of diagrams, tables.
Here are some examples










Any sugestions how to resolve this..
Difermo is offline   Reply With Quote
Old 03-11-2017, 04:54 PM   #2
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by Difermo View Post
The books I need to put is for civil engineering. So there is a lot of math trigonometry, equations, integrals, lots of diagrams, tables.
I suspect these are just PDF scans of books?

Your best bet would probably be reading this as a PDF on a larger screen (tablet/monitor).

Depending on the PDF, you may be able to do some cropping to make it a bit easier to read on your device. For example, using a tool like k2pdfopt:

https://www.mobileread.com/forums/sh...d.php?t=144711

but most of the time it just might not be possible to shrink a very large and complex 8.5"x11" page into a smaller screen.

Converting this type of complex material to a proper ebook is EXTREMELY labor intensive... and if the publisher doesn't release an ebook directly from the source material... it probably wouldn't be worth the time invested for a single individual to OCR (easily tens/hundreds of hours).

The more complex the layout (multi-column, lots of footnotes, tables, figures, captions, equations, [...]) the harder the books are to convert using OCR + the more manual intervention would be needed to fix all the broken formatting.

Quote:
Originally Posted by Difermo View Post
I need to put some books in my eReader. It is kindle paperwhite. But if mobi or azw3 can not suport that i'm ready to buy some epub reader.
Dedicated ereaders can read PDFs... but the experience is typically very poor: slow/sluggish page turns, having to pan/scan, not being able to easily highlight text or take notes, can't resize text, etc. etc.

For example, here is The Digital Reader showing off PDFs on a Kobo Aura One (Kindles/Nooks/others are similar):

http://the-digital-reader.com/2016/0...just-no-video/

Quote:
Originally Posted by Difermo View Post
I'm sorry for openin new thread if one is already there. I searched on google and all what i found is to old. Date is 2012 or 2009. And I gess a lot have change since then.
The largest change in this kind of material is probably MathML in EPUB3.

Equations/Formulas? Each equation is going to have to be included as bitmap images (or SVG or MathML).

Each and every equation would require laborious double-checking to make sure it is correct and require some serious markup.

The only program/engine I know of that handles OCRing Formulas is InftyReader:

http://www.sciaccess.net/en/InftyReader/

and that costs $800+.

Side Note: Back in 2013 I wrote a topic, "Tutorial: Formulas to PNG", where I sort of show off one method of digitizing equations (using LibreOffice Math):

https://www.mobileread.com/forums/sh...d.php?t=223254

I have used this to recreate formulas in books that had <50 equations... now I tend to prefer using LaTeX as a middleman... but it STILL requires a massive amount of manual work per equation. I shudder to think how long it would take working on a book that is as full as your example pages.

Last edited by Tex2002ans; 03-11-2017 at 05:10 PM.
Tex2002ans is offline   Reply With Quote
Old 03-11-2017, 08:23 PM   #3
Nate the great
Sir Penguin of Edinburgh
Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.Nate the great ought to be getting tired of karma fortunes by now.
 
Nate the great's Avatar
 
Posts: 12,375
Karma: 23555235
Join Date: Apr 2007
Location: DC Metro area
Device: Shake a stick plus 1
Get an iPad. Use Goodreader.

Seriously, this type of content requires a _large_ screen. And honestly, the iPad is the best for this.
Nate the great is offline   Reply With Quote
Old 03-12-2017, 06:32 AM   #4
Difermo
Member
Difermo began at the beginning.
 
Posts: 20
Karma: 10
Join Date: Nov 2013
Device: none
Thanks @Tex2002ans for very detail answer
@Nate the great thanks to you2

I gess it would be best to use some 9.7" tablet (ipad or some android device)
Some of pictures are a4 paper format some are smaller. But the 90% are books dimension 16cm x 24cm.
These are books for the building profession. The professor advised to make maximum use today's technology. Earlier, it was difficult, actually not possible, to pull all the books on the construction site. You always need a little of something to be reminded. So I thought to process books for kindle. Given that this is too complicated, it is better to keep them in pdf format on the tablet. It is better to drag tablet than suitcase with 20 books.

It is to bad that kindle or kobo aren't so powerful. In tablet, pdf, I can't hold word and see what it means (if i do not understand it). That was the second reason for me to place books on my kindle paperwhite..
Difermo is offline   Reply With Quote
Old 03-12-2017, 07:34 AM   #5
Notjohn
mostly an observer
Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.
 
Posts: 1,519
Karma: 987654
Join Date: Dec 2012
Device: Kindle
Quote:
Dedicated ereaders can read PDFs... but the experience is typically very poor: slow/sluggish page turns, having to pan/scan, not being able to easily highlight text or take notes, can't resize text, etc. etc.
A publisher once sent me a review copy as a PDF, and it was two pages up! So yes, I had to do a lot of finger work on my 7-inch Fire tablet, but I did read the book, and it was better than ordering up a commercial copy. So that was what? -- 8.5 by 11 inches? Of course I only had to read half of it at any given moment, so maybe that's not a fair example.
Notjohn is offline   Reply With Quote
Old 03-12-2017, 03:29 PM   #6
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,736
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by Difermo View Post
It is to bad that kindle or kobo aren't so powerful. In tablet, pdf, I can't hold word and see what it means (if i do not understand it). That was the second reason for me to place books on my kindle paperwhite..
I assume you mean something like this

Click image for larger version

Name:	Capture.JPG
Views:	229
Size:	181.2 KB
ID:	155615

All I did was highlight the word 'manuscript' (double click) in the PDF (opened in PDF xChange), and press Ctrl+Shift+`, voila WordWeb popped up with a definition, the only customisation was to assign that particular key sequence as the WordWeb hotkey. I have another gadget (clickto) that will lookup the highlighted text on the Web (Google, Wikipedia, Wiktionary, etc, or even MobileRead)

If something similar can't be done on Android or iOS I'd be surprised, but if not get a Surface, its a walk in park with Windows.

One of my dislikes of Android (and iOS I would guess, were I to use it) is the difficulty of getting 'apps' to interoperate, walled gardens full of walled gardens, or as I prefer - nests of recursive arboreta.

BR
BetterRed is offline   Reply With Quote
Old 03-12-2017, 04:08 PM   #7
PeterT
Grand Sorcerer
PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.
 
Posts: 13,533
Karma: 78910202
Join Date: Nov 2007
Location: Toronto
Device: Libra H2O, Libra Colour
I expect that the ability to press on a word in a PDF will depend on whether or not there is a text layer (not sure of the real term) in the PDF. If the PDF is purely a scanned image I doubt that any lookup would be possible.
PeterT is offline   Reply With Quote
Old 03-12-2017, 04:56 PM   #8
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,736
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
True - maybe the Textify TSR (or whatever they're called now) might help.

Failing that, screenshoot the page [Alt/PrtScn], paste into a Onenote note [WinKey+N, Ctrl+V], use its Copy Text from picture feature [Menu Key, x], paste the resultant text into the note [Ctrl+V], now do the look up from the just pasted text (correct if necessary).

That might even be doable with the Android and iOS versions of OneNote.

But for scanned image PDF's, Evernote might be better, it can embed a PDF in a note and do the OCR in one fell swoop - lawyers love it. Evernote runs on most platforms except Linux (or last time I looked it didn't). And you can access your notebooks via the web without buggering about with DropBox and the like.

BR
BetterRed is offline   Reply With Quote
Old 03-12-2017, 05:39 PM   #9
Difermo
Member
Difermo began at the beginning.
 
Posts: 20
Karma: 10
Join Date: Nov 2013
Device: none
@BetterRed

But that works only if text can be selected.
I do not have original pdf of all books. I'm taking pictures of them. So to be able to select text, they must be OCR.
I think OCR will be very bad and destroy lines tables etc. So the work to fix all will probably be huge.
I'm still searching the best way to create PDF from pictures. They are not all same size since hand is not always on same distance. I will have to make some diy book scanner
Difermo is offline   Reply With Quote
Old 03-12-2017, 07:43 PM   #10
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by Difermo View Post
I do not have original pdf of all books. I'm taking pictures of them. So to be able to select text, they must be OCR.
I think OCR will be very bad and destroy lines tables etc. So the work to fix all will probably be huge.
As PeterT said, you could create a PDF with the image layer on top and the invisible text layer (OCR) on the bottom.

For example, that is how you can search through all the books on Archive.org:

https://archive.org/details/engineeringbook00yeom

The most accurate Open Source program is probably tesseract:

https://github.com/tesseract-ocr/tesseract

but it is commandline only (there are a few programs based off of it that do have a GUI).

I haven't tested it in years, but last I tested there was serious inaccuracies with Formatted Text (carrying over Italics/Bold/Smallcaps/Superscript/Subscript) and you had to do a ton of finagling with dictionaries + training. I also have no idea how well it handles complex formatting like Tables or Charts/Graphs with captions.

The most accurate Proprietary OCR is ABBYY Finereader (this is what I use):

https://www.abbyy.com/en-us/finereader/

It costs a bit of money ($199 for the latest version), but if you value your time, it will save you A TON of headaches.

The examples you gave of written Maths or complex equations is just not going to work well with ANY OCR programs... but at least you would be able to have all of the normal text in a book OCRed/searchable + accurate. :P

Quote:
Originally Posted by Difermo View Post
I'm still searching the best way to create PDF from pictures. They are not all same size since hand is not always on same distance. I will have to make some diy book scanner
The worse your input, the worse the OCR... and the worse your output will be.

Taking pictures with your shaky hand/phone is not ideal because you would most likely get very fuzzy text. This is ok if you are a human trying to quickly read the image, but disastrous for OCR.

The DIY Book Scanner forums discusses quite a few designs people have rigged up + their workflows:

https://forum.diybookscanner.org/

and we also discussed quite a lot of this in the topic, "Delicate text digitalizing + scanning issues":

https://www.mobileread.com/forums/sh...d.php?t=234146
Tex2002ans is offline   Reply With Quote
Old 03-12-2017, 08:14 PM   #11
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,736
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by Difermo View Post
@BetterRed

But that works only if text can be selected.
I do not have original pdf of all books. I'm taking pictures of them. So to be able to select text, they must be OCR.
I think OCR will be very bad and destroy lines tables etc. So the work to fix all will probably be huge.
I'm still searching the best way to create PDF from pictures. They are not all same size since hand is not always on same distance. I will have to make some diy book scanner
Ah-ha, I assumed, as others did, that the books in question were already scanned into image PDF's.
BetterRed is offline   Reply With Quote
Old 03-12-2017, 08:28 PM   #12
Difermo
Member
Difermo began at the beginning.
 
Posts: 20
Karma: 10
Join Date: Nov 2013
Device: none
thanks for answer. I have ABBYY but I'm not gona use it this time
To OCR only parts and to merge with diagrams and pictures to create pdf that is searchable is to much work and time. And lots of books to do. I will probably cancel work after 10-20 pages when I realise how much work is there to be done. Since i have very little free time, it is best just to take pictures and merge in one pdf
Maybe from 10 or 20 years from now, when some1 do it I will buy books.

I'll just make some bookscaner, take pictures, create pdf from them and then use it like that. If I need to look for some word I dont know, I will check it manualy in the dictionary.

Quote:
Originally Posted by BetterRed View Post
Ah-ha, I assumed, as others did, that the books in question were already scanned into image PDF's.
Some books are in pdf formats (maybe 20%), but lots of theme are not. And there are lots of my notes

Last edited by Difermo; 03-12-2017 at 08:32 PM.
Difermo is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Diagrams, equations and tables in e-books Linton Kobo Reader 4 01-15-2014 02:57 AM
Need to create ePub from PDF with equations, diagrams and tables prankie ePub 4 04-25-2013 10:21 PM
Forum mechanics jbcohen Feedback 1 03-21-2012 08:41 AM
Troubleshooting Synchronization mechanics sirmaru Amazon Kindle 0 08-28-2010 06:39 PM
issues with Technical PDF docs (equations; matrice...) tristouille Calibre 1 01-27-2010 07:52 AM


All times are GMT -4. The time now is 08:43 AM.


MobileRead.com is a privately owned, operated and funded community.