![]() |
#1 |
Member
![]() Posts: 20
Karma: 10
Join Date: Nov 2013
Device: none
|
mathematics equations, technical mechanics and all kind of diagrams
I'm sorry for openin new thread if one is already there. I searched on google and all what i found is to old. Date is 2012 or 2009. And I gess a lot have change since then.
I need to put some books in my eReader. It is kindle paperwhite. But if mobi or azw3 can not suport that i'm ready to buy some epub reader. The books I need to put is for civil engineering. So there is a lot of math trigonometry, equations, integrals, lots of diagrams, tables. Here are some examples ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Any sugestions how to resolve this.. |
![]() |
![]() |
![]() |
#2 | |||
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Quote:
Your best bet would probably be reading this as a PDF on a larger screen (tablet/monitor). Depending on the PDF, you may be able to do some cropping to make it a bit easier to read on your device. For example, using a tool like k2pdfopt: https://www.mobileread.com/forums/sh...d.php?t=144711 but most of the time it just might not be possible to shrink a very large and complex 8.5"x11" page into a smaller screen. Converting this type of complex material to a proper ebook is EXTREMELY labor intensive... and if the publisher doesn't release an ebook directly from the source material... it probably wouldn't be worth the time invested for a single individual to OCR (easily tens/hundreds of hours). The more complex the layout (multi-column, lots of footnotes, tables, figures, captions, equations, [...]) the harder the books are to convert using OCR + the more manual intervention would be needed to fix all the broken formatting. Quote:
For example, here is The Digital Reader showing off PDFs on a Kobo Aura One (Kindles/Nooks/others are similar): http://the-digital-reader.com/2016/0...just-no-video/ Quote:
Equations/Formulas? Each equation is going to have to be included as bitmap images (or SVG or MathML). Each and every equation would require laborious double-checking to make sure it is correct and require some serious markup. The only program/engine I know of that handles OCRing Formulas is InftyReader: http://www.sciaccess.net/en/InftyReader/ and that costs $800+. Side Note: Back in 2013 I wrote a topic, "Tutorial: Formulas to PNG", where I sort of show off one method of digitizing equations (using LibreOffice Math): https://www.mobileread.com/forums/sh...d.php?t=223254 I have used this to recreate formulas in books that had <50 equations... now I tend to prefer using LaTeX as a middleman... but it STILL requires a massive amount of manual work per equation. I shudder to think how long it would take working on a book that is as full as your example pages. Last edited by Tex2002ans; 03-11-2017 at 05:10 PM. |
|||
![]() |
![]() |
![]() |
#3 |
Sir Penguin of Edinburgh
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 12,375
Karma: 23555235
Join Date: Apr 2007
Location: DC Metro area
Device: Shake a stick plus 1
|
Get an iPad. Use Goodreader.
Seriously, this type of content requires a _large_ screen. And honestly, the iPad is the best for this. |
![]() |
![]() |
![]() |
#4 |
Member
![]() Posts: 20
Karma: 10
Join Date: Nov 2013
Device: none
|
Thanks @Tex2002ans for very detail answer
@Nate the great thanks to you2 ![]() Some of pictures are a4 paper format some are smaller. But the 90% are books dimension 16cm x 24cm. These are books for the building profession. The professor advised to make maximum use today's technology. Earlier, it was difficult, actually not possible, to pull all the books on the construction site. You always need a little of something to be reminded. So I thought to process books for kindle. Given that this is too complicated, it is better to keep them in pdf format on the tablet. It is better to drag tablet than suitcase with 20 books. It is to bad that kindle or kobo aren't so powerful. In tablet, pdf, I can't hold word and see what it means (if i do not understand it). That was the second reason for me to place books on my kindle paperwhite.. |
![]() |
![]() |
![]() |
#5 | |
mostly an observer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,519
Karma: 987654
Join Date: Dec 2012
Device: Kindle
|
Quote:
|
|
![]() |
![]() |
![]() |
#6 | |
null operator (he/him)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 21,736
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
Quote:
All I did was highlight the word 'manuscript' (double click) in the PDF (opened in PDF xChange), and press Ctrl+Shift+`, voila WordWeb popped up with a definition, the only customisation was to assign that particular key sequence as the WordWeb hotkey. I have another gadget (clickto) that will lookup the highlighted text on the Web (Google, Wikipedia, Wiktionary, etc, or even MobileRead) If something similar can't be done on Android or iOS I'd be surprised, but if not get a Surface, its a walk in park with Windows. One of my dislikes of Android (and iOS I would guess, were I to use it) is the difficulty of getting 'apps' to interoperate, walled gardens full of walled gardens, or as I prefer - nests of recursive arboreta. BR |
|
![]() |
![]() |
![]() |
#7 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 13,533
Karma: 78910202
Join Date: Nov 2007
Location: Toronto
Device: Libra H2O, Libra Colour
|
I expect that the ability to press on a word in a PDF will depend on whether or not there is a text layer (not sure of the real term) in the PDF. If the PDF is purely a scanned image I doubt that any lookup would be possible.
|
![]() |
![]() |
![]() |
#8 |
null operator (he/him)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 21,736
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
True - maybe the Textify TSR (or whatever they're called now) might help.
Failing that, screenshoot the page [Alt/PrtScn], paste into a Onenote note [WinKey+N, Ctrl+V], use its Copy Text from picture feature [Menu Key, x], paste the resultant text into the note [Ctrl+V], now do the look up from the just pasted text (correct if necessary). That might even be doable with the Android and iOS versions of OneNote. But for scanned image PDF's, Evernote might be better, it can embed a PDF in a note and do the OCR in one fell swoop - lawyers love it. Evernote runs on most platforms except Linux (or last time I looked it didn't). And you can access your notebooks via the web without buggering about with DropBox and the like. BR |
![]() |
![]() |
![]() |
#9 |
Member
![]() Posts: 20
Karma: 10
Join Date: Nov 2013
Device: none
|
@BetterRed
But that works only if text can be selected. I do not have original pdf of all books. I'm taking pictures of them. So to be able to select text, they must be OCR. I think OCR will be very bad and destroy lines tables etc. So the work to fix all will probably be huge. I'm still searching the best way to create PDF from pictures. They are not all same size since hand is not always on same distance. I will have to make some diy book scanner |
![]() |
![]() |
![]() |
#10 | ||
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Quote:
For example, that is how you can search through all the books on Archive.org: https://archive.org/details/engineeringbook00yeom The most accurate Open Source program is probably tesseract: https://github.com/tesseract-ocr/tesseract but it is commandline only (there are a few programs based off of it that do have a GUI). I haven't tested it in years, but last I tested there was serious inaccuracies with Formatted Text (carrying over Italics/Bold/Smallcaps/Superscript/Subscript) and you had to do a ton of finagling with dictionaries + training. I also have no idea how well it handles complex formatting like Tables or Charts/Graphs with captions. The most accurate Proprietary OCR is ABBYY Finereader (this is what I use): https://www.abbyy.com/en-us/finereader/ It costs a bit of money ($199 for the latest version), but if you value your time, it will save you A TON of headaches. The examples you gave of written Maths or complex equations is just not going to work well with ANY OCR programs... but at least you would be able to have all of the normal text in a book OCRed/searchable + accurate. :P Quote:
Taking pictures with your shaky hand/phone is not ideal because you would most likely get very fuzzy text. This is ok if you are a human trying to quickly read the image, but disastrous for OCR. The DIY Book Scanner forums discusses quite a few designs people have rigged up + their workflows: https://forum.diybookscanner.org/ and we also discussed quite a lot of this in the topic, "Delicate text digitalizing + scanning issues": https://www.mobileread.com/forums/sh...d.php?t=234146 |
||
![]() |
![]() |
![]() |
#11 | |
null operator (he/him)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 21,736
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
Quote:
|
|
![]() |
![]() |
![]() |
#12 |
Member
![]() Posts: 20
Karma: 10
Join Date: Nov 2013
Device: none
|
thanks for answer. I have ABBYY but I'm not gona use it this time
To OCR only parts and to merge with diagrams and pictures to create pdf that is searchable is to much work and time. And lots of books to do. I will probably cancel work after 10-20 pages when I realise how much work is there to be done. Since i have very little free time, it is best just to take pictures and merge in one pdf Maybe from 10 or 20 years from now, when some1 do it I will buy books. I'll just make some bookscaner, take pictures, create pdf from them and then use it like that. If I need to look for some word I dont know, I will check it manualy in the dictionary. Some books are in pdf formats (maybe 20%), but lots of theme are not. And there are lots of my notes Last edited by Difermo; 03-12-2017 at 08:32 PM. |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Diagrams, equations and tables in e-books | Linton | Kobo Reader | 4 | 01-15-2014 02:57 AM |
Need to create ePub from PDF with equations, diagrams and tables | prankie | ePub | 4 | 04-25-2013 10:21 PM |
Forum mechanics | jbcohen | Feedback | 1 | 03-21-2012 08:41 AM |
Troubleshooting Synchronization mechanics | sirmaru | Amazon Kindle | 0 | 08-28-2010 06:39 PM |
issues with Technical PDF docs (equations; matrice...) | tristouille | Calibre | 1 | 01-27-2010 07:52 AM |