Thread: getxbook
View Single Post
Old 11-09-2013, 04:13 AM   #1
brianinmaine
Evangelist
brianinmaine ought to be getting tired of karma fortunes by now.brianinmaine ought to be getting tired of karma fortunes by now.brianinmaine ought to be getting tired of karma fortunes by now.brianinmaine ought to be getting tired of karma fortunes by now.brianinmaine ought to be getting tired of karma fortunes by now.brianinmaine ought to be getting tired of karma fortunes by now.brianinmaine ought to be getting tired of karma fortunes by now.brianinmaine ought to be getting tired of karma fortunes by now.brianinmaine ought to be getting tired of karma fortunes by now.brianinmaine ought to be getting tired of karma fortunes by now.brianinmaine ought to be getting tired of karma fortunes by now.
 
brianinmaine's Avatar
 
Posts: 457
Karma: 1287375
Join Date: Jan 2013
Location: West Gardiner, Maine
Device: Touch (5.3.7)
getxbook

http://njw.me.uk/getxbook/

source: http://njw.me.uk/getxbook/getxbook-1.1.tar.bz2

I compiled this and ripped convert and tesseract-ocr from debian. put together a few scripts to try it out. I did not bother with the GUI as it's Tcl/Tk.

result: works terribly, can't download all the needed files to convert properly.

why the heck did I post this: I thought maybe someone else might be interested enough to mess with it. I'm done, but if someone wants, I can supply a larger file with the tessdata directory to make tesseract work - it's 34Mb so I didn't post it yet.

directions: in a web browser, find a book in google books that you can preview. write down the code after the ID= part in the address. In the KUAL button for getxbook, type "./getgbook.sh code" and it should download all the pages (mostly jpg and pngs) to a directory in the current. "ls" the directory name. "mkpdf.sh directoryname" should try to build a pdf of the images into a pdf. mkocrtxt.sh is to convert the images to a tiff, then OCR the images to text files. I couldn't figure out getbnbook or getabook. Lots of other smart people out there, try "./getbnbook.sh -h"...

Have a nice day.
Attached Files
File Type: zip getxbook.zip (4.20 MB, 291 views)
brianinmaine is offline   Reply With Quote