Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Readers > More E-Book Readers > iRex > iRex Developer's Corner

Notices

Reply
 
Thread Tools Search this Thread
Old 01-18-2008, 02:52 PM   #1
tribble
iLiad Maniac
tribble knows what time it istribble knows what time it istribble knows what time it istribble knows what time it istribble knows what time it istribble knows what time it istribble knows what time it istribble knows what time it istribble knows what time it istribble knows what time it istribble knows what time it is
 
tribble's Avatar
 
Posts: 1,382
Karma: 2369
Join Date: Apr 2006
Location: Germany
Device: Bookeen Opus (i love that thing) and iPad (what an irony)
PDF search

Hi!

I just found this.
http://ruby-gnome2.sourceforge.jp/hi...er&key=poppler

poppler::Page has a find_text() function.

Can we somehow use that to get us a search function inside pdfs on our iLiad? Its some kind of poppler library, so it should somehow relate to the iLiad. But i have no clue what they are doing there.

Anyone with mor insight could maybe shed some light here.

Thanks
tribble is offline   Reply With Quote
Old 01-18-2008, 04:37 PM   #2
-Thomas-
Addict
-Thomas- once ate a cherry pie in a record 7 seconds.-Thomas- once ate a cherry pie in a record 7 seconds.-Thomas- once ate a cherry pie in a record 7 seconds.-Thomas- once ate a cherry pie in a record 7 seconds.-Thomas- once ate a cherry pie in a record 7 seconds.-Thomas- once ate a cherry pie in a record 7 seconds.-Thomas- once ate a cherry pie in a record 7 seconds.-Thomas- once ate a cherry pie in a record 7 seconds.-Thomas- once ate a cherry pie in a record 7 seconds.-Thomas- once ate a cherry pie in a record 7 seconds.-Thomas- once ate a cherry pie in a record 7 seconds.
 
-Thomas-'s Avatar
 
Posts: 325
Karma: 1725
Join Date: Dec 2007
Location: Münster, Germany
Device: iRex iLiad v2
This function is also included in the poppler lib used on the iLiad. It looks like the function searches for a string within a specific page of an opened PDF file. At least I found the following in the poppler sources (glib/test-poppler-lib.c):
Code:
  list = poppler_page_find_text (page, "Bitwise");
  printf ("\n");  
  printf ("\tFound text \"Bitwise\" at positions:\n");
So to search for a string globally we would have traverse all directories, open all PDF files, go through all pages and let the function do the rest...
-Thomas- is offline   Reply With Quote
Advert
Old 01-18-2008, 05:29 PM   #3
tribble
iLiad Maniac
tribble knows what time it istribble knows what time it istribble knows what time it istribble knows what time it istribble knows what time it istribble knows what time it istribble knows what time it istribble knows what time it istribble knows what time it istribble knows what time it istribble knows what time it is
 
tribble's Avatar
 
Posts: 1,382
Karma: 2369
Join Date: Apr 2006
Location: Germany
Device: Bookeen Opus (i love that thing) and iPad (what an irony)
Do you know, how well the function works with hyphenated text if at all, and if it finds hyphenated texts that span over multiple pages, and how it handles different languages? Is it UTF-8?
What info gets returned? Will it then be simple to somehow mark the found word?

It would be great if we could get a search going on the iLiad. I am willing to take a look into this aswell, but i know nothing about c++ programming and can only start mid Febuary.

A simple textsearch in a single PDF would suffice for me at the moment. A global textsearch on the iLiad could easily be very expensive.
tribble is offline   Reply With Quote
Old 01-18-2008, 07:56 PM   #4
-Thomas-
Addict
-Thomas- once ate a cherry pie in a record 7 seconds.-Thomas- once ate a cherry pie in a record 7 seconds.-Thomas- once ate a cherry pie in a record 7 seconds.-Thomas- once ate a cherry pie in a record 7 seconds.-Thomas- once ate a cherry pie in a record 7 seconds.-Thomas- once ate a cherry pie in a record 7 seconds.-Thomas- once ate a cherry pie in a record 7 seconds.-Thomas- once ate a cherry pie in a record 7 seconds.-Thomas- once ate a cherry pie in a record 7 seconds.-Thomas- once ate a cherry pie in a record 7 seconds.-Thomas- once ate a cherry pie in a record 7 seconds.
 
-Thomas-'s Avatar
 
Posts: 325
Karma: 1725
Join Date: Dec 2007
Location: Münster, Germany
Device: iRex iLiad v2
According to the docs it takes UTF-8 coded input and returns a list of rectangles for each occurance of the text on the page (in PDF points).

Hyphenation doesn't work at all, I've tried it against the actual Debian version of libpoppler.

For those who are interested I've added a proof of concept for a single-file search. I couldn't compile it to run on the iLiad, but maybe someone can help out. It currently prints a list of all pages the string occurs on and exits 0 if matches were found.
Attached Files
File Type: gz poppler-find.c.gz (534 Bytes, 410 views)
-Thomas- is offline   Reply With Quote
Old 01-19-2008, 02:41 AM   #5
tribble
iLiad Maniac
tribble knows what time it istribble knows what time it istribble knows what time it istribble knows what time it istribble knows what time it istribble knows what time it istribble knows what time it istribble knows what time it istribble knows what time it istribble knows what time it istribble knows what time it is
 
tribble's Avatar
 
Posts: 1,382
Karma: 2369
Join Date: Apr 2006
Location: Germany
Device: Bookeen Opus (i love that thing) and iPad (what an irony)
That looks rather easy. Now we will have to do a few things:
1) Integrate the search into the ipdf.
2) store the results in some global variable.
3) add a search icon, that starts keayboard and runs search on enter.
4) add a gui for the results. display list. on click goto page. and somehow rende a box or overlay on the searchword. (the bookmark ipdf could giv hints on this.)
5) when there is a resultset, change the search icon, to show resultset on one click. on second click open keyboard for new search.
6) rewrite poppler to find hyphenated text

Anyone up to the challange?
tribble is offline   Reply With Quote
Advert
Old 01-29-2008, 08:48 AM   #6
PhilT
Enthusiast
PhilT began at the beginning.
 
Posts: 25
Karma: 10
Join Date: Nov 2007
Device: iRex iLiad
I would love to see searching within PDFs working and I know many others would too. It just so happens that I know C++ too although I'm a little rusty as I tend to program in Java and Ruby these days. My Linux knowledge is also pretty limited but I'm willing to give it a try if someone could let me know what I need to set up my environment.

Regards,
Phil
PhilT is offline   Reply With Quote
Old 04-16-2008, 09:13 AM   #7
mvoosten
Zealot
mvoosten began at the beginning.
 
Posts: 128
Karma: 41
Join Date: Nov 2007
Device: Hanlin V3
Is there any progress on this?? Searching in PDF is a #1 item for me and I assume a lot of people.. so make us happy
mvoosten is offline   Reply With Quote
Old 04-16-2008, 09:27 AM   #8
mvoosten
Zealot
mvoosten began at the beginning.
 
Posts: 128
Karma: 41
Join Date: Nov 2007
Device: Hanlin V3
a possible other option to borrow from?!?:
http://sourceforge.net/projects/pdfsearch/
mvoosten is offline   Reply With Quote
Old 04-16-2008, 07:31 PM   #9
-Thomas-
Addict
-Thomas- once ate a cherry pie in a record 7 seconds.-Thomas- once ate a cherry pie in a record 7 seconds.-Thomas- once ate a cherry pie in a record 7 seconds.-Thomas- once ate a cherry pie in a record 7 seconds.-Thomas- once ate a cherry pie in a record 7 seconds.-Thomas- once ate a cherry pie in a record 7 seconds.-Thomas- once ate a cherry pie in a record 7 seconds.-Thomas- once ate a cherry pie in a record 7 seconds.-Thomas- once ate a cherry pie in a record 7 seconds.-Thomas- once ate a cherry pie in a record 7 seconds.-Thomas- once ate a cherry pie in a record 7 seconds.
 
-Thomas-'s Avatar
 
Posts: 325
Karma: 1725
Join Date: Dec 2007
Location: Münster, Germany
Device: iRex iLiad v2
Teasing 3 :D

I've made a concept of a global PDF search, see attached screenshot...

Just a few things:
  • Results will be shown in content lister (in the future)
  • I don't know much about threaded programming (application freezes), so I definitely need some time
  • It's terribly slow, even in internal memory

edit: I don't know much about ipdf hacking, so has anybody an idea how to search in a single PDF file?
Attached Thumbnails
Click image for larger version

Name:	iliad_080417_002457.png
Views:	630
Size:	99.6 KB
ID:	12220  

Last edited by -Thomas-; 04-16-2008 at 07:34 PM.
-Thomas- is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Search Everything cklammer Other formats 3 04-14-2010 05:27 PM
HELP - Need PDF Reader With Text Search and Bookmarks! kevnlis Apple Devices 7 10-29-2009 04:20 PM
DX: partial word search in PDF jcgam69 Amazon Kindle 0 06-16-2009 03:24 PM
Google Book Search to search full-text books online Bob Russell Deals and Resources (No Self-Promotion or Affiliate Links) 1 08-19-2006 12:13 PM


All times are GMT -4. The time now is 05:05 AM.


MobileRead.com is a privately owned, operated and funded community.