Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 02-11-2005, 01:15 AM   #1
hacker
Technology Mercenary
hacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with others
 
hacker's Avatar
 
Posts: 617
Karma: 2561
Join Date: Feb 2003
Location: East Lyme, CT
Device: Direct Neural Implant
MapQuester World Atlas Conversion Tool

As promised over here, I've taken some time to clean up and cannibalize part of a spider I use daily for Plucker, and modified it to allow you to run it and spider MapQuest to build yourself a World Atlas with images, country data, and lots of other bits.

The whole script is only 87 lines of actual code! (161 with comments and liberal spacing for readibility) I prefer writing clean, tight, well-commented code. My code is my business card, and this is no exception.

The script is written in Perl, my language of choice, but all modules used are either in core, or available via CPAN (perl -MCPAN -e 'install "Module::Name"'). It should be easy to run and figure out. I've commented it where required.
My only requirement of using this, is that you don't rip off the code and claim you wrote it, or parts of it, and that you provide some feedback so I can improve it; good, bad, feature requests, bugs you find, whatever. I'd like to know!
Unfortunately, I cannot redistribute the completed version of the maps in mobile format, because that would violate MapQuest's copyright and Terms of Use, but you can see how good it looks in the screenshots below.

The entire script is attached below. Just grab the script and run it in an empty directory. It will spider and fetch the 238-or-so separate pages from MapQuest's World Atlas pages, strip out the unnecessary HTML, Javascript, stylesheets, and other non-visible bits, and write each country to its own file. All of the external links to country data is rewritten to reference the local copies. The only pieces fetched remotely are the images themselves.

When the spidering is complete, it outputs a top-level index file for you to point your mobile creation tool towards, so you can then spider the content yourself, and convert it to the format of your choice.

Hopefully many users will find this useful. Enjoy!
Attached Thumbnails
Click image for larger version

Name:	01.png
Views:	766
Size:	26.2 KB
ID:	514   Click image for larger version

Name:	02.png
Views:	673
Size:	39.7 KB
ID:	515   Click image for larger version

Name:	03.png
Views:	737
Size:	17.6 KB
ID:	516   Click image for larger version

Name:	04.png
Views:	640
Size:	43.9 KB
ID:	517   Click image for larger version

Name:	05.png
Views:	775
Size:	62.3 KB
ID:	518  
Attached Files
File Type: txt mapquester.txt (4.4 KB, 578 views)

Last edited by hacker; 02-11-2005 at 01:58 AM.
hacker is offline   Reply With Quote
Old 02-11-2005, 01:36 AM   #2
Chaos
Evangelist
Chaos has a complete set of Star Wars action figures.Chaos has a complete set of Star Wars action figures.Chaos has a complete set of Star Wars action figures.
 
Posts: 418
Karma: 281
Join Date: Jul 2004
Location: Canada
Device: Assorted older devices
Oooooh... I think I'll play with this on the weekend.

Looks really nice.
Chaos is offline   Reply With Quote
Advert
Old 02-11-2005, 04:28 AM   #3
Alexander Turcic
Fully Converged
Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.
 
Alexander Turcic's Avatar
 
Posts: 18,163
Karma: 14021202
Join Date: Oct 2002
Location: Switzerland
Device: Too many to count here.
Thank you David!
Alexander Turcic is offline   Reply With Quote
Old 02-11-2005, 04:46 AM   #4
Colin Dunstan
Is papyrophobic!
Colin Dunstan ought to be getting tired of karma fortunes by now.Colin Dunstan ought to be getting tired of karma fortunes by now.Colin Dunstan ought to be getting tired of karma fortunes by now.Colin Dunstan ought to be getting tired of karma fortunes by now.Colin Dunstan ought to be getting tired of karma fortunes by now.Colin Dunstan ought to be getting tired of karma fortunes by now.Colin Dunstan ought to be getting tired of karma fortunes by now.Colin Dunstan ought to be getting tired of karma fortunes by now.Colin Dunstan ought to be getting tired of karma fortunes by now.Colin Dunstan ought to be getting tired of karma fortunes by now.Colin Dunstan ought to be getting tired of karma fortunes by now.
 
Colin Dunstan's Avatar
 
Posts: 1,926
Karma: 1009999
Join Date: Aug 2003
Location: USA
Device: Dell Axim
The script gave me the following error message:
Code:
$ perl mapquester.txt
Can't locate HTML/SimpleLinkExtor.pm in @INC (@INC contains: /usr/local/lib/perl5/site_perl/5.6.1/mach /usr/local/lib/perl5/site_perl/5.6.1 /usr/local/lib/perl5/site_perl /usr/local/lib/perl5/5.6.1/BSDPAN /usr/local/lib/perl5/5.6.1/mach /usr/local/lib/perl5/5.6.1 .) at mapquester.txt line 38.
BEGIN failed--compilation aborted at mapquester.txt line 38.
$
I then tried to install the missing module following your instruction, but came up with another error message:
Code:
# perl -MCPAN -e 'install "HTML:SimpleLinkExtor"'
Going to read /root/.cpan/sources/authors/01mailrc.txt.gz
Going to read /root/.cpan/sources/modules/02packages.details.txt.gz
  Database was generated on Thu, 10 Feb 2005 22:38:20 GMT
CPAN: HTTP::Date loaded ok

  There's a new CPAN.pm version (v1.76) available!
  [Current version is v1.59_54]
  You might want to try
    install Bundle::CPAN
    reload cpan
  without quitting the current session. It should be a seamless upgrade
  while we are running...


Going to read /root/.cpan/sources/modules/03modlist.data.gz
Warning: Cannot install HTML:SimpleLinkExtor, don't know what it is.
Try the command

    i /HTML:SimpleLinkExtor/

to find objects with matching identifiers.
#
Any help would be appreciated!
Colin Dunstan is offline   Reply With Quote
Old 02-11-2005, 08:54 AM   #5
Chaos
Evangelist
Chaos has a complete set of Star Wars action figures.Chaos has a complete set of Star Wars action figures.Chaos has a complete set of Star Wars action figures.
 
Posts: 418
Karma: 281
Join Date: Jul 2004
Location: Canada
Device: Assorted older devices
That should be HTML::SimpleLinkExtor.

Always two colons.
Chaos is offline   Reply With Quote
Advert
Old 02-11-2005, 08:55 AM   #6
hacker
Technology Mercenary
hacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with others
 
hacker's Avatar
 
Posts: 617
Karma: 2561
Join Date: Feb 2003
Location: East Lyme, CT
Device: Direct Neural Implant
Quote:
Originally Posted by Morpheus
Code:
  # perl -MCPAN -e 'install "HTML:SimpleLinkExtor"'
  
  i /HTML:SimpleLinkExtor/
Any help would be appreciated!
You'll want to use two colons between the parent class and the sub-class, not just one as above. It should be executed as follows:
Code:
perl -MCPAN -e 'install "HTML::SimpleLinkExtor"'
Try that and see if it helps.
hacker is offline   Reply With Quote
Old 02-11-2005, 09:09 AM   #7
Colin Dunstan
Is papyrophobic!
Colin Dunstan ought to be getting tired of karma fortunes by now.Colin Dunstan ought to be getting tired of karma fortunes by now.Colin Dunstan ought to be getting tired of karma fortunes by now.Colin Dunstan ought to be getting tired of karma fortunes by now.Colin Dunstan ought to be getting tired of karma fortunes by now.Colin Dunstan ought to be getting tired of karma fortunes by now.Colin Dunstan ought to be getting tired of karma fortunes by now.Colin Dunstan ought to be getting tired of karma fortunes by now.Colin Dunstan ought to be getting tired of karma fortunes by now.Colin Dunstan ought to be getting tired of karma fortunes by now.Colin Dunstan ought to be getting tired of karma fortunes by now.
 
Colin Dunstan's Avatar
 
Posts: 1,926
Karma: 1009999
Join Date: Aug 2003
Location: USA
Device: Dell Axim
That worked! Thanks
Colin Dunstan is offline   Reply With Quote
Old 02-11-2005, 09:25 AM   #8
Colin Dunstan
Is papyrophobic!
Colin Dunstan ought to be getting tired of karma fortunes by now.Colin Dunstan ought to be getting tired of karma fortunes by now.Colin Dunstan ought to be getting tired of karma fortunes by now.Colin Dunstan ought to be getting tired of karma fortunes by now.Colin Dunstan ought to be getting tired of karma fortunes by now.Colin Dunstan ought to be getting tired of karma fortunes by now.Colin Dunstan ought to be getting tired of karma fortunes by now.Colin Dunstan ought to be getting tired of karma fortunes by now.Colin Dunstan ought to be getting tired of karma fortunes by now.Colin Dunstan ought to be getting tired of karma fortunes by now.Colin Dunstan ought to be getting tired of karma fortunes by now.
 
Colin Dunstan's Avatar
 
Posts: 1,926
Karma: 1009999
Join Date: Aug 2003
Location: USA
Device: Dell Axim
Quote:
Originally Posted by hacker
Unfortunately, I cannot redistribute the completed version of the maps in mobile format, because that would violate MapQuest's copyright and Terms of Use, but you can see how good it looks in the screenshots below.
But we could share the generated .html files here, couldn't we?
Colin Dunstan is offline   Reply With Quote
Old 02-11-2005, 12:27 PM   #9
hacker
Technology Mercenary
hacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with others
 
hacker's Avatar
 
Posts: 617
Karma: 2561
Join Date: Feb 2003
Location: East Lyme, CT
Device: Direct Neural Implant
I've just created three pre-compiled versions, for those users without the right Perl modules installed, using PAR. These standalone executables contain all of the modules + the Perl stub to run it.

Just drop one of these files in an empty directory, and run it. No muss, no fuss.

Versions for FreeBSD, Linux and Windows are attached below. I haven't written any docs or README to go with it, but its self-explanatory. If it becomes popular enough, I'll repackage it as a "real" application in the same fashion with docs.

Enjoy!
Attached Files
File Type: zip mapquester_win32.zip (1.16 MB, 615 views)
File Type: zip mapquester_linux.zip (1.46 MB, 505 views)
File Type: zip mapquester_bsd.zip (1.35 MB, 483 views)

Last edited by hacker; 02-11-2005 at 12:33 PM.
hacker is offline   Reply With Quote
Old 03-04-2005, 03:39 AM   #10
albertc
Junior Member
albertc began at the beginning.
 
Posts: 7
Karma: 10
Join Date: Mar 2005
Location: Near Barcelona
Device: Treo 650, LifeDrive
Quote:
Originally Posted by hacker
Versions for FreeBSD, Linux and Windows are attached below. I haven't written any docs or README to go with it, but its self-explanatory. If it becomes popular enough, I'll repackage it as a "real" application in the same fashion with docs.
Amazing! (as usual, David)

I have a request: could you please precompile a version for OS X, too?

Thanks
--
Albert
albertc is offline   Reply With Quote
Old 08-26-2009, 05:00 PM   #11
gmorgan_va
Junior Member
gmorgan_va began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Aug 2009
Device: HP 2795b
Thanks for creating this great perl script. I needed proxy support and ended up adding my proxy manually by adding the line after your $ua->agent line:

Code:
$ua->proxy('http', 'http://your-http-proxy.com:proxyport/');
And it worked like a charm through the proxy. BTW, if you are behind a proxy it just writes the base continent files with 500 Not found errors in them. Not sure how to fix that so there is an error message instead.

Thanks for your beautiful script. Now to figure out how to iSilo these files.

Last edited by gmorgan_va; 08-26-2009 at 05:03 PM. Reason: Silly emoticon in the middle of my code snippet.
gmorgan_va is offline   Reply With Quote
Old 08-26-2009, 10:05 PM   #12
nrapallo
GuteBook/Mobi2IMP Creator
nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.
 
nrapallo's Avatar
 
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
Interesting thread....
nrapallo is offline   Reply With Quote
Old 08-27-2009, 08:28 AM   #13
gmorgan_va
Junior Member
gmorgan_va began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Aug 2009
Device: HP 2795b
Missing the Carribean

So for some reason MapQuest decided to put the Caribbean under North America. Not sure why but I'm also not a cartographer. I need to examine your script some more to see how it could be modified to gather a 2nd level index. In the short term, my hack to include the files for the Caribbean is to add it to the index. iSiloX's error log clued me into the fact that the Caribbean pages weren't being grabbed (had an error for each page...must be linked to from elsewhere in the site). Anyway, now the iSiloX conversion succeeded. Can't wait to see what this looks like on my PocketPC!
gmorgan_va is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Best PDF conversion tool. Dark123 PDF 19 04-21-2010 02:52 AM
tool(s) for conversion to ePub Richard Maseles ePub 1 01-18-2009 08:47 PM
E-Book Conversion Tool kgns Kindle Formats 5 11-28-2008 04:01 AM
Conversion tool for Mac OS X Klaatu Sony Reader 8 12-13-2007 07:13 PM
Excellent conversion tool Greenchief59 Workshop 0 02-12-2005 08:36 PM


All times are GMT -4. The time now is 07:28 AM.


MobileRead.com is a privately owned, operated and funded community.