02-11-2005, 01:15 AM | #1 |
Technology Mercenary
Posts: 617
Karma: 2561
Join Date: Feb 2003
Location: East Lyme, CT
Device: Direct Neural Implant
|
MapQuester World Atlas Conversion Tool
As promised over here, I've taken some time to clean up and cannibalize part of a spider I use daily for Plucker, and modified it to allow you to run it and spider MapQuest to build yourself a World Atlas with images, country data, and lots of other bits.
The whole script is only 87 lines of actual code! (161 with comments and liberal spacing for readibility) I prefer writing clean, tight, well-commented code. My code is my business card, and this is no exception. The script is written in Perl, my language of choice, but all modules used are either in core, or available via CPAN (perl -MCPAN -e 'install "Module::Name"'). It should be easy to run and figure out. I've commented it where required. My only requirement of using this, is that you don't rip off the code and claim you wrote it, or parts of it, and that you provide some feedback so I can improve it; good, bad, feature requests, bugs you find, whatever. I'd like to know! Unfortunately, I cannot redistribute the completed version of the maps in mobile format, because that would violate MapQuest's copyright and Terms of Use, but you can see how good it looks in the screenshots below. The entire script is attached below. Just grab the script and run it in an empty directory. It will spider and fetch the 238-or-so separate pages from MapQuest's World Atlas pages, strip out the unnecessary HTML, Javascript, stylesheets, and other non-visible bits, and write each country to its own file. All of the external links to country data is rewritten to reference the local copies. The only pieces fetched remotely are the images themselves. When the spidering is complete, it outputs a top-level index file for you to point your mobile creation tool towards, so you can then spider the content yourself, and convert it to the format of your choice. Hopefully many users will find this useful. Enjoy! Last edited by hacker; 02-11-2005 at 01:58 AM. |
02-11-2005, 01:36 AM | #2 |
Evangelist
Posts: 418
Karma: 281
Join Date: Jul 2004
Location: Canada
Device: Assorted older devices
|
Oooooh... I think I'll play with this on the weekend.
Looks really nice. |
Advert | |
|
02-11-2005, 04:28 AM | #3 |
Fully Converged
Posts: 18,163
Karma: 14021202
Join Date: Oct 2002
Location: Switzerland
Device: Too many to count here.
|
Thank you David!
|
02-11-2005, 04:46 AM | #4 |
Is papyrophobic!
Posts: 1,926
Karma: 1009999
Join Date: Aug 2003
Location: USA
Device: Dell Axim
|
The script gave me the following error message:
Code:
$ perl mapquester.txt Can't locate HTML/SimpleLinkExtor.pm in @INC (@INC contains: /usr/local/lib/perl5/site_perl/5.6.1/mach /usr/local/lib/perl5/site_perl/5.6.1 /usr/local/lib/perl5/site_perl /usr/local/lib/perl5/5.6.1/BSDPAN /usr/local/lib/perl5/5.6.1/mach /usr/local/lib/perl5/5.6.1 .) at mapquester.txt line 38. BEGIN failed--compilation aborted at mapquester.txt line 38. $ Code:
# perl -MCPAN -e 'install "HTML:SimpleLinkExtor"' Going to read /root/.cpan/sources/authors/01mailrc.txt.gz Going to read /root/.cpan/sources/modules/02packages.details.txt.gz Database was generated on Thu, 10 Feb 2005 22:38:20 GMT CPAN: HTTP::Date loaded ok There's a new CPAN.pm version (v1.76) available! [Current version is v1.59_54] You might want to try install Bundle::CPAN reload cpan without quitting the current session. It should be a seamless upgrade while we are running... Going to read /root/.cpan/sources/modules/03modlist.data.gz Warning: Cannot install HTML:SimpleLinkExtor, don't know what it is. Try the command i /HTML:SimpleLinkExtor/ to find objects with matching identifiers. # |
02-11-2005, 08:54 AM | #5 |
Evangelist
Posts: 418
Karma: 281
Join Date: Jul 2004
Location: Canada
Device: Assorted older devices
|
That should be HTML::SimpleLinkExtor.
Always two colons. |
Advert | |
|
02-11-2005, 08:55 AM | #6 | |
Technology Mercenary
Posts: 617
Karma: 2561
Join Date: Feb 2003
Location: East Lyme, CT
Device: Direct Neural Implant
|
Quote:
Code:
perl -MCPAN -e 'install "HTML::SimpleLinkExtor"' |
|
02-11-2005, 09:09 AM | #7 |
Is papyrophobic!
Posts: 1,926
Karma: 1009999
Join Date: Aug 2003
Location: USA
Device: Dell Axim
|
That worked! Thanks
|
02-11-2005, 09:25 AM | #8 | |
Is papyrophobic!
Posts: 1,926
Karma: 1009999
Join Date: Aug 2003
Location: USA
Device: Dell Axim
|
Quote:
|
|
02-11-2005, 12:27 PM | #9 |
Technology Mercenary
Posts: 617
Karma: 2561
Join Date: Feb 2003
Location: East Lyme, CT
Device: Direct Neural Implant
|
I've just created three pre-compiled versions, for those users without the right Perl modules installed, using PAR. These standalone executables contain all of the modules + the Perl stub to run it.
Just drop one of these files in an empty directory, and run it. No muss, no fuss. Versions for FreeBSD, Linux and Windows are attached below. I haven't written any docs or README to go with it, but its self-explanatory. If it becomes popular enough, I'll repackage it as a "real" application in the same fashion with docs. Enjoy! Last edited by hacker; 02-11-2005 at 12:33 PM. |
03-04-2005, 03:39 AM | #10 | |
Junior Member
Posts: 7
Karma: 10
Join Date: Mar 2005
Location: Near Barcelona
Device: Treo 650, LifeDrive
|
Quote:
I have a request: could you please precompile a version for OS X, too? Thanks -- Albert |
|
08-26-2009, 05:00 PM | #11 |
Junior Member
Posts: 2
Karma: 10
Join Date: Aug 2009
Device: HP 2795b
|
Thanks for creating this great perl script. I needed proxy support and ended up adding my proxy manually by adding the line after your $ua->agent line:
Code:
$ua->proxy('http', 'http://your-http-proxy.com:proxyport/'); Thanks for your beautiful script. Now to figure out how to iSilo these files. Last edited by gmorgan_va; 08-26-2009 at 05:03 PM. Reason: Silly emoticon in the middle of my code snippet. |
08-26-2009, 10:05 PM | #12 |
GuteBook/Mobi2IMP Creator
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
|
Interesting thread....
|
08-27-2009, 08:28 AM | #13 |
Junior Member
Posts: 2
Karma: 10
Join Date: Aug 2009
Device: HP 2795b
|
Missing the Carribean
So for some reason MapQuest decided to put the Caribbean under North America. Not sure why but I'm also not a cartographer. I need to examine your script some more to see how it could be modified to gather a 2nd level index. In the short term, my hack to include the files for the Caribbean is to add it to the index. iSiloX's error log clued me into the fact that the Caribbean pages weren't being grabbed (had an error for each page...must be linked to from elsewhere in the site). Anyway, now the iSiloX conversion succeeded. Can't wait to see what this looks like on my PocketPC!
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Best PDF conversion tool. | Dark123 | 19 | 04-21-2010 02:52 AM | |
tool(s) for conversion to ePub | Richard Maseles | ePub | 1 | 01-18-2009 08:47 PM |
E-Book Conversion Tool | kgns | Kindle Formats | 5 | 11-28-2008 04:01 AM |
Conversion tool for Mac OS X | Klaatu | Sony Reader | 8 | 12-13-2007 07:13 PM |
Excellent conversion tool | Greenchief59 | Workshop | 0 | 02-12-2005 08:36 PM |