Quote:
Originally Posted by Nergal
Nate the Great, I just downloaded it, unpacked it has a size of 371.6 MB
In case you have some Unix derivate at hand:
less google-renewals-all-20080624.xml
and I had within a second the first record. Though with all the tags around.
|
A Windows console version of less is available from the Less home page:
http://www.greenwoodsoftware.com/less/index.html
Quote:
cat google-renewals-all-20080624.xml | grep 'Tolkien'
|
"grep Tolkien google-renewals-all-20080624.xml" also works. No need for the cat and pipeline.
A Windows console version of grep is available as part of a set of Gnu utilities for Windows, here:
http://gnuwin32.sourceforge.net/packages/grep.htm
Quote:
and I found pretty quick there are indeed some books with copyright still, now I think some simple xml-viewer for this file is needed.
|
The challenge will be the file size. I tried a couple of XML viewers/editors, and they choked on it with out of memory errors. (I have a 3,1ghz Pentium box running XP Pro with 1GB of RAM.)
______
Dennis