Jason Bartholme wrote on his blog:
I have been working on an internal tool for our sales and marketing departments. They wanted the ability to provide the URL of a company and return various information. One of the elements they wanted was the number of pages index in Google.
So he tried scraping Google SERPs with ColdFusion. For example in the case of Rebol Tutorial site, I would want this:
You can compare Coldfusion version with Rebol version found here (I just corrected a little glitch and do it more verbosely for Rebol Noobies to understand) which uses Rebol Parse function:
goog: http://www.google.com/search?hl=en&btnG=Google+Search&q=
domain: "reboltutorial.com" ;modify this string for your search term/keyword
replace/all domain " " "+"
query: rejoin [goog domain]
parse-rule: [
thru "Results"
thru "of about "
thru <b>
copy num to </b>
to end
]
html-content: read query
parse/all html-content parse-rule
print num
Neat, isn’t it
this is the output on Rebol console if you copy and paste the code above:


















This is very interesting.
I have tried it with one my client’s and it gave me the exact number of pages Google has indexed but Google hasn’t indexed all the page, why is this?
Maybe try add a google site map
I was the one who replied to Jason’s blog post with the example you link to. Thanks for highlighting this useful, real-world example.
Oh it was you