Sunday, July 12, 2009

the "rain Google Columbus" question

In David Huynh's Parallax faceted browser for Freebase, his video describes how to use the application to answer the following question: give me a list of schools attended by the children of Republican presidents.

This is another example of a query that you cannot answer using a search engine. In fact, it is a Semantic Web type query.

I attended the Freebase hack-day and I spoke to David about Parallax. While describing W|A, I posited another query that Alpha may answer one day and that Mathematica (with a suitably written program) can answer today. This question is:

How many days in 2009 did the Google stock price decline on the same day that is rained in Columbus Ohio?

It is interesting to note that (according to a Metaweb engineer), Freebase cannot answer the Google component of this question either! :-)

This is a simple calculation using the Mathematica primitives and curated data. This is explained in detail in the short series rain Google Columbus Part 1, Part 2 and Part 3. There is also the Google query results. As stated earlier, it is not a fair question to ask Google.

We wanted to answer the rain Google Columbus question. The answer is: there are 26 trading days in the first half of 2009 when the Google stock price declined (compared to the previous day) and it rained in Columbus Ohio more than 0.3 inches on the same day.

Of course this could be done with some other stock (not GOOG), some other finance-related data (not amount closing price declined), some other city (not Columbus) and some other weather-related data (not rainfall).

We derived this answer using Mathematica curated WeatherData[] and FinancialData[]. We made a routine to return the precipitation in Columbus on a particular day. Then we computed which days the stock price declined. Then we passed that list of dates to the precipitation computation and resticted the list to rainy days.

In addition to the curated data functions, we also looked at some other Mathematica constructs such as:
1) functional programming (i.e. no loops are used, only Map[], Apply[], etc.)
2) pure and nested pure functions
3) assorted list manipulation techniques (e.g. interleaving)