User earney asked in a comment where I am getting the raw data.
Just like blockexplorer.com I'm using the getblock patch by jgarzik to build a custom bitcoind exposing a new RPC method, getblock. The getblock method returns all data about a given block number in JSON format, but still requires some post-processing to have access to all of the details.
I've written some java utility classes that iterate through the current block chain and store some of the attributes in memory. I then post-process this data to extract interesting stats - such as the Bitcoin top 100 'Rich List'.
For testing my own code and verifying the results I used blockexplorer.com - the UI is quite nice and makes it easy to navigate between transactions, blocks and addresses, and the data there appears to be always up to date.
If I find the time I'll either clean up the code and open source it, or look for a way to publish it online in an easy to use format.
User earney had a follow up question about the use of gettransaction.
I extract all of the data I need using only the getblock command, and don't use gettrasaction at all.
Basically I start at block 0, retrieving each block in turn and storing the data in memory about all transactions. When I get to the end of the block chain (i.e. the newest block), everything I need is available in memory -
* a Map of transaction ID to transaction details
* a Map of address to current balance
Then depending on what kind of analysis I want to do my program iterates over the transactions, or the address balances (or both).
What kind of stats are you trying to extract about bitcoin transactions? I might be able to publish the processed data somewhere public, like a Google spreadsheet or fusion table that will let others get access and do their own analysis. Does this sound useful to anyone?