A extension for Google Refine to compute elementary statistics.
Run Google Refine. On the starting page click the “Browse workspace directory” link in the lower left corner. Make a folder called “extensions” within the the window that pops up. Copy the stats folder into the extensions folder. The final path should be:
/PATH/TO/Google/Refine/extensions/stats/...
Restart Google Refine.
(Alternatively, you copy the stats folder into the extensions folder of your Google Refine installation, but it may be lost during upgrades.)
Select “Column statistics” from the drop-down menu of any column header. Statistics will be calculated based on filtered rows, so you can facet your dataset in different ways and calculate statistics for each subset.
WARNING: THIS IS NOT REALLY TESTED WE WILL ADD BETTER INSTRUCTIONS SOON.
If you modify this extension, you can build by changing into the stats directoy and executing a command such as:
ant -Drefine.dir=/Users/YOU/src/google-refine/main -Dserver.lib.dir=/Users/YOU/src/google-refine/server/lib build
refine-stats is a Newsapps project. Development by Joe Germuska and Christopher Groskopf.
MIT.
Note: refine-stats includes the Apache Commons Math library, which is licensed under the Apache Software License. A copy of this license can be found in /stats/module/MOD-INF/lib/.