- Members:
Operational Lead: Carl Shan carlshan : [email protected]
He Ma sunnymh : [email protected]
Siyang Zeng (Sunny) SunnySunnia : [email protected]
Alex Chao alexchao56 : [email protected]
Theresa Andrasfay tandrasfay
David Barrera jest4pun
Xiaorui (Sherry) Xia xsherryxia : x_sherry_xia@berk
Iteration 1 Goals:
- To create a presentation that explains the ETAS model (you can see the presentation here: https://docs.google.com/presentation/d/1yf3W22eAIX-bPgmVdqkRFfsfIgCHmbtXclF3-S92Us8/edit?usp=sharing)
-
SMART Goals:
[S]pecific
-- We will create a Google Presentation with slides that explain the 4 parameters of the ETAS model as presented by Professor Stark.
[M]easurable
-- The presentation will be 4-minutes long, comprising of about 8 slides.
[A]ttainable
-- You can view our presentation here: https://docs.google.com/presentation/d/1yf3W22eAIX-bPgmVdqkRFfsfIgCHmbtXclF3-S92Us8/edit?usp=sharing
[R]elevant
-- Understanding the ETAS model will help the class understand what the important inputs and parameters to earthquake modeling to consider.
[T]ime-Bound
-- We will have the presentation completed by 11:59pm 11/4/2013 -
ROADBLOCKS
-- The biggest and most obvious roadblock we've encountered is in the process of understanding the ETAS model itself, which is mathematically complex and sophisticated.
However, we're overcoming this roadblock through the utilization of Luen's dissertation, as well as various papers by Ogata et al. A list of resources we've used can be found in our repo: https://github.com/SunnySunnia/TheQuakers
======
Iteration 2 Goals:
-To plot the inter-arrival times of the earthquakes binning by magnitude, for the analyzers, as Prof. Stark suggested.
-Study Luen's code and try to reproduce his process with both his data and our data.
-Try to draw some initial conclusions based on what we observe while visualizing and share that with the class.
Can find our slides here: https://docs.google.com/presentation/d/1X4uL6_auhQZESuqVPTj9bZPF0vfhcdAE8MwNcEgtW9k/edit#slide=id.p
- ROADBLOCKS
-- Luen's code took us most of time while still have no clue about how he picked his parameters for his ETAS models and where is the parameter K in Ku^M embeded in his code. (We will ask Disi, who contacts with Luen, to address these questions to Luen)
-- Being notified by the curator that they have all the data sets and relevant codes ready for use, we failed to get any other data with nice time-format but the '250' because the curators did not upload the data set, that all the codes are based on, to the data-curator repo. (Later we talked to one of the curators and found out they 'hid' the data set to their wiki instead.)
===========
Iteration 2.5 Task: (Nov 12th)
-Plot the 'wait-times' of the successors of an earthquake so that the analyzers could observe and fit a window function from that.
Plots: https://github.com/SunnySunnia/TheQuakers/blob/master/MDA/SuccessorsTimePlot.md
==========
- Look at the upgraded version of the MDA window length function suggested by Prof. Stark and develop an algorithm for tuning parameters and producing error diagrams.
- Assemble questions for Luen conference call.
Roadblock: So far we are confused about his code to generate the error curve.
Luen Conference-Call notes:
https://github.com/SunnySunnia/TheQuakers/blob/master/Resources/Luen-Conference-Call-Notes.md
Modify the algorithm and start trying with different parameters. Figure out a way to tune the parameter. (Since there are 2 free parameters, hold one fixed and test on the other one, or changing both of them in some ways.
Roadblock: not able to understand the logic of Luen's way of producing error diagram.
So, while still waiting for Luen's response to explain, we decide to try another approach: setting specific percentiles of the waiting time to next event of earthquakes as the window lengths to those corresponding magnitudes.
Results:
https://github.com/SunnySunnia/TheQuakers/tree/master/Quantile-Method
Should have a set of 'best' parameters for our window function and present our results. If ours cannot beat any of the models, maybe try with the 'Optimization approach" (Update* was able to find out the parameters from Professor Stark and are trying to create a few updated models based on his suggestions)
Thanksgiving (Group members can continue to work on tasks) Communicate with each other via github, email, facebook
Successfully developed our own algprithm of producing error diagrams and comparing models by their integrated areas under the error diagrams.
Details including functions can be found here:
https://github.com/SunnySunnia/TheQuakers/tree/master/skeleton
Gather outputs from analyzers. Use these to make graphs for the final presentation. Compare different parameters found by different groups. Determine the best model to beat ETAS.
Regroup and see whether or not we have a model ready to present or if we should create a wrap-up summary presentation of all the work we have done so that we'll have something to show.
Our Final Quake Presentation: https://docs.google.com/a/berkeley.edu/presentation/d/18xqnLrV0QYs6DH1QRwFwGcOcU2ff9X6nhPDliJxy9gk/edit#slide=id.p
Choose best graphs for data science fair. Updated results on: https://docs.google.com/a/berkeley.edu/document/d/1YHruJF6JhOOQ3yputnTgFFLCiTmZqHTg-j5oCipsIBE/edit#heading=h.d23q731yqkzp
Any last minute changes. Provide description of process to the presentors for Dec 12 Data Science Faire.
Attend Data Science Faire
================= ###Final Results:
- ECDF: https://github.com/SunnySunnia/TheQuakers/tree/master/Quantile-Method
- Scaled MDA
- Scaled by division: http://github.com/SunnySunnia/TheQuakers/tree/master/skeleton/MDAScaledByDiv
- Scaled by subtraction: https://github.com/SunnySunnia/TheQuakers/tree/master/skeleton/MDAScaledBySub
- *These are the main takeaways of our project, but take a closer look at our repositories for more details on what we did.
=============== Links to datasets:
data: https://github.com/stat157/analyzers/wiki/Data-from-Curators
data with NICE time format: (mag>=4,5)
https://www.dropbox.com/s/gttgdef4j0z02hd/CleanDataWithCorrectTime.csv
Nice full data set: https://www.dropbox.com/s/tzx4qqxhh9u9iz2/DataFrame.csv