Merge pull request #2 from Pangbo15/master

upadate 0810
Pangbo15 · Aug 10, 2020 · f178b2d · f178b2d
2 parents c26947e + c629ece
commit f178b2d
Show file tree

Hide file tree

Showing 236 changed files with 581,306 additions and 236,766 deletions.
diff --git a/SIQEF Assignment/VAR time series/README.md b/SIQEF Assignment/VAR time series/README.md
@@ -0,0 +1,16 @@
+# VAR and NN time series analysis
+
+Materials for `Time Series Forecasting Using Statistics and Machine Learning` lecture by Dr.Jeffrey Yau at UC Berkeley. You can access the video and slides in the link below. 
+
+
+## Reference
+https://www.youtube.com/watch?v=i40Road82No
+
+Slides:
+https://d2pnv7vfbsu458.cloudfront.net/uploads/video/presentation/2291/Day2-1_copy.pdf
+
+or(backup resorce) 
+
+https://www.iteblog.com/sparksummit2018/time-series-forecasting-using-recurrent-neural-network-and-vector-autoregressive-model-when-and-how-with-jeffrey-yau-119149690-iteblog.pdf
+
+
diff --git a/...nment/VAR time series/Time Series Forecasting Using Statistics and Machine Learning.ipynb b/...nment/VAR time series/Time Series Forecasting Using Statistics and Machine Learning.ipynb
@@ -0,0 +1,320 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The video only shows little part of his code. Unfortunately I don't find his original code. I try to connect Dr.Yau yet received no reply. \n",
+    "\n",
+    "So I just replicate his code in slides here and try to make up the entire program by my knowledge in the following week. Yet it is very unlikely to finish because we don't know the data and the preprocessing way he used."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# VAR Part"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Index of Consumer Sentiment"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def tsplot2(y, title, lags=None, figsize=(12,8)):\n",
+    "    '''Examine the patterns of ACF and PACF, along with the time series plot and histogram.\n",
+    "    '''\n",
+    "    fig = plt.figure(figsize=figsize)\n",
+    "    layout = (2,2)\n",
+    "    ts_ax = plt.subplot2grid(layout,(0,0))\n",
+    "    hist_ax = plt.subplot2grid(layout,(0,1))\n",
+    "    acf_ax = plt.subplot2grid(layout,(1,0))\n",
+    "    pacf_ax = plt.subplot2grid(layout,(1,1))\n",
+    "    \n",
+    "    y.plot(ax=ts_ax)\n",
+    "    ts_ax.set_title(title, fontsize=14, fontweight='bold')\n",
+    "    y.plot(ax=hist_ax, kind='hist', bins=25)\n",
+    "    hist_ax.set_title('Histogram')\n",
+    "    smt.graphics.plot_acf(y, lags=lags, ax=acf_ax)\n",
+    "    smt.graphics.plot_pacf(y, lags=lags, ax=pacf_ax)\n",
+    "    [ax.set_xlim(0) for ax in [acf_ax, pacf_ax]]\n",
+    "    sns.despine()\n",
+    "    plt.tight_layout()\n",
+    "    return ts_ax, acf_ax, pacf_ax\n",
+    "    "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Transforming the Series"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "series_transformed['UMCSENT'] = np.log(series.iloc[:,0])\n",
+    "series_transformed['beer'] = np.log(series.iloc[:,1])"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## VAR Model Estimation and Output"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "model = sm.tsa.VARMAX(y_train, order=(3,0), trend='c' )\n",
+    "model_result = model.fit(maxiter=1000, disp=False)\n",
+    "print(model_result.summmary())"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## VAR Model Diagnostic"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "model = sm.tsa.VARMAX(y_train, order=(3,0), trend='c')\n",
+    "model_result = model.fit(maxiter=1000, disp=False)\n",
+    "model_result.plot_diagnostics()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## VAR Model Selection"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "aic = []\n",
+    "for i in range(5):\n",
+    "    i += 1\n",
+    "    model = sm.tsa.VARMAX(y_train, order=(i,0), trend='c')\n",
+    "    model_result = model.fit(maxiter=1000, disp=False)\n",
+    "    print('Order =', i)\n",
+    "    print('AIC:', model_result.aic)\n",
+    "    print('BIC:', model_result.bic)\n",
+    "    print('HQIC:', model_result.hqic)\n",
+    "    "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## VAR Model Forecast"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from math import sqrt\n",
+    "from sklearn.metrics import mean_squared_error\n",
+    "\n",
+    "VAR_forecast_beer = np.exp(z['beer'])*series['beer'][-3:]\n",
+    "VAR_forecast_UMCSENT = np.exp(z['UMCSENT'])*series['UMCSENT'][-3:]\n",
+    "\n",
+    "rmse_beer = sqrt(mean_squared_error(series['beer'][-3:], VAR_forecast_UMCSENT))\n",
+    "rmse_UMCSENT = sqrt(mean_squared_error(series['UMCSENT'][-3:], VAR_forecast_UMCSENT))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "  "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# NN Part"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from keras.models import Sequential\n",
+    "from keras.layers import Dense\n",
+    "from keras.layers import LSTM\n",
+    "\n",
+    "# Design the network architecture\n",
+    "model = Sequential()\n",
+    "model.add(LSTM(60,\n",
+    "               dropout=0.1\n",
+    "               recurrent_dropout=0.2,\n",
+    "               return_sequences = True,\n",
+    "               input_shape=(n_lookback,X_scaled_train_reshape.shape[2])))\n",
+    "\n",
+    "model.add(LSTM(36))\n",
+    "model.add(Dense(X_scaled_train_reshape.shape[2]))\n",
+    "model.compile(loss='mae', optimizer='RMSprop')\n",
+    "\n",
+    "# Model Training\n",
+    "n_epochs = 500\n",
+    "batchSize = 40\n",
+    "model.fit(X_scaled_train_reshape, y_scaled_train, epochs=n_epochs,\n",
+    "         batch_size=batchSize, verbose=0, shuffle=False)\n",
+    "\n",
+    "# make a prediction\n",
+    "yhat_scale = model.predict(X_scaled_test_reshape)\n",
+    "\n",
+    "# Inverse-scaling for forecast\n",
+    "inv_yhat = np.concatenate((X_scaled_test, yhat_scale), axis=1)\n",
+    "inv_yhat = scaler.inverse_transform(inv_yhat)\n",
+    "\n",
+    "# Model Evaluation: calculate RMSE\n",
+    "from math import sqrt\n",
+    "from sklearn.metrics import mean_square_error\n",
+    "print('Test RMSE: %.3f' % sqrt(mean_squared_error(y_test[:,0], inv_yhat[:,0])))\n",
+    "print('Test RMSE: %.3f' % sqrt(mean_squared_error(y_test[:,1], inv_yhat[:,1])))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## LSTM: Forecast Results"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# Model Evaluation: calculate RMSE\n",
+    "from math import sqrt\n",
+    "from sklearn.metrics import mean_square_error\n",
+    "print('Test RMSE: %.3f' % sqrt(mean_squared_error(y_test[:,0], inv_yhat[:,0])))\n",
+    "print('Test RMSE: %.3f' % sqrt(mean_squared_error(y_test[:,1], inv_yhat[:,1])))"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "print(str(round((sqrt(mean_squared_error(y_test[:,0], inv_yhat[:,0]))/y_test[:,0].mean())*100,2)) + '%')\n",
+    "print(str(round((sqrt(mean_squared_error(y_test[:,0], inv_yhat[:,1]))/y_test[:,1].mean())*100,2)) + '%')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.7.6"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
diff --git a/SIQEF Assignment/cvxpy/DQCP_composition_rule.jpg b/SIQEF Assignment/cvxpy/DQCP_composition_rule.jpg
diff --git a/SIQEF Assignment/cvxpy/README.md b/SIQEF Assignment/cvxpy/README.md
@@ -0,0 +1,8 @@
+# CVXPY package
+
+Author: Huang Kenghua
+
+CVXPY is a Python-embedded modeling language for convex optimization problems. It allows you to express your problem in a natural way that follows the math, rather than in the restrictive standard form required by solvers.
+
+## Reference
+https://github.com/cvxgrp/cvxpy
diff --git a/SIQEF Assignment/cvxpy/curvature.jpg b/SIQEF Assignment/cvxpy/curvature.jpg