Vectorized implementations of 3/4 partner selection approaches #6

franz101 · 2021-02-17T20:39:34Z

Coded mostly in numpy, pandas and scipy.
I have focused on different strategies to improve the speed further.

Avoid for-loops as much as possible
Use broadcasting and matrix multiplication instead
Experiment with advanced indexing and one hot encoded matrix multiplication

In this PR the following is included:

The base structure for a module that can be used for partner selection.
A stock downloader which is build on top of y-finance
A notebook with a tutorial on how to use it.

Outlook:

I have not yet finished the vectorised copula approach.
I wanted to build a few dashboards outside of Python
Unit testing is still behind
An architecture to make it more usable with static methods for example.

If you have any questions. I'm looking forward to discuss them.

Jackal08

Hi Franz,

Overall good application. I like that you went the extra mile to improve the quality, even though you didn't manage to finish all 4 approaches.

Will be setting up an interview.

Glad to see you started building unit tests
Cool package name
I like that you set up the framework for a python package
Good quality writeup. I would have liked it to be more verbose but I like that you took the effort to add a template and detail your process.
Good import order
Added hard typing
Used decorators
Used hidden and public methods
Dostrings match our style
Added error handling and assertions
Used class inheritance
Good use of inline comments
Good design choices
req.txt missing version numbers
Notebook is good. I would have liked to see some Latex. I like your visualizations, good to see the geometric selector.
Used CUSUM and should have been CUMPROD.

Jackal08 · 2021-02-19T13:37:08Z

FranzKrekeler/README.md

@@ -0,0 +1,13 @@
+The vinecopulaslab is a basic implementation of the paper [Statistical Arbitrage with Vine Copulas [Stübinger, Mangold, Krauss (2016)]](https://www.econstor.eu/bitstream/10419/147450/1/870932616.pdf)


Cool name: vinecopulaslab

Jackal08 · 2021-02-19T13:39:19Z

FranzKrekeler/tests/unit/datafetchertest.py

+
+class TestDownload(TestCase):
+    def test_download_sample(self):
+        pass


At least the thought is there :)

Jackal08 · 2021-02-19T13:40:06Z

FranzKrekeler/tests/unit/geometrictest.py

+
+class TestDownload(TestCase):
+    def test_distance_to_diagonal(self):
+        # test template


Start with a capital letter. (Our convention)

Jackal08 · 2021-02-19T13:40:17Z

FranzKrekeler/tests/unit/geometrictest.py

+        # TODO: add tests
+        line = np.array([1, 1, 1])
+        pts = np.array([[0, 0, 0]])
+        self.assertEqual(GeometricSelection.distance_to_line(line, pts), 0, "Should be 0") 


Jackal08 · 2021-02-19T13:40:48Z

FranzKrekeler/vinecopulaslab/__init__.py

@@ -0,0 +1,3 @@
+from vinecopulaslab.partnerselection import TraditionalSelection, ExtendedSelection, GeometricSelection, \


Nice to see that you put the framework in place to build your own package.

Jackal08 · 2021-02-19T13:58:45Z

FranzKrekeler/vinecopulaslab/partnerselection/extended.py

+        target_stock = group.name
+        partner_stocks = group.STOCK_PAIR.tolist()
+        stock_selection = [target_stock] + partner_stocks
+        # We create a subset of our ecdf dataframe to increase lookup speed.


Good use of inline comments.

Jackal08 · 2021-02-19T14:00:55Z

FranzKrekeler/vinecopulaslab/partnerselection/extremal.py

+        # Here the math from the Mangold 2015 paper begins
+        permut_mat = np.array(list(itertools.product([-1, 1], repeat=d)), dtype=np.int8)
+        sub_mat = permut_mat @ permut_mat.T
+        F = (d + sub_mat) / 2


I'm not a fan of variables that are all CAPS unless they are env variables. Maybe you tried to match the notation in the paper? If not then perhaps a more descriptive name would help.

Jackal08 · 2021-02-19T14:03:42Z

FranzKrekeler/vinecopulaslab/requirements.txt

@@ -0,0 +1,6 @@
+numpy


Nice, thank you for including this, however you need to add the version numbers.

Jackal08 · 2021-02-19T14:04:21Z

FranzKrekeler/vinecopulaslab/universe/universe.py

+        :return: (List[str]) returns a list of SP500 symbols
+        """
+        url = "https://raw.githubusercontent.com/datasets/s-and-p-500-companies/master/data/constituents_symbols.txt"
+        r = requests.get(url)


r is a bad variable name.

Jackal08 · 2021-02-19T14:09:53Z

FranzKrekeler/submission_and_tutorial.ipynb

+{
+ "cells": [
+  {
+   "cell_type": "markdown",


Why did you take the CumSum of the returns? Thats wrong, it should be the cumproduct, otherwise, you are not compounding your returns.

sp500_prices[sample_partners].pct_change(fill_method='ffill').dropna(how='all').cumsum(axis=0).plot();

franz101 added 19 commits February 17, 2021 05:37

First commit with basic structure, comments and writeup will follow

18529c4

fixed wrong axis calculation in geometric sum

908d4a4

added further documentation and graphics

0d16b42

small restructering and added UML

45ff58b

fixed photo path

f7f7393

added write up

b46012c

moved partnerselection

323f2cc

added cached files to skip downloads

5803450

added cached files to skip downloads

ad1fb49

functionality refactoring

a845d99

further structuring

1b080c8

inheritance refactoring

98ce248

added another static method

b3e9eb0

removed tensorflow

af8b690

removed tf from requirments

e8243a5

removed unescarry function

382787f

added the extremal approach for developing

4839ac3

typos

bc24416

added further comments

c5f53a6

Jackal08 approved these changes Feb 19, 2021

View reviewed changes

Jackal08 assigned franz101 Feb 19, 2021

Jackal08 added the Skillset Challenge label Feb 19, 2021

franz101 added 2 commits March 3, 2021 10:29

fixed miscalculation of est1 [extended approach] pointed out by vijay

201346a

[extended approach] fixed wrong axes summation

cf281a9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vectorized implementations of 3/4 partner selection approaches #6

Vectorized implementations of 3/4 partner selection approaches #6

franz101 commented Feb 17, 2021

Jackal08 left a comment

Jackal08 Feb 19, 2021

Jackal08 Feb 19, 2021

Jackal08 Feb 19, 2021

Jackal08 Feb 19, 2021

Jackal08 Feb 19, 2021

Jackal08 Feb 19, 2021

Jackal08 Feb 19, 2021

Jackal08 Feb 19, 2021

Jackal08 Feb 19, 2021

Jackal08 Feb 19, 2021

		@@ -0,0 +1,13 @@
		The vinecopulaslab is a basic implementation of the paper [Statistical Arbitrage with Vine Copulas [Stübinger, Mangold, Krauss (2016)]](https://www.econstor.eu/bitstream/10419/147450/1/870932616.pdf)

		@@ -0,0 +1,3 @@
		from vinecopulaslab.partnerselection import TraditionalSelection, ExtendedSelection, GeometricSelection, \

Vectorized implementations of 3/4 partner selection approaches #6

Are you sure you want to change the base?

Vectorized implementations of 3/4 partner selection approaches #6

Conversation

franz101 commented Feb 17, 2021

Jackal08 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment