forked from Sandy4321/Betting-Strategy-and-Model-Validation
-
Notifications
You must be signed in to change notification settings - Fork 7
/
Copy pathBetting Strategy and Model Validation - Part 01.Rmd
872 lines (630 loc) · 61.7 KB
/
Betting Strategy and Model Validation - Part 01.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
---
title: "Betting Strategy and Ⓜodel Validation - Part I"
subtitle: "Betting Model Analysis on Sportsbook Consultancy Firm A"
author: "[®γσ, Eng Lian Hu](https://englianhu.github.io/) <img src='figure/ShirotoNorimichi2.jpg' width='24'> 白戸則道®"
date: "`r Sys.Date()`"
output:
tufte::tufte_html:
toc: yes
tufte::tufte_handout:
citation_package: natbib
latex_engine: xelatex
tufte::tufte_book:
citation_package: natbib
latex_engine: xelatex
bibliography: skeleton.bib
link-citations: yes
---
```{r libs, message = FALSE, warning = FALSE, cache = TRUE, include = FALSE}
## tufte::tufte_html:
## self_contained: no
suppressMessages(require('tufte', quietly = TRUE))
# invalidate cache when the tufte version changes
knitr::opts_chunk$set(tidy = FALSE, cache.extra = packageVersion('tufte'))
options(htmltools.dir.version = FALSE)
## Setup Options, Loading Required Libraries and Preparing Environment
## Loading the packages and setting adjustment
suppressMessages(require('utils', quietly = TRUE))
suppressMessages(source('function/libs.R'))
```
# Abstract
This is an academic research by apply R statistics analysis to an agency A of an existing betting consultancy firm A. According to the *Dixon and Pope (2004)*^[Kindly refer to 24th paper in [Reference for industry knowdelege and academic research portion for the paper.](http://rstudio-pubs-static.s3.amazonaws.com/208636_147313e52754447e89b38cc3456c5009.html#reference-for-industry-knowdelege-and-academic-research-portion-for-the-paper.) in **7.4 References**], due to business confidential and privacy I am also using agency A and firm A in this paper. **The purpose of the anaysis is measure the staking model of the firm A**. For more sample which using R for Soccer Betting see <http://rpubs.com/englianhu>. Here is the references of [rmarkdown](http://rmarkdown.rstudio.com/authoring_basics.html) and [An Introduction to R Markdown](http://rpubs.com/mansun_kuo/24330). You are welcome to read the *Tony Hirst (2014)*^[Kindly refer to 1st paper in [Reference for technical research on programming and coding portion for the paper.](http://rstudio-pubs-static.s3.amazonaws.com/208636_147313e52754447e89b38cc3456c5009.html#reference-for-technical-research-on-programming-and-coding-portion-for-the-paper.) in **7.4 References**] if you are getting interest to write a data analysis on Sports-book.
# 1. Introduction to the Betting Stategics
- Section [1.1 Introducing Betting Strategies] - Introduce Betting Strategies
- Section [1.2 Value Betting] - Odds Price and Overrounds Changared by Bookmakers
- Section [1.3 Professional Gambler] - Punters' life and How Hedge Fund Works
## 1.1 Introducing Betting Strategies
As a player, we know gambling is that an activities which bet against bankers. Normally gamblers applied few betting strategies to make money from bankers or may be other players. You can try to refer to [数据科学中的R和Python - Data Science is the art of turning data into actions - 如何用21点来击败赌场?](http://xccds1977.blogspot.my/2012/03/21.html)or [Betting Strategy](https://en.wikipedia.org/wiki/Betting_strategy) for more information.
Well, I'll introducing some sports betting strategies used by a company **Sports Insights** (You can just read as your own reference, since I've never subscribe thier service, here I term it as **SI**) might help you improve your winning percentage and start making money investing in sports. The following concepts represent some of the most lucrative historical betting trends and are the same tools used by sharp bettors to turn consistent profits.
- **Sports Betting Strategy 1** - [Betting Against the Public](https://www.sportsinsights.com/how-to-bet-on-sports/betting-against-the-public/)
Betting Against the Public is one of the most popular and simplest methods used by **SI** to maximize value in the sports betting marketplace.
- **Sports Betting Strategy 2** - [Reverse-Line Movement](https://www.sportsinsights.com/how-to-bet-on-sports/reverse-line-movement/)
**SI** will show how analyzing [betting trends data](https://www.sportsinsights.com/betting-trends/) and line movement can help you identify which games the sharp money (wagers placed by sharps, wiseguys or betting syndicates) is taking.
- **Sports Betting Strategy 3** - [Major Line Moves](https://www.sportsinsights.com/major-line-moves/)
**SI**'s major line move analysis explains how to interpret line moves across the sports betting marketplace in order to find value.
- **Sports Betting Strategy 4** - [Shading Sports Betting Lines](https://www.sportsinsights.com/how-to-bet-on-sports/shading-sports-betting-lines/)
This article explains how sportsbooks shade their lines to exploit human tendencies and how you can take advantage by using **SI**'s Betting Against the Public strategy.
- **Sports Betting Strategy 5** - [Shopping for the Best Line](https://www.sportsinsights.com/shopping-for-the-best-line/)
Shopping for the best possible number is an easy way to improve your winning percentage over the course of an entire season.
- **Sports Betting Strategy 6** - [The Importance of Units Won](https://www.sportsinsights.com/sportsbettingarticles/the-importance-of-true-units-won-in-evaluating-betting-strategies/)
Understanding the importance of units won vs winning percentage will help you evaluate the true worth of any sports betting system.
Here is some websites or companies which provides sportsbook trading/betting.
- [StrataBet](https://app.stratabet.com/home)
<iframe width="560" height="315" src="https://www.youtube.com/embed/V_5O0GUEPo4" frameborder="0" allowfullscreen></iframe>
- [4lowin2](http://www.4lowin.com/)
<iframe width="560" height="315" src="https://www.youtube.com/embed/eFYS0jdSiWc" frameborder="0" allowfullscreen></iframe>
- [BetAngel](http://www.betangel.com/)
<iframe width="560" height="315" src="https://www.youtube.com/embed/cBNfUU_kcic" frameborder="0" allowfullscreen></iframe> <iframe width="560" height="315" src="https://www.youtube.com/embed/itlYsN0TZRs" frameborder="0" allowfullscreen></iframe>
-[Sports Betting Tips For Profit](http://sportsbettingtipsforprofit.com/)
[16]*Maurizio Montone (2015)* taking 82 operators' as sample data for research on arbitrages and bookmakers' characteristics. [17]*Steven D. Levitt (2004)* analyse the betslips breakdown which is similar with section 3 in this paper.
## 1.2 Value Betting
Section [1.1 Introducing Betting Strategies] describe some basic concepts about betting strategies. Now we focus on [Value betting](http://www.valuepunter.com/valuebetting.htm) and it is the popular and efficient staking strategy since money management is the key for betting strategy.
> It's not a matter of life or death. But if that team, that result or that referee's decision goes against them, the lives of their wives and their children are affected. The mortgage does not get paid. Holidays are cancelled. They are not players or managers. They are football's professional gamblers.
> It is their full-time job to win money betting on the game. There are not many successful enough to survive. It is estimated, by the gamblers interviewed here, that fewer than 3 percent of gamblers who have what it takes to "go pro" can earn a living from betting.
> No wonder they are a secretive, paranoid bunch. Never do they reveal exactly how they win their money or how much. Their greatest secret is what is known as "the edge". That nugget of information which tells them that the odds on the football betting market is wrong. Only then do they hand over their hard-won dough.
> It takes hours of eye-bleeding research to find "the edge". Most pros spend hours, and thousands of pounds, building statistical models. Others will employ specialists—analysts and statisticians—to build a complex algorithm for them. If successful enough, they will attract wealthy investors who will hand over thousands, sometimes millions, to bet for them and be promised a healthy return.
> Tony Bloom, a legendary gambler known as "The Lizard" is one such operator. So revered, Bloom runs Star Lizard, a company that employs a raft of people to analyse football matches for his millionaire-only investors. Bloom is rumoured to be worth more than £1 billion and owns Brighton, the Premier League wannabes.
source : [Mugs and Millionaires: Inside the Murky World of Professional Football Gambling](http://bleacherreport.com/articles/2200795-mugs-and-millionaires-inside-the-murky-world-of-professional-football-gambling)
The best and the most successful punters are money managers looking for ideal situations, which are defined as matches with only high percentage of return. In individual situations luck will play into the outcome of an event, which no amount of odds compiling can overcome, but in the long run a disciplined punter will win more of those lucky games than lose.
## 1.3 Professional Gambler
Nowadays, operates make a lot of restriction to increase their profits. For example: single bet maximum stakes per account, triggers upon staking per bet, single match maximum stakes per account, vigorous/spreads margin (Which will describe in [2.2 Overrounds / Vigorish]). As a professional gambler we are require a high level mathematical skill in order to take profit from operators. Below are some articles about sports betting data analysis.
- [Play Data, Play Ball!Exploring Baseball Data with R](http://suensummit.github.io/playBall/#1)
- [openWAR](http://statsinthewild.com/openwar/)
- [How Predictable is the English Premier League?](http://www.r-bloggers.com/how-predictable-is-the-english-premier-league/)
- [It's boffins versus bookies on the World Cup Rankings](http://www.theregister.co.uk/2010/06/27/world_cup_update/)
As we know George Soros and Jim Rogers are two of most successness punters in financial market while they used to analyse more than 25 companies from financial reports and also their business when they was working in Atom fund. Environment and the life of punters.
- [Preparing for a Career as a Sports Statistician: Two Interviews with People in the Field](http://stattrak.amstat.org/2012/08/01/sports-statistician/)
- [How hedge funds work](http://www.economist.com/blogs/economist-explains/2015/03/economist-explains-16?fsrc=scn%2Ffb%2Fwl%2Fee%2Fhowhedgefundswork)
- [Rob Mastrodomenico uses data to estimate the outcomes of sports events for professional punters, and it's a complicated business](http://www.theguardian.com/money/2011/jun/11/working-life-quantitative-analyst)
- [ATASS - Work to have fun](http://www.atass.com/about-us)
- [数学是不是博彩业的水晶球?](http://www.xinminweekly.com.cn/News/Content/878)
- [庞特俱乐部:三年狂赚156亿神秘赌博集团如何“十赌九赢”?](http://finance.sina.com.cn/roll/20120804/013612756650.shtml)
Now that you have some basic betting strategy knowledge and concept, you can try to learn further sportsbook staking modelling to take the challenge.
# 2. Data
- Section [2.1 Collect and Reprocess the Data] - Data from Firm A
- Section [2.2 Overrounds / Vigorish] - Odds Price and Overrounds Changared by Bookmakers
## 2.1 Collect and Reprocess the Data
I collect the data-set of World Wide soccer matches from year 2011 until 2015 from a British betting consultancy named firm A. All bets placed by display on HK currency, and the odds price also measure based on Hong Kong price.
I tried to apply `RSelenium` on [**®Studio Server Centos7**](http://rstudio.scibrokes.com) to scrape the data from live-score website includes the odds price but the binary [phantomjs](https://github.com/ariya/phantomjs/issues/12948#issuecomment-131267076) is not available for Linux, and I also not familiar with the installation of `Java` as well as setting of the path for `rJava`. Kindly refer to [Natural Language Analysis](http://rpubs.com/englianhu/natural-language-analysis) for more information about the teams name matching.
```{r load-data, echo = FALSE, results = 'asis'}
## Load saved dataset to save the loading time.
## directly load the dataset from running chunk `read-data-summary-table` and also chunk `scrap-data`. The spboData for filtering leagues and matches scores purpose.
load('./regressionApps/shinyData.RData')
```
```{r read-data-summary-table, eval = FALSE, echo = FALSE, results = 'asis'}
## Setup Options, Loading Required Libraries and Preparing Environment
## Loading the packages and setting adjustment again due to unable find the functions
suppressMessages(library('utils'))
suppressMessages(library('DT'))
suppressMessages(source('function/libs.R'))
## Read the data
## Refer to **Testing efficiency of coding.Rmd** at chunk `get-data-summary-table-2.1`
years <- seq(2011, 2015)
## Here I take the majority leagues setting profile which are "league-10-12"
## fMYPriceB = Back with vigorish price; fMYPriceL = Lay with vigorish price
## Here we term as Fair Odds
lProfile <- c(AH = 0.10, OU = 0.12)
mbase <- readfirmData(years = years, pth = './data/') %>% arrfirmData(lProfile = lProfile)
## In order to analyse the AHOU, here I need to filter out all soccer matches other than AHOU. (For example : Corners, Total League Goals etc.)
## the stakes amount display as $1 = $10,000
#'@ mbase$datasets[!(mbase$datasets$Home %in% mbase$corners)|!(mbase$datasets$Away %in% mbase$corners),]
dat <- mbase$datasets %>% filter((!Home %in% mbase$others)|(!Away %in% mbase$others)) %>% mutate(Stakes = Stakes/10000, Return = Return/10000, PL = PL/10000)
rm(years, readfirmData, arrfirmData)
rm(mbase) ## We need to scrap the livescore data based on the raw data mbase without filter, but this is not the point in this research paper.
```
```{r print-table, echo = FALSE, results = 'asis'}
#'@ pander(head(dat)) # exactly same layout with kable(x)
#'@ kable(head(dat)) ## example of the dataset in the research paper
suppressMessages(require('dplyr', quietly = TRUE))
suppressMessages(require('DT', quietly = TRUE))
dat %>% datatable(
caption = "Table 2.1.1 : Firm A Staking Data (in $0,000)",
escape = FALSE, filter = "top", rownames = FALSE,
extensions = list("ColReorder" = NULL, "RowReorder" = NULL,
"Buttons" = NULL, "Responsive" = NULL),
options = list(dom = 'BRrltpi', autoWidth = TRUE, scrollX = TRUE,
lengthMenu = list(c(10, 50, 100, -1), c('10', '50', '100', 'All')),
ColReorder = TRUE, rowReorder = TRUE,
buttons = list('copy', 'print',
list(extend = 'collection',
buttons = c('csv', 'excel', 'pdf'),
text = 'Download'), I('colvis'))))
```
*table 2.1.1* : `r paste0(dim(dat), collapse = ' x ')` : *Sample data collected for the research.*
> About 90 percent of money wagered will be on the Asian handicap, a market that allows the team expected to win a "head start" of a quarter of a goal or more to the opposition. The rest of the money staked will go on a market for over or under a certain number of goals and the match-result market.
source : [Mugs and Millionaires: Inside the Murky World of Professional Football Gambling](http://bleacherreport.com/articles/2200795-mugs-and-millionaires-inside-the-murky-world-of-professional-football-gambling)
In order to analyse the AHOU, here I've filtered out all soccer matches other than AHOU which is the table showing above (For example : Corners, Total League Goals etc.) for whole research paper. Please refer to [Natural Language Analysis](http://rpubs.com/englianhu/natural-language-analysis) to see the firm A staking raw data-set.
You are feel free to read [Asian Handicap](http://www.bettingarbitrage.eu/asian-handicap.html) and [Arbitrage of Synthetic Asian Handicap Bets](http://www.sportstradingnetwork.com/arbitrage-synthetic-asian-handicap-bets/) for some basic lession about Asian Handicap Bets.
The `rebates` value inside the dataset doesn't count in `Return` and `PL` since it is not includes in win/loss profit but only awarded upon hit a certain amount of stakes. `Rebates` is a marketing strategy for bankers and also agent to fight for revenue and occupy the market shares.
- Normally `rebates` only offers by credit market sportsbook makers.
- The maximum stakes per bet of `cash market` normally only up to HKD10,000 while credit market can up to millions HKD1,000,000.
- Normally only high volume agents, small bookmakers and also high volume sportbook consultancy firms will able to get the `rebates`.
## 2.2 Overrounds / Vigorish
**Fair odds**: the odds that would be offered if the sum of the probabilities for all possible outcomes were exactly 1 (100%). For example, supposing we had a market with three possible outcomes {A, B, C} with probabilities of success $P(A) = 0.5, P(B) = 0.4$ and $P(C) = 0.1$, the fair odds would be `2.00`, `2.50`, and `10.00` respectively, which are just the inverse of the estimated probabilities.
**Overround**: Also called vigorish (or vig for short) in American sports betting, the over-round is a measure of the bookmaker’s edge over the gambler. The bookmaker will never offer fair odds on a market. In practice, the payout offered on each selection will be reduced, which in turn increases the reflected probability of an event. When odds have been adjusted in this way the sum of the probabilities for all events will exceed 1 `(100%)`. The over-round is the amount by which the sum of all
probabilities exceeds `100%` and it is the bookmaker’s profit margin.
For example, if we had a market with two possible outcomes {A, B}, where $P(A) = P(B) = 0.5$, the fair odds on each selection would be `2.00`. However, the bookmaker may offer payouts of `1.85` on each selection. The corresponding probabilities for each selection are now 1/1.85 = `r 1/1.85`, and the sum of the probabilities for all outcomes is `r 1/1.85` x 2 = `r 1/1.85*2`. The over-round is `8.1%`, and for every `$100` paid out by gamblers the bookmaker expects to make a profit of `8.1` dollars, assuming that there are balanced bets on both A and B.
```{r scrap-data, eval = FALSE, echo = FALSE, results = 'asis'}
## Scrape the leagues and also overrounds which provides by a sportsbookmaker named Firm B
#'@ lnk <- 'http://data.nowgoal.com/history/handicap.htm'
## Above website provides odds price history but sendkeyElements cannot work in RSelenium, will follow-up
## https://github.com/ropensci/RSelenium/issues/55
## Therefore scrape spbo link to know the league of matches
## Besides, need to scrap the final-scores / half-time scores / result of soccer matches
#'@ dateID <- sort(unique(mbase$datasets$Date)); spboDate <- gsub('-','',dateID)
#'@ lnk <- paste0('http://www8.spbo.com/history.plex?day=', spboDate, '&l=en')
## kick-off time(GMT+8) - 12hrs since livescore website start count a day from 12pm(GMT+8)
dateID <- as.Date(sort(unique(dat$Date) - hm('12:00'))); spboDate <- gsub('-', '', dateID)
## Due to the scrapSPBO function scrapped unmatched data, example lnk[827],
## therefore I rewrite the function as scrapSPBO2
#'@ suppressAll(source('function/scrapSPBO2.R'))
#'@ scrapSPBO2(lnk = lnk, dateID = dateID, path = 'livescore', parallel = TRUE)
## Read spbo livescore datasets.
spboData <- readSPBO(dateID = dateID, parallel = FALSE)$data
## Apply stringdist() to 'exactly matching' and 'approximate matching' team names
#'@ method <- c('osa', 'lv', 'dl', 'hamming', 'lcs', 'qgram', 'cosine', 'jaccard', 'jw', 'soundex')
#'@ source(paste0(getwd(),'/function/arrTeamID.R'))
#'@ tmID <- arrTeamID(mbase, spboData, parallel = FALSE)
tmIDdata <- read.csv('./data/teamID.csv', header = TRUE, sep = ',') %>% mutate_each(funs(as.character)) %>% data.frame %>% tbl_df %>% filter(spbo != 'Kuban Krasnodar')
spboData %<>% filter((Home %in% tmIDdata$spbo)|(Away %in% tmIDdata$spbo))
## filter the bet slips with spbo live scores matches.
#'@ dat %<>% filter(as.Date(DateUK) %in% as.Date(spboData$DateUK), Home %in% tmIDdata$teamID, Away %in% tmIDdata$teamID)
#'@ spboData %<>% filter(as.Date(DateUK) %in% as.Date(dat$DateUK), Home %in% tmIDdata$teamID, Away %in% tmIDdata$teamID)
tmIDdata %<>% mutate(Home = factor(teamID), Away = factor(teamID), spboHome = factor(spbo), spboAway = factor(spbo)) %>% .[-c(1:3)]
dat <- join_all(list(dat, tmIDdata[c(1, 3)]), by = c('Home'), type = 'inner') %>% tbl_df
dat <- join_all(list(dat, tmIDdata[c(2, 4)]), by = c('Away'), type = 'inner') %>% tbl_df
names(spboData)[names(spboData) == 'Home'] <- 'spboHome'
names(spboData)[names(spboData) == 'Away'] <- 'spboAway'
names(spboData)[names(spboData) == 'DateUK'] <- 'spboDateUK'
names(spboData)[names(spboData) == 'Time'] <- 'spboTime'
dat <- dat[order(dat$Date, dat$Time, decreasing = FALSE), ]
#'@ names(dat)
# [1] c("No", "Sess", "Month", "Day", "DateUK", "Date", "Time", "Home", "Away", "Selection", "HCap", "EUPrice", "HKPrice", "Stakes", "CurScore", "Mins", "Result", "Return", "PL", "Rebates", "Picked", "AHOU", "fMYPriceB", "fMYPriceL", "pHKRange", "fHKPriceL", "pMYRange", "pHKRange2", "pMYRange2", "InPlay", "Mins2", "InPlay2", "ipRange", "HG", "AG", "FHFTET", "Picked2", "ipHCap", "CurScore2", "netProbB", "netProbL", "favNetProb", "undNetProb", "spboHome", "spboAway")
## join data to get a completed data with leagues, final scores in order to run the simulation in staking and Poisson section.
#'@ dat1 <- join_all(list(dat, spboData), by = c('Date', 'spboHome', 'spboAway'), type = 'full') %>% tbl_df %>% na.omit ## Due to join_all() will cause the Time variable became '0' numeric value. Therefore use merge instead.
dat <- merge(dat, spboData, by = c('Date', 'spboHome', 'spboAway')) %>% tbl_df
#'@ names(dat)
# [1] c("Date", "spboHome", "spboAway", "No.x", "Sess", "Month", "Day", "DateUK", "Time", "Home", "Away", "Selection", "HCap", "EUPrice", "HKPrice", "Stakes", "CurScore", "Mins", "Result", "Return", "PL", "Rebates", "Picked", "AHOU", "fMYPriceB", "fMYPriceL", "pHKRange", "fHKPriceL", "pMYRange", "pHKRange2", "pMYRange2", "InPlay", "Mins2", "InPlay2", "ipRange", "HG", "AG", "FHFTET", "Picked2", "ipHCap", "CurScore2", "netProbB", "netProbL", "favNetProb", "undNetProb", "No.y", "X", "matchID", "LeagueColor", "League", "spboDateUK", "spboTime", "Finished", "FTHG", "FTAG", "HTHG", "HTAG", "H.Card", "A.Card", "HT.matchID", "HT.graph1", "HT.graph2")
## Due to the daily financial settlement of Asian bookmakers based on 12.00PM (GMT + 8), therefore I just leave the merged data name alignment above to ease for refer the time and score.
dat %<>% mutate(Stakes = as.numeric(Stakes), League = factor(League),
Home = factor(Home), Away = factor(Away),
spboHome = factor(spboHome), spboAway = factor(spboAway))
## saveImage after filter to ease the whole research and efficiency of read data.
#'@ save.image("./regressionApps/shinyData.RData")
rm(dateID, spboDate, scrapSPBO, readSPBO, tmIDdata, spboData, nms)
```
I just simply get the lay price by applying below equation.
$$P_i^{HK_{Lay}} = 1/P_i^{HK_{Back}}-\nu_{j}$$ *equation 2.2.1*
While $\nu$ is the vigorish and $j={1,2}$ which are `r names(lProfile)[1]`=`r lProfile[1]` and `r names(lProfile)[2]`=`r lProfile[1]`. I have just simply calculated the Layed Fair odds (Odds Price with Vigorish which offer by operators), here I apply a setting profile which is term as `lProfile` (you can casually edit the soccer match profile setting) to get the Real Odds (Net Odds Price without Vigorish). As well as the Value $Value = Real Price/Fair Odds$. Here we can use the Bet Stake Calculator [Kelly Staking Calculator](http://www.sportsbettingcalculator.co.uk/kelly-staking-calculator/). I simply reverse value $\Re$ to get the estimated $P_{i}^{EM}$ (firm A) where we will talk in Section [4.2 Linear Ⓜodel](http://rstudio-pubs-static.s3.amazonaws.com/208636_147313e52754447e89b38cc3456c5009.html#linear-omdel)^[I write a simple odds convertion from fixed odds to Asian Handicap at [*regressionApps*](https://beta.rstudioconnect.com/content/1807/) to use the ShinyApp]. Youcan change the vigorish casually and observe the outcome of the odds price.].
Besides, *John Fingleton & Patrick Waldron (1999)*^[Kindly refer to 13th paper in [Reference for industry knowdelege and academic research portion for the paper.] in **7.4 References**. I've summarise the paper and brief some points in section [4.2 Linear Ⓜodel](http://rstudio-pubs-static.s3.amazonaws.com/208636_147313e52754447e89b38cc3456c5009.html#linear-omdel).] had conducted a research on the use of vigorish (bookmakers need to pay tax, labor cost, overhead and also other cost, therefore the odds price without vigorish will definately make loss to bookmakers) and how to differentiate the punters and ordinary gamblers as well as the optimal odds price, the Shin model measure the threaten and the portion of smart punters among gamblers.
```{r data-prob-table1, echo = FALSE, results = 'asis'}
## Calculate the Real Odds (Net Odds Price without Vigorish)
## fMYPriceB = Back without vigorish net price; netProbL = Lay without vigorish net price
## fMYPriceB adn fMYPriceL are MYPrice, here we term as Real Odds
## Please refer to function arrfirmDatasets()
suppressMessages(require('formattable', quietly = TRUE))
dat %>% select(No.x, EUPrice, HKPrice, fHKPriceL, fMYPriceB, fMYPriceL, netProbB, netProbL) %>% head %>% formattable %>% as.htmlwidget # kable(caption = 'Table 2.2.1 : Sample Data of Virogish/Overrounds and Odds Price')
```
*table 2.2.1* : `r paste0(dim(dat), collapse = ' x ')` : *Vigorish, price and probabilities sample table.*
Above *table 2.2.1* just provides some sample about the odds price and over-round while you can refer to *table 2.1.1* for details. Meanwhile, you can know more details about the *return of investment*, *convertion* and also *origin region* based on *same probabilities* among different Odds Types/Styles via [Betting Odds Converter](http://www.sportsbookreview.com/betting-tools/odds-converter/) or just simply google'ing.
# 3. Summarise the Staking Model
- Section [3.1 Summarise Diversified Periodic Stakes] - Summarise the Stakes and Return
- Section [3.2 Summarise the Staking Handicap] - Summarise the Staking Handicap Breakdown
- Section [3.3 Summarise the Staking Prices] - Summarise the Staking Price Range Breakdown
- Section [3.4 Summarise the In-Play Staking Timing] - Summarise the In-Play Staking Breakdown by Time Range
- Section [3.5 Summarise the In-Play Staking Based on Current Score] - Summarise the In-Play Staking Breakdown by Current Score
## 3.1 Summarise Diversified Periodic Stakes
Before we start analyse the staking model, we are firstly see some diversified periodic breakdown Stakes and Profit & Lose of the Agency A.
```{r data-annum-summary-plot, echo = FALSE, results = 'asis'}
## Load googleVis again due to unable find the function.
suppressMessages(require('googleVis', quietly = TRUE))
suppressMessages(require('plyr', quietly = TRUE))
## Annual summry graph.
m <- ddply(dat, .(Sess), summarise, Stakes = sum(Stakes), Return = sum(Return), PL = sum(PL)) %>% data.frame %>% tbl_df
gvis.options <- list(title = "Annum Stakes & PL ('0,000)", series = "[{targetAxisIndex:0},{targetAxisIndex:1}]", hAxis = "{title:'Year'}", vAxis = "{title:'Stakes'},{title:'PL'}", width = 'automatic', height = 'automatic')
col.gvis <- gvisColumnChart(xvar = 'Sess', yvar = c('Stakes','Return','PL'), data = m, options = gvis.options)
plot(col.gvis)
rm(m)
```
*graph 3.1.1* : *Investment Annual summary graph.*
From the graph above showing that the investment of firm A through agency A generates a positive return (profit). Please refer to *table 4.1.1* for more details about investment analysis.
```{r data-month-summary-table, echo = FALSE, results = 'asis'}
suppressMessages(require('magrittr', quietly = TRUE))
## Monthly summary table.
dt1 <- dat %>% group_by(Sess, Month) %>% summarise(S.total = sum(Stakes), S.median = median(Stakes), S.mean = round(mean(Stakes), 4), S.sd = round(sd(Stakes), 4), Count = length(PL), minHKPrcB = min(HKPrice), maxHKPrcB = max(HKPrice), minProbB = min(netProbB), maxProbB = max(netProbB), Return = sum(PL, Stakes), PL = sum(PL), PL.percent = sprintf(sum(PL) / sum(Stakes), fmt = '%1.4f%%'), Samples = paste(Stakes, collapse = ',')) %>% mutate(Series = Samples)
#'@ names(dt1)[1] <- 'Sess'
ddim <- dim(dt1)
## daily stakes but not grouped monthly staking to plot a daily staking chart.
## r <- dt1 %>% group_by(Sess, Month) %>% .$S.total %>% range
#'@ r <- dat$Stakes %>% range
#'@ line_string <- "type: 'line', lineColor: 'black', fillColor: '#ccc', highlightLineColor: 'orange', highlightSpotColor: 'orange'"
#'@ cb_line <- JS(paste0("function (oSettings, json) { $('.spark:not(:has(canvas))').sparkline('html', { ", line_string, ", chartRangeMin: ", r[1], ", chartRangeMax: ", r[2], " }); }"), collapse = "")
#'@ box_string <- "type: 'box', lineColor: 'black', whiskerColor: 'black', outlierFillColor: 'black', outlierLineColor: 'black', medianColor: 'black', boxFillColor: 'orange', boxLineColor: 'black'"
#'@ cd <- list(list(targets = 15, render = JS("function(data, type, full){ return '<span class=sparkSamples>' + data + '</span>' }")), list(targets = 16, render = JS("function(data, type, full){ return '<span class=sparkSeries>' + data + '</span>' }")))
#'@ cb = JS(paste0("function (oSettings, json) {\n $('.sparkSeries:not(:has(canvas))').sparkline('html', { ", line_string, " });\n $('.sparkSamples:not(:has(canvas))').sparkline('html', { ", box_string, " });\n}"), collapse = "")
dt1 %<>% datatable(
caption = "Table 3.1.1 : Summary of Monthly Stakes and Return. ('0,000)",
escape = FALSE, filter = 'top', rownames = FALSE,
extensions = list('ColReorder' = NULL, 'RowReorder' = NULL,
'Buttons' = NULL, 'Responsive' = NULL),
options = list(dom = 'BRrltpi', autoWidth = TRUE, scrollX = TRUE,
lengthMenu = list(c(10, 50, 100, -1), c('10', '50', '100', 'All')),
ColReorder = TRUE, rowReorder = TRUE,
buttons = list('copy', 'print',
list(extend = 'collection',
buttons = c('csv', 'excel', 'pdf'),
text = 'Download'), I('colvis'))))#,
#'@ columnDefs = cd, fnDrawCallback = cb))
## convert the data into inline graph
#'@ dt1$dependencies <- append(dt1$dependencies, htmlwidgets:::getDependency('sparkline'))
dt1
rm(dt1)#, r)
```
*table 3.1.1* : `r paste0(ddim, collapse = ' x ')` : *Investment monthly breakdown table.*
From the table above, we realized that the Asian agency A make profit by follow the British sports betting consultancy firm A every year. Since thousands of bets (and maximum bet limit setting, league profile setting, and also value betting which properly based on Kelly model, mean value will be kinda bias) placed per month, here we take median will be accurate than mean value.
```{r data-month-summary-plots, echo = FALSE, results = 'asis'}
## Load package 'zoo' due to unable find function as.yearmon()
suppressMessages(require('zoo', quietly = TRUE))
## Monthly summary graph.
## the stakes amount display as $1 = $10,000
m <- ddply(data.frame(Month = as.yearmon(dat$Date,'%b'), dat), .(Month), summarise, Stakes = sum(Stakes), Return = sum(Return), PL = sum(PL)) %>% mutate(Month = as.character(Month)) %>% tbl_df # mutate Month as.character() at last to make the Month hAxis arrange in order after summarise
gvis.options <- list(title = "Monthly Stakes & PL ('0,000)", series = "[{targetAxisIndex:0},{targetAxisIndex:1}]", hAxis = "{title:'Month'}", vAxis = "{title:'Stakes'},{title:'PL'}", width = 'automatic', height = 'automatic')
line.gvis <- gvisLineChart(xvar = 'Month', yvar = c('Stakes', 'Return', 'PL'), data = m, options = gvis.options)
plot(line.gvis)
rm(m)
```
*graph 3.1.2* : *Investment monthly trend graph.*
```{r data-daily-summary, message = FALSE, warning = FALSE, echo = FALSE, results = 'asis'}
suppressMessages(library('tidyr'))
dt2 <- dat %>% group_by(Sess, Month, Day) %>% summarise(S.total = sum(Stakes), S.median = median(Stakes), S.mean = round(mean(Stakes), 4), S.sd = round(sd(Stakes), 4), Count = length(PL), minHKPrcB = min(HKPrice), maxHKPrcB = max(HKPrice), minProbB = min(netProbB), maxProbB = max(netProbB), Return = sum(PL, Stakes), PL = sum(PL), PL.percent = sprintf(sum(PL)/sum(Stakes), fmt = "%1.4f%%"))#, Samples = paste(Stakes, collapse = ",")) %>% mutate(Series = Samples)
#'@ names(dt2)[1] <- 'Sess'
ddim <- dim(dt2)
## Copy for plot graph
m <- dt2 %>% tbl_df %>% mutate(Day = paste(Day, Month, Sess)) %>% select(Day, Stakes = S.total, Return, PL)
#'@ m <- dt2 %>% tbl_df %>% unite(Day, Day, Month, Sess) %>% select(Day, Stakes = S.total, Return, PL)
#'@ r <- dt2 %>% group_by(Sess, Month, Day) %>% .$S.total %>% range
#'@ cb_line <- JS(paste0("function (oSettings, json) { $('.spark:not(:has(canvas))').sparkline('html', { ", line_string, ", chartRangeMin: ", r[1], ", chartRangeMax: ", r[2], " }); }"), collapse = "")
#'@ cb = JS(paste0("function (oSettings, json) {\n $('.sparkSeries:not(:has(canvas))').sparkline('html', { ", line_string, " });\n $('.sparkSamples:not(:has(canvas))').sparkline('html', { ", box_string, " });\n}"), collapse = "")
dt2 %<>% datatable(
caption = "Table 3.1.2 : Summary of Daily Stakes and Return. ('0,000)",
escape = FALSE, filter = 'top', rownames = FALSE,
extensions = list('ColReorder' = NULL, 'RowReorder' = NULL,
'Buttons' = NULL, 'Responsive' = NULL),
options = list(dom = 'BRrltpi', autoWidth = TRUE, scrollX = TRUE,
lengthMenu = list(c(10, 50, 100, -1), c('10', '50', '100', 'All')),
ColReorder = TRUE, rowReorder = TRUE,
buttons = list('copy', 'print',
list(extend = 'collection',
buttons = c('csv', 'excel', 'pdf'),
text = 'Download'), I('colvis'))))#,
#'@ columnDefs = cd, fnDrawCallback = cb))
## convert the data into inline graph
#'@ dt2$dependencies <- append(dt2$dependencies, htmlwidgets:::getDependency('sparkline'))
dt2
rm(dt2)#, r, line_string, cb_line, box_string, cd, cb)
```
*table 3.1.2* : `r paste0(ddim, collapse = ' x ')` : *Investment daily breakdown table.*
```{r data-daily-summary-plot, echo = FALSE, results = 'asis'}
## Daily summary graph.
## the stakes amount display as $1 = $10,000
gvis.options <- list(title = "Daily Stakes & PL ('0,000)", series = "[{targetAxisIndex:0},{targetAxisIndex:1}]", hAxis = "{title:'Day'}", vAxis = "{title:'Stakes'},{title:'PL'}", width = 'automatic', height = 'automatic')
line.gvis <- gvisLineChart(xvar = 'Day', yvar = c('Stakes', 'Return', 'PL'), data = m, options = gvis.options)
plot(line.gvis)
rm(ddim, line.gvis, m)
```
*graph 3.1.3* : *Investment daily trend graph.*
From the graph above, we can easily know the figure of Stakes, Returns and Profit & Lose while below table separate into daily breakdown. The table shows the daily stakes and also quantile values.
## 3.2 Summarise the Staking Handicap
```{r data-hdp-summary-table1a, echo = FALSE, results = 'asis'}
## Handicap breakdown summary
funs <- expression(sum, mean, median, sd, length)
ahBets <- llply(funs, function(x)
ddply(dat, ~ HCap + AHOU + Picked, .drop = TRUE, colwise(.fun = x, .cols = ~ Stakes + Return + PL))) %>% join_all(by = c('HCap', 'AHOU', 'Picked')) %>% .[-c((ncol(.)-1):ncol(.))] %>% data.frame
names(ahBets) <- suppressWarnings(c('HCap', 'AHOU', 'Picked', 'Stakes', 'Return', 'PL', 'S.mean', 'R.mean', 'PL.mean', 'S.median', 'R.median', 'PL.median', 'S.sd', 'R.sd', 'PL.sd', 'Count'))
ahBets %<>% map_if(is.numeric, round, 2) %>% data.frame %>% mutate(R.percent = sprintf(Return / Stakes, fmt = '%1.4f%%'), PL.percent = sprintf(PL / Stakes, fmt = '%1.4f%%')) %>% tbl_df
lst <- ahBets %>% dlply(.(AHOU))
lst[[1]] %>% datatable(
caption = "Table 3.2.1A : Asian Handicap - Handicap Breakdown ('0,000)",
escape = FALSE, filter = 'top', rownames = FALSE,
extensions = list('ColReorder' = NULL, 'RowReorder' = NULL,
'Buttons' = NULL, 'Responsive' = NULL),
options = list(dom = 'BRrltpi', autoWidth = TRUE, scrollX = TRUE,
lengthMenu = list(c(10, 50, 100, -1), c('10', '50', '100', 'All')),
ColReorder = TRUE, rowReorder = TRUE,
buttons = list('copy', 'print',
list(extend = 'collection',
buttons = c('csv', 'excel', 'pdf'),
text = 'Download'), I('colvis'))))
```
*table 3.2.1a* : `r paste0(dim(lst[[1]]), collapse = ' x ')` : *Asian Handicap - handicap breakdown table.*
```{r data-hdp-summary-table1b, echo = FALSE, results = 'asis'}
lst[[2]] %>% datatable(
caption = "Table 3.2.1B : Goal Line - Handicap Breakdown ('0,000)",
escape = FALSE, filter = 'top', rownames = FALSE,
extensions = list('ColReorder' = NULL, 'RowReorder' = NULL,
'Buttons' = NULL, 'Responsive' = NULL),
options = list(dom = 'BRrltpi', autoWidth = TRUE, scrollX = TRUE,
lengthMenu = list(c(10, 50, 100, -1), c('10', '50', '100', 'All')),
ColReorder = TRUE, rowReorder = TRUE,
buttons = list('copy', 'print',
list(extend = 'collection',
buttons = c('csv', 'excel', 'pdf'),
text = 'Download'), I('colvis'))))
```
*table 3.2.1b* : `r paste0(dim(lst[[2]]), collapse = ' x ')` : *Goal Line - handicap breakdown table.*
```{r data-hdp-summary-table1c, echo = FALSE, results = 'asis'}
m <- ddply(dat, ~ HCap + AHOU, .drop = TRUE, colwise(.fun = sum, .cols = ~ Stakes + Return + PL)) %>% mutate(R.percent = Return / Stakes * 100, PL.percent = PL / Stakes * 100) %>% dlply(.(AHOU)) %>% llply(arrange, Stakes) %>% ldply(tail, 4, .id = NULL) %>% tbl_df
m %>% formattable(list(
HCap = color_tile('white', 'darkgoldenrod'),
PL = formatter('span', style = x ~ style(color = ifelse(x > 0, 'green', 'red')), x ~ icontext(ifelse(x > 0, 'ok', 'remove'), ifelse(x > 0, m$PL, m$PL))),
R.percent = formatter('span', style = x ~ style(color = ifelse(rank(-x) <= 3, 'green', 'silver')), x ~ sprintf('%.2f%% (rank: %02d)', x, rank(-x))),
PL.percent = formatter("span", style = x ~ style(color = ifelse(rank(-x) <= 3, 'green', 'silver')), x ~ sprintf('%.2f%% (rank: %02d)', x, rank(-x))))) %>% as.htmlwidget
#'@ m <- ddply(dat, ~ HCap + AHOU, .drop = TRUE, colwise(.fun = sum, .cols = ~ Stakes + Return + PL)) %>% mutate(R.percent = Return / Stakes * 100, PL.percent = PL / Stakes * 100) %>% tbl_df
## Due to save the file size, here I ommit the DT::datatable for stylist table but using statical formattable.
## https://beta.rstudioconnect.com/englianhu/Programming-Assignment-2-Submission/#read-data
```
*table 3.2.1c* : `r paste0(dim(m), collapse = ' x ')` : *Handicap, stakes and PL sample data.*
From above tables, firm A mostly placed on Asian Handicap range concedes/taken `r m %>% filter(Stakes == max(Stakes), AHOU == 'AH') %>% .$HCap` ball on agency A. Menwhile the odds `r m %>% filter(AHOU == 'AH', R.percent == max(R.percent)) %>% .$HCap` is most profitable from return rate.
Secondly, from the Goal Line mostly taking `r ahBets %>% filter(AHOU == 'OU') %>% filter(Stakes == max(Stakes)) %>% .$Picked` selection on `r ahBets %>% filter(AHOU == 'OU') %>% filter(Stakes == max(Stakes)) %>% .$HCap` balls. (Since Dutch, Japanese, Spanish and Women soccer leagues always scoring more goals, but Portuguese, Italian, French leagues always score less, English leagues average `2.5` balls)
```{r data-hdp-summary-plot1a, echo = FALSE, results = 'asis'}
## Plot the Stakes and P&L on AH handicap graphs
## the stakes amount display as $1 = $10,000
lst <- ahBets %>% select(HCap, AHOU, Picked, Stakes, Return, PL) %>% dlply(.(AHOU), gather, Type, Stakes, Stakes:PL) %>% llply(unite, Category, Picked, Type, sep = '.') %>% llply(spread, Category, Stakes) %>% llply(., function(x) tbl_df(data.frame(x)))
gvis.options <- list(title = "Asian Handicap - Handicap Breakdown ('0,000)", series = "[{targetAxisIndex:0},{targetAxisIndex:1}]", hAxis = "{title:'HCap'}", vAxis = "{title:'Stakes'},{title:'PL'}", width = 'automatic', height = 'automatic')
line.gvis <- gvisLineChart(xvar = 'HCap', yvar = names(lst[[1]])[-c(1:2)], data = lst[[1]], options = gvis.options)
plot(line.gvis)
rm(ahBets)
```
*graph 3.2.1a* : *Asian Handicap - handicap breakdown staking graph.*
```{r data-hdp-summary-plot1b, echo = FALSE, results = 'asis'}
## Plot the Stakes and P&L on OU handicap graphs
## the stakes amount display as $1 = $10,000
gvis.options <- list(title = "Goal Line - Handicap Breakdown ('0,000)", series = "[{targetAxisIndex:0},{targetAxisIndex:1}]", hAxis = "{title:'HCap'}", vAxis = "{title:'Stakes'},{title:'PL'}", width = 'automatic', height = 'automatic')
line.gvis <- gvisLineChart(xvar = 'HCap', yvar = names(lst[[2]])[-c(1:2)], data = lst[[2]], options = gvis.options)
plot(line.gvis)
rm(line.gvis)
```
*graph 3.2.1b* : *Goal Line - handicap breakdown staking graph.*
Now we look at the graph above, we can know the Stakes breakdown on both `r names(lProfile)[1]` and `r names(lProfile)[2]`.
## 3.3 Summarise the Staking Prices
```{r data-hdp-summary-table2a, echo = FALSE, results = 'asis'}
## Favorite or Underdog and price range breakdown
fuBets <- llply(funs, function(x)
ddply(dat, ~ AHOU + pHKRange + Picked, .drop = TRUE, colwise(.fun = x, .cols = ~ Stakes + Return + PL))) %>% join_all(by = c('AHOU', 'pHKRange', 'Picked')) %>% .[-c((ncol(.) - 1):ncol(.))] %>% data.frame
names(fuBets) <- suppressWarnings(c('AHOU', 'pHKRange', 'Picked', 'Stakes', 'Return', 'PL', 'S.mean', 'R.mean', 'PL.mean', 'S.median', 'R.median', 'PL.median', 'S.sd', 'R.sd', 'PL.sd', 'Count'))
fuBets %<>% map_if(is.numeric, round, 2) %>% data.frame %>% mutate(R.percent = sprintf(Return / Stakes, fmt = '%1.4f%%'), PL.percent = sprintf(PL / Stakes, fmt = '%1.4f%%')) %>% tbl_df
lst <- fuBets %>% dlply(.(AHOU))
lst[[1]] %>% datatable(
caption = "Table 3.3.1A : Asian Handicap - Price Range Breakdown ('0,000)",
escape = FALSE, filter = 'top', rownames = FALSE,
extensions = list('ColReorder' = NULL, 'RowReorder' = NULL,
'Buttons' = NULL, 'Responsive' = NULL),
options = list(dom = 'BRrltpi', autoWidth = TRUE, scrollX = TRUE,
lengthMenu = list(c(10, 50, 100, -1), c('10', '50', '100', 'All')),
ColReorder = TRUE, rowReorder = TRUE,
buttons = list('copy', 'print',
list(extend = 'collection',
buttons = c('csv', 'excel', 'pdf'),
text = 'Download'), I('colvis'))))
```
*table 3.3.1a* : `r paste0(dim(lst[[1]]), collapse = ' x ')` : *Asian Handicap - price range breakdown table.*
```{r data-hdp-summary-table2b, echo = FALSE, results = 'asis'}
lst[[2]] %>% datatable(
caption = "Table 3.3.1B : Goal Line - Price Range Breakdown ('0,000)",
escape = FALSE, filter = 'top', rownames = FALSE,
extensions = list('ColReorder' = NULL, 'RowReorder' = NULL,
'Buttons' = NULL, 'Responsive' = NULL),
options = list(dom = 'BRrltpi', autoWidth = TRUE, scrollX = TRUE,
lengthMenu = list(c(10, 50, 100, -1), c('10', '50', '100', 'All')),
ColReorder = TRUE, rowReorder = TRUE,
buttons = list('copy', 'print',
list(extend = 'collection',
buttons = c('csv', 'excel', 'pdf'),
text = 'Download'), I('colvis'))))
```
*table 3.3.1b* : `r paste0(dim(lst[[2]]), collapse = ' x ')` : *Goal Line - price range breakdown table.*
```{r data-hdp-summary-table2c, echo = FALSE, results = 'asis'}
#'@ m <- ddply(dat, ~ pHKRange + pMYRange, .drop = TRUE, colwise(.fun = sum, .cols = ~ Stakes + Return + PL)) %>% tbl_df %>% mutate(R.percent = sprintf(Return / Stakes, fmt = '%1.4f%%'), PL.percent = sprintf(PL / Stakes, fmt = '%1.4f%%')) %>% arrange(Stakes) %>% tail
m <- ddply(dat, ~ AHOU + pHKRange + pMYRange, .drop = TRUE, colwise(.fun = sum, .cols = ~ Stakes + Return + PL)) %>% mutate(R.percent = Return / Stakes * 100, PL.percent = PL / Stakes * 100) %>% dlply(.(AHOU)) %>% llply(arrange, Stakes) %>% ldply(tail, 4, .id = NULL) %>% tbl_df
m %>% formattable(list(
HCap = color_tile('white', 'darkgoldenrod'),
PL = formatter('span', style = x ~ style(color = ifelse(x > 0, 'green', 'red')), x ~ icontext(ifelse(x > 0, 'ok', 'remove'), ifelse(x > 0, m$PL, m$PL))),
R.percent = formatter('span', style = x ~ style(color = ifelse(rank(-x) <= 3, 'green', 'silver')), x ~ sprintf('%.2f%% (rank: %02d)', x, rank(-x))),
PL.percent = formatter("span", style = x ~ style(color = ifelse(rank(-x) <= 3, 'green', 'silver')), x ~ sprintf('%.2f%% (rank: %02d)', x, rank(-x))))) %>% as.htmlwidget
#'@ m <- ddply(dat, ~ pHKRange + pMYRange, .drop = TRUE, colwise(.fun = sum, .cols = ~ Stakes + Return + PL)) %>% mutate(R.percent = Return / Stakes * 100, PL.percent = PL / Stakes * 100) %>% tbl_df
## Due to save the file size, here I ommit the DT::datatable for stylist table but using statical formattable.
## https://beta.rstudioconnect.com/englianhu/Programming-Assignment-2-Submission/#read-data
```
*table 3.3.1c* : `r paste0(dim(m), collapse = ' x ')` : *Price range, stakes and PL sample table.*
From above tables, the price range on `r m %>% filter(Stakes == max(Stakes)) %>% .$pHKRange %>% as.character` are mostly been placed. We try to compare the stakes between `0.70~0.80` and `1.10~1.20`, `0.60~0.70` and `1.20~1.30` and the returns/profit, we will know the price is importance on **Value Betting**.
```{r data-hdp-summary-plot2a, echo = FALSE, results = 'asis'}
## Plot the Stakes and P&L on AH handicap graphs
## the stakes amount display as $1 = $10,000
lst <- fuBets %>% select(AHOU, pHKRange, Picked, Stakes, Return, PL) %>% dlply(.(AHOU), gather, Type, Stakes, Stakes:PL) %>% llply(unite, Category, Picked, Type, sep = '.') %>% llply(spread, Category, Stakes) %>% llply(., function(x) tbl_df(data.frame(x)))
gvis.options <- list(title = "Asian Handicap - Price Range Breakdown ('0,000)", series = "[{targetAxisIndex:0},{targetAxisIndex:1}]", hAxis = "{title:'pHKRange'}", vAxis = "{title:'Stakes'},{title:'PL'}", width = 'automatic', height = 'automatic')
line.gvis <- gvisLineChart(xvar = 'pHKRange', yvar = names(lst[[1]])[-c(1:2)], data = lst[[1]], options = gvis.options)
plot(line.gvis)
rm(fuBets)
```
*graph 3.3.1a* : *Asian Handicap - price range staking graph.*
```{r data-hdp-summary-plot2b, echo = FALSE, results = 'asis'}
## Plot the Stakes and P&L on OU handicap graphs
## the stakes amount display as $1 = $10,000
gvis.options <- list(title = "Goal Line - Price Range Breakdown ('0,000)", series = "[{targetAxisIndex:0},{targetAxisIndex:1}]", hAxis = "{title:'pHKRange'}", vAxis = "{title:'Stakes'},{title:'PL'}", width = 'automatic', height = 'automatic')
line.gvis <- gvisLineChart(xvar = 'pHKRange', yvar = names(lst[[2]])[-c(1:2)], data = lst[[2]], options = gvis.options)
plot(line.gvis)
rm(line.gvis)
```
*graph 3.3.1b* : *Goal Line - price range staking graph.*
Above graph shows the Stakes and P&L on different price range in MY Odds style. In fact the MY Odds Style will be easier to count and understand in statistics as well as plot graph since the return (both won and lost) will be ONLY from `-1` to `1` while HK/Europe Odds Style will count from `-1` to `Inf`. However I keep both HKOdds and MYOdds Please refer to *table 2.2.1* for more details.
However, due to consideration of the stakes amount, here I just simply use the HK in order to make the Stakes and Return/PL exactly same with the dataset.
## 3.4 Summarise the In-Play Staking Timing
```{r data-ip-summary-table1a, echo = FALSE, results = 'asis'}
## In-Play : time range
funs <- expression(sum, mean, median, sd, length)
ipBets1 <- llply(funs, function(x)
ddply(dat, ~ InPlay2 + AHOU + ipRange, .drop = TRUE, colwise(.fun = x, .cols = ~ Stakes + Return + PL))) %>% join_all(by = c('InPlay2', 'AHOU', 'ipRange')) %>% .[-c((ncol(.) - 1):ncol(.))] %>% data.frame
names(ipBets1) <- suppressWarnings(c('InPlay2', 'AHOU', 'ipRange', 'Stakes', 'Return', 'PL', 'S.mean', 'R.mean', 'PL.mean', 'S.median', 'R.median', 'PL.median', 'S.sd', 'R.sd', 'PL.sd', 'Count'))
ipBets1 %<>% map_if(is.numeric, round, 2) %>% data.frame %>% mutate(R.percent = sprintf(Return/Stakes, fmt = '%1.4f%%'), PL.percent = sprintf(PL/Stakes, fmt = '%1.4f%%')) %>% tbl_df
lst <- ipBets1 %>% dlply(.(AHOU))
lst[[1]] %>% datatable(
caption = "Table 3.4.1A : Asian Handicap - In-Play Time Range Breakdown ('0,000)",
escape = FALSE, filter = 'top', rownames = FALSE,
extensions = list('ColReorder' = NULL, 'RowReorder' = NULL,
'Buttons' = NULL, 'Responsive' = NULL),
options = list(dom = 'BRrltpi', autoWidth = TRUE, scrollX = TRUE,
lengthMenu = list(c(10, 50, 100, -1), c('10', '50', '100', 'All')),
ColReorder = TRUE, rowReorder = TRUE,
buttons = list('copy', 'print',
list(extend = 'collection',
buttons = c('csv', 'excel', 'pdf'),
text = 'Download'), I('colvis'))))
rm(funs)
```
*table 3.4.1a* : `r paste0(dim(lst[[1]]), collapse = ' x ')` : *Asian Handicap - In-Play time range breakdown table.*
```{r data-ip-summary-table1b, echo = FALSE, results = 'asis'}
lst[[2]] %>% datatable(
caption = "Table 3.4.1B : Goal Line - In-Play Time Range Breakdown ('0,000)",
escape = FALSE, filter = 'top', rownames = FALSE,
extensions = list('ColReorder' = NULL, 'RowReorder' = NULL,
'Buttons' = NULL, 'Responsive' = NULL),
options = list(dom = 'BRrltpi', autoWidth = TRUE, scrollX = TRUE,
lengthMenu = list(c(10, 50, 100, -1), c('10', '50', '100', 'All')),
ColReorder = TRUE, rowReorder = TRUE,
buttons = list('copy', 'print',
list(extend = 'collection',
buttons = c('csv', 'excel', 'pdf'),
text = 'Download'), I('colvis'))))
```
*table 3.4.1b* : `r paste0(dim(lst[[2]]), collapse = ' x ')` : *Goal Line - In-Play time range breakdown table.*
The table above shows the breakdown stakes on `Breaks` includes pregames of Extra-Time (started 90 minutes games), Half-Time and Full-Time in both 90 minutes games and also Extra-Time, Injuries-Time, Breaks-Time etc (All stakes after blew game-start whistle and before final result). While `No` means pre-games stakes and P&L summary.
```{r data-ip-summary-plots1a, echo = FALSE, results = 'asis'}
## Plot the Stakes and P&L on AH handicap graphs
## the stakes amount display as $1 = $10,000
lst <- ipBets1 %>% select(InPlay2, AHOU, ipRange, Stakes, Return, PL) %>% dlply(.(AHOU), gather, Type, Stakes, Stakes:PL) %>% llply(unite, Category, InPlay2, Type, sep = '.') %>% llply(spread, Category, Stakes) %>% llply(., function(x) tbl_df(data.frame(x)))
gvis.options <- list(title = " Asian Handicap - In-Play Time Range Breakdown ('0,000)", series = "[{targetAxisIndex:0},{targetAxisIndex:1}]", hAxis = "{title:'ipRange'}", vAxis = "{title:'Stakes'},{title:'PL'}", width = 'automatic', height = 'automatic')
line.gvis <- gvisLineChart(xvar = 'ipRange', yvar = names(lst[[1]])[-c(1:2)], data = lst[[1]], options = gvis.options)
plot(line.gvis)
rm(ipBets1)
```
*graph 3.4.1a* : *Asian Handicap - In-Play time range graph.*
```{r data-ip-summary-plots1b, echo = FALSE, results = 'asis'}
## Plot the Stakes and P&L on OU handicap graphs
## the stakes amount display as $1 = $10,000
gvis.options <- list(title = "Goal Line - In-Play Time Range Breakdown ('0,000)", series = "[{targetAxisIndex:0},{targetAxisIndex:1}]", hAxis = "{title:'ipRange'}", vAxis = "{title:'Stakes'},{title:'PL'}", width = 'automatic', height = 'automatic')
line.gvis <- gvisLineChart(xvar = 'ipRange', yvar = names(lst[[2]])[-c(1:2)], data = lst[[2]], options = gvis.options)
plot(line.gvis)
rm(line.gvis)
```
*graph 3.4.1b* : *Goal Line - In-Play time range graph.*
From the above graph shows the In-Play stakes, the first `(0,10]` time range placed most stakes while `(55,60]` start dropping. The `<NA>` includes all stakes when the soccer players are not playing on the football field. (Pre-games, Half-Time, Full-Time, Extra-Time, Injuries Time, Breaks Time etc.)
## 3.5 Summarise the In-Play Staking Based on Current Score
```{r data-ip-summary-table2a, echo = FALSE, results = 'asis'}
## In-Play : time range, current score, current ah/ou and also picked home team or favorite team etc.
funs <- expression(sum, mean, median, sd, length)
ipBets2 <- llply(funs, function(x)
ddply(dat, ~ AHOU + CurScore + ipHCap, .drop = TRUE, colwise(.fun = x, .cols = ~ Stakes + Return + PL))) %>% join_all(by = c('AHOU', 'CurScore', 'ipHCap')) %>% .[-c((ncol(.) - 1):ncol(.))] %>% data.frame
names(ipBets2) <- suppressWarnings(c('AHOU', 'CurScore', 'ipHCap', 'Stakes', 'Return', 'PL', 'S.mean', 'R.mean', 'PL.mean', 'S.median', 'R.median', 'PL.median', 'S.sd', 'R.sd', 'PL.sd', 'Count'))
ipBets2 %<>% map_if(is.numeric, round, 2) %>% data.frame %>% mutate(R.percent = sprintf(Return / Stakes, fmt = '%1.4f%%'), PL.percent = sprintf(PL / Stakes, fmt = '%1.4f%%')) %>% tbl_df
lst <- ipBets2 %>% dlply(.(AHOU))
lst[[1]] %>% datatable(
caption = "Table 3.4.2A : Asian Handicap - In-Play Current Score and Handicap Breakdown ('0,000)",
escape = FALSE, filter = "top", rownames = FALSE,
extensions = list("ColReorder" = NULL, "RowReorder" = NULL,
"Buttons" = NULL, "Responsive" = NULL),
options = list(dom = 'BRrltpi', autoWidth = TRUE, scrollX = TRUE,
lengthMenu = list(c(10, 50, 100, -1), c('10', '50', '100', 'All')),
ColReorder = TRUE, rowReorder = TRUE,
buttons = list('copy', 'print',
list(extend = 'collection',
buttons = c('csv', 'excel', 'pdf'),
text = 'Download'), I('colvis'))))
rm(funs)
```
*table 3.5.1a* : `r paste0(dim(lst[[1]]), collapse = ' x ')` : *Asian Handicap - In-Play state-space staking breakdown table.*
```{r data-ip-summary-table2b, echo = FALSE, results = 'asis'}
lst[[2]] %>% datatable(
caption = "Table 3.4.2B : Goal Line - In-Play Current Score and Handicap Breakdown ('0,000)",
escape = FALSE, filter = "top", rownames = FALSE,
extensions = list("ColReorder" = NULL, "RowReorder" = NULL,
"Buttons" = NULL, "Responsive" = NULL),
options = list(dom = 'BRrltpi', autoWidth = TRUE, scrollX = TRUE,
lengthMenu = list(c(10, 50, 100, -1), c('10', '50', '100', 'All')),
ColReorder = TRUE, rowReorder = TRUE,
buttons = list('copy', 'print',
list(extend = 'collection',
buttons = c('csv', 'excel', 'pdf'),
text = 'Download'), I('colvis'))))
```
*table 3.5.1b* : `r paste0(dim(lst[[2]]), collapse = ' x ')` : *Goal Line - In-Play state-space staking breakdown table.*
Above table shows a further details breakdown of In-Play stakes, includes the current scores and also current concedes/given handicap during In-Play while `<NA>` during Break means Break-Time or pre-Extra-Time etc. The complete data is dim(sample.data) `r paste0(llply(lst, dim)[[1]], collapse = ' x ')` and `r paste0(llply(lst, dim)[[2]], collapse = ' x ')` for both AH and OU.
```{r data-ip-summary-plots2a, echo = FALSE, results = 'asis'}
## Plot the Stakes and P&L on AH handicap graphs
## the stakes amount display as $1 = $10,000
lst <- ipBets2 %>% select(AHOU, CurScore, ipHCap, Stakes, Return, PL) %>% dlply(.(AHOU), gather, Type, Stakes, Stakes:PL) %>% llply(unite, Category, ipHCap, Type, sep = '.') %>% llply(spread, Category, Stakes) %>% llply(., function(x) tbl_df(data.frame(x)))
gvis.options <- list(title="Asian Handicap - In-Play Current Score and Handicap Breakdown ('0,000)", series = "[{targetAxisIndex:0},{targetAxisIndex:1}]", hAxis = "{title:'CurScore'}", vAxis = "{title:'Stakes'},{title:'PL'}", width = 'automatic', height = 'automatic')
line.gvis <- gvisLineChart(xvar='CurScore', yvar = names(lst[[1]])[-c(1:2)], data = lst[[1]], options = gvis.options)
plot(line.gvis)
rm(ipBets2)
```
*graph 3.5.1a* : *Asian Handicap - In-Play state-space graph.*
```{r data-ip-summary-plots2b, echo = FALSE, results = 'asis'}
## Plot the Stakes and P&L on OU handicap graphs
## the stakes amount display as $1 = $10,000
gvis.options <- list(title = "Goal Line - In-Play Current Score and Handicap Breakdown ('0,000)", series = "[{targetAxisIndex:0},{targetAxisIndex:1}]", hAxis = "{title:'CurScore'}", vAxis = "{title:'Stakes'},{title:'PL'}", width = 'automatic', height = 'automatic')
line.gvis <- gvisLineChart(xvar = 'CurScore', yvar = names(lst[[2]])[-c(1:2)], data = lst[[2]], options = gvis.options)
plot(line.gvis)
rm(line.gvis)
```
*graph 3.5.1b* : *Goal Line - In-Play state-space graph.*
Section 3 summarise breakdown tables and also graphs on the investment of firm A. Basically, soccer sports investment need to consider below criteria :
- **Handicap** in [3.2 Summarise the Staking Handicap]
- **Price** in [3.3 Summarise the Staking Prices]
- **Timing** in [3.4 Summarise the In-Play Staking Timing]
- **Current Score** in [3.5 Summarise the In-Play Staking Based on Current Score]
While the further linear model will also take above criteria for investment. You can also refer to my previous research which is [Odds Modelling and Testing Inefficiency of Sports-Bookmakers](https://www.dropbox.com/sh/ifwczokjptt6re0/AADv1VarJoQ6IgIitZBzG5c6a?dl=0).
# [4. Staking Ⓜodel](http://rstudio-pubs-static.s3.amazonaws.com/208636_147313e52754447e89b38cc3456c5009.html#staking-odel)
- Section [4.1 Basic Equation](http://rstudio-pubs-static.s3.amazonaws.com/208636_147313e52754447e89b38cc3456c5009.html#basic-equation) - Analyse the Odds Price and Probabilities
- Section [4.2 Linear Ⓜodel](http://rstudio-pubs-static.s3.amazonaws.com/208636_147313e52754447e89b38cc3456c5009.html#linear-odel) - Reversed Engineer to get the EMOdds derived from Stakes
- Section [4.3 Kelly Ⓜodel](http://rstudio-pubs-static.s3.amazonaws.com/208636_147313e52754447e89b38cc3456c5009.html#kelly-odel) - Test the Kelly Model.
- Section [4.4 Poisson Ⓜodel](http://rstudio-pubs-static.s3.amazonaws.com/208636_147313e52754447e89b38cc3456c5009.html#poisson-odel) - Soccer Scores, Odds Price and Stakes modelling.
- Section [4.5 Staking Ⓜodel and Ⓜoney Ⓜanagement](http://rstudio-pubs-static.s3.amazonaws.com/208636_147313e52754447e89b38cc3456c5009.html#staking-odel-and-oney-anagement) - Simulate the staking model.
- Section [4.6 Expectation Ⓜaximization and Staking Simulation](http://rstudio-pubs-static.s3.amazonaws.com/208636_147313e52754447e89b38cc3456c5009.html#expectation-aximization-and-staking-simulation) - Enhance by weighted function on Staking model and Simulation.
# [5. ®esult](http://rstudio-pubs-static.s3.amazonaws.com/208636_147313e52754447e89b38cc3456c5009.html#esult)
- Section [5.1 Comparison of the ®esults](http://rstudio-pubs-static.s3.amazonaws.com/208636_147313e52754447e89b38cc3456c5009.html#comparison-of-the-esults) - Comparison of the Returns of Staking Models.
- Section [5.2 Ⓜarket Basket](http://rstudio-pubs-static.s3.amazonaws.com/208636_147313e52754447e89b38cc3456c5009.html#arket-basket) - Analyse the Hedging or Double up Invest by Firm A.
```{r, echo = FALSE, results = 'asis'}
## Set options back to original options
options(op)
```
# [6. Conclusion](http://rstudio-pubs-static.s3.amazonaws.com/208636_147313e52754447e89b38cc3456c5009.html#conclusion)
- Section [6.1 Conclusion](http://rstudio-pubs-static.s3.amazonaws.com/208636_147313e52754447e89b38cc3456c5009.html#conclusion-1) - Conclusion of this Research Paper.
- Section [6.2 Future Works](http://rstudio-pubs-static.s3.amazonaws.com/208636_147313e52754447e89b38cc3456c5009.html#future-works) - Future Research or Enhancement.
# [7. Appendices](http://rstudio-pubs-static.s3.amazonaws.com/208636_147313e52754447e89b38cc3456c5009.html#appendices)
- Section [7.1 Documenting File Creation](http://rstudio-pubs-static.s3.amazonaws.com/208636_147313e52754447e89b38cc3456c5009.html#documenting-file-creation) - Information of the Paper.
- Section [7.2 Versions' Log](http://rstudio-pubs-static.s3.amazonaws.com/208636_147313e52754447e89b38cc3456c5009.html#versions-log) - Version Log of the Paper.
- Section [7.3 Speech and Blooper](http://rstudio-pubs-static.s3.amazonaws.com/208636_147313e52754447e89b38cc3456c5009.html#speech-and-blooper) - Speech and Blooper during Conducting the Research.
- Section [7.4 References](http://rstudio-pubs-static.s3.amazonaws.com/208636_147313e52754447e89b38cc3456c5009.html#references) - Reference Papers for the Paper.
## 7.1 Documenting File Creation
It's useful to record some information about how your file was created.
- File creation date: 2015-07-22
- File latest updated date: `r Sys.Date()`
- `r R.version.string`
- R version (short form): `r getRversion()`
- [**rmarkdown** package](https://github.com/rstudio/rmarkdown) version: `r packageVersion('rmarkdown')`
- [**tufte** package](https://github.com/rstudio/tufte) version: `r packageVersion('tufte')`
- File version: 1.0.1
- Author Profile: [®γσ, Eng Lian Hu](https://beta.rstudioconnect.com/englianhu/ryo-eng/)
- GitHub: [Source Code](https://github.com/scibrokes/betting-strategy-and-model-validation)
- Additional session information
```{r info, echo = FALSE, warning = FALSE, results = 'asis'}
suppressMessages(require('dplyr', quietly = TRUE))
suppressMessages(require('formattable', quietly = TRUE))
lubridate::now()
sys1 <- devtools::session_info()$platform %>% unlist %>% data.frame(Category = names(.), session_info = .)
rownames(sys1) <- NULL
sys1 %>% formattable %>% as.htmlwidget
data.frame(Sys.info()) %>% mutate(Category = rownames(.)) %>% .[2:1] %>% rename(Category = Category, Sys.info = Sys.info..) %>% formattable %>% as.htmlwidget
rm(sys1)
```
**Powered by - Copyright® Intellectual Property Rights of <img src='figure/oda-army.jpg' width='24'> [Scibrokes®](http://www.scibrokes.com)個人の経営企業**