final_paper.Rmd

---
title: '**The Top (0.4%) Tail of Finland**: A Tale of Incomes, Tax Rates, and Elasticities from 2009 to 2013'
author: "Kyle Ott and Cornelius Schneider"
date: "12 December 2014"
output:
  pdf_document:
    number_sections: yes
    toc: yes
bibliography:
- main1.bib
- packages1.bib
---
\pagebreak
```{r, include=FALSE, echo=FALSE, warning=FALSE, message=FALSE}
pkgs <- c('httr', 'dplyr', 'XML', 'ggplot2', 'stringr', 'car', 'devtools', 'rsdmx', 'stargazer', 'knitr', 'tidyr', 'reshape2', 'sandwich', 'lmtest', 'plm' )
repmis::LoadandCite(pkgs, file = 'packages1.bib')

# set your working directory here
setwd("/Users/Kyle/Dropbox/!Fall_2014/Collab_Data/Final_Project/")

##################################
# Final Assignment: Data Science Course
# Kyle Ott & Cornelius Schneider
# December 12, 2014
##################################

# Load packages
library(httr)
library(dplyr)
library(XML)
library(ggplot2)
library(stringr)
library(car)
library(devtools)
library(rsdmx)
library(stargazer)
library(knitr)
library(tidyr)
library(reshape2)
library(sandwich)
library(lmtest)
library(plm)

load("cleaned.RData")

```

#Introduction

The question about an “optimal” income taxation is always discussed against the background of classic economic theories: Income taxation should maximize a given social welfare function that depicts a societies preference for equality. Furthermore, sacrifice theory of income taxation illustrates that redistribution should take place up to the point where marginal utilities are equalized. However, these theories completely neglect *behavioral responses* to taxation. According to the Laffer-Curve, there comes a point where a further increase of the tax rate would result in a loss in tax revenues due to negative labor supply responses. In a relatively recent work, @saez2004reported identifies additional reasons why trends in top-income shares are correlated with the tax rates: evasion, avoidance, and bargaining responses.

Indeed, the rationale of disruptive changes in income taxation schemes, like heavy reductions in marginal income tax rates in the US of the 1980’s was the logic of almost exclusively the supply side economics: Lower tax rates were believed to trigger important increases in economic activities and therefore higher tax revenues. It is against this background that many researchers focused their analysis only on behavioral responses like labor supply, savings and retirement. The current research frontier challenges this intellectual weight on supply side economics and steps beyond those “conventional” behavioral responses. @saez2004reported states that the discovered behavioral responses, such as tax deductible activities, compensation (e.g. wage versus untaxed fringe), unmeasured efforts, career choices, saving decisions and/or compliances, have “substantial effects on economic activity of high-income earners” (p. 118). Eventually, these determinants of reported incomes lead to more elastic responses with respect to taxation than assumed initially [@piketty2014inequality]. 

According to @giertz2007elasticity this effect is driven mainly by high income earners: analyzing US tax reforms of the 1980’s provides strong evidence that especially highly paid employers were able to retime (i.e. temporally shift) their compensation, taking advantage of the tax reforms. Moreover, and apparently also related to the behavior of the top tail of the income distribution, the timing of capital gains realizations seems to be highly sensitive to changes in the capital gains tax rates [@auerbach1988capital]. Finally, tax cuts in the top individual taxation to below the corporate tax rate triggered a massive shift of corporate income towards the individual income sector (legal entities that are only taxed at the individual level) [@auerbach1997economic]. All in all, the data strongly suggests that taxpayers with high incomes are much more responsive to changes than individuals in the middle class. 

The relevance of these mechanisms can be illustrated by the actual share of total incomes that the top income earners account for: In the US for example, the top 1% owned almost 20% of total incomes in 2010 [@alvaredo2013top]. This elucidates how tax-burden minimizing behavior of ultra wealthy people yields enormous sources inefficiencies for a whole economy. It is our opinion, that these patterns need to be discussed not only in a purely economic rationale (welfare loss spillover effects, etc.), but also raise questions about the equality of income distribution. Therefore, a detailed and well justified analysis of the actual mechanisms is essential.  

Whereas poverty is studied extensively in economics through surveys and welfare programs, the current debate still lacks information about the top of the income distribution. The aim of this project is to inspect exactly this upper end of the income distribution. For this, we will analyze micro-level data from Finnish taxpayers from which we have data for 2009 to 2013.

In Finland the tax on earned income is levied according to a progressive tax scale: Each taxpayer has to pay a basic amount dependent on his earned income plus the tax rate within the respective tax bracket. The concrete tax scheme is decided annually by the parliament. The relevant tax rates of the top tax-bracket were levied as follows:

* 2009: 30.5%
* 2010: 30.0%
* 2011: 30.0%
* 2012: 29.75%
* 2013: 31.75%

Against this background, our paper’s purpose is to identify anomalies in the tax patterns of ultra-wealthy Finnish people. In detail, the research question of this paper is: What is the responsiveness of reported taxable income to changes in average tax rates? 

In a first step, we will visualize how the share of the top 0.4% (approximately the wealthiest 15,000 individuals) in total income-tax revenue changes between 2009 and 2013. Following up, with inferential statistics we will dig deeper into income-tax payer’s behavior. With the principle of elasticity of taxable income (ETI) we want to apply a broader measure than only the labor supply elasticity in order to gain a better understanding of efficiency costs of taxation. Finally it is worth mentioning that many of the issues also apply to any tax base [@saez2012elasticity], which makes it an even worthier topic to investigate.

#Literature Review

Since the 1990’s, several researchers analysed the elasticity of taxable income (ETI) in different settings with different data and different methodologies. Most of these analyses on the ETI focus on the United States due to data limitations, nonetheless there are few studies in the literature also about Canada and Western Europe. The following chapter provides an overview, focusing on empirical approaches, the data, and respective findings. 

The very first attempt to investigate high income shares of reported aggregate income was run by @feenberg1993income. By using **aggregated time-series data**, they calculated the adjusted gross income (AGI) owned by the top 0.5 percent of US households in the period from 1951 to 1990. During this span they were able to identify an increase of this share by about 6% in 1970 to more than 12.1% in 1988. They assume that this is due to behavioral responses, since the pattern in the time series is consistent with reductions in the top marginal tax rates around a significant tax reform in 1986. 

Following, @slemrod1996high analyzed inequality for 1954 to 1990 using exactly the same data as @feenberg1993income. In his time-series model, Slemrod used the high-income share of AGI and four other components of income as dependent variables. The explanatory variables were measures of one-year lagged and one-year leading top tax rate for both individual income and capital gains. As a control for exogenous income trends he included a control for earnings inequality between the 90th and the 10th percentiles, as well as macroeconomic variables like level of stock prices, etc. Slemrod could not find evidence that the changes in the top tax rate explain the high-income share of AGI until 1985, but rather that the wage inequality seems to be the underlying driver. However, in the period from 1985 until 1990 Slemrod revealed that the high-income share of AGI is most likely caused by changes to the structure of the tax base: different incentives and opportunities to report income were introduced for high income earners. 

@saez2004reported uses data from the *Internal Revenue Service* (IRS) from 1960 to 2000 regress the log of average income of the top 1% of the distribution on the log of average net-of-tax rate for the top 1% over the time period. Even though he included time trends and other controls, he found large elasticities. Like @slemrod1996high he concludes that large proportions of the rise in top incomes can be explained with income shifting. Nonetheless, he was not able to depict the actual degree of top income shares explained by shifting, or if there are other underlying influences of non-tax related increases in earnings. 

Whereas the previous paragraphs exclusively presented aggregated time-series methodologies, the following paragraphs lay down the literature using **panel data**. @moffitt1998taxation analyzed the same period around the Tax Reform Act of 1986, but in contrast to previous studies they were one of the first who used panel data. Due to data constrains, their study covered more AGI rather than taxable income. With a two-stage least-squares regression, they included instruments for the change in the net-of-tax rate like education and illiquid assets. Accordingly, these instruments succeeded in distinguishing between high-income and the balance of the population. Taking different controls into account, their major finding was that the high income tax-cut and the respective increase in taxable income of ultra wealthy people  in 1986 was **not** accompanied by an increase in hours worked. 

@gruber2002elasticity wrote a seminal, highly influential paper with a methodological framework used by multiple subsequent papers. Again the period around the Tax Reform Act of 1986 (1979-1990) was subject of the analysis. They examined both broad income responses and taxable income responses for all income levels by measuring behavioral changes over three-year intervals. Moreover they instrumented the net-of-tax rate assuming, that “each filer’s income grows at the rate of overall nominal income growth between the base and the subsequent year” [@saez2012elasticity , p.38] and included a rich set of controls like year fixed effects, dummies for marital status, etc. For broad income they estimated an elasticity of 0.12, which is significantly smaller than that of taxable income with 0.65. This result yields that, indeed, much of the taxable income response comes through channels like deductions, exemptions and exclusions. They conclude that this is evidence for significant efficiency costs caused by avoidance opportunities in the current tax system. 

Along the same lines, @kopczuk2005tax runs an only slightly different model and found a post-1986 ETI of 0.36. He states that his results imply that the availability of deductions is linked to a behavioral responses. In the absence of being able to have access to deductions, a taxpayer does not respond to changes in tax rates [-@kopczuk2005tax]. @giertz2007elasticity examined a relatively large panel data set from 1979 to 2001, applying the methodology laid down by @gruber2002elasticity. Again, the empirical evidence for the ETI of broad income (0.12) are significantly lower than of taxable income (0.39) - consistent with @kopczuk2005tax findings that its the availability of deductions and exemptions determining the ETI. 

In general one can conclude that these studies found a) patterns in behavioral responses consistent with tax reforms, and that b) those behavioral responses, reflected in high income shares, are not due to labor supply responses but rather due to access to avoidance opportunities. Nonetheless, @saez2012elasticity outline some overarching methodological issues: The models are not adequately able to control for exogenous income trends, which biases the ETI results. Furthermore, the models fail to identify potentially important types of income shifting like between individual and corporate income tax base. 

Against the background that this paper’s subject of interest is Finland, it is worth noting that there are two recent papers about Denmark that made use of the especially rich data available in all Scandinavian countries, including a variety of demographic variables that are not available in the US. Another advantage is, that the income distribution in Scandinavia has been relatively stable compared to other parts of the world, which makes it easier to identify effects of income taxation. 
@jacobsen2011estimating analyzed the period from 1984 to 2005, using the methodology of @gruber2002elasticity and controlled for a rich set of base-year incomes. Their identified elasticities were modest compared to the previous studies. Nonetheless they observed some important structural behaviors: Elasticities are larger for self-employed than for employees and in high income levels (top quintile) the elasticities are two to three times larger compared to the bottom quintile of the distribution. Finally, @chetty2011adjustment estimated bunching around kink points, also exploiting population tax files from Denmark. The key finding is that large changes in marginal tax rates are associated with large elasticities, whereas the elasticities are small for small tax changes. They are assuming that this effect is due to large adjustment costs in the Danish tax-scheme.

# Our Unique, Tidy, Open, Reproducible Dataset

The Nordic countries of Finland, Sweden and Norway have a tradition of publishing everyone’s income and tax details every year. Whereas in Sweden and Norway this data is only accessible to citizens and after pulling an official request, in Finland top income tax earners are public figures as a result of heightened media scrutiny on top income tax earners. Finland’s largest business online daily newspaper @taloussanomat published the figures in a suitable format for our purpose: The top 15,000 income earners over the age of 18 are displayed on yearly basis from 2009 to 2013, including their name, total income (income, profits, and capital gains), taxed paid and average tax rate.
In a first step we scraped the data from their [dedicated website](http://www.taloussanomat.fi/verotiedot/2009/suurituloisimmat/). Since there are some observations missing in the first three years (2009 to 2011), the final data set contains 70,402 observations. The variables given by our scraped data are as follows: full name, total income (in euros), total taxes paid (in euros), average tax rate (in percent), and the tax year. We also created a couple of new variables that are helpful for our analysis: the rank based on the income of each individual in a given year and the share of total income-revenues that the top 15,000 tax payers pay. Furthermore, we created two different datasets for analysis, a time-series and a panel-dataset (more in the methodology) section. The panel dataset uses the given names of the taxpayers as unique identifiers and therefore we are able to track 4,867 individuals across all 5 years.

In order to derive an estimate of the share of the population that our top-tailed income earners comprise, we looked at data from @oecdstatspopulation to determine the working population in a given year in Finland (in 2009 it was 3,547,335 and shrinking to 3,508,645 by 2013). From this we divided the number of observations in our dataset in a given year by the working population in given year. Thus, we arrive at a dataset consisting of the top 0.4% income earners in Finland each year (note that for 2009 there are missing observations, so the total observations is low resulting in a population share of 0.34%). Moreover, we are able to track the development of the share of the total income tax revenue paid by the top 0.4 percent over our time period. In the figure below, the share increases from 5 percent to almost 8 percent over the course of five years. In each year we can nicely observe the changes in the top income tax rate changes which likely resulted in the reflected changes in tax revenue collected by the top 0.4 percent. Most importantly for our analysis, from this graph we can clearly start to see a shift in the share from 2012 to 2013 (the year of the tax reform). This provides simple, yet powerful graphical analysis in our attempt to flesh out latent underpinnings in taxpayers behavior given our dataset.

```{r, include=FALSE, echo=FALSE, warning=FALSE, message=FALSE}
# figure on shares
# clean$total2009 <- with(clean, sum(clean[yr2009==1, "taxes_paid"]))
# clean$total2010 <- with(clean, sum(clean[yr2010==1, "taxes_paid"])) 
# clean$total2011 <- with(clean, sum(clean[yr2011==1, "taxes_paid"]))  
# clean$total2012 <- with(clean, sum(clean[yr2012==1, "taxes_paid"])) 
# clean$total2013 <- with(clean, sum(clean[yr2013==1, "taxes_paid"])) 

# clean$share2009 <- clean$total2009/clean$Total_Tax_Revenue
# clean$share2010 <- clean$total2010/clean$Total_Tax_Revenue 
# clean$share2011 <- clean$total2011/clean$Total_Tax_Revenue 
# clean$share2012 <- clean$total2012/clean$Total_Tax_Revenue 
# clean$share2013 <- clean$total2013/clean$Total_Tax_Revenue 
# since we only have 5 datapoints, easier to graph manually...

share <- c(0.04898281, 0.06022747, 0.06752648, 0.06347982, 0.0770184)
year <- c(2009, 2010, 2011, 2012, 2013)
shares <- data.frame(year, share)

dev.off()
```
```{r, results='asis', echo=FALSE, error=FALSE, warning=FALSE}
sharefigure <- qplot(shares$year, shares$share, caption='\n Top 0.4% Share of Total Paid Finnish Taxes', ylim=c(0.04, 0.08), geom='line', ylab='Total Finnish Taxes Share Paid by Top 0.4%\n', xlab='\nYear')
sharefigure + theme_bw(base_size = 13)
```

```{r, include=FALSE, echo=FALSE, warning=FALSE, message=FALSE}
cleaned2 <- group_by(cleaned, year)


obs_all <- tally(cleaned2)
t1_totinc <- summarise(cleaned2, mean1=mean(total_inc),median1=median(total_inc))
t1_tottax <- summarise(cleaned2, mean2=mean(taxes_paid),median2=median(taxes_paid))
t1_rat <- summarise(cleaned2, mean3=mean(ratio),median3=median(ratio))

t1a <- merge(obs_all, t1_totinc,
                  by = c('year'))
t1b <- merge(t1a, t1_tottax,
             by = c('year'))
t1c <- merge(t1b, t1_rat,
             by = c('year'))

load("clean.RData")

obs_all_sum <- tally(clean)
t1_totinc_sum <- summarise(clean, mean1=mean(total_inc),median1=median(total_inc))
t1_tottax_sum <- summarise(clean, mean2=mean(taxes_paid),median2=median(taxes_paid))
t1_rat_sum <- summarise(clean, mean3=mean(ratio),median3=median(ratio))

obs_all_sum$year <- c('All')
t1_totinc_sum$year <- c('All')
t1_tottax_sum$year <- c('All')
t1_rat_sum$year <- c('All')

t1a_sum <- merge(obs_all_sum, t1_totinc_sum,
             by = c('year'))
t1b_sum <- merge(t1a_sum, t1_tottax_sum,
             by = c('year'))
t1c_sum <- merge(t1b_sum, t1_rat_sum,
             by = c('year'))

summarytableTS <- rbind(t1c, t1c_sum)
```

```{r, results='asis', echo=FALSE, error=FALSE, warning=FALSE}
knitr::kable(summarytableTS, align ='c', digits = 0, format='latex', 
             col.names=c("Year", "N", "Ave Inc (E)", "Med Inc (E)",
                         "Ave Paid (E)", "Med Paid (E)",
                         "Mean Tax (%)", "Med Tax (%)"))
```

The table above provides a quick overview of the mean and median of relevant variables by each year. As expected, all of the median income and taxes paid fall below the median indicating that high income earners (and taxpayers) skew the data. Only in 2013 do we observe two extreme outliers (see figure below), two tech multi-millionaires working for a iOS [video game developer](http://www.supercell.net/about) which currently has the two top grossing iPad games in 122 countries. 

```{r, results='asis', echo=FALSE, error=FALSE, warning=FALSE}
load("clean.RData")
# shows the income outliers in 2013
qplot(year, total_inc, data=clean, main="Total Income by Year Highlighting 2013 Outliers", ylab = 'Total Annual Income') + theme_bw(base_size = 13)
```

For our analysis, we took the natural log of reported taxable income in order to obtain a more normal distribution. The plot below shows that across all five years, the logged incomes follow a similar distribution (albeit not as normal as we would like -- however we are working with a specific subset, namely the ultra wealthy, so it is expected that our data does not follow a normal distribution).

```{r, results='asis', echo=FALSE, error=FALSE, warning=FALSE}
clean$year2 <- as.character(clean$year)
m2 <- ggplot(clean, aes(x = log(taxes_paid), y= ..count.., group=year2))
m2 + geom_density(aes(colour=year2)) +
  scale_colour_brewer(palette="Set1") +
  theme_bw(base_size = 13) +
  xlab("\nLog Total Income (Euros)") +
  ylab("Count\n") +
  xlim(8,18)+
  ggtitle("Observation Count of Log Total Income by Year\n")
```

The figure below shows observation counts of the average tax rate paid across the five years of our data. It is interesting to note a parallel trend in all five years, with two bimodal peaks centered around 30 and 47 percent, respectively. We cannot disentangle behavioral response from this graph alone, because we are not privy to the sources of our taxpayers taxable income. However, it is interesting to note that we can already see that in the first peak 2012 and 2013 look identical, despite that fact of the tax reform starting January 1, 2013. More interesting is that 2012 has the highest number of observations at the second peak. 

```{r, results='asis', echo=FALSE, error=FALSE, warning=FALSE}

clean$year2 <- as.character(clean$year)
m <- ggplot(clean, aes(x = ratio, y= ..count.., group=year2))
m + geom_density(aes(colour=year2)) +
  scale_colour_brewer(palette="Set1") +
  theme_bw(base_size = 13) +
  xlab("\nAverage Tax Rate (%)") +
  ylab("Count\n") +
  ggtitle("Observation Counts of Average Tax Rate by Year\n")
```

The next figure plots the log total annual income against the average tax rate. This figure provides a good visual outlook of our data, since we are interested in observing patterns between these two key variables. There are clearly multiple instances where higher income individuals have a small average tax rate (even instances where individuals have an average tax rate of zero), which is likely due to taking advantages of deductions or even worse potential tax avoidance like for example, shifting income to capital gains. 

```{r, results='asis', echo=FALSE, error=FALSE, warning=FALSE}
# plotting income against tax rate
qplot(ratio, log(total_inc), data=clean, ylab  = 'Log Total Annual Income', xlab ='Average Tax Rate', main = "Plot of Average Tax Rate Against Annual Total Income") + theme_bw(base_size = 13)
```

Finally, inspired by the methodology of @riihela2010trends, we look at the mean of the average tax rate by income decile bins across five years. With the exception of 2011, the mean holds stable across all deciles. We cannot explain why 2011 has a decrease as the income grows. However, we are most interested in the between year comparisons. The average tax rates for 2009 and 2010 always fall below the mean rates for 2012 and 2013 with 2013, as expected having the highest mean across all deciles. In order to dig deeper into the data, we present our methodology and results as follows.

```{r, inlcude=FALSE, echo=FALSE, warning=FALSE, message=FALSE}
## Creating graph, bin plo like finnish top income paper
bins <- group_by(clean, year)

bins1 <- subset(bins, total_inc <= quantile(total_inc, 0.1))
binplot1 <- mutate(bins1, avetax1 = mean(ratio))
binplot1a <- summarise(binplot1, median(avetax1))

bins2 <- subset(bins, total_inc <= quantile(total_inc, 0.2))
binplot2 <- mutate(bins2, avetax2 = mean(ratio))
binplot2a <- summarise(binplot2, median(avetax2))

binplots <- merge(binplot1a, binplot2a,
               by = c('year'))

bins3 <- subset(bins, total_inc <= quantile(total_inc, 0.3))
binplot3 <- mutate(bins3, avetax3 = mean(ratio))
binplot3a <- summarise(binplot3, median(avetax3))

binplots <- merge(binplots, binplot3a,
                  by = c('year'))
                  
bins4 <- subset(bins, total_inc <= quantile(total_inc, 0.4))
binplot4 <- mutate(bins4, avetax4 = mean(ratio))
binplot4a <- summarise(binplot4, median(avetax4))

binplots <- merge(binplots, binplot4a,
                  by = c('year'))
bins5 <- subset(bins, total_inc <= quantile(total_inc, 0.5))
binplot5 <- mutate(bins5, avetax5 = mean(ratio))
binplot5a <- summarise(binplot5, median(avetax5))

binplots <- merge(binplots, binplot5a,
                  by = c('year'))
bins6 <- subset(bins, total_inc <= quantile(total_inc, 0.6))
binplot6 <- mutate(bins6, avetax6 = mean(ratio))
binplot6a <- summarise(binplot6, median(avetax6))

binplots <- merge(binplots, binplot6a,
                  by = c('year'))
bins7 <- subset(bins, total_inc <= quantile(total_inc, 0.7))
binplot7 <- mutate(bins7, avetax7 = mean(ratio))
binplot7a <- summarise(binplot7, median(avetax7))

binplots <- merge(binplots, binplot7a,
                  by = c('year'))
bins8 <- subset(bins, total_inc <= quantile(total_inc, 0.8))
binplot8 <- mutate(bins8, avetax8 = mean(ratio))
binplot8a <- summarise(binplot8, median(avetax8))

binplots <- merge(binplots, binplot8a,
                  by = c('year'))
bins9 <- subset(bins, total_inc <= quantile(total_inc, 0.9))
binplot9 <- mutate(bins9, avetax9 = mean(ratio))
binplot9a <- summarise(binplot9, median(avetax9))

binplots <- merge(binplots, binplot9a,
                  by = c('year'))

## melting and plotting the bins

binplots <- plyr::rename(x = binplots,
                      replace = c("median(avetax1)" = "1",
                                  "median(avetax2)" = "2",
                                  "median(avetax3)" = "3",
                                  "median(avetax4)" = "4",
                                  "median(avetax5)" = "5",
                                  "median(avetax6)" = "6",
                                  "median(avetax7)" = "7",
                                  "median(avetax8)" = "8",
                                  "median(avetax9)" = "9"
                      ))

binplots2 <- melt(binplots, id=c("year"))
binplots2 <- group_by(binplots2, year)

binplots2$year <- as.character(binplots2$year)

ggplot(data = binplots2, aes(x = variable,y = value, group=year)) +
          geom_line(aes(color = year)) + theme_bw(base_size = 13) +
          xlab("\nDecile Bin") +
          ylab("Average Tax Rate (%)\n") +
          ggtitle("Average Tax Rate by Deciles\n") +
          scale_colour_brewer(palette="Set1")
```

#Methodology

The methodology in this paper is in general derived by the literature described above, but especially using the methodological framework of @saez2004reported for a time-series model and @giertz2007elasticity for a panel analysis. Since both approaches are quite relevant in the latest literature and we were not sure how sensitive our data is to the respective model, we decided to include both models in our paper. Due to data constraints, we had to adjust these models slightly suiting our capabilities, explained in the following paragraph. 

In our **time-series model** we simply regress the log of average income for the top 15,000 Finnish income earners on the log of the average *net-of-tax rate* in percent over the given time period 2009-2013 (5 observations). Whereas the *net-of-tax rate* depicts the rate of total income after taxes (net-of-tax rate = 1 - average tax). 
@saez2004reported wrote about his model: “A simple OLS regression of log average incomes on the log of the net-of-tax rate, always displays insignificant elasticity coefficients. Therefore, the aggregate data displays no evidence of significant behavioral responses of reported incomes relative to changes in the average marginal tax rate” (p. 138). Thus, in order to control for exogenous real income growth, we also added a time control capturing year-specific effects. 

For our **panel analysis** we used exactly the model laid down by @giertz2007elasticity, exploiting the panel advantage comparing only individuals who filed their tax returns in all subsequent years and also were part of the top 15,000 income earners from 2009 to 2013. In detail, here we are observing 4,867 individuals. The dependent variable in this model is the log of total income in the future year (*income t+2*) divided by income in the base year (*income t*), where the future year is two years after the base year. Thus, we end up observing three pairs of years (2009/2011, 2010/2012, 2011/2013).
The key independent variable is the log of the average net-of-tax rate in the future year (t+2) divided by the average net-of-tax rate in the base year (t). We use these time-differences in our variables to avoid endogeneity between the tax rate and income. 

![(Source: Giertz 2007)](Figures/model3.png)
(Image Source: @giertz2007elasticity)

Our panel-model differs from the original as such, that we use two instead of three future years for the sake of an additional year pair. Moreover, @giertz2007elasticity used several control variables like *marital status* that we are not able to obtain due to data constraints. 

Again we run two different versions of this model, first simple pooled OLS, but due to the presence of unobserved, time invariant effect also a within estimation (fixed effects). One of the most limiting constraints of this project is the limited time-span available relatively to previous research, which entirely focuses on long-term behavioral elasticities. Here we are only able to observe short term elasticities. Moreover, we are limited to only one income group, namely the top-tax bracket. Consequences of those limitations will be discussed in the *Discussion* section. 


#Results

## Time Series Results
Regressing the time series model without time control provides an ETI of -0.035, highly significant at the 1% level (with a confidence interval between -0.040 and -0.029). Interestingly the sign of this effect changes, when including the time control (consistent with the literature). Moreover, the effect dampens to an ETI of 0.009, which is still significant at the 1% level (with a confidence interval between 0.003 and 0.014). In detail, this yields that a 1% increase in average tax rate would lead to an 0.009% increase in reported taxable income, ceteris paribus. The time trend which is meant to capture all exogenous income growth is also significant at the 1% level and increases the adjusted R squared from 0.002 to 0.246.  


```{r, inlcude=FALSE, echo=FALSE, warning=FALSE, message=FALSE}
load("cleaned.RData")
M1 <- lm(log(avg_inc) ~ log(net_of_tax), data = cleaned, na.action = NULL)
# we have strong autocorr, so we use newey-west SE's
# M1a <- coeftest(M1,vcov=NeweyWest)
# stargazer only prints newey west results of coeff and SE only, no other info
M2 <- lm(log(avg_inc) ~ log(net_of_tax) + year,data = cleaned, na.action = NULL)
# due to problems with martix calculations, we cannot run newey west SW for M2
# therefore, we will argue that our coefficients are right, but the SE will be different
# Create cleaner covariate labels
labels <- c('Elasticity', 'Time Trend')
```

```{r, results='asis', echo=FALSE, error=FALSE, warning=FALSE}
stargazer::stargazer(M1, M2, covariate.labels = labels,
                     title = 'Time Series Elasticities',
                     column.labels = c("Without Time Trend", "With Time Trend"),
                     dep.var.labels = c("Log of Average Taxable Income"),
                     type = 'latex', header = FALSE)
```

## Panel Results

Running the first panel model, pooled OLS yields an ETI 0.105, highly significant at the 1% level (with a confidence interval between 0.040 and 0.170). Knowing that panel specific effects are present, we proceeded with a within estimator. This result indicates an ETI 0.640, highly significant at the 1% level (with a confidence interval between 0.381 and 0.897). In detail, this yields the implication, that a 1% increase in the average tax rate would increase the reported taxable income by 0.0640%. 


```{r, inlcude=FALSE, echo=FALSE, warning=FALSE, message=FALSE, results='hide'}

# from creating panel dataset
load("panelmodel3.RData")
  
#OLS
ols1 <- lm(log(dep) ~ log(indep), data=panelmodel3)
summary(ols1)
confint(ols1)
# Durbin-Watson test for autocorrelation
# dwt(ols1)
# we do not have strong autocorr

# panelmodel3$id2 <- as.character(panelmodel3$id2)
# still can't get the panel to run using id2, using justname from now on

# Fixed

fixed <- plm(log(dep) ~ log(indep), data=panelmodel3, index=c("justname", "year"), model="within")
# summary(fixed)
# confint(fixed)
# Testing for fixed effects, null: OLS better than fixed
# pFtest(fixed, ols1) 

#BP LM test of cross-sectional dependence
# pcdtest(fixed, test = c("lm"))
# yes cross-sectional dependence?

# testing for serial correlation
# pbgtest(fixed)
# yes serial correlation?

# testing for heterosk, BP test
# bptest(dep ~ indep + factor(justname), data = panelmodel3, studentize=F)

# Create cleaner covariate labels
labels2 <- c('Elasticity')
```

```{r, results='asis', echo=FALSE, error=FALSE, warning=FALSE}
stargazer::stargazer(ols1, fixed, covariate.labels = labels2,
                     title = 'Panel Elasticities',
                     column.labels = NULL,
                     dep.var.labels = c("Log of Average Taxable Income"),
                     type = 'latex', header = FALSE)
```

#Discussion

While it seems that our estimated ETIs fall in line to results found in the literature, nonetheless we should be keep the following caveats in mind. In this section the weaknesses and problems of our model will be highlighted in detail. We structured the issues in *data*, *statistical*, and *contextual* difficulties. 

## Data Issues
To start with our most critical weakness: our data only contains *average* tax rates on **different aggregated** sources of income, not *marginal* tax rates on only labour income. This yields significant limitations to the interpretability of our results. First of all we cannot open the “black box” and observe the actual behavior of ultra-wealthy Finnish people. For example, to what extent are they exploiting loopholes, income shifting or timing decisions of reported income? In that sense we are only able to observe an aggregated effect of behavioral responses that are happening within the “black box.” Along the same lines we cannot exclude the possibility that our results were driven by changes in different incomes than labor income; for instance, a change in the tax scheme for capital gains realizations could indeed have a significant impact on the behavior of individuals on the upper end of the income distribution. 

Nevertheless, we argue that there is indeed evidence that marginal income taxation is a significant driver of behavioral responses, even analysing only average tax rates: a look into the “shares”-plot (page 5) yields that the *share of the tax revenue paid by the top 0.4%* react consistently with the changes in top marginal tax-rates. With a relatively small cut in top tax rates in 2011, the share of total tax revenue paid by the top 0.4% slightly decreases, while during a 2% increase in top tax rates the share of total tax revenue paid by the top 0.4% significantly increases from 2012 to 2013. In addition to that, the top 15,000 individuals are all significantly above the top cut-off, which translates into significant change in the total tax burden if a change in the marginal top-tax rate takes place (due to the progressive nature of the Finnish tax scheme). Against this background, we believe that the average tax rate has at least *some* explanatory power. 

Moreover, we are only able to observe five years (2009-2013) in contrast to all previous studies, what only allows us to examine short-term responses to taxation. This yields additional limitations in the interpretability of our results: If taxpayers retime their reported taxable income eg. from 2012 to 2013, we would observe a large *short*-terme elasticity but given this short-term response, we would expect the long-term response to be much smaller. @saez2012elasticity provide an approach to a solution for this by suggesting to use cross-country time-series analysis for taking advantage of varying time patterns of tax rate changes. In case we would have more time and resources this would totally depict a legitimate and realistic approach. It is not only the case that all Scandinavian countries are publishing their income tax data, also they are structurally quite similar (stable income distribution, etc). Given tax-reforms at different points in time, this approach yields high potential for further findings in behavioral responses to income taxation. 

Furthermore, our top 15,000 income individuals do not supply any information about behavioral responses of other parts of the income distribution. Thus, we are unfortunately not able to put the estimated results about the top-tail of income earners into relation to the remaining part of the population. 

## Statistical Issues

In both our models, the time-series and the panel analysis, we found the expected issues like heteroskedasticity, autocorrelation and cross-sectional dependence. Whereas these symptoms are not biasing our coefficients themselves, they are inflating the standard errors, which leads to a t- and F-test that is not valid anymore. Thus, we should be careful in interpreting the significance of our results. Since they are all significant at a 1%  level, this might not be the case anymore once corrected for the standard errors. Unfortunately it was beyond the scope of our abilities to run the time-series and panel model with Newey-West corrected standard errors, so we decided to leave this for further analysis. 

## Issues in Context

At this point it is important to emphasize the underlying assumption of the models in use, summarized by @saez2012elasticity: The individuals are responding immediately and permanent, as well as they have a perfect understanding of the tax structure and choose their behavior according to the knowledge of the exact realization of potential income. Against the background of the Finnish tax system, where the concrete tax scheme is decided annually again by the parliament, these assumptions could be highly contested. Adjusting the income behavior each year again requires high effort and thus comes along with high adjustment costs. It is also questionable if individuals actually fully understand the tax structure. Most likely the upper tail of income distribution hires the best tax consultants; but these are only assumptions that cannot be examined in this paper. 

In addition to that, with a wider time constraint we could have conducted a much deeper understanding of the actual Finnish tax system; how were different income sources affected by tax reforms and how did the tax base potentially changed in those reforms. A change in the tax base for example has significant constraints for the comparability of self-reported taxable income. Also, we want to get a better understanding how prone the Finnish tax system actually is to tax avoidance, evasion, deductions, among other responses. Only in answering these questions we would gain a better understanding on how these tax reforms might had a substantial impact on the behavior or if they might not. Only lately  @chetty2011adjustment concludes in his analysis about Denmark that small tax changes only trigger small elasticities - a fact to bear in mind in interpreting our results. 


#Conclusion

Our exploration of our unique dataset has allowed us to initially explore the top income-earners in Finland thanks to the openness of Finnish administration and for the finance newspaper for hosting our data in table format. Our conclusion is that there seems to be underlying mechanisms which trigger patterns of behavioral responses, i.e. our estimated ETIs. Due to our data constraints, we faced special circumstances that qualify our findings. Our short time span allows us only to identify short-run changes. Furthermore, our lack of important control variables at the micro-level certainly hinders our results. Despite these limitations and the ones discussion previously, our limitations are similarly faced in the literature. There simply is a lack of high-quality data and therefore research relative to other fields that are similarly important to welfare economics. It is also important to note that potentially high efficiency costs of the lack of research in our field of understanding behavioral changes to taxation amongst high-income earners. The hope that more countries follow the example set in Scandinavia in providing and publishing tax data would be fruitful to the field of study as there is still much to be learned. Countries exposing their data would help in the sense that we are trying to estimate an equation where both sides are unknown. The biggest challenge in the field of taxation is that researchers are working blindly in the face of not being able to officially observe the income of taxpayers as well as their payments to tax offices.

***
Final word count: 5,593

This project used @CiteRStudio and @R-rsdmx to create this assignment.

\pagebreak

#References