Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Including children age 17 in the child tax credit calculator possible error? #2630

Closed
JackLandry opened this issue Oct 27, 2021 · 15 comments
Closed
Labels

Comments

@JackLandry
Copy link

JackLandry commented Oct 27, 2021

Part of the child tax credit is calculated as follows:

if CTC_include17:
    childnum = n24 + max(0, XTOT - n21 - n1820 - n24 - num)
else:
    childnum = n24
line1 = CTC_c * childnum + CTC_c_under6_bonus * nu06

It seems to me that the formula for childnum when CTC_include17=true is not correct. The total number of exemptions (basically total number of people in the filing unit) minus the number of people age 21 and over, the number of people age 18-20, the number of people under age 17, and then the indicator for marital status (2 when married filing jointly, otherwise one) seems like it will always be negative or zero. For instance, imagine a married couple with one child age 17. The formula would work out as follows: 3 - 2 - 0 - 0 - 2 = -1. Alternatively, a single-parent household with one child age 17: 2 - 1 - 0 - 0 - 1 = 0. Any children age 0 - 16 are canceled out by increasing exceptions and subtracting n24. Am I missing something, or is this an error? It relates to issue #2571

@jdebacker
Copy link
Member

cc'ing @MaxGhenis and @kpomerleau who may be able to help here (I know they've both thought about this).

@Thirdhuman
Copy link

I remember noticing this issue before.
If I recall correctly, it's due to one of the nu# variables being misleadingly titled.

@JackLandry
Copy link
Author

@Thirdhuman, n24 is misleadingly titled, (number of people under age 17), but in the (possible) error above I describe it correctly.... Maybe I'm misunderstanding total number of exemptions is not equivilent to total number of people in the filing unit?

@martinholmer
Copy link
Collaborator

@JackLandry and @Thirdhuman, This is a complicated issue (which is probably why you haven't heard much from the Tax-Calculator maintainers over the last two weeks). The first thing to always keep in mind is that there are many possibilities here: there could be a variable documentation error, there could be a Tax-Calculator coding error, there could be internally inconsistent (CPS and/or PUF) data being used by Tax-Calculator users, we could be confused, or there could be a combination of those things going on all at the same time.

Over the past few days, I've been tabulating dump output for 2021 generated by the tc CLI tool that comes with the Tax-Calculator taxcalc package. The taxcalc package has CPS data built into it and that is what I've been tabulating. I don't have access to the taxdata PUF data/weights that are used by some Tax-Calculator users. For example, the ARPA discussion between Cody and Kyle in issue #2571 was, almost certainly, based on their use of some kind of IRS-PUF-derived input data file.

I'll be sharing some of my tabulation results here in the coming days, but so far it seems to me that things are not right and that @JackLandry was right to raise an issue about this matter.

One question I have for the two of you: how are you using Tax-Calculator? On the web or on you own computer? If the later, are you writing Python scripts or are you using the tc CLI tool? Are you looking at the CTC using CPS-derived data or PUF-derived data?

@JackLandry
Copy link
Author

Glad to here I'm not totally off-base! I've been using CPS data and the command line tool, though I construct my own tax units in the CPS rather than using the taxcalc built in CPS data.

@martinholmer I agree that there could be something going on between IRS-PUF vs. CPS given the comments in #2571

Also, this issue has an easy workaround (changing ages directly in the data) so no worries about the delay. I could make a quick exsample if that's helpful.

@martinholmer
Copy link
Collaborator

@JackLandry said:

I've been using CPS data and the command line tool, though I construct my own tax units in the CPS rather than using the taxcalc built-in CPS data.

That's very interesting.
Which CPS survey year are you using to construct your own tax units?
Are you extrapolating your tax unit data to future years in any way?

@martinholmer
Copy link
Collaborator

martinholmer commented Nov 10, 2021

@JackLandry (Jack) raised issue #2630 by making this observation:

Part of the child tax credit is calculated as follows:

if CTC_include17:
    childnum = n24 + max(0, XTOT - n21 - n1820 - n24 - num)
else:
    childnum = n24
line1 = CTC_c * childnum + CTC_c_under6_bonus * nu06

It seems to me that the formula for childnum when CTC_include17=true
is not correct.
[snip]...gives some examples to illustrate his concerns...[snip]
Am I missing something, or is this an error?
It relates to issue #2571 [a discussion of ARPA reform provisions]

I have done some tabulation of 2021 data dumped from a current-law-policy
run executed by the Tax-Calculator tc CLI tool using the built-in CPS data,
which are constructed in the taxdata repository.

My conclusion is that the expression:

if CTC_include17:
    childnum = n24 + max(0, XTOT - n21 - n1820 - n24 - num)

is logically incorrect as Jack indicated.

The above statement generates estimates of age17 eligibles that are
extremely small: 0.140 million when considering all CPS tax units, and
0.015 million when tabulating only CPS tax units whose pre-CTC tax
liability is positive.

However, there is a logically correct method of estimating the number
of age17 eligibles, and that method generates more sensible estimates:
4.993 million when considering all CPS tax units (even those without
any tax liability), and 1.906 million when tabulating only CPS tax units
whose pre-CTC income tax liability is positive. The later estimate is
close to the PUF-derived estimate of 2.0 million age17 eligibles mentioned
by Kyle Pomerleau (Tax Foundation) in an issue 2571 comment.

The logically correct method computes the number of dependents under
age 18 and then subtracts the number of CTC eligibles (who are under
age 17) to get an estimate of the number of dependents who are age 17.
The nu18 variable is the number of people (not dependents as some
people mistakenly assume) in the tax unit who are under age 18. So, to
get the number of dependents who are under age 18, we need to subtract
from nu18 one if the unit's taxpayer is under age 18 (using the age_head
variable) and subtract another one if the unit's spouse is under age 18 (using
the age_spouse variable).

One way to code the logically correct method is this:

if CTC_include17:
    tu18 = int(age_head < 18)   # taxpayer is under age 18
    su18 = int(MARS == 2 and age_spouse < 18)  # spouse is under age 18
    childnum = n24 + max(0, nu18 - tu18 - su18 - n24)
else:
    childnum = n24

When the data being processed by Tax-Calculator are logically
consistent, the max(0, business is unnecessary. But my tabulation
work found that even the pure CPS data built into the tc CLI tool
has some inconsistencies, and the PUF-derived data have even more
because of the capping of variables by IRS-SOI.

I have no access to the puf data/weights generated in the taxdata repository,
so I have not been able to do any of these tabulations on dumped data from
a tc run using the puf data. In particular, I have no idea whether or not
the taxdata puf data would generate an estimate close to the 2.0 million
cited by Kyle Pomerleau.

REPLICATION NOTE: I dumped tc output as an SQLite database, so my tabulation
code is written in SQL. I'm happy to share that code with anybody that is interested;
just let me know if you want it. I will send a zip file containing dumpvars, tab.sql, and cpstab.exp, and you can replicate my analysis by doing:

(taxcalc-dev) ~% tc cps.csv 2021 --sqldb --dvars dumpvars    
You loaded data for 2014.
Tax-Calculator startup automatically extrapolated your data to 2021.
(taxcalc-dev) ~% sqlite3 cps-21-#-#-#.db <tab.sql >cpstab.act
(taxcalc-dev) ~% diff cpstab.act cpstab.exp

and getting no differences.

NOTE ON THE 1.906 ESTIMATE: Strictly speaking 2021 tax liability before the CTC should have been computed with CTC_include17 being false. But Tax-Calculator version 3.2.1 has it correctly being true in 2021. Repeating the above tabulations with a reform that sets CTC_include17 equal to false for 2021 produces an estimate of 1.910 (rather than the 1.906 reported above) for the logically correct method and no change in the logically incorrect estimates.

@martinholmer
Copy link
Collaborator

@jdebacker and @MattHJensen,
It has been two months since @JackLandry submitted the bug report in issue #2630. Do the two of you, who are the maintainers of Tax-Calculator, agree that this is a bug? If you don't think this is a bug, please explain why you think the code is correct (and the analysis in issue #2630 is wrong). If you agree that there is a problem in the code, what is the plan for fixing the bug?

@kpomerleau
Copy link

@martinholmer

I responded about a month ago replying to this thread directly through email, but it didn't seem to work because I don't see my comments. I agree that the code in question has a logical flaw. Here is what I wrote responding to your suggested code.

From November 12th 2021:

Hi Martin,

I tried your suggestion using the PUF-derived data. While I follow the logic of your suggestion, it ends up producing an increase in eligible children of about 22 million. The revenue impact of extending the CTC to 17-year-olds also reflects this. The original flawed code in question produces a revenue impact of $1.75 billion a year while your code produces a revenue impact of $20 billion a year.

If I drop the 'max(0,' test, the increase in eligible children drops from 22 million to 10 million and the revenue impact per year is $7.44 billion. More reasonable, but 80 million total children still seems a little large. Also worth nothing that without the max statement, we end up with 101 observations with negative age 17 children.

Kyle

@martinholmer
Copy link
Collaborator

@kpomerleau (Kyle) wrote on 2021-12-29:

I responded about a month ago [on 2021-11-12] replying to this thread directly through email, but it didn't seem to work because I don't see my comments [in issue #2630]. I agree that the code in question has a logical flaw. Here is what I wrote responding to your suggested code. [begin-paraphrase] When using Tax-Calculator PUF data, the "suggested code" implies way too many 17-year olds [end-paraphrase].

Kyle, thanks for posting this comment. So, I guess we are in agreement that the current Tax-Calculator logic is incorrect. But your point is that when using my "suggested code" (which does seem logical to me and to @JackLandry) along with the Tax-Calculator PUF data, there are way too many 17-year-olds, even though the "suggest code" gives about the correct number of 17-year-olds when using the Tax-Calculator CPS data.

To me, this points to serious problems with the age variables in the Tax-Calculator PUF data. I have already posted two issues that document (less severe) problems with CPS age variables at the taxdata site, but have received no response from the maintainer of that PSL repository, @andersonfrailey.

The Tax-Calculator PUF data is known to have problems with its age variables as described in issue #2469. This issue (originally posted about a year and a half ago) is still open, so I assume nothing has been done in taxdata to fix that problem.

The obvious solution to this problem is to fix the age variables in the Tax-Calculator PUF data and then fix the invalid logic (being discussed in this issue #2630) when the new PUF data is introduced as part of Tax-Calculator 4.0, for which there is already a GutHub development branch.

Does this make sense to you, @kpomerleau?
@jdebacker and @MattHJensen, does this plan make sense to you?
@andersonfrailey, does this bug-fix plan make sense to you?

@MattHJensen
Copy link
Contributor

@JackLandry, @kpomerleau, and @martinholmer thanks for your investigation of this issue!

Tax-Calculator accommodates numerous datasets from other open source projects as well as user-submitted data. I suggest that this project fix the known bug here and issue a release with the fix now, rather than sequence our work based on another project's timeline.

TaxData contributors (I'm sure all of us are very welcome) can tackle the issue over there on their own timeline.

TaxData PUF users should review that project's open issues to understand the TaxData PUF's advantages and limitations. If we think TaxData users don't do that already, we could revise Tax-Calculator's documentation, here, to suggest it.

@JackLandry or @martinholmer, do you have any interest in opening the Tax-Calculator PR, or would you prefer for me to do so building on your work here?

@MattHJensen MattHJensen added the bug label Jan 3, 2022
@martinholmer
Copy link
Collaborator

@MattHJensen said in issue #2630:

@JackLandry or @martinholmer, do you have any interest in opening the Tax-Calculator PR, or would you prefer for me to do so building on your work here?

Speaking for myself, I have the interest (which is why I investigated this issue), but don't have the "bandwidth" to prepare and test a pull request. So, it makes more sense for you to build on @JackLandry's insight.

@JackLandry
Copy link
Author

@MattHJensen, Thanks for your attention to this, I'm such a novice Python user that I think it would make sense for you to take the lead on this. Thanks!

@MattHJensen
Copy link
Contributor

I will tee up the PR. Thanks again all for your help with this issue!

@MattHJensen
Copy link
Contributor

Resolved by #2644.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants