Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overly restrictive and inconsistent "pinned" package versions #1350

Closed
martinholmer opened this issue May 5, 2017 · 5 comments
Closed

Overly restrictive and inconsistent "pinned" package versions #1350

martinholmer opened this issue May 5, 2017 · 5 comments

Comments

@martinholmer
Copy link
Collaborator

martinholmer commented May 5, 2017

This afternoon I tried to follow the taxcalc package installation instructions in the user documentation. I did this by creating a new user account on my Mac (which had the Apple version of Python and nothing from Continuum Analytics) and following the installation instructions.

I had no problem installing the offered Anaconda Python 2.7 distribution, which is the 4.3.1 version.
That version of Anaconda came with all the packages used by Tax-Calculator except for the mock package, which is not used in Tax-Calculator calculations, but is used in one of the thee-hundred some unit tests. Among those packages included in Anaconda2 version 4.3.1 are the three packages that have "pinned" versions in the environment.yml file (which sets the development environment). But those three packages have newer versions in 4.3.1 than in the environment.yml file. Also, the conda.recipie/meta.yaml (which specifies package versions for the taxcalc package) has "pinned" versions that differ from those in the environment.yml file. Here is a summary of the the three different sets of "pinned" package versions.

------------------------------------------------------------
package    environment.yml   meta.yaml   Mac Anaconda2 4.3.1
------------------------------------------------------------
numpy          =1.11.2        =1.11.2         1.11.3
pandas         =1.18.0       <=0.18.0         1.19.2
bokeh          =0.12.3        MISSING         0.12.4
------------------------------------------------------------

But, at least on a Mac, the Tax-Calculator works just fine with the more recent versions of these three packages. After downloading a zip file containing the Tax-Calculator code at the tip of the master branch, bringing to the new tcuser account a copy of the puf.csv file, and conda installing the mock package, I passed all 305 unit tests.

So, this experiment seems to prove (at least on a Mac) that the Tax-Calculator will work just fine when using the newest versions of numpy and pandas and bokeh. Given this, why shouldn't all the = and <= symbols in the environment.yml and conda.recipie/meta.yaml files be converted to >= symbols?

Unless I get a cogent reason for not making these changes soon, I'm going to prepare a pull request that makes those symbol changes. These changes are essential otherwise the users who download the taxcalc package after installing Anaconda are going to be subjected to a whole set of DOWNGRADE messages when doing the conda install of taxcalc. Here is what I got before I installed the mock package:

iMac2:~ tcuser$ conda install -c ospc taxcalc --dry-run
Fetching package metadata ...........
Solving package specifications: .

Package plan for installation in environment /Users/tcuser/anaconda:

The following NEW packages will be INSTALLED:
  Do
    mock:         2.0.0-py27_0           
    pbr:          1.10.0-py27_0          
    taxcalc:      0.8.2-np111py27_0  ospc

The following packages will be UPDATED:

    anaconda:     4.3.1-np111py27_0       --> custom-py27_0     
    conda:        4.3.14-py27_0           --> 4.3.17-py27_0     

The following packages will be DOWNGRADED due to dependency conflicts:

    mkl:          2017.0.1-0              --> 11.3.3-0          
    numexpr:      2.6.1-np111py27_2       --> 2.6.1-np111py27_1 
    numpy:        1.11.3-py27_0           --> 1.11.2-py27_0     
    pandas:       0.19.2-np111py27_1      --> 0.18.0-np111py27_0
    scikit-learn: 0.18.1-np111py27_1      --> 0.18.1-np111py27_0
    scipy:        0.18.1-np111py27_1      --> 0.18.1-np111py27_0

I have no idea why the anaconda change from 4.3.1 to custom is considered an UPDATE when 4.3.1 is the latest offering on the Continuum Analytics download page.

The Tax-Calculator code runs just fine without these DOWNGRADES. So, why should the taxcalc package require these downgrades?

It seems to me there needs to be a careful rethink of environment.yml and conda.recipie/meta.yaml file contents.

@MattHJensen @feenberg @zrisher @salimfurth @PeterDSteinberg @brittainhard

@talumbau
Copy link
Member

talumbau commented May 7, 2017

Some general comments/questions about this issue. First, I made some comments on #1351 to hopefully answer why the package pinning is done in the way it is. The following questions occur to me when reading the documentation and then this issue:

  • Why is it bad/undesirable to downgrade a package in an environment?
    This doesn't seem like an important thing to me (if installing a package requires downgrading some other package), but this issue makes it seem like it is somehow an important issue. I can see two use cases for someone uses Python with this project:
  1. the user wants to use Python for the exclusive purpose of using Tax-Calculator

  2. the user wants to use Python to use Tax-Calculator and do something else with Python (develop/run other programs, do analysis, etc.)

In the first case, the user would not care about any versions of any other packages, because the only reason those packages are on the user's system is to use Tax-Calculator.

For the second case, the user would create a conda environment for the purpose of using Tax-Calculator (in which case, any and all packages installed in to that environment would be there for express purpose of using/developing Tax-Calculator). For any and all other desired uses of Python, if the user did not like/want the packages that are installed in to the environment for using Tax-Calculator, the user could do as follows: 1. exit the environment by typing source deactivate (or deactivate on Windows), which would take him/her back to the "root' environment. 2. (optionally) the user could activate/create some other environment for the other uses of Python he/she had in mind.

In general, a conda environment should be treated as disposable and useful for a particular purpose. The entire goal is to not have to maintain a single Python installation that satisfies many needs: one can have a Python installation for each use case one has for Python. Packages that are useful for that purpose can be added, updated, or removed, as needed.

  • Why it is recommended to execute conda upgrade anaconda in order to the use Tax-Calculator?

According to the instructions, one is asked to execute conda update anaconda while inside the taxcalc-dev conda environment. I don't know the impact of doing such a thing, and I don't know of other Python projects where this is standard practice. It seems to me to be an odd thing to do.

@feenberg
Copy link
Contributor

feenberg commented May 7, 2017 via email

@MattHJensen
Copy link
Contributor

Why it is recommended to execute conda upgrade anaconda in order to the use Tax-Calculator?

According to the instructions, one is asked to execute conda update anaconda while inside the taxcalc-dev conda environment. I don't know the impact of doing such a thing, and I don't know of other Python projects where this is standard practice. It seems to me to be an odd thing to do.

I think this might be a voodoo trick that kind of sort of seemed to fix common installation problems in the early days of the project. It is probably out of context now if it ever did truly work.

@MattHJensen
Copy link
Contributor

MattHJensen commented May 8, 2017

If we move to >= in environment.yml, will we ever be able to guarantee that the taxcalc works, given that the packages on which we depend can push updates anytime w/o triggering our tests?

@martinholmer
Copy link
Collaborator Author

Issue #1350 has been resolved by the merger of pull request #1351.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants