-
Notifications
You must be signed in to change notification settings - Fork 181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature idea resolve jars from maven repositories #900
Comments
Are you familiar with Groovy Grape? A similar feature available from CPython via JPype would be amazing. Grape works by annotating import statements with which GAV it's from, and optionally which Maven repository, and Grape takes care of the rest: downloading the artifact if needed, and then loading the class with a dynamic class loader. Similarly, BeakerX has Regarding jgo: it works by invoking |
It is very light relative to grovy or maven as it is a dependency manager and not a build system. I selected it for my projects internal use mainly because it is light, doesn't take over the build system, and can be used from the command line with minimal setup. (in other words perfect for a Python project that just uses Java packages occasionally.) For JPype it requires two files to use ivy: I am not sure that I can exactly match the syntax in Python as annotations do not apply to Python import statements. Though I likely can make it simply act as functions that are called before the import statement. The API would likely be something like (stealing from the Grab example)
Ivy is the dependency manager for ant and supports pulling files including all dependencies specified in the pom files. It is able to do things like publishing and other similar features, but for our purposes we are interested only in "resolve" and "retrieve". The set up is pretty easy. You first have to configure it with settings, though I believe that the defaults are likely pretty close to usable. Then you construct a resource using the usual notation. The weird issue is classifiers are not an ivy concept natively but you can add it in the process (at least I know how to do it in the files; need to find it programmatically). Then you set up the resolve options. Unfortunately I am less familiar with this part as there are a lot of options that not being a maven user, some don't make much sense. You then call resolve which scans the POM files and decides what you are going to pull. There is the somewhat weird concept of configurations. My config files usually just push the jar file request into the default and pull it. But there are some set of default configurations that it supports. Assuming we just want jars then master seems like the right one.
The resolve pulls the parsed pom files into home/.ivy2/cache. They are big xml files (well this came from ant so that is to be expected.) It generates a report object that you can scan for errors to see if something was missing in the request. The next stage is retrieve. Same pattern as before. Use the result from resolve to pass the requested artifacts into the retrieve options. You then set options arguments and most importantly a pattern. Ivy requires you to define how the pulled resources are supposed to look on disk like One issue is ivy is rather noisy. I tried to turn the logging down to minimal but it still seems to print out a few random log messages. Relevant links: https://en.wikipedia.org/wiki/Apache_Ivy |
@ctrueden so I guess I will tag you as someone interested in seeing this feature added. @marscher This is likely one of those features that it actually makes sense to include in a separate package like jpype-ivy or jpype-deps. The package will be self contained and runs on all architectures. Further, including a 1M jar file for a special feature seems like a bit much. We could incorporate only a portion of ivy into the project, but it includes crypto so I think that would be a poor choice. |
This feature should definitely be a separate package. Why would you like include only a subset of ivy or do you mean that the python wrapper only binds to a subset? Does it include cryptography forbidden in the US, or why do you consider it a bad choice then? |
We will only be using a subset of ivy for pulling. But i suspect it would be the majority. We arent going to support pushing or making new packages from Python. So there would be no benefit to including the ivy source and striping it down vs including the whole jar. If it was included in Jpype rather than a separate package it would force the crypto warning on JPype which would be way too much paperwork on my side. Better to make it a new package and just include the ivy jar as is, just pointing to there crypto warning. Their crypto message says it is exempt but not sure why they require a disclaimer in the first place unless it was on the export controlled list from US. The ASM library on the other hand is a good candidate to include as it has no incomberences, and many packages like jacoco require different versions so we are best to include it and rename it internally to prevent conflicts. |
You might be interested in a package I maintain (inherited) which speaks to a bespoke Java build system (built on top of Gradle): https://gitlab.cern.ch/scripting-tools/cmmnbuild-dep-manager. Of course, the implementation there isn't something that is something that can picked up directly (not least because there are some significant improvements needed to the library), but there are some interesting synergies and/or lessons that we could draw on. One of the first things I would say is: avoid import-time behaviour as much as possible. It is an anti-pattern in Python. I opened a discussion at #933 which was motivated by this point (I didn't want to muddy the waters here). Downloading JARs at import time is a good example of the kind of import-time behaviour we should avoid. In maintaining the package aforementioned I've also found that it is highly brittle and not a tenable approach for operational code. Gradually I've been building out tools to try to move the JAR downloading forwards as much as possible - in my case I have developer tools which allow virtual environment installation where both pip packages and JARs are downloaded at the same time (i.e. at "install time"). In order to do this effectively I recommend having package metadata to declare Java dependencies, not metadata declared in code (which is only available at runtime) - this mirrors the idea of declaring Python dependencies which we do in For the record, it is no longer viable to assume that a I prototyped enabling custom metadata in the Another challenging aspect of runtime resolution of JARs is deciding where to put the JARs you download. There is no guarantee that a user has write permission to an environment, so you end up having to make some compromises down the line. If you do this at install-time then you know that the person doing the install is the owner of the environment, and therefore has write permission to store the JARs appropriately. It may sound trivial, but this specific issue has been a real challenge for the package I maintain, and has been an endless cause of runtime issues - the compromise that was made in that library was to use user site-packages which has the terrible effect of being put on the path of all Pythons (even the ones you thought were well isolated in a virtual environment). |
Well I haven't done much on this beyond the prototype. This behavior is part of scyjava so at least some usage. I personally don't have much use for this type of system as as you point out it is very brittle for production code. (can you get to the maven repo, do you have write privileges, is there already a version on system?) That said it would be very nice to have a way to automatically install jars using pip for a JPype using project. This ivy pattern may or may not be part of that solution. |
One issue that people run when distributing Python packages with jars is that the jar file needs to become part of the Python module and be installed in the site packages. It is possible that we could add an alternative method such as
jpype.ivy.addArtifact("com.h2database:h2:1.4.200")
which would automatically download the jar file and all dependencies and insert into the class path loader. It would depend on having Apache ivy already available in the class path.Here is a prototype of the system.
I know that scyjava uses a similar system for JGO. Is this a feature which people likely use? Does it belong in JPype?
@ctrueden any thoughts on this?
The text was updated successfully, but these errors were encountered: