Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add conversion backend #3

Open
alecandido opened this issue Jan 17, 2022 · 33 comments
Open

Add conversion backend #3

alecandido opened this issue Jan 17, 2022 · 33 comments
Labels
good first issue Good for newcomers

Comments

@alecandido
Copy link
Member

As we have different MC available as back-end (at the moment mg5 and yadism), we should add a conversion back-end powered by pineappl conversion scripts.

Indeed, we are not able to produce all of the grids needed (and we won't be for quite some time), as some of them are the result of MC runs, with some non-publicly available MC.
In these cases we're gently gifted the runcards, so we should download them from somewhere else (or have the user running rr downloading them), and then convert to pineappl.

@cschwan
Copy link
Contributor

cschwan commented Jan 17, 2022

What exactly should we convert? Can you give an example?

@alecandido
Copy link
Member Author

I'm not so familiar with our APPLgrids/fastNLO grids, but for example (if I remember correctly) we should have grid from NNLOjet, for which we don't have the code, and we won't have soon (if ever) another code to produce them.

Even though, I guess most of the things are actually K-factors...
(but we agreed to drop K-factors, didn't we?)

@cschwan
Copy link
Contributor

cschwan commented Jan 18, 2022

Do you mean those: https://ploughshare.web.cern.ch/ploughshare/? For them we have appl2pine and fnlo2pine, of course.

@alecandido
Copy link
Member Author

I was not even aware of the website, and I wonder if there are others, since I remember that at the time of 4.0 publication NNPDF contacted people in order to check if it was fine to make grids public (that to me meant some grids were not public yet, but given directly to NNPDF).

@alecandido
Copy link
Member Author

In any case, these or others, since those grids are not PineAPPL grids and we need to convert, I was thinking about make appl2pine and fnlo2pine part of the runcards runner, such that you have a uniform way (a single command) to generate them, even if they are coming by conversion.

Maybe, it would be appropriate to warn explicitly in case of conversion, but this can always be done.

@felixhekhorn
Copy link
Contributor

Maybe, it would be appropriate to warn explicitly in case of conversion, but this can always be done.

Why do we need a warning? it is just generated by some specific program, i.e. wget more or less

@felixhekhorn
Copy link
Contributor

felixhekhorn commented Jan 18, 2022

@cschwan to give you a bit more of context: yesterday we were brainstorming a bit about the new "theory layout" (theory in Emanuele sense) and we figured it should be sufficient to have a list of PineAPPL grids together with a file which spells out how to combine the grids together to match to the experimental datasets - the specific format of that file is to be discussed in https://github.com/NNPDF/fktables/issues/12

@alecandido
Copy link
Member Author

The specific program will be appl2pine (for example), and thus pineappl, and wget-like.

In any case, I thought that by default a computation from scratch is expected, since this should run (based on the runcards). If a conversion of a former run is happening behind the scenes, better to warn, or not?

@felixhekhorn
Copy link
Contributor

felixhekhorn commented Jan 18, 2022

Ok, I see - but actually it's a bit more complex: the user only specifies the theory card, (which, as you said, would be ignored since we have no other choice), but the "MC runcard" is always in the (runcards) repo and not explicit (so is by chance implicitly the one the NNLOjet people used)

@alecandido
Copy link
Member Author

I guess we have no access to the NNLOjet runcard, since we have even no access to NNLOjet.

So I believe in the corresponding runcards folder, there will be only metadata, and the actual runcard will be replaced with information needed to retrieve the grid...

@scarlehoff
Copy link
Member

I think NNLOjet has only been used for K-factors?

@alecandido
Copy link
Member Author

Yeah, but the idea was to get rid of K-factors, and burn them into PineAPPL grids, if I'm not wrong...

@scarlehoff
Copy link
Member

Really? In any case, what information do you need from NNLOJET (or any other program?). The K-factors are a bit of a "god given" number.

I think the idea is (or should be) getting rid of K-factors and using NNLO grids.

@alecandido
Copy link
Member Author

But how do you generate NNLO grids? And in particular, how do you generate NNLO grids in the next few weeks/months?

@scarlehoff
Copy link
Member

You don't. That's why I'm not entirely sure why is this relevant. In the next few weeks/months the k-factors are "god-given" numbers already in NNPDF and that are applied on top of the NLO grids.

@alecandido
Copy link
Member Author

alecandido commented Jan 18, 2022

Indeed, but we'll not have a step to apply K-factors (and we explicitly stated we'll not provide).
Are you suggesting we should do only NLO fit for the time being?

@scarlehoff
Copy link
Member

No, we can do a NNLO fit but whenever we need a K-factor this is a multiplicative factor applied by validphys so it doesn't matter whether the underlying fktable is pineappl or not.

@alecandido
Copy link
Member Author

Ok, so we simply do not deliver NNLO grids for the time being.
I guess if we stick to NLO, mg5 can do everything. (Is there something missing?)

In any case, we won't have runcards even for all NLO we need for a while, so we need to run appl2pine for the time being.
But I guess we can keep it in fkutil, and then add here only the runcard, whenever it will be ready.

I'm going to close this, does everyone agree?

@cschwan
Copy link
Contributor

cschwan commented Jan 18, 2022

Ok, so we simply do not deliver NNLO grids for the time being.

I don't think we can deliever full NNLO FK tables, but instead we do what NNPDF4.0 already did:

  • NLO predictions from grids plus
  • NNLO evolution and
  • NNLO K factors,

which is approximately NNLO. We should give this a name to not confuse ourselves, how about NLO grids + NNLO evolution, or short (N)NLO FK tables?

In any case, we won't have runcards even for all NLO we need for a while, so we need to run appl2pine for the time being.

We already have that: https://github.com/NNPDF/fktables/blob/main/convert_applgrids.sh.

@felixhekhorn
Copy link
Contributor

I'm going to close this, does everyone agree?

Mmm, I'm not sure - I think, we should still do something here

Ok, so we simply do not deliver NNLO grids for the time being. I guess if we stick to NLO, mg5 can do everything. (Is there something missing?)

true, but this is not the problem here, right? we still need to convert grids (even at NLO, I think, e.g. DIS jets)

In any case, we won't have runcards even for all NLO we need for a while, so we need to run appl2pine for the time being. But I guess we can keep it in fkutil, and then add here only the runcard, whenever it will be ready.

We already have that: https://github.com/NNPDF/fktables/blob/main/convert_applgrids.sh.

I think, I'd like to move that to this repo since, as said here, to me a new theory is "(list of PineAPPL files) + (list of {dataset}.yaml)" and, to me, runcards is responsible to generate the first list ...

furthermore stuff like this should really be spelled out locally and this could be exactly done here ...

@cschwan
Copy link
Contributor

cschwan commented Jan 18, 2022

@felixhekhorn I see, now I understand what you're after. That would certainly be convenient, but it's probably not a priority right now (!?).

@cschwan
Copy link
Contributor

cschwan commented Jan 18, 2022

In the easiest case we could write a postrun.sh in the corresponding dataset directory which

  • fetches the APPLgrids/fastNLO tables,
  • converts them and
  • adds the correct metadata binning info.

./rr would have to make sure we have appl2grid and fastNLO (and their dependencies ...) and must make sure not run either mg5 or yadism.

@alecandido
Copy link
Member Author

Yes, so maybe we can just implement a void backend (some sort of noop) and rely on postrun.sh.

Most of the dependencies we already have, I guess we just need meson and to compile them.

@cschwan
Copy link
Contributor

cschwan commented Jan 18, 2022

Yes, so maybe we can just implement a void backend (some sort of noop) and rely on postrun.sh.

That sounds good!

Most of the dependencies we already have, I guess we just need meson and to compile them.

You can install meson and ninja using pip, so that should be easy!

@alecandido
Copy link
Member Author

I hope to get back here soon (even if not immediately), and I was looking back even at NNPDF/pinecards#124 (comment).

Do we want to move even appl2pine and fnlo2pine in here? Otherwise I can simply donwload them alongside pineappl, and install from there.

As I wrote above, the runner can be simply a void one, just running postrun.sh, in which there will be a suitable call to the proper converter (provided by rr) on a suitably named grid, that has to be provided by the user.

@cschwan
Copy link
Contributor

cschwan commented Feb 18, 2022

Do we want to move even appl2pine and fnlo2pine in here? Otherwise I can simply donwload them alongside pineappl, and install from there.

What do you mean exactly with move?

@alecandido
Copy link
Member Author

Get the code in here.

Of course it would a problem if it gets out of sync with that in pineappl repository, but the issue is that the examples are not packaged with pineappl (even though they might be considered distributed alongside in the GitHub release).

Maybe it's just enough to use the code from pineappl repository, since the examples are officially maintained (and arguably part of the distribution).

@alecandido
Copy link
Member Author

Speaking of, maybe we should start using a PineAPPL release, instead of master. What do you think @cschwan?

@cschwan
Copy link
Contributor

cschwan commented Feb 18, 2022

Speaking of, maybe we should start using a PineAPPL release, instead of master. What do you think @cschwan?

I agree! Version 0.5.0 should support everything we need, otherwise I'll make a point release.

@cschwan
Copy link
Contributor

cschwan commented Feb 18, 2022

Of course it would a problem if it gets out of sync with that in pineappl repository, but the issue is that the examples are not packaged with pineappl (even though they might be considered distributed alongside in the GitHub release).

I'd like to promote them from examples to proper programs, but appl2pine requires ROOT and the NNPDF-modified APPLgrid, where the former is big and complicated to install. fnlo2pine requires the official fastNLO. So the problem is to package all these dependencies.

I was thinking about integrating the fastNLO converter into the PineAPPL CLI as pineappl import, but I don't know how difficult it is.

@felixhekhorn
Copy link
Contributor

I'd like to promote them from examples to proper programs, but appl2pine requires ROOT and the NNPDF-modified APPLgrid, where the former is big and complicated to install. fnlo2pine requires the official fastNLO. So the problem is to package all these dependencies.

I was thinking about integrating the fastNLO converter into the PineAPPL CLI as pineappl import, but I don't know how difficult it is.

Maybe not inside the pineappl library? (such that you can opt-out) You could do a pineappl_utils (or similar name) alongside pineappl_cli etc. Of course we would prefer also a python binding ...

On the other side appl2pine seems sufficiently complicated, so I wonder if not leave it where it is (of course the complication will become relevant here, as e.g. we would need to download its dependencies ...)

@alecandido
Copy link
Member Author

I agree! Version 0.5.0 should support everything we need, otherwise I'll make a point release.

This I'm going to do in a separate PR (and this way I'll even drop the dependency on pygit2, and a considerable amount of Git overhead).

@alecandido
Copy link
Member Author

I'd like to promote them from examples to proper programs, but appl2pine requires ROOT and the NNPDF-modified APPLgrid, where the former is big and complicated to install. fnlo2pine requires the official fastNLO. So the problem is to package all these dependencies.
I was thinking about integrating the fastNLO converter into the PineAPPL CLI as pineappl import, but I don't know how difficult it is.

Maybe not inside the pineappl library? (such that you can opt-out) You could do a pineappl_utils (or similar name) alongside pineappl_cli etc. Of course we would prefer also a python binding ...

On the other side appl2pine seems sufficiently complicated, so I wonder if not leave it where it is (of course the complication will become relevant here, as e.g. we would need to download its dependencies ...)

I'd say that the CLI would be optimal, I'm thinking about how to do it in practice. I'm going to move it to a dedicated PineAPPL discussion, most likely this is not the best place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

4 participants