Move the parsing of fktables to pineko #17

scarlehoff · 2022-03-29T15:37:56Z

This is basically what it would be required to have vp depend on pineko.

Cons:
- It is a bit of a complication for something that is only depending on vp (that is, the format of the dataframe)

Pros:
- Gives us a bit more freedom on how to do anything since the only limitation is that at the end we have to have a dataframe as defined in pineappl_to_fktable. How do we get there is up to us. Vp doesn't need to know how the initial state were define, or what the luminosity means or anything.

alecandido · 2022-03-30T09:35:38Z

Just to tell it: I'm not a fan of linearizing dimensions.

We have the luck of to have a simple and powerful framework at hand, that is able to deal with N-dimensional tensors in a fast and expressive way.

FkTables are genuinely 4 dimensional objects (bin, lumi, x1, x2), so we should use 4 dimensional data structures, not try to fit it into a 2 dimensional one.

If you want labels, concatenation and other excel-like swag, giving up on part of the simplicity, than you should consider to use xarray, still based on numpy, but at least it doesn't have the tradeoff on the rank.

scarlehoff · 2022-03-30T09:47:07Z

And you haven't seen the terrible crap in the other side to get the pandas df and recover exactly the 4rank tensor that I had at the beginning :__

Sadly vp uses the df everywhere (most notably, for the old tables). I can move the tensor to df part to the vp parser so that this side is clean. I think that makes sense.

alecandido · 2022-03-30T09:48:50Z

😞

alecandido

First batch, I still have to review the main API pineappl_to_fktable

src/pineko/parser.py

scarlehoff · 2022-03-30T10:18:37Z

I still have to review the main API pineappl_to_fktable

Wait for that one though, I'm going to remove the df part and move it back to vp.

scarlehoff · 2022-03-30T10:50:08Z

I will run through the yaml db to standarize how the apfelcomb flags are added since many were added manually anyway. That way I can fix most issues at once without adding checks for it in here.

scarlehoff · 2022-03-30T11:00:42Z

apfelcomb:
    normalization:
        ATLAS_TTB_8TEV_2L_TTRAP: [2.0]
    repetition_flag:
        CMSTTBARTOT8TEV-TOPDIFF8TEVTOT: True

So that instead of relying on things being by chance in the same order or whatever, the behavior is explicit for each subfktable. This is for ATLAS_TOPDIFF_DILEPT_8TEV_TTRAPNORM for instance.

If you guys agree I'll go ahead. In this case for instance only the ATLAS_TTB_8TEV_2L_TTRAP fktable needs a factor and only the CMSTTBARTOT8TEV-TOPDIFF8TEVTOT needs to set the repetition flag.

felixhekhorn · 2022-03-30T11:44:13Z

In this case for instance only the ATLAS_TTB_8TEV_2L_TTRAP fktable needs a factor and only the CMSTTBARTOT8TEV-TOPDIFF8TEVTOT needs to set the repetition flag

If things decouple, perfect!

scarlehoff · 2022-03-30T13:52:29Z

Meh, there's no good solution for this.

The apfelcomb flags basically set up the dataframe for vp so either everything is in vp or everything is in here.
I would prefer to have the apfelcomb flags here so that they actually get removed, if they get into the vp side they will be abused forever just like apfelcomb was.

alecandido · 2022-03-30T20:56:44Z

The apfelcomb flags basically set up the dataframe for vp so either everything is in vp or everything is in here. I would prefer to have the apfelcomb flags here so that they actually get removed, if they get into the vp side they will be abused forever just like apfelcomb was.

Fine, then I'd immediately open a separate issue for apfelcomb flags total disruption: as soon as we completed theory uncertainties and we have new commondata, we are stable enough and we drop backward compatibility (the last faints of it).

felixhekhorn · 2022-04-04T12:11:03Z

I'm still no big fan of this:

it adds highly non-trivial functionality beyond the scope of the package: the unique scope in my eyes is the production of FK tables
this should instead belong to a CommonDataReader (as also said here: https://github.com/NNPDF/nnpdf/blob/646dd01c2b9e60ebf1ac98cf52b4ed5c6efa6f62/validphys2/src/validphys/pineparser.py#L16) - which most likely is the long-term solution
for the short term I suggest to add a condensed, 5-line version of if: read the file and concatenate the content of operands
- Pros: quick and easy
- Cons: slight code duplication, though only until CDR gets settled
the other unfavoured option is to give up on datasets:
- Pros: no additional dependency what so ever
- Cons: you have to know FK table member names, which seems unreasonable to users (but not to scripts): ATLAS_WP_JET_8TEV_PT-atlas-atlas-wjets-arxiv-1711.03296-xsec002
the least favoured option is to depend on vp:
- Pros: well ...
- Cons: infinite (truly - since I'd like to get rid of non-trivial dependencies)

scarlehoff · 2022-04-04T12:14:43Z

I'm currently reducing the amount of stuff here but

it adds highly non-trivial functionality beyond the scope of the package: the unique scope in my eyes is the production of FK tables

Indeed

this should instead belong to a CommonDataReader

That's what will happen (but for now there is no such thing)

The (my) idea is that only the apfelcomb flags will remain here. I am very interested on this being the case because that way nobody can rely on them in vp (since they are coming from some external package).

felixhekhorn · 2022-04-04T12:19:12Z

The (my) idea is that only the apfelcomb flags will remain here. I am very interested on this being the case because that way nobody can rely on them in vp (since they are coming from some external package).

for sure we can intermediate stuff here, since everything is still under heavy development - but I believe the correct place for this should be NNPDF/pinefarm#3 as I consider this under correct grid generation and not FK generation

scarlehoff · 2022-04-04T12:23:07Z

But this is not only for converted grids, these flags sometimes are necessary because of a mismatch between cfactors / cuts etc and what pineappl grids have.

tbh, it can be anywhere but it needs to be something vp can import and be somewhere the people writing vp cannot touch.

felixhekhorn · 2022-04-04T12:27:57Z

But this is not only for converted grids, these flags sometimes are necessary because of a mismatch between cfactors / cuts etc and what pineappl grids have.

Okay I take your point - still it should belong to pinecardrunner: as we agreed the pinecards can and in general will be NNPDF dependent (hence dependent on cfactors and what ever) so we can even adjust existing pinecards

move the parsing of fktables to pineko

2405364

scarlehoff requested review from felixhekhorn and alecandido March 29, 2022 15:37

swap order

4f5b9d2

alecandido requested changes Mar 30, 2022

View reviewed changes

scarlehoff added 2 commits March 30, 2022 15:11

modify parser to work with the modified yamls [wip]

19853f5

modify column function

2b5b26b

try to ensure that pineko does\nt need to know about vp

ed74226

felixhekhorn marked this pull request as draft July 12, 2022 12:25

felixhekhorn added the refactor Refactor code label Jul 12, 2022

felixhekhorn closed this Dec 14, 2022

felixhekhorn deleted the pineko_does_the_parsing branch December 14, 2022 10:43

Move the parsing of fktables to pineko #17

Move the parsing of fktables to pineko #17

Uh oh!

Conversation

scarlehoff commented Mar 29, 2022

Uh oh!

alecandido commented Mar 30, 2022

Uh oh!

scarlehoff commented Mar 30, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alecandido commented Mar 30, 2022

Uh oh!

alecandido left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

scarlehoff commented Mar 30, 2022

Uh oh!

scarlehoff commented Mar 30, 2022

Uh oh!

scarlehoff commented Mar 30, 2022

Uh oh!

felixhekhorn commented Mar 30, 2022

Uh oh!

scarlehoff commented Mar 30, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alecandido commented Mar 30, 2022

Uh oh!

felixhekhorn commented Apr 4, 2022

Uh oh!

scarlehoff commented Apr 4, 2022

Uh oh!

felixhekhorn commented Apr 4, 2022

Uh oh!

scarlehoff commented Apr 4, 2022

Uh oh!

felixhekhorn commented Apr 4, 2022

Uh oh!

Uh oh!

scarlehoff commented Mar 30, 2022 •

edited

Loading

scarlehoff commented Mar 30, 2022 •

edited

Loading