Skip to content

Commit d982317

Browse files
committed
updating lessons
1 parent 744f087 commit d982317

File tree

6 files changed

+906
-446
lines changed

6 files changed

+906
-446
lines changed

_episodes/01-introduction.md

+221-43
Original file line numberDiff line numberDiff line change
@@ -1,48 +1,59 @@
11
---
2-
title: "1 - Introduction"
3-
teaching: 15
4-
exercises: 30
2+
title: "1 - Matrix Element Generator"
3+
teaching: 20
4+
exercises: 40
55
questions:
66
- "What are Monte Carlo Generators?"
77
- "Why are we using simulated samples in CMS?"
88
- "How are simulated samples created in CMS?"
99
objectives:
1010
- "Use the MadGraph generator in standalone mode and get familiar with the basic syntax"
11-
- "Analyse the produced LHE files"
11+
- "Analyze the produced LHE files"
1212
keypoints:
1313
- "MadGraph is a widely used tool to generate matrix-element predictions for the hard scatter for SM and BSM processes."
14-
- "MadGraph can be used interactively, or steered using text-based cards"
15-
- "Gridpacks are used for large scale productions"
16-
- "MadAnalysis is a tool that allows for quick checks of kinematic distributions"
14+
- "Standalone MadGraph can run interactively on-the-fly or by importing the predefined text scripts"
15+
- "Gridpacks are used for large scale productions with consistency guaranteed"
16+
- "LHE level information is not physical and parton shower is needed to describe full physics"
1717
---
1818

1919
# Introduction and first steps
2020

21-
Samples of simulated events originating from certain processes are essential in high energy physics.
22-
They are used for studies of physics objects, background predictions or signal efficiency and acceptance determinations.
23-
Processes at vastly different energy regimes are involved, from the hard scattering process to hadronization and parton showering.
24-
Luckily, these different processes factorize which allows us to separate the treatment of processes happening at different momentum transfer scales.
25-
26-
The hard scatter of the incoming partons happens at the highest involved scale, and can be treated perturbatively.
27-
Soft processes that finally lead to the formation of the observed final state hadrons cannot yet be calculated from first principles and therefore need to be modeled.
28-
Secondary interactions of other constituent partons of the colliding hadrons are called underlying event.
29-
Although the hard and soft process are distinct, they are connected by an evolutionary Markov process that leads to parton showering.
30-
The partons produced in this process eventually participate in the hadron formation (hadronization) where color singlet states are formed.
31-
Monte Carlo techniques can be used for simulating the Markov process, efficient integration of the high dimensional hard scatter problem, and the hadronization models.
32-
33-
34-
## Using Madgraph to simulate the hard scatter process
35-
36-
In the first part of the exercise, we will use the matrix element generator MadGraph5 _aMC@NLO, or in short MG5.
37-
MG5 can perform automatic matrix element predictions for many processes at leading and next-to-leading order accuracy in QCD.
38-
Because of its ease of use for processes both in and beyond the standard model, it is one of the most widely used software tools to model the hard interaction.
39-
40-
We will first use the interactive prompt of MG5 to generate proton proton collision events that produce W bosons.
21+
Although quite old, [link](https://arxiv.org/pdf/1304.6677.pdf) is a great reading material to get a general overview of Monte Carlo event generators.
22+
Monte Carlo event generators are essential components of almost all experimental analyses and are also widely used by theorists and experiments to make predictions and preparations for future experiments.
23+
It is one of the topics where we CMS experimentalists and theorists have the closest connections to, theorists give us predictions and experimentalists verify them with the actual data.
24+
Although Monte Carlo event generators are extremely important tools in HEP, they are often used as black boxes which we more or less treat them as "data".
25+
Our aim is to get the minimal background of how these tools are working and analyze them using the generator level information.
26+
27+
Samples that are used by CMS experiments go through several steps of simulation :
28+
1. Monte Carlo event generator
29+
2. Detector simulation
30+
3. Pileup mixing
31+
4. Trigger emulation
32+
5. Object econstruction
33+
34+
We focus on "1. Monte Carlo event generator" in this tutorial.
35+
Monte Carlo event generator can be further divided into several subpieces as each steps can be factorized and can be handled through separate calculations :
36+
1. Parton distribution function (PDF)
37+
2. Hard scattering (matrix element calculation)
38+
3. Parton shower & hadronization
39+
First of all, LHC is a proton-proton collider, hence we need information on how partons (quarks and gluons) are distributed in the proton (PDF).
40+
Hard scattering is the part where calculations can be treated perturbatively, interactions of incoming partons with the largest momentum transfer (usually the physics process we are interested in).
41+
Parton shower & hadronization further describes how the particles involed in the hard scattering evolve, working downwards to lower momentum scales even to a point where perturbative calculations break down.
42+
43+
44+
## Using Standalone Madgraph
45+
46+
In the first part of the exercise, we will use the matrix element generator MadGraph5 _aMC@NLO, or in short MadGraph [link](https://launchpad.net/mg5amcnlo).
47+
MadGraph can perform the calculations for many different physics processes (both SM and BSM) at leading and next-to-leading order (LO & NLO) in QCD.
48+
Because of its easy user interface and flexibility with UFO models, you can test wide variety of physics modeling.
49+
We will now first see how MadGraph runs interactively in standalone mode using simple `W+` (wplus) process as an example.
50+
51+
We will first use the interactive prompt of MadGraph to generate proton proton collision events that produce W bosons.
4152
First, log in to a new session on the LPC cluster (`ssh -Y <USERNAME>@cmslpc-el8.fnal.gov`).
4253
Make sure you have completed the <a href="../setup.html">setup</a> steps!
4354
Then, start the interactive prompt of Madgraph:
4455
~~~bash
45-
cd ~/nobackup/cmsdas_2025_gen/MG5_aMC_v2_6_5/
56+
cd ~/nobackup/cmsdas_2025_gen/MG5_aMC_v3_5_2/
4657
./bin/mg5_aMC
4758
~~~
4859
{: .source}
@@ -87,7 +98,7 @@ You can also use the `ps2pdf` program to convert the post script files into PDFs
8798

8899
Alternatively, remove `-nojpeg` from the output line and look at the diagrams in jpeg format using `display`.
89100

90-
Now that MG has figured out the feynman diagrams you can start the actual computation within the MG5 prompt with
101+
Now that Madgraph has figured out the feynman diagrams you can start the actual computation within the MG5 prompt with
91102
~~~bash
92103
launch
93104
~~~
@@ -99,9 +110,143 @@ Hint: if you closed the interactive MG session for some reason you can still lau
99110
launch wplustest_4f_LO
100111
~~~
101112
{: .source}
102-
MG will ask you a few more questions. The first one you can just skip by pressing \<RETURN\>.
103-
Once asked about the `run_card`, one can either use a default run card by just inserting `2` and hitting \<RETURN\> to edit the default `run_card` by hand, or provide a path to a run card of one's choice.
104-
Please provide the path to the pre-made run_card: `wplustest_4f_LO_run_card.dat`
113+
Madgraph will ask you a few more questions. Press `tab` to turn off the timer (otherwise, MadGraph will move on by itself after 60 seconds).
114+
~~~
115+
/===========================================================================\
116+
| 1. Choose the shower/hadronization program shower = Not Avail. |
117+
| 2. Choose the detector simulation program detector = Not Avail. |
118+
| 3. Choose an analysis package (plot/convert) analysis = Not Avail. |
119+
| 4. Decay onshell particles madspin = OFF |
120+
| 5. Add weights to events for new hypp. reweight = Not Avail. |
121+
\===========================================================================/
122+
~~~
123+
{: .output}
124+
The first one you can just skip by pressing \<RETURN\>. As we did not install any other `shower`, `detector`, `analysis package`, they are in `Not Avail.` state.
125+
126+
~~~
127+
Do you want to edit a card (press enter to bypass editing)?
128+
/------------------------------------------------------------\
129+
| 1. param : param_card.dat |
130+
| 2. run : run_card.dat |
131+
\------------------------------------------------------------/
132+
you can also
133+
- enter the path to a valid card or banner.
134+
- use the 'set' command to modify a parameter directly.
135+
The set option works only for param_card and run_card.
136+
Type 'help set' for more information on this command.
137+
- call an external program (ASperGE/MadWidth/...).
138+
Type 'help' for the list of available command
139+
[0, done, 1, param, 2, run, enter path][90s to answer]
140+
~~~
141+
{: .output}
142+
143+
144+
Let's take a look at the `param card` and see how the values are set, press `1` and `ENTER` (\<RETURN\>) to investigate the parameter settings.
145+
~~~
146+
###################################
147+
## INFORMATION FOR MASS
148+
###################################
149+
Block mass
150+
5 4.700000e+00 # MB
151+
6 1.730000e+02 # MT
152+
15 1.777000e+00 # MTA
153+
23 9.118800e+01 # MZ
154+
25 1.250000e+02 # MH
155+
156+
...
157+
158+
###################################
159+
## INFORMATION FOR DECAY
160+
###################################
161+
DECAY 6 1.491500e+00 # WT
162+
DECAY 23 2.441404e+00 # WZ
163+
DECAY 24 2.047600e+00 # WW
164+
DECAY 25 6.382339e-03 # WH
165+
~~~
166+
{: .output}
167+
168+
Let's take a look at the `run card` and see how the values are set, press `2` and `ENTER` (\<RETURN\>) to investigate the run settings.
169+
~~~
170+
#*********************************************************************
171+
# Number of events and rnd seed *
172+
# Warning: Do not generate more than 1M events in a single run *
173+
#*********************************************************************
174+
10000 = nevents ! Number of unweighted events requested
175+
0 = iseed ! rnd seed (0=assigned automatically=default))
176+
177+
...
178+
179+
#*********************************************************************
180+
# Collider type and energy *
181+
# lpp: 0=No PDF, 1=proton, -1=antiproton, *
182+
# 2=elastic photon of proton/ion beam *
183+
# +/-3=PDF of electron/positron beam *
184+
# +/-4=PDF of muon/antimuon beam *
185+
#*********************************************************************
186+
1 = lpp1 ! beam 1 type
187+
1 = lpp2 ! beam 2 type
188+
6500.0 = ebeam1 ! beam 1 total energy in GeV
189+
6500.0 = ebeam2 ! beam 2 total energy in GeV
190+
191+
...
192+
193+
#*********************************************************************
194+
# Standard Cuts *
195+
#*********************************************************************
196+
# Minimum and maximum pt's (for max, -1 means no cut) *
197+
#*********************************************************************
198+
10.0 = ptl ! minimum pt for the charged leptons
199+
-1.0 = ptlmax ! maximum pt for the charged leptons
200+
{} = pt_min_pdg ! pt cut for other particles (use pdg code). Applied on particle and anti-particle
201+
{} = pt_max_pdg ! pt cut for other particles (syntax e.g. {6: 100, 25: 50})
202+
203+
...
204+
205+
#*********************************************************************
206+
# Minimum and maximum invariant mass for pairs *
207+
#*********************************************************************
208+
0.0 = mmll ! min invariant mass of l+l- (same flavour) lepton pair
209+
-1.0 = mmllmax ! max invariant mass of l+l- (same flavour) lepton pair
210+
{} = mxx_min_pdg ! min invariant mass of a pair of particles X/X~ (e.g. {6:250})
211+
{'default': False} = mxx_only_part_antipart ! if True the invariant mass is applied only
212+
! to pairs of particle/antiparticle and not to pairs of the same pdg codes.
213+
214+
...
215+
216+
#*********************************************************************
217+
# maximal pdg code for quark to be considered as a light jet *
218+
# (otherwise b cuts are applied) *
219+
#*********************************************************************
220+
4 = maxjetflavor ! Maximum jet pdg code
221+
~~~
222+
{: .output}
223+
224+
Try editting the beam energy (`ebeam1` and `ebeam2`) `6500` to `6800` as we are now running at 13.6TeV beam energy.
225+
When done with editting, escape after saving the changes in the text file.
226+
227+
MadGraph allows you to change settings by interactively typing in below as well.
228+
~~~
229+
set run_card nevents 5000
230+
~~~
231+
{: .output}
232+
233+
Take a look at the run card again and see if number of events to generate (`nevents`) is changed to `5000`.
234+
And change it back to `10000` using same command and check again.
235+
236+
As shown above, there are several phase space cuts set by default (e.g. `10.0 = ptl`).
237+
There is a handy command that removes all phase space cuts at once (instead of doing `set run_card ptl 0`, `set run_card ptj 0`, ... one by one by hand).
238+
~~~
239+
set no_parton_cut
240+
~~~
241+
{: .output}
242+
243+
Take a look at the card again and see if lepton pt cut (`ptl`) is changed to `0`.
244+
Keep in mind that the cuts you give before doing `set no_parton_cut` will be removed by this command.
245+
So don't forget to do `set no_parton_cut` before giving the cuts you wish to give.
246+
247+
248+
Once you are done, please provide the path to the pre-made run_card: `wplustest_4f_LO_run_card.dat`
249+
105250

106251
What is the cross section determined by Madgraph?
107252

@@ -119,7 +264,7 @@ What is the cross section determined by Madgraph?
119264
> > ~~~
120265
> > === Results Summary for run: run_01 tag: tag_1 ===
121266
> >
122-
> > Cross-section : 2.752e+04 +- 36.14 pb
267+
> > Cross-section : 2.715e+04 +- 39.45 pb
123268
> > Nb of events : 10000
124269
> >
125270
> > INFO: No version of lhapdf. Can not run systematics computation
@@ -132,7 +277,7 @@ What is the cross section determined by Madgraph?
132277
> > INFO: Done
133278
> > ~~~
134279
> > {: .output}
135-
> > The cross section calculated by MG is `2.752e+04 +- 36.14 pb`.
280+
> > The cross section calculated by MG is `2.715e+04 +- 39.45 pb`.
136281
> {: .solution}
137282
{: .challenge}
138283
@@ -185,6 +330,18 @@ The LHE file is plain text, so it's usually a good idea to use some compression
185330
> {: .solution}
186331
{: .challenge}
187332
333+
> ## What does each column mean?
334+
>
335+
> > ## Solution
336+
> > `ID`, `status`, `mother1`, `mother2`, `color`, `anticolor`, `px`, `py`, `pz`, `E`, `mass`, `life time`, and `spin`
337+
> > ~~~output
338+
> > -11 1 3 3 0 0 -2.3393803385e+01 -7.4187481776e+00 -1.5274153214e+02 1.5470062541e+02 0.0000000000e+00 0.0000e+00 1.0000e+00
339+
> > ~~~
340+
> > {: .output}
341+
> > This line tells you that a positron (`ID`) is an outgoing particle (`status`) with Z as its mother (`mother1` and `mother2` : 3rd particle is Z which is `ID=23`) with no color (`color` and `anticolor`), ...
342+
> {: .solution}
343+
{: .challenge}
344+
188345
> ## MadGraph syntax
189346
> If you want to add another process, e.g. production of W- in the above example, you can add another process with `add process p p > w-, w- > ell- vl~`
190347
>
@@ -228,11 +385,11 @@ The LHE file is plain text, so it's usually a good idea to use some compression
228385
> > ~~~
229386
> > === Results Summary for run: run_01 tag: tag_1 ===
230387
> >
231-
> > Cross-section : 4.732e+04 +- 57.08 pb
388+
> > Cross-section : 4.667e+04 +- 63.91 pb
232389
> > Nb of events : 10000
233390
> > ~~~
234391
> > {: .output}
235-
> > The cross section calculated by MG is `4.732e+04 +- 57.08 pb`.
392+
> > The cross section calculated by Madgraph is `4.732e+04 +- 57.08 pb`.
236393
> > While one would naiively expect the cross section to double by including W- bosons we only get a cross section that is ~40% larger.
237394
> > The simplified explanation is that the initial state protons contain more up valence quarks than down valence quarks.
238395
> {: .solution}
@@ -267,7 +424,7 @@ We will be generating a gridpack with cards similar to the commands we've used i
267424
The cards are located in the MG section of the genproductions directory
268425
269426
~~~bash
270-
cd ~/nobackup/cmsdas_2025_gen/genproductions_mg265/bin/MadGraph5_aMCatNLO
427+
cd ~/nobackup/cmsdas_2025_gen/genproductions_mg352/bin/MadGraph5_aMCatNLO
271428
time ./gridpack_generation.sh wplustest_4f_LO cards/examples/wplustest_4f_LO local
272429
~~~
273430
{: .source}
@@ -277,13 +434,13 @@ time ./gridpack_generation.sh wplustest_4f_LO cards/examples/wplustest_4f_LO loc
277434
> For a given process name $NAME, the input cards must be named as $NAME_run_card.dat, $NAME_proc_card.dat, etc...
278435
{: .callout}
279436

280-
The cards for Run2 UL samples that were generated with MG can be found in `bin/MadGraph5_aMCatNLO/cards/production/2017/13TeV/` of the [genproductions repo](https://github.com/cms-sw/genproductions/tree/master/bin/MadGraph5_aMCatNLO/cards/production/2017/13TeV).
437+
<!-- The cards for Run2 UL samples that were generated with MG can be found in `bin/MadGraph5_aMCatNLO/cards/production/2017/13TeV/` of the [genproductions repo](https://github.com/cms-sw/genproductions/tree/master/bin/MadGraph5_aMCatNLO/cards/production/2017/13TeV). -->
281438

282439

283440
~~~bash
284441
mkdir work
285442
cd work
286-
tar xf ../wplustest_4f_LO_slc7_amd64_gcc700_CMSSW_10_6_19_tarball.tar.xz
443+
tar xf ../wplustest_4f_LO_el8_amd64_gcc10_CMSSW_12_4_8_tarball.tar.xz
287444

288445
NEVENTS=10000
289446
RANDOMSEED=12345
@@ -297,14 +454,35 @@ This will produce an output LHE file called `cmsgrid_final.lhe`.
297454
## Comparing the LHE output
298455

299456
There are multiple ways of analyzing an LHE file, each of which has its own advantages and disadvantages.
300-
For the purpose of this exercise, we will use the most straightforward tool: MadAnalysis (MA).
457+
For the purpose of this exercise, we will use a pre-made pyroot script.
458+
~~~bash
459+
cd ~/nobackup/cmsdas_2025_gen/
460+
cp /eos/uscms/store/user/cmsdas/2025/short_exercises/generators/LHEReader.py .
461+
~~~
462+
{: .source}
463+
464+
#### Convert LHE output to root format
465+
466+
> ## Make sure the Python virtual environment is deactivated
467+
> If you are still using the virtual environment, you need to unset it in order to not interfere with the CMSSW environment.
468+
> {: .source}
469+
{: .callout}
470+
~~~bash
471+
cd CMSSW_12_4_8/src; cmsenv; cd -
472+
python3.9 LHEReader.py --input MG5_aMC_v3_5_2/wplustest_4f_LO/Events/run_01/unweighted_events.lhe --output standalone.root
473+
python3.9 LHEReader.py --input genproductions_mg352/bin/MadGraph5_aMCatNLO/work/cmsgrid_final.lhe --output cmsgrid.root
474+
~~~
475+
{: .source}
476+
477+
Feel free to experiment here and plot various quantities. What are the shapes of the lepton pT distributions? What is the shape of the pT distribution of the W system? Are these shapes physical?
478+
479+
<!-- The most straightforward tool: MadAnalysis (MA).
301480
MA is a tool designed to be used by theorists to analyze parton-level LHE files, particle-level HEPMC files or even events with DELPHES detector simulation.
302481
We can install MA directly from the MG5 console.
303482
304483
~~~bash
305-
cd $CDGPATH/MG5_aMC_v2_6_5/
484+
cd ~/nobackup/cmsdas_2025_gen/MG5_aMC_v3_5_2/
306485
./bin/mg5_aMC
307-
308486
~~~
309487
{: .source}
310488
@@ -356,7 +534,7 @@ The analysis output can be viewed as HTML. Since there is no web browser on cmsl
356534
firefox ANALYSIS_0/Output/HTML/MadAnalysis5job_0/index.html
357535
358536
What observations do you make? Are the two datasets consistent? What are the shapes of the lepton pT distributions? What is the shape of the pT distribution of the W system? Are these shapes physical?
359-
Feel free to experiment here and plot other quantities you find interesting. You can refer to the [user manual](https://arxiv.org/pdf/1206.1599.pdf) to learn about the syntax.
537+
Feel free to experiment here and plot other quantities you find interesting. You can refer to the [user manual](https://arxiv.org/pdf/1206.1599.pdf) to learn about the syntax. -->
360538

361539
{% include links.md %}
362540

0 commit comments

Comments
 (0)