Skip to content

Commit

Permalink
Added documentation regarding model generation. Updated paths in Data…
Browse files Browse the repository at this point in the history
…_prepare.py (#320)

Co-authored-by: James Minock <[email protected]>
  • Loading branch information
jminock and James Minock authored Nov 26, 2024
1 parent 0b1f89f commit 30a9cb9
Show file tree
Hide file tree
Showing 2 changed files with 18 additions and 7 deletions.
21 changes: 16 additions & 5 deletions configfiles/MuonFitter/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
**********************

Date created: 2024-10-02
Date last updated: 2024-11-22
Date last updated: 2024-11-26
The MuonFitter toolchain makes an attempt to fit muons using hit information.

The Tool has 2 modes. The first mode is pre-reconstruction. It takes input information from the ANNIEEvent and generates a text file containing hit information for the RNN. It is advisable to include minimal tools in this ToolChain, as the same data must be re-analysed with ToolAnalysis later.
Expand All @@ -25,12 +25,13 @@ More detailed instructions are below.
To generate (train) a model:
============================

1. Data_prepare.py
1. Run a ToolChain containing the MuonFitter Tool configured in "RecoMode 0". This will generate 2 files: ev_ai_eta_R{RUN}.txt with a {RUN} number corresponding to the WCSim run number and true_track_len.txt. You should not include any Tools further along the ToolChain for this step.

2. RNN_train.py
2. Run Data_prepare.py with the generated ev_ai_eta file and true_track_len file. This will generate data files to train the model for the next step.

NOTE: Data_prepare.py requires input files that do not yet exist on the ANNIE gpvms; as a result the model cannot at present be re-trained.
Previously generated models are stored in /pnfs/annie/persistent/simulations/models/MuonFitter/ which may be used.
3. Run RNN_train.py with the data files generated from the previous step. This will generate model.pth to be used as the model for data analysis.

Previously generated models and data are stored in /pnfs/annie/persistent/simulations/models/MuonFitter/ which may be used.

Please update any paths in the Tool configuration and Fit_data.py accordingly, or copy the appropriate model files to the configfiles/MuonFitter/RNNFit directory.

Expand All @@ -44,3 +45,13 @@ To analyse data:
NOTE: the script RNNFit/rnn_fit.sh can act as a helper to process multiple ev_ai_eta_R{RUN}.txt files. Be sure to update the path in the script to point to your ev_ai_eta text files from step 1.

3. Finally, run a ToolChain containing the MuonFitter Tool configured in "RecoMode 1". Please set the paths for the ev_ai_eta_R{RUN}.txt and tanktrackfitfile_r{RUN}_RNN.txt in the MuonFitter config file accordingly. You may include any downstream tools you desire for further analysis. See the UserTools/MuonFitter/README.md for short descriptions of information saved to the DataModel and how to access them.

Storage for intermediate files:
===============================

Intermediate files can be stored in directories in /pnfs/annie/persistent/simulations/models/MuonFitter/
truetanktracklength/ - would contain true_track_len.txt
tanktrackfit/ - would contain tanktrackfitfile_r{RUN}_RNN.txt
ev_ai_eta/ - would contain ev_ai_eta_R{RUN}.txt

These directories are currently empty as of 11/26/2024 and will be filled upon Production level generation of ntuples using MuonFitter.
4 changes: 2 additions & 2 deletions configfiles/MuonFitter/RNNFit/Data_prepare.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,8 @@


# # Prepare data into pandas DataFrame
dataX = pd.read_csv("/home/jhe/annie/analysis/Muon_vertex/X.txt",sep=',',header=None,names=['id','ai','eta']) #ai is track segment
dataY = pd.read_csv("/home/jhe/annie/analysis/Muon_vertex/Y.txt",sep=',',header=None,names=['id','truetracklen'])
dataX = pd.read_csv("ev_ai_eta_R0.0.txt",sep=',',header=None,names=['id','ai','eta']) #ai is track segment
dataY = pd.read_csv("true_track_len.txt",sep=',',header=None,names=['id','truetracklen'])
# dataX['combine'] = dataX[['X','Y']].values.tolist()


Expand Down

0 comments on commit 30a9cb9

Please sign in to comment.