Added documentation regarding model generation. Updated paths in Data…

…_prepare.py (#320) Co-authored-by: James Minock <[email protected]>
ANNIEsoft · Nov 26, 2024 · 30a9cb9 · 30a9cb9
1 parent 0b1f89f
commit 30a9cb9
Show file tree

Hide file tree

Showing 2 changed files with 18 additions and 7 deletions.
diff --git a/configfiles/MuonFitter/README.md b/configfiles/MuonFitter/README.md
@@ -5,7 +5,7 @@
 **********************
 
 Date created: 2024-10-02
-Date last updated: 2024-11-22
+Date last updated: 2024-11-26
 The MuonFitter toolchain makes an attempt to fit muons using hit information.
 
 The Tool has 2 modes. The first mode is pre-reconstruction. It takes input information from the ANNIEEvent and generates a text file containing hit information for the RNN. It is advisable to include minimal tools in this ToolChain, as the same data must be re-analysed with ToolAnalysis later.
@@ -25,12 +25,13 @@ More detailed instructions are below.
 To generate (train) a model:
 ============================
 
-1. Data_prepare.py
+1. Run a ToolChain containing the MuonFitter Tool configured in "RecoMode 0". This will generate 2 files: ev_ai_eta_R{RUN}.txt with a {RUN} number corresponding to the WCSim run number and true_track_len.txt. You should not include any Tools further along the ToolChain for this step.
 
-2. RNN_train.py
+2. Run Data_prepare.py with the generated ev_ai_eta file and true_track_len file. This will generate data files to train the model for the next step.
 
-NOTE: Data_prepare.py requires input files that do not yet exist on the ANNIE gpvms; as a result the model cannot at present be re-trained.
-Previously generated models are stored in /pnfs/annie/persistent/simulations/models/MuonFitter/ which may be used.
+3. Run RNN_train.py with the data files generated from the previous step. This will generate model.pth to be used as the model for data analysis.
+
+Previously generated models and data are stored in /pnfs/annie/persistent/simulations/models/MuonFitter/ which may be used.
 
 Please update any paths in the Tool configuration and Fit_data.py accordingly, or copy the appropriate model files to the configfiles/MuonFitter/RNNFit directory.
 
@@ -44,3 +45,13 @@ To analyse data:
 NOTE: the script RNNFit/rnn_fit.sh can act as a helper to process multiple ev_ai_eta_R{RUN}.txt files. Be sure to update the path in the script to point to your ev_ai_eta text files from step 1.
 
 3. Finally, run a ToolChain containing the MuonFitter Tool configured in "RecoMode 1". Please set the paths for the ev_ai_eta_R{RUN}.txt and tanktrackfitfile_r{RUN}_RNN.txt in the MuonFitter config file accordingly. You may include any downstream tools you desire for further analysis. See the UserTools/MuonFitter/README.md for short descriptions of information saved to the DataModel and how to access them.
+
+Storage for intermediate files:
+===============================
+
+Intermediate files can be stored in directories in /pnfs/annie/persistent/simulations/models/MuonFitter/
+truetanktracklength/ - would contain true_track_len.txt
+tanktrackfit/ - would contain tanktrackfitfile_r{RUN}_RNN.txt
+ev_ai_eta/ - would contain ev_ai_eta_R{RUN}.txt
+
+These directories are currently empty as of 11/26/2024 and will be filled upon Production level generation of ntuples using MuonFitter.
diff --git a/configfiles/MuonFitter/RNNFit/Data_prepare.py b/configfiles/MuonFitter/RNNFit/Data_prepare.py
@@ -6,8 +6,8 @@
 
 
 # # Prepare data into pandas DataFrame
-dataX = pd.read_csv("/home/jhe/annie/analysis/Muon_vertex/X.txt",sep=',',header=None,names=['id','ai','eta'])   #ai is track segment
-dataY = pd.read_csv("/home/jhe/annie/analysis/Muon_vertex/Y.txt",sep=',',header=None,names=['id','truetracklen'])
+dataX = pd.read_csv("ev_ai_eta_R0.0.txt",sep=',',header=None,names=['id','ai','eta'])   #ai is track segment
+dataY = pd.read_csv("true_track_len.txt",sep=',',header=None,names=['id','truetracklen'])
 # dataX['combine'] = dataX[['X','Y']].values.tolist()