Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vertex Reconstruction Update #288

Open
wants to merge 31 commits into
base: Application
Choose a base branch
from
Open

Conversation

flemmons
Copy link
Contributor

@flemmons flemmons commented Oct 2, 2024

Update to muon vertex reconstruction, to bring up to current work.
-Added current/updated toolchain configfiles PhaseIIRecoDir (PhaseIIRecoFirinne is a working version that comes with no guarantee), and PhaseIIRecoTruthDOE directories, due to some testing for comparison with old results; these are disorganized and can be rejected from the merge without issue; I'll fix them up for actual use for the next update.
-Updated DigitBuilder and EventSelector with check for LAPPD Multiplicity for vertex reconstruction purposes
-Updated DigitBuilder for reduction of LAPPD hits into strip-hits. Partial functionality; is a work in progress.
-Added DirectionGridCheck debug tool, which makes simple plots comparing various track-direction seeding methods
-Updated LikelihoodFitterCheck tool to make plots for multiple events in a single run
-Updated VertexGeometryCheck tool to show individual LAPPD hits, for purpose of debugging or analyzing in-depth effects of strip-hit reduction

flemmons and others added 30 commits March 8, 2023 07:48
Removed redundant GenerateFineGrid function in VtxSeedGenerator
Removed experimental penalty term calculation in FoMCalculator
…r, and handling in VtxExtendedVertexFinder and MinuitOptimizer. This change is experimental, but should only trigger with appropriate config variables.
…arge.root, which is the file containing pdf plot for use within new charge fitter.
… of the old DOE MC files;

(2) Updated EventSelectorDoE to avoid segmentation fault;
(3) Updated the  config files under PhaseIIRecoTruthDoE to run the reconstruction for the old DoE files
(2) Fixed a bug in Digitbuilder to avoid empty digit times
Added mulit-event functionality to LikelihoodFitterCheck
if (not has_LAPPDMult) {
Log("EventSelector Tool: No count of hit LAPPDs; proceeding without LAPPD multiplicity cut.", v_warning, verbosity);
fLAPPDMultMin = 0;
fHitLAPPDs = new std::vector<int>;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This Tool doesn't seem to call m_data->Stores.at("RecoEvent")->Set("HitLAPPDs",XXX), which means if HitLAPPDs doesn't exist for the first Execute call, it probably won't exist for any of them, and you'll be creating a new std::vector<int> on every loop, leaking them as you go.
Is there any reason this vector even needs to be on the heap? At the least please only new the vector if it doesn't exist.

m_data->Stores.at("RecoEvent")->Set("RecoPDGVector",cluster_reco_pdg);
m_data->Stores.at("RecoEvent")->Set("PDG",fRecoPDG);
bool passRecoPDGCut = 0;
std::cout << "fRecoPDG: " << fRecoPDG << endl;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::cout usage...

m_data->Stores.at("RecoEvent")->Set("PDG", fRecoPDG);
}

bool passExtendedCut = this->EventSelectionByTriggerExtended();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpicking but this keeps coming up: there are only a very small number of obscure cases involving derived class pointers being dynamically cast to base class pointers where this-> is required. Just call the function:
passExtendedCut = EventSelectionByTriggerExtended();

@@ -691,7 +693,7 @@ std::vector<RecoCluster*>* HitCleaner::RecoClusters(std::vector<RecoDigit*>* myD
}
}

//std::cout <<"fClusterList->size() = "<<fClusterList->size()<<std::endl;
std::cout <<"Number of clusters = "<<fClusterList->size()<<std::endl;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cout

if (!pdftest) {
Log("LikelihoodFitterCheck Tool: Error retrieving pdffile; running without!", v_error, verbosity);
fUsePDFFile = 0;
return false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this return false? The printout suggests that it will continue without, but it doesn't. At least, not until the next Execute call.

seedY = trueVtxY - 50*dy + j*dy;
seedZ = trueVtxZ - 50*dz + j*dz;
int nbins = 400;
double dlpara[400], dlfom[400];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any reason to accumulate these in an array? It seems like dlfom[j] is set to fom, and then the only time it's used is in the next line to access dlfom[j], where fom is still in scope. Just use fom? Same for dlpara.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same remark for dltrans further down

Log("VtxExtendedVertexFinder: pdffile does not exist", v_error, verbosity);
return false;
}
pdf = *(TH1D*)f1.Get("zenith");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be nice to check if zenith exists, otherwise this will segfault.

} // else if we're splitting subtriggers, loop from [current MCTriggernum] to [current MCTriggernum+1]

if (!splitSubtriggers) MCTriggernum = 0;
int MaxEventNr = (splitSubtriggers)? (MCTriggernum + 1) : (WCSimEntry->wcsimrootevent->GetNumberOfEvents());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason to have replaced this code with a version that removes the explanatory comments?

@@ -631,18 +636,24 @@ bool LoadWCSim::Execute(){
} else {
// instead take the true time of the first photon
std::vector<int> photonids = digihit->GetPhotonIds(); // indices of the digit's photons
std::cout << "LoadWCSim Tool: number of photons in this hit: " << photonids.size() << endl;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::cout usage

digittime = earliestphotontruetime;
}
else {

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...? Is there supposed to be some alternative here? Otherwise if firstHit is false, we just leave the digit times uninitialised. 🤨

// so downstream tools know it's a new event
int reported_triggernum = splitSubtriggers ? MCTriggernum-1 : 0;
m_data->Stores.at("ANNIEEvent")->Set("MCTriggernum",reported_triggernum);
m_data->Stores.at("ANNIEEvent")->Set("MCTriggernum",MCTriggernum-1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has this been validated? Why was this change made?

@@ -47,7 +47,7 @@ public :
WCSimRootOptions *wcsimrootopts=nullptr;

// TODO implement a verbosity arg
int verbose=1;
int verbose=5;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's no point changing this as both constructors will set the verbosity, so this value will always be overwritten.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's also bad practice to set default verbosities so high. Generally they should set to only print errors and warnings.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feels like none of the changes in this file should be committed....

Log("MeanTimeCheck Tool: Executing", v_debug, verbosity);
auto* reco_event = m_data->Stores["RecoEvent"];
if (!reco_event) {
Log("Error: The MeanTimeCheck tool could not find the ANNIEEvent Store", 0, verbosity);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again name of store doesn't tie up with the error message, and this isn't a good way to do this. Same comments as DirectionGridCheck.

double timeMin = deltaTime1.at(int((n - 1)*0.05)); // 5% of the total entries
double timeMax = deltaTime1.at(int((n - 1)*0.90)); // 90% of the total entries
int nbins = int(n / 5);
TH1D *hDeltaTime = new TH1D("hDeltaTime", "hDeltaTime", nbins, timeMin, timeMax);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does this need to be on the heap? heap allocation is slow and risks memory leaks, this could just as easily be declared on the stack with no such issues.

return true;
}

bool VertexGeometryCheck::Finalise(){
fOutput_tfile->cd();
fOutput_tfile->Write();
if (StripTimePlot > 0) StripHits1->Write("lappdtimegradient");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this name doesn't seem consistent with the usage in the above loop. Is this correct?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general we don't add binary files to source code repositories. Please instead make the path to this file a configuration variable if it isn't already, and put a copy of this file somewhere appropriate on /pnfs/annie/persistent, with the default value of the configured path directed there.

bool VtxSeedFineGrid::Execute(){
Log("VtxSeedFineGrid Tool: Executing", v_debug, verbosity);

auto* annie_event = m_data->Stores["RecoEvent"];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as made before regarding check.

bool EventCutstatus = false;
auto get_evtstatus = m_data->Stores.at("RecoEvent")->Get("EventCutStatus", EventCutstatus);
if (!get_evtstatus) {
Log("Error: The PhaseITreeMaker tool could not find the Event selection status", v_error, verbosity);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

incorrect Tool name

auto get_flags = m_data->Stores.at("RecoEvent")->Get("EventFlagged",fEventStatusFlagged);
if(!get_flagsapp || !get_flags) {
Log("PhaseITreeMaker tool: No Event status applied or flagged bitmask!!", v_error, verbosity);
//return false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's concerning to see error printouts with a commented-out return false.
It seems like either this should be an error and it should fail, or it isn't an error and should be delegated to a warning. That said, I have no idea what the implications of this warning would be. Could you clarify?

}

if (useSimpleDir) {
Log("Using simple direction", v_debug, verbosity);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do these checks/printouts need to be in Execute? It seems like they only need to run once in Initialise.

else { Center.push_back(vSeedVtxList->at(centerIndex[0]).GetPosition()); }
this->GenerateFineGrid();
for (int i = 0; i < Center.size(); i++) {
std::cout << "Center " << i << ": " << Center.at(i).X() << ", " << Center.at(i).Y() << ", " << Center.at(i).Z() << endl;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cout usage



bool VtxSeedFineGrid::Finalise(){
Log("VtxSeedFineGrid exitting", v_debug, verbosity);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just one t in exiting.

RecoVertex iSeed;
RecoVertex thisCenterSeed;

if (verbosity > 0) cout << "True vertex = (" << trueVtxX << ", " << trueVtxY << ", " << trueVtxZ << ", " << trueVtxT << ", " << trueDirX << ", " << trueDirY << ", " << trueDirZ << ")" << endl;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

verbosity levels are, in general:
0: error
1: warning
2: message
3+: debug
This does not seem like a warning.

return result;
}

RecoVertex* VtxSeedFineGrid::FindSimpleDirection(RecoVertex* myVertex) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK so this is the fifth instance of FindSimpleDirection ...!?


// set vertex and direction
// ========================
RecoVertex* newVertex = new RecoVertex(); // Note: pointer must be deleted by the invoker
Copy link
Collaborator

@marc1uk marc1uk Dec 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: why does this function return a pointer to an object on the heap? Can it not return an object? Can it not take an object reference and modify it? Can it wrap the returned pointer in a smart pointer to avoid the need for the caller to remember to delete it?

myFoMCalculator.TimePropertiesLnL(meantime, timefom);
myFoMCalculator.ConePropertiesFoM(coneAngle, conefom);
fom = timefom * 0.5 + conefom * 0.5;
std::cout << "seed (" << seedX<<","<<seedY<<","<<seedZ<<") fom= " << fom << " best= " << bestFOM[0] << endl;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cout

if (numtracksinev > 1) Log("Multiple tracks need work; just using first for now", v_debug, verbosity);
double gradx, grady, theta, phi;
Direction startVertex, endVertex, result;
BoostStore* thisTrack = &(Tracks->at(0));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as before this will throw an exception if there are no MRDTracks

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which demonstrates the point that when code is duplicated, bugs in that code will need to be fixed in multiple locations. Please consolidate this function with the other instance(s).

@marc1uk
Copy link
Collaborator

marc1uk commented Dec 18, 2024

One of the reasons this PR has been held up so long is the number of changed files. We want to minimize the number of changed files to make the PR easier to review.
Yet there are 8 files in this PR that have nothing but whitespace changes:
DataModel/MinuitOptimizer.cpp
DataModel/MinuitOptimizer.h
UserTools/LikelihoodFitterCheck/README.md
UserTools/VertexGeometryCheck/README.md
UserTools/VtxExtendedVertexFinder/README.md
UserTools/VtxExtendedVertexFinder/VtxExtendedVertexFinder.h
UserTools/VtxSeedGenerator/README.md
UserTools/VtxSeedGenerator/VtxSeedGenerator.h
All of these can be checked out from the main Application branch.
Likewise, as i've commented on a previous instance of this pull request, DataModel/FoMCalculator.cpp has 1,181 lines changed including whitespace changes, but only 126 changed lines without whitespace changes. Again, that makes reviewal and git diff comparisons in the future much harder to read. Please don't change whitespace unnecessarily.

wcsimT.h also appears to have no changes other than printouts that don't appear to be particularly useful, so can also be checked out from the current Application branch.

@marc1uk
Copy link
Collaborator

marc1uk commented Dec 18, 2024

Even excluding the 9 files with no meaningful changes, that still amounts to 40 files changed. These changes are not all related. This can and should be split up into a number of separate PRs.

@marc1uk
Copy link
Collaborator

marc1uk commented Dec 18, 2024

For those that impact the vertex reconstruction, given their importance and impact on all analyses I think it's critical to do proper validation before we merge in any changes. It would be a good idea to get some assistance from others on doing this, to ensure different people checking out the code in different repositories, analysing different files and with different configurations don't encounter issues. We should evaluate the impact on a variety of different particle types and energies, check that it works on MC and data, and since there are cuts on whether vertex reconstruction succeeds or not, check how this impacts the number of successful reconstructions, and what gets rejected.
Validation code can be added into this repository, along with a docdb link to the results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants