-
Notifications
You must be signed in to change notification settings - Fork 0
ThreadingCalculations
Calculations can be speeded up in two ways:
(1) Using compiled XOP code as supplied in SANSAnalysis.xop. True compiled c-code is much faster than Igor's language which is "semi-compiled". (2) Multithreading of the calculation to make use of multiple processors.
Below is a snippet of code from Cylinder_PolyRadius.ipf showing how the normal function definition:
Function Cyl_PolyRadius(cw,yw,xw) : FitFunc
can be augmented to dispatch the calculation to N processors. The normal function definition dispatches to a "helper" function (with a "_T" suffix) that works on each separate segment of the calculation. Compiler directives (#if) select between Igor code and XOP code, using the XOP code preferentially if it exists. Currently, not all functions are threaded, since only calculations that are "slow enough" will benefit from threading. There is timing overhead in the creating and dispatching of threads. For this example, the !SmearedCylinder_PolyRadius function was calculated using N=2 processors and 265 data points x 5 trials (summed times are reported), the timings are:
No threading, Igor Function = 46.87 s
No threading, XOP = 6.47 s
Threading, XOP = 3.95 s
--So using an XOP gives a speedup of 7.2x, and using threading is another factor of 1.64x (N=2), for a total speedup of 11.9x !
This example is a triple numerical integral, and benefits greatly. Pay no attention to the actual number of seconds the test takes on my old computer. Your mileage may vary.
//
// Fit function that is actually a wrapper to dispatch the calculation to N threads
//
// nthreads is 1 or an even number, typically 2
// it doesn't matter if npt is odd. In this case, fractional point numbers are passed
// and the wave indexing works just fine - I tested this with test waves of 7 and 8 points
// and the points "2.5" and "3.5" evaluate correctly as 2 and 3
//
Function Cyl_PolyRadius(cw,yw,xw) : FitFunc
Wave cw,yw,xw
#if exists("Cyl_PolyRadiusX")
Variable npt=numpnts(yw)
Variable i,nthreads= ThreadProcessorCount
variable mt= ThreadGroupCreate(nthreads)
// Variable t1=StopMSTimer(-2)
for(i=0;i<nthreads;i+=1)
// Print (i*npt/nthreads),((i+1)*npt/nthreads-1)
ThreadStart mt,i,Cyl_PolyRadius_T(cw,yw,xw,(i*npt/nthreads),((i+1)*npt/nthreads-1))
endfor
do
variable tgs= ThreadGroupWait(mt,100)
while( tgs != 0 )
variable dummy= ThreadGroupRelease(mt)
// Print "elapsed time = ",(StopMSTimer(-2) - t1)/1e6
#else
yw = fCyl_PolyRadius(cw,xw) //the Igor, non-XOP, non-threaded calculation
#endif
return(0)
End
//// experimental threaded version...
// don't try to thread the smeared calculation, it's good enough
// to thread the unsmeared version
//threaded version of the function
ThreadSafe Function Cyl_PolyRadius_T(cw,yw,xw,p1,p2)
WAVE cw,yw,xw
Variable p1,p2
#if exists("Cyl_PolyRadiusX") //this check is done in the calling function, simply hide from compiler
yw[p1,p2] = Cyl_PolyRadiusX(cw,xw)
#else
yw[p1,p2] = fCyl_PolyRadius(cw,xw)
#endif
return 0
End