-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get code up to date #37
base: main
Are you sure you want to change the base?
Conversation
…g buffer size. Added EPIC_multi_gpu.py to have two copies of fft_grid run on two GPUs simultaneously
…ing. It includes Matt's numpy speedups, fits file writing and Hari's optimizations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this PR looks pretty straightforward. Just a few cleanup tasks, and a couple quick discussions to have.
#try: | ||
# bf_romein.execute(udata, gdata) | ||
#except NameError: | ||
# bf_romein = Romein() | ||
# bf_romein.init(self.locs, gphases, self.grid_size, polmajor=False) | ||
# bf_romein.execute(udata, gdata) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove blocks of commented code.
#try: | |
# bf_romein.execute(udata, gdata) | |
#except NameError: | |
# bf_romein = Romein() | |
# bf_romein.init(self.locs, gphases, self.grid_size, polmajor=False) | |
# bf_romein.execute(udata, gdata) |
#bifrost.map( | ||
# "a(i,j,k,l) += (b(i,j,k,l/2) * b(i,j,k,l%2).conj())", | ||
# {"a": autocorrs, "b": udata, "t": self.ntime_gulp}, | ||
# axis_names=("i", "j", "k", "l"), | ||
# shape=(self.ntime_gulp, nchan, nstand, npol ** 2), | ||
#) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove blocks of commented code.
#bifrost.map( | |
# "a(i,j,k,l) += (b(i,j,k,l/2) * b(i,j,k,l%2).conj())", | |
# {"a": autocorrs, "b": udata, "t": self.ntime_gulp}, | |
# axis_names=("i", "j", "k", "l"), | |
# shape=(self.ntime_gulp, nchan, nstand, npol ** 2), | |
#) |
#bifrost.map( | ||
# "a(i,j,p,k,l) += b(0,i,j,p/2,k,l)*b(0,i,j,p%2,k,l).conj()", | ||
# {"a": crosspol, "b": gdata}, | ||
# axis_names=("i", "j", "p", "k", "l"), | ||
# shape=(self.ntime_gulp, nchan, npol ** 2, self.grid_size, self.grid_size), | ||
#) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove blocks of commented code
#bifrost.map( | |
# "a(i,j,p,k,l) += b(0,i,j,p/2,k,l)*b(0,i,j,p%2,k,l).conj()", | |
# {"a": crosspol, "b": gdata}, | |
# axis_names=("i", "j", "p", "k", "l"), | |
# shape=(self.ntime_gulp, nchan, npol ** 2, self.grid_size, self.grid_size), | |
#) |
#try: | ||
# bf_romein_autocorr.execute(autocorrs_av, autocorr_g) | ||
#except NameError: | ||
# bf_romein_autocorr = Romein() | ||
# bf_romein_autocorr.init( | ||
# autocorr_lo, autocorr_il, self.grid_size, polmajor=False | ||
# ) | ||
# bf_romein_autocorr.execute(autocorrs_av, autocorr_g) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove blocks of commented code
#try: | |
# bf_romein_autocorr.execute(autocorrs_av, autocorr_g) | |
#except NameError: | |
# bf_romein_autocorr = Romein() | |
# bf_romein_autocorr.init( | |
# autocorr_lo, autocorr_il, self.grid_size, polmajor=False | |
# ) | |
# bf_romein_autocorr.execute(autocorrs_av, autocorr_g) |
"--duration", | ||
type=int, | ||
default=3600, | ||
help="Duration of EPIC (seconds)", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
help="Duration of EPIC (seconds)", | |
help="Total duration of EPIC observation (seconds)", |
if args.removeautocorrs: | ||
raise NotImplementedError( | ||
"Removing autocorrelations is not yet properly implemented." | ||
) | ||
#if args.removeautocorrs: | ||
# raise NotImplementedError( | ||
# "Removing autocorrelations is not yet properly implemented." | ||
# ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There have been some changes to the autocorr code, but has it been tested? As far as I know we don't run with this option, so I'm not sure this NotImplementedError
should be removed yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not even sure if the auto-correlation removal ever worked.
self.iring.resize(igulp_size, buffer_factor= 8) | ||
self.oring.resize(ogulp_size, buffer_factor= 128) # , obuf_size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @jaycedowell had some concerns about this. IIRC, this looks like we're making large buffers to account for a slow startup, when really we should be aiming to fix the startup issue instead. I don't have a solution, just flagging this for discussion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that buffer_factor
of 128 is really large. I had done some simple benchmarking and it looked like the problem was with the GenerateLocations
call in MOFFCorrelatorOp.main
being slow. I don't know if that was all of it but it is a factor.
For solutions we could think about something like compute-and-cache. The first time a particular frequency range is encountered we take the hit on computing and cache the results to disk. On subsequent encounters with that frequency setup we use what is on disk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes a lot of sense. Perhaps we could also reorder so that even if we need to generate the locations it's done before the buffer is allocated/used, so we don't need to be juggling data while we're still setting up.
self.iring.resize(igulp_size) | ||
self.oring.resize(ogulp_size, buffer_factor=5) | ||
self.iring.resize(igulp_size, buffer_factor=128) | ||
self.oring.resize(ogulp_size, buffer_factor=256) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is another large buffer in MOFFCorrelatorOp
. The buffer_factor
of 256 on the output ring could be one of the big things driving the GPU memory utilization.
cores = [0, 2, 3, 4, 5, 6, 7] | ||
gpus = [0, 0, 0, 0, 0, 0, 0] | ||
cores = [3, 4, 5, 6, 7] | ||
gpus = [0, 0, 0, 0, 0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Control of cores
and gpus
should probably be set by a command line flag. gpus
could probably be reduced to just a single gpu
argument that is used for everything.
This is an all-in-one PR to grab changes made over the last couple years. Some things included here:
This supersedes #20 and #36.