Skip to content

Commit

Permalink
Merge pull request #170 from nlesc-dirac/libdirac
Browse files Browse the repository at this point in the history
libmvec vectorization
  • Loading branch information
SarodYatawatta authored Dec 28, 2022
2 parents 3e74113 + 159a58a commit 840fa37
Show file tree
Hide file tree
Showing 12 changed files with 288 additions and 169 deletions.
2 changes: 1 addition & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ cmake_minimum_required(VERSION 3.10)
project (SAGECal)
set(PROJECT_VERSION_MAJOR 0)
set(PROJECT_VERSION_MINOR 7)
set(PROJECT_VERSION_PATCH 8)
set(PROJECT_VERSION_PATCH 9)
set(PROJECT_VERSION_REVISION 0)
set(PROJECT_VERSION
"${PROJECT_VERSION_MAJOR}.${PROJECT_VERSION_MINOR}.${PROJECT_VERSION_PATCH}")
Expand Down
3 changes: 2 additions & 1 deletion Docker/ubuntu2004-cpu/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,8 @@ RUN git clone --depth 1 --branch master \
mkdir build-ubuntu && cd build-ubuntu && \
cmake -DCMAKE_INSTALL_PREFIX=/opt/sagecal \
-DBLA_VENDOR=OpenBLAS \
-DCMAKE_CXX_FLAGS="-O3" -DCMAKE_C_FLAGS="-O3" .. && \
-DCMAKE_CXX_FLAGS="-O3 -fopenmp -ffast-math -lmvec -lm" \
-DCMAKE_C_FLAGS="-O3 -fopenmp -ffast-math -lmvec -lm" .. && \
make -j4 && \
make install
RUN ls -alsrt /opt/sagecal && \
Expand Down
3 changes: 2 additions & 1 deletion Docker/ubuntu2004-cuda/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,8 @@ RUN git clone --depth 1 --branch master \
-DHAVE_CUDA=ON -DCUDA_NVCC_FLAGS="-gencode arch=compute_75,code=sm_75 -O3" \
-DNUM_GPU=2 -DBLA_VENDOR=OpenBLAS \
-DCMAKE_CXX_COMPILER=g++-8 -DCMAKE_C_COMPILER=gcc-8 \
-DCMAKE_CXX_FLAGS="-O3" -DCMAKE_C_FLAGS="-O3" .. && \
-DCMAKE_CXX_FLAGS="-O3 -fopenmp -ffast-math -lmvec -lm" \
-DCMAKE_C_FLAGS="-O3 -fopenmp -ffast-math -lmvec -lm" .. && \
make -j4 && \
make install
RUN ls -alsrt /opt/sagecal && \
Expand Down
109 changes: 6 additions & 103 deletions INSTALL.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
do 24 feb 2022 1:25:28 CET
vr 9 dec 2022 1:23:12 CET
# SAGECal Installation

## Cmake Build
Expand All @@ -12,7 +12,7 @@ Run cmake (with GPU support) for example like
mkdir build && cd build
cmake .. -DHAVE_CUDA=ON -DCMAKE_CXX_FLAGS='-DMAX_GPU_ID=0' -DCMAKE_CXX_COMPILER=g++-8 -DCMAKE_C_FLAGS='-DMAX_GPU_ID=0' -DCMAKE_C_COMPILER=gcc-8 -DCUDA_NVCC_FLAGS='-gencode arch=compute_75,code=sm_75' -DBLA_VENDOR=OpenBLAS
```
where *MAX_GPU_ID=0* is when there is only one GPU (ordinal 0). If you have more GPUs, increase this number to 1,2, and so on. This will produce *sagecal_gpu* and *sagecal-mpi_gpu* binary files (after running *make* of course).
where *MAX_GPU_ID=0* is when there is only one GPU (ordinal 0). If you have more GPUs, increase this number to 1,2, and so on. This will produce *sagecal_gpu* and *sagecal-mpi_gpu* binary files (after running *make* of course). You can also use *-DNUM_CPU* to specify the number of GPUs to use, for example *-DNUM_GPU=4*.

CPU only version can be build as
```
Expand All @@ -31,20 +31,14 @@ to make a symbolic link to libgfortran.so.5 or whatever version that is installe

To only build *libdirac* (shared) library, use *-DLIB_ONLY=1* option (also *-DBLA_VENDOR* to select the BLAS flavour). This library can be used with pkg-config using *lib/pkgconfig/libdirac.pc*.

### Requirements for older installations
#### das5
### Vectorized math operations (New)
SAGECal can use ***libmvec*** vectorized math operations, both in GPU and CPU versions. In order to enable this, use compiler options *-fopenmp -ffast-math -lmvec -lm* for both gcc and g++. Also *-mavx*, *-mavx2* etc. can be added. Here is an example for CPU version

Load the modules below before compiling SageCal.
```
module load cmake/3.8.2
module load mpich/ge/gcc/64/3.2
module load gcc/4.9.3
module load casacore/2.3.0-gcc-4.9.3
module load wcslib/5.13-gcc-4.9.3
module load wcslib/5.16-gcc-4.9.3
module load cfitsio/3.410-gcc-4.9.3
cmake .. -DCMAKE_CXX_FLAGS='-g -O3 -Wall -fopenmp -ffast-math -lmvec -lm -mavx2' -DCMAKE_C_FLAGS='-g -O3 -Wall -fopenmp -ffast-math -lmvec -lm -mavx2'
```

### Requirements for older installations
checkout the source code and compile it with the instructions below(in source folder):
```
git clone https://github.com/nlesc-dirac/sagecal.git
Expand Down Expand Up @@ -111,94 +105,3 @@ MPI support is automatically detected, otherwise, it can be forced with:
```
cmake -DENABLE_MPI=ON
```

## GPU Support

### Loading modules on DAS5
See scripts folder for the modules.
```
source ./scripts/load_das5_modules_gcc6.sh
```

### Compiling with GPU support
```
mkdir -p build && cd build
cmake -DCUDA_DEBUG=ON -DDEBUG=ON -DHAVE_CUDA=ON ..
make VERBOSE=1
```



## Installation via Anaconda (WIP)
```
conda install -c sagecal=0.6.0
```



## Manual installation
For expert users, and for custom architectures (GPU), the manual install is recommended.
### 1 Prerequisites:
- CASACORE http://casacore.googlecode.com/
- glib http://developer.gnome.org/glib
- BLAS/LAPACK
Highly recommended is OpenBLAS http://www.openblas.net/
Also, to avoid any linking issues (and to get best performance), build OpenBLAS from source and link SAGECal with the static library (libopenblas***.a) and NOT libopenblas***.so
- Compilers gcc/g++ or Intel icc/icpc
- If you have NVIDIA GPUs,
-- CUDA/CUBLAS/CUSOLVER and nvcc
-- NVML Nvidia management library
- If you are using Intel Xeon Phi MICs.
-- Intel MKL and other libraries
- Get the source for SAGECal
```
git clone -b master https://[email protected]/nlesc-dirac/sagecal.git
```

### 2 The basic way to build is
1.a) go to ./src/lib/Dirac and ./src/lib/Radio and run make (which will create libdirac.a and libradio.a)
1.b) go to ./src/MS and run make (which will create the executable)


### 3 Build settings
In ./src/lib/Dirac and ./src/lib/Radio and ./src/MS you MUST edit the Makefiles to suit your system. Some common items to edit are:
- LAPACK: directory where LAPACK/OpenBLAS is installed
- GLIBI/GLIBL: include/lib files for glib
- CASA_LIBDIR/CASA_INCDIR/CASA_LIBS : casacore include/library location and files:
Note with new CASACORE might need two include paths, e.g.
-I/opt/casacore/include/ -I/opt/casacore/include/casacore
- CUDAINC/CUDALIB : where CUDA/CUBLAS/CUSOLVER is installed
- NVML_INC/NVML_LIB : NVML include/lib path
- NVCFLAGS : flags to pass to nvcc, especially -arch option to match your GPU
- MKLROOT : for Intel MKL

Example makefiles:
Makefile : plain build
Makefile.gpu: with GPU support
Note: Edit ./lib/Radio/Radio.h MAX_GPU_ID to match the number of available GPUs, e.g., for 2 GPUs, MAX_GPU_ID=1



## SAGECAL-MPI Manual Installation
This is for manually installing the distributed version of sagecal (sagecal-mpi), the cmake build will will work for most cases.
## 1 Prerequsites:
- Same as for SAGECal.
- MPI (e.g. OpenMPI)

## 2 Build ./src/lib/Dirac ./src/lib/Radio as above (using mpicc -DMPI_BUILD)

## 3 Build ./src/MPI using mpicc++



## BUILDSKY Installation

- See INSTALL in ./src/buildsky


## RESTORE Installation

- See INSTALL in ./src/restore



2 changes: 1 addition & 1 deletion src/MPI/main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ using namespace Data;

void
print_copyright(void) {
cout<<"SAGECal-MPI 0.7.8 (C) 2011-2022 Sarod Yatawatta"<<endl;
cout<<"SAGECal-MPI 0.7.9 (C) 2011-2023 Sarod Yatawatta"<<endl;
}


Expand Down
46 changes: 32 additions & 14 deletions src/MS/data.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -245,18 +245,23 @@ Data::readAuxData(const char *fname, Data::IOData *data, Data::LBeam *binfo) {
ROArrayColumn<double> chan_width(_freq, "CHAN_WIDTH");
data->deltaf=(double)data->Nchan*(chan_width(0).data()[0]);

/* UTC time */
binfo->time_utc=new double[data->tilesz];
/* no of elements in each station */
binfo->Nelem=new int[data->N];
/* positions of stations */
binfo->sx=new double[data->N];
binfo->sy=new double[data->N];
binfo->sz=new double[data->N];
/* coordinates of elements */
binfo->xx=new double*[data->N];
binfo->yy=new double*[data->N];
binfo->zz=new double*[data->N];
try {
/* UTC time */
binfo->time_utc=new double[data->tilesz];
/* no of elements in each station */
binfo->Nelem=new int[data->N];
/* positions of stations */
binfo->sx=new double[data->N];
binfo->sy=new double[data->N];
binfo->sz=new double[data->N];
/* coordinates of elements */
binfo->xx=new double*[data->N];
binfo->yy=new double*[data->N];
binfo->zz=new double*[data->N];
} catch (const std::bad_alloc& e) {
cout<<"Allocating memory for data failed. Quitting."<< e.what() << endl;
exit(1);
}

Table antfield;
bool isDipole=false;
Expand Down Expand Up @@ -706,10 +711,13 @@ Data::loadData(Table ti, Data::IOData iodata, LBeam binfo, double *fratio) {
}
/* counters for finding flagged data ratio */
int countgood=0; int countbad=0;
/* get antenna pair of first row for recording time */
uInt ant_i=a1(0);
uInt ant_j=a2(0);
for(int row = 0; row < nrow && row0<iodata.tilesz*iodata.Nbase; row++) {
uInt i = a1(row); //antenna1
uInt j = a2(row); //antenna2
if (!i && !j) {/* use baseline 0-0 to extract time */
if (i==ant_i && j==ant_j) {/* baseline ant_i-ant_j to extract time */
double tt=tut(row);
/* convert MJD (s) to JD (days) */
binfo.time_utc[rowt++]=(tt/86400.0+2400000.5); /* no +0.5 added */
Expand Down Expand Up @@ -806,6 +814,12 @@ Data::loadData(Table ti, Data::IOData iodata, LBeam binfo, double *fratio) {
iodata.flag[row]=1;

}
/* also set time to last valid one */
if (rowt>0 && rowt<iodata.tilesz) {
for(int rowtt=rowt; rowtt<iodata.tilesz; rowtt++) {
binfo.time_utc[rowtt]=binfo.time_utc[rowt-1];
}
}
/* set uvw and data to 0 to eliminate any funny business */
memset(&iodata.u[row0],0,sizeof(double)*(size_t)(iodata.tilesz*iodata.Nbase-row0));
memset(&iodata.v[row0],0,sizeof(double)*(size_t)(iodata.tilesz*iodata.Nbase-row0));
Expand Down Expand Up @@ -1159,10 +1173,14 @@ Data::loadDataMinibatch(Table ti, Data::IOData iodata, LBeam binfo, int minibatc
int nrow=t.nrow();
int row0=rowoffset;
int rowt=rowtoffset;

/* get antenna pair of first row for recording time */
uInt ant_i=a1(0);
uInt ant_j=a2(0);
for(int row = 0; row < nrow && row0<iodata.tilesz*iodata.Nbase; row++) {
uInt i = a1(row); //antenna1
uInt j = a2(row); //antenna2
if (!i && !j) {/* use baseline 0-0 to extract time */
if (i==ant_i && j==ant_j) {/* use baseline ant_i-ant_j to extract time */
double tt=tut(row);
/* convert MJD (s) to JD (days) */
binfo.time_utc[rowt++]=(tt/86400.0+2400000.5); /* no +0.5 added */
Expand Down
2 changes: 1 addition & 1 deletion src/MS/main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ using namespace Data;

void
print_copyright(void) {
cout<<"SAGECal 0.7.8 (C) 2011-2022 Sarod Yatawatta"<<endl;
cout<<"SAGECal 0.7.9 (C) 2011-2023 Sarod Yatawatta"<<endl;
}


Expand Down
Loading

0 comments on commit 840fa37

Please sign in to comment.