Skip to content

Latest commit

 

History

History
236 lines (159 loc) · 7.42 KB

lapack-3.4.0.txt

File metadata and controls

236 lines (159 loc) · 7.42 KB

LAPACK 3.4.0

Release date: Fr 11/11/11.

This material is based upon work supported by the National Science Foundation and the Department of Energy under Grant No. NSF-OCI-1032861, NSF-CCF-00444486, NSF-CNS 0325873, NSF-EIA 0122599, NSF-ACI-0090127, DOE-DE-FC02-01ER25478, DOE-DE-FC02-06ER25768.

LAPACK is a software package provided by Univ. of Tennessee, Univ. of California, Berkeley, Univ. of Colorado Denver and NAG Ltd..

1. Support and questions:

2. LAPACK 3.4.0: What’s new

  1. xGEQRT: QR factorization (improved interface). Contribution by Rodney James (UC Denver). xGEQRT is analogous to xGEQRF with a modified interface which enables better performance when the blocked reflectors need to be reused. The companion subroutines xGEMQRT apply the reflectors.

  2. xGEQRT3: recursive QR factorization. Contribution by Rodney James (UC Denver). The recursive QR factorization enables cache-oblivious and enables high performance on tall and skinny matrices. See reference [1] below.

  3. xTPQRT: Communication-Avoiding QR sequential kernels. Contribution by Rodney James (UC Denver). These subroutines are useful for updating a QR factorization and are used in sequential and parallel Communication Avoiding QR. These subroutines support the general case Triangle on top of Pentagon which includes as special cases so-called Triangle on top of Triangle and Triangle on top of Square. This is the right-looking version of the subroutines and the subroutines are blocked. The T matrices and the block size are part of the interface. The companion subroutines xTPMQRT apply the reflectors.

  4. CMAKE build system. We are striving to help our users install our libraries seamlessly on their machines. The CMAKE team contributed to our effort to port LAPACK and ScaLAPACK under the CMAKE build system. Building under Windows has never been easier. This also allows us to release dll for Windows, so users no longer need a Fortran compiler to use LAPACK under Windows.

  5. Doxygen documentation. LAPACK routine documentation has never been more accessible. See http://www.netlib.org/lapack/explore-html/.

  6. New website allowing for easier navigation.

  7. LAPACKE - Standard C language APIs for LAPACK. Since LAPACK 3.3.0, LAPACK includes new C interfaces. With the LAPACK 3.4.0 release, LAPACKE is directly integrated within the LAPACK library and has been enriched by the full set of LAPACK subroutines. See here.

  8. References

[1] [[1]]:: E. Elmroth and F.G. Gustavson. ``Applying recursion to serial and
parallel QR factorization leads to better performance.'' IBM Journal of
Research and Development 44(4): 605-624 (2000)


4. External Contributors
------------------------

    * CMAKE team
    * Intel MKL team
    * Craig Lucas (University of Manchester and NAG)

5. Thanks
---------

Thanks for bug-report/patches/suggestions to:
Ming Gu (UC Berkeley),
Hatem Ltaif (KAUST, Saudi Arabia),
omitrofa (intel team),
Tiago Requeijo,
Edward Smyth (NAG), and
Clint Whaley (University of Texas at San Antonio, USA).

6. Developer list
-----------------

.Principal Investigators

    * Jim Demmel (University of California,  Berkeley, USA)
    * Jack Dongarra (University of Tennessee and ORNL, USA)
    * Julien Langou (University of Colorado Denver, USA)

.LAPACK developers involved in this release

    * Sven Hammarling (NAG Ltd. and University of Manchester, UK)
    * Igor Kozachenko  (University of California,  Berkeley, USA)
    * Julie Langou (University of Tennessee, USA)
    * Benjamin Lipshitz  (University of California,  Berkeley, USA)
    * Rodney James (University of Colorado Denver, USA)

7. More details
----------------

7.1.  xGEQRT: QR factorization (improved interface)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.contributors

Rodney James (UC Denver)

.contribution

xGEQRT is analogous to xGEQRF with a modified interface which enables better
performance when the blocked reflectors need to be reused. The companion
subroutines xGEMQRT apply the reflectors.

.public interfaces

   A SRC/sgeqrt.f    A SRC/sgemqrt.f   A SRC/sgeqrt2.f
   A SRC/cgeqrt.f    A SRC/cgemqrt.f   A SRC/cgeqrt2.f
   A SRC/dgeqrt.f    A SRC/dgemqrt.f   A SRC/dgeqrt2.f
   A SRC/zgeqrt.f    A SRC/zgemqrt.f   A SRC/zgeqrt2.f

7.2.  xGEQRT3: recursive QR factorization
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.contributors

Rodney James (UC Denver)

.reference

Elmroth, E., and Gustavson, F.G. ``Applying recursion to serial and
parallel QR factorization leads to better performance.'' IBM Journal of
Research and Development 44(4): 605-624 (2000)

.contribution

The recursive QR factorization enables cache-oblivious and enables high
performance on tall and skinny matrices.

.public interfaces

   A SRC/sgeqrt3.f
   A SRC/cgeqrt3.f
   A SRC/dgeqrt3.f
   A SRC/zgeqrt3.f

7.3.  xTPQRT: Communication-Avoiding QR sequential kernels
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.contributors

Rodney James (UC Denver)

.contribution

These subroutines are useful for updating a QR factorization and are used in
sequential and parallel Communication Avoiding QR. These subroutines support
the general case Triangle on top of Pentagon which includes as special cases
so-called Triangle on top of Triangle and Triangle on top of Square. This is
the right-looking version of the subroutines and the subroutines are blocked.
The T matrices and the block size are part of the interface. The companion
subroutines xTPMQRT apply the reflectors.

.public interfaces

   A SRC/stpmqrt.f   A SRC/stpqrt.f    A SRC/stpqrt2.f   A SRC/stprfb.f
   A SRC/ctpmqrt.f   A SRC/ctpqrt.f    A SRC/ctpqrt2.f   A SRC/ctprfb.f
   A SRC/dtpmqrt.f   A SRC/dtpqrt.f    A SRC/dtpqrt2.f   A SRC/dtprfb.f
   A SRC/ztpmqrt.f   A SRC/ztpqrt.f    A SRC/ztpqrt2.f   A SRC/ztprfb.f


7.4.  CMAKE build system
~~~~~~~~~~~~~~~~~~~~~~~~

.contributors

CMAKE team

.lapacker

Julie Langou (University of Tennessee)

.contribution

We are striving to help our users install our libraries seamlessly on their
machines.  The CMAKE team contributed to our effort to port LAPACK and
ScaLAPACK under the CMAKE build system. Building under Windows has never been
easier. This also allows us to release dll for Windows, so users no longer need
a Fortran compiler to use LAPACK under Windows.

7.5.  Doxygen documentation
~~~~~~~~~~~~~~~~~~~~~~~~~~~

.lapacker

Julie Langou (University of Tennessee)

.contribution

LAPACK routine documentation has never been more
accessible. See http://www.netlib.org/lapack/explore-html/.

7.6.  New website allowing for easier navigation
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.lapacker

Julie Langou (University of Tennessee)

7.7.  LAPACKE - Standard C language APIs for LAPACK
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.contributors

Intel MKL team

.lapacker

Julie Langou (University of Tennessee)

.contribution

Since LAPACK 3.3.0, LAPACK includes new C interfaces. With the LAPACK 3.4.0
release, LAPACKE is directly integrated within the LAPACK library and has been
enriched with the new set of LAPACK subroutines. See link:lapacke.html[here].

8.  Bug Fix
-----------

link:errata_from_331_to_340.html[see here]

// vim: set syntax=asciidoc: