Skip to content
@IST-DASLab

IST Austria Distributed Algorithms and Systems Lab

Popular repositories Loading

  1. gptq gptq Public

    Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".

    Python 1.9k 154

  2. sparsegpt sparsegpt Public

    Code for the ICML 2023 paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot".

    Python 722 96

  3. marlin marlin Public

    FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

    Python 624 47

  4. qmoe qmoe Public

    Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".

    Python 262 22

  5. PanzaMail PanzaMail Public

    Python 257 14

  6. QUIK QUIK Public

    Repository for the QUIK project, enabling the use of 4bit kernels for generative inference - EMNLP 2024

    C++ 173 12

Repositories

Showing 10 of 48 repositories
  • IST-DASLab/PanzaMail’s past year of commit activity
    Python 257 Apache-2.0 14 3 5 Updated Nov 19, 2024
  • EvoPress Public
    IST-DASLab/EvoPress’s past year of commit activity
    Python 5 1 0 0 Updated Nov 15, 2024
  • LDAdam Public

    LDAdam - Adaptive Optimization from Low-Dimensional Gradient Statistics

    IST-DASLab/LDAdam’s past year of commit activity
    Python 4 Apache-2.0 0 0 0 Updated Nov 6, 2024
  • GridSearcher Public

    GridSearcher simplifies running grid searches for machine learning projects in Python, emphasizing parallel execution and GPU scheduling without dependencies on SLURM or other workload managers.

    IST-DASLab/GridSearcher’s past year of commit activity
    Python 2 Apache-2.0 0 0 0 Updated Oct 27, 2024
  • torch_cgx Public

    Pytorch distributed backend extension with compression support

    IST-DASLab/torch_cgx’s past year of commit activity
    C++ 17 AGPL-3.0 0 4 0 Updated Oct 17, 2024
  • IST-DASLab/ISTA-DASLab-Optimizers’s past year of commit activity
    Python 7 Apache-2.0 0 0 0 Updated Sep 5, 2024
  • Sparse-Marlin Public

    Boosting 4-bit inference kernels with 2:4 Sparsity

    IST-DASLab/Sparse-Marlin’s past year of commit activity
    Cuda 51 Apache-2.0 3 1 0 Updated Sep 4, 2024
  • marlin Public

    FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

    IST-DASLab/marlin’s past year of commit activity
    Python 624 Apache-2.0 47 25 4 Updated Sep 4, 2024
  • sparsegpt Public

    Code for the ICML 2023 paper "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot".

    IST-DASLab/sparsegpt’s past year of commit activity
    Python 722 Apache-2.0 96 14 1 Updated Aug 20, 2024
  • peft-rosa Public

    A fork of the PEFT library, supporting Robust Adaptation (RoSA)

    IST-DASLab/peft-rosa’s past year of commit activity
    Python 13 Apache-2.0 3 1 0 Updated Aug 16, 2024