**Describe the feature request** Bring support for Llama.cpp inferencing and benchmarking. **Describe the solution you'd like** - `modelling_llama_skip.py` changes for exporting to GGUF - Add and dispatch inference to llama.cpp with sparse transformers GGUF - update `run_benchmark.py` to support llama.cpp