Adaptive Vision Transformers for Efficient Processing of Video Data in Automotive Applications

Code

This repository contains code to run the model designed in 'Adaptive Vision Transformers for Efficient Processing of Video Data in Automotive Applications'. Below, you'll find details about the code structure, how to set up the environment, run inferences, and interpret benchmarking results.

Model

model contains the primary implementation of the model, extending the mmengine and mmsegmentation frameworks.

encoder-decoder: Modified encoder-decoder implementation.
token reducing vision transformer: Token-reducing Vision Transformer module.

Setup and execution

setup: Notebook to install model weights and dependencies.

Running the code

example.ipynb: Example notebook demonstrating inference with the model.
benchamrking: Benchmarking and analysis

Benchmark results

benchamrking contains numpy files with benchmarking results.

File descriptions

starting with encode_times: Measures the time taken to run the encoder. (in seconds)
starting with pixel_wise_acc: Shows the pixel-wise accuracy loss compared to the original model. (in %)
starting with reduced_tokens_heatmap: Visualizes the location of the most pruned tokens.
starting with pruned_tokens: Represents the amount of tokens that are most pruned. (absolute numbers)

File naming conventions

Files ending with '0.xx': Fixed threshold with a standard reduction interval of 8.
Files containing 'lin': Linear threshold.
Files containing 'all_layers': Reduction interval set to 1.
Files containing 'int_x': Representing varying reduction intervals.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
code		code
paper		paper
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Adaptive Vision Transformers for Efficient Processing of Video Data in Automotive Applications

Code

Model

Setup and execution

Running the code

Benchmark results

File descriptions

File naming conventions

About

Uh oh!

Releases

Packages

Languages

woutverbiest/adaptive_vit

Folders and files

Latest commit

History

Repository files navigation

Adaptive Vision Transformers for Efficient Processing of Video Data in Automotive Applications

Code

Model

Setup and execution

Running the code

Benchmark results

File descriptions

File naming conventions

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages