- Dresden, Germany
-
03:36
- 1h ahead - claas.plus
- @claas.plus
Highlights
- Pro
Stars
noDRM / DeDRM_tools
Forked from apprenticeharper/DeDRM_toolsDeDRM tools for ebooks
A simple, but powerful UI toolset for various ROS2 utilities, with additional partial CLI support.
Collection of scripts to build small-scale datasets for fine-tuning video generation models.
[Nature Reviews Bioengineering🔥] Application of Large Language Models in Medicine. A curated list of practical guide resources of Medical LLMs (Medical LLMs Tree, Tables, and Papers)
AlignCLIP: Improving Cross-Modal Alignment in CLIP
📚 LookScanned.io - Make your PDFs look scanned
A project page template for academic papers. Demo at https://eliahuhorwitz.github.io/Academic-project-page-template/
🎥 Python and OpenCV-based scene cut/transition detection program & library.
Official repository of the GraSP dataset and implemention of TAPIS
Extract frames and motion vectors from H.264 and MPEG-4 encoded video.
Witness the aha moment of VLM with less than $3.
An efficient video loader for deep learning with smart shuffling that's super easy to digest
Implementation of the "Learn No to Say Yes Better" paper.
PyTorch code and models for V-JEPA self-supervised learning from video.
🐍 The official Python client library for Google's discovery based APIs.
[MICCAI'23] Foundation Model for Endoscopy Video Analysis via Large-scale Self-supervised Pre-train
[ECCV 2024] Official PyTorch implementation of DreamLIP: Language-Image Pre-training with Long Captions
Official implementation of Pix2SG, the first location-free scene graph generation method, as well as the corresponding heuristic tree search-based evaluation implemented in C++.
Google Drive Public File Downloader when Curl/Wget Fails
Surgical Visual Question Answering. A transformer-based surgical VQA model. Offical Implementation of "Surgical-VQA: Visual Question Answering in Surgical Scenes using Transformers", MICCAI 2022.
Document (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured …
A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcription.
A CLI tool to convert your codebase into a single LLM prompt with source tree, prompt templating, and token counting.