Skip to content

tjmlabs/ColiVara-docs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

icon cover coverY layout
hand-wave
0
cover title description tableOfContents outline pagination
visible size
true
full
visible
true
visible
visible
true
visible
true
visible
true

Welcome

Welcome to the ColiVara documentation! Here you'll get an overview of all the features ColiVara offers to help you build a state of the art retrieval system.

Colivara is a suite of services that allows you to store, search, and retrieve documents based on their visual embeddings.

Why visual embeddings?

Documents are visually rich structures that convey information through text, as well as tables, figures, page layouts, and charts. While legacy document retrieval systems exhibit good performance on query-to-text matching, they struggle to pass visual cues efficiently to large language models, hindering their performance on practical document retrieval applications such as Retrieval Augmented Generation.

It is a web-first implementation of the ColPali paper using ColQwen2 as the LLM model. It works exactly like RAG from the end-user standpoint - but using vision models instead of chunking and text-processing for documents. No OCR, no text extraction, no broken tables, or missing images. What you see, is what you get.

Performance

{% hint style="info" %} ColiVara performance is near state of the art for Retrieval-Augmented Generation on the vidore leaderboard. We significantly outperfomed currently methods for document parsing and processing such as OCR and captioning. {% endhint %}

Our detailed Benchmark Performance Evaluation have illustrated Colivara's performance across diverse benchmarks. Metrics like NDCG@5 score (Normalized Discounted Cumulative Gain at rank 5) and Latency were recorded for a comprehensive analysis.

Benchmark Colivara Score Avg Latency (s) (lower is better) Num Docs
Average 86.8 N/A N/A
ArxivQA 87.6 3.2 500
DocVQA 54.8 2.9 500
InfoVQA 90.1 2.9 500
Shift Project 87.7 5.3 1000
Artificial Intelligence 98.7 4.3 1000
Energy 96.4 4.5 1000
Government Reports 96.8 4.4 1000
Healthcare Industry 98.5 4.5 1000
TabFQuad 86.6 3.7 280
TatQA 70.9 8.4 1663

Key Findings

  • ColiVara dominated visual-heavy benchmarks like ArxivQA and InfoQA with NDCG@5 score of 88.1, double the performance of captioning-based systems.
  • Even for on text-centric benchmarks, ColiVara outperformed traditional methods by up to 30% on benchmarks like DocQA and multimodal benchmarks like InfoQA.
  • For more comprehensive benchmarks, where a holistic approach of visual and textual analysis is key to query generation, such as the key for queries in specific domains (Sustainability, Energy, AI, Government Report, Healthcare), ColiVara shines overwhelmingly over competitions, scoring in the high 90s for all benchmarks. This is due to ColiVara's Holistic Multimodal Integration and Spatial Context Awareness. \

Jump right in

Getting StartedCreate your first RAG pipeline with 2 lines of codequickstart.md
GuidesLearn all what ColiVara have to offerBroken link
API ReferenceTry the API live

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •