colpali v1.3 by AndrewOgn #427

joein · 2024-12-18T16:59:01Z

it's a draft of second iteration of work on colpali #394

fastembed/late_interaction_multimodal/colpali.py

fastembed/late_interaction_multimodal/onnx_multimodal_model.py

Docstring improvements

I8dNLo · 2024-12-23T12:25:06Z

To check out values for tests I use code examples from here

joein

@I8dNLo

fastembed/late_interaction_multimodal/__init__.py

tests/test_late_interaction_multimodal.py

fastembed/late_interaction_multimodal/late_interaction_multimodal_embedding_base.py

fastembed/late_interaction_multimodal/late_interaction_multimodal_embedding.py

fastembed/late_interaction_multimodal/colpali.py

joein · 2024-12-27T17:14:18Z

fastembed/late_interaction_multimodal/colpali.py

+    PAD_TOKEN = "<pad>"
+    QUERY_MARKER_TOKEN_ID = [2, 9413]
+    IMAGE_PLACEHOLDER_SIZE = (3, 448, 448)
+    EMPTY_TEXT_PLACEHOLDER = np.array([257152] * 1024 + [2, 50721, 573, 2416, 235265, 108])


This is actually token ids of the following string '<image>' * 1024 + '<bos>Describe the image.\n'
Could we make it nicer? It's not really readable at the moment

EVEN_ATTENTION_MASK is also not really readable, maybe instead of having this even_attention_mask we could assign 1030 to a constant which seems to be a bit more reasonable

That's this tokens, right. But there are some slight difference between real text input and this placeholder.
As you can see here original preprocessor does not add '\n' token there, while it should be added everywhere else. So we should refactor logic of tokenize, trigger tokenizer each time (with constant output)

descriptions docs black

wip: design draft

87bfae1

joein changed the title ~~wip: design draft~~ wip: colpali design draft Dec 18, 2024

I8dNLo reviewed Dec 18, 2024

View reviewed changes

fastembed/late_interaction_multimodal/colpali.py Outdated Show resolved Hide resolved

I8dNLo reviewed Dec 18, 2024

View reviewed changes

fastembed/late_interaction_multimodal/colpali.py Outdated Show resolved Hide resolved

I8dNLo reviewed Dec 18, 2024

View reviewed changes

fastembed/late_interaction_multimodal/onnx_multimodal_model.py Show resolved Hide resolved

I8dNLo added 5 commits December 19, 2024 23:21

Operators fix

b8cda68

Fix model inputs

a7aa9c3

Import from fastembed.late_interaction_multimodal

ade31c5

Fixed method misspelling

5cc0a7e

Tests, which do not run in CI

566d245

Docstring improvements

I8dNLo added 2 commits December 27, 2024 12:40

Merge remote-tracking branch 'origin/main' into colpali-multi

d1b6ce9

Fix tests

d0b1f86

joein commented Dec 27, 2024

View reviewed changes

I8dNLo added 3 commits January 13, 2025 13:15

Bump colpali to version v1.3

b60f372

Remove colpali v1.2

013a462

Remove colpali v1.2 from tests

a57fd3c

I8dNLo changed the title ~~wip: colpali design draft~~ colpali v1.3 by AndrewOgn Jan 13, 2025

I8dNLo added 2 commits January 13, 2025 13:34

partial fix of change requests:

939a1c0

descriptions docs black

query_max_length

78dcc33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

colpali v1.3 by AndrewOgn #427

colpali v1.3 by AndrewOgn #427

joein commented Dec 18, 2024

I8dNLo commented Dec 23, 2024

joein left a comment

joein Dec 27, 2024

I8dNLo Jan 14, 2025

colpali v1.3 by AndrewOgn #427

Are you sure you want to change the base?

colpali v1.3 by AndrewOgn #427

Conversation

joein commented Dec 18, 2024

I8dNLo commented Dec 23, 2024

joein left a comment

Choose a reason for hiding this comment

joein Dec 27, 2024

Choose a reason for hiding this comment

I8dNLo Jan 14, 2025

Choose a reason for hiding this comment