OCR Multimodal Retrieval using ColiPali Model

Rejected

Session Description

I'll discuss ColPali, a model that extends PaliGemma-3B to generate ColBERT-style multi-vector representations for both text and images. We'll explore how ColPali enhances multimodal retrieval and search across text and image data, showcasing its recent benchmark success in vision retrieval tasks. The session will also cover the key concepts of ColBERT-style multi-vector representations and their impact on retrieval performance.

github link: https://github.com/samvardhan777/Colipali

ppt: https://github.com/samvardhan777/Colipali/blob/main/ppt/ColiPali%20Model%20OCR.pdf

Key Takeaways

None

References

https://medium.com/@samvardhan777/colbert-paligemma-colipali-for-document-retrieval-3bb8cc80326c

Session Categories

FOSS

Speakers

sam

Data Engineer Thoughtworks

Reviews

0 %

Approvability

Approvals

Rejections

Not Sure

Rather focused on something not a lot of attendees will have insight into IMO

Reviewer #1

Rejected