Mitigating Hallucinations in Multimodal LLMs with Google's HALVA

Rejected

Session Description

HALVA: Hallucination Attenuated Language and Vision Assistant

A new contrastive tuning strategy mitigates hallucinations while retaining general performance in multimodal LLMs.

Data-augmented contrastive tuning has been introduced to mitigate object hallucination in MLLMs. The proposed method effectively mitigates object hallucinations and beyond while retaining or improving performance on general vision-language tasks. Moreover, the proposed contrastive tuning is simple, fast, and requires minimal training with no additional overhead at inference. This method may have applications in other areas as well. For example, it might be adapted to mitigate bias and harmful language generation.

Key Takeaways

None

References

https://research.google/blog/halva-hallucination-attenuated-language-and-vision-assistant/

Session Categories

FOSS

Speakers

Jayita Bhattacharyya

AI Evangelist Deloitte

Reviews

100 %

Approvability

Approvals

Rejections

Not Sure

This is very interesting - its an algorithm/method but not a project itself. In the paper, it was applied to an open-source model so i hope the speaker doesnt bring up closed-source models.

Reviewer #1

Approved