(English below)

Vietnamese Vision-Language Model (Vietnamese-VLM)

Chúng tôi là ai?

Vietnamese Vision-Language Model (Vietnamese-VLM) là một dự án nghiên cứu tập trung vào lĩnh vực multimodal, tích hợp cả thị giác và ngôn ngữ cho tiếng Việt.

Nhóm nghiên cứu của Vi-VLM mong muốn đóng góp, phát triển những mô hình và bộ dữ liệu chất lượng cao nhất nhằm thúc đẩy sự phát triển của Trí tuệ nhân tạo trong lĩnh vực Vision-Language.

Những đóng góp hiện tại

Vista: Bộ dữ liệu lớn cho vision-language được xây dựng dựa trên LLAVA, ShareGPT4V, WIT.
Vistral-V (Vistral-Vision): Visual Instruction cho model Vistral - Mô hình hình ảnh và ngôn ngữ lớn cho tiếng Việt.

Các thành viên

Oanh Tran
Hop Bui
Hoang Ha
Phan Phuc

Who are we?

Vietnamese Vision-Language Model (Vietnamese-VLM) is a research project focusing on the multimodal field, integrating both vision and language for Vietnamese.

Vi-VLM's research team wishes to contribute and develop the highest quality models and data sets to promote the development of Artificial Intelligence in the field of Vision-Language.

Current contribution

Vista: Large dataset for vision-language built on LLAVA, ShareGPT4V, WIT.
Vistral-V (Vistral-Vision): Visual Instruction Tuning for Vistral - Vietnamese Large Vision-Language Model.

Members

Oanh Tran
Hop Bui
Hoang Ha
Phan Phuc