Jñāna - Multimodal LLM App

Jñāna is a multimodal LLM gradio app hosted on huggingface spaces. It is capable of accepting inputs in the form of image/audio/text or a combination of any of these 3. Jñāna uses microsoft/phi2 LLM model that was trained based on Llava 1.0 and Llava 1.5. qlora strategy was used for fine-tuning microsoft/phi2.
J

Try-out the app : Jnana-Multimodal LLM

J

Check the code : Github

J

Read my detailed blog on building this app : Blog

J

View youtube video demo : Youtube Demo Video (1:15 mins)