GPT-4 is a powerful natural language processing system that can generate coherent and diverse texts on various topics and domains. However, it is not publicly available and requires a lot of computational resources to run. Therefore, there is a need for an alternative model that can offer similar capabilities but is accessible and free for anyone to use. There are few models that can understand and provide informative and engaging responses.
General overview of each model:
- Vicuna: Vicuna is a chat assistant that has been fine-tuned from LLaMA, a language model, on user-shared conversations. It is expected to perform well and is similar in performance to Koala, which is also a chat assistant fine-tuned from LLaMA on user-shared conversations.
- Koala: Koala is a chatbot that has been fine-tuned from LLaMA on user-shared conversations and open-source datasets. It performs similarly to Vicuna, which is also fine-tuned from LLaMA on user-shared conversations. Github code repository
- ChatGLM: ChatGLM is an open bilingual dialogue language model that is capable of understanding and responding to text in both English and Spanish. It is fine-tuned from LLaMA and is an open-source model that can be used for a variety of natural language processing tasks. Github code repository
- Alpaca: Alpaca is a model that has been fine-tuned from LLaMA on 52K instruction-following demonstrations. It is not a chatbot or chat assistant, but rather a language model that has been trained on a specific set of data. Github code repository
- LLaMA: LLaMA is an open and efficient foundation language model that can be used for a variety of natural language processing tasks. It is capable of understanding and generating text in multiple languages and has a wide range of potential applications. It is the foundation on which several other models, such as Vicuna, Koala, and ChatGLM, have been fine-tuned.
Github code repository
How Vicuna is Leading the Race?
The collaborative project that involves partners from several leading institutions, such as UC Berkeley, CMU, Stanford, UC San Diego, and MBZUAI has developed a chatbot called "Vicuna-13B". Team introduce Vicuna-13B, an open-source chatbot trained by fine-tuning LLaMA on user-shared conversations collected from ShareGPT.
Team claims that Preliminary evaluation using GPT-4 as a judge shows Vicuna-13B achieves more than 90%* quality of OpenAI ChatGPT and Google Bard while outperforming other models like LLaMA and Stanford Alpaca in more than 90%* of cases The cost of training Vicuna-13B is around $300. The training and serving code, along with an online demo, are publicly available for non-commercial use.
Team presents examples of Alpaca and Vicuna responses to their benchmark questions. After fine-tuning Vicuna with 70K user-shared ChatGPT conversations, team discover that Vicuna becomes capable of generating more detailed and well-structured answers compared to Alpaca (see examples below), with the quality on par with ChatGPT.
With recent advancements in GPT-4, team is curious whether its capabilities have reached a human-like level that could enable an automated evaluation framework for benchmark generation and performance assessments. Their initial finding indicates that GPT-4 can produce highly consistent ranks and detailed assessment when comparing chatbots’ answers. Preliminary evaluations based on GPT-4, show that Vicuna achieves 90%* capability of Bard/ChatGPT.
While this proposed framework shows a potential to automate chatbot assessment, it is not yet a rigorous approach. Building an evaluation system for chatbots remains an open question requiring further research. More details are provided in the below section.
Vicuna Model Training: Team created Vicuna by fine-tuning a LLaMA base model using around 70,000 conversations gathered from ShareGPT.com with public APIs. They ensured data quality by filtering out inappropriate or low-quality samples, converting the HTML back to markdown, and dividing lengthy conversations into smaller segments that fit the model’s maximum context length. Team made several improvements to the training recipe, such as expanding the maximum context length to 2048, enabling understanding of long context, and reducing the cost of training by employing SkyPilot managed spot. They built a serving system that can work with cheaper spot instances to reduce serving costs.
How to Evaluate a Chatbot? Evaluating AI chatbots is a challenging task as it requires examining language understanding, reasoning, and context awareness. Team proposes an evaluation framework based on GPT-4 to automate chatbot performance assessment and devised eight question categories to test various aspects of a chatbot's performance and collected answers from five chatbots, including Vicuna. They asked GPT-4 to rate the quality of their answers based on helpfulness, relevance, accuracy, and detail. GPT-4 prefers Vicuna over state-of-the-art open-source models in more than 90% of the questions and achieves competitive performance against proprietary models. Vicuna's total score is 92% of ChatGPT's. While this proposed evaluation framework demonstrates potential, developing a comprehensive, standardized evaluation system for chatbots remains an open question requiring further research.
How Vicuna-13B Chatbot Uses LLaMA and Botonic
Vicuna-13B is a chatbot open source that uses LLaMA as its base model. LLaMA is a large language model that can generate text for different domains and tasks. Vicuna-13B fine-tuned LLaMA on user-shared conversations from ShareGPT, a platform where people can chat with different models and share their chats. Vicuna-13B is a high-quality chatbot open source that can chat with you on various topics.
One of the open source tools that Vicuna-13B uses is Botonic. Botonic is an open source chatbot framework that lets you build chatbots with React and Tensorflow.js. You can create text and graphical interfaces for your chatbots and deploy them on different channels, such as web, mobile, social media, and voice assistants. Botonic also supports natural language understanding and dialogue management for your chatbots.
What is Open Source (Apa itu open source?) Open source is a way of developing software that makes the source code available to everyone. Anyone can use, modify, customize, and distribute the software as they wish. Open source is beneficial because it encourages collaboration, innovation, and transparency among developers and users. Vicuna-13B is an example of a chatbot open source that follows this philosophy.
Conclusion
Vicuna-13B is a top-performing chatbot that surpasses other models in initial tests. Its fine-tuning with LLaMA on user-shared conversations from ShareGPT enables it to deliver comprehensive and well-organized responses. While our automated evaluation framework shows promise, further exploration is required to establish a robust evaluation system for chatbots.
source - https://vicuna.lmsys.org/
demo link - https://chat.lmsys.org/
No comments:
Post a Comment