Has the open-source large model really made more progress and achievements in the past year compared to the closed-source large models?
The Impact of Open Source and Closed Source Large Models
Closed Source Large Models Surpass Open Source in Performance
Open source large models are still significantly outperformed by their closed source counterparts. A comparative study in a recent paper highlights this gap across various tasks.
It’s noteworthy that GPT-4, rumored to have been trained as early as 2022 and released in March 2023, is no longer considered “new”. In many tasks, even ChatGPT can surpass the open source Llama-2-70B-chat.
Currently, the two best-performing large models, OpenAI’s ChatGPT/GPT-4 and Google’s Gemini, have not been open-sourced. They significantly outperform open source models like Facebook’s llama and Alibaba’s Tongyi Qianwen.
The Indispensable Role of Open Source Large Models in Advancing the Field
Open source large models, particularly the llama model, have been crucial in advancing the field. There are several notable llama-named models and derivatives, including:
- LLAMA PRO: An extended training version of llama by Tencent.
- TinyLlama: A powerful small model (11B).
- LlaMaVAE
- Lag-Llama: For time series prediction.
- laywer-llama: Legal domain application.
- fin-llama: Specialized in finance.
- code-llama: Focused on coding.
Without the open sourcing of models like llama, such research initiatives would be difficult to commence.
The impact of these models is also evident from their citations in academic papers. For instance, llama-2, released in July, has citations only 50% lower than GPT-4, released in March, with a higher proportion of high-impact and methodological citations. GPT-4 is more often mentioned in the background sections of papers.
The difference in influence becomes even more apparent when comparing llama-1, released just three weeks before GPT-4.
Hence, open source large models play an indispensable role in driving the development of the field. While they may not be the leaders, they are essential contributors to the field’s progress.
The Progress and Impact of Open Source Large Language Models
During a live AI salon, someone sneakily asked a question (jokingly). First, it’s challenging to answer this because we don’t know the exact progress of closed source large models. Closed source models operate in secrecy, while open source models are transparent. Moreover, closed source models can learn from open source ones, gaining new ideas and experiences, which isn’t as feasible in reverse due to the lack of technical details available from closed source models.
Therefore, a comparison seems impractical. However, it’s undeniable that in the past year, open source large models have made significant achievements and progress. Examples include Tongyi, ChatGLM, Baichuan, and Yi from China, and Llama, MistralAI, and Falcon from abroad, all of which are outstanding open source models.
Meta has been a forerunner in the open source Large Language Model (LLM) movement. In February 2023, Meta released Llama, followed by LLama-2 in July.
Yann LeCun is not just a critic of ChatGPT but also a staunch supporter of open source:
MistralAI, a French AI startup, pioneered a magnetic chain-style promotion, being straightforward and direct by providing download links.
MistralAI not only open-sourced an 8X7B MoE model but also announced plans to open source a ‘GPT-4 level model’ in 2024.
By the end of the year, Microsoft open-sourced phi-2, showcasing the potential of smaller models. High-quality data might further enhance the performance of these smaller models, which we will experience more on mobile and PC (CPU) this year.
Open source LLMs are undoubtedly meaningful. The open source community fosters more intellectual interactions and verifications, advancing our understanding of technology and enabling academia to play a more collaborative role in the advancement of LLMs.
The field of large models naturally requires significant investment in open source projects…