WritingQwen (Alibaba Cloud)Qwen (Alibaba Cloud)published Nov 15, 2024seen 6d

Extending the Context Length to 1M Tokens!

Open original ↗

Captured source

source ↗
published Nov 15, 2024seen 6dcaptured 3dhttp 200method plain

Extending the Context Length to 1M Tokens! | Qwen

We have a new blog! View this page at qwen.ai . This page will automatically redirect in 5 seconds. If you are not redirected automatically, please click the button below. Go Now

Extending the Context Length to 1M Tokens! November 15, 2024 · 11 min · 2314 words · Qwen Team | Translations: 简体中文

API Documentation (Chinese) HuggingFace Demo ModelScope Demo Introduction # After the release of Qwen2.5, we heard the community’s demand for processing longer contexts. In recent months, we have made many optimizations for the model capabilities and inference performance of extremely long context. Today, we are proud to introduce the new Qwen2.5-Turbo version, which features: Longer Context Support : We have extended the model’s context length from 128k to 1M , which is approximately 1 million English words or 1.5 million Chinese characters , equivalent to 10 full-length novels, 150 hours of speech transcripts, or 30,000 lines of code. The model achieves 100% accuracy in the 1M length Passkey Retrieval task and scores 93.1 on the long text evaluation benchmark RULER, surpassing GPT-4’s 91.6 and GLM4-9B-1M’s 89.9. Additionally, the model maintains very strong competitiveness in short sequence capabilities, on par with GPT-4o-mini. Faster Inference Speed : Using sparse attention mechanisms, we successfully reduced the time to first token for processing a context of 1M tokens from 4.9 minutes to 68 seconds, achieving a 4.3x speedup . Lower Cost : The price remains ¥0.3 / 1M tokens. At the same cost, Qwen2.5-Turbo can process 3.6 times the number of tokens as GPT-4o-mini.

Now, you can use it through the API service of Alibaba Cloud Model Studio [Chinese] , or through HuggingFace Demo or ModelScope Demo . Demo # We have prepared some demos to showcase the new uses that Qwen2.5-Turbo can bring, covering deep understanding of long novels, repository-level code assistant, and reading multiple papers. Example: Understanding Long Novels Next

User [Upload the three Chinese novels of The Three-Body Problem , which contain 690k tokens.] Provide a summary of the plots of these three novels in English

Qwen2.5-Turbo The three novels you mentioned from the “Earth’s Past” series are interconnected and form a trilogy that delves into the complex interactions between Earth and an alien civilization known as the Trisolarans. First Novel: “Earth’s Past” In the first novel, the narrative focuses on the development of a top-secret project called “Red Coast” (红岸) initiated by the Chinese military. The protagonist, Ye Wenjie, is a scientist who discovers that the Trisolarans have begun listening to the universe. Her actions, particularly her decision to respond to the Trisolarans, drastically alter the course of human history. She sets up a powerful signal to send to the Trisolarans using the sun as an amplifier. This action marks the beginning of a new era where humanity faces a significant existential threat. Second Novel: “Dark Forest” The second novel, titled “Dark Forest,” delves into the concept of the “Dark Forest Theory.” According to this theory, the universe is a vast battlefield where civilizations must remain silent to avoid detection and subsequent destruction. The Trisolarans and humans engage in a complex strategic game, with the Trisolarans deploying advanced weapons known as “water drops” (水滴). These water drops are capable of destroying entire planets by disrupting their ecosystems. Meanwhile, humans develop their own strategies, including the “Wall-Bearing Plan” (面壁计划), which involves four individuals who devise plans to counter the Trisolaran threat without revealing their intentions to the outside world. The story follows one of these face-bearers, Luo Ji, as he struggles with the ethical implications of his mission. Third Novel: “Death’s End” In “Death’s End,” Luo Ji becomes the ultimate guardian of the Dark Forest Theory. He holds the key to the survival of both human and Trisolaran civilizations. However, his position as the guardian (执剑人) brings him immense personal struggle and psychological turmoil. As Luo Ji’s time as the guardian nears its end, he must make a critical decision about the future of humanity. Meanwhile, the Trisolarans continue their efforts to colonize and assimilate other worlds. The novel explores the consequences of this decision and the broader impact on the universe. It also introduces a new element: the creation of micro-black holes (黑域) by humans, which could potentially isolate them from the rest of the universe. However, the effectiveness of this strategy is questioned when six of these micro-black holes begin to move and accelerate towards Earth. The trilogy as a whole explores themes of survival, technological advancement, and the ethical dilemmas faced by humanity in the face of extraterrestrial threats. Each novel builds upon the previous one, revealing more about the intricate relationship between the two civilizations and the universe they inhabit.

Example: Understanding Repository Next

User [Upload a text file containing all code files from the Qwen-Agent repository, which has a length of 133k tokens.] Please read the codes carefully and ensure your response is comprehensive. Introduce this repository

Qwen2.5-Turbo Qwen-Agent is a framework designed for developing LLM (Large Language Model) applications. It is based on the instruction-following, tool usage, planning, and memory capabilities of Qwen. Here are some key points and functionalities: Agent Class : Provides a high-level interface for integrating tool calls and LLM interactions. It supports role-playing, automatic planning, and tool invocation. DocQA Agent : Specialized for document question answering, capable of handling long documents and providing context-aware responses. Function Calling : Supports function calling, enabling interaction with tools or APIs directly from the model’s responses. RAG (Retrieval-Augmented Generation) : Integrates document retrieval with generation, enhancing the model’s ability to recall and use information from documents. Tool Integration : Includes various built-in and custom tools, such as code interpreters, web extractors, and image generators, which can be used to perform tasks like code execution, web scraping, and…

Excerpt shown — open the source for the full document.

Notability

notability 8.0/10

Significant context length extension, notable research.

Qwen (Alibaba Cloud) has a writing signal matching infrastructure, product and customer.