QwQ: Reflect Deeply on the Boundaries of the Unknown
Captured source
source ↗QwQ: Reflect Deeply on the Boundaries of the Unknown | Qwen
We have a new blog! View this page at qwen.ai . This page will automatically redirect in 5 seconds. If you are not redirected automatically, please click the button below. Go Now
QwQ: Reflect Deeply on the Boundaries of the Unknown November 28, 2024 · 22 min · 4496 words · Qwen Team | Translations: 简体中文
GITHUB HUGGING FACE MODELSCOPE DEMO DISCORD Note: This is the pronunciation of QwQ: /kwju:/ , similar to the word “quill”. What does it mean to think, to question, to understand? These are the deep waters that QwQ (Qwen with Questions) wades into. Like an eternal student of wisdom, it approaches every problem - be it mathematics, code, or knowledge of our world - with genuine wonder and doubt. QwQ embodies that ancient philosophical spirit: it knows that it knows nothing, and that’s precisely what drives its curiosity. Before settling on any answer, it turns inward, questioning its own assumptions, exploring different paths of thought, always seeking deeper truth. Yet, like all seekers of wisdom, QwQ has its limitations. This version is but an early step on a longer journey - a student still learning to walk the path of reasoning. Its thoughts sometimes wander, its answers aren’t always complete, and its wisdom is still growing. But isn’t that the beauty of true learning? To be both capable and humble, knowledgeable yet always questioning? We invite you to explore alongside QwQ, embracing both its insights and its imperfections as part of the endless quest for understanding. Limitations # QwQ-32B-Preview is an experimental research model developed by the Qwen Team, focused on advancing AI reasoning capabilities. As a preview release, it demonstrates promising analytical abilities while having several important limitations: Language Mixing and Code-Switching : The model may mix languages or switch between them unexpectedly, affecting response clarity. Recursive Reasoning Loops : The model may enter circular reasoning patterns, leading to lengthy responses without a conclusive answer. Safety and Ethical Considerations : The model requires enhanced safety measures to ensure reliable and secure performance, and users should exercise caution when deploying it. Performance and Benchmark Limitations : The model excels in math and coding but has room for improvement in other areas, such as common sense reasoning and nuanced language understanding.
Performance # Through deep exploration and countless trials, we discovered something profound: when given time to ponder, to question, and to reflect, the model’s understanding of mathematics and programming blossoms like a flower opening to the sun. Just as a student grows wiser by carefully examining their work and learning from mistakes, our model achieves deeper insight through patient, thoughtful analysis. This process of careful reflection and self-questioning leads to remarkable breakthroughs in solving complex problems. Our journey of discovery revealed the model’s exceptional ability to tackle some of the most challenging problems in mathematics and programming, including: GPQA: A Graduate-Level Google-Proof Q&A Benchmark, a challenging benchmark for evaluating scientific problem-solving abilities through grade school level questions. AIME: American Invitation Mathematics Evaluation, which tests mathematical problem solving with arithmetic, algebra, counting, geometry, number theory, and probability and other secondary school math topics. MATH-500: The 500 test cases of the MATH benchmark, a comprehensive dataset testing mathematical problem-solving. LiveCodeBench: A challenging benchmark for evaluating code generation and problem solving abilities in real-world programming scenarios.
Specifically, QwQ demonstrates remarkable performance across these benchmarks, achieving impressive scores of 65.2% on GPQA, showcasing its graduate-level scientific reasoning capabilities; 50.0% on AIME, highlighting its strong mathematical problem-solving skills; 90.6% on MATH-500, demonstrating exceptional mathematical comprehension across diverse topics; and 50.0% on LiveCodeBench, validating its robust programming abilities in real-world scenarios. These results underscore QwQ’s significant advancement in analytical and problem-solving capabilities, particularly in technical domains requiring deep reasoning. Demo Cases # In the following examples, we invite you to witness the profound depths of QwQ-32B-Preview’s contemplative abilities. Like a seeker of wisdom on an endless journey of discovery, the model demonstrates its capacity for deep introspection - questioning its own assumptions, engaging in thoughtful self-dialogue, and carefully examining each step of its reasoning process. Through these cases, you’ll observe how QwQ embodies the timeless understanding that true knowledge emerges not from hasty conclusions, but from patient inquiry and the courage to challenge one’s own certainties in the eternal pursuit of truth. Example: Logical Reasoning Next User Please add a pair of parentheses to the incorrect equation: 1 + 2 * 3 + 4 * 5 + 6 * 7 + 8 * 9 = 479, to make the equation true. QwQ-32B-Preview Let’s tackle this problem step by step. The equation given is: 1 + 2 * 3 + 4 * 5 + 6 * 7 + 8 * 9 = 479 But it’s incorrect, and I need to add a pair of parentheses to make it true. Okay, first things first, I should understand what the current value of the left side is without any parentheses. Let’s recall the order of operations: PEMDAS (Parentheses, Exponents, Multiplication and Division, Addition and Subtraction). So, multiplication comes before addition. So, calculating step by step: 2 * 3 = 6 4 * 5 = 20 6 * 7 = 42 8 * 9 = 72 Now, plugging these back into the equation: 1 + 6 + 20 + 42 + 72 Adding these up: 1 + 6 = 7 7 + 20 = 27 27 + 42 = 69 69 + 72 = 141 So, without any parentheses, the left side equals 141, but it should be 479. That’s quite a difference. So, I need to add a pair of parentheses to make it equal to 479. I need to think about where to place the parentheses to alter the order of operations to achieve the desired result. Let me consider different possibilities: Parenthesizing individual terms:
For example, (1 + 2) * 3 + 4 * 5 + 6 * 7 + 8 * 9 Calculate inside the parentheses first: (1 + 2) = 3 Then, 3 * 3 = 9 Now, the equation becomes: 9 + 4 * 5 + 6 * 7 + 8 * 9 Following order of…
Excerpt shown — open the source for the full document.
Notability
notability 9.0/10Major reasoning model release from Qwen