- formatting
- images
- links
- math
- code
- blockquotes
- external-services
•
•
•
•
•
•
-
[3min-Paper] The_Illusion_of_Thinking
AI真的會推理嗎?還是我們的測試方法有問題?
-
The Illusion of the Illusion of Thinking: When AI Evaluation Methods Become Traps for Capability Assessment
This commentary paper reveals a shocking truth: we often mistake the limitations of AI evaluation methods for the limitations of AI systems themselves! Research shows that many cases considered AI reasoning failures are actually misjudgments caused by poorly designed evaluation frameworks.
-
[中文版] The Illusion of the Illusion of Thinking: 當AI評估方法成為能力判斷的陷阱
這篇評論文章揭露了一個驚人真相:我們經常將AI評估方法的限制誤認為是AI系統能力的限制!研究發現,許多被認為是AI推理失敗的案例,實際上是評估框架設計不當造成的誤判。
-
Persona Features Control Emergent Misalignment
OpenAI research team explores how language models generalize behaviors from training to broader deployment distributions, focusing on emergent misalignment issues. The study reveals that controlling persona features can effectively manage model misalignment behaviors, providing important insights for AI safety.
-
[中文版] Persona Features Control Emergent Misalignment
OpenAI 研究團隊探討語言模型在從訓練分佈泛化到更廣泛部署分佈時的行為變化,特別關注新興錯位對齊問題。研究發現透過控制人格特徵可以有效管理模型的錯位對齊行為,為AI安全提供重要見解。