Researchers from MIT, Yale, McGill University and others found that adapting the Sequential Monte Carlo algorithm can...
AI research
Auto Added by WPeMatico
Training LLMs on trajectories of reasoning and tool use makes them superior at multi-step reasoning tasks.Read More
Not all AI scaling strategies are equal. Longer reasoning chains are not sign of higher intelligence. More...
DeepCoder-14B competes with frontier models like o3 and o1—and the weights, code, and optimization platform are open...
Reward models holding back AI? DeepSeek’s SPCT creates self-guiding critiques, promising more scalable intelligence for enterprise LLMs.Read...
CoTools uses hidden states and in-context learning to enable LLMs to use more than 1,000 tools very...
Anthropic has developed a new method for peering inside large language models like Claude, revealing for the...
METASCALE uses a three-stage approach to dynamically choose the right reasoning technique for each promblem.Read More
With multiple sampling and self-verification, Gemini 1.5 Pro can outperform o1-preview in reasoning tasks.Read More
SEARCH-R1 trains LLMs to gradually think and conduct online search as they generate answers for reasoning problems.Read...