— Rex (@12exyz) February 18, 2025 ‘Think’ Button For AI Reasoning and Deep Search A standout feature in Grok 3 is its “Think” button, which allows users to request a more detailed and analytical response by giving the AI additional processing time. The goal is to improve reasoning accuracy and enhance the model’s ability to tackle complex tasks. The button enables advanced chain of thought reasoning, which like OpenAi’s o1 and o3 models and also DeepSeek R1 aims to provide users with results based on complex thinkingt Grok 3 also introduces its own adoption of an AI-driven research features similar to OpenAI’s Deep Research and Google Gemini’s Deep Research. The tool allows Grok 3 to pull and synthesize real-time information, making it a competitor to both deep research products and Perplexity AI, which also just launched its own deep research implementation. Andrej Karpathy, a former Tesla AI director and early tester of Grok 3 who got early access, found that with ‘Think’ mode enabled, the model successfully estimated the training FLOPs required for OpenAI’s GPT-2, a task that even OpenAI’s most powerful thinking model o1-pro failed. Karpathy noted, “Grok 3 with Thinking solves it great, while o1 pro (GPT thinking model) fails.” For real-time research, Deep Search gives Grok 3 an edge over many models, but its accuracy issues put it behind OpenAI’s Deep Research and Perplexity AI. Karpathy says Grok 3 generates “hallucinated URLs” and avoids citing X unless explicitly asked to limits its effectiveness as a research tool. In terms of reasoning, Grok 3’s new Deep Search mode allows it to match OpenAI’s o1-pro in some logic-heavy tasks. However, it still struggles with spatial reasoning, as demonstrated by its failed tic-tac-toe board generation test. This places it behind GPT-4o, which has been noted for its advanced logic capabilities. Creativity remains another weak point. Claude has been widely praised for its natural and engaging writing style, while Grok 3 still produces responses that feel formulaic. In another test, Grok 3 was able to correctly generate a Settlers of Catan board setup, a challenge that many AI models struggle with. However, when asked to generate tricky tic-tac-toe boards, the model failed, producing nonsensical layouts. Karpathy observed, “It solved a few tic tac toe boards I gave it with a pretty nice/clean chain of thought… but failed on generating tricky ones.” I was given early access to Grok 3 earlier today, making me I think one of the first few who could run a quick vibe check. Thinking ✅ First, Grok 3 clearly has an around state of the art thinking model ("Think" button) and did great out of the box on my Settler's of Catan… pic.twitter.com/qIrUAN1IfD

Elon Musk’s xAI Launches Grok 3, Dethroning OpenAI on Key AI Benchmarks

Tesla's EV sales are plummeting - as used Model Y and Model 3 prices crash to bargain levels

Xiaomi's EV is racing ahead of Tesla in China - and it's planning a global Model Y rival next

Tesla’s UK sales rise despite threat of backlash over Musk’s political role

Stay Connected

Sponsored

Popular Post

tesla Model 3 Owner Nearly Stung With $1,700 Bill For Windshield Crack After Delivery

Tesla Offers A $220 OEM Dash Light Kit For Model Ys, But Only In ChinO

Bezos vs Musk spacewar: Amazon boss to send a Swiss Army knife-like rocket today that's 2 times heavier than Space X' Falcon 9

About Us

Instagram

Contact Info

Download the App Now