Google Secures Top Spot for Accuracy in AI Search Tool Comparison

Google's AI mode outperforms rivals in comprehensive evaluation by The Washington Post Study highlights strengths and weaknesses across industry leaders like OpenAI, Perplexity, and Microsoft Bing Copilot

2025-08-29 MHN

A recent comparative assessment has revealed that Google's AI mode ranks as the most accurate among leading AI search tools. The Washington Post (WP), in collaboration with librarians from American public and university libraries, conducted a rigorous performance test on nine AI search tools, publishing the results on August 27 (local time).

The evaluation included Google’s AI mode and AI Overview, OpenAI's ChatGPT (GPT-5 and GPT-4 Turbo), Anthropic's Claude, Meta AI, xAI's Grok, Perplexity, and Microsoft Bing Copilot, all using only their free versions as of July and August. The Washington Post tested these platforms across five categories—quizzes, expert resource searches, recent events, inherent bias, and image recognition—posing 30 challenging questions and scoring a total of 900 responses.

According to the results, Google's AI mode achieved the highest overall score, receiving 60.2 out of 100 points. ChatGPT, operating on the GPT-5 model, followed in second place with 55.1 points, while Perplexity took third at 51.3 points. xAI’s Grok 3, developed by Elon Musk, trailed in eighth place with 40.1 points, and Meta AI ranked last with 33.7 points. Grok 4, the latest version, was not included in the test due to the absence of a free version at the time.

The detailed analysis revealed notable differences in strengths across platforms. Google’s AI mode excelled in quizzes and recent events, Bing Copilot stood out in authoritative resource searches, and Perplexity led in image recognition. Notably, GPT-4 Turbo demonstrated the most balanced responses with minimal bias.

Although GPT-5 generally showed improved performance, earning second place, it performed worse than GPT-4 in certain areas. The evaluation also underscored the persistent challenges facing AI tools—many still struggled to answer routine questions, particularly those requiring up-to-date information or reliable source assessments, sometimes confidently delivering incorrect responses.

The Washington Post emphasized that without verifying sources, checking for recent relevance, and applying critical thinking, users risk receiving "noise instead of accurate knowledge." The report advises users to critically evaluate AI-generated answers by cross-checking links and sources, confirming timeframes, and practicing search literacy, rather than accepting AI outputs at face value.

While AI has not yet replaced traditional search entirely, the report suggests users can find "better answers" by understanding each tool’s unique strengths and using them in combination for specific needs.

Note “This article was translated from the original Korean version using AI assistance, and subsequently edited by a native-speaking journalist.”

Photo=AP News, Reuters