OpenAI Claims GPT-5 Matches Human Performance Across Various Professions

OV News DeskSeptember 26, 2025Last Updated: September 26, 2025

2 minutes read

OpenAI Claims GPT-5 Matches Human Performance Across Various Professions — CREDIT : OpenAI

OpenAI has unveiled a new benchmark, GDPval, designed to evaluate the performance of its AI models against human professionals across various industries. Released on Thursday, this benchmark aims to assess how close OpenAI’s systems are to achieving their goal of artificial general intelligence (AGI). Early results indicate that the GPT-5 model and Anthropic’s Claude Opus 4.1 are nearing the quality of work produced by industry experts, although OpenAI acknowledges that the benchmark currently covers a limited range of tasks.

Understanding GDPval and Its Scope

The GDPval benchmark focuses on nine key industries that significantly contribute to the U.S. gross domestic product, including healthcare, finance, manufacturing, and government. It evaluates AI performance across 44 different occupations, such as software engineers, nurses, and journalists. For the initial version, GDPval-v0, OpenAI enlisted experienced professionals to compare AI-generated reports with those created by their peers. Participants were tasked with selecting the best report from both sources. For instance, investment bankers were asked to analyze a competitor landscape in the last-mile delivery sector and compare their findings with AI-generated reports. The results were then averaged to determine an AI model’s “win rate” against human-generated reports across all evaluated occupations.

Performance Metrics of AI Models

In the first round of testing, OpenAI’s enhanced GPT-5-high model achieved a win rate of 40.6%, indicating that it was rated as better than or on par with industry experts in nearly half of the tasks assessed. In contrast, Anthropic’s Claude Opus 4.1 model performed even better, with a win rate of 49%. OpenAI attributes Claude’s high score to its ability to produce visually appealing graphics, rather than solely its performance in generating text-based reports. Despite these promising results, OpenAI cautions that the GDPval benchmark only evaluates a narrow aspect of professional work, primarily focused on report generation.

Future Directions and Industry Implications

OpenAI recognizes that the current GDPval test does not encompass the full range of tasks performed by professionals. The company plans to develop more comprehensive assessments that will account for a wider variety of industries and interactive workflows. This evolution is crucial as the company aims to demonstrate the practical applications of its AI models in real-world scenarios. In an interview, OpenAI’s chief economist, Dr. Aaron Chatterji, expressed optimism about the benchmark’s implications, suggesting that as AI models improve, professionals can leverage these tools to focus on more meaningful and higher-value tasks.

Tejal Patwardhan, who leads OpenAI’s evaluations, highlighted the significant progress made since the release of the GPT-4o model, which scored only 13.7% in similar evaluations about 15 months ago. The substantial improvement in GPT-5‘s performance reflects a trend that Patwardhan expects to continue as AI capabilities advance.

The Importance of Robust Benchmarks

As the field of artificial intelligence evolves, benchmarks like GDPval are becoming increasingly vital for assessing AI models’ capabilities. Silicon Valley employs various benchmarks to measure AI progress, including AIME 2025 and GPQA Diamond, which focus on competitive math problems and PhD-level science questions, respectively. However, many AI models are nearing their limits on these existing benchmarks, prompting researchers to call for more effective tests that can evaluate AI proficiency in real-world applications.

OpenAI’s GDPval could play a significant role in this ongoing conversation, as the company seeks to establish its AI models as valuable tools across diverse industries. Nevertheless, to convincingly demonstrate that its AI systems can outperform human professionals, OpenAI will need to expand the scope and depth of the GDPval benchmark in future iterations.

Observer Voice is the one stop site for National, International news, Sports, Editor’s Choice, Art/culture contents, Quotes and much more. We also cover historical contents. Historical contents includes World History, Indian History, and what happened today. The website also covers Entertainment across the India and World.

Follow Us on Twitter, Instagram, Facebook, & LinkedIn

OpenAI Claims GPT-5 Matches Human Performance Across Various Professions

Understanding GDPval and Its Scope

Performance Metrics of AI Models

Future Directions and Industry Implications

The Importance of Robust Benchmarks

OV News Desk

Read Next

Are AI Boyfriends and Girlfriends Creating a New Definition of Infidelity? LeapHope Insights

Vivo X Fold 6 Expected to Include 200MP Camera and Enhanced Battery Capacity: Report

Asus ROG Launches ‘Edition 20’ Series with Custom PCs, Displays, and Gaming Accessories

Computex 2026: Intel Unveils Xeon 6+ for Next-Gen AI

Nvidia Launches First AI Agent-Optimized PCs

Are AI Boyfriends and Girlfriends Creating a New Definition of Infidelity? LeapHope Insights

Vivo X Fold 6 Expected to Include 200MP Camera and Enhanced Battery Capacity: Report

Asus ROG Launches ‘Edition 20’ Series with Custom PCs, Displays, and Gaming Accessories

Computex 2026: Intel Unveils Xeon 6+ for Next-Gen AI

Nvidia Launches First AI Agent-Optimized PCs

Gold and Silver Prices Soar Amid Geopolitical Tensions

Indian Equity Markets Open Flat Amid Mixed Global Cues and Geopolitical Concerns

Indian Equity Markets Navigate Cautious Waters Amid Global Uncertainties

Volatility Grips Indian Equity Markets Amid Global Uncertainty

Indian Markets Open Cautiously Amid Global Mixed Cues and Pre-Budget Positioning

Indian Equity Markets Anticipate Mild Gains Amid Global Concerns

Gold and Silver Market Outlook: Trends and Projections for Investors

Literary Luminaries Mamta Kalia and Arambam Ongbi Memchoubi to Receive ‘Akashdeep’ Award

Market Volatility: Gold and Crude Oil Prices React to Recent Developments

Cautious Start for Indian Equity Markets Amid Global Uncertainties

IPL 2026: Shashank Singh of PBKS Under Fire After 5 Catches Missed in Just 3 Matches

R Ashwin Responds Playfully to Rohit Sharma Rift Rumors Ahead of IPL 2026

Shreyas Iyer Reflects on Missed Opportunities as PBKS Faces Third Consecutive Loss in IPL 2026

Alastair Cook Sparks Debate with Controversial Statement on IPL’s True Quality

Sunrisers Hyderabad’s Path to IPL 2026 Playoffs: Key Scenarios for a Top-Four Finish

Understanding GDPval and Its Scope

Performance Metrics of AI Models

Future Directions and Industry Implications

The Importance of Robust Benchmarks

OV News Desk

Read Next

Are AI Boyfriends and Girlfriends Creating a New Definition of Infidelity? LeapHope Insights

Vivo X Fold 6 Expected to Include 200MP Camera and Enhanced Battery Capacity: Report

Asus ROG Launches ‘Edition 20’ Series with Custom PCs, Displays, and Gaming Accessories

Computex 2026: Intel Unveils Xeon 6+ for Next-Gen AI

Nvidia Launches First AI Agent-Optimized PCs

Are AI Boyfriends and Girlfriends Creating a New Definition of Infidelity? LeapHope Insights

Vivo X Fold 6 Expected to Include 200MP Camera and Enhanced Battery Capacity: Report

Asus ROG Launches ‘Edition 20’ Series with Custom PCs, Displays, and Gaming Accessories

Computex 2026: Intel Unveils Xeon 6+ for Next-Gen AI

Nvidia Launches First AI Agent-Optimized PCs

Daily Observer Voice Newsletter