GPT-5 Benchmarking: Raising the Bar for Finance AI

Blog Author
Pascal New
by Jeremy Isnard
and Pascal Hauri
Aug 29, 2025
featured image

At Unique, precision and trust aren’t negotiable. That’s why we ran extensive benchmarks on the new GPT-5 family, comparing GPT-5, GPT-5-mini, GPT-5-nano, to our previous baseline, GPT-4o.

Our goal: find the model that best serves finance professionals in high-stakes contexts like investment research, due digligence, or KYC reviews, and that reduces the friction of their day-to-day tasks.

The results were clear. While the smaller variants had interesting traits, only the main GPT-5 consistently met the standard required for finance professionals, and GPT-5 now powers our AI platform for finance professionals.

Here's what our benchmarks revealed about all three variants, and why nano and mini didn't make the cut.

 

The Three Flavors: A Tale of Different Personalities

 

GPT-5 delivers exactly what finance teams need: predictable, executive-ready outputs. It consistently structures responses in 3-5 clear pillars with high-level insights. It provides slide-deck ready analysis that won't overwhelm busy decision-makers.

GPT-5-mini lies heavy on the reader with a compliance-style tone, adding exhaustive lists packed with disclaimers and cross-references. It is great for audit trail, but it does not allow for quick and accurate insights.

GPT-5-nano takes a mechanisms-to-implications approach, connecting operational details to strategic outcomes. While thorough, it often becomes wordy and asks too many clarifying questions before delivering answers, and is often redundant.

For instance, to the question: "What's Tesla's Q1 2024 net income from our knowledge base?", the following results were observed:

  • GPT-5 gave a quick, accurate answer with proper sourcing.

  • GPT-5-mini provided a correct answer buried in extensive context about reporting methodologies.

  • GPT-5-nano asked clarifying questions about fiscal vs calendar year before answering.

 

Evals Results: Numbers Don't Lie

 

Our comprehensive testing across 180 queries focused on the finance industry topics, revealed clear winners:

 

Model

GPT-5

GPT-5-mini

GPT-5-nano

GPT-4o

Accuracy

99.3%

99.3%

92.2%

98.6%

Avg Response Time

9.2s

9.4s

10.0s

8.5s

Source Reliability

98.6%

80.9%

94.3%

99.3%

 
 

While accuracy was comparable across GPT-5 variants, nano struggled with errors and mini failed to consistently source references.

GPT-5 base model, by contrast, maintained consistent referencing throughout our testing, properly citing internal documents and external sources which is critical for regulatory compliance and audit trails. Having enabled minimal thinking mode, its average response time stayed within range for executive workflows.

 

Why Not Mini or Nano?

 

We evaluated mini and nano primarily to test cost savings and speed improvements. But despite being smaller, they did not deliver faster results.

That said, the smaller models still have niche uses:

  • GPT-5 Mini is particularly suited for compliance review or exhaustive audit records.

  • GPT-5 Nano can help with structured classification tasks, such as rating or resorting file fragments for inclusion in LLM context.

But for most of our customer cases, the base model offers the best balance between accuracy, speed, and clarity.

 

Beyond GPT models: The Broader Landscape

 

GPT-5 significantly outperforms our previous GPT-4o implementation with 50% fewer failures to provide an answer and more consistent formatting.

Compared to other leading models:

  • vs. o3: GPT-5 offers better structure and clarity. o3 provides deeper technical compliance details but sometimes fails to provide context.

  • vs. Gemini: GPT-5 delivers more consistent tone and formatting, offering even clearer and straight-to-the point answers.

  • vs. GPT-4.1: GPT-5 avoids the information overload that GPT-4.1 share with GPT-5 mini, making it less cognitively expensive to parse.

 

The Verdict

 

For finance professionals who need reliable, executive-ready analysis without the noise, GPT-5 base model hits the sweet spot. It's fast, accurate, and delivers insights in a response format that is quick to ingest and ready for most reuse cases.

Mini and nano might find homes in specialized compliance or educational settings, but for mainstream financial analysis, GPT-5 base model is our clear choice.

 

Ready to experience GPT-5 in action? Activate GPT-5 in your existing spaces!