Model Benchmarks
Business-centric LLM benchmarks of the top models. The ORCFLO Indexconsists of 40+ real-world business tasks — writing, analysis, summarization, extraction. All judged by a rigorous rubric and validated by a 4-judge frontier model panel to ensure objective, thorough results.
Published Benchmarks
11 model evaluations published to date. Newest first.
AnthropicApril 20, 2026
Claude Opus 4.7
ORCFLO Index Benchmark
AnthropicApril 13, 2026
Claude Opus 4.6
ORCFLO Index Benchmark
AnthropicApril 6, 2026
Claude Sonnet 4.6
ORCFLO Index Benchmark
OpenAIMarch 30, 2026
GPT 5
ORCFLO Index Benchmark
OpenAIMarch 23, 2026
GPT 5.4
ORCFLO Index Benchmark
GoogleMarch 16, 2026
Gemini 2.5 Pro
ORCFLO Index Benchmark
GoogleMarch 9, 2026
Gemini 3 Pro (Preview)
ORCFLO Index Benchmark
OpenAIMarch 2, 2026
GPT 5.1
ORCFLO Index Benchmark
OpenAIFebruary 23, 2026
GPT 5.2 Pro
ORCFLO Index Benchmark
AnthropicFebruary 16, 2026
Claude Opus 4.5
ORCFLO Index Benchmark
GoogleFebruary 9, 2026
Gemini 3 Flash (Preview)
ORCFLO Index Benchmark