SHARCSHARC
Back to blog
Benchmark Report
BenchmarksMTEBEmbeddings

MTEB Benchmark: Why SHARC-Embed-Code-001 Outperforms Commercial Models

A deep dive into the MTEB leaderboard showing how SHARC-Embed-Code-001 achieves #2 ranking.

December 12, 2024 8 min read
MTEB benchmark visualization
Key Insight: SHARC-Embed-Code-001 leads the next best commercial model by +9.45 points on MTEB overall.

MTEB Overall Scores

Comparison across embedding providers. Higher is better.

70.58
61.13
58.96
58.46
SHARC
Cohere
OpenAI
Voyage
+9.45 points vs. best commercial competitor

Models Compared

Dimension, price, and rank breakdown for the compared models.

ModelMTEB RankScoreDimensionsPrice
SHARCSHARC-Embed-Code-001#270.584096$0.05/M
CohereCohere-embed-multilingual-v3.0#1561.131024$0.10/M
OpenAItext-embedding-3-large#1858.963072$0.13/M
Voyagevoyage-3.5#2458.461024$0.06/M
* Pricing for SHARC is not finalized.

Performance by Task Category

SHARC leads all listed MTEB task categories.

CategorySHARCCohereOpenAIVoyage
Retrieval88.9680.3576.6279.26
Classification88.8574.4473.4370.42
STS90.5484.2782.7077.92
BitextMining86.3483.6869.9568.93
PairClassification85.8480.1080.0776.30
Reranking65.6364.0763.8964.16
Clustering56.2149.0648.5045.30

Code Search Relevant Benchmarks

Benchmarks most relevant to semantic retrieval quality in code workflows.

StackOverflowQA

Dataset
94.75
SHARC
89.42
Cohere
92.44
OpenAI
94.66
Voyage

StackExchangeClustering.v2

Dataset
79.41
SHARC
59.10
Cohere
62.51
OpenAI
58.48
Voyage

SCIDOCS

Dataset
32.74
SHARC
19.34
Cohere
23.07
OpenAI
21.34
Voyage

STSBenchmark

Dataset
93.60
SHARC
88.79
Cohere
83.56
OpenAI
82.28
Voyage

Cost-Performance Analysis

Score-per-dollar view from MTEB average score and list pricing.

ModelMTEB ScoreCost / M tokensScore / $
SHARC70.58$0.05/M1412
Cohere61.13$0.10/M611
OpenAI58.96$0.13/M454
Voyage58.46$0.06/M974
Value Snapshot: SHARC currently offers the strongest score-per-dollar profile in this comparison set.

Conclusion

Why SHARC-Embed-Code-001 is positioned for production code retrieval.

  • #2 on MTEB overall with strong retrieval and code-relevant benchmark performance.
  • 4096-dimensional vectors preserve semantic precision for complex codebases.
  • Consistent category leadership across retrieval, STS, classification, and reranking subsets.
  • Strong cost-efficiency profile based on available benchmark score and pricing data.

Data Source

Metrics are sourced from the official MTEB leaderboard and SHARC internal benchmark snapshots. Refer to the official board for latest standings.

MTEB Leaderboard
Back to blogExplore SHARC