Blog · Data Analysis
Some Tech Publications Consistently Score Products 5–7 Points Higher Than Others
After normalizing scores from 16,000+ professional reviews, certain publications consistently rate products more generously than others. The bias is systematic, persistent, and visible once you control for which products are being reviewed.
March 3, 2026
Criticaster aggregates professional reviews from hundreds of publications and normalizes every score to a common 0–100 scale. Some outlets rate on 5-star scales, others use 10-point scales, letter grades, or percentages. We normalize all of them so they're directly comparable.
After running this normalization across 16,062 scored reviews from 280+ sources, a pattern emerged: certain publications consistently land above or below the global weighted average of 79.3/100. This isn't about individual product opinions—it's a systematic tendency that holds across hundreds of products and dozens of categories.
The generous and the critical
Among major publications with 50+ reviews in our database, the spread between the most generous and most critical outlets is over 13 points. Engadget averages 86.1; Consumer Reports averages 73.0. That's the same products, reviewed by professionals, with fundamentally different calibration.
Average normalized score by publication
Major publications with 50+ scored reviews. Dashed line = global weighted average (79.3).
The pattern isn't random. Outlets like Stuff, Engadget, and Pocket-lint consistently score 4–7 points above average. Consumer Reports, Tom's Hardware, and Wired consistently score 2–6 points below. These biases persist across product categories and hold up over hundreds of reviews.
Controlling for product selection
A reasonable objection: maybe generous outlets just review better products. To control for this, we looked at paired comparisons— cases where two publications reviewed the exact same products. This isolates scoring tendency from product selection.
The bias holds. When Stuff and Wired review the same 32 products, Stuff averages 89.2 and Wired averages 81.0—an 8.2-point gap. When Consumer Reports and TechRadar cover the same 81 products, Consumer Reports averages 6.6 points lower.
Head-to-head: same products, different scores
Average score when both outlets reviewed the same products. Minimum 20 shared products.
Why the gap exists
This is not about quality of journalism. There are several structural reasons why scoring tendencies differ:
- Scale usage: Some outlets use the full range of their scale. Consumer Reports, Which, and TechGearLab regularly give scores in the 50s and 60s. Others effectively never go below 70, compressing their entire range into the top 30% of the scale.
- Selection bias: Outlets that review everything in a category—including mediocre products—will have lower averages than those that cherry-pick only products likely to score well. Some publications only review products they can recommend; others review products to explicitly warn you away from them.
- Review culture: Enthusiast outlets tend to score higher. A gaming publication reviewing a gaming headset is more likely to find things to like than a general-interest outlet applying broader standards.
- Normalization artifacts: When we convert a 4-out-of-5 star rating to a 0–100 scale, it maps to 80. But many 4-star reviews are expressing "good, not great"—which on a 10-point scale might have been a 7.5 (75). Different native scales introduce slight calibration differences even after normalization.
What this means for aggregated scores
Scoring bias is precisely the problem that aggregation is designed to solve. When one outlet runs 7 points high and another runs 5 points low, a score built from 8–15 reviews from different sources will land closer to the "true" consensus than any individual review.
The data also suggests that reading a single review and taking its score at face value is less informative than it appears. An 85 from Stuff and an 85 from Consumer Reports represent very different levels of enthusiasm. The number alone doesn't tell you much without knowing the reviewer's baseline.
For Criticaster, this is a feature of the model rather than a bug. The whole point of aggregating is that individual biases—both generous and critical—cancel out when you combine enough independent signals. The wider the range of outlets included, the more the resulting score reflects the actual consensus rather than any single publication's calibration.
Data in this post reflects Criticaster's database as of March 3, 2026, covering 16,062 normalized reviews from 280+ sources. Scores are normalized to a 0–100 scale from each outlet's native format. Only publications with 50+ reviews are shown in the main chart; paired comparisons require a minimum of 20 shared products. Critic Scores are aggregated using our standard methodology.