REVIEWS / AI MODELS / OWNER INSIGHTS

🦉 WE READ 630 OWNER COMMENTS

GPT-5: what owners actually say

Owners see diminishing returns from scaling, with basic reasoning failures and massive power costs overshadowing genuine coding utility

LEMMY · 486 HACKERNEWS · 75 YOUTUBE · 44 STACKEXCHANGE · 14 REDDIT · 10 PRODUCTHUNT · 1

What owners complain about

  • Scaling wall SOME

    Multiple commenters assert GPT-5 proves the belief that LLMs scale linearly with training data is false, and that the field has hit a wall.

  • Basic reasoning failures SOME

    Users report models failing trivial tasks like counting letters in 'blueberry,' then apologizing and retrying without fixing the error, undermining confidence in PhD-level competence claims.

  • Power consumption COMMON

    Commenters are alarmed that newer models use even more power than predecessors, with some noting DeepSeek demonstrated much of the cost stems from inefficient coding rather than fundamental algorithmic requirements.

  • Missing cost transparency SOME

    Users requesting inference cost data on benchmark leaderboards note it is absent, and infer that undisclosed metrics are deliberately hidden because they are damning.

  • MoE underperformance SOME

    Mixture-of-experts models tested poorly on survival benchmarks — Qwen 3.5 397B had only 29% survival with negative ROI, and DeepSeek V3.2 survived 62% but still ended up in the red.

What owners love

  • Capable coding partner

    Users report successfully building complex systems with AI assistance, including an ad pacing system using PID controllers recommended by Claude Opus, with Claude Code verifying the spec.

  • Dense models punch above weight

    Gemma 4 at 31B dense outperformed all MoE models tested including much larger ones, which testers found genuinely surprising.

  • Rapid prototyping

    Users describe having AI build functional initial versions in a single session that they then iteratively extend, enabling fast scaffolding of real systems.

Surprising patterns

  • Only 1 out of 23 tested models needed custom JSON output sanitization to work — suggesting output format reliability varies wildly and unpredictably across models.
  • Commenters actively debate whether superintelligence even implies agency, with several arguing the AI-doomer position conflates intelligence with desire to exert power — a distinction that shapes how they evaluate model risk.
  • A psychiatrist listed what AI lacks: childhood dependency, experience overcoming suffering, eating and restroom breaks, and acceptance of loss — framing limitations in terms no benchmark would capture.

WHO SHOULD SKIP IT

Buyers who need reliable answers to seemingly simple questions without hallucination-and-apology cycles, or those sensitive to power and inference costs, should look elsewhere.

5.9/10 GYIBB verdict
Full review →

Synthesised from 630 real owner comments across 6 platforms. Every point is grounded in the comments — no marketing, no AI guessing. How we do it →