Google

Gemma 4 31B

TL;DR

While users value Gemma 4 for its local privacy and impressive vision capabilities, many express frustration regarding its heavy hardware requirements and occasional logical failures or repetitive hallucinations.

YouTubeRedditHacker News

530 comments analyzedApr 30, 2026

Popularity metrics

Last 30 days · 25 items analyzed

Related views
3.3M
Related endorsements
114.0k
Related comments
4.9k
Buzz
Steady

Comment distribution

Overall
41%
59%
YouTube
44%
56%
Reddit
48%
52%

PositiveNegative

Comment summary

Privacy Benefits and Competitive Positioning

68 comments

Many users appreciate the model as a private, local alternative to frontier APIs, noting its superior tool-calling abilities compared to previous Gemma versions.

Hardware Requirements and Inference Performance

64 comments

Users are heavily debating VRAM constraints and slow tokens-per-second rates, specifically questioning how 31B models and various quantizations perform on consumer GPUs.

Logical Reliability and Hallucination Concerns

53 comments

Discussion highlights issues with model consistency, including endless output loops and hallucinations during coding tasks or complex logical reasoning tests.

Vision and OCR Capabilities

14 comments

The community shows high enthusiasm for vision performance, specifically praising the model's accuracy in OCR tasks and its ability to handle bounding boxes.

Top comments

Amazing! shrinking size while high intelligence, Apache 2, works on Phones. This is the 2026 news we need. Keep going Google.
YouTube@AmanBansil134View original
I installed Gemma 4 (gemma4:e2b 7.2GB) running locally derived a Constraint-Dynamical Hamiltonian for a Clifford algebra project I am working on. And you have to trust me what it provided was amazing... So I have a thinking model running locally using RAG to read files and can do quite advanced math... that is the absolute bomb.. :)
YouTube@JimNichols69View original
Gemma 4 welcome! 🎉 And thanks to everyone behind Gemma 4's development. We all appreciate the incredible work you all do.
YouTube@TravisLee3380View original
Not surprised. Gemma is just a mini Gemini, it's good with that stuff. Where GLM 5.1 shines is coding.
Redditatape_179View original
I don't know how you ran it, if you're running it locally using llama.cpp, use the b8660 llama.cpp build (more recent versions have a regression, another tokenization issue) and use --temp 0.3 --top-p 0.9 --min-p 0.1 --top-k 20 I am sure the 26B will do much better. Also, Claude might favor better formatting etc., a boolean test is not good. Try the below prompt for the judge: I am benchmarking many AIs in many tasks. You are a judge. Go through them question by question, not LLM by LLM. Go through each question and, for every question, give all AIs a score out of 10, and be sure to be fair with them. Later, rank them all by their total score. MAKE SURE to evaluate them correctly, not based on vibe alone (check for misinformation, hallucinations, if they are useful or not, and not on formatting). PROMPT= AI 1: ... AI 2: ....
RedditSadman78247View original
LLM as judge = no thanks. It also depends how you're running Gemma 4 for the test. The new custom parser for gemma 4 in llama.cpp b8665 has fixed it for me. Before, it failed the test of just being given the image below. Now it solves it.
Redditambient_temp_xeno45View original
Super excited about the direction things are going. Next generation will be frontier quality for most daily uses and fit on a single solid GPU like the Intel B70. A couple more turbo quant type advances and we're there on SOTA phones, prob two generations. Genuinely concerned about the economy if the AI takeoff is entirely agents running on edge devices and the major labs' trillions in capital goes stale, but very glad we're leaning towards the good path where AI won't be controlled by the few.
RedditLeucisticBear28View original
Gemma 4 is the first actual leap AI did in a "long" time. It makes it smaller but also use less computing power. I am running it on my PC and while it takes up 20gb its equivalent to a 400gb model… insane and on Apache 2.0 so you can make and sell any product you make with it.
YouTube@Mister_Morrigan25View original
Even since Gemma 2 it's been useful for being good at interacting instead of being a 'yes man' (girl). Agreeableness is a flaw and I don't like it in Qwen. (I'm absolutely right)
Redditambient_temp_xeno20View original
qwen3 coder next losing to the 4b at actual game logic is the most demoralizing benchmark result i've seen this week, playwright mcp doing the heavy lifting probably explains a lot of the variance here.
RedditAngeloKappos13View original

Source breakdown

Graph based on sampled comments per item (n≤30)

Gemma 4 31B

Popularity metrics

Comment distribution

Comment summary

Privacy Benefits and Competitive Positioning

Hardware Requirements and Inference Performance

Logical Reliability and Hallucination Concerns

Vision and OCR Capabilities

Top comments

Source breakdown

Show HN: Qwen-2.5-32B is now the best open source OCR model

It's insane how lobotomized Opus 4.6 is right now. Even Gemma 4 31B UD IQ3 XXS beat it on the carwash test on my 5070 TI.

Gemma 4 31B beats several frontier models on the FoodTruck Bench

Gemma 4 31B vs Qwen 3.5 27B: Which is best for long context worklows? My THOUGHTS...

Speculative Decoding works great for Gemma 4 31B with E2B draft (+29% avg, +50% on code)

Gemma 4 31B GGUF quants ranked by KL divergence (unsloth, bartowski, lmstudio-community, ggml-org)

Gemma 4 31B at 256K Full Context on a Single RTX 5090 — TurboQuant KV Cache Benchmark

Gemma 4 31B vs Gemma 4 26B-A4B vs Qwen 3.5 27B — 30-question blind eval with Claude Opus 4.6 as judge

Gemma 4 31B sweeps the floor with GLM 5.1

Gemma 4 31B — 4bit is all you need

(Interactive)OpenCode Racing Game Comparison Qwen3.6 35B vs Qwen3.5 122B vs Qwen3.5 27B vs Qwen3.5 4B vs Gemma 4 31B vs Gemma 4 26B vs Qwen3 Coder Next vs GLM 4.7 Flash

Gemma-4-31B NVFP4 inference numbers on 1x RTX Pro 6000

Gemma 4 on the iPhone (local AI, no internet required)

What’s new in Gemma 4

Gemma 4 Just Released - Whats In?

Google Gemma 4 Tutorial - Run AI Locally for Free

Gemma 4 太猛了！谷歌最强开源模型，本地就能跑多模态（实测） | 零度解说

Gemma 4 on Raspberry Pi 5: A Surprisingly Usable Local AI Setup

The real reason Google gave away Gemma 4

AI without Internet 🤯|| ft.Google Gemma 4 #AiwithoitInternet #OfflineAI #AIwithoutNet #geminiChatGPT

Gemma 4 被越狱了！谷歌最强开源模型，这才是普通人真正需要的本地AI！| 零度解说

Gemma 4 - I Tested it on My Laptop and Desktop

Gemma 4 + SearXNG = 100% FREE & PRIVATE OpenClaw (Full Setup)

THIS FREE AI Tool Just Beat Claude & ChatGPT! Google Gemma 4