Which Gemma 4 tag should you pull first in Ollama?

For most users, gemma4:e4b is the best first pull because it balances model quality, speed, and low failure risk on mainstream hardware.

What should you check after running ollama pull gemma4?

Run ollama list to confirm the model exists locally, ollama ps to see whether it is active, and then test the local API on port 11434.

Gemma 4 Ollama Setup: Install, Pull, Run, and Use the Local API

Quick commands people actually search for

If you came here from searches like ollama pull gemma4, ollama pull gemma4:26b, or how to run Gemma 4 in Ollama, these are the commands you want first:

ollama pull gemma4
ollama pull gemma4:e4b
ollama pull gemma4:26b
ollama pull gemma4:31b
ollama list
ollama ps
ollama run gemma4:e4b

Best default: if you are not sure which tag to pull first, use gemma4:e4b. It is the lowest-risk way to confirm the install, the CLI flow, and the API before you spend time downloading larger tags.

Quick setup steps

Install Ollama.
Pull gemma4 or a specific Gemma 4 tag.
Use ollama list to confirm the model is available.
Run the model in the CLI.
Use the local API and ollama ps to verify it is serving correctly.

Recommended first move: start with gemma4 or gemma4:e4b, confirm the workflow, then decide whether larger tags are worth the extra memory cost.

Install Ollama

Use the official installer from ollama.com. On Linux, the quick shell install is common. On Mac, the desktop app or Homebrew works. On Windows, the official installer is the simplest route.

curl -fsSL https://ollama.com/install.sh | sh

brew install --cask ollama

Pull Gemma 4 and pick the right tag

Ollama's official gemma4 model page shows the default run path first, and that is still the right starting point for most users.

ollama pull gemma4
ollama list
ollama run gemma4

If you want to control the size explicitly, pull a specific tag:

ollama pull gemma4:e2b
ollama pull gemma4:e4b
ollama pull gemma4:26b
ollama pull gemma4:31b

Which Gemma 4 tag should you pull first?

Most users do not really need a full model catalogue. They need one sane starting tag. This table turns the common tag searches into a direct decision.

Your hardware or goal	Start with	Why this is the right first pull
CPU-only, tiny GPU, or first workflow test	`ollama pull gemma4:e2b`	Fastest way to prove the install path works on a constrained machine.
Most laptops, 8 to 12 GB GPU, or Apple Silicon 16 GB+	`ollama pull gemma4:e4b`	Best balance of model quality, speed, and low failure risk.
16 GB GPU or strong Apple Silicon memory	`ollama pull gemma4:26b`	Worth trying when you already know the machine can carry a larger tag.
24 GB GPU or high-memory workstation / Mac	`ollama pull gemma4:31b`	This is where the largest local tier starts feeling realistic.

Use the default gemma4 tag if you want the simplest “just run it” path. Use explicit tags like gemma4:e4b or gemma4:26b when you want repeatable behavior across machines and future installs.

Once the model is installed, these commands are the most useful first checks:

ollama list
ollama ps
ollama run gemma4 "roses are red"

Use the local API

After the CLI works, test the API immediately. That is the fastest way to know whether your local setup is ready for scripts, tools, or a small app.

curl http://localhost:11434/api/generate \
  -d '{"model":"gemma4","prompt":"Summarize why local AI matters."}'

If you are following a tag-specific workflow, replace gemma4 with the exact tag you pulled. For example, if the search that brought you here was ollama pull gemma4:26b, test the same tag in the API too so you do not accidentally validate the wrong model.

What about Mac, Windows, and Linux?

Mac: easiest if you already use Homebrew or the official desktop installer.
Windows: use the official installer, then run the same pull and run commands in PowerShell or Terminal.
Linux: the shell installer is usually the fastest first step.

Common issues

The model downloads, but inference is painfully slow

Your hardware is probably mismatched to the tag you chose, or your prompt/context size is larger than your machine can carry comfortably.

The model runs, but the system becomes unusable

That is usually memory pressure, not a mysterious Ollama failure. Step down the model size or close other heavy apps.

The API fails even though the CLI works

Check whether the model is still active with ollama ps, then verify your client timeout and request size before debugging anything more complex.

Should you use `gemma4` or a specific tag like `gemma4:e4b`?

Use gemma4 when you want the simplest install path and do not care which exact default tag Ollama points to later. Use a specific tag when you want predictable downloads, repeatable tests, or you are matching the model to hardware intentionally.

Gemma 4 Ollama setup: install it, pull the right tag, and test the API.

Quick commands people actually search for

Quick setup steps

Install Ollama

Pull Gemma 4 and pick the right tag

Which Gemma 4 tag should you pull first?

Use the local API

What about Mac, Windows, and Linux?

Common issues

The model downloads, but inference is painfully slow

The model runs, but the system becomes unusable

The API fails even though the CLI works

Should you use `gemma4` or a specific tag like `gemma4:e4b`?

Related guides

Gemma 4 Ollama setup: install it, pull the right tag, and test the API.

Quick commands people actually search for

Quick setup steps

Install Ollama

Pull Gemma 4 and pick the right tag

Which Gemma 4 tag should you pull first?

Use the local API

What about Mac, Windows, and Linux?

Common issues

The model downloads, but inference is painfully slow

The model runs, but the system becomes unusable

The API fails even though the CLI works

Should you use gemma4 or a specific tag like gemma4:e4b?

Related guides

Gemma 4 VRAM requirements

Gemma 4 vs Qwen3

Should you use `gemma4` or a specific tag like `gemma4:e4b`?