Android Studio Updated 2026-04-05 6 min read

How to use Gemma 4 in Android Studio as your local coding model

Gemma 4 is Google's officially recommended local model for Android Studio's Agent Mode. Your code never leaves your machine, there's no API key required, and it works offline. This guide covers the exact steps to get it running, the hardware you actually need, and what Agent Mode can do once it's set up.

Why Gemma 4 specifically for Android Studio

Gemma 4 isn't just a generic local model that happens to work in Android Studio — it was explicitly trained on Android development patterns. That means it understands Kotlin idioms, Jetpack Compose, and Android-specific code structures at a deeper level than a general-purpose model would.

Three practical reasons to use it instead of a cloud model:

  • Privacy: proprietary code stays on your machine. No prompts are sent to Google or any API endpoint during inference.
  • No quota limits: run as many Agent Mode requests as you want without hitting rate limits or unexpected bills.
  • Offline use: works without internet once the model is downloaded. Useful in restricted corporate environments or on flights.

Hardware requirements

Google's official recommendation is the 26B MoE model for the best Agent Mode experience. The E4B is also supported for lower-spec machines.

Model Minimum RAM Storage needed Best for
Gemma 4 E4B 12 GB RAM total ~4 GB Laptops, lower-spec dev machines
Gemma 4 26B A4B ✅ recommended 24 GB RAM total ~17 GB Workstations, high-spec dev machines

Important: these RAM numbers include both Android Studio's own memory usage and the model. Android Studio typically needs 4–8 GB on its own. If you have exactly 16 GB total RAM, the 26B model will be tight — use E4B instead and you'll get a better experience than a constantly-swapping 26B.

GPU is not strictly required — both models can run on CPU — but a discrete GPU or Apple Silicon with enough unified memory makes generation noticeably faster.

Setup steps

The setup has four parts: install Android Studio, install an LLM provider, pull the model, and connect them. The whole process takes about 20–30 minutes depending on your download speed.

Step 1 — Install the latest Android Studio

You need a recent version of Android Studio (Meerkat or newer) that includes the local model provider integration. Download from developer.android.com/studio.

If you already have Android Studio installed, check for updates: Help → Check for Updates.

Step 2 — Install an LLM provider

Android Studio doesn't run the model directly — it connects to a local inference server. You need one of these two:

Option A: Ollama (recommended — command-line, fastest setup)

# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# Pull the recommended model
ollama pull gemma4:26b

# Start the server (runs automatically after install on Mac/Linux)
ollama serve

Option B: LM Studio (GUI — easier if you prefer no terminal)

  • Download from lmstudio.ai
  • Open the app, search "gemma4" in the model browser
  • Download Gemma 4 E4B or 26B A4B (pre-quantized GGUF files available)
  • Click "Start Server" in the Local Server tab

Step 3 — Connect the provider to Android Studio

  1. Open Android Studio
  2. Go to Settings → Tools → AI → Model Providers
  3. Click Add and select Ollama or LM Studio
  4. Enter the server address:
    • Ollama default: http://localhost:11434
    • LM Studio default: http://localhost:1234
  5. Click Test Connection — you should see a green checkmark
  6. Click Apply

Step 4 — Select Gemma 4 in Agent Mode

  1. Open the Gemini panel in Android Studio (the star icon in the toolbar)
  2. Switch to Agent Mode
  3. Click the model selector dropdown at the bottom of the panel
  4. Select Gemma 4 from the available local models

You're ready. Type a prompt and Gemma 4 will handle it entirely on your machine.

What Agent Mode can actually do with Gemma 4

Agent Mode with Gemma 4 goes beyond autocomplete. It can execute multi-step tasks across your codebase. Practical examples that work well:

Task Example prompt What Gemma 4 does
Build a feature "Build a settings screen with dark mode toggle" Generates Kotlin + Compose UI, wires ViewModel, follows Material 3 patterns
Refactor legacy code "Migrate all XML layouts in this activity to Compose" Reads existing files, rewrites them, preserves logic
Fix build errors "Build my project and fix any errors" Runs build, reads error output, applies fixes iteratively
Externalize strings "Extract all hardcoded strings and add them to strings.xml" Scans codebase, applies changes across multiple files
Write tests "Write unit tests for the UserRepository class" Generates JUnit tests with mocking, follows Android testing conventions

Gemma 4's native function calling is what makes these multi-step tasks reliable. It can call tools (read file, write file, run build) and chain them sequentially without losing context between steps.

Troubleshooting

Android Studio can't connect to Ollama

First confirm Ollama is running: open a terminal and run ollama ps. If nothing shows, run ollama serve. On Windows, check that Ollama appears in the system tray. Then verify the URL in Android Studio settings is exactly http://localhost:11434 with no trailing slash.

Generation is very slow

The 26B model needs 24 GB RAM headroom to avoid swapping. If Android Studio, Chrome, and other apps are open alongside, you may not have enough. Close memory-heavy apps, or switch to the E4B model which runs comfortably in 12 GB.

Model downloaded but doesn't appear in Agent Mode selector

Try restarting Android Studio after connecting the provider. The model list is populated when the IDE starts. Also confirm the Ollama server is running before you open Android Studio, not after.

Agent Mode starts but stops mid-task

This is usually a timeout issue. Ollama's default request timeout may be too short for complex multi-step tasks. Set a longer timeout by adding OLLAMA_KEEP_ALIVE=30m to your environment before starting Ollama.

Related guides