Chat with AI Models
On Your Android Device

Connect to your Ollama server or run models fully on-device — Gemma 4, Qwen 3, DeepSeek R1 Distill and Phi-4 Mini all powered by Google LiteRT-LM. Now with real-time web chat sync: access all your conversations from any browser and send messages that your phone handles and streams back instantly.

Powerful Features

Everything you need for seamless AI conversations

💻

Web Chat — Chat from Any Browser

Open chat.html on any desktop or laptop. Sign in with the same account as your phone and all your conversations appear instantly. Send a message — your phone picks it up, runs AI inference, and the response streams back to your browser in real time. Works with Ollama and on-device LiteRT models alike.

✨ Free tier: 3 web messages/day  ·  Upgrade for unlimited →

📲

On-device AI with LiteRT-LM

Download and run Gemma 4, Gemma 3, Qwen 3, DeepSeek R1 Distill, and Phi-4 Mini directly on your phone via Google's LiteRT-LM runtime. No server, no network, no data leaving the device.

💬

Intuitive Chat Interface

Experience smooth, real-time conversations with AI models through our beautifully designed Material Design 3 interface.

🖼️

Image Support

Share images with vision-capable AI models! Smart compression automatically optimizes images for fast, efficient conversations.

📚

Persistent Chat History

Never lose a conversation! All your chats are automatically saved locally using our secure Room database.

⚙️

Flexible Configuration

Connect to multiple Ollama servers, switch between AI models, and customize model parameters to your needs.

🎨

Modern & Beautiful UI

Built with Jetpack Compose and Material Design 3, featuring smooth animations and full dark theme support.

🔒

Privacy-First

Your conversations stay on your device. We use local database storage, so your chat history never leaves your phone.

⏹️

Stop Anytime

Halt a streaming reply mid-response with a single tap — whatever has been generated so far is kept in the thread.

🎚️

Tune Model Parameters

Dial in temperature, top_p, top_k, context length, and seed per conversation to control how creative or precise each chat is.

📊

Token-Speed Stats

See tokens per second, token counts, and response time under each reply so you always know how your models are performing.

ℹ️

Model Details

Inspect any model's parameters, quantization, template, license, and full modelfile right inside the app.

🏷️

Labels & Organization

Label your chats and filter the thread list by them, on top of pinning and archiving, to keep everything tidy.

🌐

Global Search

Search across every conversation by message content and jump straight to the message you remember writing.

See It In Action

Beautiful screenshots from the app

Chat Threads

Chat Threads

Text Chat

Text Chat

Image Chat

Image Chat

Model Selection

Model Selection

Server Management

Server Management

Thread Settings

Thread Settings

Premium Plans

Everything is free to try — upgrade in-app for unlimited web chat and an ad-free experience.

BEST VALUE

Web Sync + Ad Free — Yearly

$14.99/year

~$1.25/mo · Unlimited web messages · All ads removed in-app

  • ✅ Unlimited web chat messages
  • ✅ No banners, interstitials, or app-open ads
  • ✅ Cancel anytime — no commitment
💬

Web Sync + Ad Free — Monthly

$1.99/month

Unlimited web messages · All ads removed in-app

  • ✅ Unlimited web chat messages
  • ✅ No banners, interstitials, or app-open ads
  • ✅ Cancel anytime — no commitment
🚫

Ad-Free Only

from $0.99/month

Monthly ($0.99/mo), Yearly ($9.99/yr), or Lifetime (one-time)

  • ✅ No banners, interstitials, or app-open ads
  • ⚠️ Web messages still limited to 3/day

To subscribe, open Ollama AI Chat on your Android phone → Settings → Remove Ads / Go Premium.

💳 Two ways to pay, depending on your version

The app comes in two flavors that use different payment processors because they're distributed differently:

▶️ Play Store version
  • Paid through Google Play Billing
  • No extra sign-in — uses your Play account
  • Manage & cancel in Google Play
🐙 GitHub version (sideloaded APK)
  • Paid through Stripe Checkout (opens in your browser)
  • Google sign-in required before checkout
  • Manage & cancel via your emailed Stripe receipt

Why does the GitHub version ask me to sign in before buying? Because it isn't distributed through the Play Store, there's no store account to attach a purchase to. The GitHub build uses Stripe and links your upgrade to your Google (Firebase) account instead. That sign-in is exactly what lets your ad-free or Web Sync upgrade restore automatically when you reinstall or switch devices — your purchase follows your account, not a single phone. The Play Store version skips this step because Google Play already handles account-linked restores for you.

Frequently Asked Questions

Everything you need to know

Web Chat lets you access all your AI conversations from any desktop or laptop browser — no app install needed. Open chat.html, sign in with the same Google or email account you use on the app, and your threads appear immediately. When you send a message, your Android phone receives it via Firebase, runs AI inference (using Ollama or on-device LiteRT), and streams the response back to your browser in real time. You can even type messages while your phone is offline — they queue locally and dispatch automatically when the phone reconnects.

Ollama AI Chat is an Android application that connects to your Ollama server, enabling you to chat with various AI models like Llama, Mistral, and more directly from your Android device. It features a modern Material Design 3 interface with real-time streaming responses.

No — as of the latest release you can also run models fully on-device using Google's LiteRT-LM runtime, no Ollama server required. Just add a LiteRT (on-device) backend from the Servers screen, download one of the built-in bundles, and chat. If you prefer remote inference, Ollama is still fully supported — visit ollama.ai to learn more.

The Models screen ships with a curated catalog of .litertlm bundles pulled from the public litert-community Hugging Face organization: Gemma 3 270M (~304 MB), Gemma 3 1B (~584 MB), Qwen 3 0.6B (~614 MB), Qwen 2.5 1.5B Instruct (~1.6 GB), DeepSeek R1 Distill Qwen 1.5B (~1.83 GB), Gemma 4 E2B (~2.58 GB), Gemma 4 E4B (~3.65 GB), and Phi-4 Mini Instruct (~3.91 GB). Downloads resume after network drops and a free-space check runs before each pull.

The app requires Android 13 (API Level 33) or higher. Make sure your device is running a compatible Android version before installing.

Absolutely! Your conversations stay on your device. We use local database storage (Room database), so your chat history never leaves your phone unless you choose to share it. The app supports both HTTP and HTTPS connections for secure networking.

Yes! You can connect to multiple Ollama servers and switch between different AI models instantly. The app allows you to manage models, pull new ones, and delete models you no longer need.

Yes! You can attach and send images in conversations with vision-capable AI models. The app automatically compresses images to optimize performance while maintaining quality, ensuring fast and efficient conversations.

Yes! The app is open source and licensed under the MIT License. You can view the source code, contribute, and report issues on GitHub. We welcome contributions!

Simply enter your Ollama server URL (e.g., http://192.168.1.100:11434) in the app settings. The app supports both HTTP and HTTPS connections. You can add multiple servers and switch between them as needed.

Ready to Get Started?

Download Ollama AI Chat and start chatting with AI models on your Android device today!

Requires Android 13 (API 33) or higher