▎ LocalLM lets you run large language models entirely on your Android device — no internet required, no data sent to the cloud.
▎
▎ Chat with AI models like Gemma, Qwen, Phi, and DeepSeek running directly on your phone's CPU, GPU, or NPU. Your conversations stay private and never leave your device.
▎
▎ FEATURES
▎
▎ - On-device AI inference powered by Google's LiteRT-LM
▎ - Support for CPU, GPU, and NPU acceleration
▎ - Download models from a curated catalog or search HuggingFace
▎ - Connect to local network LLM servers (LM Studio, Ollama, llama.cpp)
▎ - Connect to cloud APIs (OpenAI, Anthropic)
▎ - Full conversation history with export support
▎ - Dark theme optimized for AMOLED displays
▎ - No account required, no tracking, no ads
▎
▎ SUPPORTED MODELS
▎
▎ - Gemma 4 (E2B, E4B)
▎ - Gemma 3n (E2B, E4B)
▎ - Gemma 3 1B
▎ - Qwen 3, Qwen 2.5
▎ - DeepSeek R1
▎ - Phi 4 Mini
▎ - And more via HuggingFace search