A native app showcases on-device Large Language Model inference on Android and ChromeOS accelerated by Arm technologies, allowing people to try and use models locally for AI chats.
• Run Locally, Fully Offline: All processing happens directly on your device.
• AI Chat: Engage in multi-turn conversations with performance metrics.
• System Prompts: Customize the entire conversation flow with built-in system prompts, or craft your own to give the model a unique voice.
• Benchmark: Run standard Prompt Processing (prefill) and Token Generation (decode) tests for quantitative results.
• Model management: Start with a variety of LLMs tailored to your hardware capabilities, or import your own GGUF models with full control.