LLM Hub brings production-grade AI straight to your Android device — private, fast, and fully local. Run modern on-device LLMs (Gemma-3, Gemma-3n multimodal, Llama-3.2, Phi-4 Mini) with large context windows, persistent global memory, and retrieval-augmented generation (RAG) that grounds answers in indexed documents stored on-device. Create and store embeddings for documents and notes, run vector similarity search locally, and enrich responses with DuckDuckGo-powered web search when you need live facts. Everything important stays on your phone unless you explicitly export it: local-only memory, indexes, and embeddings protect your privacy while delivering high relevance and accuracy.
Key Features
On-device LLM inference: Fast, private responses without cloud dependency; choose models that match your device and needs.
Retrieval-Augmented Generation (RAG): Combine model reasoning with indexed document chunks and embeddings to produce fact-grounded answers.
Persistent Global Memory: Save facts, documents, and knowledge to a persistent, device-local memory (Room DB) for long-term recall across sessions.
Embeddings & Vector Search: Generate embeddings, index content locally, and retrieve the most relevant documents with efficient similarity search.
Multimodal Support: Use text + image capable models (Gemma-3n) for richer interactions when available.
Web Search Integration: Supplement local knowledge with DuckDuckGo-powered web results to fetch up-to-date information for RAG queries and instant answers.
Offline-Ready: Work without network access — models, memory, and indexes persist on-device.
GPU Acceleration (optional): Benefit from hardware acceleration where supported — for best results with larger GPU-backed models we recommend devices with at least 8GB RAM.
Privacy-First Design: Memory, embeddings, and RAG indexes remain local by default; no cloud upload unless you explicitly choose to share or export data.
Long-Context Handling: Support for models with large context windows so the assistant can reason over extensive documents and histories.
Developer-Friendly: Integrates with local inference, indexing, and retrieval use-cases for apps requiring private, offline AI.
Why choose LLM Hub? LLM Hub is built to deliver private, accurate, and flexible AI on mobile. It merges the speed of local inference with the factual grounding of retrieval-based systems and the convenience of persistent memory — ideal for knowledge workers, privacy-conscious users, and developers building local-first AI features.
Supported Models: Gemma-3, Gemma-3n (multimodal), Llama-3.2, Phi-4 Mini — choose the model that fits your device capabilities and context needs.
Oxirgi yangilanish
16-sen, 2025