TokForge Local AI Offline Chat

Classification du contenu
Tout public
100+
Téléchargements
Classification du contenu
Tout public
En savoir plus
Capture d'écran
Capture d'écran
Capture d'écran
Capture d'écran
Capture d'écran
Capture d'écran
Capture d'écran
Capture d'écran
Capture d'écran
Capture d'écran
Capture d'écran
Capture d'écran
Capture d'écran
Capture d'écran
Capture d'écran
Capture d'écran
Capture d'écran
Capture d'écran
Capture d'écran
Capture d'écran
Capture d'écran
Capture d'écran
Capture d'écran
Capture d'écran

À propos de l'application

PRIVATE, LOCAL & OFFLINE AI. AD-FREE WITH NO SUBSCRIPTION


TokForge runs large language models directly on your Android device fast. No cloud, no subscription, and no data leaving your pocket.

Whether you need a local AI assistant for productivity or a talking AI friend offline, TokForge delivers high-performance inference without an internet connection.

WHAT CAN IT DO? TOKFORGE FEATURES:



Chat with AI Characters


💬 Your offline AI chat experience just got an upgrade. Import TavernAI V2 character cards (PNG/JSON), customize personalities, and have real conversations with streaming generation. TokForge is the ultimate AI friend offline, featuring Lorebooks, alternate greetings, and world info. Reasoning models even include collapsible thinking blocks for deep logic.

Attach Documents & Ask Questions


📄 Turn TokForge into a powerful local AI research tool. Drop in a PDF, DOCX, EPUB, or text file and ask me anything app offline style. Using RAPTOR tree indexing and BGE-small embeddings, the app finds relevant passages instantly. Follow-up questions stay fast thanks to delta KV cache preservation.

Hear Responses Read Aloud


🔊 A true voice assistant for Android offline. Featuring on-device Kokoro TTS with 11 voices and two quality tiers, your offline assistant can read responses back to you with no latency and zero data usage.

2x Faster with Speculative Decoding


⚡ Experience the fastest LLM performance on mobile. A small draft model predicts ahead while the main model verifies in batch. With a live tok/s indicator and smart backend routing, it’s the most efficient AI on-device solution available.

Three Backends, Five GPU Paths


· MNN with OpenCL and Vulkan GPU: Tuned kernels for Mali and Adreno. TQ4 TurboQuant hits 46–57 tok/s on small models.
· GGUF via llama.cpp: ARM i8mm, Vulkan cooperative matrix, flash attention, and full quantization range.
· Remote API: OpenAI-compatible streaming to Ollama, vLLM, or llama.cpp servers.
· SoC-Aware Auto-Routing: This ai local assistant automatically picks the fastest path for your specific chipset.

ADVANCED AI OFFLINE CHAT FEATURES:

Your AI Remembers You: Per-character persistent memory with background extraction. Knowledge graphs track entity relationships using hybrid keyword and semantic search.

Tune Your Device: ForgeLab benchmarks every ai model and backend combo on your hardware. AutoForge sweeps all configs to pick the fastest settings for your offline ai app.

Developer API: 120+ endpoints for full local control over HTTP. Load models, manage memory, and send messages programmatically.

TESTED ON REAL HARDWARE

- RedMagic 11 Pro: 21.0 tok/s — Qwen3-8B
- Galaxy S24 Ultra: 13.58 tok/s — Qwen3-4B
- OnePlus Ace 5 Ultra: 11.88 tok/s — Qwen3-8B
- Xiaomi Pad 7 Pro: 11.81 tok/s — Qwen3-4B

WHY TOKFORGE?


►This is the AI all in one app for users who refuse to compromise on speed or security.
►Zero analytics, zero telemetry, zero cloud dependency.
►Free ai chatbot offline: All inference happens on-device—airplane mode works perfectly.
►No accounts, no sign-up.
►17 curated models (0.6B–14B): Choose from Qwen3, DeepSeek-R1, Llama 3, Phi-4, and more.

Your smartphone is smarter and more powerful than you think. And by moving the brain of the AI directly onto your silicon, we've eliminated the lag, the costs, and the prying eyes of the cloud.

☑️Download this free offline AI powerhouse today and take control of your data.
Date de mise à jour
6 apl 2026

Sécurité des données

La sécurité, c'est d'abord comprendre comment les développeurs collectent et partagent vos données. Les pratiques concernant leur confidentialité et leur protection peuvent varier selon votre utilisation, votre région et votre âge. Le développeur a fourni ces informations et peut les modifier ultérieurement.
Aucune donnée partagée avec des tiers
En savoir plus sur la manière dont les développeurs déclarent le partage
Aucune donnée collectée
En savoir plus sur la manière dont les développeurs déclarent la collecte

Nouveautés

Lot's of changes vs last upload. TurboQuant added under advanced settings, Cache clearing, RAG + Attachment support (Very Beta), Metrics/API work, UI/UX cleaning and improvements from beta tester feedback
Classification du contenu
Tout public
En savoir plus

Assistance de l'appli

À propos du développeur
Isaac Maple
isaac.maple@defcon-one.io
United States