Cactus Chat lets you talk to AI, directly on your phone. This means it's free, runs offline, and your data stays on your device.
If you're a developer, use Cactus to benchmark the latency and throughput of various LLMs.
Updated on
Aug 14, 2025
Productivity
Data safety
arrow_forward
Safety starts with understanding how developers collect and share your data. Data privacy and security practices may vary based on your use, region, and age. The developer provided this information and may update it over time.
Learn more about how developers declare collection
See details
Ratings and reviews
phone_androidPhone
4.5
36 reviews
5
4
3
2
1
T Turner
Flag inappropriate
Show review history
October 31, 2025
It's a decent enough demo but isn't nearly as fast as the GitHub description says it should be, I'm only getting about 9 tokens a second on a Pixel 7 with the default model. If this wasn't open source it would be rated much lower, there are applications such as PocketPal using a 1b gemma3 model which is as fast or faster in a feature rich application with a smarter LLM model.
1 person found this review helpful
Robin Williams
Flag inappropriate
May 2, 2025
probably the smoothest and the fastest app to set up and load your local and remote AI models with. it probably took me less than 15 seconds and most of that was wasted time moving my thumbs from key to key on my phone.
2 people found this review helpful
Super Alexander
Flag inappropriate
October 16, 2025
great tool,initially i taught the app was always going to crash like some but i got fascinated using it offline with privacy which i like,nice job by developer. but i want a feature which let's us copy generated context.