Write your task in Screen Operator and it simulates tapping the screen to complete the task. In return, a vision language model, receives a system message containing commands for operating the screen and the smartphone. Screen Operator creates screenshots and sends them to Gemini. Gemini responds with the commands, which are then implemented by Screen Operator with the Accessibility service permission.
Available models are
Gemini 2.0 Flash Lite,
Gemini 2.0 Flash,
Gemini 2.5 Flash lite
Gemini 2.5 Flash,
Gemini 2.5 Flash live coming soon,
Gemini 2.5 Pro,
Gemma 3n E4B it (cloud) and
Gemma 3 27B it.
Depending on the model, 5 to 30 responses per minute are possible.
If you in your Google account identified as under 18, you need an adult account because Google is (unreasonably) denying you the API key.
Get updates faster from Github: https://github.com/Android-PowerUser/ScreenOperator