clpeak is a synthetic micro-benchmark that measures the peak achievable compute performance of your device's GPU and CPU. It runs tight, hand-tuned vector / MAD / MMA loops to expose what the silicon is actually capable of in isolation — raw peak numbers, not real-world workload performance.
Backends on Android: Vulkan, OpenCL, and a native CPU backend. Every available backend runs back-to-back on the same device, so you can compare how different compute stacks perform on the very same hardware.
What it measures:
• Single, half, double and bf16 floating-point compute (GFLOPS)
• Integer throughput and INT8 dot-product
• Matrix-engine / tensor throughput where supported
• Global and cache / local memory bandwidth
• Memory latency
Source, issues and details: https://github.com/krrishnarraj/clpeak