Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

energy efficiency

  1. D

    Seeking M5 Max/Ultra telemetry: does higher bandwidth improve Tokens/Joule or just raw throughput?

    I built an open-source LLM inference telemetry suite that measures Tokens Per Joule — the energy efficiency metric, not just raw speed. Current baseline is on an M1 Pro (32GB UMA): 2.42 Tokens/Joule on Qwen-3B Q4_K_M 22 t/s on Llama-3.1-8B Q8_0 at 8192 context (13.7GB workload) at 35W...