I really wonder what they have up their sleeve because the current generation isn’t very good for LLM. 3-5 tokens per second is too slow for the money they are charging.