Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

MacRumors

macrumors bot
Original poster
Apr 12, 2001
67,557
37,935


xAI has launched a new Voice Mode for its Grok chatbot, introducing a feature called Grok Vision that lets users interact with the world through their smartphone camera. Much like ChatGPT and Google Gemini, Grok can now interpret what your phone sees and respond in real time.

grok-ai-logo.jpg

Using Grok Vision, iPhone users can point their camera at an object and ask "What am I looking at?" – and the chatbot will reply aloud with a context-aware response. The feature is now available on the Grok iOS app, but Android users will need to wait a little longer for access.

In addition to visual recognition, Voice Mode now supports multilingual audio, allowing users to converse with Grok in multiple languages. Real-time search is also integrated, giving the chatbot the ability to provide instant answers that reflect the latest information.

The enhancements follow last week's rollout of Grok's new memory feature, which enables the chatbot to recall past interactions – including user preferences and previously asked questions – to deliver more personalized replies and suggestions.


xAI also recently released the first version of Studio, providing a workspace for generating documents and code, similar to ChatGPT's Canvas. Studio opens in a separate window and is designed to give users a more focused environment for content creation.

Grok is available now as a free download on the App Store. [Direct Link]

Article Link: Grok AI Gains Vision and Voice Features in iOS App
 
グロクは日本語を話す is not great Japanese. It's a literal word-for-word translation of "Grok speaks Japanese", but this verb formation doesn't have the connotation of "being able to" that "speaks" includes. It's not incorrect, but it's not very idiomatic, like if you heard someone say "I talk English."

This is a very basic phrase, so it doesn't inspire confidence in the abilities of the tool.
 
グロクは日本語を話す is not great Japanese. It's a literal word-for-word translation of "Grok speaks Japanese", but this verb formation doesn't have the connotation of "being able to" that "speaks" includes. It's not incorrect, but it's not very idiomatic, like if you heard someone say "I talk English."

This is a very basic phrase, so it doesn't inspire confidence in the abilities of the tool.
Haha.. This reminded me of Mrs. Doubtfire when Robin Williams says “I..am..job”
 
グロクは日本語を話す is not great Japanese. It's a literal word-for-word translation of "Grok speaks Japanese", but this verb formation doesn't have the connotation of "being able to" that "speaks" includes. It's not incorrect, but it's not very idiomatic, like if you heard someone say "I talk English."

This is a very basic phrase, so it doesn't inspire confidence in the abilities of the tool.
How does ChatGPT advanced voice mode compare?
 
グロクは日本語を話す is not great Japanese.
Yes, and it tells us that Grok is based on English and is then translating.

話せる would be grammatically better but it still is not a naturally sounding sentence.
 
グロクは日本語を話す is not great Japanese. It's a literal word-for-word translation of "Grok speaks Japanese", but this verb formation doesn't have the connotation of "being able to" that "speaks" includes. It's not incorrect, but it's not very idiomatic, like if you heard someone say "I talk English."

This is a very basic phrase, so it doesn't inspire confidence in the abilities of the tool.
Is Grokは日本語を話します any better (using Microsoft Translate)?
 
I've got Premium (a mid level) tier subscription on X. It's currently priced at 8 USD /month. It seems to give me Grok3 which, testers seem to rate as a top level LLM. That's a lot less than other paid subscriptions. We'll see what happens but allegedly the rate of improvement at Xai is extraordinarily rapid, in part due to their having figured out how to connect lots of Nvidia GPU's together and to perform the physical installation at breakneck speed.
 
Given the resentment of smartphone-focused people (especially the young, 'smartphone zombies,' etc...) and outright animosity toward social media I often see in the culture, I saw this bit: "...lets users interact with the world through their smartphone camera," and figure this will throw gas on the fire.

Maybe for a slogan they could say something like 'Grok...bringing the real world closer than ever before.'

Probably a good thing I didn't choose a career in marketing.
 
  • Like
Reactions: amartinez1660
Cool, now I can ask AI whether "he is just a friend" or my wife's boyfriend.
Better yet, when it can read people like that, it can start advising you on how to placate your wife like those relationship video shorts on FaceBook do, and those articles, self-help books and what not.

You just point the phone at her when she looks in a snit about something, and it says things like:

1.) Ask about her feelings.
2.) Validate whatever she says (whatever that means).
3.) Look concerned.
4.) Don't offer any suggestions on how the problem might actually be fixed.

I've seen an ad. for a plant app. that you point at sick plants and it tells you what to do. Why not?
 
Last edited:
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.