ChatGPT Can Now See Images and Talk to Answer Your Questions

OpenAI has recently introduced exciting updates to its renowned conversational AI tool, ChatGPT. These enhancements have brought voice and image capabilities to ChatGPT, making it even more interactive and user-friendly. Here’s everything you need to know about these new features.

Voice Conversations with ChatGPT

ChatGPT now incorporates a state-of-the-art text-to-speech model that enables users to engage in verbal conversations with the AI. This new feature generates human-like audio, and OpenAI has collaborated with professional actors to develop five distinct voices for ChatGPT. Additionally, the open-source speech recognition system, Whisper, has played a crucial role in transcribing speech into text.

To access voice conversations, navigate to the Settings section and enable the New Features option. Once enabled, you can initiate a voice chat by tapping the microphone icon located in the top left corner of the home screen.

Image Queries with ChatGPT

Another remarkable addition to ChatGPT is its image recognition capabilities. Users can now present images to ChatGPT and receive relevant information or answers to their queries. For example, travelers can show landmark images to learn more about them, or individuals facing smartphone issues can showcase images to receive troubleshooting guidance. These features are powered by the advanced multimodal GPT-3.5 and GPT-4 models. To use this feature, simply select the photo button (on Android and iOS, tap the plus button first) to get started.

Initially, the voice and image capabilities are available exclusively to Plus and Enterprise users and are set to be rolled out within two weeks. While the voice chat feature is accessible on both Android and iOS, the image feature will be available on all platforms.

Partnerships and Future Plans

OpenAI has already begun collaborating with industry leaders to leverage ChatGPT’s voice capability. Spotify, for instance, intends to use ChatGPT’s voice translation capabilities in podcasts, enabling podcasters to reach a broader audience. This feature is currently in the pilot phase. Additionally, OpenAI has partnered with Be My Eyes, an app designed to assist visually impaired individuals, to explore how ChatGPT’s image recognition can be utilized to enhance the app’s functionalities.

OpenAI acknowledges the potential risks associated with these capabilities and has taken precautions to ensure responsible usage. The voice feature has been initially released solely for chat purposes, while the image functionality has been thoroughly tested and validated by trusted individuals to prevent misuse. OpenAI plans to make these features available for free eventually.

Now it’s your turn to experience the new ChatGPT features. Let us know in the comments below how you plan to leverage voice conversations and image queries in your interactions with ChatGPT!