News | ChatGPT Now Supports Voice Image Inputs for Answers

ChatGPT Now Supports Voice Image Inputs for Answers

Published by: Insights Desk Released: Sep 26, 2023 Source: DemandTalk

Highlights:

It utilizes an innovative text-to-speech AI model capable of producing human-like audio with a brief speech sample.
The voice feature update will roll out to ChatGPT Plus and Enterprise users on iOS and Android as an opt-in option in the next two weeks.

People have been able to hold text-based conversations with OpenAI LP’s chatbot powered by artificial intelligence for quite some time, but the company announced recently that it will soon be able to hold verbal conversations.

Users will also be able to take photographs and engage in back-and-forth dialogues with the chatbot in order to learn more about the subject of the image.

Using a model similar to OpenAI’s open-source Whisper model, which can transcribe human speech into text, the voice chat feature is designed to pick up what a person is saying to the AI chatbot and transform it so that the system can comprehend. It employs a new text-to-speech artificial intelligence model that can synthesize human-sounding audio from just a few seconds of sample speech.

OpenAI reported that the company’s developers worked with professional voice actors to create a variety of voices for the new experience. OpenAI provides five distinct accents with names that sound natural, including “Juniper,” “Ember,” “Sky,” and “Cove” and “Breeze.” The voices are of both genders, have exceptional clarity and intonation, and are therefore suitable for storytelling, reciting the news, and general conversation.

OpenAI added that it is also collaborating with Spotify on the pilot of its new Voice Translation feature, which will enable podcasters to translate their podcasts into other languages by using their own voices and the new voice model.

The new voice feature will be opt-in for iOS and Android ChatGPT Plus and Enterprise users within the next two weeks. Users can locate it in the New Features section of the mobile app’s settings and activate it by selecting the headphones button.

Conversations Concerning Images

With images, users will be able to get even more out of ChatGPT by photographing a scene, an object, or anything else and then asking the AI about it. Then, they will be able to converse with the chatbot about what it sees in order to solve a difficult math problem, construct a crèche, learn about a landmark, or obtain distant directions.

For instance, if enough potential ingredients are visible, a user could take a photograph of the contents of their refrigerator and inquire what they could prepare for dinner. They could stroll down a store aisle and obtain product information from ChatGPT by taking photographs of items for comparison shopping. It would also be possible to capture a picture of a grill that had been in the garage for an entire winter that a user couldn’t get lit in an attempt to get assistance and ChatGPT could look up the manual and help the user get it working again.

This new capability is an improvement over currently available capabilities, such as Google Lens, which provides a potent image search that can identify what is in a photograph. Google DeepMind, the artificial intelligence (AI) division of Google LLC, has also developed a vision-impaired AI model for Android called Lookout. It employs an AI model to characterize photographs and allows users to ask follow-up inquiries.

OpenAI explained that its experience with Be My Eyes, a free mobile app powered by GPT-4, informed the company’s approach to developing the new image capabilities integrated into ChatGPT.

With the ability to connect real-world images to internet queries and to converse with the chatbot, users will have access to brand-new capabilities, and it is evident that OpenAI is attempting to test the limits of its capabilities.

Additionally, the company emphasized that there are privacy implications when individuals may be in view. What happens, for instance, if someone takes a photograph of a person about whom the AI has public information but that presumably shouldn’t be disclosed? OpenAI stated that the company made measures to restrict the model’s analysis of individuals and would not make direct statements about them in order to respect their privacy, especially given that ChatGPT is not always accurate.

ai governance for the enterprise...

empower ai and real-time insights at the edge...

power ai and analytics workloads with performance,...

how to choose the right ai foundation model...

pros enterprise ai for the industrial industries (...

unlocking ai’s potential: challenges and opportu...

transforming procurement with ai: opportunities, c...

adobe acrobat ai assistant: reinventing productivi...

adobe acrobat ai assistant: reinventing productivi...

ai, automation, and the strategic cao...

an introduction to ai in customer service...

5 ways ai can transform your customer experience...

ciso guide to generative ai attacks...

10 reasons to hire a customer-led voice assistant...

10 reasons to hire a customer-led voice assistant...

the definitive buying guide for contact center her...

cfo's guide to ai...

discover the future of business innovation with ge...

preparing for the future of cx by harnessing the p...

tableau gpt: innovate for the future with generati...

profitable ai-powered data management solutions to...

business-centric cognitive architecture revolution...

ai use cases – innovations for business success...

the role of ai in software development...

ai in cybersecurity – your digital guardian...

how chatbot marketing supports today’s business ...

advanced adaptive ai bolsters business intelligenc...

the dynamic impact of ai in procurement...

ai in customer service – revealing common applic...

how to use dall-e for marketing success...

rpa vs ai: a comparative analysis for business aut...

maximizing business efficiency through ai integrat...

7 trendiest ai marketing campaigns igniting commer...

liquid neural network unveiling the fluid intellig...

the art of prompt engineering in general & marketi...

what is amazon bedrock?...

decode data like never before: chatgpt for data an...

workforce planning models –the power of ai skil...

black friday and the impact of ai in e-commerce...

how digital brain is a game changer for business s...

microsoft introduces bing generative search in lim...

cytoreason raises usd 80 m in the funding round in...

google unveils a suite of new features for ai apps...

kindo reels in usd 20.6 m and acquires whiterabbit...

microsoft’s spreadsheetllm enhances ai’s compr...

herculesai raises usd 26 m to develop and expand i...

intel capital leads usd 15 m investment in ai cons...

aws unveils app studio to accelerate app developme...

captions llc raises usd 60 m for generative video ...

enso technologies secures usd 6 m for smb-focused ...

hebbia raises usd 130 m to develop data search pla...

meta releases four open-source language models...

harvey is reportedly raising usd 100 m at usd 1.5 ...

cloudflare introduces a new no-code feature to pre...

redactive raises usd 7.5 m to expand headcount and...

rapid7 acquires noetic cyber to help businesses fi...

runway ai aims for usd 450 m amid ai startup inter...

gen ai coding assistant startup magic ai aims to r...

anthropic introduces new program to fund enhanced ...

meta to open-source meta llm compiler for code opt...

role of machine learning in networking...

ChatGPT Now Supports Voice Image Inputs for Answers

Highlights:

Conversations Concerning Images

Insights Desk

Related posts

Microsoft Introduces Bing Generative Search in Lim...

CytoReason Raises USD 80 M in the Funding Round In...

Google Unveils a Suite of New Features for AI Apps...

Kindo Reels in USD 20.6 M and Acquires WhiteRabbit...

Microsoft’s SpreadsheetLLM Enhances AI’s Compr...

HerculesAI Raises USD 26 M to Develop and Expand i...

Intel Capital Leads USD 15 M Investment in AI Cons...

AWS Unveils App Studio to Accelerate App Developme...

Captions LLC Raises USD 60 M for Generative Video ...

Enso Technologies Secures USD 6 M for SMB-focused ...

Our Brands