Synthetic Voices shared this demonstration video that highlights that voice synthesis is quickly moving beyond the so-called ‘uncanny valley’ to being close to indistinguishable from real voices.
The idea of the ‘uncanny valley’ is that, the closer a robot, animated character or a computer-generated video gets to being realistic, the creepier it gets, because you immediately sense that something isn’t quite right.
In this example, Steve Jobs – from beyond the grave – explains the revolutionary technology of ChatGPT. It’s a compelling demonstration, because Steve Jobs’ voice is very familiar to many, yet the artificially generated voice goes beyond uncanny valley territory to sound like it could be an actual recording.
ChatGPT is a large language model, developed by OpenAI, that can be used for natural language processing tasks such as text generation and language translation.
Check out the demo and share your thoughts on the state of the art of voice synthesis in the comments!
ChatGPT : Text spoken, title and description.
30 thoughts on “Steve Jobs Demonstrates How Voice Synthesis Has Moved Beyond The Uncanny Valley”
i miss jobs influence on apple. their products have gone down the stupid hole after microsoft’s idiotic infinite feature chase. my iPad does so many stupid things I can’t turn off now.
AI – still a toy.
Eliminating headphone jacks IMHO was one of the stupidest hardware changes. Also IMHO wireless earbuds suck, their battery time sucks, they always fall out, lag when trying to use them for sampling, more stuff you have to charge, etc. JMOs based on how something works previously, and then a company messes it up and tries to tell you it’s better. 🙁
I get why it’s frustrating, but less than 1% of customers need and use a headphone jack. I understand their decision in that context when factored with a tiny device
the last thing I want is another device that needs to be charged and that adds latency.
wireless headphones are a joke.
but it works for apple.
hello buy new shiny phone for 1200$
and here are your 200$ headphones that are over the hill after 2 years.
obviously less than 1% use a headphone jack now that theyve removed it….when it was there i bet it was a up in the 50%. I challenge your figure
They killed the headphone jack when they released Airpods in 2016 / 2017… so I’ll concede a lil bit; def some forced adoption. However, it was the right move from a strategic position. Probably no surprise, but if I lost my Airpods tomorrow, I’d probably order a new pair the same day. https://youtu.be/NfLu7GRMR7g
It’s not like more “Lightning to xxx products” than you can ever possibly expect to use don’t exist. In the realm of Lightning-to-audio jack products, at least 10 currently exist. While the analog audio that came out of iDevices with headphone jacks was convenient and not too bad sounding, it really doesn’t compare with the audio extracted directly from the iDevice’s digital stream to either the current Lightning connector or the ancient 40-pin connector. If you’re serious about using iPads, for instance, there are mixers, MIDI interfaces. and audio interfaces (some combining all functions) available. Although I agree that audio delay in Bluetooth earbuds is there, it is getting better (and it really is almost non-existent with 5.2-5.2 devices). Anyway, it is fine for casual music listening, which is what I expect over 99% of iDevice users use it for. Also, for music listening, I have yet to find a set of corded earbuds that sound better than my Sony WF-1000XM4 Bluetooth earbuds, or even last year’s Air Pods Pro.
the last iPad I bought is 48 khz and has a headphone jack.
and then came all that “we are trying to be a Sony tv with dolby atmos and you dont need a headphone jack non-sense”.
No offense but you are totally missing the point John. I think I was pretty clear in my previous post about why hate wireless earbuds, if you like them fine. The best time I have gotten from my wireless is earbuds is 2-3 hours max. When I am out in bumfuck Egypt taking photos I don’t need to worry about earbud battery power. When I am hiking up a mountain side and they fall out of your ears it sucks, yes I have the rubber ear string thing but it sucks, when I am sampling at the Preston Castle or some other cool spot I hate the sample lag, not mention I used to have my spare battery charging my iPhone all day, and not taking away the convenience of a dedicated headphone jack that didn’t need adapters and more goofy cables.
As for sound quality my BOSE wired earbuds sounded great for my needs for many years. The truth is if you really depend on wireless earbuds you need at least 2 pairs so you can use one set while the other is charging. Again, when I am out landscape photographing somewhere in Nevada or Idaho or Canada, wireless earbuds suck on many levels for many reasons in MY world. If they work for others, fine.
I could list 10 more specific examples of how taking away a headphone jacks sucks for my purposes but Apple could care less so I deal with it.
It’s like when they fucked up 3rd party plug-ins for iMovie or made it so you couldn’t give custom playlists on iTunes anymore or changing their cables every few years blah blah… my first Apple was a Ile running an alpha syntauri system so I have been dealing with Apple nonsense for many years. Life goes on. Not here to argue or fight just sharing my real world experiences. If others like basic features being taken away and then being told how much better it is…fine. JMHOs
I wasn’t specifically responding to you, but it was easier to collectively respond at the level of your post. Still, for about $8.00-$10 on Amazon you can get the Lightning to stereo phone jack adapter (even the “official” Apple one is $9.00). I don’t specifically know about Android devices without phone jacks, but I assume there would be a USB C adapter that provides the same function. So, I’m not sure I see what the problem is. If you used an adapter, you would have the same latency as if there was a physical phone jack on the phone, better quality audio (out the jack) than that provided by the obsolete phone jacks (at least for iPhones), and battery life that is, pretty much, the same as the phone’s when it’s not actively doing something else. If that still wouldn’t serve your purpose, I guess I’m still missing something.
I agree that the 125mS latency with the AirPods Pro might be too much for a lot of audio applications, and their 4+ hr battery life may be too short, but I wasn’t suggesting they would be adequate for everybody or for every application. Also, I began my reply with the Lighting adapter solution, for which I have yet to find an audio application problem it wouldn’t solve.
i agree. 10000%
why fuck with technology that had a heritage going back about 100+years (to telephone switchboards) and did the job? decades of headphones now cannot be used without a fucking dongle… which will break…
boooo… boooo. boooooooooooooo
Voice generator https://app.uberduck.ai/speak#mode=tts-basic&voice=steve-jobs
its not enough AI voodoo
It needs to speak English, French, German in one go and know what language it is reading
Fooled me. If I hadn’t been told it was AI generated, I wouldn’t have known. Can’t tell the difference with the real person. Phrasing, pauses, accentuation are all spot on.
I wonder if its still convincing if Steve Jobs is trying to sell me fresh pink chemical bathroom smells from the afterlife?
they are cheating with the noise in the recording here
that makes it sound more real
if the words start if the same letter it doesn’t work at all
” as you would with a friend”
“would with” sounds like shit
it sounds like timestrech
sounds like all the other voice stuff you heard, if it talks to slow you can’t make it talk faster …
the demo is 2 simple tricks :
1. the noise in the recording
2. they show you the text of the thing you are supposed to hear
with number 2 its very easy to make ppl think they hear whatever is suggested …
and the noise tries to hide the errors.
if its a voice demo and they show you written text it means it doesn’t work 😉
the written text and the pic of Steve looking at you
together with the assembled and timestreched words
with the noise
create a strong illusion
if you listen to the audio alone its really obvious
it sounds like shit
There are a number of what sounds like jump cuts, air that was taken out from between sentences.
listen up when it says
“” as you would with a friend”
thats timestrech !
no AI miracle
yes something with the breathing aint right too
sometimes it has these little glitches in-between
bllep i dont know where it comes from but it sounds very computerish
its all from timestrech in some way I guess
(i dont mean the big cuts)
sometimes they tried to make him talk faster
(I guess the original material was slower?)
that sounds like a computer at once
i didn’t watch the video. I *never* watch the videos. lol.
sadly, this stuff and AI will be used mostly to displace people from their jobs. so many daily examples of text to speech usage in telephony instead of basic narration. next book covers, ad copy and art, commercial spots, modeling; anything that doesn’t require a brain, or a paycheck will be targeted.
i, for one, always knew Jobs was a droid.
The speech is a little syncopated, but it does sound human. I never paid any attention to Steve Jobs so I can’t say if the emulation sounds like him. Definitely it’s not a good voice. They should have done Morgan Freeman, Vincent Price, Ian McKellen…. the list is never ending.
That’s very convincing and it will only get better. 10 years ago the attempts to do this sounded quite bad. 10 years from now nobody will be able to tell the difference between real and fake unless you work in the field or have a job in a national intelligence agency.