Roger Ebert: Remaking my voice
These are my words, but this is not my voice.
This is Alex, the best computer voice I've been able to find, which comes as standard equipment on every Macintosh. For most of my life, I never gave a second thought to my ability to speak. It was like breathing. In those days, I was living in a fool's paradise. After surgeries for cancer took away my ability to speak, eat or drink, I was forced to enter this virtual world in which a computer does some of my living for me. For several days now, we have enjoyed brilliant and articulate speakers here at TED.
I used to be able to talk like that. Maybe I wasn't as smart, but I was at least as talkative. I want to devote my talk today to the act of speaking itself, and how the act of speaking or not speaking is tied so indelibly to one's identity as to force the birth of a new person when it is taken away. However, I've found that listening to a computer voice for any great length of time can be monotonous. So I've decided to recruit some of my TED friends to read my words aloud for me. I will start with my wife, Chaz. Chaz Ebert: "It was Chaz who stood by my side through three attempts to reconstruct my jaw and restore my ability to speak.
Going into the first surgery for a recurrence of salivary cancer in 2006, I expected to be out of the hospital in time to return to my movie review show, 'Ebert and Roper at the Movies.' I had pre-taped enough shows to get me through six weeks of surgery and recuperation. The doctors took a fibula bone from my leg and some tissue from my shoulder to fashion into a new jaw. My tongue, larynx and vocal cords were still healthy and unaffected. (Laughter)
(Laughter)
"I was optimistic, and all was right with the world.
The first surgery was a great success. I saw myself in the mirror and I looked pretty good. Two weeks later, I was ready to return home. I was using my iPod to play the Leonard Cohen song 'I'm Your Man' for my doctors and nurses. Suddenly, I had an episode of catastrophic bleeding. My carotid artery had ruptured. Thank God I was still in my hospital room and my doctors were right there. Chaz told me that if that song hadn't played for so long, I might have already been in the car, on the way home, and would have died right there and then. So thank you, Leonard Cohen, for saving my life. (Applause)
"There was a second surgery -- which held up for five or six days and then it also fell apart.
And then a third attempt, which also patched me back together pretty well, until it failed. A doctor from Brazil said he had never seen anyone survive a carotid artery rupture. And before I left the hospital, after a year of being hospitalized, I had seven ruptures of my carotid artery. There was no particular day when anyone told me I would never speak again; it just sort of became obvious.
Human speech is an ingenious manipulation of our breath within the sound chamber of our mouth and respiratory system. We need to be able to hold and manipulate that breath in order to form sounds. Therefore, the system must be essentially airtight in order to capture air. Because I had lost my jaw, I could no longer form a seal, and therefore my tongue and all of my other vocal equipment was rendered powerless. Dean Ornish: "At first for a long time, I wrote messages in notebooks.
Then I tried typing words on my laptop and using its built in voice. This was faster, and nobody had to try to read my handwriting. I tried out various computer voices that were available online, and for several months I had a British accent, which Chaz called Sir Lawrence." (Laughter) "It was the clearest I could find. Then Apple released the Alex voice, which was the best I'd heard. It knew things like the difference between an exclamation point and a question mark. When it saw a period, it knew how to make a sentence sound like it was ending instead of staying up in the air. There are all sorts of html codes you can use to control the time and inflection of computer voices, and I've experimented with them. For me, they share a fundamental problem: they're too slow. When I find myself in a conversational situation, I need to type fast and to jump right in. People don't have the time or the patience to wait for me to fool around with the codes for every word or phrase. But what value do we place on the sound of our own voice?
How does that affect who you are as a person? When people hear Alex speaking my words, do they experience a disconnect? Does that create a separation or a distance from one person to the next? How did I feel not being able to speak? I felt, and I still feel, a lot of distance from the human mainstream. I've become uncomfortable when I'm separated from my laptop. Even then, I'm aware that most people have little patience for my speaking difficulties. So Chaz suggested finding a company that could make a customized voice using my TV show voice from a period of 30 years.
At first I was against it. I thought it would be creepy to hear my own voice coming from a computer. There was something comforting about a voice that was not my own. But I decided then to just give it a try. So we contacted a company in Scotland that created personalized computer voices. They'd never made one from previously-recorded materials. All of their voices had been made by a speaker recording original words in a control booth. But they were willing to give it a try. So I sent them many hours of recordings of my voice, including several audio commentary tracks that I'd made for movies on DVDs.
And it sounded like me, it really did. There was a reason for that; it was me. But it wasn't that simple. The tapes from my TV show weren't very useful because there were too many other kinds of audio involved -- movie soundtracks, for example, or Gene Siskel arguing with me." (Laughter) "And my words often had a particular emphasis that didn't fit into a sentence well enough. I'll let you hear a sample of that voice.
These are a few of the comments I recorded for use when Chaz and I appeared on the Oprah Winfrey program. And here's the voice we call Roger Jr. or Roger 2.0. Roger 2.0: Oprah, I can't tell you how great it is to be back on your show.
We have been talking for a long time, and now here we are again. This is the first version of my computer voice. It still needs improvement, but at least it sounds like me and not like HAL 9000. When I heard it the first time, it sent chills down my spine. When I type anything, this voice will speak whatever I type. When I read something, it will read in my voice. I have typed these words in advance, as I didn't think it would be thrilling to sit here watching me typing. The voice was created by a company in Scotland named CereProc.
It makes me feel good that many of the words you are hearing were first spoken while I was commenting on 'Casablanca' and 'Citizen Kane.' This is the first voice they've created for an individual. There are several very good voices available for computers, but they all sound like somebody else, while this voice sounds like me. I plan to use it on television, radio and the internet. People who need a voice should know that most computers already come with built-in speaking systems. Many blind people use them when they read pages on the Web to themselves. But I've got to say, in first grade, they said I talked too much, and now I still can. (Laughter)
Roger Ebert: As you can hear, it sounds like me, but the words jump up and down.
The flow isn't natural. The good people in Scotland are still improving my voice, and I'm optimistic about it. But so far, the Apple Alex voice is the best one I've heard. I wrote a blog about it and actually got a comment from the actor who played Alex. He said he recorded many long hours in various intonations to be used in the voice. A very large sample is needed. John Hunter: "All my life I was a motormouth.
Now I have spoken my last words, and I don't even remember for sure what they were. I feel like the hero of that Harlan Ellison story titled 'I Have No Mouth and I Must Scream.' On Wednesday, David Christian explained to us what a tiny instant the human race represents in the time-span of the universe. For almost all of its millions and billions of years, there was no life on Earth at all. For almost all the years of life on Earth, there was no intelligent life. Only after we learned to pass knowledge from one generation to the next, did civilization become possible. In cosmological terms, that was about 10 minutes ago. Finally came mankind's most advanced and mysterious tool, the computer. That has mostly happened in my lifetime. Some of the famous early computers were being built in my hometown of Urbana, the birthplace of HAL 9000.
When I heard the amazing Talk by Salman Khan on Wednesday, about the Khan Academy website that teaches hundreds of subjects to students all over the world, I had a flashback. It was about 1960. As a local newspaper reporter still in high school, I was sent over to the computer lab of the University of Illinois to interview the creators of something called PLATO. The initials stood for Programmed Logic for Automated Teaching Operations. This was a computer-assisted instruction system, which in those days ran on a computer named ILLIAC. The programmers said it could assist students in their learning. I doubt, on that day 50 years ago, they even dreamed of what Salman Khan has accomplished.
But that's not the point. The point is PLATO was only 50 years ago, an instant in time. It continued to evolve and operated in one form or another on more and more sophisticated computers, until only five years ago. I have learned from Wikipedia that, starting with that humble beginning, PLATO established forums, message boards, online testing, email, chat rooms, picture languages, instant messaging, remote screen sharing and multiple-player games. "Since the first Web browser was also developed in Urbana, it appears that my hometown in downstate Illinois was the birthplace of much of the virtual, online universe we occupy today.
But I'm not here from the Chamber of Commerce." (Laughter) "I'm here as a man who wants to communicate. All of this has happened in my lifetime.
I started writing on a computer back in the 1970s when one of the first Atech systems was installed at the Chicago Sun Times. I was in line at Radio Shack to buy one of the first Model 100’s. And when I told the people in the press room at the Academy Awards that they better install some phone lines for Internet connections, they didn't know what I was talking about. When I bought my first desktop, it was a DEC Rainbow. Does anybody remember that?" (Applause) "The Sun Times sent me to the Cannes Film Festival with a portable computer the size of a suitcase named the Porteram Telebubble. I joined Compuserve when it had fewer numbers than I currently have followers on Twitter. (Laughter)
CE: "All of this has happened in the blink of an eye.
It is unimaginable what will happen next. It makes me incredibly fortunate to live at this moment in history. Indeed, I am lucky to live in history at all, because without intelligence and memory there is no history. For billions of years, the universe evolved completely without notice. Now we live in the age of the Internet, which seems to be creating a form of global consciousness. And because of it, I can communicate as well as I ever could. We are born into a box of time and space. We use words and communication to break out of it and to reach out to others. For me, the Internet began as a useful tool and now has become something I rely on for my actual daily existence.
I cannot speak, I can only type so fast. Computer voices are sometimes not very sophisticated, but with my computer, I can communicate more widely than ever before. I feel as if my blog, my email, Twitter and Facebook have given me a substitute for everyday conversation. They aren't an improvement, but they're the best I can do. They give me a way to speak. Not everybody has the patience of my wife, Chaz. But online, everybody speaks at the same speed. This whole adventure has been a learning experience.
Every time there was a surgery that failed, I was left with a little less flesh and bone. Now I have no jaw left at all. While harvesting tissue from both my shoulders, the surgeries left me with back pain and reduced my ability to walk easily. Ironic that my legs are fine, and it's my shoulders that slow up my walk. When you see me today, I look like the Phantom of the Opera. But no you don't.
(Laughter)
(Applause)
"It is human nature to look at someone like me and assume I have lost some of my marbles.
People --" (Applause) "People talk loudly --" I'm so sorry.
Excuse me. (Applause)
"People talk loudly and slowly to me.
Sometimes they assume I am deaf. There are people who don't want to make eye contact. Believe me, he didn't mean this as -- anyway, let me just read it.
(Laughter) You should never let your wife read something like this. (Laughter)
"It is human nature to look away from illness.
We don't enjoy a reminder of our own fragile mortality. That's why writing on the Internet has become a life-saver for me. My ability to think and write have not been affected. And on the Web, my real voice finds expression. I have also met many other disabled people who communicate this way. One of my Twitter friends can type only with his toes. One of the funniest blogs on the Web is written by a friend of mine named Smartass Cripple." (Laughter) "Google him and he will make you laugh. All of these people are saying, in one way or another, that what you see is not all you get. So I have not come here to complain.
I have much to make me happy and relieved. I seem, for the time being, to be cancer-free. I am writing as well as ever. I am productive. If I were in this condition at any point before a few cosmological instants ago, I would be as isolated as a hermit. I would be trapped inside my head. Because of the rush of human knowledge, because of the digital revolution, I have a voice, and I do not need to scream. RE: Wait.
I have one more thing to add. A guy goes into a psychiatrist. The psychiatrist says, "You're crazy." The guy says, "I want a second opinion." The psychiatrist says, "All right, you're ugly. (Laughter)
You all know the test for artificial intelligence -- the Turing test.
A human judge has a conversation with a human and a computer. If the judge can't tell the machine apart from the human, the machine has passed the test. I now propose a test for computer voices -- the Ebert test. If a computer voice can successfully tell a joke and do the timing and delivery as well as Henny Youngman, then that's the voice I want. (Applause)