Perched atop my computer is a shiny, high-tech video camera. Through the miracles of modern technology, I can have live video chats with friends or business associates on the other side of the country or the other side of the world, without even paying long-distance phone charges. Although I could opt for an audio-only conversation or even the text-only format of email or instant messaging, there’s something about seeing another person’s face that makes communication much richer and more satisfying. Using similar technology, I’ve participated in countless videoconferences involving multiple people in each of two or more locations, using cameras mounted on large video monitors and special microphones so that we can all see and hear each other. This is all good. But there’s one thing about the current state of the art in video communication that still bothers me greatly: the inability to make eye contact with the person or people on the other end. This was never a problem on Star Trek, which was of course the source of all my technological expectations.
Look at Me When You Say That
If you have ever tried video chats or videoconferencing yourself, you undoubtedly know what I mean. If not, let me describe what’s going on. The camera that’s pointing at your face is positioned above, below, or to the side of your display. This means the angle at which you’re viewing the screen is different from the angle at which the camera (and therefore the person on the other end) sees you—an effect known as parallax. Only if you were looking directly into the camera would the viewer have the impression you’re looking into his or her eyes. As a result, while you see your friend’s image on the screen, your friend appears to be looking down (or in some direction other than right at you), and you appear the same way on your friend’s screen. You could, of course, position the camera directly in front of your own screen, but then, the camera itself would block your view of the person on the other end.
Eye contact is extremely important for meaningful communication, and after all, seeing the person you’re talking to is the whole point of videoconferencing. But if you can’t look that person in the eye, this eliminates much of the advantage of video over a regular phone call. Guides to effective business videoconferencing usually say you should look at the camera when speaking, to give the people on the other side the sense that you’re speaking directly to them. But this is unnatural, and prevents you from seeing their reactions as you speak. What we really need is exactly what they have on the starship Enterprise: video displays that also somehow function as cameras, such that wherever you direct your gaze on the screen, that’s where your eyes will appear to be looking on the other end. Sure enough, engineers are trying to achieve this effect right now, working from several different angles (as it were).
It’s All Done with Mirrors
One fairly easy way to get eye-to-eye contact over a video link is to use technology borrowed from the television industry: the teleprompter. If you watch a news broadcast on TV, you’ll notice that the announcer is looking directly at the camera. TV news anchors don’t memorize their reports in advance; they read them from a special video screen that appears to be directly in front of the camera. In reality, the screen (an ordinary CRT or LCD) is positioned face-up just below and in front of the camera lens, with its text displayed as a mirror image. Above this display, and thus in front of the lens, is a partially silvered (or two-way) mirror positioned at a 45° angle. The announcer sees the text reflected onto it from below, while the camera sees only the announcer.
Teleprompters are a simple and tested technology; they’ve been around for more than 50 years. When similar designs are used for video communication, they’re sometimes referred to as video tunnels. They do, however, have some problems. One issue is size: the equipment is by nature quite bulky, because it requires that angled mirror in front of the camera as well as special shielding to protect the camera from glare. So even a design that uses an LCD panel will end up being at least as large as a CRT display. Teleprompters also tend to be heavy, fragile, and expensive—all factors that make them unattractive for ordinary consumers.
There’s yet another problem, which comes into play when more than two people are involved in a videoconference. If I look directly into a camera, all the people who see me on the screen will perceive that I’m making eye contact, even if they’re in different locations. So the participants will not have the impression that my gaze shifts as I turn my attention from one person to another—nor can I tell who is looking at me (or my image) at any given time.
Just Like Being There
One solution to the problem of gaze direction, being developed by researchers at Keio University in Tokyo, is called MAJIC (multi-attendant joint interface for collaboration). This system replaces the two-way glass mirror of the teleprompter with a large, curved screen made of a thin, perforated material that provides a reflective surface on one side and from the other side is mostly transparent. Cameras behind the screen record the participants in one location, while ordinary video projectors display the images of other participants (in one or more locations) on the front of the screen. What’s unique about MAJIC is that behind each person’s image on the screen in each location is a separate camera that functions as that person’s virtual eyes for that location (along with a speaker to reproduce the person’s voice). The result is that I always appear to be looking at whichever participant I’m facing at the moment, and I can even tell when one participant is looking at another. An additional bonus: the life-size projections make it feel as though you’re really sitting across a table from the other participants. This is pretty much the effect we’re all looking for, but with all those cameras and projectors, the cost of such a system is quite high, and it also uses a tremendous amount of bandwidth to transmit all that video data. Not quite what we need for desktop or laptop use.
A very different approach, called gaze correction, is being studied by researchers at major companies such as HP, Microsoft, and AT&T. It starts with one or two ordinary video cameras mounted near a conventional computer display. A special video processor digitally modifies the image of each person’s face in real time so that it appears that his or her eyes are looking straight at the camera, even though they’re not. Early demonstrations of these systems appear relatively convincing—maybe even a bit spooky—but they are not yet ready for commercial use. They also have not yet been adapted to work well with multiple participants in a single location, or to permit selective eye contact with just one of several remote participants.
Yet another method of correcting for gaze is a system called GAZE-2, under development at Queen’s University in Kingston, Ontario. GAZE-2 uses multiple cameras in a video tunnel design along with a device that tracks eye movement. The system detects which part of the screen a user is looking at (corresponding to one of several remote participants), then automatically switches to the camera behind that part of the screen—thus ensuring that each participant is always making eye contact with the others, regardless of which image any person is looking at. The system can even rotate the images of other participants on the screen to show who is looking at whom.
It’s exciting to see progress being made, but I’d still like to see a slim, flat-panel display with cameras mounted invisibly inside and seamless gaze correction for any number of users. I have an idea for a completely novel design that just might provide all that, but it would require a few hundred thousand dollars and several months of experimentation to prove the concept and develop a working prototype. Since those resources are far beyond my means for the foreseeable future, my idea will have to remain speculation for now. But with or without my help, I expect to see eye-to-eye video displays long before starships. —Joe Kissell
UPDATE #1: In January 2006, Apple was awarded a patent for an eye-to-eye video system in which a large array of microscopic cameras is embedded in a monitor along with the display elements; software combines all these thousands or millions of images into a single picture. Time will tell if, when, or in what form this technology becomes available to consumers. Apple’s approach wasn’t quite the idea I had in mind, but it’s nice to see that they have been worrying about the same problem and applying their considerable resources to solving it.
UPDATE #2: Bodelin Technologies’ See Eye 2 Eye (SE2E), introduced in 2007, brings teleprompter-like features to most desktop and laptop computers with either built-in or add-on video cameras. The SE2E is inexpensive (US$50–60) and relatively compact, but also has a rather small viewing area.