I don’t have any images from my Project Starline experience. Google had a strict “no photos, no videos” policy in place. No colleagues, either. Just me in a dark meeting room on the Shoreline Amphitheater grounds in Mountain View. You walk in and sit down in front of a table. In front of you is what looks like a big, flat screen TV.
A lip below the screen extends out in an arc, incased in a speaker. There are three camera modules on the screen’s edges — on the top and flanking both sides. They look a bit like Kinects in that way all modern stereoscopic cameras seem to.
The all-too-brief seven-minute session is effectively an interview. A soft, blurry figure walks into frame and sits down, as the image’s focus sharpens. It appears to be both a privacy setting and a chance for the system to calibrate its subject. One of the key differences between this Project Starline prototype and the one Google showed off late last year is a dramatic reduction in hardware.
The team has reduced the number of cameras down from “several” to a few and dramatically decreased the overall size of the system down from something resembling one of those diner booths. The trick here is developing a real-time 3D model of a person with far fewer camera angles. That’s where AI and ML step in, filling in the gaps in data, not entirely dissimilarly from the way the Pixel approximates backgrounds with tools like Magic Erase — albeit with a three-dimensional render.
After my interview subject — a member of the Project Starline team — appears, it takes a bit of time for the eyes and brain to adjust. It’s a convincing hologram — especial for one being rendered in real time, with roughly the same sort of lag you would experience on a plain old two-dimensional Zoom call.
You’ll notice something a bit…off. Humans tend to be the most difficult. We’ve evolved over millennia to identify the slightest deviation from the norm. I throw off the term “twitching” to describe the subtle movement on parts of the subject’s skin. He — more accurately — calls them “artifacts.” These are little instances the system didn’t quite nail, likely due to limitations on the data being collected by the on-board sensors. This includes portions with an absence in visual information, which appear as though the artist has run short on paint.
A lot of your own personal comfort level comes down to adjusting to this new presentation of digital information. Generally speaking, when most of us talk to another person, we don’t spend the entire conversation fixated on their corporeal form. You focus on the words and, if you’re attuned to such things, the subtle physical cues we drop along the way. Presumably, the more you use the system, the less calibration your brain requires.
Quoting from a Google research publication on the technology:
Our system achieves key 3D audiovisual cues (stereopsis, motion parallax, and spatialized audio) and enables the full range of communication cues (eye contact, hand gestures, and body language), yet does not require special glasses or body-worn microphones/headphones. The system consists of a head-tracked autostereoscopic display, high-resolution 3D capture and rendering subsystems, and network transmission using compressed color and depth video streams. Other contributions include a novel image-based geometry fusion algorithm, free-space dereverberation, and talker localization.
Effectively, Project Starline is gathering information and presenting it in such a way that creates the perception of depth (stereopsis), using the two spaced-out biological cameras in our skulls. Spatial audio, meanwhile, serves a similar function for sound, calibrating the speakers to give the impression that the speaker’s voice is coming out of their virtual mouth.
Google has been testing this specific prototype version for some time now with WeWork, T-Mobile and Salesforce — presumable the sorts of big corporate clients that would be interested in such a thing. The company says much of the feedback revolves around how true to life the experience is versus things like Google Meet, Zoom and Webex — platforms that saved our collective butts during the pandemic, but still have a good deal of limitations.
You’ve likely heard people complain — or complained yourself — about the things we lost as we moved from the in-person meeting to virtual. It’s an objectively true sentiment. Obviously Project Starline is still very much a virtual experience, but can probably trick your brain into believing otherwise. For the sake of a workplace meeting, that’s frankly probably more than enough.
There’s no timeline here and no pricing. Google referred to it as a “technology project” during our meeting. Presumably the ideal outcome for all of the time and money spent on such a project is a saleable product. The eventual size and likely pricing will almost certainly be out of reach for most of us. I could see a more modular version of the camera system that clips onto the side of a TV or computer doing well.
For most people in most situations, it’s overkill in its current form, but it’s easy to see how Google could well be pointing to the future of teleconferencing. It certainly beats your bosses making you take calls in an unfinished metaverse.
Project Starline is the coolest work call you’ll ever take by Brian Heater originally published on TechCrunch