The Critical 'I'

Read. React. Repeat.

Thursday, April 15, 2004

There's been lots of news about search technology lately: Amazon's entry, Google's email/search hybrid offering (accompanied by uncharacteristic criticism of the fair-haired Google), and even the possibility of search-based operations displacing conventional computer operating systems. But perhaps the most compelling development in this field is next-generation search paradigms that eschew text queries in favor of visual/three-dimensional inputs, currently being eyed for industrial and technical use.

This new kind of search is being pioneered at two academic centers: Purdue University, under the direction of Karthik Ramani, and Princeton University under Thomas Funkhouser. The Princeton link is especially interesting, because it's got a Web-based utility that allows you to try out this visual search engine.
So how can computer programs look for objects? The breakthrough is the voxel.

Digital camera owners are familiar with pixels -- the basic element of a digital image. Each pixel is a tiny grain of color.

Similarly, a voxel is the basic element of a three-dimensional object that is represented in a computer. Each voxel represents the volume of the object at any given point.

In Ramani's program, for example, stored CAD designs and entries sketched by users are converted into voxels. Then voxel patterns are compared for similarities. Because the voxels represent volume rather than just shape, the program can sniff out, say, a coffee cup, which is mostly hollow but might have a solid handle.

Princeton's Funkhouser believes 3-D searching should get even smarter. He believes the systems ought to learn from their users' queries and eventually recognize common patterns. A computer could eventually recognize that several different images all show a human, even if the people are in different poses.

For the foreseeable future, 3-D searching is likely to see only specialized business uses. However, Peter Norvig, Google's director of search quality, calls the technology "interesting" and adds, "If it starts to take off, we'll look more seriously at it."

Ramani is still fine-tuning the interface of his 3-D search engine, which is to be licensed by Imaginestics, where he is chief scientist. But he is already excited about the improvements in productivity that could result when objects, not just words, are accessible through computers.

"I think this," he said, "is the beginning of the information age."
What are the long-term implications of this? I think it could be the very earliest steps in the replacement of language and communication as we now know them.

Think about it: What's the purpose of an alphabet and lanuage? Fundamentally, it's to pass along ideas and information from one person to another. At root, words are representations of these real-world concepts, designed to most effectively get the message across. But no matter how precisely we structure our language, we can only approximate our intended meanings. That's the nature of language, and it's limitation. That limitation becomes apparent in Internet search interfaces, which by necessity are grounded in text searching.

This new approach of making use of image generation as a search input bypasses what is an inefficient text--or, more directly, language--expression. Now, using a visual representation does the job better. It's still dealing in the world of representation, in place of the original thing/idea, but it's more direct that words (reinforcing the "picture worth a thousand words" maxim). As this concept spreads, it could make the use of traditional language--written and even oral--superfluous.

What's the end-result? Imagine a future point where ubiquitous digital input/output devices that we all carry, or have constant access to, are the instruments of interpersonal communication and information retrieval. Such devices would be ideal in getting across what people are trying to communicate in truer form than any current alphabet, pronounciation or other representation can. It would be an unrecognizable revolution in human interaction. If these new search frontiers are the baby steps, the revolution may not be all too far off.