Daily Payload

ITU-T Leading Work on Telepresence Standardization with H.323

January 15, 2011

For those in the videoconferencing business, mass market adoption of the technology has been a very long time coming. From the World’s Fair in 1964 when a concept of videoconferencing was introduced until this day, mass adoption of video has been very, very slow. This slow rate of adoption was due to a number of factors, not the least of which was cost and technological limits.

During the first decade of the 21st century, we started to see an explosion of IP-based videoconferencing solutions. Today, videoconferencing functionality is appearing everywhere. A growing number of mobile phones have videoconferencing capabilities, calling services like Skype support videoconferencing, and perhaps most important is that all of the major telephony equipment manufacturers and service providers offer videoconferencing solutions.

What was once a vertical industry no longer is. Mainstream phones are equipped with videoconferencing capabilities. Grandparents now use the technology to see their grandchildren, rather than just hear their voices. Teens use the technology to keep in touch with their friends. And, of course, businesses use the technology to hold more intimate meetings with colleagues, customers, and partners.

The barriers to adoption have not disappeared entirely, but they are starting to disappear. There are really two major barriers to cross before we can claim success in the videoconferencing industry: bandwidth and interoperability.

The Internet is fast, but there is not enough bandwidth to support high-quality video communications between all potential users. So, industry must invest more heavily in Internet infrastructure. The need for more bandwidth is not a requirement only for the videoconferencing industry, though. We are starting to see huge amounts of bandwidth consumed for all kinds of applications. Virtually all mobile phones put bandwidth-consuming capabilities in the palm of users’ hands. More and more users are using video streaming services like Netflix to watch movies and Hulu to watch TV. More and more people are enjoying music through streaming services like Pandora, Last.fm, and SHOUTcast. Online multi-player graphics-intensive videogames with integrated VoIP are wildly popular. There are literally millions of video streams flowing from YouTube all of the time.

With so many applications on the Internet and with bandwidth usage ever-increasing, carriers are (or should be) working to increase bandwidth both in the network core and at the access points. While many of the applications are tolerant of network delays, real-time interactive voice and video is not. One can buffer YouTube or Netflix video and it plays quite nicely, but one cannot buffer a real-time video call with your friends or family.

The second major hurdle is interoperability. Apple’s Facetime is nice, but it works only with Apple. Skype is cool, but works only with other Skype users. Much of the technology used is the same, and some of the technology might even be “standards-based,” but is nonetheless not interoperable with other products. At least, it’s not interoperable without the other product supporting the proprietary behavior of the other product.

In the consumer space, this is perhaps much less of an issue than in the enterprise space. If one buys a Telepresence system for business, one should rightfully expect that product to interoperate with Telepresence systems elsewhere. What value is there in a videoconferencing product that cannot communicate with other systems?

In the videoconferencing space, H.323 is the dominant standard for videoconferencing. H.323 defines everything a vendor needs to build interoperable videoconferencing systems. It is continually updated, with the most recent version of the H.323 standard published in 2009.

Unfortunately, Telepresence systems is a more recent class of videoconferencing systems that, while are “standards based,” are nonetheless not interoperable with other systems. The reason is that these systems are different than the typical videoconferencing system. Perhaps the most notable difference a user sees when entering a Telepresence conference room is that there are often two or three video screens. There are usually multiple audio channels. Telepresence systems try to provide users with a real-world video communication experience, an experience that is just like meeting face-to-face.

While H.323 supports multiple media flows, the standard does not define which flow is associated with the left, center, or right video screen. There are others aspects of the Telepresence experience that needs to be standardized.

During the previous meeting of ITU-T SG16 held July 19-30, 2010, SG16 approved the formation of a new Question (i.e., “working group”) focused on the study of Telepresence. The first meeting of the experts was held Research Triangle Park, North Carolina at the campus of Cisco Systems during the week of November 29, 2010. While this was only the initial meeting, it was attended by virtually all of the major equipment manufacturers. LifeSize was notably absent, which was rather unfortunate. The group made progress to set the stage for enhancing H.323 to fully support features necessary to build interoperable H.323 Telepresence systems.

So, for those who have been unhappy with the current situation regarding interoperability, there is light at the end of the tunnel.