Daily Payload

SIP: An Emerging Technology?

April 19, 2004

Recently, some news media outlets have been posting stories about SIP (Session Initiation Protocol), calling it an emerging technology that is going to revolutionize telecommunication. I am curious—what is the timeframe in which something may properly be called emerging? Is it one year, two years, three years, or longer?

The formation of a star or a planet, and most certainly a galaxy, takes a very long time. After many years, one may refer to such astronomical entities as emerging. On the other hand, one only has a few minutes to see a chicken emerging from an egg. So, where is the line drawn with respect to how long one can point to a technology and say that it is emerging?

Apparently, in the case of SIP, it is at least 8 years. That's right. SIP has been with us now for 8 years, and people are still referring to it as an emerging technology. So is it correct or even reasonable to point to SIP and call it an emerging technology? I do not think so.

The ironic thing is that Jeff Pulver and a few others have referred to H.323 as a dinosaur, suggesting that it is a much older and less sophisticated technology than SIP. Of course, those suggestive remarks are ridiculous. H.323 is a standard with an ever-growing deployment base and functionality and sophistication that surpasses SIP, fully supporting multipoint multimedia communication over IP networks. H.323 is now 9 years old—just one year older than SIP.

The SIP marketing engine is vibrant. I have witnessed companies advertising and promoting SIP even above their own brand name. It has been a very odd pattern of behavior. I recall attending a VON (Voice on the Net) show in the spring of 1999 in Atlanta. One company was there showing off the functionality of SIP, placing red dots on the show floor and handing out little red balls for those who cared to listen to their marketing spill. One of the "features" advertised during the presentation was the ability to forward a call to a web page in the event that there was no answer. They had two computers acting as SIP "phones" and a web server set up just to demonstrate this "incredibly powerful functionality" to the public. Why would one want to try to place a phone call to somebody and suddenly have a web page opened? Would the remote party always assume that the caller has the ability to display a web page? Perhaps one is using a standard telephone plugged into the wall and does not have the capability of displaying a web page. Then what? Even if one had a phone with a web browser integrated into the phone, would people be expected to both listen to sounds like busy signals and watch the screen on the phone? There may be some applications wherein this is useful, though it would probably not be useful for the majority of people. Even so, what the demonstrator did not point out, as the presentation was focused only on SIP, was that H.323 had the capability as well. In fact, most of the time when some "cool" feature is touted in SIP, the same feature can also be found in H.323, but fewer people hear about those.

For whatever reason, companies in the VoIP business have never tried to market H.323. Perhaps the name is not sexy enough. Perhaps the companies deploying H.323 are simply too busy deploying. Perhaps they did not need to destabilize a growing, successful market in order to buy time for their technology to catch up.

Today H.323 has the lion's share of the market with literally billions of minutes of billable voice traffic every month. H.323 designers were originally focused on multimedia, including video and data conferencing. As a result, H.323 has extremely good support for video and is the most widely used standards-based protocol used on IP networks in that space. Even so, the number of deployments using H.323 only for voice continues to grow.

Recently, Vonage reported that they were the first company to reach 100,000 residential VoIP subscribers on their SIP-based network. They were the first in the United States, but certainly not the first in the world. In fact, that title was taken more than a year or so ago on H.323 networks by Fastweb. (In July 2003: more than 249,000 subscribers and more than 3,000,000 calls per day.) H.323 is, indeed, alive and well.

H.323 holds the leadership position in global long distance. A very large number of international telephone calls are now carried over H.323 networks around the world. China also uses H.323 to transport calls between cities, with a total number of minutes per month in the billions.

I noticed a few years SIP-focused companies started to look for new ways to hype the protocol. I suppose companies felt like they could not win in the global long distance market, so they started to look for new markets for SIP, namely presence and instant messaging. I personally thought that was an odd move, as presence and instant messaging are quite different than telephony. H.323 does not have presence or instant messaging capability, though it does have the ability to register with a gatekeeper and transmit text between users. I suppose one could use H.323 as an instant messaging tool, but why? Instant Messaging products abound in the market already. Products like AOL Instant Messenger and MSN Messenger, while proprietary, serve the public well and products like Sametime and Jabber are available for enterprise users. Jabber, an open-source product, has the added benefit of being a technology that could be used outside the enterprise and is also a standards activity within the IETF that is competitive to SIP for instant-messaging applications. So why SIP, as opposed to anything else?

Perhaps it is possible to use SIP for several entirely unrelated tasks. I suppose that is an interesting academic exercise, but how does it increase the profitability of a company? Does it make an employee more productive? I do not believe that it does.

A strong argument can be made for converging voice (traditional telephone services) with data applications (traditional IP-based services) onto the same IP network. There is a business case for VoIP, and the future of telecommunication is IP-based. That future will include audio, video, and data-collaboration functions. What is not clear is the benefit of using the same protocol. H.323 provides voice and video and text (for deaf and hearing impaired) intrinsically. It integrates application sharing and whiteboarding through another protocol from the ITU called T.120. Together, these protocols provide extremely rich multimedia communications. Further, if two users want to "chat" in a multipoint conference, the functionality is there in H.323 and T.120.

SIP has very basic support for all features and virtually no support for video. Even today, more than 8 years since its introduction, SIP still does not have a base-level means of providing DTMF support (the ability to press keys on the telephone and generate audible tones at the remote end). There are, in fact, four ways of sending DTMF: RFC 2833 (widely adopted), using the INFO method, using the NOTIFY method, and KPML. With such diverse, unofficial choices, how can one deploy a SIP-based system and obtain international interoperability? How can one expect SIP to be used for serious, more complex communication like multipoint videoconferencing?

One can probably point to these issues and understand why SIP has failed in the market where H.323 has succeeded. H.323, as mentioned earlier, has a very large and ever-growing deployment base. It is not likely, contrary to the marketing hype surrounding SIP, that H.323 will die anytime soon—H.323 meets customer requirements and therefore allows for deployment of services and the generation of revenue. H.323 is implemented in all major IP-based PBX system used in enterprise networks today. It has a bright future.

On a final note, I have always been curious why a protocol would be hyped, rather than the technology behind the protocol. H.323, SIP, MGCP, and H.248 are all VoIP-enabling protocols. H.323 and SIP are competitive and both are complementary to MGCP and H.248, two competing media gateway control protocols. By and large, I would think that users would not care what protocol was selected, as long as the product met the user's requirements. When you buy a telephone today, you do not consider the protocol used in the PSTN. Enterprise customers largely do not care about the protocols used in their PBX systems: they just want to connect those to the service provider and that may be done in a few standard ways—more than one protocol has always existed for all of these interfaces. So, why then is there hype around SIP? It makes no business sense.

SIP might still be considered an emerging technology even after 8 years, but can the market wait for it to finally emerge?

Marketing forces should be put to work advertising long-since (more than 8 years) developed applications like voice and video over IP, electronic whiteboarding, application sharing, and other applications that make people's lives better and make employees more productive. It really makes no difference if the protocol of choice is H.323, SIP, T.120, H.248, MGCP, or anything else—use the tools that have been developed and move forward! The telecom industry, which has been in a slump for a few years, needs to get passed the protocol debate and move on to focus on services, functionality, and revenue. Leave the hype behind.