Tutorial: Voice over IP Quality of Service
Target: Telecommunications Companies / Internet / Voice over Internet
Telephone service provided over the Internet (Voice over the Internet Protocol, or VoIP) offers great savings to customers and service providers by using the world-wide network to pass voice data between the caller and the receiver. Because high-bandwidth Internet access is available at low, flat-rate pricing, the data of a typical phone call, including long distance, can be transferred at costs less than one cent a minute. Costs for implementing a VoIP solution are equivalent in most cases to implementing most phone services for businesses, such as installing a Private Branch Exchange (PBX).rolex replica sale
A VoIP solution can be implemented over a dedicated private Local or Wide Area Network (LAN or WAN), such as a company's Intranet, without a noticeable difference from the Public Switched Telephone Network (PSTN), because the administrators of a private network can control the capacity and speed of the overall network. The public Internet, however, presents some problems for a universal VoIP solution, because data is transferred worldwide through different providers and equipment. This can lead to data packets being delayed or dropped, resulting in delayed or broken speech in a VoIP telephone call.
The clarity of the audio signal, delays in the transfer of the speech, and gaps in the audio transmission are factors affecting the Quality of Service (QOS) of a telephone call. The PSTN has done its best optimize the QOS of its network to provide the best signal at the lowest bandwidth. The goal of VoIP is to allow the user to place a phone call that has the same QOS as the PSTN. This article discusses the problems and solutions facing the VoIP on the public Internet.
The PSTNhublot replica sale
First, a short discussion of how telephone calls are made over the PSTN is in order.
Most residential users, and some business users, have analog connections over copper wires to the telephone company, called the "local loop". Once the phone local company has the call, it is usually converted to a digital signal and transmitted across the telephone network to the destination telephone company, which then converts the data back to an analog signal and transmits the call over their local loop to the receiver.
The conversion from analog to digital and back to analog again are controlled by a pair of coder/decoders (codecs) in the telephone network. The human ear can handle frequencies as high as 22kHz (the high frequencies of a cymbal crash, for example), but most speech can be understood at frequencies ranging from 300 to 4000 Hz. A guiding principle behind analog to digital conversion is given in the Nyquist Theorem, developed by Harry Nyquist at Bell Telephone Laboratories in 1928. The principle states, in essence, that it is necessary to sample the amplitude of an analog signal at twice the highest frequency of the signal itself. Therefore, by reducing the highest frequency of the telephone system to 4000 Hz, the telephone company must only sample the signal 8000 times per second (or every 125 milliseconds). This is the rate of transmission over the telephone network.
The encoding standard used by the PSTN uses Pulse Code Modulation (PCM), with eight bits (one byte) to represent a single sample. This has been standardized by the International Telecommunication Union Telecommunication Sector (ITU-T) as G.721. At a sampling rate of 8000 times per second (8kHz), with each sample containing 8 bits, the data of a call through the telephone network requires a bandwidth of 64,000 bits per second (or 64kbps). That is the bandwidth required to handle one telephone call, and is known as a voice-grade digital channel.
Generally, the telephone company will group several telephone calls together, sending each call's samples over the same line, one after the other, every 125 ms. In the US, this occurs over a T1 line, which can carry up to 24 voice channels. Europe and certain other parts of the globe use an E1 line, which can carry up to 32 channels. In either case, the method of interleaving the samples of each telephone call and extracting them at the receiving station is called multiplexing and demultiplexing.
On the sending side, the telephone company equipment must sample each phone call, multiplex it with the other calls, transmit the data, and be ready to acquire the next sample every 125 ms. On the receiving side, the telephone company must receive and demultiplex the incoming transmission, decode each digital sample to an analog signal, an be ready to receive the next set of samples every 125 ms. The method of multiplexing signals based on time divisions (there are other ways of multiplexing signals) is called Time Division Multiplexing (TDM) and is the way telephone companies send their traffic through the network.
The key that makes this system work reliably is that each telephone call establishes a connection with the far end, creating a circuit. If the circuit cannot be completed because there is not enough equipment to handle the call, the caller hears a fast busy tone. All the resources needed to make this circuit remain dedicated for this telephone conversation as long as the call remains active.
The Internet is a packet-based service, meaning the data is grouped into packets, and each packet is sent individually over the Internet to its destination. The data packets have additional bits added to contain routing information - destination address, port number, priority, etc. The public Internet cannot guarantee that all packets will be received in the order that they were sent, or, in some cases, that they will reach their destination at all. Under heavy volume, data packets may be delayed behind other packets. Because any data may be transmitted over the Internet, VoIP packets become just another piece of data, along with e-mail, file transfers, and web pages.
Rather than dedicating a circuit to handle each voice connection, VoIP uses a shared packet-based system (the Internet or dedicated WAN) to handle many voice connections. The individual voice packets are sent along with all the other traffic in the network. The packets may have to pass through several interconnections before reaching their destinations and may each take different routes, meaning they may arrive out of sequence or not at all.
The delay between speech and response and the quality of the voice from the sender are the factors which affect VoIP Quality of Service.
Factors Affecting VoIP Quality of Service replica rolex
VoIP needs to compress the data to ensure it can be transported across the network. All networks have a limited throughput, and voice data must be compressed as much as possible to transport many calls simultaneously.
Higher compressions generally result in poorer audio quality. Additionally, each of the compression algorithms have a certain level of computational complexity which may cause noticeable delays when encoding and decoding the data.
Pulse Code Modulation (PCM) is the typical encoding technique for the PSTN, and sends 8000 bytes every second, or 64,000 bits every second (64 Kbps). It has an MOS of 4.4 out of a possible 5.0. PCM can be encoded and decoded within 1 ms on each side using standard Digital Signal Processing equipment, a speed which is nearly imperceptible to the listener.
Adaptive Differential Pulse Code Modulation (ADPCM), also used in some PSTN networks, reduces the size of the data by only transmitting the differences between samples rather than the entire sample. ADPCM offers several compression levels and can reduce the 64Kbps of PCM down to 40, 32, 24, and 16 Kbps. 32 Kbps is typical and has an MOS of 4.2. Higher compression rates produce a corresponding drop in audio quality. ADPCM imposes a delay of roughly 1 ms per side, which is again nearly imperceptible.
Conjugate Structure - Algebraic Code Excited Linear Prediction (CS-ACELP) is a fairly complex algorithm that compresses the PCM data to 8 Kbps, a compression ratio of 8:1. Its MOS score can reach that of ADPCM at 4.2. CS - CELP can create a delay of up to 10 ms, which will be noticed by most listeners.
Low Delay - Code Excited Linear Prediction (LD - CELP) reduces the data to 16 Kbps, a ratio of 4:1. Its MOS can reach 4.2 as well. LD - CELP has a delay of about 3-5 ms, which is about the limit which users will accept.
Intentional Packet Delay
VoIP systems typically send data packets representing between 10 ms to 80 ms of voice data. IP packets must contain a standard header containing the routing information, error control information, and the size of the data (the payload). Within the data itself, information for the Real-time Transport Protocol (RTP) accompanies the voice data and contains information such as packet sequence numbering, time-stamping, and payload type identification. These headers are present regardless of the size of the payload, so if the data is sent as it is retrieved (smaller payload size), more bandwidth will be required to handle the headers of each packet. Thus, more data can be sent using larger payloads, but at the expense of imposing a delay of about 40-80 ms to wait for all the data to arrive and be encoded.
Another component affecting delay is the network itself. During congestion, a network, particularly the public Internet, may have significant delay (upwards of 500 ms or more). This delay may also fluctuate between wide extremes, known as "jitter".
Queuing Delays at the Receiver
The receiver must be able to accept data packets in any order, re-sequence them properly, discard voice packets that are too old to be of any use, decode the data, handle echo cancellation, and fill in gaps left by packets that were not received in time. In isolation, lost data of 80 ms will not affect a conversation, but several lost packets in sequence will create gaps that will be difficult to mask. The longer the receiver queues the data, the more packets will be received and sequenced. But if the queuing time is too long, a noticeable delay occurs and the QOS is reduced. Most VoIP receiving equipment have intelligent algorithms that dynamically adjust queuing time to account for network delay and jitter and still present the data to the user within an acceptable period of time.
Example VoIP Quality of Service
To look at an example, suppose LD - CELP is used, and that each packet represents 40 ms of speech. The time to encode the data at the sender and to decode the data at the receiver is 5 ms per side, or 10 ms. Network delay is averaging 200 ms, with spikes up to 400 ms. So the receiver decides to queue the data for 300 ms, which includes the decoding delay. Combined, this means that from the time speech was uttered to the time it is heard at the receiver is 40 + 10 + 200 + 300, or 550 ms…nearly half a second. This is a delay not typical over the PSTN, and the voice quality is slightly less than the user is accustomed to hearing.
VoIP using the public Internet cannot guarantee the Quality of Service most customers are used to receiving and will typically be less than that of the PSTN. On the other hand, VoIP will be acceptable for most callers if the cost savings are enough to offset the reduced QOS.
Back to previous page.