draft-ietf-payload-g7110-03.txt   draft-ietf-payload-g7110-04.txt 
Network Working Group M. Ramalho, Ed. Network Working Group M. Ramalho, Ed.
Internet-Draft P. Jones Internet-Draft P. Jones
Intended status: Standards Track Cisco Systems Intended status: Standards Track Cisco Systems
Expires: February 23, 2015 N. Harada Expires: June 26, 2015 N. Harada
NTT NTT
M. Perumal M. Perumal
Ericsson Ericsson
L. Miao L. Miao
Huawei Technologies Huawei Technologies
August 22, 2014 December 23, 2014
RTP Payload Format for G.711.0 RTP Payload Format for G.711.0
draft-ietf-payload-g7110-03 draft-ietf-payload-g7110-04
Abstract Abstract
This document specifies the Real-Time Transport Protocol (RTP) This document specifies the Real-Time Transport Protocol (RTP)
payload format for ITU-T Recommendation G.711.0. ITU-T Rec. G.711.0 payload format for ITU-T Recommendation G.711.0. ITU-T Rec. G.711.0
defines a lossless and stateless compression for G.711 packet defines a lossless and stateless compression for G.711 packet
payloads typically used in IP networks. This document also defines a payloads typically used in IP networks. This document also defines a
storage mode format for G.711.0 and a media type registration for the storage mode format for G.711.0 and a media type registration for the
G.711.0 RTP payload format. G.711.0 RTP payload format.
skipping to change at page 1, line 41 skipping to change at page 1, line 41
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet- working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/. Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
This Internet-Draft will expire on February 23, 2015. This Internet-Draft will expire on June 26, 2015.
Copyright Notice Copyright Notice
Copyright (c) 2014 IETF Trust and the persons identified as the Copyright (c) 2014 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 2, line 25 skipping to change at page 2, line 25
2. Requirements Language . . . . . . . . . . . . . . . . . . . . 3 2. Requirements Language . . . . . . . . . . . . . . . . . . . . 3
3. G.711.0 Codec Background . . . . . . . . . . . . . . . . . . 3 3. G.711.0 Codec Background . . . . . . . . . . . . . . . . . . 3
3.1. General Information and Use of the ITU-T G.711.0 Codec . 3 3.1. General Information and Use of the ITU-T G.711.0 Codec . 3
3.2. Key Properties of G.711.0 Design . . . . . . . . . . . . 4 3.2. Key Properties of G.711.0 Design . . . . . . . . . . . . 4
3.3. G.711 Input Frames to G.711.0 Output Frames . . . . . . . 6 3.3. G.711 Input Frames to G.711.0 Output Frames . . . . . . . 6
3.3.1. Multiple G.711.0 Output Frames per RTP Payload 3.3.1. Multiple G.711.0 Output Frames per RTP Payload
Considerations . . . . . . . . . . . . . . . . . . . 8 Considerations . . . . . . . . . . . . . . . . . . . 8
4. RTP Header and Payload . . . . . . . . . . . . . . . . . . . 9 4. RTP Header and Payload . . . . . . . . . . . . . . . . . . . 9
4.1. G.711.0 RTP Header . . . . . . . . . . . . . . . . . . . 9 4.1. G.711.0 RTP Header . . . . . . . . . . . . . . . . . . . 9
4.2. G.711.0 RTP Payload . . . . . . . . . . . . . . . . . . . 10 4.2. G.711.0 RTP Payload . . . . . . . . . . . . . . . . . . . 10
4.2.1. Single G.711.0 Frame per RTP Payload Example . . . . 10 4.2.1. Single G.711.0 Frame per RTP Payload Example . . . . 11
4.2.2. G.711.0 RTP Payload Definition . . . . . . . . . . . 11 4.2.2. G.711.0 RTP Payload Definition . . . . . . . . . . . 11
4.2.3. G.711.0 RTP Payload Decoding Process . . . . . . . . 12 4.2.2.1. G.711.0 RTP Payload Encoding Process . . . . . . 13
4.2.4. G.711.0 RTP Payload for Multiple Channels . . . . . . 14 4.2.3. G.711.0 RTP Payload Decoding Process . . . . . . . . 13
5. Payload Format Parameters . . . . . . . . . . . . . . . . . . 17 4.2.4. G.711.0 RTP Payload for Multiple Channels . . . . . . 16
5.1. Media Type Registration . . . . . . . . . . . . . . . . . 17 5. Payload Format Parameters . . . . . . . . . . . . . . . . . . 18
5.2. Mapping to SDP Parameters . . . . . . . . . . . . . . . . 19 5.1. Media Type Registration . . . . . . . . . . . . . . . . . 18
5.3. Offer/Answer Considerations . . . . . . . . . . . . . . . 19 5.2. Mapping to SDP Parameters . . . . . . . . . . . . . . . . 20
5.4. SDP Examples . . . . . . . . . . . . . . . . . . . . . . 20 5.3. Offer/Answer Considerations . . . . . . . . . . . . . . . 21
5.4.1. SDP Example 1 . . . . . . . . . . . . . . . . . . . . 20 5.4. SDP Examples . . . . . . . . . . . . . . . . . . . . . . 21
5.4.2. SDP Example 2 . . . . . . . . . . . . . . . . . . . . 20 5.4.1. SDP Example 1 . . . . . . . . . . . . . . . . . . . . 21
6. G.711.0 Storage Mode Conventions and Definition . . . . . . . 21 5.4.2. SDP Example 2 . . . . . . . . . . . . . . . . . . . . 21
6.1. G.711.0 PLC Frame . . . . . . . . . . . . . . . . . . . . 21 6. G.711.0 Storage Mode Conventions and Definition . . . . . . . 22
6.2. G.711.0 Erasure Frame . . . . . . . . . . . . . . . . . . 22 6.1. G.711.0 PLC Frame . . . . . . . . . . . . . . . . . . . . 22
6.3. G.711.0 Storage Mode Definition . . . . . . . . . . . . . 23 6.2. G.711.0 Erasure Frame . . . . . . . . . . . . . . . . . . 23
7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 24 6.3. G.711.0 Storage Mode Definition . . . . . . . . . . . . . 24
8. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 24 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 25
9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 24 8. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 25
10. Security Considerations . . . . . . . . . . . . . . . . . . . 24 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 25
11. Congestion Control . . . . . . . . . . . . . . . . . . . . . 26 10. Security Considerations . . . . . . . . . . . . . . . . . . . 26
12. References . . . . . . . . . . . . . . . . . . . . . . . . . 26 11. Congestion Control . . . . . . . . . . . . . . . . . . . . . 27
12.1. Normative References . . . . . . . . . . . . . . . . . . 26 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 27
12.2. Informative References . . . . . . . . . . . . . . . . . 27 12.1. Normative References . . . . . . . . . . . . . . . . . . 27
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 28 12.2. Informative References . . . . . . . . . . . . . . . . . 28
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 29
1. Introduction 1. Introduction
The International Telecommunication Union (ITU-T) Recommendation The International Telecommunication Union (ITU-T) Recommendation
G.711.0 [G.711.0] specifies a stateless and lossless compression for G.711.0 [G.711.0] specifies a stateless and lossless compression for
G.711 packet payloads typically used in Voice over IP (VoIP) G.711 packet payloads typically used in Voice over IP (VoIP)
networks. This document specifies the Real-Time Transport Protocol networks. This document specifies the Real-Time Transport Protocol
(RTP) RFC 3550 [RFC3550] payload format and storage modes for this (RTP) RFC 3550 [RFC3550] payload format and storage modes for this
compression. compression.
skipping to change at page 4, line 12 skipping to change at page 4, line 12
G.711.0 may be employed end-to-end; in which case the RTP payload G.711.0 may be employed end-to-end; in which case the RTP payload
format specification and use is nearly identical to the G.711 RTP format specification and use is nearly identical to the G.711 RTP
specification found in RFC 3551 [RFC3551]. The only significant specification found in RFC 3551 [RFC3551]. The only significant
difference for G.711.0 is the required use of a dynamic payload type difference for G.711.0 is the required use of a dynamic payload type
(the static PT of 0 or 8 is presently almost always used with G.711 (the static PT of 0 or 8 is presently almost always used with G.711
even though dynamic assignment of other payload types is allowed) and even though dynamic assignment of other payload types is allowed) and
the recommendation not to use Voice Activity Detection (see the recommendation not to use Voice Activity Detection (see
Section 4.1). Section 4.1).
G.711.0, being both lossless and stateless, may also be employed as a G.711.0, being both lossless and stateless, may also be employed as a
lossless compression mechanism anywhere between end systems which lossless compression mechanism for G.711 payloads anywhere between
have negotiated use of G.711. Because the only significance between end systems which have negotiated use of G.711. Because the only
the G.711 RTP payload format header and the G.711.0 payload format significance between the G.711 RTP payload format header and the
header is the payload type, a G.711 RTP packet can be losslessly G.711.0 payload format header defined in this document is the payload
converted to a G.711.0 RTP packet simply by compressing the G.711 type, a G.711 RTP packet can be losslessly converted to a G.711.0 RTP
payload (thus creating a G.711.0 payload), changing the payload type packet simply by compressing the G.711 payload (thus creating a
to the dynamic value desired and copying all the remaining G.711 RTP G.711.0 payload), changing the payload type to the dynamic value
header fields into the corresponding G.711.0 RTP header. Conversely, desired and copying all the remaining G.711 RTP header fields into
the corresponding decompression of a G.711.0 RTP packet back to the the corresponding G.711.0 RTP header. In a similar manner, the
original source G.711 RTP packet can be accomplished by losslessly corresponding decompression of the G.711.0 RTP packet thus created
decompressing the G.711.0 payload back to the original source G.711 back to the original source G.711 RTP packet can be accomplished by
payload, changing the payload type back to the payload type of the losslessly decompressing the G.711.0 payload back to the original
original G.711 RTP packet and copying all the remaining G.711.0 RTP source G.711 payload, changing the payload type back to the payload
header fields into the corresponding G.711 RTP header. type of the original G.711 RTP packet and copying all the remaining
G.711.0 RTP header fields into the corresponding G.711 RTP header.
Negotiation specifics for this lossless G.711 payload compression for
RTP use case is not in scope for this document.
It is special to note that G.711.0, being both lossless and It is special to note that G.711.0, being both lossless and
stateless, can be employed multiple times (e.g., on multiple, stateless, can be employed multiple times (e.g., on multiple,
individual hops or series of hops) of a given flow with no individual hops or series of hops) of a given flow with no
degradation of quality relative to end-to-end G.711. Stated another degradation of quality relative to end-to-end G.711. Stated another
way, multiple "lossless transcodes" from/to G.711.0/G.711 do not way, multiple "lossless transcodes" from/to G.711.0/G.711 do not
affect voice quality as typically occurs with lossy transcodes to/ affect voice quality as typically occurs with lossy transcodes to/
from dissimilar codecs. from dissimilar codecs.
Lastly, it is expected that G.711.0 will be used as an archival Lastly, it is expected that G.711.0 will be used as an archival
skipping to change at page 5, line 51 skipping to change at page 6, line 6
frame presented to it without signaling knowledge. frame presented to it without signaling knowledge.
A5 Accommodate G.711 payload sizes typically used in IP: G.711 input A5 Accommodate G.711 payload sizes typically used in IP: G.711 input
frames of length typically found in VoIP applications represent frames of length typically found in VoIP applications represent
SDP ptimes (see RFC 4566 [RFC4566]) of 5 ms, 10 ms, 20 ms, 30 SDP ptimes (see RFC 4566 [RFC4566]) of 5 ms, 10 ms, 20 ms, 30
ms or 40 ms. Since the dominant sampling frequency for G.711 ms or 40 ms. Since the dominant sampling frequency for G.711
is 8000 samples per second, G.711.0 was designed to compress is 8000 samples per second, G.711.0 was designed to compress
G.711 input frames of 40, 80, 160, 240 or 320 samples. G.711 input frames of 40, 80, 160, 240 or 320 samples.
A6 Bounded expansion: Since attribute A2 above requires G.711.0 to A6 Bounded expansion: Since attribute A2 above requires G.711.0 to
be lossless for any payload, by definition there exists at be lossless for any payload (which could consist of any
least one potential G.711 payload which must be combination of octets with each octet spanning the entire space
"uncompressible". Since the quantum of compression is an of 2^8 values), by definition there exists at least one
octet, the minimum expansion of such an uncompressible payload potential G.711 payload which must be "uncompressible". Since
was designed to be the minimum possible of one octet. Thus the quantum of compression is an octet, the minimum expansion
G.711.0 "compressed" frames can be of length one octet to X+1 of such an uncompressible payload was designed to be the
octets, where X is the size of the input G.711 frame in octets. minimum possible of one octet. Thus G.711.0 "compressed"
G.711.0 can therefore be viewed as a Variable Bit Rate (VBR) frames can be of length one octet to X+1 octets, where X is the
encoding in which the size of the G.711.0 output frame is a size of the input G.711 frame in octets. G.711.0 can therefore
function of the G.711 symbols input to it. be viewed as a Variable Bit Rate (VBR) encoding in which the
size of the G.711.0 output frame is a function of the G.711
symbols input to it.
A7 Algorithmic delay: G.711.0 was designed to have the algorithmic A7 Algorithmic delay: G.711.0 was designed to have the algorithmic
delay equal to the time represented by the number of samples in delay equal to the time represented by the number of samples in
the G.711 input frame (i.e., no "look-ahead"). the G.711 input frame (i.e., no "look-ahead").
A8 Low Complexity: Less than 1.0 WMOPS average and low memory A8 Low Complexity: Less than 1.0 Weighted Million Operations Per
footprint (~5k octets RAM, ~5.7k octets ROM and ~3.6 basic Second (WMOPS) average and low memory footprint (~5k octets
operations) [ICASSP] [G.711.0]. RAM, ~5.7k octets ROM and ~3.6 basic operations) [ICASSP]
[G.711.0].
A9 Both A-law and mu-law supported: G.711 has two operating laws, A9 Both A-law and mu-law supported: G.711 has two operating laws,
A-law and mu-law. These two laws are also known as PCMA and A-law and mu-law. These two laws are also known as PCMA and
PCMU in RTP applications RFC 3551 [RFC3551]. PCMU in RTP applications RFC 3551 [RFC3551].
These attributes generally make it trivial to compress a G.711 input These attributes generally make it trivial to compress a G.711 input
frame consisting of 40, 80, 160, 240 or 320 samples. After the input frame consisting of 40, 80, 160, 240 or 320 samples. After the input
frame is presented to a G.711.0 encoder, a G.711.0 "self-describing" frame is presented to a G.711.0 encoder, a G.711.0 "self-describing"
output frame is produced. The number of samples contained within output frame is produced. The number of samples contained within
this frame is easily determined at the G.711.0 decoder by virtue of this frame is easily determined at the G.711.0 decoder by virtue of
skipping to change at page 7, line 20 skipping to change at page 7, line 20
| (where X MUST be 40, 80, | | (precise value dependent on | | (where X MUST be 40, 80, | | (precise value dependent on |
| 160, 240 or 320 octets) |<-----| G.711.0 ability to compress) | | 160, 240 or 320 octets) |<-----| G.711.0 ability to compress) |
|__________________________| B |______________________________| |__________________________| B |______________________________|
Figure 1 Figure 1
Note that the mapping is 1:1 (lossless) in both directions, subject Note that the mapping is 1:1 (lossless) in both directions, subject
to two constraints. The first constraint is that the input frame to two constraints. The first constraint is that the input frame
provided to the G.711.0 encoder (process "A") has a specific number provided to the G.711.0 encoder (process "A") has a specific number
of input G.711 symbols consistent with attribute A5 (40, 80, 160, 240 of input G.711 symbols consistent with attribute A5 (40, 80, 160, 240
or 320 octets). The second constraint is that the compression law or 320 octets). The second constraint is that the companding law
used to create the G.711 input frame (A-law or mu-law) must be known, used to create the G.711 input frame (A-law or mu-law) must be known,
consistent with attribute A9. consistent with attribute A9.
Subject to these two constraints, the input G.711 frame is processed Subject to these two constraints, the input G.711 frame is processed
by the G.711.0 encoder ("A") and produces a "self-describing" G.711.0 by the G.711.0 encoder ("process A") and produces a "self-describing"
output frame, consistent with attribute A4. Depending on the source G.711.0 output frame, consistent with attribute A4. Depending on the
G.711 symbols, the G.711.0 output frame can contain anywhere from 1 source G.711 symbols, the G.711.0 output frame can contain anywhere
to X+1 octets, where X is the number of input G.711 symbols. from 1 to X+1 octets, where X is the number of input G.711 symbols.
Compression results for virtually every zero-mean acoustic signal Compression results for virtually every zero-mean acoustic signal
encoded by G.711.0. encoded by G.711.0.
Since the G.711.0 output frame is "self-describing", a G.711.0 Since the G.711.0 output frame is "self-describing", a G.711.0
decoder (process "B") can losslessly reproduce the original G.711 decoder (process "B") can losslessly reproduce the original G.711
input frame with only the knowledge of which companding law was used input frame with only the knowledge of which companding law was used
(A-law or mu-law). The G.711.0 frame, being "self-describing", (A-law or mu-law). The first octet of a G.711.0 frame is called the
allows for the G.711.0 decoder ("B") to know precisely how many G.711 "Prefix Code" octet; the value of this octet conveys how many G.711
symbols to create. symbols the decoder is to create from a given G.711.0 input frame.
The Prefix Code value of 0x00 is used to denote zero G.711 source
symbols, which allows the use of 0x00 as a payload padding octet (to
be described later).
Since G.711.0 was designed with typical G.711 payload lengths as a Since G.711.0 was designed with typical G.711 payload lengths as a
design constraint (attribute A5), this lossless encoding can be design constraint (attribute A5), this lossless encoding can be
performed only with knowledge of the companding law being used. This performed only with knowledge of the companding law being used. This
information is anticipated to be signaled in SDP and will be information is anticipated to be signaled in SDP and will be
described later in this document. described later in this document.
If the original inputs were known to be from a zero-mean acoustic If the original inputs were known to be from a zero-mean acoustic
signal coded by G.711, an intelligent G.711.0 encoder could infer the signal coded by G.711, an intelligent G.711.0 encoder could infer the
G.711 companding law in use (via G.711 input signal amplitude G.711 companding law in use (via G.711 input signal amplitude
skipping to change at page 9, line 29 skipping to change at page 9, line 31
In this section we describe the precise format for G.711.0 frames In this section we describe the precise format for G.711.0 frames
carried via RTP. We begin with RTP header description relative to carried via RTP. We begin with RTP header description relative to
G.711, then provide two G.711.0 payload examples. G.711, then provide two G.711.0 payload examples.
4.1. G.711.0 RTP Header 4.1. G.711.0 RTP Header
Relative to G.711 RTP headers, the utilization of G.711.0 does not Relative to G.711 RTP headers, the utilization of G.711.0 does not
create any special requirements with respect to the contents of the create any special requirements with respect to the contents of the
RTP packet header. The only significant difference is that the RTP packet header. The only significant difference is that the
payload type (PT) RTP header field will have a value corresponding to payload type (PT) RTP header field MUST have a value corresponding to
the dynamic payload type assigned to the flow. This is in contrast the dynamic payload type assigned to the flow. This is in contrast
to most current uses of G.711 which typically use the static payload to most current uses of G.711 which typically use the static payload
assignment of PT = 0 (PCMU) or PT = 8 (PCMA) [RFC3551] even though assignment of PT = 0 (PCMU) or PT = 8 (PCMA) [RFC3551] even though
the negotiation and use of dynamic payload types is allowed for the negotiation and use of dynamic payload types is allowed for
G.711. G.711. With the exception of rare PT exhaustion cases, the existing
G.711 PT values of 0 and 8 MUST NOT be used for G.711.0 (helping to
avoid possible payload confusion with G.711 payloads).
Voice Activity Detection (VAD) SHOULD NOT be used when G.711.0 is Voice Activity Detection (VAD) SHOULD NOT be used when G.711.0 is
negotiated because G.711.0 obtains high compression during "VAD negotiated because G.711.0 obtains high compression during "VAD
silence intervals" and one of the advantages of G.711.0 over G.711 silence intervals" and one of the advantages of G.711.0 over G.711
with VAD is the lack of any VAD-inducing artifacts in the received with VAD is the lack of any VAD-inducing artifacts in the received
signal. However, if VAD is employed, the Marker bit (M) MUST be set signal. However, if VAD is employed, the Marker bit (M) MUST be set
in the first packet of a talkspurt (the first packet after a silence in the first packet of a talkspurt (the first packet after a silence
period in which packets have not been transmitted contiguously as per period in which packets have not been transmitted contiguously as per
rules specified in [RFC3551] for G.711 payloads). This definition, rules specified in [RFC3551] for G.711 payloads). This definition,
being consistent with the G.711 RTP VAD use, further allows lossless being consistent with the G.711 RTP VAD use, further allows lossless
skipping to change at page 12, line 22 skipping to change at page 12, line 30
|----------|---------|----------|---------|----------------| |----------|---------|----------|---------|----------------|
| First | Second | | Nth | Zero or more | | First | Second | | Nth | Zero or more |
| G.711.0 | G.711.0 | ... | G.711.0 | 0x00 | | G.711.0 | G.711.0 | ... | G.711.0 | 0x00 |
| Frame | Frame | | Frame | Padding Octets | | Frame | Frame | | Frame | Padding Octets |
|__________|_________|__________|_________|________________| |__________|_________|__________|_________|________________|
Figure 3 Figure 3
We note here that when we have multiple G.711.0 frames that the We note here that when we have multiple G.711.0 frames that the
individual frames can be, and generally are, of different lengths. individual frames can be, and generally are, of different lengths.
The decoding process in the following section is used to determine The decoding process described in Section 4.2.3 is used to determine
the frame boundaries. the frame boundaries.
Encoding Process: One or more G.711.0 frames are placed in the RTP Encoding Process: One or more G.711.0 frames are placed in the RTP
payload simply by concatenating the G.711.0 frames together. The payload simply by concatenating the G.711.0 frames together. The
amount of time represented by the G.711 symbols compressed in all the amount of time represented by the G.711 symbols compressed in all the
G.711.0 frames in the RTP payload MUST correspond to the ptime G.711.0 frames in the RTP payload MUST correspond to the ptime
signaled for applications using SDP. Although not generally desired, signaled for applications using SDP. Although not generally desired,
padding in the RTP payload SHOULD be placed after the last G.711.0 padding in the RTP payload SHOULD be placed after the last G.711.0
frame in the payload and MAY be created by placing one or more 0x00 frame in the payload and MAY be created by placing one or more 0x00
octets after the last G.711.0 frame. Such padding may be desired octets after the last G.711.0 frame. Such padding may be desired
based on security considerations (see Section 10). based on security considerations (see Section 10). Additional
encoding process details and considerations are specified later in
Section 4.2.2.1.
Decoding Process: As G.711.0 frames can be of varying length, the Decoding Process: As G.711.0 frames can be of varying length, the
payload decoding process described in the following section is used payload decoding process described in Section 4.2.3 is used to
to determine where the individual G.711.0 frame boundaries are. Any determine where the individual G.711.0 frame boundaries are. Any
padding octets inserted before or after any G.711.0 frame in the RTP padding octets inserted before or after any G.711.0 frame in the RTP
payload is silently (and safely) ignored by the G.711.0 decoding payload is silently (and safely) ignored by the G.711.0 decoding
process. process specified in Section 4.2.3.
4.2.2.1. G.711.0 RTP Payload Encoding Process
The ITU-T G.711.0 supports five possible input frame lengths: 40, 80,
160, 240, and 320 samples per frame and the rationale for choosing
those lengths was given in the description of property A5 in
Section 3.2. Assuming 8000 sample per second, these lengths
correspond to input frames representing 5 ms, 10 ms, 20 ms, 30 ms or
40 ms. So while the standard assumed the input "bit stream"
consisted of G.711 symbols of some integer multiple of 5 ms in
length, it did not specify exactly what frame lengths to use as input
to the G.711.0 encoder itself. The intent of this section is to
provide some guidance for the selection.
Consider a typical IETF use case of 20 ms (160 octets) of G.711 input
samples represented in a G.711.0 payload and signaled by using the
SDP parameter ptime. As described in Section 3.3.1, the simplest way
to encode these 160 octets is to pass the entire 160 octet to the
G.711.0 encoder, resulting in precisely one G.711.0 compressed frame,
and put that singular frame into the G.711.0 RTP payload. However,
neither the ITU-T G.711.0 standard nor this IETF payload format
mandates this. In fact 20 ms of input G.711 symbols can be encoded
as 1, 2, 3 or 4 G.711.0 frames in any one of six combinations (i.e.,
{20ms}, {10ms:10ms}, {10ms:5ms:5ms}, {5ms:10ms:5ms}, {5ms:5ms:10ms},
{5ms:5ms:5ms:5ms}) and any of these combinations would decompress
into the same source 160 G.711 octets.
Notwithstanding the above, we expect one of two encodings to be used
by implementers: the simplest possible (one 160 byte input to the
G.711.0 encoder which usually results in the highest compression) or
the combination of possible input frames to a G.711.0 encoder that
resulted in the highest compression for the payload. The explicit
mention of this issue in this IETF document was deemed important
because the ITU-T G.711.0 standard is silent on this issue and there
is a desire for this issue to be documented in a formal Standards
Development Organization (SDO) document (i.e., here).
4.2.3. G.711.0 RTP Payload Decoding Process 4.2.3. G.711.0 RTP Payload Decoding Process
The G.711.0 decoding process is a standard part of G.711.0 bit stream The G.711.0 decoding process is a standard part of G.711.0 bit stream
decoding and is implemented in the ITU-T Rec. G.711.0 reference code. decoding and is implemented in the ITU-T Rec. G.711.0 reference code.
The decoding process algorithm described in this section is a slight The decoding process algorithm described in this section is a slight
enhancement of the ITU-T reference code to explicitly accommodate RTP enhancement of the ITU-T reference code to explicitly accommodate RTP
padding (as described above). padding (as described above).
Before describing the decoding, we note here that the largest Before describing the decoding, we note here that the largest
skipping to change at page 14, line 11 skipping to change at page 15, line 7
processed packets counter by one (set P = P + 1). If the processed packets counter by one (set P = P + 1). If the
result of this increment results in P >= N then STOP (as all result of this increment results in P >= N then STOP (as all
RTP Payload octets have been processed), otherwise go to H2. RTP Payload octets have been processed), otherwise go to H2.
H5 Process an individual G.711.0 frame (produce G.711 samples in the H5 Process an individual G.711.0 frame (produce G.711 samples in the
output frame): Pass the internal buffer to the G.711.0 decoder. output frame): Pass the internal buffer to the G.711.0 decoder.
The G.711.0 decoder will read the first octet (called the The G.711.0 decoder will read the first octet (called the
"prefix code" octet in ITU-T Rec. G.711.0 [G.711.0]) to "prefix code" octet in ITU-T Rec. G.711.0 [G.711.0]) to
determine the number of source G.711 samples M are contained in determine the number of source G.711 samples M are contained in
this G.711.0 frame. The G.711.0 decoder will produce exactly M this G.711.0 frame. The G.711.0 decoder will produce exactly M
G.711 source symbols. If K = 0, these M symbols will be the G.711 source symbols (M can only have values of 0, 40, 80, 160,
first in the output buffer and are placed at the beginning of 240 or 320). If K = 0, these M symbols will be the first in
the output buffer. If K != 0, concatenate these M symbols with the output buffer and are placed at the beginning of the output
the prior symbols in the output buffer (there are K prior buffer. If K != 0, concatenate these M symbols with the prior
symbols in the buffer). Set K = K + M (as there are now this symbols in the output buffer (there are K prior symbols in the
many G.711 source symbols in the output buffer). The G.711.0 buffer). Set K = K + M (as there are now this many G.711
decoder will have consumed some number of octets, Q, in the source symbols in the output buffer). The G.711.0 decoder will
internal buffer to produce the M G.711 symbols. Increment the have consumed some number of octets, Q, in the internal buffer
number of payload octet processed counter by this quantity (set to produce the M G.711 symbols. Increment the number of
P = P + Q). If the result of this increment results in P >= N payload octet processed counter by this quantity (set P = P +
then STOP (as all RTP Payload octets have been processed), Q). If the result of this increment results in P >= N then
otherwise go to H2. STOP (as all RTP Payload octets have been processed), otherwise
go to H2.
At this point, the output buffer will contain precisely K G.711 At this point, the output buffer will contain precisely K G.711
source symbols which should correspond to the ptime signaled if SDP source symbols which should correspond to the ptime signaled if SDP
was used and the encoding process was without error. was used and the encoding process was without error. If ptime was
signaled via SDP and the number of G.711 symbols in the output buffer
is other than what corresponds to ptime, the packet MUST be discarded
unless other system design knowledge allows for otherwise (e.g.,
occasional 5 ms clock slips causing one more or one less G.711.0
frame than nominal to be in the payload). Lastly, due to the buffer
reads in H2 being bounded (to 321 octets or less), N being bounded to
the size of the G.711.0 RTP payload, and M being bounded to the
number of source G.711 symbols, there is no buffer overrun risk.
We also note, as an aside, that the algorithm above (and the ITU-T We also note, as an aside, that the algorithm above (and the ITU-T
G.711.0 reference code) accommodates padding octets (0x00) placed G.711.0 reference code) accommodates padding octets (0x00) placed
anywhere between G.711.0 frames in the RTP payload as well as prior anywhere between G.711.0 frames in the RTP payload as well as prior
to or after any or all G.711.0 frames. The ITU-T G.711.0 reference to or after any or all G.711.0 frames. The ITU-T G.711.0 reference
code does not have Step H3 and H4 as separate steps (i.e., Step H5 code does not have Step H3 and H4 as separate steps (i.e., Step H5
immediately follows H2) at the added computational cost of some immediately follows H2) at the added computational cost of some
additional buffer passing to/from the G.711.0 frame decoder additional buffer passing to/from the G.711.0 frame decoder
functions. That is the G.711.0 decoder in the reference code functions. That is the G.711.0 decoder in the reference code
"silently ignores" 0x00 padding octets at the beginning of what it "silently ignores" 0x00 padding octets at the beginning of what it
skipping to change at page 17, line 12 skipping to change at page 18, line 20
in the RTP payload. The number of padding octets introduced at any in the RTP payload. The number of padding octets introduced at any
G.711.0 frame boundary therefore does not affect the number M of the G.711.0 frame boundary therefore does not affect the number M of the
source G.711 symbols produced. Thus the decision for padding MAY be source G.711 symbols produced. Thus the decision for padding MAY be
made on a per-superframe basis. made on a per-superframe basis.
5. Payload Format Parameters 5. Payload Format Parameters
This section defines the parameters that may be used to configure This section defines the parameters that may be used to configure
optional features in the G.711.0 RTP transmission. optional features in the G.711.0 RTP transmission.
The parameters defined here as a part of the media subtype The parameters defined here are a part of the media subtype
registration for the G.711.0 codec. Mapping of the parameters into registration for the G.711.0 codec. Mapping of the parameters into
Session Description Protocol (SDP) RFC 4566 [RFC4566] is also Session Description Protocol (SDP) RFC 4566 [RFC4566] is also
provided for those applications that use SDP. provided for those applications that use SDP.
5.1. Media Type Registration 5.1. Media Type Registration
Type name: audio Type name: audio
Subtype name: G711-0 Subtype name: G711-0
skipping to change at page 19, line 25 skipping to change at page 20, line 33
Author: Michael A. Ramalho Author: Michael A. Ramalho
Change controller: Change controller:
IETF Payload working group delegated from the IESG. IETF Payload working group delegated from the IESG.
5.2. Mapping to SDP Parameters 5.2. Mapping to SDP Parameters
The information carried in the media type specification has a The information carried in the media type specification has a
specific mapping to fields in the Session Description Protocol (SDP), specific mapping to fields in the Session Description Protocol (SDP),
which is commonly used to describe RTP sessions. When SDP is used to which is commonly used to describe a RTP session. When SDP is used
specify sessions employing G.711.0, the mapping is as follows: to specify sessions employing G.711.0, the mapping is as follows:
o The media type ("audio") goes in SDP "m=" as the media name. o The media type ("audio") goes in SDP "m=" as the media name.
o The media subtype ("G711-0") goes in SDP "a=rtpmap" as the o The media subtype ("G711-0") goes in SDP "a=rtpmap" as the
encoding name. encoding name.
o The required parameter "rate" also goes in "a=rtpmap" as the clock o The required parameter "rate" also goes in "a=rtpmap" as the clock
rate. rate.
o The parameters "ptime" and "maxptime" go in the SDP "a=ptime" and o The parameters "ptime" and "maxptime" go in the SDP "a=ptime" and
skipping to change at page 20, line 33 skipping to change at page 21, line 44
5.4.1. SDP Example 1 5.4.1. SDP Example 1
m=audio RTP/AVP 98 m=audio RTP/AVP 98
a=rtpmap:98 G711-0/8000 a=rtpmap:98 G711-0/8000
a=fmtp:98 complaw=mu a=fmtp:98 complaw=mu
In the above example the dynamic payload type 98 is mapped to G.711.0 In the above example the dynamic payload type 98 is mapped to G.711.0
via the "a=rtpmap" parameter. The mandatory "complaw" is on the via the "a=rtpmap" parameter. The mandatory "complaw" is on the
"a=fmtp" parameter line. Note that neither optional parameters "a=fmtp" parameter line. Note that neither optional parameters
"ptime" nor "channels" is present; although it is generally good form "ptime" nor "channels" is present; although it is generally good form
to include "ptime" in the SDP for session diagnostic purposes. to include "ptime" in the SDP if the session is a constant ptime
session for diagnostic purposes.
5.4.2. SDP Example 2 5.4.2. SDP Example 2
The following example illustrates an offering endpoint requesting 2 The following example illustrates an offering endpoint requesting 2
channels, but the answering endpoint can only support (or render) one channels, but the answering endpoint can only support (or render) one
channel. channel.
Offer: Offer:
m=audio RTP/AVP 98 m=audio RTP/AVP 98
skipping to change at page 21, line 20 skipping to change at page 22, line 33
interpreted as one channel). As mentioned previously, it is interpreted as one channel). As mentioned previously, it is
considered good form to include "ptime" in the SDP for session considered good form to include "ptime" in the SDP for session
diagnostic purposes if the session is a constant ptime session. diagnostic purposes if the session is a constant ptime session.
6. G.711.0 Storage Mode Conventions and Definition 6. G.711.0 Storage Mode Conventions and Definition
The G.711.0 storage mode definition in this section is similar to The G.711.0 storage mode definition in this section is similar to
many other IETF codecs (e.g., iLBC, EVRC-NW) and is essentially a many other IETF codecs (e.g., iLBC, EVRC-NW) and is essentially a
concatenation of individual G.711.0 frames. concatenation of individual G.711.0 frames.
We note that something must be stored for any G.711.0 frames that not We note that something must be stored for any G.711.0 frames that are
received at the receiving endpoint, no matter what the cause. In not received at the receiving endpoint, no matter what the cause. In
this section we describe two mechanisms, a "G.711.0 PLC Frame" and a this section we describe two mechanisms, a "G.711.0 PLC Frame" and a
"G.711.0 Erasure Frame". These G.711.0 PLC and G.711.0 Erasure "G.711.0 Erasure Frame". These G.711.0 PLC and G.711.0 Erasure
Frames are described prior to the G.711.0 storage mode definition for Frames are described prior to the G.711.0 storage mode definition for
clarity. clarity.
6.1. G.711.0 PLC Frame 6.1. G.711.0 PLC Frame
When G.711 RTP payloads not received by a rendering endpoint a Packet When G.711 RTP payloads not received by a rendering endpoint a Packet
Loss Concealment (PLC) mechanism is typically employed to "fill in" Loss Concealment (PLC) mechanism is typically employed to "fill in"
the missing G.711 symbols with something that is auditorially the missing G.711 symbols with something that is auditorially
skipping to change at page 22, line 43 skipping to change at page 24, line 7
property for a G.711.0 erasure frame is for "non G.711.0 Erasure property for a G.711.0 erasure frame is for "non G.711.0 Erasure
Frame aware" endpoints to be able to playback a G.711.0 erasure frame Frame aware" endpoints to be able to playback a G.711.0 erasure frame
with the existing G.711.0 ITU-T reference code. with the existing G.711.0 ITU-T reference code.
A G.711.0 Erasure Frame is defined as any G.711.0 frame for which the A G.711.0 Erasure Frame is defined as any G.711.0 frame for which the
corresponding G.711 sample values are either the value 0++ or the corresponding G.711 sample values are either the value 0++ or the
value 0-- for the entirety of the G.711.0 frame. The levels of 0++ value 0-- for the entirety of the G.711.0 frame. The levels of 0++
and 0-- are defined to be the two levels above or below analog zero, and 0-- are defined to be the two levels above or below analog zero,
respectively. An entire frame of value 0++ or 0-- is expected to be respectively. An entire frame of value 0++ or 0-- is expected to be
extraordinarily rare when the frame was in fact generated by a extraordinarily rare when the frame was in fact generated by a
natural signal (on the order of one in 2^{ptime in samples, minus natural signal, as analog inputs such as speech and music are zero-
one}), as analog inputs such as speech and music are zero-mean and mean and are typically acoustically coupled to digital sampling
are typically acoustically coupled to digital sampling systems. Note systems. Note that the playback of a G.711.0 frame characterized as
that the playback of a G.711.0 frame characterized as an erasure an erasure frame is auditorially equivalent to a muted signal (a very
frame is auditorially equivalent to a muted signal (a very low value low value constant).
constant).
These G.711.0 erasure frames can be reasonably characterized as null These G.711.0 erasure frames can be reasonably characterized as null
or erasure frames while meeting the desired playback goal of being or erasure frames while meeting the desired playback goal of being
decoded by the G.711.0 ITU-T reference code. Thus, similarly to decoded by the G.711.0 ITU-T reference code. Thus, similarly to
G.711 PLC frames, the G.711.0 erasure frames appear as "normal" or G.711 PLC frames, the G.711.0 erasure frames appear as "normal" or
"ordinary" G.711.0 frames in the storage mode format. "ordinary" G.711.0 frames in the storage mode format.
6.3. G.711.0 Storage Mode Definition 6.3. G.711.0 Storage Mode Definition
The storage format is used for storing G.711.0 encoded frames. The The storage format is used for storing G.711.0 encoded frames. The
skipping to change at page 24, line 27 skipping to change at page 25, line 38
7. Acknowledgements 7. Acknowledgements
There have been many people contributing to G.711.0 in the course of There have been many people contributing to G.711.0 in the course of
its development. The people listed here deserve special mention: its development. The people listed here deserve special mention:
Takehiro Moriya, Claude Lamblin, Herve Taddei, Simao Campos, Yusuke Takehiro Moriya, Claude Lamblin, Herve Taddei, Simao Campos, Yusuke
Hiwasaki, Jacek Stachurski, Lorin Netsch, Paul Coverdale, Patrick Hiwasaki, Jacek Stachurski, Lorin Netsch, Paul Coverdale, Patrick
Luthi, Paul Barrett, Jari Hagqvist, Pengjun (Jeff) Huang, John Gibbs, Luthi, Paul Barrett, Jari Hagqvist, Pengjun (Jeff) Huang, John Gibbs,
Yutaka Kamamoto, and Csaba Kos. The review and oversight by the IETF Yutaka Kamamoto, and Csaba Kos. The review and oversight by the IETF
Payload Working Group chairs Ali Begen and Roni Even during the Payload Working Group chairs Ali Begen and Roni Even during the
development of this RFC is appreciated. Additionally, the careful development of this RFC is appreciated. Additionally, the careful
review and comments by Richard Barnes is likewise very much review by Richard Barnes and extensive review by David Black and the
appreciated. rest of the IESG is likewise very much appreciated.
8. Contributors 8. Contributors
The authors thank everyone who have contributed to this document. The authors thank everyone who have contributed to this document.
The people listed here deserve special mention: Ali Begen, Roni Even, The people listed here deserve special mention: Ali Begen, Roni Even,
and Hadriel Kaplan. and Hadriel Kaplan.
9. IANA Considerations 9. IANA Considerations
One media type (audio/G711-0) has been defined and requires IANA One media type (audio/G711-0) has been defined and requires IANA
skipping to change at page 26, line 21 skipping to change at page 27, line 31
resulted in very small G.711.0 frames (less than about 20% of the resulted in very small G.711.0 frames (less than about 20% of the
symbols of the corresponding G.711 input frame). Methods of symbols of the corresponding G.711 input frame). Methods of
introducing padding in the G.711.0 payloads have been provided in the introducing padding in the G.711.0 payloads have been provided in the
G.711.0 RTP payload definition in Section 4.2.2. G.711.0 RTP payload definition in Section 4.2.2.
11. Congestion Control 11. Congestion Control
The G.711 codec is a Constant Bit Rate (CBR) codec which does not The G.711 codec is a Constant Bit Rate (CBR) codec which does not
have a means to regulate the bitrate. The G.711.0 lossless have a means to regulate the bitrate. The G.711.0 lossless
compression algorithm typically compresses the G.711 CBR stream into compression algorithm typically compresses the G.711 CBR stream into
a smaller VBR stream. However, being lossless, it does not possess a lower bandwidth VBR stream. However, being lossless, it does not
means of further reducing the bitrate beyond the G.711.0-based possess means of further reducing the bitrate beyond the
compression result. The G.711.0 RTP payloads can be made arbitrarily G.711.0-based compression result. The G.711.0 RTP payloads can be
large by means of adding optional padding bytes (subject only to MTU made arbitrarily large by means of adding optional padding bytes
limitations). (subject only to MTU limitations).
Therefore, there are no explicit ways to regulate the bit-rate of the Therefore, there are no explicit ways to regulate the bit-rate of the
transmissions outlined in this RTP Payload format except by means of transmissions outlined in this RTP Payload format except by means of
modulating the number of optional padding bytes in the RTP payload. modulating the number of optional padding bytes in the RTP payload.
12. References 12. References
12.1. Normative References 12.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
 End of changes. 27 change blocks. 
101 lines changed or deleted 160 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/