draft-ietf-payload-rtp-aptx-05.txt   rfc7310.txt 
Internet-Draft J. Lindsay
A/V Transport Payloads Working Group H.Foerster Internet Engineering Task Force (IETF) J. Lindsay
Intended status: Standards Track APT Ltd Request for Comments: 7310 H. Foerster
Expires: Aug 05, 2014 February 05, 2014 Category: Standards Track APT Ltd
ISSN: 2070-1721 July 2014
RTP Payload Format for Standard apt-X and Enhanced apt-X Codecs RTP Payload Format for Standard apt-X and Enhanced apt-X Codecs
draft-ietf-payload-rtp-aptx-05
Abstract Abstract
This document specifies a scheme for packetizing Standard apt-X, or This document specifies a scheme for packetizing Standard apt-X or
Enhanced apt-X, encoded audio data into Real-time Transport Protocol Enhanced apt-X encoded audio data into Real-time Transport Protocol
(RTP) packets. The document describes a payload format that permits (RTP) packets. The document describes a payload format that permits
transmission of multiple related audio channels in a single RTP transmission of multiple related audio channels in a single RTP
payload, and a means of establishing Standard apt-X and Enhanced payload and a means of establishing Standard apt-X and Enhanced apt-X
apt-X connections through the Session Description Protocol (SDP). connections through the Session Description Protocol (SDP).
Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with the
provisions of BCP 78 and BCP 79.
Comments are solicited and should be addressed to the A/V Transport
Payloads working group's mailing list at payload@ietf.org and/or the
authors.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at Status of This Memo
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on Aug 05, 2014. This is an Internet Standards Track document.
Submission Compliance for Internet-Drafts. This document is a product of the Internet Engineering Task Force
(IETF). It represents the consensus of the IETF community. It has
received public review and has been approved for publication by the
Internet Engineering Steering Group (IESG). Further information on
Internet Standards is available in Section 2 of RFC 5741.
This Internet-Draft is submitted in full conformance with the Information about the current status of this document, any errata,
provisions of BCP 78 and BCP 79. and how to provide feedback on it may be obtained at
http://www.rfc-editor.org/info/rfc7310.
Copyright and License Notice Copyright Notice
Copyright (c) 2014 IETF Trust and the persons identified as the Copyright (c) 2014 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as the Trust Legal Provisions and are provided without warranty as
described in the Simplified BSD License. described in the Simplified BSD License.
Table of Contents Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 1. Introduction ....................................................2
2. Conventions . . . . . . . . . . . . . . . . . . . . . . . . . 5 2. Conventions .....................................................3
3. Standard apt-X and Enhanced apt-X Codecs . . . . . . . . . . . 6 3. Standard apt-X and Enhanced apt-X Codecs ........................3
4. Payload Format Capabilities . . . . . . . . . . . . . . . . . 8 4. Payload Format Capabilities .....................................5
4.1. Use of Forward Error Correction (FEC) . . . . . . . . . . 8 4.1. Use of Forward Error Correction (FEC) ......................5
5. Payload Format . . . . . . . . . . . . . . . . . . . . . . . . 9 5. Payload Format ..................................................5
5.1. RTP Header Usage . . . . . . . . . . . . . . . . . . . . . 9 5.1. RTP Header Usage ...........................................5
5.2. Payload Structure . . . . . . . . . . . . . . . . . . . . 10 5.2. Payload Structure ..........................................6
5.3. Default Packetization Interval . . . . . . . . . . . . . . 11 5.3. Default Packetization Interval .............................7
5.4. Implementation Considerations . . . . . . . . . . . . . . 11 5.4. Implementation Considerations ..............................8
5.5. Payload Example . . . . . . . . . . . . . . . . . . . . . 11 5.5. Payload Example ............................................8
6. Payload Format Parameters . . . . . . . . . . . . . . . . . . 14 6. Payload Format Parameters ......................................10
6.1. Media Type Definition . . . . . . . . . . . . . . . . . . 14 6.1. Media Type Definition .....................................10
6.2. Mapping to SDP . . . . . . . . . . . . . . . . . . . . . . 16 6.2. Mapping to SDP ............................................12
6.2.1. SDP Usage Example . . . . . . . . . . . . . . . . . . 16 6.2.1. SDP Usage Examples .................................13
6.2.2. Offer/Answer Considerations . . . . . . . . . . . . . 17 6.2.2. Offer/Answer Considerations ........................14
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 18 7. IANA Considerations ............................................14
8. Security Considerations . . . . . . . . . . . . . . . . . . . 19 8. Security Considerations ........................................14
9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 20 9. Acknowledgements ...............................................14
10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 21 10. References ....................................................15
10.1. Normative References . . . . . . . . . . . . . . . . . . . 21 10.1. Normative References .....................................15
10.2. Informative References . . . . . . . . . . . . . . . . . . 21 10.2. Informative References ...................................15
11. Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 22
1. Introduction 1. Introduction
This document specifies the payload format for packetization of audio This document specifies the payload format for packetization of audio
data, encoded with the Standard apt-X or Enhanced apt-X audio coding data encoded with the Standard apt-X or Enhanced apt-X audio coding
algorithms, into the Real-time Transport Protocol (RTP). [RFC3550]. algorithms into the Real-time Transport Protocol (RTP) [RFC3550].
The document outlines some conventions, a brief description of the The document outlines some conventions, a brief description of the
operating principles of the audio codecs, and the payload format operating principles of the audio codecs, and the payload format
capabilities. The RTP payload format is detailed and a relevant capabilities. The RTP payload format is detailed, and a relevant
example of the format is provided. The media type, its mappings to example of the format is provided. The media type, its mappings to
SDP [RFC4566] and its usage in the SDP offer/answer model are also SDP [RFC4566], and its usage in the SDP offer/answer model are also
specified. Finally, some security considerations are outlined. specified. Finally, some security considerations are outlined.
This document registers a media type (audio/aptx) for the RTP payload This document registers a media type (audio/aptx) for the RTP payload
format for the Standard apt-X and Enhanced apt-X audio codecs. format for the Standard apt-X and Enhanced apt-X audio codecs.
2. Conventions 2. Conventions
This document uses the normal IETF bit-order representation. Bit This document uses the normal IETF bit-order representation. Bit
fields in figures are read left to right and then down. The leftmost fields in figures are read left to right and then down. The leftmost
bit in each field is the most significant. The numbering starts from bit in each field is the most significant. The numbering starts from
0 and ascends, where bit 0 will be the most significant. 0 and ascends, where bit 0 will be the most significant.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119]. document are to be interpreted as described in RFC 2119 [RFC2119].
3. Standard apt-X and Enhanced apt-X Codecs 3. Standard apt-X and Enhanced apt-X Codecs
Standard apt-X and Enhanced apt-X are proprietary audio coding Standard apt-X and Enhanced apt-X are proprietary audio coding
algorithms, which can be licensed from CSR plc. and are widely algorithms, which can be licensed from CSR plc. and are widely
deployed in a variety of audio processing equipment. deployed in a variety of audio processing equipment. For commercial
For commercial reasons, the detailed internal operations of these reasons, the detailed internal operations of these algorithms are not
algorithms are not described in standards or reference documents. described in standards or reference documents. However, the data
However, the data interfaces to implementations of these algorithms interfaces to implementations of these algorithms are very simple and
are very simple, and allow easy RTP packetization of data coded allow easy RTP packetization of data coded with the algorithms
with the algorithms, without a detailed knowledge of the actual without detailed knowledge of the actual coded audio stream syntax.
coded audio stream syntax.
Both the Standard apt-X and Enhanced apt-X coding algorithms are Both the Standard apt-X and Enhanced apt-X coding algorithms are
based on Adaptive Differential Pulse Code Modulation principles. based on Adaptive Differential Pulse Code Modulation principles.
They produce a constant coded bit rate that is scaled according to They produce a constant coded bit rate that is scaled according to
the sample frequency of the uncoded audio. This constant rate is 1/4 the sample frequency of the uncoded audio. This constant rate is 1/4
of the bit rate of the uncoded audio, irrespective of the resolution of the bit rate of the uncoded audio, irrespective of the resolution
(number of bits) used to represent an uncoded audio sample. For (number of bits) used to represent an uncoded audio sample. For
example, a 1.536 Mbit/s stereo audio stream, composed of 2 channels example, a 1.536-Mbit/s stereo audio stream composed of two channels
of 16-bit Pulse Code Modulated (PCM) audio that is sampled at a of 16-bit Pulse Code Modulated (PCM) audio that is sampled at a
frequency of 48 kHz, is encoded at 384 kbit/s. frequency of 48 kHz is encoded at 384 kbit/s.
Standard apt-X and Enhanced apt-X do not enforce a coded frame Standard apt-X and Enhanced apt-X do not enforce a coded frame
structure, and the coded data forms a continuous coded sample stream structure, and the coded data forms a continuous coded sample stream
with each coded sample capable of regenerating 4 PCM samples when with each coded sample capable of regenerating four PCM samples when
decoded. The Standard apt-X algorithm encodes 4 successive 16-bit decoded. The Standard apt-X algorithm encodes four successive 16-bit
PCM samples from each audio channel into a single 16-bit coded sample PCM samples from each audio channel into a single 16-bit coded sample
per audio channel. The Enhanced apt-X algorithm encodes 4 successive per audio channel. The Enhanced apt-X algorithm encodes four
16-bit or 24-bit PCM samples from each audio channel and respectively successive 16-bit or 24-bit PCM samples from each audio channel and
produces a single 16-bit or 24-bit coded sample per channel. The respectively produces a single 16-bit or 24-bit coded sample per
same RTP packetisation rules apply for each of these algorithmic channel. The same RTP packetization rules apply for each of these
variations. algorithmic variations.
Standard apt-X and Enhanced apt-X coded data streams can optionally Standard apt-X and Enhanced apt-X coded data streams can optionally
carry synchronisation information and an auxiliary data channel carry synchronization information and an auxiliary data channel
within the coded audio data without additional overhead. These within the coded audio data without additional overhead. These
mechanisms can, for instance, be used when the IP system is cascaded mechanisms can, for instance, be used when the IP system is cascaded
with another transportation system and the decoder is acting as a with another transportation system and the decoder is acting as a
simple bridge between the two systems. Since auxiliary data channel simple bridge between the two systems. Since auxiliary data channel
and synchronisation information are carried within the coded audio and synchronization information are carried within the coded audio
data without additional overhead, RTP payload format rules do not data without additional overhead, RTP payload format rules do not
change if they are present. Out-of-band signalling is required change if they are present. Out-of-band signaling is required,
however to notify the receiver end when autosync and auxiliary data however, to notify the receiver end when autosync and auxiliary data
have been embedded in the apt-X stream. have been embedded in the apt-X stream.
Embedded auxiliary data is typically used to transport non-audio Embedded auxiliary data is typically used to transport non-audio data
data, and timecode information for synchronisation with video. The and timecode information for synchronization with video. The bit
bit rate of the auxiliary data channel is 1/4 of the sample rate of the auxiliary data channel is 1/4 of the sample frequency.
frequency. For example with a single audio channel encoded at Fs = For example, with a single audio channel encoded at Fs = 48 kHz, an
48kHz, an auxiliary data bit rate of 12 kbit/s can be embedded. auxiliary data bit rate of 12 kbit/s can be embedded.
apt-X further provides a means of stereo pairing apt-X channels so apt-X further provides a means of stereo-pairing apt-X channels so
that the embedded autosync and auxiliary data channel can be shared that the embedded autosync and auxiliary data channel can be shared
across the channel pair. In the case of a 1.536 Mbit/s stereo audio across the channel pair. In the case of a 1.536-Mbit/s stereo audio
stream, composed of 2 channels of 16-bit PCM audio that is sampled stream composed of two channels of 16-bit PCM audio that is sampled
at 48 kHz, a byte of auxiliary data would typically be fed into the at 48 kHz, a byte of auxiliary data would typically be fed into the
Standard or Enhanced apt-X encoder once every 32 uncoded left Standard apt-X or Enhanced apt-X encoder once every 32 uncoded left
channel samples. By default apt-X channels pairing is not enabled. channel samples. By default, apt-X channel-pairing is not enabled.
Out-of-band signalling is required to notify the receiver when the Out-of-band signaling is required to notify the receiver when the
option is being used. option is being used.
Standard apt-X and Enhanced apt-X decoders that have not be set up Standard apt-X and Enhanced apt-X decoders that have not been set up
with the correct embedded autosync, auxiliary data and stereo pairing with the correct embedded autosync, auxiliary data, and
information will playout uncoded PCM samples with a loss of decoding stereo-pairing information will play out uncoded PCM samples with a
quality. In the case of standard apt-X the loss of quality can be loss of decoding quality. In the case of Standard apt-X, the loss of
significant. quality can be significant.
Further details on the algorithm operation can be obtained from Further details on the algorithm operation can be obtained from
CSR plc. CSR plc.
Corporate HQ Corporate HQ
Churchill House Churchill House
Cambridge Business Park Cambridge Business Park
Cowley Road Cowley Road
Cambridge Cambridge
CB4 0WZ CB4 0WZ
UK UK
Tel: +44 1223 692000 Tel: +44 1223 692000
Fax: +44 1223 692001 Fax: +44 1223 692001
http://www.csr.com <http://www.csr.com>
4. Payload Format Capabilities 4. Payload Format Capabilities
This RTP payload format carries an integer number of Standard apt-X This RTP payload format carries an integer number of Standard apt-X
or Enhanced apt-X coded audio samples. When multiple related audio or Enhanced apt-X coded audio samples. When multiple related audio
channels are being conveyed within the payload, each channel channels are being conveyed within the payload, each channel
contributes the same integer number of apt-X coded audio samples to contributes the same integer number of apt-X coded audio samples to
the total carried by the payload. the total carried by the payload.
4.1. Use of Forward Error Correction (FEC) 4.1. Use of Forward Error Correction (FEC)
Standard apt-X and Enhanced apt-X do not inherently provide any Standard apt-X and Enhanced apt-X do not inherently provide any
mechanism for adding redundancy or error-control coding into the mechanism for adding redundancy or error-control coding into the
coded audio stream. Generic forward error correction schemes for RTP coded audio stream. Generic schemes for RTP, such as forward error
such as RFC 2198 [RFC2198] and RFC 5109 [RFC5109] can be used to add correction as described in RFC 5109 [RFC5109] and RFC 2733 [RFC2733],
redundant information to Standard apt-X and Enhanced apt-X RTP packet can be used to add redundant information to Standard apt-X and
streams, making them more resilient to packet losses at the expense Enhanced apt-X RTP packet streams, making them more resilient to
of a higher bit rate. packet losses at the expense of a higher bit rate.
5. Payload Format 5. Payload Format
The Standard apt-X and Enhanced apt-X algorithms encode 4 successive The Standard apt-X and Enhanced apt-X algorithms encode four
PCM samples from each audio channel and produce a single compressed successive PCM samples from each audio channel and produce a single
sample for each audio channel. The encoder MUST be presented with compressed sample for each audio channel. The encoder MUST be
an integer number S of input audio samples, where S is an arbitrary presented with an integer number S of input audio samples, where S is
multiple of 4. The encoder will produce exactly S/4 coded audio an arbitrary multiple of 4. The encoder will produce exactly S/4
samples. Since each coded audio sample is either 16 or 24 bits, the coded audio samples. Since each coded audio sample is either 16 or
amount of coded audio data produced upon each invocation of the 24 bits, the amount of coded audio data produced upon each invocation
encoding process will be an integer number of bytes. RTP of the encoding process will be an integer number of bytes. RTP
packetization of the encoded data SHALL be on a byte-by-byte basis. packetization of the encoded data SHALL be on a byte-by-byte basis.
5.1. RTP Header Usage 5.1. RTP Header Usage
Utilization of the Standard apt-X and Enhanced apt-X coding Utilization of the Standard apt-X and Enhanced apt-X coding
algorithms does not create any special requirements with respect to algorithms does not create any special requirements with respect to
the contents of the RTP packet header. Other RTP packet header the contents of the RTP packet header. Other RTP packet header
fields are defined as follows. fields are defined as follows.
o V - As per [RFC3550] o V - As per [RFC3550]
o P - As per [RFC3550] o P - As per [RFC3550]
o X - As per [RFC3550] o X - As per [RFC3550]
o CC - As per [RFC3550] o CC - As per [RFC3550]
o M - As per [RFC3551] Section 4.1 o M - As per [RFC3550] and [RFC3551] Section 4.1
o PT - A dynamic payload type; MUST be used [RFC3551]
o PT - A dynamic payload type, MUST be used. [RFC3551]
o SN - As per [RFC3550] o SN (sequence number) - As per [RFC3550]
o Timestamp - As per [RFC3550]. The RTP timestamp reflects the o Timestamp - As per [RFC3550]. The RTP timestamp reflects the
instant at which the first audio sample in the packet was sampled, instant at which the first audio sample in the packet was sampled,
that is, the oldest information in the packet. that is, the oldest information in the packet.
Header field abbreviations are defined as follows. Header field abbreviations are defined as follows.
V - Version Number o V - Version Number
P - Padding o P - Padding
X - Extensions o X - Extensions
CC - Count of contributing sources o CC - Count of contributing sources
M - Marker
PT - Payload Type o M - Marker
PS - Payload Structure o PT - Payload Type
5.2. Payload Structure o PS - Payload Structure
5.2. Payload Structure
The RTP payload data for Standard apt-X and Enhanced apt-X MUST be The RTP payload data for Standard apt-X and Enhanced apt-X MUST be
structured as follows. structured as follows.
Standard and Enhanced apt-X coded samples are packed contiguously Standard apt-X and Enhanced apt-X coded samples are packed
into payload octets in "network byte order", also known as big-endian contiguously into payload octets in "network byte order", also known
order and starting with the most significant bit. Coded samples are as big-endian order, and starting with the most significant bit.
packed into the packet in time sequence beginning with the oldest Coded samples are packed into the packet in time sequence, beginning
coded sample. An integer number of coded samples MUST be within the with the oldest coded sample. An integer number of coded samples
same packet. MUST be within the same packet.
When multiple channels of Standard and E-APTX coded audio, such as When multiple channels of Standard apt-X and Enhanced apt-X coded
in a stereo program, are multiplexed into a single RTP stream, the audio, such as in a stereo program, are multiplexed into a single RTP
coded samples from each channel, at a single sampling instant, are stream, the coded samples from each channel, at a single sampling
interleaved into a coded sample block according to the following instant, are interleaved into a coded sample block according to the
standard audio channel ordering, [RFC3551]. Coded sample blocks are following standard audio channel ordering [RFC3551]. Coded sample
then packed into the packet in time sequence beginning with the blocks are then packed into the packet in time sequence beginning
oldest coded sample block. with the oldest coded sample block.
l left l left
r right r right
c center c center
S surround S surround
F front F front
R rear R rear
channels, description, channel channels description channel
1 2 3 4 5 6 1 2 3 4 5 6
_________________________________________________ ___________________________________________________
2 stereo l r 2 stereo l r
3 l r c 3 l r c
4 l c r S 4 l c r S
5 Fl Fr Fc Sl Sr 5 Fl Fr Fc Sl Sr
6 l lc c r rc S 6 l lc c r rc S
For the two-channel encoding example, the sample sequence is (left For the two-channel encoding example, the sample sequence is (left
channel, first sample), (right channel, first sample), (left channel, channel, first sample), (right channel, first sample), (left channel,
second sample), (right channel, second sample). Coded Samples for all second sample), (right channel, second sample). Coded samples for
channels, belonging to a single coded sampling instant, MUST be all channels, belonging to a single coded sampling instant, MUST be
contained in the same packet. All channels in the same RTP stream contained in the same packet. All channels in the same RTP stream
MUST be sampled at the same frequency. MUST be sampled at the same frequency.
5.3. Default Packetization Interval 5.3. Default Packetization Interval
The default packetization interval MUST have a duration of 4 The default packetization interval MUST have a duration of
milliseconds. When an integer number of coded samples per channel 4 milliseconds. When an integer number of coded samples per channel
cannot be contained within this 4 milliseconds interval, the default cannot be contained within this 4-millisecond interval, the default
packet interval MUST be rounded down to the nearest packet interval packet interval MUST be rounded down to the nearest packet interval
that can contain a complete integer set of coded samples. that can contain a complete integer set of coded samples. For
For example when encoding audio with either Standard or Enhanced example, when encoding audio with either Standard apt-X or Enhanced
apt-X, sampled at 11025 Hz, 22050 Hz, or 44100 Hz, the packetization apt-X, sampled at 11025 Hz, 22050 Hz, or 44100 Hz, the packetization
interval MUST be rounded down to 3.99 milliseconds. interval MUST be rounded down to 3.99 milliseconds.
The packetization interval sets limits on the end-to-end delay; The packetization interval sets limits on the end-to-end delay;
shorter packets minimize the audio delay through a system at the shorter packets minimize the audio delay through a system at the
expense of increased bandwidth while longer packets introduce expense of increased bandwidth, while longer packets introduce less
less header overhead but increase delay and make packet loss header overhead but increase delay and make packet loss more
more noticeable. A default packet interval of 4 milliseconds noticeable. A default packet interval of 4 milliseconds maintains an
maintains an acceptable ratio of payload to header bytes and acceptable ratio of payload to header bytes and minimizes the
minimizes the end-to-end delay to allow viable interactive end-to-end delay to allow viable interactive applications based on
apt-X based applications. All implementations MUST support this apt-X. All implementations MUST support this default packetization
default packetization interval. interval.
5.4. Implementation Considerations 5.4. Implementation Considerations
An application implementing this payload format MUST understand all An application implementing this payload format MUST understand all
the payload parameters that are defined in this specification. Any the payload parameters that are defined in this specification. Any
mapping of these parameters to a signalling protocol MUST support all mapping of these parameters to a signaling protocol MUST support all
parameters. Implementation can always decide whether they are parameters. Implementations can always decide whether they are
capable of communicating based on the entities defined in this capable of communicating based on the entities defined in this
specification. specification.
5.5. Payload Example 5.5. Payload Example
As an example payload format, consider the transmission of an As an example payload format, consider the transmission of an
arbitrary 5.1 audio signal consisting of 6 channels of 24-bit PCM arbitrary 5.1 audio signal consisting of six channels of 24-bit PCM
data, sampled at a rate of 48 kHz and packetized on a RTP packet data, sampled at a rate of 48 kHz and packetized on an RTP packet
interval of 4 milliseconds. The total bit rate before audio coding interval of 4 milliseconds. The total bit rate before audio coding
is 6 * 24 * 48000 = 6.912 Mbits/s. Applying Enhanced apt-X coding, is 6 * 24 * 48000 = 6.912 Mbit/s. Applying Enhanced apt-X coding
with a coded sample size of 24 bits, results in a transmitted coded with a coded sample size of 24 bits results in a transmitted coded
bit rate of 1/4 of the uncoded bit rate, i.e. 1.728 Mbit/s. On packet bit rate of 1/4 of the uncoded bit rate, i.e., 1.728 Mbit/s. On
intervals of 4 milliseconds, packets contain 864 bytes of encoded packet intervals of 4 milliseconds, packets contain 864 bytes of
data that contain 48 Enhanced apt-X coded samples per channel. encoded data that contain 48 Enhanced apt-X coded samples per
channel.
For the example format, the diagram below shows how coded samples For the example format, the diagram below shows how coded samples
from each channel are packed into a sample block and how sample from each channel are packed into a sample block and how sample
blocks 1, 2, and 48 are subsequently packed into the RTP packet. blocks 1, 2, and 48 are subsequently packed into the RTP packet.
C: C:
Channel index: Left (l) = 1, left centre (lc) = 2, centre Channel index: Left (l) = 1, left center (lc) = 2,
(c) = 3, right (r) = 4, right centre (rc) = 5, surround (S) = 6. center (c) = 3, right (r) = 4, right center (rc) = 5,
and surround (S) = 6.
T: T:
Sample Block time index: The first sample block is 1, the final Sample Block time index: The first sample block is 1; the final
sample is 48. sample is 48.
S(C)(T): S(C)(T):
The Tth sample from channel C The Tth sample from channel C.
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| S(1)(1) | S(2)(1) | | S(1)(1) | S(2)(1) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| S(2)(1) | S(3)(1) | | S(2)(1) | S(3)(1) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| S(3)(1) | S(4)(1) | | S(3)(1) | S(4)(1) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
skipping to change at page 13, line 17 skipping to change at page 10, line 5
MSB: MSB:
Most Significant Byte of a 24-bit coded sample Most Significant Byte of a 24-bit coded sample
MB: MB:
Middle Byte of a 24-bit coded sample Middle Byte of a 24-bit coded sample
LSB: LSB:
Least Significant Byte of a 24-bit coded sample Least Significant Byte of a 24-bit coded sample
0 1 2 3 0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| MSB | MB | LSB | | | MSB | MB | LSB | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
6. Payload Format Parameters 6. Payload Format Parameters
This RTP payload format is identified using the media type audio/ This RTP payload format is identified using the media type
aptx, which is registered in accordance with RFC 4855 [RFC4855] and audio/aptx, which is registered in accordance with RFC 4855 [RFC4855]
using the template of RFC 6838 [RFC6838] and using the template of RFC 6838 [RFC6838].
6.1. Media Type Definition 6.1. Media Type Definition
Type name: audio Type name: audio
Subtype name: aptx Subtype name: aptx
Required parameters: Required parameters:
rate: rate:
RTP timestamp clock rate, which is equal to the sampling rate RTP timestamp clock rate, which is equal to the sampling rate
in Hz. -- RECOMMENDED values for rate are 8000, 11025, 16000, in Hz. RECOMMENDED values for rate are 8000, 11025, 16000,
22050, 24000, 32000, 44100 and 48000 samples per second. Other 22050, 24000, 32000, 44100, and 48000 samples per second. Other
values are permissible. values are permissible.
channels: channels:
The number of logical audio channels that are present in the The number of logical audio channels that are present in the
audio stream. audio stream.
variant: variant:
The variant of apt-X (i.e. Standard or Enhanced) that is being The variant of apt-X (i.e., Standard or Enhanced) that is being
used. The following variants can be signalled: used. The following variants can be signaled:
variant=standard variant=standard
variant=enhanced variant=enhanced
bitresolution: bitresolution:
The number of bits used by the algorithm to encode 4 PCM The number of bits used by the algorithm to encode four PCM
samples. This value MAY only be set to 16 for Standard apt-X samples. This value MAY only be set to 16 for Standard apt-X
and 16 or 24 for Enhanced apt-X. and 16 or 24 for Enhanced apt-X.
Optional parameters: Optional parameters:
ptime: ptime:
The recommended length of time (in milliseconds) represented by The recommended length of time (in milliseconds) represented by
the media in a packet. Defaults to 4 milliseconds. the media in a packet. Defaults to 4 milliseconds.
See Section 6 of [RFC4566]. See Section 6 of [RFC4566].
maxptime: maxptime:
The maximum length of time (in milliseconds) that can be The maximum amount of media that can be encapsulated in each
encapsulated in a packet. See Section 6 of [RFC4566]. packet, expressed as time in milliseconds. See Section 6 of
[RFC4566].
stereo-channel-pairs: stereo-channel-pairs:
Defines audio channels that are stereo paired in the stream. Defines audio channels that are stereo paired in the stream.
See Section 3. Each pair of audio channels is defined as two See Section 3. Each pair of audio channels is defined as two
comma-separated values that correspond to channel numbers in comma-separated values that correspond to channel numbers in
the range 1..channels. Each stereo channel pair is preceded the range 1..channels. Each stereo channel pair is preceded
by a '{', and followed by a '}'. Pairs of audio channels are by a '{' and followed by a '}'. Pairs of audio channels are
separated by a comma. A channel MUST NOT be paired with more separated by a comma. A channel MUST NOT be paired with more
than one other channel. Absence of this parameter signals that than one other channel. The absence of this parameter signals
each channel has been independently encoded. that each channel has been independently encoded.
embedded-autosync-channels: embedded-autosync-channels:
Defines channels that carry embedded autosync. Embedded- Defines channels that carry embedded autosync.
autosync-channels are defined as a list of comma-separated Embedded-autosync-channels is defined as a list of
values that correspond to channel numbers in the range 1.. comma-separated values that correspond to channel numbers in
channels. When a channel is stereo paired, embedded autosync the range 1..channels. When a channel is stereo paired, embedded
is shared across channels in the pair. The first channel autosync is shared across channels in the pair. The first channel
as defined in stereo-channel-pairs MUST be specified in the as defined in stereo-channel-pairs MUST be specified in the
embedded-autosync-channels list. embedded-autosync-channels list.
embedded-aux-channels: embedded-aux-channels:
Defines channels that carry embedded auxiliary data. Embedded- Defines channels that carry embedded auxiliary data.
aux-channels are defined as a list of comma-separated values Embedded-aux-channels is defined as a list of comma-separated
that correspond to channel numbers in the range 1..channels. values that correspond to channel numbers in the range
When a channel is stereo paired, embedded auxiliary data is 1..channels. When a channel is stereo paired, embedded auxiliary
shared across channels in the pair. The second channel as data is shared across channels in the pair. The second channel
defined in stereo-channel-pairs MUST be specified in the as defined in stereo-channel-pairs MUST be specified in the
embedded-autosync-channels list. embedded-aux-channels list.
Encoding considerations: This media type is framed in RTP and Encoding considerations: This media type is framed in RTP and
contains binary data; see Section 4.8 of [RFC6838]. contains binary data; see Section 4.8 of [RFC6838].
Security considerations: See Section 5 of [RFC4855] and Section 4 Security considerations: See Section 5 of [RFC4855] and Section 4
of [RFC4856]. of [RFC4856].
Interoperability considerations: none Interoperability considerations: none
Published specification: RFC 7310
Published specification: RFC XXXX Applications which use this media type: Audio streaming
Applications which use this media type: Audio streaming Fragment identifier considerations: None
Fragment identifier considerations: None Additional information: none
Additional information: none Person & email address to contact for further information:
John Lindsay <Lindsay@worldcastsystems.com>
Person & email address to contact for further information: John Intended usage: COMMON
Lindsay email:lindsay@worldcastsystems.com
Intended usage: Common
Restrictions on usage: This media type depends on RTP framing, Restrictions on usage: This media type depends on RTP framing,
and hence is only defined for transfer via RTP [RFC3550]. and hence is only defined for transfer via RTP [RFC3550].
Author/Change controller: "IETF Payload Working Group delegated Author/Change controller: IETF Payload Working Group delegated
from the IESG" from the IESG.
6.2. Mapping to SDP 6.2. Mapping to SDP
The information carried in the media type specification has a The information carried in the media type specification has a
specific mapping to fields in the Session Description Protocol (SDP) specific mapping to fields in the Session Description Protocol (SDP)
[RFC4566] that is commonly used to describe RTP sessions. When SDP [RFC4566] that is commonly used to describe RTP sessions. When SDP
is used to describe sessions the media type mappings are as follows. is used to describe sessions, the media type mappings are as follows.
The type name ("audio") goes in SDP "m=" as the media name. o The type name ("audio") goes in SDP "m=" as the media name.
The subtype name ("aptx") goes in SDP "a=rtpmap" as the encoding o The subtype name ("aptx") goes in SDP "a=rtpmap" as the encoding
name. name.
The parameter "rate" also goes in "a=rtpmap" as clock rate. o The parameter "rate" also goes in "a=rtpmap" as the clock rate.
The parameter "channels" also goes in "a=rtpmap" as channel count. o The parameter "channels" also goes in "a=rtpmap" as the channel
count.
The parameter "maxptime" when present, MUST be included in the o The parameter "maxptime", when present, MUST be included in the
SDP "a=maxptime" attribute. SDP "a=maxptime" attribute.
The required parameters "variant" and "bitresolution" MUST be o The required parameters "variant" and "bitresolution" MUST be
included in the SDP "a=fmtp" attribute. included in the SDP "a=fmtp" attribute.
The optional parameters "stereo-channel-pairs", "embedded- o The optional parameters "stereo-channel-pairs",
autosync-channels", "embedded-aux-channels" when present, "embedded-autosync-channels", and "embedded-aux-channels", when
MUST be included in the SDP "a=fmtp" attribute. present, MUST be included in the SDP "a=fmtp" attribute.
The parameter "ptime", when present, goes in a separate SDP o The parameter "ptime", when present, goes in a separate SDP
attribute field and is signalled as "a=ptime:<value>", where attribute field and is signaled as "a=ptime:<value>", where
<value> is the number of milliseconds of audio represented by one <value> is the number of milliseconds of audio represented by
RTP packet. See Section 6 of [RFC4566]. one RTP packet. See Section 6 of [RFC4566].
6.2.1. SDP Usage Examples 6.2.1. SDP Usage Examples
Some example SDP session descriptions utilizing apt-X encodings Some example SDP session descriptions utilizing apt-X encodings
follow. In these examples, long a=fmtp lines are folded to meet the follow. In these examples, long "a=fmtp" lines are folded to meet
column width constraints of this document. the column width constraints of this document.
Example 1: A standard apt-X stream that encodes two independent Example 1: A Standard apt-X stream that encodes two independent
44.1kHz 16-bit PCM channels into a 4 milliseconds RTP packet. 44.1-kHz 16-bit PCM channels into a 4-millisecond RTP packet.
m=audio 5004 RTP/AVP 98 m=audio 5004 RTP/AVP 98
a=rtpmap:98 aptx/44100/2 a=rtpmap:98 aptx/44100/2
a=fmtp:98 variant=standard; bitresolution=16; a=fmtp:98 variant=standard; bitresolution=16;
a=ptime:4 a=ptime:4
Example 2: An enhanced apt-X stream that encodes two 48kHz 24-bit Example 2: An Enhanced apt-X stream that encodes two 48-kHz 24-bit
stereo channels into a 4 milliseconds RTP packet and that carries stereo channels into a 4-millisecond RTP packet and carries both an
both an embedded autosync and auxiliary data channel. embedded autosync and auxiliary data channel.
m=audio 5004 RTP/AVP 98 m=audio 5004 RTP/AVP 98
a=rtpmap:98 aptx/48000/2 a=rtpmap:98 aptx/48000/2
a=fmtp:98 variant=enhanced; bitresolution=24; a=fmtp:98 variant=enhanced; bitresolution=24;
stereo-channel-pairs={1,2}; embedded-autosync-channels=1; stereo-channel-pairs={1,2}; embedded-autosync-channels=1;
embedded-aux-channels=2 embedded-aux-channels=2
a=ptime:4 a=ptime:4
Example 3: An enhanced apt-X stream that encodes six 44.1kHz 24-bit Example 3: An Enhanced apt-X stream that encodes six 44.1-kHz 24-bit
channels into a 6 milliseconds RTP packet. Channels 1,2 and 3,4 are channels into a 6-millisecond RTP packet. Channels 1,2 and 3,4 are
stereo pairs. Both stereo pairs carry both an embedded autosync and stereo pairs. Both stereo pairs carry both an embedded autosync and
auxiliary data channel. auxiliary data channel.
m=audio 5004 RTP/AVP 98 m=audio 5004 RTP/AVP 98
a=rtpmap:98 aptx/44100/6 a=rtpmap:98 aptx/44100/6
a=fmtp:98 variant=enhanced; bitresolution=24; a=fmtp:98 variant=enhanced; bitresolution=24;
stereo-channel-pairs={1,2},{3,4}; embedded-autosync-channels=1,3; stereo-channel-pairs={1,2},{3,4}; embedded-autosync-channels=1,3;
embedded-aux-channels=2,4 embedded-aux-channels=2,4
a=ptime:6 a=ptime:6
6.2.2. Offer/Answer Considerations 6.2.2. Offer/Answer Considerations
The only negotiable parameter is the delivery method. All other The only negotiable parameter is the delivery method. All other
parameters are declarative. The offer, as described in [RFC3264], parameters are declarative. The offer, as described in [RFC3264],
may contain a large number of delivery methods per single fmtp may contain a large number of delivery methods per single fmtp
attribute, the answerer MUST remove every delivery method and attribute. The answerer MUST remove every delivery method and
configuration uri not supported. Apart from this exceptional case, configuration URI that is not supported. Apart from this exceptional
all parameters MUST NOT be altered on answer. case, all parameters MUST NOT be altered on answer.
7. IANA Considerations 7. IANA Considerations
One media type (audio/aptx) has been defined and needs registration One media type (audio/aptx) has been registered in the "Media Types"
in the media types registry. See Section 6.1 registry. See Section 6.1.
8. Security Considerations 8. Security Considerations
RTP packets using the payload format defined in this specification RTP packets using the payload format defined in this specification
are subject to the security considerations discussed in the RTP are subject to the security considerations discussed in the RTP
specification [RFC3550], and any appropriate RTP profile (for example specification [RFC3550] and any appropriate RTP profile (for example,
[RFC3551]). This implies that confidentiality of the media streams [RFC3551]). This implies that confidentiality of the media streams
is achieved by encryption. Because the audio coding used with this is achieved by encryption. Because the audio coding used with this
payload format is applied end-to-end, encryption may be performed payload format is applied end to end, encryption may be performed
after audio coding so there is no conflict between the two after audio coding so there is no conflict between the two
operations. A potential denial-of-service threat exists for audio operations. A potential denial-of-service threat exists for audio
coding techniques that have non-uniform receiver-end computational coding techniques that have non-uniform receiver-end computational
load. The attacker can inject pathological datagrams into the stream load. The attacker can inject pathological datagrams into the stream
which are complex to decode and cause the receiver to be overloaded. that are complex to decode and cause the receiver to be overloaded.
However, the Standard apt-X and Enhanced apt-X audio coding However, the Standard apt-X and Enhanced apt-X audio coding
algorithms do not exhibit any significant non-uniformity. As with algorithms do not exhibit any significant non-uniformity. As with
any IP-based protocol, in some circumstances a receiver may be any IP-based protocol, in some circumstances a receiver may be
overloaded simply by the receipt of too many packets, either desired overloaded simply by the receipt of too many packets, either desired
or undesired. Network-layer authentication may be used to discard or undesired. Network-layer authentication may be used to discard
packets from undesired sources, but the processing cost of the packets from undesired sources, but the processing cost of the
authentication itself may be too high. In a multicast environment, authentication itself may be too high. In a multicast environment,
pruning of specific sources may be implemented in future versions of pruning of specific sources may be implemented in future versions of
IGMP [RFC3376] and in multicast routing protocols to allow a receiver IGMP [RFC3376] and in multicast routing protocols to allow a receiver
to select which sources are allowed to reach it. to select which sources are allowed to reach it. [RFC6562] has
[draft-ietf-avtcore-srtp-vbr-audio] has highlighted potential highlighted potential security vulnerabilities of Variable Bit Rate
security vulnerabilities of Variable Bit Rate (VBR) codecs using (VBR) codecs using Secure RTP transmission methods. As the Standard
Secure RTP transmission methods. As the Standard apt-X and Enhanced apt-X and Enhanced apt-X codecs are Constant Bit Rate (CBR) codecs,
apt-X codecs are Constant Bit Rate (CBR) codecs, this security this security vulnerability is therefore not applicable.
vulnerability is therefore not applicable.
9. Acknowledgements 9. Acknowledgements
This specification was facilitated by earlier documents produced by This specification was facilitated by earlier documents produced by
Greg Massey, David Trainer, James Hunter and Derrick Rea along with Greg Massey, David Trainer, James Hunter, and Derrick Rea, along with
practical tests carried out by Paul McCambridge of APT Ltd. practical tests carried out by Paul McCambridge of APT Ltd.
10. References 10. References
10.1. Normative References 10.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997. Requirement Levels", BCP 14, RFC 2119, March 1997.
[RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model
with Session Description Protocol (SDP)", RFC 3264, with Session Description Protocol (SDP)", RFC 3264,
June 2002. June 2002.
[RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V.
Jacobson, "RTP: A Transport Protocol for Real-Time Jacobson, "RTP: A Transport Protocol for Real-Time
Applications", STD 64, RFC 3550, July 2003. Applications", STD 64, RFC 3550, July 2003.
[RFC3551] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and
Video Conferences with Minimal Control", STD 65, RFC 3551,
July 2003.
[RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session
Description Protocol", RFC 4566, July 2006. Description Protocol", RFC 4566, July 2006.
[RFC3551] H. Schulzrinne, "RTP profile for audio and video
conferences with minimal control", RFC 3551, July 2003.
10.2. Informative References 10.2. Informative References
[RFC2198] Perkins, C., Kouvelas, I., Hodson, O., Hardman, V., [RFC2733] Rosenberg, J. and H. Schulzrinne, "An RTP Payload Format
Handley, M., Bolot, J., Vega-Garcia, A., and S. Fosse- for Generic Forward Error Correction", RFC 2733,
Parisis, "RTP Payload for Redundant Audio Data", RFC 2198, December 1999.
September 1997.
[RFC3376] Cain, B., Deering, S., Kouvelas, I., Fenner, B., and A. [RFC3376] Cain, B., Deering, S., Kouvelas, I., Fenner, B., and A.
Thyagarajan, "Internet Group Management Protocol, Version Thyagarajan, "Internet Group Management Protocol,
3", RFC 3376, October 2002. Version 3", RFC 3376, October 2002.
[RFC6838] Freed, N.,J. Klensin, and T. Hansen, "Media Type
Specifications and Registration Procedures", BCP 13,
RFC 6838, January 2013.
[RFC4855] Casner, S., "Media Type Registration of RTP Payload [RFC4855] Casner, S., "Media Type Registration of RTP Payload
Formats", RFC 4855, February 2007. Formats", RFC 4855, February 2007.
[RFC4856] Casner, S., "Media Type Registration of Payload Formats in [RFC4856] Casner, S., "Media Type Registration of Payload Formats in
the RTP Profile for Audio and Video Conferences", the RTP Profile for Audio and Video Conferences",
RFC 4856, February 2007. RFC 4856, February 2007.
[RFC5109] Li, A., "RTP Payload Format for Generic Forward Error [RFC5109] Li, A., Ed., "RTP Payload Format for Generic Forward Error
Correction", RFC 5109, December 2007. Correction", RFC 5109, December 2007.
[draft-ietf-avtcore-srtp-vbr-audio] Perkins, C., Valins, JM., [RFC6562] Perkins, C. and JM. Valin, "Guidelines for the Use of
"Guidelines for the Use of Variable Bit Rate Audio with Variable Bit Rate Audio with Secure RTP", RFC 6562,
Secure RTP", draft-ietf-avtcore-srtp-vbr-audio, March 2012. March 2012.
11. Authors' Addresses [RFC6838] Freed, N., Klensin, J., and T. Hansen, "Media Type
Specifications and Registration Procedures", BCP 13,
RFC 6838, January 2013.
Authors' Addresses
John Lindsay John Lindsay
APT Ltd APT Ltd
729 Springfield Road 729 Springfield Road
Belfast Belfast
Northern Ireland Northern Ireland
BT12 7FP BT12 7FP
UK UK
Phone: +44 2890 677200 Phone: +44 2890 677200
Email: Lindsay@worldcastsystems.com EMail: Lindsay@worldcastsystems.com
Hartmut Foerster Hartmut Foerster
APT Ltd APT Ltd
729 Springfield Road 729 Springfield Road
Belfast Belfast
Northern Ireland Northern Ireland
BT12 7FP BT12 7FP
UK UK
Phone: +44 2890 677200 Phone: +44 2890 677200
Email: foerster@worldcastsystems.com EMail: Foerster@worldcastsystems.com
 End of changes. 111 change blocks. 
321 lines changed or deleted 303 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/