draft-ietf-payload-rtp-aptx-02.txt   draft-ietf-payload-rtp-aptx-03.txt 
Internet-Draft J. Lindsay Internet-Draft J. Lindsay
A/V Transport Payloads Working Group H.Foerster A/V Transport Payloads Working Group H.Foerster
Intended status: Standards Track APT Ltd Intended status: Standards Track APT Ltd
Expires: April 14, 2014 October 14, 2013 Expires: April 21, 2014 October 22, 2013
RTP Payload Format for Standard apt-X and Enhanced apt-X Codecs RTP Payload Format for Standard apt-X and Enhanced apt-X Codecs
draft-ietf-payload-rtp-aptx-02 draft-ietf-payload-rtp-aptx-03
Abstract Abstract
This document specifies a scheme for packetizing Standard apt-X, or This document specifies a scheme for packetizing Standard apt-X, or
Enhanced apt-X, encoded audio data into Real-time Transport Protocol Enhanced apt-X, encoded audio data into Real-time Transport Protocol
(RTP) packets. The document describes a payload format that permits (RTP) packets. The document describes a payload format that permits
transmission of multiple related audio channels in a single RTP transmission of multiple related audio channels in a single RTP
payload, and a means of establishing Standard apt-X and Enhanced payload, and a means of establishing Standard apt-X and Enhanced
apt-X connections through the Session Description Protocol (SDP). apt-X connections through the Session Description Protocol (SDP).
skipping to change at page 1, line 44 skipping to change at page 1, line 44
and may be updated, replaced, or obsoleted by other documents at any and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress." material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt. http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
This Internet-Draft will expire on April 14, 2014. This Internet-Draft will expire on April 21, 2014.
Submission Compliance for Internet-Drafts. Submission Compliance for Internet-Drafts.
This Internet-Draft is submitted in full conformance with the This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79. provisions of BCP 78 and BCP 79.
Copyright and License Notice Copyright and License Notice
Copyright (c) 2013 IETF Trust and the persons identified as the Copyright (c) 2013 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
skipping to change at page 6, line 8 skipping to change at page 6, line 8
bit in each field is the most significant. The numbering starts from bit in each field is the most significant. The numbering starts from
0 and ascends, where bit 0 will be the most significant. 0 and ascends, where bit 0 will be the most significant.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119]. document are to be interpreted as described in RFC 2119 [RFC2119].
3. Standard apt-X and Enhanced apt-X Codecs 3. Standard apt-X and Enhanced apt-X Codecs
Standard apt-X and Enhanced apt-X are proprietary audio coding Standard apt-X and Enhanced apt-X are proprietary audio coding
algorithms, which can be licensed from CSR plc and are widely algorithms, which can be licensed from CSR plc. and are widely
deployed in a variety of audio processing equipment. deployed in a variety of audio processing equipment.
For commercial reasons, the detailed internal operations of these For commercial reasons, the detailed internal operations of these
algorithms are not described in standards or reference documents. algorithms are not described in standards or reference documents.
However, the data interfaces to implementations of these algorithms However, the data interfaces to implementations of these algorithms
are very simple, and allow easy RTP packetization of data coded are very simple, and allow easy RTP packetization of data coded
with the algorithms, without a detailed knowledge of the actual with the algorithms, without a detailed knowledge of the actual
coded audio stream syntax. coded audio stream syntax.
Both the Standard apt-X and Enhanced apt-X coding algorithms are Both the Standard apt-X and Enhanced apt-X coding algorithms are
based on Adaptive Differential Pulse Code Modulation principles. based on Adaptive Differential Pulse Code Modulation principles.
skipping to change at page 11, line 10 skipping to change at page 11, line 10
For the two-channel encoding example, the sample sequence is (left For the two-channel encoding example, the sample sequence is (left
channel, first sample), (right channel, first sample), (left channel, channel, first sample), (right channel, first sample), (left channel,
second sample), (right channel, second sample). Coded Samples for all second sample), (right channel, second sample). Coded Samples for all
channels, belonging to a single coded sampling instant, MUST be channels, belonging to a single coded sampling instant, MUST be
contained in the same packet. All channels in the same RTP stream contained in the same packet. All channels in the same RTP stream
MUST be sampled at the same frequency. MUST be sampled at the same frequency.
5.3. Default Packetization Interval 5.3. Default Packetization Interval
The default packetization interval MUST have a duration of 4 ms. The default packetization interval MUST have a duration of 4
When an integer number of coded samples per channel cannot be milliseconds. When an integer number of coded samples per channel
contained within this 4ms interval, the default packet interval MUST cannot be contained within this 4 milliseconds interval, the default
be rounded down to the nearest packet interval that can contain a packet interval MUST be rounded down to the nearest packet interval
complete integer set of coded samples. For example when encoding that can contain a complete integer set of coded samples.
audio with either Standard or Enhanced apt-X, sampled at 11025 Hz, For example when encoding audio with either Standard or Enhanced
22050 Hz, or 44100 Hz, the packetization interval MUST be rounded apt-X, sampled at 11025 Hz, 22050 Hz, or 44100 Hz, the packetization
down to 3.99 ms. interval MUST be rounded down to 3.99 milliseconds.
The packetization interval sets limits on the end-to-end delay; The packetization interval sets limits on the end-to-end delay;
shorter packets minimize the audio delay through a system at the shorter packets minimize the audio delay through a system at the
expense of increased bandwidth while longer packets introduce expense of increased bandwidth while longer packets introduce
less header overhead but increase delay and make packet loss less header overhead but increase delay and make packet loss
more noticeable. A default packet interval of 4 ms maintains an more noticeable. A default packet interval of 4 milliseconds
acceptable ratio of payload to header bytes and minimizes maintains an acceptable ratio of payload to header bytes and
the end-to-end delay to allow viable interactive apt-X based minimizes the end-to-end delay to allow viable interactive
applications. All implementations MUST support this default apt-X based applications. All implementations MUST support this
packetization interval. default packetization interval.
5.4. Implementation Considerations 5.4. Implementation Considerations
An application implementing this payload format MUST understand all An application implementing this payload format MUST understand all
the payload parameters that are defined in this specification. Any the payload parameters that are defined in this specification. Any
mapping of these parameters to a signaling protocol MUST support all mapping of these parameters to a signalling protocol MUST support all
parameters. Implementation can always decide whether they are parameters. Implementation can always decide whether they are
capable of communicating based on the entities defined in this capable of communicating based on the entities defined in this
specification. specification.
5.5. Payload Example 5.5. Payload Example
As an example payload format, consider the transmission of an As an example payload format, consider the transmission of an
arbitrary 5.1 audio signal consisting of 6 channels of 24-bit PCM arbitrary 5.1 audio signal consisting of 6 channels of 24-bit PCM
data, sampled at a rate of 48 kHz and packetized on a RTP packet data, sampled at a rate of 48 kHz and packetized on a RTP packet
interval of 4ms. The total bit rate before audio coding is interval of 4 milliseconds. The total bit rate before audio coding
6 * 24 * 48000 = 6.912 Mbits/s. Applying Enhanced apt-X coding, is 6 * 24 * 48000 = 6.912 Mbits/s. Applying Enhanced apt-X coding,
with a coded sample size of 24 bits, results in a transmitted coded with a coded sample size of 24 bits, results in a transmitted coded
bit rate of 1/4 of the uncoded bit rate, i.e. 1.728 Mbit/s. On packet bit rate of 1/4 of the uncoded bit rate, i.e. 1.728 Mbit/s. On packet
intervals of 4 ms, packets contain 864 bytes of encoded data that intervals of 4 milliseconds, packets contain 864 bytes of encoded
contain 48 Enhanced apt-X coded samples per channel. data that contain 48 Enhanced apt-X coded samples per channel.
For the example format, the diagram below shows how coded samples For the example format, the diagram below shows how coded samples
from each channel are packed into a sample block and how sample from each channel are packed into a sample block and how sample
blocks 1, 2, and 48 are subsequently packed into the RTP packet. blocks 1, 2, and 48 are subsequently packed into the RTP packet.
C: C:
Channel index: Left (l) = 1, left centre (lc) = 2, centre Channel index: Left (l) = 1, left centre (lc) = 2, centre
(c) = 3, right (r) = 4, right centre (rc) = 5, surround (S) = 6. (c) = 3, right (r) = 4, right centre (rc) = 5, surround (S) = 6.
T: T:
skipping to change at page 14, line 13 skipping to change at page 14, line 13
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
6. Payload Format Parameters 6. Payload Format Parameters
This RTP payload format is identified using the media type audio/ This RTP payload format is identified using the media type audio/
aptx, which is registered in accordance with RFC 4855 [RFC4855] and aptx, which is registered in accordance with RFC 4855 [RFC4855] and
using the template of RFC 6838 [RFC6838] using the template of RFC 6838 [RFC6838]
6.1. Media Type Definition 6.1. Media Type Definition
Registration of media subtype audio/aptx. Type name: audio
MIME media type name: audio
MIME subtype name: aptx Subtype name: aptx
Required parameters: Required parameters:
rate: rate:
RTP timestamp clock rate, which is equal to the sampling rate RTP timestamp clock rate, which is equal to the sampling rate
in Hz. -- RECOMMENDED values for rate are 8000, 11025, 16000, in Hz. -- RECOMMENDED values for rate are 8000, 11025, 16000,
22050, 24000, 32000, 44100 and 48000 samples per second. Other 22050, 24000, 32000, 44100 and 48000 samples per second. Other
values are permissible. values are permissible.
channels: channels:
skipping to change at page 14, line 47 skipping to change at page 14, line 45
bitresolution: bitresolution:
The number of bits used by the algorithm to encode 4 PCM The number of bits used by the algorithm to encode 4 PCM
samples. This value MAY only be set to 16 for Standard apt-X samples. This value MAY only be set to 16 for Standard apt-X
and 16 or 24 for Enhanced apt-X. and 16 or 24 for Enhanced apt-X.
Optional parameters: Optional parameters:
ptime: ptime:
The recommended length of time (in milliseconds) represented by The recommended length of time (in milliseconds) represented by
the media in a packet. Defaults to 4 ms. See Section 6 of the media in a packet. Defaults to 4 milliseconds.
[RFC4566]. See Section 6 of [RFC4566].
maxptime: maxptime:
The maximum length of time (in milliseconds) that can be The maximum length of time (in milliseconds) that can be
encapsulated in a packet. See Section 6 of [RFC4566]. encapsulated in a packet. See Section 6 of [RFC4566].
stereo-channel-pairs: stereo-channel-pairs:
Defines audio channels that are stereo paired in the stream. Defines audio channels that are stereo paired in the stream.
See Section 3. Each pair of audio channels is defined as two See Section 3. Each pair of audio channels is defined as two
comma-separated values that correspond to channel numbers in comma-separated values that correspond to channel numbers in
the range 1..channels. Each stereo channel pair is preceded the range 1..channels. Each stereo channel pair is preceded
by a '{', and followed by a '}'. Pairs of audio channels are by a '{', and followed by a '}'. Pairs of audio channels are
separated by a comma. A channel MUST NOT be paired with more separated by a comma. A channel MUST NOT be paired with more
than one other channel. Absence of this parameter signals that than one other channel. Absence of this parameter signals that
each channel has been independently encoded. each channel has been independently encoded.
embedded-autosync-channels: embedded-autosync-channels:
Defines channels that carry embedded autosync. embedded- Defines channels that carry embedded autosync. Embedded-
autosync-channels is defined as a list of comma-separated autosync-channels are defined as a list of comma-separated
values that correspond to channel numbers in the range 1.. values that correspond to channel numbers in the range 1..
channels. When a channel is stereo paired, embedded autosync channels. When a channel is stereo paired, embedded autosync
is shared across channels in the pair. Only the first channel is shared across channels in the pair. The first channel
as defined in stereo-channel-pairs MUST be specified in the as defined in stereo-channel-pairs MUST be specified in the
embedded-autosync-channels list. embedded-autosync-channels list.
embedded-aux-channels: embedded-aux-channels:
Defines channels that carry embedded auxiliary data. embedded- Defines channels that carry embedded auxiliary data. Embedded-
aux-channel is defined as a list of comma-separated values aux-channels are defined as a list of comma-separated values
that correspond to channel numbers in the range 1..channels. that correspond to channel numbers in the range 1..channels.
When a channel is stereo paired, embedded auxiliary data is When a channel is stereo paired, embedded auxiliary data is
shared across channels in the pair. Only the second channel as shared across channels in the pair. The second channel as
defined in stereo-channel-pairs MUST be specified in the defined in stereo-channel-pairs MUST be specified in the
embedded-autosync-channels list. embedded-autosync-channels list.
Encoding considerations: This type is only defined for transfer Encoding considerations: This type is only defined for transfer
via RTP [RFC3550]. via RTP [RFC3550].
Security considerations: See Section 5 of [RFC4855] and Section 4 Security considerations: See Section 5 of [RFC4855] and Section 4
of [RFC4856]. of [RFC4856].
Interoperability considerations: none Interoperability considerations: none
Published specification: RFC XXXX Published specification: RFC XXXX
Applications which use this media type: Audio streaming Applications which use this media type: Audio streaming
Additional information: none Additional information: none
Person & email address to contact for further information: John Person & email address to contact for further information: "John
Lindsay email:lindsay@worldcastsystems.com Lindsay email:lindsay@worldcastsystems.com"
Intended usage: COMMON Intended usage: RTP applications only
Author/Change controller: John Lindsay Author/Change controller: "IETF Payload Working Group delegated
from the IESG"
6.2. Mapping to SDP 6.2. Mapping to SDP
The information carried in the media type specification has a The information carried in the media type specification has a
specific mapping to fields in the Session Description Protocol (SDP) specific mapping to fields in the Session Description Protocol (SDP)
[RFC4566] that is commonly used to describe RTP sessions. When SDP [RFC4566] that is commonly used to describe RTP sessions. When SDP
is used to describe sessions the media type mappings are as follows. is used to describe sessions the media type mappings are as follows.
The type name ("audio") goes in SDP "m=" as the media name. The type name ("audio") goes in SDP "m=" as the media name.
skipping to change at page 16, line 33 skipping to change at page 16, line 33
The required parameters "variant" and "bitresolution" MUST be The required parameters "variant" and "bitresolution" MUST be
included in the SDP "a=fmtp" attribute. included in the SDP "a=fmtp" attribute.
The optional parameters "stereo-channel-pairs", "embedded- The optional parameters "stereo-channel-pairs", "embedded-
autosync-channels", "embedded-aux-channels" when present, autosync-channels", "embedded-aux-channels" when present,
MUST be included in the SDP "a=fmtp" attribute. MUST be included in the SDP "a=fmtp" attribute.
The parameter "ptime", when present, goes in a separate SDP The parameter "ptime", when present, goes in a separate SDP
attribute field and is signalled as "a=ptime:<value>", where attribute field and is signalled as "a=ptime:<value>", where
<value> is the number of millseconds of audio represented by one <value> is the number of milliseconds of audio represented by one
RTP packet. See Section 6 of [RFC4566]. RTP packet. See Section 6 of [RFC4566].
6.2.1. SDP Usage Examples 6.2.1. SDP Usage Examples
Some example SDP session descriptions utilizing apt-X encodings Some example SDP session descriptions utilizing apt-X encodings
follow. In these examples, long a=fmtp lines are folded to meet the follow. In these examples, long a=fmtp lines are folded to meet the
column width constraints of this document. column width constraints of this document.
Example 1: A standard apt-X stream that encodes two independent Example 1: A standard apt-X stream that encodes two independent
44.1kHz 16-bit PCM channels into a 4ms RTP packet. 44.1kHz 16-bit PCM channels into a 4 milliseconds RTP packet.
m=audio 5004 RTP/AVP 98 m=audio 5004 RTP/AVP 98
a=rtpmap:98 aptx/44100/2 a=rtpmap:98 aptx/44100/2
a=fmtp:98 variant=standard; bitresolution=16; a=fmtp:98 variant=standard; bitresolution=16;
a=ptime:4 a=ptime:4
Example 2: An enhanced apt-X stream that encodes two 48kHz 24-bit Example 2: An enhanced apt-X stream that encodes two 48kHz 24-bit
stereo channels into a 4ms RTP packet and that carries both an stereo channels into a 4 milliseconds RTP packet and that carries
embedded autosync and auxiliary data channel. both an embedded autosync and auxiliary data channel.
m=audio 5004 RTP/AVP 98 m=audio 5004 RTP/AVP 98
a=rtpmap:98 aptx/48000/2 a=rtpmap:98 aptx/48000/2
a=fmtp:98 variant=enhanced; bitresolution=24; a=fmtp:98 variant=enhanced; bitresolution=24;
stereo-channel-pairs={1,2}; embedded-autosync-channels=1; stereo-channel-pairs={1,2}; embedded-autosync-channels=1;
embedded-aux-channels=2 embedded-aux-channels=2
a=ptime:4 a=ptime:4
Example 3: An enhanced apt-X stream that encodes six 44.1kHz 24-bit Example 3: An enhanced apt-X stream that encodes six 44.1kHz 24-bit
channels into a 6ms RTP packet. Channels 1,2 and 3,4 are stereo channels into a 6 milliseconds RTP packet. Channels 1,2 and 3,4 are
pairs. Both stereo pairs carry both an embedded autosync and stereo pairs. Both stereo pairs carry both an embedded autosync and
auxiliary data channel. auxiliary data channel.
m=audio 5004 RTP/AVP 98 m=audio 5004 RTP/AVP 98
a=rtpmap:98 aptx/44100/6 a=rtpmap:98 aptx/44100/6
a=fmtp:98 variant=enhanced; bitresolution=24; a=fmtp:98 variant=enhanced; bitresolution=24;
stereo-channel-pairs={1,2},{3,4}; embedded-autosync-channels=1,3; stereo-channel-pairs={1,2},{3,4}; embedded-autosync-channels=1,3;
embedded-aux-channels=2,4 embedded-aux-channels=2,4
a=ptime:6 a=ptime:6
6.2.2. Offer/Answer Considerations 6.2.2. Offer/Answer Considerations
 End of changes. 23 change blocks. 
44 lines changed or deleted 43 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/