draft-ietf-payload-rtp-h265-03.txt   draft-ietf-payload-rtp-h265-04.txt 
Network Working Group Y.-K. Wang Network Working Group Y.-K. Wang
Internet Draft Qualcomm Internet Draft Qualcomm
Intended status: Standards track Y. Sanchez Intended status: Standards track Y. Sanchez
Expires: October 2014 T. Schierl Expires: November 2014 T. Schierl
Fraunhofer HHI Fraunhofer HHI
S. Wenger S. Wenger
Vidyo Vidyo
M. M. Hannuksela M. M. Hannuksela
Nokia Nokia
April 30, 2014 May 28, 2014
RTP Payload Format for High Efficiency Video Coding RTP Payload Format for High Efficiency Video Coding
draft-ietf-payload-rtp-h265-03.txt draft-ietf-payload-rtp-h265-04.txt
Abstract Abstract
This memo describes an RTP payload format for the video coding This memo describes an RTP payload format for the video coding
standard ITU-T Recommendation H.265 and ISO/IEC International standard ITU-T Recommendation H.265 and ISO/IEC International
Standard 23008-2, both also known as High Efficiency Video Coding Standard 23008-2, both also known as High Efficiency Video Coding
(HEVC) [HEVC] and developed by the Joint Collaborative Team on Video (HEVC) [HEVC] and developed by the Joint Collaborative Team on Video
Coding (JCT-VC). The RTP payload format allows for packetization of Coding (JCT-VC). The RTP payload format allows for packetization of
one or more Network Abstraction Layer (NAL) units in each RTP packet one or more Network Abstraction Layer (NAL) units in each RTP packet
payload, as well as fragmentation of a NAL unit into multiple RTP payload, as well as fragmentation of a NAL unit into multiple RTP
skipping to change at page 2, line 18 skipping to change at page 2, line 18
months and may be updated, replaced, or obsoleted by other documents months and may be updated, replaced, or obsoleted by other documents
at any time. It is inappropriate to use Internet-Drafts as at any time. It is inappropriate to use Internet-Drafts as
reference material or to cite them other than as "work in progress." reference material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt. http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html. http://www.ietf.org/shadow.html.
This Internet-Draft will expire on October 30, 2014. This Internet-Draft will expire on November 28, 2014.
Copyright and License Notice Copyright and License Notice
Copyright (c) 2014 IETF Trust and the persons identified as the Copyright (c) 2014 IETF Trust and the persons identified as the
document authors. All rights reserved. document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of (http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents publication of this document. Please review these documents
skipping to change at page 3, line 10 skipping to change at page 3, line 10
respect to this document. Code Components extracted from this respect to this document. Code Components extracted from this
document must include Simplified BSD License text as described in document must include Simplified BSD License text as described in
Section 4.e of the Trust Legal Provisions and are provided without Section 4.e of the Trust Legal Provisions and are provided without
warranty as described in the Simplified BSD License. warranty as described in the Simplified BSD License.
Table of Contents Table of Contents
Abstract..........................................................1 Abstract..........................................................1
Status of this Memo...............................................1 Status of this Memo...............................................1
Table of Contents.................................................3 Table of Contents.................................................3
1 . Introduction..................................................5 1. Introduction...................................................5
1.1 . Overview of the HEVC Codec...............................5 1.1. Overview of the HEVC Codec................................5
1.1.1 Coding-Tool Features..................................5 1.1.1 Coding-Tool Features..................................5
1.1.2 Systems and Transport Interfaces......................7 1.1.2 Systems and Transport Interfaces......................7
1.1.3 Parallel Processing Support..........................14 1.1.3 Parallel Processing Support..........................14
1.1.4 NAL Unit Header......................................16 1.1.4 NAL Unit Header......................................16
1.2 . Overview of the Payload Format..........................17 1.2. Overview of the Payload Format...........................17
2 . Conventions..................................................18 2. Conventions...................................................18
3 . Definitions and Abbreviations................................18 3. Definitions and Abbreviations.................................18
3.1 Definitions...............................................18 3.1 Definitions...............................................18
3.1.1 Definitions from the HEVC Specification..............18 3.1.1 Definitions from the HEVC Specification..............18
3.1.2 Definitions Specific to This Memo....................20 3.1.2 Definitions Specific to This Memo....................20
3.2 Abbreviations.............................................22 3.2 Abbreviations.............................................22
4 . RTP Payload Format...........................................23 4. RTP Payload Format............................................23
4.1 RTP Header Usage..........................................23 4.1 RTP Header Usage..........................................23
4.2 Payload Header Usage......................................26 4.2 Payload Header Usage......................................26
4.3 Payload Structures........................................26 4.3 Payload Structures........................................26
4.4 Transmission Modes........................................27 4.4 Transmission Modes........................................27
4.5 Decoding Order Number.....................................28 4.5 Decoding Order Number.....................................28
4.6 Single NAL Unit Packets...................................30 4.6 Single NAL Unit Packets...................................30
4.7 Aggregation Packets (APs).................................31 4.7 Aggregation Packets (APs).................................31
4.8 Fragmentation Units (FUs).................................35 4.8 Fragmentation Units (FUs).................................35
4.9 PACI packets..............................................38 4.9 PACI packets..............................................38
4.9.1 Reasons for the PACI rules (informative).............41 4.9.1 Reasons for the PACI rules (informative).............41
4.9.2 PACI extensions (Informative)........................41 4.9.2 PACI extensions (Informative)........................41
4.10 Temporal Scalability Control Information.................43 4.10 Temporal Scalability Control Information.................43
5 . Packetization Rules..........................................45 5. Packetization Rules...........................................45
6 . De-packetization Process.....................................45 6. De-packetization Process......................................45
7 . Payload Format Parameters....................................48 7. Payload Format Parameters.....................................48
7.1 Media Type Registration...................................48 7.1 Media Type Registration...................................48
7.2 SDP Parameters............................................71 7.2 SDP Parameters............................................73
7.2.1 Mapping of Payload Type Parameters to SDP............71 7.2.1 Mapping of Payload Type Parameters to SDP............73
7.2.2 Usage with SDP Offer/Answer Model....................72 7.2.2 Usage with SDP Offer/Answer Model....................74
7.2.3 Usage in Declarative Session Descriptions............80 7.2.3 Usage in Declarative Session Descriptions............83
7.2.4 Parameter Sets Considerations........................81 7.2.4 Parameter Sets Considerations........................84
7.2.5 Dependency Signaling in Multi-Stream Transmission....82 7.2.5 Dependency Signaling in Multi-Stream Mode............84
8 . Use with Feedback Messages...................................82 8. Use with Feedback Messages....................................85
8.1 Picture Loss Indication (PLI).............................83 8.1 Picture Loss Indication (PLI).............................86
8.2 Slice Loss Indication.....................................83 8.2 Slice Loss Indication.....................................86
8.3 Use of HEVC with the RPSI Feedback Message................84 8.3 Use of HEVC with the RPSI Feedback Message................87
8.4 Full Intra Request (FIR)..................................85 8.4 Full Intra Request (FIR)..................................88
9 . Security Considerations......................................85 9. Security Considerations.......................................88
10 . Congestion Control..........................................87 10. Congestion Control...........................................90
11 . IANA Consideration..........................................88 11. IANA Consideration...........................................91
12 . Acknowledgements............................................88 12. Acknowledgements.............................................91
13 . References..................................................88 13. References...................................................91
13.1 Normative References.....................................88 13.1 Normative References.....................................91
13.2 Informative References...................................90 13.2 Informative References...................................93
14 . Authors' Addresses..........................................91 14. Authors' Addresses...........................................95
1. Introduction 1. Introduction
1.1. Overview of the HEVC Codec 1.1. Overview of the HEVC Codec
High Efficiency Video Coding [HEVC], formally known as ITU-T High Efficiency Video Coding [HEVC], formally known as ITU-T
Recommendation H.265 and ISO/IEC International Standard 23008-2 was Recommendation H.265 and ISO/IEC International Standard 23008-2 was
ratified by ITU-T in April 2013 and reportedly provides significant ratified by ITU-T in April 2013 and reportedly provides significant
coding efficiency gains over H.264 [H.264]. coding efficiency gains over H.264 [H.264].
skipping to change at page 20, line 21 skipping to change at page 20, line 21
bitstream, a target highest TemporalId, and a target layer bitstream, a target highest TemporalId, and a target layer
identifier list as inputs. identifier list as inputs.
random access: The act of starting the decoding process for a random access: The act of starting the decoding process for a
bitstream at a point other than the beginning of the bitstream. bitstream at a point other than the beginning of the bitstream.
sub-layer: A temporal scalable layer of a temporal scalable sub-layer: A temporal scalable layer of a temporal scalable
bitstream consisting of VCL NAL units with a particular value of the bitstream consisting of VCL NAL units with a particular value of the
TemporalId variable, and the associated non-VCL NAL units. TemporalId variable, and the associated non-VCL NAL units.
sub-layer representation: A subset of the bitstream consisting of
NAL units of a particular sub-layer and the lower sub-layers.
tile: A rectangular region of coding tree blocks within a particular tile: A rectangular region of coding tree blocks within a particular
tile column and a particular tile row in a picture. tile column and a particular tile row in a picture.
tile column: A rectangular region of coding tree blocks having a tile column: A rectangular region of coding tree blocks having a
height equal to the height of the picture and a width specified by height equal to the height of the picture and a width specified by
syntax elements in the picture parameter set. syntax elements in the picture parameter set.
tile row: A rectangular region of coding tree blocks having a height tile row: A rectangular region of coding tree blocks having a height
specified by syntax elements in the picture parameter set and a specified by syntax elements in the picture parameter set and a
width equal to the width of the picture. width equal to the width of the picture.
3.1.2 Definitions Specific to This Memo 3.1.2 Definitions Specific to This Memo
dependent RTP stream: An RTP stream on which another RTP stream dependee RTP stream: An RTP stream on which another RTP stream
depends. All RTP streams in an MST except for the highest RTP depends. All RTP streams in an MSM except for the highest RTP
stream are all dependent RTP streams. stream are dependee RTP streams.
highest RTP stream: The packet stream on which no other RTP stream highest RTP stream: The RTP stream on which no other RTP stream
depends. The RTP stream in an SST is the highest RTP stream. depends. The RTP stream in an SSM is the highest RTP stream.
media aware network element (MANE): A network element, such as a media aware network element (MANE): A network element, such as a
middlebox, selective forwarding unit, or application layer gateway middlebox, selective forwarding unit, or application layer gateway
that is capable of parsing certain aspects of the RTP payload that is capable of parsing certain aspects of the RTP payload
headers or the RTP payload and reacting to their contents. headers or the RTP payload and reacting to their contents.
Informative note: The concept of a MANE goes beyond normal Informative note: The concept of a MANE goes beyond normal
routers or gateways in that a MANE has to be aware of the routers or gateways in that a MANE has to be aware of the
signaling (e.g. to learn about the payload type mappings of the signaling (e.g. to learn about the payload type mappings of the
media streams), and in that it has to be trusted when working media streams), and in that it has to be trusted when working
with SRTP. The advantage of using MANEs is that they allow with SRTP. The advantage of using MANEs is that they allow
packets to be dropped according to the needs of the media coding. packets to be dropped according to the needs of the media coding.
For example, if a MANE has to drop packets due to congestion on a For example, if a MANE has to drop packets due to congestion on a
certain link, it can identify and remove those packets whose certain link, it can identify and remove those packets whose
elimination produces the least adverse effect on the user elimination produces the least adverse effect on the user
experience. After dropping packets, MANEs must rewrite RTCP experience. After dropping packets, MANEs must rewrite RTCP
packets to match the changes to the RTP stream as specified in packets to match the changes to the RTP stream as specified in
Section 7 of [RFC3550]. Section 7 of [RFC3550].
multi-stream transmission (MST): Transmission of an HEVC bitstream multi-stream mode(MSM): Transmission of an HEVC bitstream using more
using more than one RTP stream. than one RTP stream.
NAL unit decoding order: A NAL unit order that conforms to the NAL unit decoding order: A NAL unit order that conforms to the
constraints on NAL unit order given in Section 7.4.2.4 in [HEVC]. constraints on NAL unit order given in Section 7.4.2.4 in [HEVC].
NAL-unit-like structure: A data structure that is similar to NAL NAL-unit-like structure: A data structure that is similar to NAL
units in the sense that it also has a NAL unit header and a payload, units in the sense that it also has a NAL unit header and a payload,
with a difference that the payload does not follow the start code with a difference that the payload does not follow the start code
emulation prevention mechanism required for the NAL unit syntax as emulation prevention mechanism required for the NAL unit syntax as
specified in Section 7.3.1.1 of [HEVC]. Examples NAL-unit-like specified in Section 7.3.1.1 of [HEVC]. Examples NAL-unit-like
structures defined in this memo are packet payloads of AP, PACI, and structures defined in this memo are packet payloads of AP, PACI, and
FU packets. FU packets.
NALU-time: The value that the RTP timestamp would have if the NAL NALU-time: The value that the RTP timestamp would have if the NAL
unit would be transported in its own RTP packet. unit would be transported in its own RTP packet.
packet stream: See [I-D.ietf-avtext-rtp-grouping-taxonomy]. Within RTP stream: See [I-D.ietf-avtext-rtp-grouping-taxonomy]. Within the
the scope of this memo, one RTP stream is utilized to transport one scope of this memo, one RTP stream is utilized to transport one or
or more temporal sub-layers. more temporal sub-layers.
single-stream transmission (SST): Transmission of an HEVC bitstream single-stream mode (SSM): Transmission of an HEVC bitstream using
using only one RTP stream. only one RTP stream.
transmission order: The order of packets in ascending RTP sequence transmission order: The order of packets in ascending RTP sequence
number order (in modulo arithmetic). Within an aggregation packet, number order (in modulo arithmetic). Within an aggregation packet,
the NAL unit transmission order is the same as the order of the NAL unit transmission order is the same as the order of
appearance of NAL units in the packet. appearance of NAL units in the packet.
3.2 Abbreviations 3.2 Abbreviations
AP Aggregation Packet AP Aggregation Packet
BLA Broken Link Access BLA Broken Link Access
CRA Clean Random Access CRA Clean Random Access
CTB Coding Tree Block CTB Coding Tree Block
CTU Coding Tree Unit CTU Coding Tree Unit
CVS Coded Video Sequence CVS Coded Video Sequence
DPH Decoded Picture Hash
FU Fragmentation Unit FU Fragmentation Unit
GDR Gradual Decoding Refresh GDR Gradual Decoding Refresh
HRD Hypothetical Reference Decoder HRD Hypothetical Reference Decoder
IDR Instantaneous Decoding Refresh IDR Instantaneous Decoding Refresh
IRAP Intra Random Access Point IRAP Intra Random Access Point
MANE Media Aware Network Element MANE Media Aware Network Element
MST Multi-Stream Transmission MSM Multi-Stream Mode
MTU Maximum Transfer Unit MTU Maximum Transfer Unit
NAL Network Abstraction Layer NAL Network Abstraction Layer
NALU Network Abstraction Layer Unit NALU Network Abstraction Layer Unit
PACI PAyload Content Information PACI PAyload Content Information
PHES Payload Header Extension Structure PHES Payload Header Extension Structure
PPS Picture Parameter Set PPS Picture Parameter Set
RADL Random Access Decodable Leading (Picture) RADL Random Access Decodable Leading (Picture)
RASL Random Access Skipped Leading (Picture) RASL Random Access Skipped Leading (Picture)
RPS Reference Picture Set RPS Reference Picture Set
SEI Supplemental Enhancement Information SEI Supplemental Enhancement Information
SPS Sequence Parameter Set SPS Sequence Parameter Set
SST Single-Stream Transmission SSM Single-Stream Mode
STSA Step-wise Temporal Sub-layer Access STSA Step-wise Temporal Sub-layer Access
TSA Temporal Sub-layer Access TSA Temporal Sub-layer Access
TCSI Temporal Scalability Control Information TCSI Temporal Scalability Control Information
VCL Video Coding Layer VCL Video Coding Layer
VPS Video Parameter Set VPS Video Parameter Set
skipping to change at page 24, line 28 skipping to change at page 24, line 28
Figure 2 RTP header according to [RFC3550] Figure 2 RTP header according to [RFC3550]
The RTP header information to be set according to this RTP payload The RTP header information to be set according to this RTP payload
format is set as follows: format is set as follows:
Marker bit (M): 1 bit Marker bit (M): 1 bit
Set for the last packet, carried in the current RTP stream, of Set for the last packet, carried in the current RTP stream, of
the access unit, in line with the normal use of the M bit in the access unit, in line with the normal use of the M bit in
video formats, to allow an efficient playout buffer handling. video formats, to allow an efficient playout buffer handling.
When MST is in use, if an access unit appears in multiple RTP When MSM is in use, if an access unit appears in multiple RTP
streams, the marker bit is set on each RTP stream's last packet streams, the marker bit is set on each RTP stream's last packet
of the access unit. of the access unit.
Informative note: The content of a NAL unit does not tell Informative note: The content of a NAL unit does not tell
whether or not the NAL unit is the last NAL unit, in decoding whether or not the NAL unit is the last NAL unit, in decoding
order, of an access unit. An RTP sender implementation may order, of an access unit. An RTP sender implementation may
obtain this information from the video encoder. If, however, obtain this information from the video encoder. If, however,
the implementation cannot obtain this information directly the implementation cannot obtain this information directly
from the encoder, e.g. when the bitstream was pre-encoded, and from the encoder, e.g. when the bitstream was pre-encoded, and
also there is no timestamp allocated for each NAL unit, then also there is no timestamp allocated for each NAL unit, then
skipping to change at page 25, line 16 skipping to change at page 25, line 16
44, inclusive, or 48 to 55, inclusive. 44, inclusive, or 48 to 55, inclusive.
Payload type (PT): 7 bits Payload type (PT): 7 bits
The assignment of an RTP payload type for this new packet format The assignment of an RTP payload type for this new packet format
is outside the scope of this document and will not be specified is outside the scope of this document and will not be specified
here. The assignment of a payload type has to be performed here. The assignment of a payload type has to be performed
either through the profile used or in a dynamic way. either through the profile used or in a dynamic way.
Informative note: It is not required to use different payload Informative note: It is not required to use different payload
type values for different RTP streams in MST. type values for different RTP streams in MSM.
Sequence number (SN): 16 bits Sequence number (SN): 16 bits
Set and used in accordance with RFC 3550. Set and used in accordance with RFC 3550.
Timestamp: 32 bits Timestamp: 32 bits
The RTP timestamp is set to the sampling timestamp of the The RTP timestamp is set to the sampling timestamp of the
content. A 90 kHz clock rate MUST be used. content. A 90 kHz clock rate MUST be used.
skipping to change at page 25, line 43 skipping to change at page 25, line 43
Receivers MUST use the RTP timestamp for the display process, Receivers MUST use the RTP timestamp for the display process,
even when the bitstream contains picture timing SEI messages or even when the bitstream contains picture timing SEI messages or
decoding unit information SEI messages as specified in [HEVC]. decoding unit information SEI messages as specified in [HEVC].
However, this does not mean that picture timing SEI messages in However, this does not mean that picture timing SEI messages in
the bitstream should be discarded, as picture timing SEI messages the bitstream should be discarded, as picture timing SEI messages
may contain frame-field information that is important in may contain frame-field information that is important in
appropriately rendering interlaced video. appropriately rendering interlaced video.
Synchronization source (SSRC): 32-bits Synchronization source (SSRC): 32-bits
Used to identify the source of the RTP packets. In SST, by Used to identify the source of the RTP packets. In SSM, by
definition a single SSRC is used for all parts of a single definition a single SSRC is used for all parts of a single
bitstream. In MST, each SSRC is used for an RTP stream bitstream. In MSM, each SSRC is used for an RTP stream
containing a subset of the sub-layers for a single (temporally containing a subset of the sub-layers for a single (temporally
scalable) bitstream. A receiver is required to correctly scalable) bitstream. A receiver is required to correctly
associate the set of SSRCs that are included parts of the same associate the set of SSRCs that are included parts of the same
bitstream. bitstream.
Informative note: The term "bitstream" in this document is Informative note: The term "bitstream" in this document is
equivalent to the term "encoded stream" in [I-D.ietf-avtext- equivalent to the term "encoded stream" in [I-D.ietf-avtext-
rtp-grouping-taxonomy]. rtp-grouping-taxonomy].
4.2 Payload Header Usage 4.2 Payload Header Usage
skipping to change at page 27, line 16 skipping to change at page 27, line 16
This payload structure is specified in section 4.8. This payload structure is specified in section 4.8.
o PACI carrying RTP packet: Contains a payload header (that differs o PACI carrying RTP packet: Contains a payload header (that differs
from other payload headers for efficiency), a Payload Header from other payload headers for efficiency), a Payload Header
Extension Structure (PHES), and a PACI payload. This payload Extension Structure (PHES), and a PACI payload. This payload
structure is specified in section 4.9. structure is specified in section 4.9.
4.4 Transmission Modes 4.4 Transmission Modes
This memo enables transmission of an HEVC bitstream over a single This memo enables transmission of an HEVC bitstream over a single
packet stream or multiple RTP streams. The concept and working RTP stream or multiple RTP streams. The concept and working
principle is inherited from the design of what was called single and principle is inherited from the design of what was called single and
multiple session transmission in [RFC6190] and follows a similar multiple session transmission in [RFC6190] and follows a similar
design. If only one RTP stream is used for transmission of the HEVC design. If only one RTP stream is used for transmission of the HEVC
bitstream, the transmission mode is referred to as single-stream bitstream, the transmission mode is referred to as single-stream
transmission (SST); otherwise (more than one RTP stream is used for mode (SSM); otherwise (more than one RTP stream is used for
transmission of the HEVC bitstream), the transmission mode is transmission of the HEVC bitstream), the transmission mode is
referred to as multi-stream transmission (MST). referred to as multi-stream mode (MSM).
Dependency of one RTP stream on another RTP stream is typically Dependency of one RTP stream on another RTP stream is typically
indicated as specified in [RFC5583]. When an RTP stream A depends indicated as specified in [RFC5583]. When an RTP stream A depends
on another RTP stream B, the RTP stream B is referred to as a on another RTP stream B, the RTP stream B is referred to as a
dependent RTP stream of the RTP stream A. dependee RTP stream of the RTP stream A.
Informative note: An MST may involve one or more RTP sessions. Informative note: An MSM may involve one or more RTP sessions.
For example, each RTP stream in an MST may be in its own RTP For example, each RTP stream in an MSM may be in its own RTP
session. For another example, a set of multiple RTP streams in session. For another example, a set of multiple RTP streams in
an MST may belong to the same RTP session, e.g. as indicated by an MSM may belong to the same RTP session, e.g. as indicated by
the mechanism specified in [I-D.ietf-avtcore-rtp-multi-stream] or the mechanism specified in [I-D.ietf-avtcore-rtp-multi-stream] or
[I-D.ietf-mmusic-sdp-bundle-negotiation]. [I-D.ietf-mmusic-sdp-bundle-negotiation].
SST SHOULD be used for point-to-point unicast scenarios, while MST SSM SHOULD be used for point-to-point unicast scenarios, while MSM
SHOULD be used for point-to-multipoint multicast scenarios where SHOULD be used for point-to-multipoint multicast scenarios where
different receivers require different operation points of the same different receivers require different operation points of the same
HEVC bitstream, to improve bandwidth utilizing efficiency. HEVC bitstream, to improve bandwidth utilizing efficiency.
Informative note: A multicast may degrade to a unicast after all Informative note: A multicast may degrade to a unicast after all
but one receivers have left (this is a justification of the first but one receivers have left (this is a justification of the first
"SHOULD" instead of "MUST"), and there might be scenarios where "SHOULD" instead of "MUST"), and there might be scenarios where
MST is desirable but not possible e.g. when IP multicast is not MSM is desirable but not possible e.g. when IP multicast is not
deployed in certain network (this is a justification of the deployed in certain network (this is a justification of the
second "SHOULD" instead of "MUST"). second "SHOULD" instead of "MUST").
The transmission mode is indicated by the tx-mode media parameter The transmission mode is indicated by the tx-mode media parameter
(see section 7.1). If tx-mode is equal to "SST", SST MUST be used. (see section 7.1). If tx-mode is equal to "SSM", SSM MUST be used.
Otherwise (tx-mode is equal to "MST"), MST MUST be used. Otherwise (tx-mode is equal to "MSM"), MSM MUST be used.
Receivers MUST support both SST and MST. Receivers MUST support both SSM and MSM.
4.5 Decoding Order Number 4.5 Decoding Order Number
For each NAL unit, the variable AbsDon is derived, representing the For each NAL unit, the variable AbsDon is derived, representing the
decoding order number that is indicative of the NAL unit decoding decoding order number that is indicative of the NAL unit decoding
order. order.
Let NAL unit n be the n-th NAL unit in transmission order within an Let NAL unit n be the n-th NAL unit in transmission order within an
RTP stream. RTP stream.
If tx-mode is equal to "SST" and sprop-max-don-diff is equal to 0, If tx-mode is equal to "SSM" and sprop-max-don-diff is equal to 0,
AbsDon[n], the value of AbsDon for NAL unit n, is derived as equal AbsDon[n], the value of AbsDon for NAL unit n, is derived as equal
to n. to n.
Otherwise (tx-mode is equal to "MST" or sprop-max-don-diff is Otherwise (tx-mode is equal to "MSM" or sprop-max-don-diff is
greater than 0), AbsDon[n] is derived as follows, where DON[n] is greater than 0), AbsDon[n] is derived as follows, where DON[n] is
the value of the variable DON for NAL unit n: the value of the variable DON for NAL unit n:
o If n is equal to 0 (i.e. NAL unit n is the very first NAL unit in o If n is equal to 0 (i.e. NAL unit n is the very first NAL unit in
transmission order), AbsDon[0] is set equal to DON[0]. transmission order), AbsDon[0] is set equal to DON[0].
o Otherwise (n is greater than 0), the following applies for o Otherwise (n is greater than 0), the following applies for
derivation of AbsDon[n]: derivation of AbsDon[n]:
If DON[n] == DON[n-1], If DON[n] == DON[n-1],
skipping to change at page 30, line 4 skipping to change at page 30, line 4
layers or some SEI NAL units when there is congestion in the layers or some SEI NAL units when there is congestion in the
network. In another example, the first intra-coded picture of a network. In another example, the first intra-coded picture of a
pre-encoded clip is transmitted in advance to ensure that it is pre-encoded clip is transmitted in advance to ensure that it is
readily available in the receiver, and when transmitting the readily available in the receiver, and when transmitting the
first intra-coded picture, the originator does not exactly know first intra-coded picture, the originator does not exactly know
how many NAL units will be encoded before the first intra-coded how many NAL units will be encoded before the first intra-coded
picture of the pre-encoded clip follows in decoding order. Thus, picture of the pre-encoded clip follows in decoding order. Thus,
the values of AbsDon for the NAL units of the first intra-coded the values of AbsDon for the NAL units of the first intra-coded
picture of the pre-encoded clip have to be estimated when they picture of the pre-encoded clip have to be estimated when they
are transmitted, and gaps in values of AbsDon may occur. Another are transmitted, and gaps in values of AbsDon may occur. Another
example is MST where the AbsDon values must indicate cross-layer example is MSM where the AbsDon values must indicate cross-layer
decoding order for NAL units conveyed in all the RTP streams. decoding order for NAL units conveyed in all the RTP streams.
4.6 Single NAL Unit Packets 4.6 Single NAL Unit Packets
A single NAL unit packet contains exactly one NAL unit, and consists A single NAL unit packet contains exactly one NAL unit, and consists
of a payload header (denoted as PayloadHdr), a conditional 16-bit of a payload header (denoted as PayloadHdr), a conditional 16-bit
DONL field (in network byte order), and the NAL unit payload data DONL field (in network byte order), and the NAL unit payload data
(the NAL unit excluding its NAL unit header) of the contained NAL (the NAL unit excluding its NAL unit header) of the contained NAL
unit, as shown in Figure 3. unit, as shown in Figure 3.
skipping to change at page 30, line 36 skipping to change at page 30, line 36
Figure 3 The structure a single NAL unit packet Figure 3 The structure a single NAL unit packet
The payload header SHOULD be an exact copy of the NAL unit header of The payload header SHOULD be an exact copy of the NAL unit header of
the contained NAL unit. However, the Type (i.e. nal_unit_type) the contained NAL unit. However, the Type (i.e. nal_unit_type)
field MAY be changed, e.g. when it is desirable to handle a CRA field MAY be changed, e.g. when it is desirable to handle a CRA
picture to be a BLA picture [JCTVC-J0107]. picture to be a BLA picture [JCTVC-J0107].
The DONL field, when present, specifies the value of the 16 least The DONL field, when present, specifies the value of the 16 least
significant bits of the decoding order number of the contained NAL significant bits of the decoding order number of the contained NAL
unit. If tx-mode is equal to "MST" or sprop-max-don-diff is greater unit. If tx-mode is equal to "MSM" or sprop-max-don-diff is greater
than 0, the DONL field MUST be present, and the variable DON for the than 0, the DONL field MUST be present, and the variable DON for the
contained NAL unit is derived as equal to the value of the DONL contained NAL unit is derived as equal to the value of the DONL
field. Otherwise (tx-mode is equal to "SST" and sprop-max-don-diff field. Otherwise (tx-mode is equal to "SSM" and sprop-max-don-diff
is equal to 0), the DONL field MUST NOT be present. is equal to 0), the DONL field MUST NOT be present.
4.7 Aggregation Packets (APs) 4.7 Aggregation Packets (APs)
Aggregation packets (APs) are introduced to enable the reduction of Aggregation packets (APs) are introduced to enable the reduction of
packetization overhead for small NAL units, such as most of the non- packetization overhead for small NAL units, such as most of the non-
VCL NAL units, which are often only a few octets in size. VCL NAL units, which are often only a few octets in size.
An AP aggregates NAL units within one access unit. Each NAL unit to An AP aggregates NAL units within one access unit. Each NAL unit to
be carried in an AP is encapsulated in an aggregation unit. NAL be carried in an AP is encapsulated in an aggregation unit. NAL
skipping to change at page 32, line 38 skipping to change at page 32, line 38
| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| : | :
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 5 The structure of the first aggregation unit in an AP Figure 5 The structure of the first aggregation unit in an AP
The DONL field, when present, specifies the value of the 16 least The DONL field, when present, specifies the value of the 16 least
significant bits of the decoding order number of the aggregated NAL significant bits of the decoding order number of the aggregated NAL
unit. unit.
If tx-mode is equal to "MST" or sprop-max-don-diff is greater than If tx-mode is equal to "MSM" or sprop-max-don-diff is greater than
0, the DONL field MUST be present in an aggregation unit that is the 0, the DONL field MUST be present in an aggregation unit that is the
first aggregation unit in an AP, and the variable DON for the first aggregation unit in an AP, and the variable DON for the
aggregated NAL unit is derived as equal to the value of the DONL aggregated NAL unit is derived as equal to the value of the DONL
field. Otherwise (tx-mode is equal to "SST" and sprop-max-don-diff field. Otherwise (tx-mode is equal to "SSM" and sprop-max-don-diff
is equal to 0), the DONL field MUST NOT be present in an aggregation is equal to 0), the DONL field MUST NOT be present in an aggregation
unit that is the first aggregation unit in an AP. unit that is the first aggregation unit in an AP.
An aggregation unit that is not the first aggregation unit in an AP An aggregation unit that is not the first aggregation unit in an AP
consists of a conditional 8-bit DOND field followed by a 16-bit consists of a conditional 8-bit DOND field followed by a 16-bit
unsigned size information (in network byte order) that indicates the unsigned size information (in network byte order) that indicates the
size of the NAL unit in bytes (excluding these two octets, but size of the NAL unit in bytes (excluding these two octets, but
including the NAL unit header), followed by the NAL unit itself, including the NAL unit header), followed by the NAL unit itself,
including its NAL unit header, as shown in Figure 6. including its NAL unit header, as shown in Figure 6.
skipping to change at page 33, line 30 skipping to change at page 33, line 30
| : | :
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 6 The structure of an aggregation unit that is not the first Figure 6 The structure of an aggregation unit that is not the first
aggregation unit in an AP aggregation unit in an AP
When present, the DOND field plus 1 specifies the difference between When present, the DOND field plus 1 specifies the difference between
the decoding order number values of the current aggregated NAL unit the decoding order number values of the current aggregated NAL unit
and the preceding aggregated NAL unit in the same AP. and the preceding aggregated NAL unit in the same AP.
If tx-mode is equal to "MST" or sprop-max-don-diff is greater than If tx-mode is equal to "MSM" or sprop-max-don-diff is greater than
0, the DOND field MUST be present in an aggregation unit that is not 0, the DOND field MUST be present in an aggregation unit that is not
the first aggregation unit in an AP, and the variable DON for the the first aggregation unit in an AP, and the variable DON for the
aggregated NAL unit is derived as equal to the DON of the preceding aggregated NAL unit is derived as equal to the DON of the preceding
aggregated NAL unit in the same AP plus the value of the DOND field aggregated NAL unit in the same AP plus the value of the DOND field
plus 1 modulo 65536. Otherwise (tx-mode is equal to "SST" and plus 1 modulo 65536. Otherwise (tx-mode is equal to "SSM" and
sprop-max-don-diff is equal to 0), the DOND field MUST NOT be sprop-max-don-diff is equal to 0), the DOND field MUST NOT be
present in an aggregation unit that is not the first aggregation present in an aggregation unit that is not the first aggregation
unit in an AP, and in this case the transmission order and decoding unit in an AP, and in this case the transmission order and decoding
order of NAL units carried in the AP are the same as the order the order of NAL units carried in the AP are the same as the order the
NAL units appear in the AP. NAL units appear in the AP.
Figure 7 presents an example of an AP that contains two aggregation Figure 7 presents an example of an AP that contains two aggregation
units, labeled as 1 and 2 in the figure, without the DONL and DOND units, labeled as 1 and 2 in the figure, without the DONL and DOND
fields being present. fields being present.
skipping to change at page 37, line 27 skipping to change at page 37, line 27
fragment of a fragmented NAL unit, the E bit MUST be set to zero. fragment of a fragmented NAL unit, the E bit MUST be set to zero.
FuType: 6 bits FuType: 6 bits
The field FuType MUST be equal to the field Type of the The field FuType MUST be equal to the field Type of the
fragmented NAL unit. fragmented NAL unit.
The DONL field, when present, specifies the value of the 16 least The DONL field, when present, specifies the value of the 16 least
significant bits of the decoding order number of the fragmented NAL significant bits of the decoding order number of the fragmented NAL
unit. unit.
If tx-mode is equal to "MST" or sprop-max-don-diff is greater than If tx-mode is equal to "MSM" or sprop-max-don-diff is greater than
0, and the S bit is equal to 1, the DONL field MUST be present in 0, and the S bit is equal to 1, the DONL field MUST be present in
the FU, and the variable DON for the fragmented NAL unit is derived the FU, and the variable DON for the fragmented NAL unit is derived
as equal to the value of the DONL field. Otherwise (tx-mode is as equal to the value of the DONL field. Otherwise (tx-mode is
equal to "SST" and sprop-max-don-diff is equal to 0, or the S bit is equal to "SSM" and sprop-max-don-diff is equal to 0, or the S bit is
equal to 0), the DONL field MUST NOT be present in the FU. equal to 0), the DONL field MUST NOT be present in the FU.
A non-fragmented NAL unit MUST NOT be transmitted in one FU; i.e. A non-fragmented NAL unit MUST NOT be transmitted in one FU; i.e.
the Start bit and End bit MUST NOT both be set to one in the same FU the Start bit and End bit MUST NOT both be set to one in the same FU
header. header.
The FU payload consists of fragments of the payload of the The FU payload consists of fragments of the payload of the
fragmented NAL unit so that if the FU payloads of consecutive FUs, fragmented NAL unit so that if the FU payloads of consecutive FUs,
starting with an FU with the S bit equal to 1 and ending with an FU starting with an FU with the S bit equal to 1 and ending with an FU
with the E bit equal to 1, are sequentially concatenated, the with the E bit equal to 1, are sequentially concatenated, the
skipping to change at page 45, line 13 skipping to change at page 45, line 13
MUST be equal to 0. Reserved for future extensions. MUST be equal to 0. Reserved for future extensions.
The value of PHSsize MUST be set to 3. Receivers MUST allow other The value of PHSsize MUST be set to 3. Receivers MUST allow other
values of the fields F0, F1, F2, Y, and PHSsize, and MUST ignore any values of the fields F0, F1, F2, Y, and PHSsize, and MUST ignore any
additional fields, when present, than specified above in the PHES. additional fields, when present, than specified above in the PHES.
5. Packetization Rules 5. Packetization Rules
The following packetization rules apply: The following packetization rules apply:
o If tx-mode is equal to "MST" or sprop-max-don-diff is greater o If tx-mode is equal to "MSM" or sprop-max-don-diff is greater
than 0 for an RTP stream, the transmission order of NAL units than 0 for an RTP stream, the transmission order of NAL units
carried in the RTP stream MAY be different than the NAL unit carried in the RTP stream MAY be different than the NAL unit
decoding order. Otherwise (tx-mode is equal to "SST" and sprop- decoding order. Otherwise (tx-mode is equal to "SSM" and sprop-
max-don-diff is equal to 0 for an RTP stream), the transmission max-don-diff is equal to 0 for an RTP stream), the transmission
order of NAL units carried in the RTP stream MUST be the same as order of NAL units carried in the RTP stream MUST be the same as
the NAL unit decoding order. the NAL unit decoding order.
o A NAL unit of a small size SHOULD be encapsulated in an o A NAL unit of a small size SHOULD be encapsulated in an
aggregation packet together with one or more other NAL units in aggregation packet together with one or more other NAL units in
order to avoid the unnecessary packetization overhead for small order to avoid the unnecessary packetization overhead for small
NAL units. For example, non-VCL NAL units such as access unit NAL units. For example, non-VCL NAL units such as access unit
delimiters, parameter sets, or SEI NAL units are typically small delimiters, parameter sets, or SEI NAL units are typically small
and can often be aggregated with VCL NAL units without violating and can often be aggregated with VCL NAL units without violating
skipping to change at page 45, line 41 skipping to change at page 45, line 41
together with its associated VCL NAL unit, as typically a non-VCL together with its associated VCL NAL unit, as typically a non-VCL
NAL unit would be meaningless without the associated VCL NAL unit NAL unit would be meaningless without the associated VCL NAL unit
being available. being available.
o For carrying exactly one NAL unit in an RTP packet, a single NAL o For carrying exactly one NAL unit in an RTP packet, a single NAL
unit packet MUST be used. unit packet MUST be used.
6. De-packetization Process 6. De-packetization Process
The general concept behind de-packetization is to get the NAL units The general concept behind de-packetization is to get the NAL units
out of the RTP packets in an RTP stream and all the dependent RTP out of the RTP packets in an RTP stream and all RTP streams the RTP
streams, if any, and pass them to the decoder in the NAL unit stream depends on, if any, and pass them to the decoder in the NAL
decoding order. unit decoding order.
The de-packetization process is implementation dependent. The de-packetization process is implementation dependent.
Therefore, the following description should be seen as an example of Therefore, the following description should be seen as an example of
a suitable implementation. Other schemes may be used as well as a suitable implementation. Other schemes may be used as well as
long as the output for the same input is the same as the process long as the output for the same input is the same as the process
described below. The output is the same when the set of output NAL described below. The output is the same when the set of output NAL
units and their order are both identical. Optimizations relative to units and their order are both identical. Optimizations relative to
the described algorithms are possible. the described algorithms are possible.
All normal RTP mechanisms related to buffer management apply. In All normal RTP mechanisms related to buffer management apply. In
skipping to change at page 46, line 29 skipping to change at page 46, line 29
NAL units with NAL unit type values in the range of 0 to 47, NAL units with NAL unit type values in the range of 0 to 47,
inclusive may be passed to the decoder. NAL-unit-like structures inclusive may be passed to the decoder. NAL-unit-like structures
with NAL unit type values in the range of 48 to 63, inclusive, MUST with NAL unit type values in the range of 48 to 63, inclusive, MUST
NOT be passed to the decoder. NOT be passed to the decoder.
The receiver includes a receiver buffer, which is used to compensate The receiver includes a receiver buffer, which is used to compensate
for transmission delay jitter within individual RTP streams and for transmission delay jitter within individual RTP streams and
across RTP streams, to reorder NAL units from transmission order to across RTP streams, to reorder NAL units from transmission order to
the NAL unit decoding order, and to recover the NAL unit decoding the NAL unit decoding order, and to recover the NAL unit decoding
order in MST, when applicable. In this section, the receiver order in MSM, when applicable. In this section, the receiver
operation is described under the assumption that there is no operation is described under the assumption that there is no
transmission delay jitter within a packet stream and across RTP transmission delay jitter within an RTP stream and across RTP
streams. To make a difference from a practical receiver buffer that streams. To make a difference from a practical receiver buffer that
is also used for compensation of transmission delay jitter, the is also used for compensation of transmission delay jitter, the
receiver buffer is here after called the de-packetization buffer in receiver buffer is here after called the de-packetization buffer in
this section. Receivers should also prepare for transmission delay this section. Receivers should also prepare for transmission delay
jitter; i.e. either reserve separate buffers for transmission delay jitter; i.e. either reserve separate buffers for transmission delay
jitter buffering and de-packetization buffering or use a receiver jitter buffering and de-packetization buffering or use a receiver
buffer for both transmission delay jitter and de-packetization. buffer for both transmission delay jitter and de-packetization.
Moreover, receivers should take transmission delay jitter into Moreover, receivers should take transmission delay jitter into
account in the buffering operation; e.g. by additional initial account in the buffering operation; e.g. by additional initial
buffering before starting of decoding and playback. buffering before starting of decoding and playback.
skipping to change at page 47, line 18 skipping to change at page 47, line 18
There are two buffering states in the receiver: initial buffering There are two buffering states in the receiver: initial buffering
and buffering while playing. Initial buffering starts when the and buffering while playing. Initial buffering starts when the
reception is initialized. After initial buffering, decoding and reception is initialized. After initial buffering, decoding and
playback are started, and the buffering-while-playing mode is used. playback are started, and the buffering-while-playing mode is used.
Regardless of the buffering state, the receiver stores incoming NAL Regardless of the buffering state, the receiver stores incoming NAL
units, in reception order, into the de-packetization buffer. NAL units, in reception order, into the de-packetization buffer. NAL
units carried in RTP packets are stored in the de-packetization units carried in RTP packets are stored in the de-packetization
buffer individually, and the value of AbsDon is calculated and buffer individually, and the value of AbsDon is calculated and
stored for each NAL unit. When MST is in use, NAL units of all RTP stored for each NAL unit. When MSM is in use, NAL units of all RTP
streams of a bitstream are stored in the same de-packetization streams of a bitstream are stored in the same de-packetization
buffer. When NAL units carried in any two RTP streams are available buffer. When NAL units carried in any two RTP streams are available
to be placed into the de-packetization buffer, those NAL units to be placed into the de-packetization buffer, those NAL units
carried in the RTP stream that is lower in the dependency tree are carried in the RTP stream that is lower in the dependency tree are
placed into the buffer first. For example, if RTP stream A depends placed into the buffer first. For example, if RTP stream A depends
on RTP stream B, then NAL units carried in RTP stream B are placed on RTP stream B, then NAL units carried in RTP stream B are placed
into the buffer first. into the buffer first.
Initial buffering lasts until condition A (the difference between Initial buffering lasts until condition A (the difference between
the greatest and smallest AbsDon values of the NAL units in the de- the greatest and smallest AbsDon values of the NAL units in the de-
skipping to change at page 48, line 33 skipping to change at page 48, line 33
The receiver MUST ignore any unrecognized parameter. The receiver MUST ignore any unrecognized parameter.
Media Type name: video Media Type name: video
Media subtype name: H265 Media subtype name: H265
Required parameters: none Required parameters: none
OPTIONAL parameters: OPTIONAL parameters:
profile-space, profile-id: profile-space, tier-flag, profile-id, profile-compatibility-
indicator, interop-constraints, and level-id:
The profile-space parameter indicates the context for These parameters indicate the profile, tier, default level,
interpretation of the profile-id parameter value. The and some constraints of the bitstream carried by the RTP
profile, which specifies the subset of coding tools that may stream and all RTP streams the RTP stream depends on, or a
have been used to generate the bitstream or that the receiver specific set of the profile, tier, default level, and some
supports, as specified in [HEVC], is defined by the constraints the receiver supports.
combination of profile-space and profile-id.
The profile and some constraints are indicated collectively by
profile-space, profile-id, profile-compatibility-indicator,
and interop-constraints. The profile specifies the subset of
coding tools that may have been used to generate the bitstream
or that the receiver supports.
Informative note: There are 32 values of profile-id, and
there are 32 flags in profile-compatibility-indicator, each
flag corresponding to one value of profile-id. According
to HEVC version 1 in [HEVC], when more than one of the 32
flags is set for a bitstream, the bitstream would comply
with all the profiles corresponding to the set flags.
However, in a draft of HEVC version 2 in [HEVC draft v2],
subclause A.3.5, 19 Format Range Extensions profiles have
been specified, all using the same value of profile-id (4),
differentiated by some of the 48 bits in interop-
constraints - this (rather unexpected way of profile
signalling) means that one of the 32 flags may correspond
to multiple profiles. To be able to support whatever HEVC
extension profile that might be specified and indicated
using profile-space, profile-id, profile-compatibility-
indicator, and interop-constraints in the future, it would
be safe to require symmetric use of these parameters in SDP
offer/answer unless recv-sub-layer-id is included in the
SDP answer for choosing one of the sub-layers offered.
The tier is indicated by tier-flag. The default level is
indicated by level-id. The tier and the default level specify
the limits on values of syntax elements or arithmetic
combinations of values of syntax elements that are followed
when generating the bitstream or that the receiver supports.
A set of profile-space, tier-flag, profile-id, profile-
compatibility-indicator, interop-constraints, and level-id
parameters ptlA is said to be consistent with another set of
these parameters ptlB if any decoder that conforms to the
profile, tier, level, and constraints indicated by ptlB can
decode any bitstream that conforms to the profile, tier,
level, and constraints indicated by ptlA.
In SDP offer/answer, when the SDP answer does not include the
recv-sub-layer-id parameter that is less than the sprop-sub-
layer-id parameter in the SDP offer, the following applies:
o The profile-space, tier-flag, profile-id, profile-
compatibility-indicator, and interop-constraints
parameters MUST be used symmetrically, i.e. the value of
each of these parameters in the offer MUST be the same as
that in the answer, either explicitly signalled or
implicitly inferred.
o The level-id parameter is changeable as long as the
highest level indicated by the answer is either equal to
or lower than that in the offer. Note that the highest
level is indicated by level-id and max-recv-level-id
together.
In SDP offer/answer, when the SDP answer does include the
recv-sub-layer-id parameter that is less than the sprop-sub-
layer-id parameter in the SDP offer, the set of profile-space,
tier-flag, profile-id, profile-compatibility-indicator,
interop-constraints, and level-id parameters included in the
answer MUST be consistent with that for the chosen sub-layer
representation as indicated in the SDP offer, with the
exception that the level-id parameter in the SDP answer is
changable as long as the highest level indicated by the answer
is either lower than or equal to that in the offer.
More specifications of these parameters, including how they
relate to the values of the profile, tier, and level syntax
elements specified in [HEVC] are provided below.
profile-space, profile-id:
The value of profile-space MUST be in the range of 0 to 3, The value of profile-space MUST be in the range of 0 to 3,
inclusive. The value of profile-id MUST be in the range of 0 inclusive. The value of profile-id MUST be in the range of 0
to 31, inclusive. to 31, inclusive.
If the profile-space and profile-id parameters are used to When profile-space is not present, a value of 0 MUST be
indicate properties of a bitstream, it indicates that, to inferred. When profile-id is not present, a value of 1 (i.e.
decode the bitstream, the minimum subset of coding tools a the Main profile) MUST be inferred.
decoder has to support is the profile specified by both
parameters.
If the profile-space and profile-id parameters are used for
capability exchange or session setup, it indicates the subset
of coding tools, which is equal to the profile, that the codec
supports for both receiving and sending.
If no profile-space is present, a value of 0 MUST be inferred
and if no profile-id is present the Main profile (i.e. a value
of 1) MUST be inferred.
When used to indicate properties of a bitstream, the profile- When used to indicate properties of a bitstream, profile-space
space and profile-id parameters are derived from the SPS or and profile-id are derived from the profile, tier, and level
VPS NAL units as follows, where general_profile_space, syntax elements in SPS or VPS NAL units as follows, where
general_profile_idc, sub_layer_profile_space[j], and general_profile_space, general_profile_idc,
sub_layer_profile_idc[j] are specified in [HEVC]. sub_layer_profile_space[j], and sub_layer_profile_idc[j] are
specified in [HEVC]:
If the RTP stream is the highest RTP stream, the following If the RTP stream is the highest RTP stream, the following
applies: applies:
o profile_space = general_profile_space o profile_space = general_profile_space
o profile_id = general_profile_idc o profile_id = general_profile_idc
Otherwise (the RTP stream is a dependent RTP stream), the Otherwise (the RTP stream is a dependee RTP stream), the
following applies, with j being the value of the sprop-sub- following applies, with j being the value of the sprop-sub-
layer-id parameter: layer-id parameter:
o profile_space = sub_layer_profile_space[j] o profile_space = sub_layer_profile_space[j]
o profile_id = sub_layer_profile_idc[j] o profile_id = sub_layer_profile_idc[j]
tier-flag, level-id: tier-flag, level-id:
The tier-flag parameter indicates the context for
interpretation of the level-id value. The default level,
which limits values of syntax elements or on arithmetic
combinations of values of syntax elements, as specified in
[HEVC], is defined by the combination of tier-flag and level-
id.
The value of tier-flag MUST be in the range of 0 to 1, The value of tier-flag MUST be in the range of 0 to 1,
inclusive. The value of level-id MUST be in the range of 0 inclusive. The value of level-id MUST be in the range of 0
to 255, inclusive. to 255, inclusive.
If the tier-flag and level-id parameters are used to indicate If the tier-flag and level-id parameters are used to indicate
properties of a bitstream, it indicates that, to decode the properties of a bitstream, they indicate the tier and the
bitstream the lowest level the decoder has to support is the highest level the bitstream complies with.
default level.
If the tier-flag and level-id parameters are used for If the tier-flag and level-id parameters are used for
capability exchange or session setup, the following applies. capability exchange, the following applies. If max-recv-
If max-recv-level-id is not present, the default level defined level-id is not present, the default level defined by level-id
by tier-flag and level-id indicates the highest level the indicates the highest level the codec wishes to support.
codec wishes to support. Otherwise, tier-flag and max-recv- Otherwise, max-recv-level-id indicates the highest level the
level-id indicate the highest level the codec supports for codec supports for receiving. For either receiving or
receiving. For either receiving or sending, all levels that sending, all levels that are lower than the highest level
are lower than the highest level supported MUST also be supported MUST also be supported.
supported.
If no tier-flag is present, a value of 0 MUST be inferred and If no tier-flag is present, a value of 0 MUST be inferred and
if no level-id is present, a value of 93 (i.e. level 3.1) MUST if no level-id is present, a value of 93 (i.e. level 3.1) MUST
be inferred. be inferred.
When used to indicate properties of a bitstream, the tier-flag When used to indicate properties of a bitstream, the tier-flag
and level-id parameters are derived from the SPS or VPS NAL and level-id parameters are derived from the profile, tier,
units as follows, where general_tier_flag, general_level_idc, and level syntax elements in SPS or VPS NAL units as follows,
where general_tier_flag, general_level_idc,
sub_layer_tier_flag[j], and sub_layer_level_idc[j] are sub_layer_tier_flag[j], and sub_layer_level_idc[j] are
specified in [HEVC]. specified in [HEVC]:
If the RTP stream is the highest RTP stream, the following If the RTP stream is the highest RTP stream, the following
applies: applies:
o tier-flag = general_tier_flag o tier-flag = general_tier_flag
o level-id = general_level_idc o level-id = general_level_idc
Otherwise (the RTP stream is a dependent RTP stream), the Otherwise (the RTP stream is a dependee RTP stream), the
following applies, with j being the value of the sprop-sub- following applies, with j being the value of the sprop-sub-
layer-id parameter: layer-id parameter:
o tier-flag = sub_layer_tier_flag[j] o tier-flag = sub_layer_tier_flag[j]
o level-id = sub_layer_level_idc[j] o level-id = sub_layer_level_idc[j]
interop-constraints: interop-constraints:
A base16 [RFC4648] (hexadecimal) representation of six bytes A base16 [RFC4648] (hexadecimal) representation of six bytes
of data, consisting of progressive_source_flag, of data, consisting of progressive_source_flag,
skipping to change at page 51, line 36 skipping to change at page 53, line 9
general_progressive_source_flag, general_progressive_source_flag,
general_interlaced_source_flag, general_interlaced_source_flag,
general_non_packed_constraint_flag, general_non_packed_constraint_flag,
general_non_packed_constraint_flag, general_non_packed_constraint_flag,
general_frame_only_constraint_flag, general_frame_only_constraint_flag,
general_reserved_zero_44bits, general_reserved_zero_44bits,
sub_layer_progressive_source_flag[j], sub_layer_progressive_source_flag[j],
sub_layer_interlaced_source_flag[j], sub_layer_interlaced_source_flag[j],
sub_layer_non_packed_constraint_flag[j], sub_layer_non_packed_constraint_flag[j],
sub_layer_frame_only_constraint_flag[j], and sub_layer_frame_only_constraint_flag[j], and
sub_layer_reserved_zero_44bits[j] are specified in [HEVC]. sub_layer_reserved_zero_44bits[j] are specified in [HEVC]:
If the RTP stream is the highest RTP stream, the following If the RTP stream is the highest RTP stream, the following
applies: applies:
o progressive_source_flag = general_progressive_source_flag o progressive_source_flag = general_progressive_source_flag
o interlaced_source_flag = general_interlaced_source_flag o interlaced_source_flag = general_interlaced_source_flag
o non_packed_constraint_flag = o non_packed_constraint_flag =
general_non_packed_constraint_flag general_non_packed_constraint_flag
o frame_only_constraint_flag = o frame_only_constraint_flag =
general_frame_only_constraint_flag general_frame_only_constraint_flag
o reserved_zero_44bits = general_reserved_zero_44bits o reserved_zero_44bits = general_reserved_zero_44bits
Otherwise (the RTP stream is a dependent RTP stream), the Otherwise (the RTP stream is a dependee RTP stream), the
following applies, with j being the value of the sprop-sub- following applies, with j being the value of the sprop-sub-
layer-id parameter: layer-id parameter:
o progressive_source_flag = o progressive_source_flag =
sub_layer_progressive_source_flag[j] sub_layer_progressive_source_flag[j]
o interlaced_source_flag = o interlaced_source_flag =
sub_layer_interlaced_source_flag[j] sub_layer_interlaced_source_flag[j]
o non_packed_constraint_flag = o non_packed_constraint_flag =
sub_layer_non_packed_constraint_flag[j] sub_layer_non_packed_constraint_flag[j]
o frame_only_constraint_flag = o frame_only_constraint_flag =
sub_layer_frame_only_constraint_flag[j] sub_layer_frame_only_constraint_flag[j]
o reserved_zero_44bits = sub_layer_reserved_zero_44bits[j] o reserved_zero_44bits = sub_layer_reserved_zero_44bits[j]
When the interop-constraints parameter is used for capability Using interop-constraints for capability exchange results in a
exchange or session setup, for both the sent bitstream, when requirement on any bitstream to be compliant with the interop-
present, and the received bitstream, when present, the values constraints.
of general_progressive_source_flag,
general_interlaced_source_flag,
general_non_packed_constraint_flag,
general_frame_only_constraint_flag, and
general_reserved_zero_44bits in the SPS or VPS NAL units MUST
be equal to progressive_source_flag, interlaced_source_flag,
non_packed_constraint_flag, frame_only_constraint_flag, and
reserved_zero_44bits, respectively, and for any value of j,
the values of sub_layer_progressive_source_flag[j],
sub_layer_interlaced_source_flag[j],
sub_layer_non_packed_constraint_flag[j],
sub_layer_frame_only_constraint_flag[j], and
sub_layer_reserved_zero_44bits[j] in the SPS or VPS NAL units
MUST be equal to progressive_source_flag,
interlaced_source_flag, non_packed_constraint_flag,
frame_only_constraint_flag, and reserved_zero_44bits,
respectively.
profile-compatibility-indicator: profile-compatibility-indicator:
A base16 [RFC4648] representation of the four bytes A base16 [RFC4648] representation of four bytes of data.
representing the 32 profile compatibility flags in the SPS or
VPS NAL units. A decoder conforming to a certain profile may
be able to decode bitstreams conforming to other profiles.
The profile-compatibility-indicator provides exact information
of the ability of a decoder conforming to a certain profile to
decode bitstreams conforming to another profile. More
concretely, if the profile compatibility flag corresponding to
the profile a decoder conforms to is set, then the decoder is
able to decode any bitstream with the flag set, irrespective
of the profile the bitstream conforms to (provided that the
decoder supports the highest level of the bitstream).
When profile-compatibility-indicator is used to indicate When profile-compatibility-indicator is used to indicate
properties of a bitstream, the following applies, where properties of a bitstream, the following applies, where
general_profile_compatibility_flag[j] and general_profile_compatibility_flag[j] and
sub_layer_profile_compatibility_flag[i][j] are specified in sub_layer_profile_compatibility_flag[i][j] are specified in
[HEVC]. [HEVC]:
The profile-compatibility-indicator in this case indicates
additional profiles to the profile defined by
profile_space, profile_id, and interop-constraints the
bitstream conforms to. A decoder that conforms to any of
all the profiles the bitstream conforms to would be capable
of decoding the bitstream. These additional profiles are
defined by profile-space, each set bit of profile-
compatibility-indicator, and interop-constraints.
If the RTP stream is the highest RTP stream, the following If the RTP stream is the highest RTP stream, the following
applies with j = 0..31: applies for each value of j in the range of 0 to 31,
inclusive:
o The 32 flags = general_profile_compatibility_flag[j] o bit j of profile-compatibility-indicator =
general_profile_compatibility_flag[j]
Otherwise (the RTP stream is a dependent RTP stream), the Otherwise (the RTP stream is a dependee RTP stream), the
following applies with i being the value of the sprop-sub- following applies for i equal to sprop-sub-layer-id and for
layer-id parameter and j = 0..31: each value of j in the range of 0 to 31, inclusive:
o The 32 flags = sub_layer_profile_compatibility_flag[i][j] o bit j of profile-compatibility-indicator =
sub_layer_profile_compatibility_flag[i][j]
When profile-compatibility-indicator is used for capability Using profile-compatibility-indicator for capability exchange
exchange or session setup, the values of results in a requirement on any bitstream to be compliant with
general_profile_compatibility_flag[j] with j = 0..31 MUST be the profile-compatibility-indicator. This is intended to
equal to bits 0 to 31, inclusive, of profile-compatibility- handle cases where any future HEVC profile is defined as an
indicator, respectively, and for any value of i, the values of intersection of two or more profiles.
sub_layer_profile_compatibility_flag[i][j] with j = 0..31 MUST
be equal to bits 0 to 31, inclusive, of profile-compatibility- If this parameter is not present, this parameter defaults to
indicator, respectively. the following: bit j, with j equal to profile-id, of profile-
compatibility-indicator is inferred to be equal to 1, and all
other bits are inferred to be equal to 0.
sprop-sub-layer-id: sprop-sub-layer-id:
This parameter MAY be used to indicate the highest allowed This parameter MAY be used to indicate the highest allowed
value of TID in the bitstream. When not present, the value of value of TID in the bitstream. When not present, the value of
sprop-sub-layer-id is inferred to be equal to 6. sprop-sub-layer-id is inferred to be equal to 6.
The value of sprop-sub-layer-id MUST be in the range of 0 The value of sprop-sub-layer-id MUST be in the range of 0
to 6, inclusive. to 6, inclusive.
recv-sub-layer-id: recv-sub-layer-id:
This parameter MAY be used to signal a receiver's choice of This parameter MAY be used to signal a receiver's choice of
the offered or declared sub-layers in the sprop-vps. The the offered or declared sub-layer representations in the
value of recv-sub-layer-id indicates the TID of the highest sprop-vps. The value of recv-sub-layer-id indicates the TID
sub-layer of the bitstream that a receiver supports. When not of the highest sub-layer of the bitstream that a receiver
present, the value of recv-sub-layer-id is inferred to be supports. When not present, the value of recv-sub-layer-id is
equal to sprop-sub-layer-id. inferred to be equal to the value of the sprop-sub-layer-id
parameter in the SDP offer.
The value of recv-sub-layer-id MUST be in the range of 0 to 6, The value of recv-sub-layer-id MUST be in the range of 0 to 6,
inclusive. inclusive.
max-recv-level-id: max-recv-level-id:
This parameter MAY be used, together with tier-flag, to This parameter MAY be used to indicate the highest level a
indicate the highest level a receiver supports. The highest receiver supports. The highest level the receiver supports is
level the receiver supports is equal to the value of max-recv- equal to the value of max-recv-level-id divided by 30.
level-id divided by 30 for the Main or High tier (as
determined by tier-flag equal to 0 or 1, respectively).
The value of max-recv-level-id MUST be in the range of 0 The value of max-recv-level-id MUST be in the range of 0
to 255, inclusive. to 255, inclusive.
When max-recv-level-id is not present, the value is inferred When max-recv-level-id is not present, the value is inferred
to be equal to level-id. to be equal to level-id.
max-recv-level-id MUST NOT be present when the highest level max-recv-level-id MUST NOT be present when the highest level
the receiver supports is not higher than the default level. the receiver supports is not higher than the default level.
tx-mode: tx-mode:
This parameter indicates whether the transmission mode is SST This parameter indicates whether the transmission mode is SSM
or MST. or MSM.
The value of tx-mode MUST be equal to either "MST" or "SST". The value of tx-mode MUST be equal to either "MSM" or "SSM".
When not present, the value of tx-mode is inferred to be equal When not present, the value of tx-mode is inferred to be equal
to "SST". to "SSM".
If the value is equal to "MST", MST MUST be in use. Otherwise If the value is equal to "MSM", MSM MUST be in use. Otherwise
(the value is equal to "SST"), SST MUST be in use. (the value is equal to "SSM"), SSM MUST be in use.
The value of tx-mode MUST be equal to "MST" for all RTP The value of tx-mode MUST be equal to "MSM" for all RTP
sessions in an MST. sessions in an MSM.
sprop-vps: sprop-vps:
This parameter MAY be used to convey any video parameter set This parameter MAY be used to convey any video parameter set
NAL unit of the bitstream. When present, the parameter MAY be NAL unit of the bitstream for out-of-band transmission of
used to indicate codec capability and sub-stream video parameter sets. The parameter MAY also be used for
characteristics (i.e. properties of sub-layer representations capability exchange and to indicate sub-stream characteristics
as defined in [HEVC]) as well as for out-of-band transmission (i.e. properties of sub-layer representations as defined in
of video parameter sets. The value of the parameter is a [HEVC]). The value of the parameter is a comma-separated
comma-separated (',') list of base64 [RFC4648] representations (',') list of base64 [RFC4648] representations of the video
of the video parameter set NAL units as specified in Section parameter set NAL units as specified in Section 7.3.2.1 of
7.3.2.1 of [HEVC]. [HEVC].
The sprop-vps parameter MAY contain one or more than one video
parameter set NAL unit. However, all other video parameter
sets contained in the sprop-vps parameter MUST be consistent
with the first video parameter set in the sprop-vps parameter.
A video parameter set vpsB is said to be consistent with
another video parameter set vpsA if any decoder that conforms
to the profile, tier, level, and constraints indicated by the
12 bytes of data starting from the syntax element
general_profile_space to the syntax element general_level_id,
inclusive, in the first profile_tier_level( ) syntax structure
in vpsA can decode any bitstream that conforms to the profile,
tier, level, and constraints indicated by the 12 bytes of data
starting from the syntax element general_profile_space to the
syntax element general_level_id, inclusive, in the first
profile_tier_level( ) syntax structure in vpsB.
sprop-sps: sprop-sps:
This parameter MAY be used to convey sequence parameter set This parameter MAY be used to convey sequence parameter set
NAL units of the bitstream for out-of-band transmission of NAL units of the bitstream for out-of-band transmission of
sequence parameter sets. The value of the parameter is a sequence parameter sets. The value of the parameter is a
comma-separated (',') list of base64 [RFC4648] representations comma-separated (',') list of base64 [RFC4648] representations
of the sequence parameter set NAL units as specified in of the sequence parameter set NAL units as specified in
Section 7.3.2.2 of [HEVC]. Section 7.3.2.2 of [HEVC].
skipping to change at page 56, line 22 skipping to change at page 57, line 42
The value of the parameter is a comma-separated (',') list of The value of the parameter is a comma-separated (',') list of
base64 [RFC4648] representations of SEI NAL units as specified base64 [RFC4648] representations of SEI NAL units as specified
in Section 7.3.2.4 of [HEVC]. in Section 7.3.2.4 of [HEVC].
Informative note: Intentionally, no list of applicable or Informative note: Intentionally, no list of applicable or
inapplicable SEI messages is specified here. Conveying inapplicable SEI messages is specified here. Conveying
certain SEI messages in sprop-sei may be sensible in some certain SEI messages in sprop-sei may be sensible in some
application scenarios and meaningless in others. However, application scenarios and meaningless in others. However,
a few examples are described below: a few examples are described below:
1) In an environment where the encoded bitstream was 1) In an environment where the bitstream was created from
created from film-based source material, and no splicing film-based source material, and no splicing is going to
is going to occur during the lifetime of the session, occur during the lifetime of the session, the film grain
the film grain characteristics SEI message or the tone characteristics SEI message or the tone mapping
mapping information SEI message are likely meaningful, information SEI message are likely meaningful, and
and sending them in sprop-sei rather than in the sending them in sprop-sei rather than in the bitstream
bitstream at each entry point may help saving bits and at each entry point may help saving bits and allows to
allows to configure the renderer only once, avoiding configure the renderer only once, avoiding unwanted
unwanted artifacts. artifacts.
2) The structure of pictures information SEI message in 2) The structure of pictures information SEI message in
sprop-sei can be used to inform a decoder of information sprop-sei can be used to inform a decoder of information
on the NAL unit types, picture order count values, and on the NAL unit types, picture order count values, and
prediction dependencies of a sequence of pictures. prediction dependencies of a sequence of pictures.
Having such knowledge can be helpful for error recovery. Having such knowledge can be helpful for error recovery.
3) Examples for SEI messages that would be meaningless to 3) Examples for SEI messages that would be meaningless to
be conveyed in sprop-sei include the decoded picture be conveyed in sprop-sei include the decoded picture
hash SEI message (it is close to impossible that all hash SEI message (it is close to impossible that all
decoded pictures have the same hash-tag), the display decoded pictures have the same hash-tag), the display
orientation SEI message when the device is a handheld orientation SEI message when the device is a handheld
device (as the display orientation may change when the device (as the display orientation may change when the
handheld device is turned around), or the filler payload handheld device is turned around), or the filler payload
SEI message (as there is no point in just having more SEI message (as there is no point in just having more
bits in SDP). bits in SDP).
max-lsr, max-lps, max-cpb, max-dpb, max-br, max-tr, max-tc: max-lsr, max-lps, max-cpb, max-dpb, max-br, max-tr, max-tc:
These parameters MAY be used to signal the capabilities of a These parameters MAY be used to signal the capabilities of a
receiver implementation. These parameters MUST NOT be used receiver implementation. These parameters MUST NOT be used
for any other purpose. The highest level (specified by tier- for any other purpose. The highest level (specified by max-
flag and max-recv-level-id) MUST be such that the receiver is recv-level-id) MUST be such that the receiver is fully capable
fully capable of supporting. max-lsr, max-lps, max-cpb, max- of supporting. max-lsr, max-lps, max-cpb, max-dpb, max-br,
dpb, max-br, max-tr, and max-tc MAY be used to indicate max-tr, and max-tc MAY be used to indicate capabilities of the
capabilities of the receiver that extend the required receiver that extend the required capabilities of the highest
capabilities of the highest level, as specified below. level, as specified below.
When more than one parameter from the set (max-lsr, max-lps, When more than one parameter from the set (max-lsr, max-lps,
max-cpb, max-dpb, max-br, max-tr, max-tc) is present, the max-cpb, max-dpb, max-br, max-tr, max-tc) is present, the
receiver MUST support all signaled capabilities receiver MUST support all signaled capabilities
simultaneously. For example, if both max-lsr and max-br are simultaneously. For example, if both max-lsr and max-br are
present, the highest level with the extension of both the present, the highest level with the extension of both the
picture rate and bitrate is supported. That is, the receiver picture rate and bitrate is supported. That is, the receiver
is able to decode bitstreams in which the luma sample rate is is able to decode bitstreams in which the luma sample rate is
up to max-lsr (inclusive), the bitrate is up to max-br up to max-lsr (inclusive), the bitrate is up to max-br
(inclusive), the coded picture buffer size is derived as (inclusive), the coded picture buffer size is derived as
specified in the semantics of the max-br parameter below, and specified in the semantics of the max-br parameter below, and
the other properties comply with the highest level specified the other properties comply with the highest level specified
by tier-flag and max-recv-level-id. by max-recv-level-id.
Informative note: When the OPTIONAL media type parameters Informative note: When the OPTIONAL media type parameters
are used to signal the properties of a bitstream, and max- are used to signal the properties of a bitstream, and max-
lsr, max-lps, max-cpb, max-dpb, max-br, max-tr, and max-tc lsr, max-lps, max-cpb, max-dpb, max-br, max-tr, and max-tc
are not present, the values of profile-space, profile-id, are not present, the values of profile-space, tier-flag,
tier-flag, and level-id must always be such that the profile-id, profile-compatibility-indicator, interop-
bitstream complies fully with the specified profile and constraints, and level-id must always be such that the
level. bitstream complies fully with the specified profile, tier,
and level.
max-lsr: max-lsr:
The value of max-lsr is an integer indicating the maximum The value of max-lsr is an integer indicating the maximum
processing rate in units of luma samples per second. The max- processing rate in units of luma samples per second. The max-
lsr parameter signals that the receiver is capable of decoding lsr parameter signals that the receiver is capable of decoding
video at a higher rate than is required by the highest level. video at a higher rate than is required by the highest level.
When max-lsr is signaled, the receiver MUST be able to decode When max-lsr is signaled, the receiver MUST be able to decode
bitstreams that conform to the highest level, with the bitstreams that conform to the highest level, with the
exception that the MaxLumaSR value in Table A-2 of [HEVC] for exception that the MaxLumaSR value in Table A-2 of [HEVC] for
skipping to change at page 63, line 40 skipping to change at page 65, line 19
The value of max-fps is not necessarily the picture rate at The value of max-fps is not necessarily the picture rate at
which the maximum picture size can be sent, it constitutes a which the maximum picture size can be sent, it constitutes a
constraint on maximum picture rate for all resolutions. constraint on maximum picture rate for all resolutions.
Informative note: The max-fps parameter is semantically Informative note: The max-fps parameter is semantically
different from max-lsr, max-lps, max-cpb, max-dpb, max-br, different from max-lsr, max-lps, max-cpb, max-dpb, max-br,
max-tr, and max-tc in that max-fps is used to signal a max-tr, and max-tc in that max-fps is used to signal a
constraint, lowering the maximum picture rate from what is constraint, lowering the maximum picture rate from what is
implied by other parameters. implied by other parameters.
The encoder SHOULD use a picture rate equal to or less than The encoder MUST use a picture rate equal to or less than this
this value. An exception is when sending a pre-encoded value. In cases where the max-fps parameter is absent the
bitstream, in which case the picture rate may be greater than encoder is free to choose any picture rate according to the
the value of max-fps. In cases where the max-fps parameter is highest level and any signaled optional parameters.
absent the encoder is free to choose any picture rate
according to the highest level and any signaled optional
parameters.
The value of max-fps MUST be smaller than or equal to the full The value of max-fps MUST be smaller than or equal to the full
picture rate that is implied by the highest level and, when picture rate that is implied by the highest level and, when
present, one or more of the parameters max-lsr, max-lps, and present, one or more of the parameters max-lsr, max-lps, and
max-br. max-br.
sprop-max-don-diff: sprop-max-don-diff:
The value of this parameter MUST be equal to 0, if the RTP The value of this parameter MUST be equal to 0, if the RTP
stream does not depend on other RTP streams and there is no stream does not depend on other RTP streams and there is no
skipping to change at page 64, line 30 skipping to change at page 66, line 6
units naluA and naluB, where naluA follows naluB in decoding units naluA and naluB, where naluA follows naluB in decoding
order and precedes naluB in transmission order. order and precedes naluB in transmission order.
The value of sprop-max-don-diff MUST be an integer in the The value of sprop-max-don-diff MUST be an integer in the
range of 0 to 32767, inclusive. range of 0 to 32767, inclusive.
When not present, the value of sprop-max-don-diff is inferred When not present, the value of sprop-max-don-diff is inferred
to be equal to 0. to be equal to 0.
When the RTP stream depends on one or more other RTP streams When the RTP stream depends on one or more other RTP streams
(in this case tx-mode MUST be equal to "MST" and MST is in (in this case tx-mode MUST be equal to "MSM" and MSM is in
use), this parameter MUST be present and the value MUST be use), this parameter MUST be present and the value MUST be
greater than 0. greater than 0.
Informative note: When the RTP stream does not depend on Informative note: When the RTP stream does not depend on
other RTP streams, either MST or SST may be in use. other RTP streams, either MSM or SSM may be in use.
sprop-depack-buf-nalus: sprop-depack-buf-nalus:
This parameter specifies the maximum number of NAL units that This parameter specifies the maximum number of NAL units that
precede a NAL unit in transmission order and follow the NAL precede a NAL unit in transmission order and follow the NAL
unit in decoding order. unit in decoding order.
The value of sprop-depack-buf-nalus MUST be an integer in the The value of sprop-depack-buf-nalus MUST be an integer in the
range of 0 to 32767, inclusive. range of 0 to 32767, inclusive.
When not present, the value of sprop-depack-buf-nalus is When not present, the value of sprop-depack-buf-nalus is
inferred to be equal to 0. inferred to be equal to 0.
When the RTP stream depends on one or more other RTP streams When the RTP stream depends on one or more other RTP streams
(in this case tx-mode MUST be equal to "MST" and MST is in (in this case tx-mode MUST be equal to "MSM" and MSM is in
use), this parameter MUST be present and the value MUST be use), this parameter MUST be present and the value MUST be
greater than 0. greater than 0.
sprop-depack-buf-bytes: sprop-depack-buf-bytes:
This parameter signals the required size of the de- This parameter signals the required size of the de-
packetization buffer in units of bytes. The value of the packetization buffer in units of bytes. The value of the
parameter MUST be greater than or equal to the maximum buffer parameter MUST be greater than or equal to the maximum buffer
occupancy (in units of bytes) of the de-packetization buffer occupancy (in units of bytes) of the de-packetization buffer
as specified in section 6. as specified in section 6.
The value of sprop-depack-buf-bytes MUST be an integer in the The value of sprop-depack-buf-bytes MUST be an integer in the
range of 0 to 4294967295, inclusive. range of 0 to 4294967295, inclusive.
When the RTP stream depends on one or more other RTP streams When the RTP stream depends on one or more other RTP streams
(in this case tx-mode MUST be equal to "MST" and MST is in (in this case tx-mode MUST be equal to "MSM" and MSM is in
use) or sprop-max-don-diff is present and greater than 0, this use) or sprop-max-don-diff is present and greater than 0, this
parameter MUST be present and the value MUST be greater than parameter MUST be present and the value MUST be greater than
0. 0.
Informative note: The value of sprop-depack-buf-bytes Informative note: The value of sprop-depack-buf-bytes
indicates the required size of the de-packetization buffer indicates the required size of the de-packetization buffer
only. When network jitter can occur, an appropriately only. When network jitter can occur, an appropriately
sized jitter buffer has to be available as well. sized jitter buffer has to be available as well.
depack-buf-cap: depack-buf-cap:
This parameter signals the capabilities of a receiver This parameter signals the capabilities of a receiver
implementation and indicates the amount of de-packetization implementation and indicates the amount of de-packetization
buffer space in units of bytes that the receiver has available buffer space in units of bytes that the receiver has available
for reconstructing the NAL unit decoding order from NAL units for reconstructing the NAL unit decoding order from NAL units
carried in one or more RTP streams. A receiver is able to carried in one or more RTP streams. A receiver is able to
handle any RTP stream, and its dependent RTP streams, when handle any RTP stream, and all RTP streams the RTP stream
present, for which the value of the sprop-depack-buf-bytes depends on, when present, for which the value of the sprop-
parameter is smaller than or equal to this parameter. depack-buf-bytes parameter is smaller than or equal to this
parameter.
When not present, the value of depack-buf-cap is inferred to When not present, the value of depack-buf-cap is inferred to
be equal to 4294967295. The value of depack-buf-cap MUST be be equal to 4294967295. The value of depack-buf-cap MUST be
an integer in the range of 1 to 4294967295, inclusive. an integer in the range of 1 to 4294967295, inclusive.
Informative note: depack-buf-cap indicates the maximum Informative note: depack-buf-cap indicates the maximum
possible size of the de-packetization buffer of the possible size of the de-packetization buffer of the
receiver only. When network jitter can occur, an receiver only. When network jitter can occur, an
appropriately sized jitter buffer has to be available as appropriately sized jitter buffer has to be available as
well. well.
skipping to change at page 69, line 25 skipping to change at page 70, line 45
When not present, the value of dec-parallel-cap.max-lsr, dec- When not present, the value of dec-parallel-cap.max-lsr, dec-
parallel-cap.max-lps, or dec-parallel-cap.max-br is inferred parallel-cap.max-lps, or dec-parallel-cap.max-br is inferred
to be equal to the value of max-lsr, max-lps, or max-br, to be equal to the value of max-lsr, max-lps, or max-br,
respectively, outside the dec-parallel-cap parameter. respectively, outside the dec-parallel-cap parameter.
The general decoding capability, expressed by the set of The general decoding capability, expressed by the set of
parameters outside of dec-parallel-cap, is defined as the parameters outside of dec-parallel-cap, is defined as the
capability point that is determined by the following capability point that is determined by the following
combination of parameters: 1) the parallelism requirement combination of parameters: 1) the parallelism requirement
corresponding to the value of sprop-segmentation-id equal to 0 corresponding to the value of sprop-segmentation-id equal to 0
for a bitstream, 2) the profile determined by profile-space for a bitstream, 2) the profile determined by profile-space,
and profile-id, 3) the highest level determined by tier-flag profile-id, profile-compatibility-indicator, and interop-
and max-recv-level-id, and 4) the maximum processing rate, the constraints, 3) the tier and the highest level determined by
maximum picture size, and the maximum video bitrate determined tier-flag and max-recv-level-id, and 4) the maximum processing
by the highest level. The general decoding capability MUST rate, the maximum picture size, and the maximum video bitrate
NOT be included as one of the set of capability points in the determined by the highest level. The general decoding
dec-parallel-cap parameter. capability MUST NOT be included as one of the set of
capability points in the dec-parallel-cap parameter.
For example, the following parameters express the general For example, the following parameters express the general
decoding capability of 720p30 (Level 3.1) plus an additional decoding capability of 720p30 (Level 3.1) plus an additional
decoding capability of 1080p30 (Level 4) given that the decoding capability of 1080p30 (Level 4) given that the
spatially largest tile or slice used in the bitstream is equal spatially largest tile or slice used in the bitstream is equal
to or less than 1/3 of the picture size: to or less than 1/3 of the picture size:
a=fmtp:98 level-id=93;dec-parallel-cap={t:8;level-id=120} a=fmtp:98 level-id=93;dec-parallel-cap={t:8;level-id=120}
For another example, the following parameters express an For another example, the following parameters express an
skipping to change at page 70, line 14 skipping to change at page 71, line 34
a=fmtp:98 level-id=93;dec-parallel-cap={w:8; a=fmtp:98 level-id=93;dec-parallel-cap={w:8;
max-lsr=62668800;max-lps=2088960} max-lsr=62668800;max-lps=2088960}
Informative note: When min_spatial_segmentation_idc is Informative note: When min_spatial_segmentation_idc is
present in a bitstream and WPP is not used, [HEVC] present in a bitstream and WPP is not used, [HEVC]
specifies that there is no slice or no tile in the specifies that there is no slice or no tile in the
bitstream containing more than 4 * PicSizeInSamplesY / bitstream containing more than 4 * PicSizeInSamplesY /
( min_spatial_segmentation_idc + 4 ) luma samples. ( min_spatial_segmentation_idc + 4 ) luma samples.
include-dph:
This parameter is used to indicate the capability and
preference to utilize or include decoded picture hash (DPH)
SEI messages (See Section D.3.19 of [HEVC]) in the bitstream.
DPH SEI messages can be used to detect picture corruption so
the receiver can request picture repair, see Section 8. The
value is a comma separated list of hash types that is
supported or requested to be used, each hash type provided as
an unsigned integer value (0-255), with the hash types listed
from most preferred to the least preferred. Example:
"include-dph=0,2", which indicates the capability for MD5
(most preferred) and Checksum (less preferred). If the
parameter is not included or the value contains no hash types,
then no capability to utilize DPH SEI messages is assumed.
Note that DPH SEI messages MAY still be included in the
bitstream even when there is no declaration of capability to
use them, as in general SEI messages do not affect the
normative decoding process and decoders are allowed to ignore
SEI messages.
Encoding considerations: Encoding considerations:
This type is only defined for transfer via RTP (RFC 3550). This type is only defined for transfer via RTP (RFC 3550).
Security considerations: Security considerations:
See Section 9 of RFC XXXX. See Section 9 of RFC XXXX.
Public specification: Public specification:
skipping to change at page 71, line 28 skipping to change at page 73, line 28
o The clock rate in the "a=rtpmap" line MUST be 90000. o The clock rate in the "a=rtpmap" line MUST be 90000.
o The OPTIONAL parameters "profile-space", "profile-id", "tier- o The OPTIONAL parameters "profile-space", "profile-id", "tier-
flag", "level-id", "interop-constraints", "profile-compatibility- flag", "level-id", "interop-constraints", "profile-compatibility-
indicator", "sprop-sub-layer-id", "recv-sub-layer-id", "max-recv- indicator", "sprop-sub-layer-id", "recv-sub-layer-id", "max-recv-
level-id", "tx-mode", "max-lsr", "max-lps", "max-cpb", "max-dpb", level-id", "tx-mode", "max-lsr", "max-lps", "max-cpb", "max-dpb",
"max-br", "max-tr", "max-tc", "max-fps", "sprop-max-don-diff", "max-br", "max-tr", "max-tc", "max-fps", "sprop-max-don-diff",
"sprop-depack-buf-nalus", "sprop-depack-buf-bytes", "depack-buf- "sprop-depack-buf-nalus", "sprop-depack-buf-bytes", "depack-buf-
cap", "sprop-segmentation-id", "sprop-spatial-segmentation-idc", cap", "sprop-segmentation-id", "sprop-spatial-segmentation-idc",
and "dec-parallel-cap", when present, MUST be included in the "dec-parallel-cap", and "include-dph", when present, MUST be
"a=fmtp" line of SDP. This parameter is expressed as a media included in the "a=fmtp" line of SDP. This parameter is
type string, in the form of a semicolon separated list of expressed as a media type string, in the form of a semicolon
parameter=value pairs. separated list of parameter=value pairs.
o The OPTIONAL parameters "sprop-vps", "sprop-sps", and "sprop- o The OPTIONAL parameters "sprop-vps", "sprop-sps", and "sprop-
pps", when present, MUST be included in the "a=fmtp" line of SDP pps", when present, MUST be included in the "a=fmtp" line of SDP
or conveyed using the "fmtp" source attribute as specified in or conveyed using the "fmtp" source attribute as specified in
section 6.3 of [RFC5576]. For a particular media format (i.e. section 6.3 of [RFC5576]. For a particular media format (i.e.
RTP payload type), "sprop-vps" "sprop-sps", or "sprop-pps" MUST RTP payload type), "sprop-vps" "sprop-sps", or "sprop-pps" MUST
NOT be both included in the "a=fmtp" line of SDP and conveyed NOT be both included in the "a=fmtp" line of SDP and conveyed
using the "fmtp" source attribute. When included in the "a=fmtp" using the "fmtp" source attribute. When included in the "a=fmtp"
line of SDP, these parameters are expressed as a media type line of SDP, these parameters are expressed as a media type
string, in the form of a semicolon separated list of string, in the form of a semicolon separated list of
parameter=value pairs. When conveyed using the "fmtp" source parameter=value pairs. When conveyed in the "a=fmtp" line of SDP
for a particular payload type, the parameters "sprop-vps",
"sprop-sps", and "sprop-pps" MUST be applied to each SSRC with
the payload type. When conveyed using the "fmtp" source
attribute, these parameters are only associated with the given attribute, these parameters are only associated with the given
source and payload type as parts of the "fmtp" source attribute. source and payload type as parts of the "fmtp" source attribute.
Informative note: Conveyance of "sprop-vps", "sprop-sps", and Informative note: Conveyance of "sprop-vps", "sprop-sps", and
"sprop-pps" using the "fmtp" source attribute allows for out- "sprop-pps" using the "fmtp" source attribute allows for out-
of-band transport of parameter sets in topologies like Topo- of-band transport of parameter sets in topologies like Topo-
Video-switch-MCU as specified in [RFC5117]. Video-switch-MCU as specified in [RFC5117].
An example of media representation in SDP is as follows: An example of media representation in SDP is as follows:
skipping to change at page 72, line 26 skipping to change at page 74, line 28
7.2.2 Usage with SDP Offer/Answer Model 7.2.2 Usage with SDP Offer/Answer Model
When HEVC is offered over RTP using SDP in an Offer/Answer model When HEVC is offered over RTP using SDP in an Offer/Answer model
[RFC3264] for negotiation for unicast usage, the following [RFC3264] for negotiation for unicast usage, the following
limitations and rules apply: limitations and rules apply:
o The parameters identifying a media format configuration for HEVC o The parameters identifying a media format configuration for HEVC
are profile-space, profile-id, tier-flag, level-id, interop- are profile-space, profile-id, tier-flag, level-id, interop-
constraints, profile-compatibility-indicator, and tx-mode. These constraints, profile-compatibility-indicator, and tx-mode. These
media configuration parameters, except for level-id, MUST be used media configuration parameters, except level-id, MUST be used
symmetrically when the answerer does not include recv-sub-layer- symmetrically when the answerer does not include recv-sub-layer-
id in the answer for the media format (payload type). In other id in the answer for the media format (payload type) or the
words, the answerer MUST 1) maintain all configuration parameters included recv-sub-layer-id is equal to sprop-sub-layer-id in the
for the media format (payload type), 2) include recv-sub-layer-id offer. The answerer MUST
in the answer for the media format (payload type), or 3) remove
the media format (payload type) completely (when one or more of
the parameter values are not supported). The value of level-id
is changeable.
Informative note: The requirement for symmetric use does not 1) maintain all configuration parameters with the values
apply for level-id, and does not apply for the other remaining the same as in the offer for the media format
(payload type), with the exception that the value of level-
id is changeable as long as the highest level indicated by
the answer is not higher than that indicated by the offer;
2) include in the answer the recv-sub-layer-id parameter, with
a value less than the sprop-sub-layer-id parameter in the
offer, for the media format (payload type), and maintain all
configuration parameters with the values being the same as
signalled in the sprop-vps for the chosen sub-layer
representation, with the exception that the value of level-
id is changeable as long as the highest level indicated by
the answer is not higher than the level indicated by the
sprop-vps in offer for the chosen sub-layer representation;
or
3) remove the media format (payload type) completely (when one
or more of the parameter values are not supported).
Informative note: The above requirement for symmetric use
does not apply for level-id, and does not apply for the other
bitstream or RTP stream properties and capability parameters. bitstream or RTP stream properties and capability parameters.
o The profile-compatibility-indicator, when offered as sendonly,
describe bitstream properties. The answerer MAY accept an RTP
payload type even if the decoder is not capable of handling the
profile indicated by the profile-space, profile-id, and interop-
constraints parameters, but capable of any of the profiles
indicated by the profile-space, profile-compatibility-indicator,
and interop-constraints. However, when the profile-
compatibility-indicator is used in a recvonly or sendrecv media
description, the bitstream using this RTP payload type is
required to conform to all profiles indicated by profile-space,
profile-compatibility-indicator, and interop-constraints.
o To simplify handling and matching of these configurations, the o To simplify handling and matching of these configurations, the
same RTP payload type number used in the offer SHOULD also be same RTP payload type number used in the offer SHOULD also be
used in the answer, as specified in [RFC3264]. The same RTP used in the answer, as specified in [RFC3264].
payload type number used in the offer MUST also be used in the
answer when the answer includes recv-sub-layer-id. When the o The same RTP payload type number used in the offer MUST be used
answer does not include recv-sub-layer-id, the answer MUST NOT in the answer when the answer includes recv-sub-layer-id. When
contain a payload type number used in the offer unless the the answer does not include recv-sub-layer-id, the answer MUST
NOT contain a payload type number used in the offer unless the
configuration is exactly the same as in the offer or the configuration is exactly the same as in the offer or the
configuration in the answer only differs from that in the offer configuration in the answer only differs from that in the offer
with a different value of level-id. The answer MAY contain the with a different value of level-id. The answer MAY contain the
recv-sub-layer-id parameter if an HEVC bitstream contains recv-sub-layer-id parameter if an HEVC bitstream contains
multiple operation points (using temporal scalability and sub- multiple operation points (using temporal scalability and sub-
layers) and sprop-vps is included in the offer where sub-layers layers) and sprop-vps is included in the offer where information
are present in the video parameter set. If the sprop-vps is of sub-layers are present in the first video parameter set
provided in an offer, an answerer MAY select a particular contained in sprop-vps. If the sprop-vps is provided in an
operation point in the received and/or in the sent bitstream. offer, an answerer MAY select a particular operation point
When recv-sub-layer-id is present in the answer, the media indicated in the first video parameter set contained in sprop-
configuration parameters MUST NOT be present in the answer. vps. When the answer includes recv-sub-layer-id that is less
Rather, the media configuration that the answerer will use for than sprop-sub-layer-id in the offer, all video parameter sets
receiving and/or sending is the one used for the selected contained in the sprop-vps parameter in the SDP answer and all
operation point as indicated in the offer. video parameter sets sent in-band for either the offerer-to-
answerer direction or the answerer-to-offerer direction MUST be
consistent with the first video parameter set in the sprop-vps
parameter of the offer (see the semantics of sprop-vps on one
video parameter set being consistent with another video parameter
set), and the bitstream sent in either direction MUST conform to
the profile, tier, level, and constraints of the chosen sub-layer
representation as indicated by the first profile_tier_level( )
syntax structure in the first video parameter set in the sprop-
vps parameter of the offer.
Informative note: When an offerer receives an answer that Informative note: When an offerer receives an answer that
does not include recv-sub-layer-id, it has to compare payload does not include recv-sub-layer-id, it has to compare payload
types not declared in the offer based on the media type (i.e. types not declared in the offer based on the media type (i.e.
video/H265) and the above media configuration parameters with video/H265) and the above media configuration parameters with
any payload types it has already declared. This will enable any payload types it has already declared. This will enable
it to determine whether the configuration in question is new it to determine whether the configuration in question is new
or if it is equivalent to configuration already offered, or if it is equivalent to configuration already offered,
since a different payload type number may be used in the since a different payload type number may be used in the
answer. The ability to perform operation point selection answer. The ability to perform operation point selection
enables a receiver to utilize the temporal scalable nature of enables a receiver to utilize the temporal scalable nature of
an HEVC bitstream. an HEVC bitstream.
o The parameters sprop-max-don-diff, sprop-depack-buf-nalus, and o The parameters sprop-max-don-diff, sprop-depack-buf-nalus, and
sprop-depack-buf-bytes describe the properties of an RTP stream, sprop-depack-buf-bytes describe the properties of an RTP stream,
and its dependent RTP streams, when present, that the offerer or and all RTP streams the RTP stream depends on, when present, that
the answerer is sending for the media format configuration. This the offerer or the answerer is sending for the media format
differs from the normal usage of the Offer/Answer parameters: configuration. This differs from the normal usage of the
normally such parameters declare the properties of the bitstream Offer/Answer parameters: normally such parameters declare the
or RTP stream that the offerer or the answerer is able to properties of the bitstream or RTP stream that the offerer or the
receive. When dealing with HEVC, the offerer assumes that the answerer is able to receive. When dealing with HEVC, the offerer
answerer will be able to receive media encoded using the assumes that the answerer will be able to receive media encoded
configuration being offered. using the configuration being offered.
Informative note: The above parameters apply for any RTP Informative note: The above parameters apply for any RTP
stream and its dependent RTP streams, when present, sent by a stream and all RTP streams the RTP stream depends on, when
declaring entity with the same configuration; i.e. they are present, sent by a declaring entity with the same
dependent on their source endpoint. Rather than being bound configuration; i.e. they are dependent on their source
to the payload type, the values may have to be applied to endpoint. Rather than being bound to the payload type, the
another payload type when being sent, as they apply for the values may have to be applied to another payload type when
configuration. being sent, as they apply for the configuration.
o The capability parameters max-lsr, max-lps, max-cpb, max-dpb, o The capability parameters max-lsr, max-lps, max-cpb, max-dpb,
max-br, max-tr, and max-tc MAY be used to declare further max-br, max-tr, and max-tc MAY be used to declare further
capabilities of the offerer or answerer for receiving. These capabilities of the offerer or answerer for receiving. These
parameters MUST NOT be present when the direction attribute is parameters MUST NOT be present when the direction attribute is
"sendonly". "sendonly".
o The capability parameter max-fps MAY be used to declare lower o The capability parameter max-fps MAY be used to declare lower
capabilities of the offerer or answerer for receiving. The capabilities of the offerer or answerer for receiving. The
parameters MUST NOT be present when the direction attribute is parameters MUST NOT be present when the direction attribute is
skipping to change at page 75, line 24 skipping to change at page 77, line 40
parallel-cap.spatial-seg-idc of the capability point. A parallel-cap.spatial-seg-idc of the capability point. A
bitstream that is sent based on choosing a capability point with bitstream that is sent based on choosing a capability point with
parallel tool type 't' from dec-parallel-cap MUST have parallel tool type 't' from dec-parallel-cap MUST have
entropy_coding_sync_enabled_flag equal to 0 and entropy_coding_sync_enabled_flag equal to 0 and
min_spatial_segmentation_idc equal to or larger than dec- min_spatial_segmentation_idc equal to or larger than dec-
parallel-cap.spatial-seg-idc of the capability point. parallel-cap.spatial-seg-idc of the capability point.
o An offerer has to include the size of the de-packetization o An offerer has to include the size of the de-packetization
buffer, sprop-depack-buf-bytes, as well as sprop-max-don-diff and buffer, sprop-depack-buf-bytes, as well as sprop-max-don-diff and
sprop-depack-buf-nalus, in the offer for an interleaved HEVC sprop-depack-buf-nalus, in the offer for an interleaved HEVC
bitstream or for the MST transmission mode. To enable the bitstream or for the MSM transmission mode. To enable the
offerer and answerer to inform each other about their offerer and answerer to inform each other about their
capabilities for de-packetization buffering in receiving RTP capabilities for de-packetization buffering in receiving RTP
streams, both parties are RECOMMENDED to include depack-buf-cap. streams, both parties are RECOMMENDED to include depack-buf-cap.
For interleaved RTP streams or in MST, it is also RECOMMENDED to For interleaved RTP streams or in MSM, it is also RECOMMENDED to
consider offering multiple payload types with different buffering consider offering multiple payload types with different buffering
requirements when the capabilities of the receiver are unknown. requirements when the capabilities of the receiver are unknown.
o The sprop-vps, sprop-sps, or sprop-pps, when present (included in o The sprop-vps, sprop-sps, or sprop-pps, when present (included in
the "a=fmtp" line of SDP or conveyed using the "fmtp" source the "a=fmtp" line of SDP or conveyed using the "fmtp" source
attribute as specified in section 6.3 of [RFC5576]), are used for attribute as specified in section 6.3 of [RFC5576]), are used for
out-of-band transport of the parameter sets (VPS, SPS, or PPS out-of-band transport of the parameter sets (VPS, SPS, or PPS
respectively). respectively).
o The answerer MAY use either out-of-band or in-band transport of o The answerer MAY use either out-of-band or in-band transport of
parameter sets for the bitstream it is sending, regardless of parameter sets for the bitstream it is sending, regardless of
whether out-of-band parameter sets transport has been used in the whether out-of-band parameter sets transport has been used in the
offerer-to-answerer direction. Parameter sets included in an offerer-to-answerer direction. Parameter sets included in an
answer are independent of those parameter sets included in the answer are independent of those parameter sets included in the
offer, as they are used for decoding two different bitstreams, offer, as they are used for decoding two different bitstreams,
one from the answerer to the offerer and the other in the one from the answerer to the offerer and the other in the
opposite direction. opposite direction.
o The capability parameter include-dph MAY be used to declare the
capability to utilize decoded picture hash SEI messages and which
types of hashes in any HEVC RTP streams received by the offerer
or answerer.
o The following rules apply to transport of parameter set in the o The following rules apply to transport of parameter set in the
offerer-to-answerer direction. offerer-to-answerer direction.
o An offer MAY include sprop-vps, sprop-sps, and/or sprop-pps. o An offer MAY include sprop-vps, sprop-sps, and/or sprop-pps.
If none of these parameters is present in the offer, then If none of these parameters is present in the offer, then
only in-band transport of parameter sets is used. only in-band transport of parameter sets is used.
o If the level to use in the offerer-to-answerer direction is o If the level to use in the offerer-to-answerer direction is
equal to the default level in the offer, the answerer MUST be equal to the default level in the offer, the answerer MUST be
prepared to use the parameter sets included in sprop-vps, prepared to use the parameter sets included in sprop-vps,
sprop-sps, and sprop-pps (either included in the "a=fmtp" sprop-sps, and sprop-pps (either included in the "a=fmtp"
line of SDP or conveyed using the "fmtp" source attribute) line of SDP or conveyed using the "fmtp" source attribute)
for decoding the incoming bitstream, e.g. by passing these for decoding the incoming bitstream, e.g. by passing these
parameter set NAL units to the video decoder before passing parameter set NAL units to the video decoder before passing
any NAL units carried in the RTP streams. Otherwise, the any NAL units carried in the RTP streams. Otherwise, the
answerer MUST ignore sprop-vps, sprop-sps, and sprop-pps answerer MUST ignore sprop-vps, sprop-sps, and sprop-pps
(either included in the "a=fmtp" line of SDP or conveyed (either included in the "a=fmtp" line of SDP or conveyed
using the "fmtp" source attribute) and the offerer MUST using the "fmtp" source attribute) and the offerer MUST
transmit parameter sets in-band. transmit parameter sets in-band.
o In MST, the answerer MUST be prepared to use the parameter o In MSM, the answerer MUST be prepared to use the parameter
sets out-of-band transmitted for the current RTP stream and sets out-of-band transmitted for the RTP stream and all RTP
its dependent RTP streams, when present, for decoding the streams the RTP stream depends on, when present, for decoding
incoming bitstream, e.g. by passing these parameter set NAL the incoming bitstream, e.g. by passing these parameter set
units to the video decoder before passing any NAL units NAL units to the video decoder before passing any NAL units
carried in the RTP streams. carried in the RTP streams.
o The following rules apply to transport of parameter set in the o The following rules apply to transport of parameter set in the
answerer-to-offerer direction. answerer-to-offerer direction.
o An answer MAY include sprop-vps, sprop-sps, and/or sprop-pps. o An answer MAY include sprop-vps, sprop-sps, and/or sprop-pps.
If none of these parameters is present in the answer, then If none of these parameters is present in the answer, then
only in-band transport of parameter sets is used. only in-band transport of parameter sets is used.
o The offerer MUST be prepared to use the parameter sets o The offerer MUST be prepared to use the parameter sets
included in sprop-vps, sprop-sps, and sprop-pps (either included in sprop-vps, sprop-sps, and sprop-pps (either
included in the "a=fmtp" line of SDP or conveyed using the included in the "a=fmtp" line of SDP or conveyed using the
"fmtp" source attribute) for decoding the incoming bitstream, "fmtp" source attribute) for decoding the incoming bitstream,
e.g. by passing these parameter set NAL units to the video e.g. by passing these parameter set NAL units to the video
decoder before passing any NAL units carried in the RTP decoder before passing any NAL units carried in the RTP
streams. streams.
o In MST, the offerer MUST be prepared to use the parameter o In MSM, the offerer MUST be prepared to use the parameter
sets out-of-band transmitted for the current RTP stream and sets out-of-band transmitted for the RTP stream and all RTP
its dependent RTP streams, when present, for decoding the streams the RTP stream depends on, when present, for decoding
incoming bitstream, e.g. by passing these parameter set NAL the incoming bitstream, e.g. by passing these parameter set
units to the video decoder before passing any NAL units NAL units to the video decoder before passing any NAL units
carried in the RTP streams. carried in the RTP streams.
o When sprop-vps, sprop-sps, and/or sprop-pps are conveyed using o When sprop-vps, sprop-sps, and/or sprop-pps are conveyed using
the "fmtp" source attribute as specified in section 6.3 of the "fmtp" source attribute as specified in section 6.3 of
[RFC5576], the receiver of the parameters MUST store the [RFC5576], the receiver of the parameters MUST store the
parameter sets included in sprop-vps, sprop-sps, and/or sprop-pps parameter sets included in sprop-vps, sprop-sps, and/or sprop-pps
and associate them with the source given as part of the "fmtp" and associate them with the source given as part of the "fmtp"
source attribute. Parameter sets associated with one source source attribute. Parameter sets associated with one source
(given as part of the "fmtp" source attribute) MUST only be used (given as part of the "fmtp" source attribute) MUST only be used
to decode NAL units conveyed in RTP packets from the same source to decode NAL units conveyed in RTP packets from the same source
skipping to change at page 78, line 25 skipping to change at page 81, line 8
offers, answers, direction attributes, with and without recv-sub- offers, answers, direction attributes, with and without recv-sub-
layer-id. Columns that do not indicate offer or answer apply to layer-id. Columns that do not indicate offer or answer apply to
both. both.
sendonly --+ sendonly --+
answer: recvonly, recv-sub-layer-id --+ | answer: recvonly, recv-sub-layer-id --+ |
recvonly w/o recv-sub-layer-id --+ | | recvonly w/o recv-sub-layer-id --+ | |
answer: sendrecv, recv-sub-layer-id --+ | | | answer: sendrecv, recv-sub-layer-id --+ | | |
sendrecv w/o recv-sub-layer-id --+ | | | | sendrecv w/o recv-sub-layer-id --+ | | | |
| | | | | | | | | |
profile-space C X C X P profile-space C D C D P
profile-id C X C X P profile-id C D C D P
tier-flag C X C X P tier-flag C D C D P
level-id C X C X P level-id D D D D P
interop-constraints C X C X P interop-constraints C D C D P
profile-compatibility-indicator C X C X P profile-compatibility-indicator C D C D P
tx-mode C X C X P tx-mode C C C C P
max-recv-level-id R R R R - max-recv-level-id R R R R -
sprop-max-don-diff P P - - P sprop-max-don-diff P P - - P
sprop- depack-buf-nalus P P - - P sprop- depack-buf-nalus P P - - P
sprop-depack-buf-bytes P P - - P sprop-depack-buf-bytes P P - - P
depack-buf-cap R R R R - depack-buf-cap R R R R -
sprop-segmentation-id P P P P P sprop-segmentation-id P P P P P
sprop-spatial-segmentation-idc P P P P P sprop-spatial-segmentation-idc P P P P P
max-br R R R R - max-br R R R R -
max-cpb R R R R - max-cpb R R R R -
max-dpb R R R R - max-dpb R R R R -
skipping to change at page 79, line 11 skipping to change at page 81, line 36
max-lps R R R R - max-lps R R R R -
max-tr R R R R - max-tr R R R R -
max-tc R R R R - max-tc R R R R -
max-fps R R R R - max-fps R R R R -
sprop-vps P P - - P sprop-vps P P - - P
sprop-sps P P - - P sprop-sps P P - - P
sprop-pps P P - - P sprop-pps P P - - P
sprop-sub-layer-id P P - - P sprop-sub-layer-id P P - - P
recv-sub-layer-id X O X O - recv-sub-layer-id X O X O -
dec-parallel-cap R R R R - dec-parallel-cap R R R R -
include-dph R R R R -
Legend: Legend:
C: configuration for sending and receiving bitstreams C: configuration for sending and receiving bitstreams
D: changable configuration, same as C except possible
to answer with a different but consistent value (see the
semantics of the six parameters related to profile, tier,
and level on these parameters being consistent)
P: properties of the bitstream to be sent P: properties of the bitstream to be sent
R: receiver capabilities R: receiver capabilities
O: operation point selection O: operation point selection
X: MUST NOT be present X: MUST NOT be present
-: not usable, when present SHOULD be ignored -: not usable, when present SHOULD be ignored
Parameters used for declaring receiver capabilities are in general Parameters used for declaring receiver capabilities are in general
downgradable; i.e. they express the upper limit for a sender's downgradable; i.e. they express the upper limit for a sender's
possible behavior. Thus, a sender MAY select to set its encoder possible behavior. Thus, a sender MAY select to set its encoder
using only lower/lesser or equal values of these parameters. using only lower/lesser or equal values of these parameters.
Parameters declaring a configuration point are not changeable, with When the answer does not include recv-sub-layer-id that is less than
the exception of the level-id parameter for unicast usage. This the sprop-sub-layer-id in the offer, parameters declaring a
expresses values a receiver expects to be used and MUST be used configuration point are not changeable, with the exception of the
verbatim on the sender side. If level-id is changed, an answerer level-id parameter for unicast usage, and these parameters express
MUST NOT include the recv-sub-layer-id parameter. values a receiver expects to be used and MUST be used verbatim in
the answer as in the offer.
When a sender's capabilities are declared, and non-changeable When a sender's capabilities are declared with the configuration
parameters are used in this declaration, these parameters express a parameters, these parameters express a configuration that is
configuration that is acceptable for the sender to receive acceptable for the sender to receive bitstreams. In order to
bitstreams. In order to achieve high interoperability levels, it is achieve high interoperability levels, it is often advisable to offer
often advisable to offer multiple alternative configurations. It is multiple alternative configurations. It is impossible to offer
impossible to offer multiple configurations in a single payload multiple configurations in a single payload type. Thus, when
type. Thus, when multiple configuration offers are made, each offer multiple configuration offers are made, each offer requires its own
requires its own RTP payload type associated with the offer. RTP payload type associated with the offer. However, it is possible
to offer multiple operation points using one configuration in a
single payload type by including sprop-vps in the offer and recv-
sub-layer-id in the answer.
A receiver SHOULD understand all media type parameters, even if it A receiver SHOULD understand all media type parameters, even if it
only supports a subset of the payload format's functionality. This only supports a subset of the payload format's functionality. This
ensures that a receiver is capable of understanding when an offer to ensures that a receiver is capable of understanding when an offer to
receive media can be downgraded to what is supported by the receiver receive media can be downgraded to what is supported by the receiver
of the offer. of the offer.
An answerer MAY extend the offer with additional media format An answerer MAY extend the offer with additional media format
configurations. However, to enable their usage, in most cases a configurations. However, to enable their usage, in most cases a
second offer is required from the offerer to provide the bitstream second offer is required from the offerer to provide the bitstream
skipping to change at page 81, line 19 skipping to change at page 84, line 13
- max-cpb - max-cpb
- max-dpb - max-dpb
- max-br - max-br
- max-tr - max-tr
- max-tc - max-tc
- max-fps - max-fps
- max-recv-level-id - max-recv-level-id
- depack-buf-cap - depack-buf-cap
- sprop-sub-layer-id - sprop-sub-layer-id
- dec-parallel-cap - dec-parallel-cap
- include-dph
o A receiver of the SDP is required to support all parameters and o A receiver of the SDP is required to support all parameters and
values of the parameters provided; otherwise, the receiver MUST values of the parameters provided; otherwise, the receiver MUST
reject (RTSP) or not participate in (SAP) the session. It falls reject (RTSP) or not participate in (SAP) the session. It falls
on the creator of the session to use values that are expected to on the creator of the session to use values that are expected to
be supported by the receiving application. be supported by the receiving application.
7.2.4 Parameter Sets Considerations 7.2.4 Parameter Sets Considerations
When out-of-band transport of parameter sets is used, parameter sets When out-of-band transport of parameter sets is used, parameter sets
MAY still be additionally transported in-band unless explicitly MAY still be additionally transported in-band unless explicitly
disallowed by an application, and some of these additionally in-band disallowed by an application, and some of these additionally in-band
transported parameter sets may update some of the out-of-band transported parameter sets may update some of the out-of-band
transported parameter sets. Update of a parameter set refers to transported parameter sets. Update of a parameter set refers to
sending of a parameter set of the same type using the same parameter sending of a parameter set of the same type using the same parameter
set ID but with different values for at least one other parameter of set ID but with different values for at least one other parameter of
the parameter set. the parameter set.
If MST is used, the rules on signaling media decoding dependency in If MSM is used, the rules on signaling media decoding dependency in
SDP as defined in [RFC5583] apply. The rules on "hierarchical or SDP as defined in [RFC5583] apply. The rules on "hierarchical or
layered encoding" with multicast in Section 5.7 of [RFC4566] do not layered encoding" with multicast in Section 5.7 of [RFC4566] do not
apply, i.e. the notation for Connection Data "c=" SHALL NOT be used apply, i.e. the notation for Connection Data "c=" SHALL NOT be used
with more than one address. The order of session dependency is with more than one address. The order of session dependency is
given from the RTP stream containing the lowest temporal sub-layer given from the RTP stream containing the lowest temporal sub-layer
to the RTP stream containing the highest temporal sub-layer. to the RTP stream containing the highest temporal sub-layer.
7.2.5 Dependency Signaling in Multi-Stream Transmission 7.2.5 Dependency Signaling in Multi-Stream Mode
If MST is used, the rules on signaling media decoding dependency in If MSM is used, the rules on signaling media decoding dependency in
SDP as defined in [RFC5583] apply. The rules on "hierarchical or SDP as defined in [RFC5583] apply. The rules on "hierarchical or
layered encoding" with multicast in Section 5.7 of [RFC4566] do not layered encoding" with multicast in Section 5.7 of [RFC4566] do not
apply, i.e. the notation for Connection Data "c=" SHALL NOT be used apply, i.e. the notation for Connection Data "c=" SHALL NOT be used
with more than one address. The order of session dependency is with more than one address. The order of session dependency is
given from the RTP stream containing the lowest temporal sub-layer given from the RTP stream containing the lowest temporal sub-layer
to the RTP stream containing the highest temporal sub-layer. to the RTP stream containing the highest temporal sub-layer.
8. Use with Feedback Messages 8. Use with Feedback Messages
As specified in section 6.1 of RFC 4585 [RFC4585], payload Specific As specified in section 6.1 of RFC 4585 [RFC4585], payload Specific
skipping to change at page 83, line 11 skipping to change at page 86, line 8
0: unassigned 0: unassigned
8-14: unassigned 8-14: unassigned
16-30: unassigned 16-30: unassigned
The following subsections define the use of the PLI, SLI, RPSI, and The following subsections define the use of the PLI, SLI, RPSI, and
FIR feedback messages with HEVC. FIR feedback messages with HEVC.
8.1 Picture Loss Indication (PLI) 8.1 Picture Loss Indication (PLI)
As specified in RFC 4585 section 6.3.1, the reception of a picture As specified in RFC 4585 section 6.3.1, the reception of a picture
loss indication by a media sender indicates the loss of "the loss of loss indication by a media sender indicates "the loss of an
an undefined amount of coded video data belonging to one or more undefined amount of coded video data belonging to one or more
pictures.". Without having any specific knowledge of the setup of pictures.". Without having any specific knowledge of the setup of
the bitstream (such as: use and location of in-band parameter sets, the bitstream (such as: use and location of in-band parameter sets,
non-IDR decoder refresh points, picture structures, and so forth) a non-IDR decoder refresh points, picture structures, and so forth) a
reaction to the reception of an PLI by an HEVC sender SHOULD BE to reaction to the reception of an PLI by an HEVC sender SHOULD be to
send an IDR picture and relevant parameter sets; potentially with send an IDR picture and relevant parameter sets; potentially with
sufficient redundancy so to ensure correct reception. However, sufficient redundancy so to ensure correct reception. However,
sometimes information about the bitstream structure is known. For sometimes information about the bitstream structure is known. For
example, state could have been established outside of the mechanisms example, state could have been established outside of the mechanisms
defined in this document that parameter sets are conveyed out of defined in this document that parameter sets are conveyed out of
band only, and stay static for the duration of the session. In that band only, and stay static for the duration of the session. In that
case, it is obviously unnecessary to send them in-band as a result case, it is obviously unnecessary to send them in-band as a result
of the reception of a PLI. Other examples could be devised based on of the reception of a PLI. Other examples could be devised based on
a priori knowledge of different aspects of the bitstream structure. a priori knowledge of different aspects of the bitstream structure.
In all cases, the timing and congestion control mechanisms of RFC In all cases, the timing and congestion control mechanisms of RFC
skipping to change at page 83, line 41 skipping to change at page 86, line 38
RFC 4585's Slice Loss Indication can be used to indicate, to a RFC 4585's Slice Loss Indication can be used to indicate, to a
sender, the loss of a number of Coded Tree Blocks (CTBs) in CTB sender, the loss of a number of Coded Tree Blocks (CTBs) in CTB
raster scan order of a picture. In the SLI's Feedback Control raster scan order of a picture. In the SLI's Feedback Control
Indication (FCI) field, the subfield "First" MUST be set to the CTB Indication (FCI) field, the subfield "First" MUST be set to the CTB
address of the first lost CTB. Note that the CTB address is in CTB address of the first lost CTB. Note that the CTB address is in CTB
raster scan order of a picture. For the first CTB of a slice raster scan order of a picture. For the first CTB of a slice
segment, the CTB address is the value of slice_segment_address when segment, the CTB address is the value of slice_segment_address when
present; or 0 when first_slice_segement_in_pic_flag is equal to 1; present; or 0 when first_slice_segement_in_pic_flag is equal to 1;
both syntax elements are in the slice segment header. The subfield both syntax elements are in the slice segment header. The subfield
"Number" MUST be set to the number of consecutive lost CTBs, again "Number" MUST be set to the number of consecutive lost CTBs, again
in CTB raster scan order of a picture. The subfield "PictureID" in CTB raster scan order of a picture. Note that due to both the
MUST be set to the 6 least significant bits of a binary "First" and "Number" are counted in CTBs in CTB raster scan order,
representation of the value of slice_pic_order_cnt_lsb of the of a picture, not in tile scan order (which is the bitstream order
picture for which the lost CTBs are indicated. Note that for IDR of CTBs), multiple SLI messages may be needed to report the loss of
pictures the syntax element slice_pic_order_cnt_lsb is not present, one tile covering multiple CTB rows but less wide than the picture.
but then the value is inferred to be equal to 0.
The subfield "PictureID" MUST be set to the 6 least significant bits
of a binary representation of the value of PicOrderCntVal, as
defined in [HEVC], of the picture for which the lost CTBs are
indicated. Note that for IDR pictures the syntax element
slice_pic_order_cnt_lsb is not present, but then the value is
inferred to be equal to 0.
As described in RFC 4585, an encoder in a media sender can use this As described in RFC 4585, an encoder in a media sender can use this
information to "clean up" the corrupted picture by sending intra information to "clean up" the corrupted picture by sending intra
information, while observing the constraints described in RFC4585, information, while observing the constraints described in RFC4585,
for example with respect to congestion control. In many cases, for example with respect to congestion control. In many cases,
error tracking is required to identify the corrupted region in the error tracking is required to identify the corrupted region in the
receiver's state (reference pictures) because of error import in receiver's state (reference pictures) because of error import in
uncorrupted regions of the picture through motion compensation, and uncorrupted regions of the picture through motion compensation.
reference picture selection can also be used to "clean up" the Reference picture selection can also be used to "clean up" the
corrupted picture, which is usually more efficient and less likely corrupted picture, which is usually more efficient and less likely
to generate congestion than sending intra information. to generate congestion than sending intra information.
In contrast to the video codecs contemplated in RFC 4585 and RFC In contrast to the video codecs contemplated in RFC 4585 and RFC
5104, in HEVC, the "macroblock size" is not fixed to 16x16 luma 5104, in HEVC, the "macroblock size" is not fixed to 16x16 luma
samples, but variable. That, however, does not create a conceptual samples, but variable. That, however, does not create a conceptual
difficulty with SLI, because the setting of the CTB size is a difficulty with SLI, because the setting of the CTB size is a
sequence-level functionality, and using a slice loss indication sequence-level functionality, and using a slice loss indication
across coded video sequence boundaries is meaningless as there is no across coded video sequence boundaries is meaningless as there is no
prediction across sequence boundaries. However, a proper use of SLI prediction across sequence boundaries. However, a proper use of SLI
skipping to change at page 90, line 24 skipping to change at page 93, line 24
December 2013. December 2013.
[3GPPFF] 3GPP TS 26.244, "Transparent end-to-end packet switched [3GPPFF] 3GPP TS 26.244, "Transparent end-to-end packet switched
streaming service (PSS); 3GPP file format (3GP)", v12.20, streaming service (PSS); 3GPP file format (3GP)", v12.20,
December 2013. December 2013.
[Girod99] Girod, B. and Faerber, F., "Feedback-based error control [Girod99] Girod, B. and Faerber, F., "Feedback-based error control
for mobile video transmission", Proceedings IEEE, Vol. 87, for mobile video transmission", Proceedings IEEE, Vol. 87,
No. 10, pp. 1707-1723, October 1999. No. 10, pp. 1707-1723, October 1999.
[HEVC draft v2]
Draft version 2 of HEVC, "High Efficiency Video Coding
(HEVC) Range Extensions text specification: Draft 7", JCT-
VC document JCTVC-Q1005, 17th JCT-VC meeting, 27 March - 4
April 2014, Valencia, Spain.
[I-D.ietf-avt-srtp-not-mandatory] [I-D.ietf-avt-srtp-not-mandatory]
Perkins, C. and M. Westerlund, "Securing the RTP Perkins, C. and M. Westerlund, "Securing the RTP
ProtocolFramework: Why RTP Does Not Mandate a Single ProtocolFramework: Why RTP Does Not Mandate a Single
MediaSecurity Solution", draft-ietf-avt-srtp-not- MediaSecurity Solution", draft-ietf-avt-srtp-not-
mandatory-16 (work in progress), January 2014. mandatory-16 (work in progress), January 2014.
[I-D.ietf-avtcore-rtp-security-options] [I-D.ietf-avtcore-rtp-security-options]
Westerlund, M. and C. Perkins, "Options for Securing RTP Westerlund, M. and C. Perkins, "Options for Securing RTP
Sessions", draft-ietf-avtcore-rtp-security-options-10 Sessions", draft-ietf-avtcore-rtp-security-options-10
(work in progress), January 2014. (work in progress), January 2014.
skipping to change at page 91, line 18 skipping to change at page 94, line 24
Mechanisms for Real-Time Transport", draft-ietf-avtext- Mechanisms for Real-Time Transport", draft-ietf-avtext-
rtp-grouping-taxonomy-01 (work in progress), February rtp-grouping-taxonomy-01 (work in progress), February
2014. 2014.
[ISOBMFF] IS0/IEC 14496-12 | 15444-12: "Information technology - [ISOBMFF] IS0/IEC 14496-12 | 15444-12: "Information technology -
Coding of audio-visual objects - Part 12: ISO base media Coding of audio-visual objects - Part 12: ISO base media
file format" | "Information technology - JPEG 2000 image file format" | "Information technology - JPEG 2000 image
coding system - Part 12: ISO base media file format", coding system - Part 12: ISO base media file format",
2012. 2012.
[JCTVC-J0107] Wang, Y.-K., Chen, Y., Joshi, R., and Ramasubramonian, [JCTVC-J0107]
K., "AHG9: On RAP pictures", JCT-VC document JCTVC-L0107, Wang, Y.-K., Chen, Y., Joshi, R., and Ramasubramonian, K.,
10th JCT-VC meeting, July 2012, Stockholm, Sweden. "AHG9: On RAP pictures", JCT-VC document JCTVC-L0107, 10th
JCT-VC meeting, July 2012, Stockholm, Sweden.
[MPEG2S] ISO/IEC 13818-1, "Information technology - Generic coding [MPEG2S] ISO/IEC 13818-1, "Information technology - Generic coding
of moving pictures and associated audio information: of moving pictures and associated audio information:
Systems", 2013. Systems", 2013.
[MPEGDASH] ISO/IEC 23009-1, "Information technology - Dynamic [MPEGDASH] ISO/IEC 23009-1, "Information technology - Dynamic
adaptive streaming over HTTP (DASH) - Part 1: Media adaptive streaming over HTTP (DASH) - Part 1: Media
presentation description and segment formats", 2012. presentation description and segment formats", 2012.
[RFC5109] Li, A., "RTP Payload Format for Generic Forward Error [RFC5109] Li, A., "RTP Payload Format for Generic Forward Error
 End of changes. 122 change blocks. 
340 lines changed or deleted 491 lines changed or added

This html diff was produced by rfcdiff 1.41. The latest version is available from http://tools.ietf.org/tools/rfcdiff/