Internet Video Codec (Active WG)
Art Area: Adam Roach, Alexey Melnikov, Ben Campbell | 2015-May-18 —  

IETF-99 netvc minutes

Session 2017-07-17 1550-1720: Athens/Barcelona - Audio stream - netvc chatroom


minutes-99-netvc-01 minutes

          IETF 99
          Monday 17 July 2017, Afternoon Session II
          -------- Agenda --------
          * No agenda bashing
          * Tim Terriberry will present Daala and Thomas Daede’s slides
          -------- Chair Slides --------
          Video codec requirements an devaluation methodology: updated Alexey
          Filippov, Andrey Norkin, Jose Alvarez
          * Mo Zanaty: requirements document is ready for progressing; currently
          at version section
                  most changes around section 3.1.1
                  - no other substantive changes
                  - last call at end of may
                  - current status: shepherd write-up and passing it on to AD
          * Earlier question: whether or not we publish it
                  - main impetus is that it’s also be used by other bodies
          -------- Test and Evaluation Criteria --------
          PRESENTER Tim Terriberry (slides by Thomas Daede)
          NETVC Testing
          * not a lot of changes to testing documents
          * have started exercising some of the subjective testing procedures for it
          * added a subjective test set (small subset of objective test set)
          Statistical analysis
          * Generally 12 viewers needed for results to be significant
          * SP 50 will be changed in future
          * CDEF constrained directional enhancement filter; ended up being
          significantly better than CLPF for a number of videos
          * CLPF: These are all completed; you can still vote but the results have
          been calculated
          * Jonathan Lennox: do we really intend to have Sintel video only up
          there twice?
          * Answer: not sure
          Test: https://arewecompressedyet.com/
          * Mo Zanaty: for new subjective tests, we will start forwarding to the
          list if people are willing to give their feedback on them
          -------- Thor Update --------
          PRESENTER Steinar Midtskogen
          * No updates since IETF 98 (spring 2017)
          * Last consensus: have Thor and Daala converge
          * Wish list: a tool designed to improve screen content; this has not
          started yet
          * Concerns about buffer retirement: both filters can have vertical
          originally fixed by restricting second; quick fix
          * Steinar tried to find another fix: combined two passes into one
          * Used new subjective test framework AWCY
          * Tests were done in AV1, but don’t think would be much difference
          for Thor
          * In all cases objective scores for CDEF are slightly better
          * high latency vs low latency results; high latency has more ties
          * Objective codec comparisons: did not use objective-2-fast b/c it breaks
          AV1 sometimes
          * AV1 compression history: decreased over the last year, compression gains
          are slightly more than 20%, most of that has come in the last three months
          * AV1 complexity history: y-axis is logarithmic, frames per minute not
          fps. In order to get a 20% compression gain, the complexity goes up by
          about 1000%
          * Tim Terriberry:  not sure which commits Steinar measured, but there
          changes that allowed you to make much quicker selections; expansion
          probably made it much slower, then sped up, faster again
          * Steinar Midtskogen: Complexity could go down
          * Mo Zanaty: data points? Steinar Midtskogen: Twice a month, same
          configuration. Selected whatever was in the repository first on the 15th
          of each month
          -------- Codec Comparison: Thor, VP9, AV1 --------
          * Thor and VP9 seemed to have same complexity and compression trade off
          except thor can have more compression at the cost of added complexity
          * AV1 performing better
          * If we limit sequence test set to screen content, Thor performs much
          better than VP9 but not as well as AV1
          * It’s possible to get thor to perform roughly as well as av1 but with
          a fraction of the tools and added complexity
          * Mo Zanaty: what amount of screen content is in earlier test set?
          * Steinar Midtskogen: at least one sequence had a BDR score of 80%
          better than Thor
          * Tim Terriberry:  Wikipedia set (screen capture of someone scrolling
          through Wikipedia), a few Twitch videos (Minecraft)
          * Steinar Midtskogen: with CDEF we should get a slight improvement
          * Jonathan Lennox:  you don’t anticipate any complexity costs?
          * Steinar Midtskogen: not that huge; for the entropy coder, some
          complexity but not a doubling or something like that. Screen content
          tool hasn’t been invented yet.
          -------- Daala Update --------
          PRESENTER Tim Terriberry
          * This change is something we discovered while working with the VP9
          * How this works for VP9 (VP9 slide)
          * Proposal for AV1, but AV1 has all the same problems as VP9 and more
          problems on top of that
          * Mo Zanaty:  comment on resilience these frame numbers that have been
          added; you can have a much larger frame number 10 bit 12 bit if you drop
          one you actually know that you dropped one
          * Right. Wanted to have some consistent way of solving this problem;
          * Before slide: basically the situation now. Each one has a buffer of
          actual pixels in it
          * After (proposed): move all the probabilities up into the reference
          frame; the global motion data moves up
          * Whatever is the first frame in your list of reference frame, you draw
          reference pixels, and all of your pixels,s probability, all of your
          motion data
          * Mo Zanaty: do you mean to say that before you can update a context
          after decoding a non-reference frame, but now you can’t?
          * TO DO slide: Global motion; relatively recent, proposal not complete
          yet, frame size prediction
          -------- Chroma From Luma --------
          PRESENTER Tim Terriberry; Luc Trudeau, David Michael Barr did most of
          the work
          * Update to CfL proposal
          * This presentation topic: Solely used for intra-prediction
          * Originally designed Cfl to work within Daala. Hard to do in other codecs
          * A lot of Cfl proposals try to build a linear model implicitly from
          data—this is not very good
          * No longer require PVQ (perceptual vector quantization) b/c we’re
          doing everything in spatial domain
          * Decoder nice and simple, just use parameters that were sent
          * CfL: encoder side slide, to answer “Are the were models going to
          have some constant offset?”
          * feed into search for best linear parameters
          * A couple choices made for efficiency reasons
          * Mo Zanaty: one question about your alphas; have you ever looked at to
          see one plane is useful for another plane
          * We code them together, direction on that plane and magnitude in
          that direction; probability will increase to the extent that those
          are correlated
          * Boundary handling complicates things b/c (see slide)
          * 1 pixel, has a small effect on metrics, none visible in picture
          * Steinar Midtskogen: does it make sense to use something alpha values
          to drive predictions [missed]
          * Tried in daala, didn’t help.  May be worth revisiting in AV1;
          Answer is maybe.

