Commit Graph

50287 Commits

Author SHA1 Message Date
Nuo Mi f68f40736f avcodec/vvcdec: support mv wraparound
A 360 video specific tool
see https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=9503377

passed files:
    DMVR_A_Huawei_3.bit
    WRAP_D_InterDigital_4.bit
    WRAP_A_InterDigital_4.bit
    WRAP_B_InterDigital_4.bit
    WRAP_C_InterDigital_4.bit
    ERP_A_MediaTek_3.bit
2024-06-08 17:45:55 +08:00
Nuo Mi 685174069f avcodec/vvcdec: misc, reindent inter.c 2024-06-08 17:45:55 +08:00
Nuo Mi a4013e748a avcodec/vvcdec: refact out emulated_edge_no_wrap
prepare for refrence wraparound
2024-06-08 17:45:55 +08:00
Nuo Mi 8abdf0a28e avcodec/vvcdec: misc, move src offset inside emulated_edge 2024-06-08 17:45:55 +08:00
Nuo Mi 2d98786fee avcodec/vvcdec: refact, remove emulated_edge_dmvr and emulated_edge_bilinear to simplify code 2024-06-08 17:45:55 +08:00
Lynne 714596bcbf aacdec_usac: zero out alpha values for the current frame 2024-06-08 00:22:41 +02:00
Lynne c2d459cb51 aacdec_usac: fix stereo alpha values for transients
Typo.
Also added comments and fixed the branch underneath.
2024-06-08 00:22:40 +02:00
Lynne 7223523335 aacdec_usac: use correct TNS values
The standard slightly modified the maximum TNS bands allowed.
2024-06-08 00:22:40 +02:00
Lynne 9b41cc0430 aacdec_usac: do not round noise amplitude values
Use floating point division instead of integer division.
2024-06-08 00:22:40 +02:00
Lynne a18d0659f4 aacdec_usac: skip coeff decoding if the number to be decoded is 0
Yet another thing not mentioned in the spec.
2024-06-08 00:22:39 +02:00
Lynne 1ad9a4008b aacdec_usac: decouple TNS active from TNS data present flag
The issue was that in case of common TNS parameters, TNS was
entirely skipped, as tns.present was set to 0.
2024-06-08 00:22:39 +02:00
Lynne c0fdb0cdfd aacdec_usac: do not continue parsing bitstream on core_mode == 1
Although LPD is not functional yet, the bitstream ends at that point.
2024-06-08 00:22:38 +02:00
Lynne 8ecaa64b9b aacdec_usac: respect tns_on_lr flag
This was left out, and due to av_unused, forgotten about.
2024-06-08 00:22:38 +02:00
Lynne 25b848a0bd aacdec_usac: correctly set and use the layout map 2024-06-08 00:22:38 +02:00
Lynne ae495b56ff aacdec_usac: remove fallback for custom maps with invalid position
Not needed as every possible index is mapped.
2024-06-08 00:22:37 +02:00
Lynne 91ab17e2fe aacdec_usac: tag LFE channels as such in the channel map
Missed.
2024-06-08 00:22:37 +02:00
Lynne 62cd6d9e59 aacdec_usac: clean up nb_elems on error
Require that there is a valid layout with a valid number of channels
before accepting nb_elems.
The value is required when flushing.

Thanks to kasper93 for figuring it out.
2024-06-08 00:22:37 +02:00
Lynne 5c328e6c1e aacdec: increase MAX_ELEM_ID to 64
In USAC, we set the max to 64.
2024-06-08 00:22:36 +02:00
Lynne 91fd6ca000 lavc: bump minor and add APIchanges entry for new USAC profile 2024-06-08 00:22:36 +02:00
Lynne 1c066867df aac: define a new profile for USAC
This allows users to determine whether a stream is USAC or not.
2024-06-08 00:22:35 +02:00
Lynne ee419804da mpeg4audio: explicitly define each AOT
This makes it far easier to figure out which AOT belongs to which
profile.
Also, explicitly highlight the holes.
2024-06-08 00:22:35 +02:00
Lynne 8a2fe8a5b9 mpeg4audio: rename AOT_USAC_NOSBR to AOT_USAC
The issue is that AOT 45 isn't defined anywhere, and looking at the git
blame, it seems to have sprung up through a reordering of the enum,
and adding a hole.

The spec does not define an explicit AOT for SBR and no SBR, and only
uses AOT 42 (previously AOT_USAC_NOSBR), so just rename AOT_USAC to
it and replace its use everywhere.
2024-06-08 00:22:31 +02:00
Michael Niedermayer dce69ba89e avcodec/libx264: Check init_get_bits8() return code
Fixes: CID1594529 Unchecked return value

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-06-07 21:42:25 +02:00
Michael Niedermayer 8a64a003b5 avcodec/ilbcdec: Remove dead code
Yes the same dead code is in "iLBC Speech Coder ANSI-C Source Code"

Fixes: CID1509370 Logically dead code

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-06-07 21:42:24 +02:00
Michael Niedermayer 9b76e49061 avcodec/vp8: Check cond init
Fixes: CID1598563 Unchecked return value

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-06-07 21:42:24 +02:00
Michael Niedermayer 4ac7405aaf avcodec/vp8: Check mutex init
Fixes: CID1598556 Unchecked return value

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2024-06-07 21:42:24 +02:00
Rémi Denis-Courmont 3152c684cb lavc/vc1dsp: R-V V vc1_inv_trans_4x4
T-Head C908 (cycles):
vc1dsp.vc1_inv_trans_4x4_c: 310.7
vc1dsp.vc1_inv_trans_4x4_rvv_i32: 120.0

We could use 1 `vlseg4e64.v` instead of 4 `vle16.v`, but that seems to
be about 7% slower.
2024-06-07 17:53:05 +03:00
Rémi Denis-Courmont 6ffa639c8a lavc/vc1dsp: R-V V vc1_inv_trans_4x8
T-Head C908 (cycles):
vc1dsp.vc1_inv_trans_4x8_c: 653.2
vc1dsp.vc1_inv_trans_4x8_rvv_i32: 234.0
2024-06-07 17:53:05 +03:00
Rémi Denis-Courmont a169f3bca5 lavc/vc1dsp: R-V V vc1_inv_trans_8x4
T-Head C908 (cycles):
vc1dsp.vc1_inv_trans_8x4_c:       626.2
vc1dsp.vc1_inv_trans_8x4_rvv_i32: 215.2
2024-06-07 17:53:05 +03:00
Rémi Denis-Courmont 04397a29de lavc/vc1dsp: R-V V vc1_inv_trans_8x8
T-Head C908 (cycles):
vc1dsp.vc1_inv_trans_8x8_c:       871.7
vc1dsp.vc1_inv_trans_8x8_rvv_i32: 286.7
2024-06-07 17:53:05 +03:00
Rémi Denis-Courmont c3dbbb316e lavc/flacdsp: fix sign extension in R-V V wasted33
We need to use either VWCVT.X.X.V or VSEXT.VF2. The later is preferable
to avoid changing VTYPE.
2024-06-07 17:53:05 +03:00
Zhao Zhili 7d46ab9e12 avcodec/mediacodecenc: workaround the alignment requirement for H.265
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-06-07 13:14:46 +08:00
Zhao Zhili 2a68b2d643 avcodec/mediacodecenc: workaround the alignment requirement only for H.264
There is no bsf for other codecs to modify crop info except H.265.
For H.265, the assumption that FFALIGN(width, 16)xFFALIGN(height, 16)
is the video resolution can be wrong, since the encoder can use CTU
larger than 16x16. In that case, use FFALIGN(width, 16) - width
as crop_right is incorrect. So disable the workaround for H.265 now.

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-06-07 13:14:46 +08:00
Zhao Zhili 680b3cee1f avcodec/h265_metadata: Add options to set width/height after crop
It's a common usecase to request a video size after crop. Before
this patch, user must know the video size before crop, then set
crop_right/crop_bottom accordingly. Since HEVC can have different
CTU size, it's not easy to get/deduce the video size before crop.
With the new width/height options, there is no such requirement.

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2024-06-07 13:14:46 +08:00
Ramiro Polla 2d24a80e5e avcodec/mpegvideo_enc: give magic number a name 2024-06-05 19:39:36 +02:00
Ramiro Polla 01b1f4c9a5 libavcodec/libxvid: code cleanup (replace magic numbers) 2024-06-05 19:39:35 +02:00
Rémi Denis-Courmont 0415bb74c8 lavc/vp8dsp: remove no longer used macros 2024-06-04 17:42:07 +03:00
Rémi Denis-Courmont 121fb846b9 lavc/vp7dsp: add R-V V vp7_idct_dc_add4uv
This is almost the same story as vp7_idct_add4y. We just have to use
strided loads of 2 64-bit elements to account for the different data
layout in memory.

T-Head C908:
vp7_idct_dc_add4uv_c:       7.5
vp7_idct_dc_add4uv_rvv_i64: 2.0
vp8_idct_dc_add4uv_c:       6.2
vp8_idct_dc_add4uv_rvv_i32: 2.2 (before)
vp8_idct_dc_add4uv_rvv_i64: 2.0

SpacemiT X60:
vp7_idct_dc_add4uv_c:       6.7
vp7_idct_dc_add4uv_rvv_i64: 2.2
vp8_idct_dc_add4uv_c:       5.7
vp8_idct_dc_add4uv_rvv_i32: 2.5 (before)
vp8_idct_dc_add4uv_rvv_i64: 2.0
2024-06-04 17:42:07 +03:00
Rémi Denis-Courmont 225de53c9d lavc/vp8dsp: rework R-V V idct_dc_add4y
DCT-related FFmpeg functions often add an unsigned 8-bit sample to a
signed 16-bit coefficient, then clip the result back to an unsigned
8-bit value. RISC-V has no signed 16-bit to unsigned 8-bit clip, so
instead our most common sequence is:
    VWADDU.WV
    set SEW to 16 bits
    VMAX.VV zero     # clip negative values to 0
    set SEW to 8 bits
    VNCLIPU.WI       # clip values over 255 to 255 and narrow

Here we use a different sequence which does not require toggling the
vector type. This assumes that the wide addend vector is biased by
-128:
    VWADDU.WV
    VNCLIP.WI    # clip values to signed 8-bit and narrow
    VXOR.VX 0x80 # flip sign bit (convert signed to unsigned)

Also the VMAX is effectively replaced by a VXOR of half-width. In this
function, this comes for free as we anyway add a constant to the wide
vector in the prologue.

On C908, this has no observable effects. On X60, this improves
microbenchmarks by about 20%.
2024-06-04 17:42:07 +03:00
Rémi Denis-Courmont 4e120fbbbd lavc/vp8dsp: add R-V V vp7_idct_dc_add4y
As with idct_dc_add, most of the code is shared with, and replaces, the
previous VP8 function. To improve performance, we break down the 16x4
matrix into 4 rows, rather than 4 squares. Thus strided loads and
stores are avoided, and the 4 DC calculations are vectored.
Unfortunately this requires a vector gather to splat the DC values, but
overall this is still a win for performance:

T-Head C908:
vp7_idct_dc_add4y_c:       7.2
vp7_idct_dc_add4y_rvv_i32: 2.2
vp8_idct_dc_add4y_c:       6.2
vp8_idct_dc_add4y_rvv_i32: 2.2 (before)
vp8_idct_dc_add4y_rvv_i32: 1.7

SpacemiT X60:
vp7_idct_dc_add4y_c:       6.2
vp7_idct_dc_add4y_rvv_i32: 2.0
vp8_idct_dc_add4y_c:       5.5
vp8_idct_dc_add4y_rvv_i32: 2.5 (before)
vp8_idct_dc_add4y_rvv_i32: 1.7

I also tried to provision the DC values using indexed loads. It ends up
slower overall, especially for VP7, as we then have to compute 16 DC's
instead of just 4.
2024-06-04 17:40:41 +03:00
Rémi Denis-Courmont 30797e4ff6 lavc/vp8dsp: add R-V V vp7_idct_dc_add
This just computes the direct coefficient and hands over to code shared
with VP8. Accordingly the bulk of changes are just rewriting the VP8
code to share.

Nothing to write home about:
vp7_idct_dc_add_c:       1.7
vp7_idct_dc_add_rvv_i32: 1.2
2024-06-04 17:40:36 +03:00
Frank Plowman d866f49791 lavc/vvc: Reallocate pixel buffers if pixel shift changes
Allocations in the following lines depend on the pixel shift, and so
these buffers must be reallocated if the pixel shift changes.  Patch
fixes segmentation faults in fuzzed bitstreams.

Signed-off-by: Frank Plowman <post@frankplowman.com>
2024-06-04 20:13:47 +08:00
Anton Khirnov 9576a00527 lavc/hevcdec: drop unused HEVCContext.width/height 2024-06-04 11:46:27 +02:00
Anton Khirnov a13b892080 lavc/hevcdec: deduplicate calling hwaccel decode_params() 2024-06-04 11:46:27 +02:00
Anton Khirnov e4601cc339 lavc/hevc*: move to hevc/ subdir 2024-06-04 11:46:27 +02:00
Anton Khirnov ba56a300a9 lavc/hevcdec: drop HEVCContext.frame
It is merely a redundant pointer to cur_frame->f
2024-06-04 11:46:27 +02:00
Anton Khirnov db84c1c6ef lavc/hevcdec: rename HEVCFrame.frame to just f
This is shorter, loses no information, and is consistent with other
similar structs.
2024-06-04 11:46:23 +02:00
Anton Khirnov 9226514ced lavc/hevcdec: rename HEVCContext.ref to cur_frame
Since it stores a pointer to the current frame.
2024-06-04 11:44:37 +02:00
Anton Khirnov 7ad9400952 lavc/hevcdec: drop HEVCContext.HEVClc
It is merely a pointer to local_ctx[0], which we can just as well use
directly.
2024-06-04 11:36:51 +02:00
Anton Khirnov 67ca18dd56 lavc/hevcdec: drop HEVCLocalContext.gb
In all HEVCLocalContext instances except the first one, the bitreader is
never used for actually reading bits, but merely for passing the buffer
to ff_init_cabac_decoder(), which is better done directly.

The instance that actually is used for bitreading gets moved to stack in
decode_nal_unit(), which makes its lifetime clearer.
2024-06-04 11:36:51 +02:00