Fixes: 3.82046e+18 is outside the range of representable values of type 'unsigned int'
Fixes: 62276/clusterfuzz-testcase-minimized-ffmpeg_dem_WEBM_DASH_MANIFEST_fuzzer-6381436594421760
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 5053074104798691550 + 5053074104259715104 cannot be represented in type 'long'
Fixes: 62276/clusterfuzz-testcase-minimized-ffmpeg_dem_WAV_fuzzer-6515315309936640
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 9223372036854775807 - -8000000 cannot be represented in type 'long'
Fixes: 62276/clusterfuzz-testcase-minimized-ffmpeg_dem_SBG_fuzzer-5133181743136768
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: 62276/clusterfuzz-testcase-minimized-ffmpeg_dem_RPL_fuzzer-4677434693517312
Fixes: signed integer overflow: 5555555555555555556 * 8 cannot be represented in type 'long long'
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: negation of -2147483648 cannot be represented in type 'int'; cast to an unsigned type to negate this value to itself
Fixes: 62276/clusterfuzz-testcase-minimized-ffmpeg_dem_JACOSUB_fuzzer-5401294942371840
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: 62276/clusterfuzz-testcase-minimized-ffmpeg_dem_CONCAT_fuzzer-6434245599690752
Fixes: signed integer overflow: 9223372026773000000 + 22337000000 cannot be represented in type 'long'
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: -2147483648 + -25122315 cannot be represented in type 'int'
Fixes: 62285/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_WAVARC_fuzzer-6199806972198912
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
mb_change_bits is given space based on height >> 2, while more data is read
Fixes: out of array access
Fixes: 62285/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_TRUEMOTION1_fuzzer-5201925062590464.fuzz
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: -1575944192 + -602931200 cannot be represented in type 'int'
Fixes: 62285/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_QOA_fuzzer-6470469339185152
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Since log statements printing int64 were made portable in
4464b7eeb1, let us include
inttypes.h explicitly (as it is unclear where PRId64 and
such are coming from now).
Reported-by: Leo Izen <leo.izen@gmail.com>
Signed-off-by: Marth64 <marth64@proxyid.net>
The VP8 compressed header may not be byte-aligned due to boolean
coding. Round up byte count for accurate data positioning.
Signed-off-by: Jianhui Dai <jianhui.j.dai@intel.com>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
This commit adds value range checks to cbs_vp8_read_unsigned_le,
migrates fixed() to use it, and enforces little-endian consistency for
all read methods.
Signed-off-by: Jianhui Dai <jianhui.j.dai@intel.com>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
As the plain neon qpel_h functions process two rows at a time,
we need to allocate storage for h+8 rows instead of h+7.
By allocating storage for h+8 rows, incrementing the stack
pointer won't end up at the right spot in the end. Store the
intended final stack pointer value in a register x14 which we
store on the stack.
AWS Graviton 3:
put_hevc_qpel_bi_hv4_8_c: 385.7
put_hevc_qpel_bi_hv4_8_neon: 131.0
put_hevc_qpel_bi_hv4_8_i8mm: 92.2
put_hevc_qpel_bi_hv6_8_c: 701.0
put_hevc_qpel_bi_hv6_8_neon: 239.5
put_hevc_qpel_bi_hv6_8_i8mm: 191.0
put_hevc_qpel_bi_hv8_8_c: 1162.0
put_hevc_qpel_bi_hv8_8_neon: 228.0
put_hevc_qpel_bi_hv8_8_i8mm: 225.2
put_hevc_qpel_bi_hv12_8_c: 2305.0
put_hevc_qpel_bi_hv12_8_neon: 558.0
put_hevc_qpel_bi_hv12_8_i8mm: 483.2
put_hevc_qpel_bi_hv16_8_c: 3965.2
put_hevc_qpel_bi_hv16_8_neon: 732.7
put_hevc_qpel_bi_hv16_8_i8mm: 656.5
put_hevc_qpel_bi_hv24_8_c: 8709.7
put_hevc_qpel_bi_hv24_8_neon: 1555.2
put_hevc_qpel_bi_hv24_8_i8mm: 1448.7
put_hevc_qpel_bi_hv32_8_c: 14818.0
put_hevc_qpel_bi_hv32_8_neon: 2763.7
put_hevc_qpel_bi_hv32_8_i8mm: 2468.0
put_hevc_qpel_bi_hv48_8_c: 32855.5
put_hevc_qpel_bi_hv48_8_neon: 6107.2
put_hevc_qpel_bi_hv48_8_i8mm: 5452.7
put_hevc_qpel_bi_hv64_8_c: 57591.5
put_hevc_qpel_bi_hv64_8_neon: 10660.2
put_hevc_qpel_bi_hv64_8_i8mm: 9580.0
Signed-off-by: Martin Storsjö <martin@martin.st>
As the plain neon qpel_h functions process two rows at a time,
we need to allocate storage for h+8 rows instead of h+7.
AWS Graviton 3:
put_hevc_qpel_uni_w_hv4_8_c: 422.2
put_hevc_qpel_uni_w_hv4_8_neon: 140.7
put_hevc_qpel_uni_w_hv4_8_i8mm: 100.7
put_hevc_qpel_uni_w_hv8_8_c: 1208.0
put_hevc_qpel_uni_w_hv8_8_neon: 268.2
put_hevc_qpel_uni_w_hv8_8_i8mm: 261.5
put_hevc_qpel_uni_w_hv16_8_c: 4297.2
put_hevc_qpel_uni_w_hv16_8_neon: 802.2
put_hevc_qpel_uni_w_hv16_8_i8mm: 731.2
put_hevc_qpel_uni_w_hv32_8_c: 15518.5
put_hevc_qpel_uni_w_hv32_8_neon: 3085.2
put_hevc_qpel_uni_w_hv32_8_i8mm: 2783.2
put_hevc_qpel_uni_w_hv64_8_c: 57254.5
put_hevc_qpel_uni_w_hv64_8_neon: 11787.5
put_hevc_qpel_uni_w_hv64_8_i8mm: 10659.0
Signed-off-by: Martin Storsjö <martin@martin.st>
As the plain neon qpel_h functions process two rows at a time,
we need to allocate storage for h+8 rows instead of h+7.
By allocating storage for h+8 rows, incrementing the stack
pointer won't end up at the right spot in the end. Store the
intended final stack pointer value in a register x14 which we
store on the stack.
AWS Graviton 3:
put_hevc_qpel_uni_hv4_8_c: 384.2
put_hevc_qpel_uni_hv4_8_neon: 127.5
put_hevc_qpel_uni_hv4_8_i8mm: 85.5
put_hevc_qpel_uni_hv6_8_c: 705.5
put_hevc_qpel_uni_hv6_8_neon: 224.5
put_hevc_qpel_uni_hv6_8_i8mm: 176.2
put_hevc_qpel_uni_hv8_8_c: 1136.5
put_hevc_qpel_uni_hv8_8_neon: 216.5
put_hevc_qpel_uni_hv8_8_i8mm: 214.0
put_hevc_qpel_uni_hv12_8_c: 2259.5
put_hevc_qpel_uni_hv12_8_neon: 498.5
put_hevc_qpel_uni_hv12_8_i8mm: 410.7
put_hevc_qpel_uni_hv16_8_c: 3824.7
put_hevc_qpel_uni_hv16_8_neon: 670.0
put_hevc_qpel_uni_hv16_8_i8mm: 603.7
put_hevc_qpel_uni_hv24_8_c: 8113.5
put_hevc_qpel_uni_hv24_8_neon: 1474.7
put_hevc_qpel_uni_hv24_8_i8mm: 1351.5
put_hevc_qpel_uni_hv32_8_c: 14744.5
put_hevc_qpel_uni_hv32_8_neon: 2599.7
put_hevc_qpel_uni_hv32_8_i8mm: 2266.0
put_hevc_qpel_uni_hv48_8_c: 32800.0
put_hevc_qpel_uni_hv48_8_neon: 5650.0
put_hevc_qpel_uni_hv48_8_i8mm: 5011.7
put_hevc_qpel_uni_hv64_8_c: 57856.2
put_hevc_qpel_uni_hv64_8_neon: 9863.5
put_hevc_qpel_uni_hv64_8_i8mm: 8767.7
Signed-off-by: Martin Storsjö <martin@martin.st>
As the plain neon qpel_h functions process two rows at a time,
we need to allocate storage for h+8 rows instead of h+7.
By allocating storage for h+8 rows, incrementing the stack
pointer won't end up at the right spot in the end. Store the
intended final stack pointer value in a register x14 which we
store on the stack.
AWS Graviton 3:
put_hevc_qpel_hv4_8_c: 386.0
put_hevc_qpel_hv4_8_neon: 125.7
put_hevc_qpel_hv4_8_i8mm: 83.2
put_hevc_qpel_hv6_8_c: 749.0
put_hevc_qpel_hv6_8_neon: 207.0
put_hevc_qpel_hv6_8_i8mm: 166.0
put_hevc_qpel_hv8_8_c: 1305.2
put_hevc_qpel_hv8_8_neon: 216.5
put_hevc_qpel_hv8_8_i8mm: 213.0
put_hevc_qpel_hv12_8_c: 2570.5
put_hevc_qpel_hv12_8_neon: 480.0
put_hevc_qpel_hv12_8_i8mm: 398.2
put_hevc_qpel_hv16_8_c: 4158.7
put_hevc_qpel_hv16_8_neon: 659.7
put_hevc_qpel_hv16_8_i8mm: 593.5
put_hevc_qpel_hv24_8_c: 8626.7
put_hevc_qpel_hv24_8_neon: 1653.5
put_hevc_qpel_hv24_8_i8mm: 1398.7
put_hevc_qpel_hv32_8_c: 14646.0
put_hevc_qpel_hv32_8_neon: 2566.2
put_hevc_qpel_hv32_8_i8mm: 2287.5
put_hevc_qpel_hv48_8_c: 31072.5
put_hevc_qpel_hv48_8_neon: 6228.5
put_hevc_qpel_hv48_8_i8mm: 5291.0
put_hevc_qpel_hv64_8_c: 53847.2
put_hevc_qpel_hv64_8_neon: 9856.7
put_hevc_qpel_hv64_8_i8mm: 8831.0
Signed-off-by: Martin Storsjö <martin@martin.st>
The hv32 and hv64 functions were identical - both loop and
process 16 pixels at a time.
The hv16 function was near identical, except for the outer loop
(and using sp instead of a separate register).
Given the size of these functions, the extra cost of the outer
loop is negligible, so use the same function for hv16 as well.
This removes over 200 lines of duplicated assembly, and over 4 KB
of binary size.
Signed-off-by: Martin Storsjö <martin@martin.st>
The first horizontal filter can use either i8mm or plain neon
versions, while the second part is a pure neon implementation.
Signed-off-by: Martin Storsjö <martin@martin.st>
The first horizontal filter can use either i8mm or plain neon
versions, while the second part is a pure neon implementation.
Signed-off-by: Martin Storsjö <martin@martin.st>
For widths of 32 pixels and more, loop first horizontally,
then vertically.
Previously, this function would process a 16 pixel wide slice
of the block, looping vertically. After processing the whole
height, it would backtrack and process the next 16 pixel wide
slice.
When doing 8tap filtering horizontally, the function must load
7 more pixels (in practice, 8) following the actual inputs, and
this was done for each slice.
By iterating first horizontally throughout each line, then
vertically, we access data in a more cache friendly order, and
we don't need to reload data unnecessarily.
Keep the original order in put_hevc_\type\()_h12_8_neon; the
only suboptimal case there is for width=24. But specializing
an optimal variant for that would require more code, which
might not be worth it.
For the h16 case, this implementation would give a slowdown,
as it now loads the first 8 pixels separately from the rest, but
for larger widths, it is a gain. Therefore, keep the h16 case
as it was (but remove the outer loop), and create a new specialized
version for horizontal looping with 16 pixels at a time.
Before: Cortex A53 A72 A73 Graviton 3
put_hevc_qpel_h16_8_neon: 710.5 667.7 692.5 211.0
put_hevc_qpel_h32_8_neon: 2791.5 2643.5 2732.0 883.5
put_hevc_qpel_h64_8_neon: 10954.0 10657.0 10874.2 3241.5
After:
put_hevc_qpel_h16_8_neon: 697.5 663.5 705.7 212.5
put_hevc_qpel_h32_8_neon: 2767.2 2684.5 2791.2 920.5
put_hevc_qpel_h64_8_neon: 10559.2 10471.5 10932.2 3051.7
Signed-off-by: Martin Storsjö <martin@martin.st>
This gets rid of a couple instructions, but the actual performance
is almost identical on Cortex A72/A73. On Cortex A53, it is a
handful of cycles faster.
Signed-off-by: Martin Storsjö <martin@martin.st>
Many of the routines within hevcdsp_epel_neon and hevcdsp_qpel_neon
store temporary buffers on the stack. When consuming it,
many of these functions use the stack pointer as incremental pointer
for reading the data (instead of storing it in another register),
which is rather unusual.
Technically, this is fine as long as the pointer remains properly
aligned.
However in the case of ff_hevc_put_hevc_qpel_uni_w_hv64_8_neon_i8mm,
after incrementing sp when reading data (within each 16 pixel
wide stripe) it would then reset the stack pointer back to a lower
value, for reading the next 16 pixel wide stripe, expecting the
data to remain untouched.
This can't be assumed; data on the stack below the stack pointer
can be clobbered (e.g. by a signal handler). Some OS ABIs
allow for a little margin that won't be touched, aka a red zone,
but not all do. The ones that do, guarantee 16 or 128 bytes, not
9 KB.
Convert this function to use a separate pointer register to
iterate through the data, retaining the stack pointer to point
at the bottom of the data we require to remain untouched.
Signed-off-by: Martin Storsjö <martin@martin.st>
MATCH_PER_STREAM_OPT iterates over all options of a given
OptionDef and tests whether they apply to the current stream;
if so, they are set to ost->apad, otherwise, the code errors
out. If no error happens, ost->apad is av_strdup'ed in order
to take ownership of this pointer.
But this means that setting it originally was premature,
as it leads to double-frees when an error happens lateron.
This can simply be reproduced with
ffmpeg -filter_complex anullsrc -apad bar -apad:n baz -f null -
This is a regression since 83ace80bfd.
Fix this by using a temporary variable instead of directly
setting ost->apad. Also only strdup the string if it actually
is != NULL.
Reviewed-by: Marth64 <marth64@proxyid.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
and rename it to FF_INFMT_INIT_CLEANUP. This flag is demuxer-only,
so this is the more appropriate place for it.
This does not preclude adding internal flags common to both
demuxer and muxer in the future.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The average length of the strings used here does not differ much
from the length of the longest string; therefore it makes sense
to use an array big enough for the longest string and not
a pointer to a string. This also moves this array into .rodata
(from .data.rel.ro).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Most of the data in the temporary storage ends up being
returned to the user as AVPacket.data, so it makes sense
to avoid using the AVBPrint for temporary storage altogether
(in particular in light of the fact that the blocks read here
are too big for the small-string optimization anyway) and
read the data directly into AVPacket.data. This also avoids
another memcpy() from a stack buffer to the AVBPrint in ts_image()
(that could always have been avoided with av_bprint_get_buffer()).
These changes also allow to use av_append_packet(), which
greatly simplifies the code; furthermore, one can avoid cleanup
code on error as the packet is already unreferenced generically
on error.
There are two user-visible changes from this patch:
1. Truncated packets are now marked as corrupt.
2. AVPacket.pos is set (it corresponds to the discarded header
line, 80 bytes before the position corresponding to the
actual packet data).
Furthermore, this patch also removes code that triggered
a -Wtautological-constant-out-of-range-compare warning
from Clang (namely a comparison of an unsigned and INT64_MAX
in an assert).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is an implementation detail of other input formats whether
they use raw_codec_id or not. The HLS demuxer should not rely
on this.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The code relies on their presence and would presumably crash
when retrieving in_fmt->name if an encrypted stream with a codec id
without demuxer were encountered.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Due to the bump it is now certain that all devices
that support flushing have the proper internal flag set.
(Notice that the check for LIBAVFORMAT_VERSION was wrong.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Besides improving readability this also ensures that
a developer who has the android content protocol enabled
and works on the other parts of the file will not
forget to add necessary inclusions just because of
(indirect) inclusions from the files included only
when said protocol is enabled.
Reviewed-by: Matthieu Bouron <matthieu.bouron@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
An alternative would be to limit all time/duration fields to below 64bit
Fixes: signed integer overflow: -93000000 - 9223372036839000000 cannot be represented in type 'long long'
Fixes: 64546/clusterfuzz-testcase-minimized-ffmpeg_dem_CONCAT_fuzzer-5110813828186112
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 178459578 + 2009763270 cannot be represented in type 'int'
Fixes: 62285/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_OSQ_fuzzer-5013423686287360
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Small cleanup for style, indent, switch case lables.
BTW, the preferred way to ease multiple indentation levels in a
switch statement is to align the switch and its subordinate
case labels in the same column
Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
Fixes: signed integer overflow: 64 + 9223372036854775803 cannot be represented in type 'long long'
Fixes: 51896/clusterfuzz-testcase-minimized-ffmpeg_dem_CAF_fuzzer-6536881135550464
Fixes: 62276/clusterfuzz-testcase-minimized-ffmpeg_dem_CAF_fuzzer-6536881135550464
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 9223372036854775796 + 12 cannot be represented in type 'long long'
Fixes: 51896/clusterfuzz-testcase-minimized-ffmpeg_dem_IFF_fuzzer-4898373660704768
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 64 + 9223372036854775807 cannot be represented in type 'long long'
Fixes: 51896/clusterfuzz-testcase-minimized-ffmpeg_dem_CAF_fuzzer-6418242730328064
Fixes: 62276/clusterfuzz-testcase-minimized-ffmpeg_dem_CAF_fuzzer-6418242730328064
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 109817402400 * 301990077 cannot be represented in type 'long long'
Fixes: 51896/clusterfuzz-testcase-minimized-ffmpeg_dem_AVI_fuzzer-6706191715139584
Fixes: 62276/clusterfuzz-testcase-minimized-ffmpeg_dem_AVI_fuzzer-6706191715139584
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
For hardware cases where we are forced to have a fixed pool of frames
allocated up-front (such as array textures on decoder output), suggest
a possible workaround to the user if an allocation fails because the
pool is exhausted.
av_ts_make_time_string() used "%.6g" format, but this format was losing
precision even when the timestamp to be printed was not that large. For example
for 3 hours (10800) seconds, only 1 decimal digit was printed, which made this
format inaccurate when it was used in e.g. the silencedetect filter. Other
detection filters printing timestamps had similar issues. Also time base
parameter of the function was *AVRational instead of AVRational.
Resolve these problems by introducing a new function, av_ts_make_time_string2().
We change the used format to "%.*f", use a precision of 6, except when printing
values near 0, in which case we calculate the precision dynamically to aim for
a similar precision in normal form as with %.6g. No longer using scientific
representation can make parsing the timestamp easier for the users, we can
safely do this because the theoretical maximum of INT64_MAX*INT32_MAX still
fits into the string buffer in normal form.
We somewhat imitate %g by trimming ending zeroes and the potential decimal
point characters. In order not to trim "inf" as well, we assume that the
decimal point string does not contain the letter "f". Note that depending on
printf %f implementation, we might trim "infinity" to "inf".
Thanks for Allan Cady for bringing up this issue.
Signed-off-by: Marton Balint <cus@passwd.hu>
Broken in afa471d0ef. It just happened
to work before due to x86inc.asm previously performing XMM spills in
INIT_MMX mode which was more of a bug than an intentional feature.
- Properly initialize all the mappings to -1 by default.
- Make sure every output channel is assigned exactly once
- Autodetect a native layout when only native channels are present
- Always honor the user specified layout, but make sure the mapping is
compatible with it
The last item is a regression from 4af412be71.
Signed-off-by: Marton Balint <cus@passwd.hu>
In this case in_channel_idx was never set and the default 0 was used.
Suprisingly no one noticed that the respective fate test output was wrong.
Signed-off-by: Marton Balint <cus@passwd.hu>
Previously we always assumed that the channels are in native order, even if
they were not. The new channel layout API allows us to signal the proper
channel order, so let's do so.
Fixes ticket #98.
C11 provides static assertions via _Static_assert and
provides static_assert as a convenience define for this
in assert.h. Our codebase uses the latter, as _Static_assert
has actually already been deprecated in C23.
Not all toolchains that declare support for C11 actually
support it; e.g. MSVC 19.27 does not support _Static_assert,
but somehow supports static_assert. MSVC 19.27 admits to be
a "preview implementation of the ISO C11 standard",
so this is not surprising (MSVC 19.28 does not come with
this caveat).
Furthermore some FATE boxes [1] use old GCC toolchains (with
only experimental support for C11) where _Static_assert is
supported, but assert.h does not provide the fallback define.
They are broken since the first usage of static_assert.
This commit therefore checks whether static_assert and
_Static_assert work with assert.h included; if not,
configure errors out.
This intentionally drops support for MSVC 19.27. Users like
the old FATE boxes above can still add -Dstatic_assert=_Static_assert
to cflags as a workaround if desired.
[1]: https://fate.ffmpeg.org/report.cgi?time=20240321123620&slot=sh4-debian-qemu-gcc-4.7
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It has been moved after "st = s->streams[pkt->stream_index]"
in b140b8332c.
Deduplicate ff_read_packet() and ff_buffer_packet()
while fixing this.
This also fixes shadowing in ff_read_packet().
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The continue statement will break out of the do/while loop, not the
outer loop as intended. This is one (compound) statement anyway, so we
can remove the do/while entirely.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
Based on the AOMedia Film Grain Synthesis 1 (AFGS1) spec:
https://aomediacodec.github.io/afgs1-spec/
The parsing has been changed substantially relative to the AV1 film
grain OBU. In particular:
1. There is the possibility of maintaining multiple independent film
grain parameter sets, and decoders/players are recommended to pick
the one most appropriate for the intended display resolution. This
could also be used to e.g. switch between different grain profiles
without having to re-signal the appropriate coefficients.
2. Supporting this, it's possible to *predict* the grain coefficients
from previously signalled parameter sets, transmitting only the
residual.
3. When not predicting, the parameter sets are now stored as a series of
increments, rather than being directly transmitted.
4. There are several new AFGS1-exclusive fields.
I placed this parser in its own file, rather than h2645_sei.c, since
nothing in the generic AFGS1 film grain payload is specific to T.35, and
to compartmentalize the code base.
Implementation copied wholesale from dav1d, sans SIMD, under permissive
license. This implementation was extensively verified to be bit-exact,
so it serves as a much better starting point than trying to re-engineer
this from scratch for no reason. (I also authored the original
implementation in dav1d, so any "clean room" implementation would end up
looking much the same, anyway)
The notable changes I had to make while adapting this from the dav1d
code-base to the FFmpeg codebase include:
- reordering variable declarations to avoid triggering warnings
- replacing several inline helpers by avutil equivalents
- changing code that accesses frame metadata
- replacing raw plane copying logic by av_image_copy_plane
Apart from this, the implementation is basically unmodified.
Common utility function that can be used by all codecs to select the
right (any valid) film grain parameter set. In particular, this is
useful for AFGS1, which has support for multiple parameters.
However, it also performs parameter validation for H274.
Not directly signalled by AV1, but we should still set this accordingly
so that users will know what the original intended video characteristics
and chroma resolution were.
Not directly signalled by AV1, but we should still set this accordingly
so that users will know what the original intended video characteristics
and chroma resolution were.
H.274 specifies that film grain parameters are signalled as intended for
4:4:4 frames, so we always signal this, regardless of the frame's actual
subsampling.
This is needed for AV1 film grain as well, when using AFGS1 streams.
Also add extra width/height and subsampling information, which AFGS1
cares about, as part of the same API bump. (And in principle, H274
should also expose this information, since it is needed downstream to
correctly adjust the chroma grain frequency to the subsampling ratio)
Deprecate the equivalent H274-exclusive fields. To avoid breaking ABI,
add the new fields after the union; but with enough of a paper trail to
hopefully re-order them on the next bump.
small cleanup for style, redundant semicolons, goto labels,
in FFmpeg, we put goto labels at brace level.
Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
avpkt->pos needs to be set for generic indexing or features such as the
stream_loop option will not work.
Co-authored-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Signed-off-by: Leo Izen <leo.izen@gmail.com>
This will allow users to pass the Android ApplicationContext which is mandatory
to retrieve the ContentResolver responsible to resolve/open Android content URIS.
E.g. chromaprint expects to be fed 16bit signed PCM
in native endianness, yet there was no check for this.
Similarly for other muxers. Use the new
FF_OFMT_FLAG_ONLY_DEFAULT_CODECS to enfore this where
appropriate, e.g. for pcm/raw muxers.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
AVOutputFormat has default codecs for audio, video and subtitle
and often these are the only codecs of this type allowed.
So add a flag to AVOutputFormat so that this can be checked generically.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Several muxers (e.g. pcm muxers) did not check the number
of streams even though the individual streams were not
recoverable from the muxed files. This commit changes
this by using the FF_OFMT_MAX_ONE_OF_EACH flag
where appropriate.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
More exactly: Not more than one stream of each type for which
a default codec (i.e. AVOutputFormat.(audio|video|subtitle)_codec)
is set; for those types for which no such codec is set (or for
which no designated default codec in AVOutputFormat exists at all)
no streams are permitted.
Given that with this flag set the default codecs become more important,
they are now set explicitly to AV_CODEC_ID_NONE for "unset";
the earlier code relied on AV_CODEC_ID_NONE being equal to zero,
so that default static initialization set it accordingly;
but this is not how one is supposed to use an enum.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This test muxes two streams into a single pcm file, although
the two streams are of course not recoverable from the output
(unless one has extra information). So use the streamhash muxer
instead (which also provides coverage for it; it was surprisingly
unused in FATE so far). This is in preparation for actually
enforcing a limit of one stream for the PCM muxers.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
If AVOutputFormat.video_codec, audio_codec or subtitle_codec
is AV_CODEC_ID_NONE, it means that there is no default codec
for this format and not that it is supported to mux AV_CODEC_ID_NONE.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Also use sizeof of the proper type, namely sizeof(**sd)
and not sizeof(*sd).
Reviewed-by: Jan Ekström <jeebjp@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
av_frame_side_data_get() has a const AVFrameSideData * const *sd
parameter; so calling it with an AVFramesSideData **sd like
AVCodecContext.decoded_side_data (or with a AVFramesSideData * const
*sd) is safe, but the conversion is not performed automatically
in C. All users of this function therefore resort to a cast.
This commit changes this: av_frame_side_data_get() is renamed
to av_frame_side_data_get_c(); furthermore, a static inline
wrapper for it name av_frame_side_data_get() is added
that accepts an AVFramesSideData * const * and converts this
to const AVFramesSideData * const * in a Wcast-qual safe way.
This also allows to remove the casts from the current users.
Reviewed-by: Jan Ekström <jeebjp@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The latter need not be save, because av_log() expects
to get a pointer to an AVClass-enabled structure
and not only a fake object. If this function were actually
be called in the following way:
const AVClass *avcl = avctx->av_class;
handle_mdcv(&avcl, );
the AVClass's item_name would expect to point to an actual
AVCodecContext, potentially leading to a segfault.
Reviewed-by: Jan Ekström <jeebjp@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This code uses the AVBPrint API for exactly one av_bprintf()
in a scenario in which a good upper bound for the needed
size of the buffer is available (with said upper bound being
much smaller than sizeof(AVBPrint)). So one can simply use
snprintf() instead. This also avoids the (always-false due to
the current size of the internal AVBPrint buffer) check for
whether the AVBPrint is complete.
Furthermore, the old code used AV_BPRINT_SIZE_AUTOMATIC
which implies that the AVBPrint buffer will never be
(re)allocated and yet it used av_bprint_finalize().
This has of course also been removed.
Reviewed-by: Jan Ekström <jeebjp@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
AVCodecContext extradata should be an AVCDecoderConfigurationRecord
when bitstream format is avcc. Simply concatenating the NALUs output
by x264_encoder_headers does not form a standard
AVCDecoderConfigurationRecord. The following cmd generates broken
file before the patch:
ffmpeg -i foo.mp4 -c:v libx264 -x264-params annexb=0 bar.mp4
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
Many users mistakenly think that hwaccel is an instance of a decoder,
and cannot find the corresponding decoder name in the logs. Log
hwaccel name so user know hwaccel has taken effect.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
When the compiler chooses to inline put_amf_string(pb, ""),
the avio_write(pb, "", 0) can be avoided. Happens with
Clang-17 with -O1 and higher and GCC 13 with -O2 and higher
here.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
These were defined in a way compatible with the Vulkan HEVC acceleration, which
expects bitmasks, yet the fields were being overwritting on each loop with the
latest read value.
Signed-off-by: James Almer <jamrial@gmail.com>
While ensuring it's at least C11, the minimum supported version.
Also, enforce C11 on the host compiler, same as we already do for C11 on the
target compiler.
Tested-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: James Almer <jamrial@gmail.com>
The newer of these two are the separate integers for content light
level, introduced in 3952bf3e98c76c31594529a3fe34e056d3e3e2ea ,
with X265_BUILD 75. As we already require X265_BUILD of at least
89, no further conditions are required.
Both of these two structures were first available with X264_BUILD
163, so make relevant functionality conditional on the version
being at least such.
Keep handle_side_data available in all cases as this way X264_init
does not require additional version based conditions within it.
Finally, add a FATE test which verifies that pass-through of the
MDCV/CLL side data is working during encoding.
These two were added in 28e23d7f348c78d49a726c7469f9d4e38edec341
and 3558c1f2e97455e0b89edef31b9a72ab7fa30550 for version 0.9.0 of
SVT-AV1, which is also our minimum requirement right now.
In other words, no additional version limiting conditions seem
to be required.
Additionally, add a FATE test which verifies that pass-through of
the MDCV/CLL side data is working during encoding.
The second part of this condition is intended to check whether the
current quantisation group is in the first CTU column of the current
tile. The issue is that ctb_to_col_bd gives the x-ordinate of the first
column of the current tile *in CTUs*, while xQg gives the x-ordinate of
the quantisation group *in samples*. Rectify this by shifting xQg by
ctb_log2_size to get xQg in CTUs before comparing.
Fixes FFVVC issues #201 and #203.
This muxer does not have the AVFMT_NOSTREAMS flag; therefore
it is checked generically that there is at least a stream.
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Since e0da916b8f the ffmpeg utility has
held multiple frames output by the decoder in internal queues without
telling the decoder that it is going to do so. When the decoder has a
fixed-size pool of frames (common in some hardware APIs where the output
frames must be stored as an array texture) this could lead to the pool
being exhausted and the decoder getting stuck. Fix this by telling the
decoder to allocate additional frames according to the queue size.
In MPEG-2 user data, there can be different types of Closed Captions
formats embedded (A53, SCTE-20, or DVD). The current behavior of the
CC extraction code in the MPEG-2 decoder is to not be aware of
multiple formats if multiple exist, therefore allowing one format
to overwrite the other during the extraction process since the CC
extraction shares one output buffer for the normalized bytes.
This causes sources that have two CC formats to produce flawed output.
There exist real-world samples which contain both A53 and SCTE-20 captions
in the same MPEG-2 stream, and that manifest this problem. Example of symptom:
THANK YOU (expected) --> THTHANANK K YOYOUU (actual)
The solution is to pick only the first CC substream observed with valid bytes,
and ignore the other types. Additionally, provide an option for users
to manually "force" a type in the event that this matters for a particular
source.
Signed-off-by: Marth64 <marth64@proxyid.net>
PyTorch is an open source machine learning framework that accelerates
the path from research prototyping to production deployment. Official
website: https://pytorch.org/. We call the C++ library of PyTorch as
LibTorch, the same below.
To build FFmpeg with LibTorch, please take following steps as
reference:
1. download LibTorch C++ library in
https://pytorch.org/get-started/locally/,
please select C++/Java for language, and other options as your need.
Please download cxx11 ABI version:
(libtorch-cxx11-abi-shared-with-deps-*.zip).
2. unzip the file to your own dir, with command
unzip libtorch-shared-with-deps-latest.zip -d your_dir
3. export libtorch_root/libtorch/include and
libtorch_root/libtorch/include/torch/csrc/api/include to $PATH
export libtorch_root/libtorch/lib/ to $LD_LIBRARY_PATH
4. config FFmpeg with ../configure --enable-libtorch \
--extra-cflag=-I/libtorch_root/libtorch/include \
--extra-cflag=-I/libtorch_root/libtorch/include/torch/csrc/api/include \
--extra-ldflags=-L/libtorch_root/libtorch/lib/
5. make
To run FFmpeg DNN inference with LibTorch backend:
./ffmpeg -i input.jpg -vf \
dnn_processing=dnn_backend=torch:model=LibTorch_model.pt -y output.jpg
The LibTorch_model.pt can be generated by Python with torch.jit.script()
api. https://pytorch.org/tutorials/advanced/cpp_export.html. This is
pytorch official guide about how to convert and load torchscript model.
Please note, torch.jit.trace() is not recommanded, since it does
not support ambiguous input size.
Signed-off-by: Ting Fu <ting.fu@intel.com>
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
Otherwise the derived device and the source device might have different
PCI ID in a multiple-device system.
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
At least on latest Win 11 and Visual Studio 2022, that DLL does not
exist anymore and can't be installed via any of the usual means.
However, debugging works just fine regardless, so this check makes
debugging impossible.
D3D11CreateDevice will fail anyway if debugging is not supported, so
let's rely on that instead.
When all cached frames are drained, the output mfxSyncPoint pointer is
NULL and MFX_ERR_MORE_DATA is returned, hence needn't print warning for
this expected behavior, otherwise the user might think the output from
qsv decoders are wrong.
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
ff_aac_coder_init_mips() modifies a static const structure of
function pointers. This will crash if the binary uses relro
and is a data race in any case.
Furthermore it points to a maintainability issue: The
AACCoefficientsEncoder structures have been constified
in commit fd9212f2ed,
a Libav commit merged in 318778de9e.
Libav did not have the MIPS-specific AAC code and so this was
fine for them; yet FFmpeg had them, but this was not recognized.
Commit 75a099fc73 points to another
maintainability issue: Contrary to ordinary DSP code, this code
here is way more complex and needs to be constantly kept in sync
with the ordinary code which it mimicks and replaces. Said commit
is the only commit actually changing aaccoder.c in the last few
years and the same change has not been performed for the MIPS
clone; before that, it even happened several times that the mips
code was broken due to changes of the generic code (see commits
97437bd17a and
de262d018d or
860dbe0275 or
933309a6ca or
b65ffa316e). This might even lead
to scenarios where someone changing non-dsp aacenc code would
have to modify mips inline asm in order to keep them in sync.
This is obviously a significant burden (if the AAC encoder were
actively developed).
Finally, the code does not even compile here due to errors like
"Error: float register should be even, was 1".
Reviewed-by: Lynne <dev@lynne.ee>
Reviewed-by: Jean-Baptiste Kempf <jb@videolan.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
These strings are so short (longest takes 11B) that using
pointers is wasteful. Avoiding them also moves hashdesc
into .rodata (from .data.rel.ro).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
In particular, test writing tags with odd strlen.
(These tags are zero-padded to even size.)
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Both GCC and Clang create code that inlines the loops in
next_input() and next_output() at high optimization
levels (presumably when there are not too many devices)
and this code leads to the creation of .got entries:
e7: 48 3b 3d 00 00 00 00 cmp 0x0(%rip),%rdi # ee <av_input_video_device_next+0xe>
ea: R_X86_64_REX_GOTPCRELX ff_alsa_demuxer-0x4
ee: 74 43 je 133 <av_input_video_device_next+0x53>
f0: 48 3b 3d 00 00 00 00 cmp 0x0(%rip),%rdi # f7 <av_input_video_device_next+0x17>
f3: R_X86_64_REX_GOTPCRELX ff_fbdev_demuxer-0x4
f7: 74 41 je 13a <av_input_video_device_next+0x5a>
These relocations can't be fixed up lateron when it is known
that the symbols exist in the same DSO.
This commit therefore marks these symbols as hidden, leading
to code like this:
f7: 48 8d 05 00 00 00 00 lea 0x0(%rip),%rax # fe <av_input_video_device_next+0xe>
fa: R_X86_64_PC32 ff_alsa_demuxer-0x4
fe: 48 39 c7 cmp %rax,%rdi
101: 74 55 je 158 <av_input_video_device_next+0x68>
103: 48 8d 05 00 00 00 00 lea 0x0(%rip),%rax # 10a <av_input_video_device_next+0x1a>
106: R_X86_64_PC32 ff_fbdev_demuxer-0x4
10a: 48 39 c7 cmp %rax,%rdi
10d: 74 50 je 15f <av_input_video_device_next+0x6f>
(Note: It is actually strange that the compiler creates code
that tries to read the addresses from the .got given that the
addresses can be read directly from indev_list/outdev_list.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Avoids .got entries for ff_iamf_scalable_ch_layouts and
ff_iamf_sound_system_map (whether they would have been
created otherwise depends upon the compiler and compiler
options).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
OptionDef.u is only an offset (i.e. its off member) iff OPT_FLAG_OFFSET
is true. Otherwise, the pointer arithmetic can be undefined behaviour.
UBSan warns about this (on 32bit arches):
src/fftools/ffmpeg_opt.c:102:15: runtime error: pointer index expression with base 0xffa4db10 overflowed to 0x56059a50
This commit fixes this by checking for OPT_FLAG_OFFSET first.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The documentation correctly states that the rdiv is a multiplier but incorrectly states the default behavior is to multiply by the sum of all matrix elements - it multiplies by 1/sum.
This changes the documentation to match the code.
Address trac #10889
Signed-off-by: Marton Balint <cus@passwd.hu>
Also make initialization/uninitialization behaviour more explicit in the docs,
and make sure we do not leak a channel map on error.
Signed-off-by: Marton Balint <cus@passwd.hu>
Also make use of the av_channel_from_string() function to determine the channel
id. This fixes some parse issues in av_channel_layout_from_string().
Signed-off-by: Marton Balint <cus@passwd.hu>
We lacked tests which supposed to fail, and there are some which should fail
but right now it does not. This will be fixed in a later commit.
Signed-off-by: Marton Balint <cus@passwd.hu>
Deduplicates a lot of code.
Some minor differences (mostly white space and inconsistent use of quotes) are
expected in the fate tests, there was no point aiming for exactly the same
formatting.
Signed-off-by: Marton Balint <cus@passwd.hu>
This makes the wav and pcm demuxer demux bigger packets, which is more
efficient.
As a side effect of the bigger packets, audio durations can become less exact
for command lines such as "ffmpeg -i $INPUT -c:a copy -t 1.0 $OUTPUT".
Signed-off-by: Marton Balint <cus@passwd.hu>
- Remove the 1024 cap on the number of samples, for high sample rate audio it
was suboptimal, calculate the low neighbour power of two for the number of
samples (audio blocks) instead.
- Make the function work correctly also for non-pcm codecs by using the stream
bitrate to estimate the target packet size. A previous version of this patch
used av_get_audio_frame_duration2() the estimate the desired packet size, but
for some codecs that returns the duration of a single audio frame regardless
of frame_bytes.
- Fallback to 4096/block_align*block_align if bitrate is not available.
Signed-off-by: Marton Balint <cus@passwd.hu>
All versions of MSVC that support C11 (namely >= v19.27)
also support the restrict keyword, therefore av_restrict
is no longer necessary since 75697836b1.
Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
HAVE_FAST_UNALIGNED being true does not imply that
one can simply read from any pointer via *(long*).
It is undefined behaviour in case the pointer is not
sufficiently aligned; and even if it is, it is (likely)
a violation of the effective-type rules. Fix both
of these by using the appropriate AV_[RW]N macros.
Also, the current code used sizeof(long) as if this
were the CPU's native arithmetic size, but this is
not true on 64bit Windows. This has been fixed, too.
This affected huffyuv FATE-tests.
Tested-by: Sean McGovern <gseanmcg@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Instead store all the strings in one continugous string
(with internal \0) and use offsets to access the actual
substrings. This replaces the pointers to the strings
and therefore avoids relocations (and on x64, it actually
shrinks TiffGeoTagNameType by reusing padding to store
the offset field).
This saves 720B of .data.rel.ro and 1080B of .rela.dyn
(containing the relocation records) here while increasing
.rodata by 384B.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This simplifies the code for checking the output, and can print
the failing output (including a map of matching/mismatching
elements) if checkasm is run with the -v/--verbose option.
Signed-off-by: J. Dekker <jdek@itanimul.li>
Previously it only checked half the output in 8 bit per pixel mode,
as the output actually is 16 bit elements here.
Signed-off-by: J. Dekker <jdek@itanimul.li>
The first 32 elements of each row were correct, while the
last 16 were scrambled.
This hasn't been noticed, because the checkasm test erroneously
only checked half of the output (for 8 bit functions), and
apparently none of the samples as part of "fate-hevc" seem to
trigger this specific function.
Signed-off-by: J. Dekker <jdek@itanimul.li>
Muxing multiple streams to raw files is allowed but the packets are
interleaved, so the output is dependant of packet size.
Signed-off-by: Marton Balint <cus@passwd.hu>
The samples I found all have 2000 sample packets, and by forcing the packet
size with a bsf we could automagically make muxing work for packets containing
more than 3640 samples.
Signed-off-by: Marton Balint <cus@passwd.hu>
This is easily possible with an X macro.
Using an enum for the offsets also allows to remove
two arrays which are not really needed and will typically
be optimized away by the compiler: The first just exists
to count the number of syntax elements*, the second one
exists to get offset[CONSTANT]. These constants were
of type enum SyntaxElement and this enum was only used
in hevc_cabac.c (although it was declared in hevcdec.h);
it is now no longer needed at all and has therefore been
removed.
The first of these arrays led to a warning from Clang
which is fixed by this commit:
warning: variable 'num_bins_in_se' is not needed and will
not be emitted [-Wunneeded-internal-declaration]
*: One could also just added a trailing SYNTAX_ELEMENT_NB
to the SyntaxElement enum for this purpose.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This allows using WRAPPED_AVFRAME encoders with loopback decoders in
order to connect multiple filtergraphs together.
Clear the flag in muxers, since lavf does not need it for anything and
it would change the results of framecrc FATE tests.
Encoder timebase is equal to the frame timebase, so does not need to be
passed separately.
Also, rename in_picture to frame, which is shorter and more accurate -
it always contains a frame, never a field.
These functions used to be passed directly to pthread_create(), which
required them to return void*. This is no longer the case, so they can
return a plain int.
Current callstack looks like this:
* ifilter_bind_ist() (filter) calls ist_filter_add() (demuxer);
* ist_filter_add() opens the decoder, and then calls
dec_add_filter() (decoder);
* dec_add_filter() calls ifilter_parameters_from_dec() (i.e. back into
the filtering code) in order to give post-avcodec_open2() parameters
to the filter.
This is unnecessarily complicated. Pass the parameters as follows
instead:
* dec_init() (which opens the decoder) returns post-avcodec_open2()
parameters to its caller (i.e. the demuxer) in a parameter-only
AVFrame
* the demuxer passes these parameters to the filter in
InputFilterOptions, together with other filter options
The first of these binds inputs of complex filtergraphs to demuxer
streams (with a misleading comment claiming it *creates* complex
filtergraphs).
The second ensures that all filtergraph outputs are connected to an
encoder.
Merge them into a single function, which simplifies the ffmpeg_filter
API, is shorter, and will also be useful in following commits.
Also, rename misleadingly-named init_input_filter() to
fg_complex_bind_input().
Rename dec_open to dec_init(), as it is more descriptive of its new
purpose.
Will be useful in following commits, which will add a new path for
opening decoders.
Treat it analogously to stream parameters like format/dimensions/etc.
This is functionally different from previous code in 2 ways:
* for non-CFR video, the frame timebase (set by the decoder) is used
rather than the demuxer timebase
* for sub2video, AV_TIME_BASE_Q is used, which is hardcoded by the
subtitle decoding API
These changes should avoid unnecessary and potentially lossy timestamp
conversions from decoder timebase into the demuxer one.
Changes the timebases used in sub2video tests.
The 's', ';', and '<' itemtypes are used for audio and video by
Gophernicus and Gopher+.
Signed-off-by: Christian Lee Seibold <christian.seibold32@outlook.com>
Signed-off-by: Marton Balint <cus@passwd.hu>
This muxer solely exists to test the fifo muxer via a dedicated
test tool in libavformat/tests/fifo_muxer.c. It fulfills no
other role and it is only designed with this role in mind.
The latter can be seen in two facts: The muxer uses printf
for logging and it simply presumes the packets' data to contain
a FailingMuxerPacketData (a struct duplicated in fifo_test.c
and tests/fifo_muxer.c.); in particular, it presumes packets
to have data at all, but this need not be true with side-data
only packets and a segfault can easily be triggered by e.g.
encoding flac (our native encoder sends a side-data only packet
with updated extradata at the end of encoding).
This patch fixes this by moving the test muxer into the fifo
test tool, making it inaccessible via the API (and actually
removing it from libavformat.so and libavformat.a).
While this muxer was accessible via e.g. av_guess_format(),
it was not really usable for an API user as FailingMuxerPacketData
was not public. Therefore this is not considered a breaking change.
In order to continue to use the test muxer in the test tool,
the ordinary fifo muxer had to be overridden: fifo_muxer.c
includes lavf/fifo.c but with FIFO_TEST defined which makes
it support the fifo_test muxer. This is possible because
test tools are always linked statically to their respective
library.
Reviewed-by: Stefano Sabatini <stefasab@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Add an AVOption to the libjxl encoder wrapper, which exposes the flag
uses_original_profile in libjxl. For highly unusual ICC profiles where
the target needs to stay in the original space, this can be useful.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
The formal title of the muxer according to the specification
is "RCWT (Raw Captions With Time)", so canonize this
in the long name of the codec and docs.
In the documentation section, point #2 was wrong: ccextractor
extracts the Closed Captions data and stores normalized bits
similarly to this muxer.
Signed-off-by: Marth64 <marth64@proxyid.net>
For unknown geokey values, get_geokey_val() returns
"Unknown-%d" with val being used for %d. This string
is allocated and therefore all the known geokey values
(static strings) are strdup'ed. In case this fails
it is either ignored or treated as "Unknown-%d".
(Furthermore it is possible to call av_strdup(NULL),
although this is not documented to be legal.)
This commit changes this by only returning the static strings
in get_geokey_val(); the unknown handling and strdup'ing
is moved out of it.
Reviewed-by: Stefano Sabatini <stefasab@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Apparently it can happen that avfilter_graph_request_oldest() returns
EAGAIN, yet av_buffersrc_get_nb_failed_requests() returns 0 for every
input.
Works around #10795, though the root issue is most likely in the
scale2ref filter.
This is thankfully passed through verbatim by libdav1d, so we can parse
it in our own code.
In theory, taking the DV profile from the packet-level configuration
struct is redundant since there is currently only one possible DV level
for AV1 (and all others would fail parsing), but this marginally
future-proofs it against possible new AV1-specific profiles being added
in the future.
At present, consume_update evaluates timeline state on all links for
a multi-input filter. This can lead to the filter being incorrectly
en/dis-abled when evaluation on a frame on a secondary link leads to
a different result than the frame on the current main link next in
line for processing.
This currently builds files in the libavcodec/x86/{vvc,h26x}
subdirectories, which is somewhat unexpected when building for
another architecture than x86.
The regular arch subdirectories are handled with
-include $(SRC_PATH)/$(1)/$(ARCH)/Makefile
in the toplevel Makefile. Switch this to a similar optional
inclusion, using $(ARCH).
Signed-off-by: Martin Storsjö <martin@martin.st>
In some builds, the following object files could be left behind
after make clean:
./libavfilter/metal/utils.o
./libavfilter/metal/vf_yadif_videotoolbox.metallib.o
./libavcodec/x86/h26x/h2656dsp.o
./libavcodec/neon/mpegvideo.o
./ffbuild/bin2c_host.o
Fixes: http://trac.ffmpeg.org/ticket/10895
Signed-off-by: Martin Storsjö <martin@martin.st>
Fixes: signed integer overflow: -93000000 - 9223372036839000000 cannot be represented in type 'long'
Fixes: 64546/clusterfuzz-testcase-minimized-ffmpeg_dem_CONCAT_fuzzer-5110813828186112
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
src/libavfilter/internal.h:255:45: note: passing argument to parameter 'filter' here
int ff_filter_config_links(AVFilterContext *filter);
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
If the time code side data is overridden by the packet level, we also
make sure not to update `out->metadata` to a mismatched timecode.
Signed-off-by: Anton Khirnov <anton@khirnov.net>
The signature of these wrappers is more complicated due to a need to
distinguish between "failed allocating side data" and "side data was
already present".
Signed-off-by: Anton Khirnov <anton@khirnov.net>
Native access is from the code that declared the options, foreign access
is from code that is using the options. Forbid foreign access to
AVOption.offset/default_val, for which there is no good reason, and
which should allow us more freedom in extending their semantics in a
compatible way.
Replace the opt_size() function, currently only called from
av_opt_copy(), with
* a constant array of element sizes
* a function that signals whether an option type is POD (i.e.
memcpyable) or not
Will be useful in following commits.
The longest string here takes four bytes, so using an array
of pointers is wasteful even when ignoring the cost of relocations;
the lack of relocations also implies that this array
will now be put into .rodata and not into .data.rel.ro.
Static asserts are used to ensure that all strings are always
properly zero-terminated.
Tested-by: Marth64 <marth64@proxyid.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is possible because the lifetime of these structures coincide.
It has the advantage of allowing to remove AVHWFramesInternal
from the public header; given that AVHWFramesInternal.priv is no more,
most accesses to AVHWFramesInternal are no more; indeed, the only
field accessed of it outside of hwcontext.c is the internal frame pool,
making this commit very simple.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is no longer used by any hwcontext, as they all allocate
their private data together with their public data and access
it via AVHWFramesContext.hwctx.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to D3D12VAFramesContext as one no longer has to
go through AVHWFramesInternal.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is possible because the lifetime of these structures coincide.
It has the advantage of allowing to remove the AVHWDeviceInternal
from the public header; given that AVHWDeviceInternal.priv is no more,
all accesses to it happen in hwcontext.c, so that this commit moves
the joint structure there.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is no longer used by any hwcontext, as they all allocate
their private data together with their public data and access
it via AVHWDeviceContext.hwctx.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to D3D12VADevicePriv as one no longer has to
go through AVHWDeviceInternal.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This commit does for AVInputFormat what commit
59c9dc82f4 did for AVOutputFormat:
It adds a new type FFInputFormat, moves all the internals
of AVInputFormat to it and adds a now reduced AVInputFormat
as first member.
This does not affect/improve extensibility of both public
or private fields for demuxers (it is still a mess due to lavd).
This is possible since 50f34172e0
(which removed the last usage of an internal field of AVInputFormat
in fftools).
(Hint: tools/probetest.c accesses the internals of FFInputFormat
as well, but given that it is a testing tool this is not considered
a problem.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This reverts commits fd5aa93a37
and cf00f60bab
("avcodec/kbdwin: Support arbitrary sized windows").
The change in question has only been made for libavradio.
in anticipation of merging it into the main tree. This has
not happened, so this commit reverts the changes to kbdwin
that are not used for anything else. In particular, these
functions are no longer exported (as avpriv functions);
notice that the fixed-point function has been exported
despite having never been used outside of lavc.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is only used by hq_hqa.c, so said header can simply
be included there.
Also move the code to initialize the VLCs to hq_hqa.c.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
These fields are set for all Vulkan decoding hwaccels;
they would be useless if it were different.
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Only three of the 226 (== AV_CODEC_ID_AV1) entries
have been used. Unsparsing this table is especially
important given that this array lives in .data.rel.ro.
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
All the fields of FFVkCodecMap are either decoder-only
or encoder-only (with the latter being unused and unset for now).
Yet there is already a per-decoder struct containing
static information about these decoders, namely
VkExtensionProperties.
This commit merges the decoder-parts of FFVkCodecMap
with the VkExtensionProperties into a common structure.
Given that FFVkCodecMap is now unused, it is removed.
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to VTFramesContext as one no longer has to
go through AVHWFramesInternal.
Tested-by: Jan Ekström <jeebjp@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Components and pieces are side data specific fields and there's a variable
amount of them.
They also need to be identified in some form, so print a type too.
Reviewed-by: Stefano Sabatini <stefasab@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
Do not propagate the return value of avformat_write_header(),
as it contains the information whether the output had
already been initialized in avformat_init_output(),
but this is set generically; the return value of
FFOutputFormat.write_header is not documented at all
(and is currently ignored if >= 0), but it is more future-proof
to simply return 0 on success.
Reviewed-by: Liu Steven <lq@chinaffmpeg.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is intended to avoid -Wformat= warnings on systems
where %s might not be supported (and also generally emitted
by GCC with -pedantic).
Reviewed-by: Liu Steven <lq@chinaffmpeg.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is undefined behaviour to use a different type for a call
than the actual type of the function.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This separates the URL-layer adjacent parts of the code
from the parts that are also usable with custom IO.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to QSVFramesContext as one no longer has to
go through AVHWFramesInternal.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to QSVDeviceContext as one no longer has to
go through AVHWDeviceInternal.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to D3D11VAFramesContext as one no longer has to
go through AVHWFramesInternal.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to DXVA2FramesContext as one no longer has to
go through AVHWFramesInternal.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Some encoders, like flac, propagate updated extradata at the end of encoding
as packet side data. Use it to update the relevant codec_config.
Signed-off-by: James Almer <jamrial@gmail.com>
If it's the primary item, then it's expected to be ready for presentation even
outside of the grid it belongs to.
Signed-off-by: James Almer <jamrial@gmail.com>
Doxygen only associates this comment with "pts_correction_num_faulty_pts",
causing it to display incorrectly.
Signed-off-by: Andrew Sayers <ffmpeg-devel@pileofstuff.org>
Doxygen eats the newline in the first comment, making it harder to read.
Join the lines and add a comma, so source and documentation are equally readable.
Doxygen only associates the second comment with cust_dec.
The comments for cust_dec and cust_tab make perfect sense without it,
so downgrade it to a non-doxygen "//" comment.
The third comment was missed by the command in the previous commit,
because it (correctly but uniquely) doesn't have a trailing comma.
Signed-off-by: Andrew Sayers <ffmpeg-devel@pileofstuff.org>
Use AVHWFramesContext.hwctx instead.
This simplifies access to VDPAUFramesContext as one no longer has
to go through AVHWFramesInternal.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to VDPAUDeviceContext as one no longer has to
go through AVHWDeviceInternal.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
libavcodec/vvc/vvc_inter.c:823:18: runtime error: signed integer overflow: 1426128896 + 1426128896 cannot be represented in type 'int'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior libavcodec/vvc/vvc_inter.c:823:18
Suggested-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The following cases should set bs to 1:
If the prediction modes are not the same.
If both prediction modes are MODE_IBC, but the motion vector delta is larger than 8 of 1/16 pixels.
see 8.8.3.5
How to reproduce it:
vvencapp -i sintel_trailer_2k_1080p24.y4m --preset fast --additional "IBC=1" -o sintel.266
ffmpeg -i sintel.266 -f md5 -
md5 will mismatch
Found-by: 6ws at https://github.com/ffvvc/FFmpeg/issues/187#issuecomment-1962842135
Frame-level side data attributes are printed with the same key/value
structure as packet-level side data attributes, but this is not
reflected in the XSD.
It is only used by ffprobe (once) and ffplay (twice);
inlining it avoids including it unnecessarily into ffmpeg.
Reviewed-by: Stefano Sabatini <stefasab@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Rebased on top of recently merged fixes (should apply correctly now).
In merged DVD patch, -pgc and -pg options were broken. While these are
rather advanced options, they are the only means to get content for
some strangely authored discs.
Signed-off-by: Marth64 <marth64@proxyid.net>
This makes the code much simpler (especially for adding support
for other instruction set extensions), avoids needing inline
assembly for this feature, and generally is more of the canonical
way to do this.
The CPU feature detection was added in
493fcde50a, using HWCAP_CPUID.
The argument for using that, was that HWCAP_CPUID was added much
earlier in the kernel (in Linux v4.11), while the HWCAP flags for
individual features always come later. This allows detecting support
for new CPU extensions before the kernel exposes information about
them via hwcap flags.
However in practice, there's probably quite little advantage in this.
E.g. HWCAP2_I8MM was added in Linux v5.10 - long after HWCAP_CPUID,
but there's probably very little practical cases where one would
run a kernel older than that on a CPU that supports those instructions.
Additionally, we provide our own definitions of the flag values to
check (as they are fixed constants anyway), with names not conflicting
with the ones from system headers. This reduces the number of ifdefs
needed, and allows detecting those features even if building with
userland headers that are lacking the definitions of those flags.
Also, slightly older versions of QEMU, e.g. 6.2 in Ubuntu 22.04,
do expose support for these features via HWCAP flags, but the
emulated cpuid registers are missing the bits for exposing e.g. I8MM.
(This issue is fixed in later versions of QEMU though.)
Signed-off-by: Martin Storsjö <martin@martin.st>
And move compute_ref_coefs() to its only user: lpc.c
There is no overlap between the users of compute_lpc_coefs()
and lpc proper.
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It leads to defines for the AAC decoder being included
outside of the AAC decoder.
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The mpegts code historically tries to strip (the first) metadata access unit
header from synchronous KLV metadata, but the detection for such streams was
unreliable causing strips of asynchronous metadata or ID3 as well.
MISB ST 1402 specifies required stream type, stream id and registration
descriptor (which eventually maps to the codec ID) so let's use all of these
for reliable detection.
Fixes a regression caused by 468615f204.
Fixes ticket #10828, #10883.
Signed-off-by: Marton Balint <cus@passwd.hu>
This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to OpenCLFramesContext as one no longer has to
go through AVHWFramesInternal.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to OpenCLDeviceContext as one no longer has to
go through AVHWDeviceInternal.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
To do so, concatenate all the names together to one big string
name1\0name2\0....lastname\0\0. This avoids the pointer in
the FunctionLoadInfo structure and thereby moves vk_load_info
into .rodata (and makes it smaller by 888B).
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Saves 16B per entry here (four of these 16 bytes are padding);
leads to 1776 B of savings in each file that uses
ff_vk_load_functions().
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
There are three possible names for the functions requested;
they only differ in an extension: "", "EXT" or "KHR".
Yet vk_load_info contained pointers to all these strings.
This is wasteful and this commit changes it to avoid
the latter two strings. This saves 6353B of strings,
1776 B of .data.rel.ro as well as 5328 B due to the removed
relocations (corresponding to 2 * 111 removed pointers)
in lavc/vulkan_decode.o alone (ff_vk_load_functions()
is inlined in lavfi/vulkan_filter.c, lavu/hwcontext_vulkan.c
and lavc_vulkan_decode.c, so the savings are three times
this for shared builds; for static builds, the number may
be smaller depending upon whether strings are deduplicated).
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
ff_vk_codec_map currently is an array indexed by AVCodecID;
it has AV_CODEC_ID_FIRST_AUDIO (= 65536) entries, but uses
only three of them; only 24B of 1MiB were actually used
This commit fixes this by adding an AVCodecID field to the table
and making it non-sparse.
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The inner AVInputFormat* of the inner mpegps-demuxer
is only used once (in avformat_open_input()), so
don't even store it. In fact, just use ff_mpegps_demuxer
directly, as this demuxer has a configure dependency
on it.
Reviewed-by: Marth64 <marth64@proxyid.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The wav demuxer by default tried to demux 4096-byte packets which caused
packets with very few number of samples for files with high channel count.
This caused a significant overhead especially since the latest ffmpeg.c
threading changes.
So let's use a similar approach for selecting audio frame size which is already
used in the PCM demuxer, which is to read 25 times per second but at most 1024
samples.
Signed-off-by: Marton Balint <cus@passwd.hu>
Since PTS is changed randomly for every audio frame, it matters. Also add some
forgotten filter dependencies.
Signed-off-by: Marton Balint <cus@passwd.hu>
Depending on input chunk size noticable corrpution was hearable, here is an
example command line:
ffplay -f lavfi -i "sine=440:r=8000:samples_per_frame=32,aresample=24000:filter_size=1:phase_shift=0"
Fix this by rounding the fixed point fractions up instead of down.
Signed-off-by: Marton Balint <cus@passwd.hu>
GEN=1 is used to generate reference files in the source tree, not
auto-generated reference samples.
Without this patch GEN=1 could overwrite the auto generated reference files
in each test where they are used causing failures.
Signed-off-by: Marton Balint <cus@passwd.hu>
We typically are only interesed in a single type of metadata set, so it is
better to keep them separated instead of always filtering for them.
Also use av_dynarray_add for increasing their array.
Signed-off-by: Marton Balint <cus@passwd.hu>
UUIDs do not have to be unique if their type sets them apart, so avoid using
AnyType, since we are only interested in specific types.
Signed-off-by: Marton Balint <cus@passwd.hu>
There is no MMX DSP code for VVC, so one can use the stricter
declare_func which also tests that we are not in MMX mode
at the end of this function.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Otherwise vvc_intra.o gets pulled in by the vvc_mc checkasm
test and it in turn pulls vvc_ctu.o and then the rest of vvcdec
and lavc in. Besides being bad size-wise this also has the downside
that it pulls in avpriv_(cga|vga16)_font from libavutil which are
marked as being imported from another library when building
libavcodec as a DLL and this breaks checkasm because it links
both lavc and lavu statically.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Otherwise svq1enc.o gets pulled in by the svq1encdsp checkasm
test and it in turn pulls the rest of lavc in.
Besides being bad size-wise this also has the downside that
it pulls in avpriv_(cga|vga16)_font from libavutil which are
marked as being imported from another library when building
libavcodec as a DLL and this breaks checkasm because it links
both lavc and lavu statically.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Otherwise aacenc.o gets pulled in by the aacencdsp checkasm
test and it in turn pulls the rest of lavc in.
Besides being bad size-wise this also has the downside that
it pulls in avpriv_(cga|vga16)_font from libavutil which are
marked as being imported from another library when building
libavcodec as a DLL and this breaks checkasm because it links
both lavc and lavu statically.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This BSF is supposed to be used in conjunction with mp3_header_compress,
which has been removed more than ten years ago in commit
c6080d8900. It mangled the headers
by removing the CRC field as well as fields that are supposed
to stay constant for the entirety of a stream (which are put into
extradata). This made these files unplayable; they need to be
decompressed with the BSF first (which does not happen automatically).
Even in this case the CRC does not get restored.
I am not aware that such compressed files exist at all; therefore
this commit removes the BSF completely.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to VulkanFramesPriv as one no longer has to
go through AVHWFramesInternal.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to VulkanDevicePriv as one no longer has to
go through AVHWDeviceInternal.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Use AVHWFramesContext.hwctx instead.
This simplifies accesses to VDPAUFramesContext as one no longer has
to go through AVHWFramesInternal.
Tested-by: Timo Rothenpieler <timo@rothenpieler.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Correct the names of the format-specific headers (not hwframe_*.h)
and clarify that the user shall ignore this field if there is no
public context associated with it.
In particular, this allows to use this field for the private context
alone if there is no public context. This can't break conforming
API users, because they always have to live with the possibility
that a new public context has been introduced.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Without ff_vk_exec_discard_deps which is called by ff_vk_exec_wait,
the reference count of hwframe context cannot reach zero due to
circular reference created by ff_vk_exec_add_dep_frame.
Fix#10873
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
It was always intended that the buffers returned by
RefStruct shall have the same alignment guarantees
as the buffers returned by av_malloc(); said alignment
depends upon the arch and the enabled instruction set
and the code used STRIDE_ALIGN as a proxy for this.
Yet since 7945d30e91
there is a better way to get av_malloc's alignment:
ALIGN_64 in mem_internal.h. So use this.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Cloning a packet whose source is going to be unreferenced
immediately afterwards is wasteful.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This decoder uses av_image_copy() to copy decoded images
to buffers obtained via ff_get_buffer(); ergo it can handle
user-provided buffers.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This decoder implements the receive_frame API; such decoders
always have to set the pkt_dts field themselves and the avcodec
test checks for this.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Fixes errors when opening streams with no extradata (like those from raw OBU
sources). It also calls get_format() on new Sequence Headers when required.
Signed-off-by: James Almer <jamrial@gmail.com>
Commit 900ce6f8c3 replaced
IntraX8Context.ac_vlc by IntraX8Context.ac_vlc_table,
but forgot to update an av_assert2()*.
cf7ed01938 then
replaced this with a check for j_ac_vlc[mode],
but this makes no sense as j_ac_vlc is of type
const VLCElem [2][2][8][].
Worse yet, mode can be up to three and then
j_ac_vlc[mode] is undefined behaviour. This happened
during the wmv8-x8intra FATE test.
*: Since 84f16bb5e6
config.h was no longer auto-included in avassert.h
and this disabled av_assert1() and av_assert2()
in files where config.h has not been included before
the inclusion of avassert.h.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
So that can show OBU info even it doesn't have decomposed content. And
add OBU content status into the message.
Signed-off-by: Fei Wang <fei.w.wang@intel.com>
Benched using single-threaded full decode on an Ampere Altra.
Bpp Before After Speedup
8 73,3s 65,2s 1.124x
10 114,2s 104,0s 1.098x
12 125,8s 115,7s 1.087x
Signed-off-by: J. Dekker <jdek@itanimul.li>
FFmpeg has instances of DECLARE_ALIGNED(32, ...) in a lot of structs,
which then end up heap-allocated.
By declaring any variable in a struct, or tree of structs, to be 32 byte
aligned, it allows the compiler to safely assume the entire struct
itself is also 32 byte aligned.
This might make the compiler emit code which straight up crashes or
misbehaves in other ways, and at least in one instances is now
documented to actually do (see ticket 10549 on trac).
The issue there is that an unrelated variable in SingleChannelElement is
declared to have an alignment of 32 bytes. So if the compiler does a copy
in decode_cpe() with avx instructions, but ffmpeg is built with
--disable-avx, this results in a crash, since the memory is only 16 byte
aligned.
Mind you, even if the compiler does not emit avx instructions, the code
is still invalid and could misbehave. It just happens not to. Declaring
any variable in a struct with a 32 byte alignment promises 32 byte
alignment of the whole struct to the compiler.
This patch limits the maximum alignment to the maximum possible simd
alignment according to configure.
While not perfect, it at the very least gets rid of a lot of UB, by
matching up the maximum DECLARE_ALIGNED value with the alignment of heap
allocations done by lavu.
It is undefined behaviour.
Fixes many failed tests with UBSan and GCC 13 like
"src/libavformat/mov.c:4229:44: runtime error: store to address
0x5572abe20f80 with insufficient space for an object of type 'struct
MOVIndexRange'"
(The line number does not refer to the line where &entry[-1]
is assigned.)
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The AAC decoders share no common code with the AAC encoder,
so they are not restricted to using the same structures.
This implies that one can use different structs for each
component and remove elements not used by the decoders/
the encoder. This leads to quite sizeable savings:
sizeof(ChannelElement) for the encoder went down to 134432B
here from 547552B; for the decoder it went down to 512800B.
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is decoder-only; furthermore, there is already
an AACContext in use by libfdk-aacenc.
Also make aacdec.h provide the typedef for AACContext;
up until now, this has been done by sbr.h.
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Common symbols are not part of ISO-C and therefore not used
by FFmpeg at all. Yet linker warnings to ensure it stays
that way are nevertheless wrong, because the existence of
common symbols does not imply that there is a bug in our code.
More precisely, Clang's ASAN implementation uses a common symbol
___asan_globals_registered when used on Elf targets with
the -fsanitize-address-globals-dead-stripping option;
said option is the default since Clang 17 [1].
This leads to 1883 warnings about ___asan_globals_registered
when linking here.
(Even without that option there were warnings like
_ZN14__interception10real_vforkE being overridden.)
Said warning is also unnecessary: The proper way to ensure
that our code is free of common symbols is to let the compiler
enforce this. But this is already the default since GCC 10
and Clang 11, so there is no risk of introducing our own
common symbols.
[1]: https://reviews.llvm.org/D152604
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Export each tile as its own stream, and the grid information as a Stream Group
of type TILE_GRID.
This also enables exporting other stream items like thumbnails, which may be
present in non tiled HEIF images too. For those, the primary stream will be
tagged with the default disposition.
Based on a patch by Swaraj Hota
Signed-off-by: James Almer <jamrial@gmail.com>
Previously, the following syntax elements were not read in the case
sps_num_subpics_minus is 0:
* sps_subpic_id_len_minus1
* sps_subpic_id_mapping_explicitly_signalled_flag
* sps_subpic_id_mapping_present_flag
* sps_subpic_id[i]
This was causing failures to decode bitstreams, for example the DVB's
"VVC HDR UHDTV1 OpenGOP 3840x2160 50fps HLG10 PiP" V&V bitstream.
Patch fixes this by moving the reads for these syntax elements out a
scope.
Signed-off-by: Frank Plowman <post@frankplowman.com>
Signed-off-by: James Almer <jamrial@gmail.com>
The spec says "the value of sps_num_subpics_minus1 shall be in the
range of 0 to MaxSlicesPerAu − 1, inclusive."
Signed-off-by: Frank Plowman <post@frankplowman.com>
Signed-off-by: James Almer <jamrial@gmail.com>
The SEI message read/write functions are called
via function pointers where the SEI message-specific
context is passed as void*. But the actual function
definitions use a pointer to their proper context
in place of void*, making the calls undefined behaviour.
Clang UBSan 17 warns about this.
This commit fixes this by adding wrapper functions
(created via macros) that have the right type that
call the actual functions. This reduced the number of failing
FATE tests with UBSan from 164 to 85 here.
Reviewed-by: Mark Thompson <sw@jkqxz.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Previously to support dynamic reconfigurations of the matrix string (e.g. 0m),
the rdiv values would always be cleared to 0.f, causing the rdiv to be
recalculated based on the new filter. This however had the side effect of
always ignoring user specified rdiv values.
Instead float user_rdiv[0] is added to ConvolutionContext which will store the
user specified rdiv values. Then the original rdiv array will store either the
user_rdiv or the automatically calculated 1/sum.
This fixes trac ticket #10294, #10867.
Signed-off-by: Stone Chen <chen.stonechen@gmail.com>
Signed-off-by: Marton Balint <cus@passwd.hu>
av_get_sample/pix_fmt() return their respective enums
and are therefore not of the type int (*)(const char*),
yet they are called as-if they were of this type.
This works in practice, but is actually undefined behaviour.
With Clang 17 UBSan these violations are flagged, affecting lots
of tests. The number of failing tests went down from 3363 to 164
here with this patch.
Reviewed-by: Mark Thompson <sw@jkqxz.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The calls to the sei_decoded_picture_hash read and write functions
are performed with four pointer arguments; just because one
of them is unused by the callees does not mean that they
can be omitted: This is undefined behaviour.
(This was not recognized because the SEI_MESSAGE_RW macro
contains casts.)
Reviewed-by: Mark Thompson <sw@jkqxz.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Jpeg2000 decoder is decoding in native endian, so let's use the same workaround
as in fate-mxf-probe-applehdr10.
Fixes ticket #10868.
Signed-off-by: Marton Balint <cus@passwd.hu>
This is the proper poll mode for waiting for an incoming connection according
to the SRT API docs.
Fixes ticket #9142.
Signed-off-by: Marton Balint <cus@passwd.hu>
For IBC, we'll utilize the check_available function.
However, neither MVP nor merge mode need to check the motion estimation region.
Let's rename it to avoid confusion.
Intra Block Copy relies on reconstructed pixels from the current frame.
We skip IBC during the inter prediction stage and handle it during the reconstruction stage.
An Intra Block Copy clip may use different modes for luma and chroma.
For example, MODE_IBC for luma and MODE_INTRA for chroma.
We have to check luma and chroma CuPredMode (cpm) separately.
AV_CODEC_CAP_VARIABLE_FRAME_SIZE has been set for alac_at encoder,
which means avctx->frame_size should be zero. However, alac_at
encoder also set avctx->frame_size. This leading to assert failure
in ffmpeg_sched.c
av_assert0(enc->sq_idx[0] >= 0);
Actually, the implementation of audiotoolboxenc.c doesn't support
frame_size been zero.
Fix#10720.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
The output of TX is extremely verbose and makes it harder to find other debug
log messages, so print most of it at trace level.
Signed-off-by: James Almer <jamrial@gmail.com>
Using av_bprint_init_for_buffer() avoids copying data
into the internal AVBPrint buffer (or worse: to allocate
a temporary buffer in case the internal buffer does not
suffice).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Using av_bprint_init_for_buffer() avoids copying data
into the internal AVBPrint buffer (or worse: to allocate
a temporary buffer in case the internal buffer does not
suffice).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Using av_bprint_init_for_buffer() avoids copying data
into the internal AVBPrint buffer (or worse: to allocate
a temporary buffer in case the internal buffer does not
suffice). It also ensures that the data is always
0-terminated.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Using av_bprint_init_for_buffer() avoids copying data
into the internal AVBPrint buffer (or worse: to allocate
a temporary buffer in case the internal buffer does not
suffice). It also ensures that the data is always
0-terminated, whereas the current code never does this
and returns success in case the length of the
string to write and the size of the buffer coincide.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Contrary to the existing "fate-checkasm", this always prints the
tool output, and runs all tests at once instead of splitting it up
per target group. This is more useful when the user expects to
look directly at the tool output, instead of being part of a full
fate run.
(On failure with the regular "make fate-checkasm" targets, none of
the tool output is printed, but stored in files. If run with reporting
set up to the FATE website, the individual failures are uploaded there,
but if it is run in some sort of other CI setup, the intermediate files
might not be available afterwards for inspection.)
Signed-off-by: Martin Storsjö <martin@martin.st>
This callback is optional and should therefore only be set
if it actually does something.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Fixes: out of array access.
Earlier code assumes that a unscaled bayer to yuvj420 converter exists
but the later code then skips yuvj420
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The `entries` value is read directly from the stream and used to
allocate memory. This change clamps `entries` to however many are
possible in the remaining atom or file size (whichever is smallest).
Fixes https://crbug.com/1429357
Signed-off-by: Dale Curtis <dalecurtis@chromium.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Do not construct the name manually from input file/stream indices.
This is a step towards avoiding the assumption that filtergraph inputs
are always fed by demuxers.
The computation is based on demuxer properties, so that is the more
appropriate place for it. Filter code just receives the desired
start time/duration.
Inside a function an extra ';' is a null statement;
but outside of it it simply must not happen.
So remove them.
Reviewed-by: Nuo Mi <nuomi2021@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
These arrays are currently accessed via uint16_t* pointers
although nothing guarantees their alignment. Furthermore,
this is problematic wrt the effective-type rules.
Fix this by using a union of arrays of uint8_t and uint16_t.
Reviewed-by: Nuo Mi <nuomi2021@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
max_bin_idx can be at most LMCS_MAX_BIN_SIZE - 1 here,
so pivot[LCMS_MAX_BIN_SIZE + 1] may be accessed,
but pivot has only LCMS_MAX_BIN_SIZE + 1 elements
(unless the values of pivot were so that it is always
assured that pivot[LCMS_MAX_BIN_SIZE] is always < sample
(which it is iff it is always < 2^bit_depth - 1)).
So reorder the checks.
Reviewed-by: Nuo Mi <nuomi2021@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It used to be used with preallocated packet buffers with
the old encode API, but said API is no more and therefore
there is no reason for this to be public any more.
So deprecate it and use an internal replacement
for the encoders using it as an upper bound for the
size of their headers.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to VAAPIFramesContext as one no longer has to
go through AVHWFramesInternal.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is possible because the lifetime of both coincide.
Besides reducing the number of allocations this also simplifies
access to VAAPIDeviceContext as one no longer has to
go through AVHWDeviceInternal.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Also fixes a Clang warning:
"overlapping comparisons always evaluate to false
[-Wtautological-overlap-compare]"
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
By default the option "flv_metadata" (internally using the field
name "trust_metadata") is set to 0, meaning that we don't allocate
streams based on information in the metadata, only based on
actual streams we encounter. However the "datastream" metadata field
still would allocate a subtitle stream.
When muxing, the "datastream" field is added if either a data stream
or subtitle stream is present - but the same metadata field is used
to preemtively create a subtitle stream only. Thus, if the field
was added due to a data stream, not a subtitle stream, the demuxer
would create a stream which won't get any actual packets.
If there was such an extra, empty subtitle stream, running
avformat_find_stream_info still used to terminate within reasonable
time before 3749eede66. After that
commit, it no longer would terminate until it reaches the max
analyze duration, which is 90 seconds for flv streams (see
e6a084641a,
24fdf7334d and
f58e011a1f).
Before that commit (which removed the deprecated AVStream.codec), the
"st->codecpar->codec_id = AV_CODEC_ID_TEXT", set within the demuxer,
would get propagated into st->codec->codec_id by numerous
avcodec_parameters_to_context(st->codec, st->codecpar), then further
into st->internal->avctx->codec_id by update_stream_avctx within
read_frame_internal in libavformat/utils.c (demux.c these days).
Signed-off-by: Martin Storsjö <martin@martin.st>
if there's an audio layer with a single stream that can be rendered alone, mark it
as default. Otherwise, mark every stream as dependent.
Signed-off-by: James Almer <jamrial@gmail.com>
The call to ff_exp2fi() here always uses arguments in the normal
range, so that the branches in ff_exp2fi() are unnecessary.
This is so because JPEG2000 itself only supports up to
128 bits per component per pixel (we only support far less);
furthermore, expn is always 0..31 for the decoder and also
sane for the encoder, so that the difference between these
two values is always in the normal range of -126..128.
Reviewed-by: Tomas Härdin <git@haerdin.se>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It allows the compiler to combine two reads and writes of adjacent
32bit memory locations into 64bit read-writes.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Use ff_avg_pixels16_mmxext or ff_avg_pixels16_sse2
(for users with SSE2_FAST) instead.
This also allows to remove ff_avg_pixels16_mmx,
as this was its last remaining user.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
By using AnyType for resolving a strong reference we searched among all types,
not just the ones which can be the target of the reference, which in some cases
caused to find the wrong type, if the metadata set UUIDs were not unique.
UUIDs do not have to be unique if their type sets them apart, SMPTE 377M says:
> StrongRef: 'One to One’ relationship between sets and implemented in MXF
> with UUIDs. Strong References are typed which means that the definition
> identifies the kind of set which is the target of the reference.
Fixes ticket #10865.
Signed-off-by: Marton Balint <cus@passwd.hu>
Surprisingly the return value of add_param_definition()
(a pointer) has only been used to check for success
and not to actually access the pointee; nonsuccess
was equated with ENOMEM, although there is a non-enomem
error path in this function.
Change this by returning an int.
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
IAMFAudioElement and IAMFMixPresentation currently contain
pointers to independently allocated objects that are sometimes
owned by said structures and sometimes not.
More precisely, upon success the demuxer transfers ownership
of these other objects newly created AVStreamGroups, but it
keeps its pointers. iamf_read_close() therefore always resets
these pointers (because the cleanup code always treats them
as ownership pointers). This leads to memory leaks in case
iamf_read_header() without having attached all of these
objects to stream groups.
The muxer has a similar issue: It also clears these pointers
(pointing to objects owned by stream groups created by the user)
in its deinit function.
This commit fixes this memleak by explicitly adding non-ownership
pointers; this also allows to remove the code to reset the
ownership pointers.
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Checking whether a pointer to an element of an array is NULL
makes no sense, as the pointer addition involved in getting
the address would be undefined behaviour already if the array
were NULL.
In this case the array allocation has already been checked
a few lines before.
Fixes Coverity issue #1559548.
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The old layout happened to be a native layout and therefore missed some
recently fixed layout parsing bugs.
Signed-off-by: Marton Balint <cus@passwd.hu>
It's a read only exported option, and not meant to be set by the user.
Also, move it to MPEGTS_OPTIONS while at it to avoid duplication.
Signed-off-by: James Almer <jamrial@gmail.com>
The configure check already had fallback for the previous version
of glslang, which had different requirements for flags.
This commit simply moves the flags needed for glslang 13 to the
fallback, while first trying to use new flags for glslang 14.
This drops support for ~3 year old glslang versions, which
I'm not sure had the complete C API we're using anyway.
The filename is freed with the OptionsContext and therefore
there will be a use-after-free when reporting the filename
in print_stream_maps(). So create a copy of the string.
This is a regression since 8aed3911fc.
fate-lavf-mkv_attachment exhibits it (and reports a random nonsense
filename here), but this does not make the test fail (not even with
valgrind; only with ASAN, as it aborts on use-after-free).
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Avoids ugly casts when uninitializing.
(One could actually avoid allocating this separately if one
were willing to expose FFFramePool to those files including
link_internal.h.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Also make FFFilterGraph.sink_links a FilterLinkInternal**
because sink_links is used to access FilterLinkInternal
fields.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
(These fields were in AVFilterGraph although AVFilterGraphInternal
existed for years.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
To do this, allocate AVFilterGraphInternal jointly with AVFilterGraph
and rename it to FFFilterGraph in the process (similarly to
AVStream/FFStream).
The AVFilterGraphInternal* will be removed on the next major version
bump.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This commit moves the generic-layer stuff (that is not used
by filters) to a new header of its own, similarly to
5e7b5b0090 for libavcodec.
thread.h and link_internal.h are merged into this header.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
To do this, allocate AVFilterInternal jointly with AVFilterContext
and rename it to FFFilterContext in the process (similarly to
AVStream/FFStream).
The AVFilterInternal* will be removed from AVFilterContext
on the next major bump.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
These logs use the wrong loglevel and are uninformative;
and it is of course highly unlikely that a buffer of 56B
can't be allocated.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This fixes the previous commit and adds more cases (DCT-I and DST-I).
I am holding off on defining a scale parameter for FFTs as I'd like
to use a complex value for them.
If a custom layout is equivalent to a native one, check if it matches one of the
known layout names and print that instead.
Signed-off-by: James Almer <jamrial@gmail.com>
This together with adjusting the inclusion define allows for the
build to not fail with latest Vulkan-Headers that contain the
stabilized Vulkan AV1 decoding definitions.
Compilation fails currently as the AV1 header is getting included
via hwcontext_vulkan.h -> <vulkan/vulkan.h> -> vulkan_core.h, which
finally includes vk_video/vulkan_video_codec_av1std.h and the decode
header, leading to the bundled header to never defining anything
due to the inclusion define being the same.
This fix is imperfect, as it leads to additional re-definition
warnings for things such as
VK_STD_VULKAN_VIDEO_CODEC_AV1_DECODE_SPEC_VERSION. , but it is
not clear how to otherwise have the bundled version trump the
actually standardized one for a short-term compilation fix.
A lot of changes and fixes to channel layout parsing, notably
- get rid of dynamic allocation of channel positions
- signal unimplemented speaker positions as unknown instead of failure, but
warn the user about it
- native order, and that a single channel only appears once was always assumed
for less than 64 channels, obviously this was incorrect
Signed-off-by: Marton Balint <cus@passwd.hu>
It should be available in all relevant modern compilers and will allow
us to use features like anonymous unions.
Note that stdatomic.h is still emulated on MSVC, as current versions
require the /experimental:c11atomics, and do not support
ATOMIC_VAR_INIT() anyway.
According to ITU-T H.265 7.4.2.1 this byte sequence should not appear in a
NAL unit but in practice in rare cases it seems it does, possibly due to buggy
encoders. Other players like VLC and Quicktime seem to be fine with it.
Currently when this sequence is found it is treated as if the next start code
has been found and the NAL unit gets truncated.
This change limits the code to only look for first start code 0x0000001 or
first escape 0x000003.
Sadly i can't share the original source file with the issue but the first
80 bytes of the NAL unit looks like this:
│00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f│0123456789abcdef│
0x00000│00 00 00 01 02 01 d0 bc 57 a1 b8 44 70 01 00 0b│........W..Dp...│
0x00010│80 2e 00 c2 6c ec 3e b9 e3 03 fb 91 2e d2 43 cb│....l.>.......C.│
0x00020│1d 2c 00 00 02 00 02 00 5c 93 72 6f 31 76 18 00│.,......\.ro1v..│
0x00030│08 38 aa b1 4c 33 3f fd 08 cb 77 9b d4 3c db 02│.8..L3?...w..<..│
0x00040│a2 04 73 15 75 de 3b c4 67 c0 8f ca ad 31 f1 99│..s.u.;.g....1..│
Signed-off-by: Anton Khirnov <anton@khirnov.net>
I generated a DXV2 file with an interesting alpha channel using
Adobe Media Encoder 2015 and compared decoding it using Resolume Alley
and ffmpeg. Similarly to DXV3 files, Alley expects premultiplied alpha
and ffmpeg matches its decoding more closely when it does the same.
Reference file: https://connorworley.com/dxv2-dxt5.mov
Existing FATE tests for DXV2 files do not cover this change.
Signed-off-by: Connor Worley <connorbworley@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
This change decouples the frame dimensions from avctx, which is useful
for DXV decoding, and fixes incorrect behavior in the existing
implementation.
Tested with `make fate THREADS=7` and
`make fate THREADS=7 THREAD_TYPE=slice`.
Signed-off-by: Connor Worley <connorbworley@gmail.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
We currently mostly do not empty the MMX state in our MMX
DSP functions; instead we only do so before code that might
be using x87 code. This is a violation of the System V i386 ABI
(and maybe of other ABIs, too):
"The CPU shall be in x87 mode upon entry to a function. Therefore,
every function that uses the MMX registers is required to issue an
emms or femms instruction after using MMX registers, before returning
or calling another function." (See 2.2.1 in [1])
This patch does not intend to change all these functions to abide
by the ABI; it only does so for ff_simple_idct_mmx, as this
function can by called by external users, because it is exported
via the avdct API. Without this, the following fragment will
assert (on x86_32, as ff_simple_idct_mmx is not used on x64):
int16_t __attribute__ ((aligned (16))) block[64];
AVDCT *dct = avcodec_dct_alloc();
dct->idct_algo = FF_IDCT_AUTO;
avcodec_dct_init(dct);
dct->idct(block);
av_assert0_fpu();
[1]: https://raw.githubusercontent.com/wiki/hjl-tools/x86-psABI/intel386-psABI-1.1.pdf
Reviewed-by: Kieran Kunhya <kierank@obe.tv>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The PPS should be used instead of the SPS to get the current picture's
dimensions. Using the SPS can cause issues if the resolution changes
mid-sequence. In particular, it was leading to invalid memory accesses
if the resolution decreased.
Patch replaces sps->{width,height} with pps->{width,height}. It also
removes sps->{width,height}, as these are no longer used anywhere.
Fixes crash when decoding DVB V&V test sequence
VVC_HDR_UHDTV1_ClosedGOP_Max3840x2160_50fps_HLG10_res_change_without_RPR
Signed-off-by: Frank Plowman <post@frankplowman.com>
This exposes VP8E_SET_SCREEN_CONTENT_MODE option from libvpx.
Co-authored-by: Erik Språng <sprang@webrtc.org>
Signed-off-by: Dariusz Marcinkiewicz <darekm@google.com>
Signed-off-by: James Zern <jzern@google.com>
Make av_tx_init() agree with documentation:
* Real to complex and complex to real DFTs.
* For the float and int32 variants, the scale type is 'float', while for
* the double variant, it's a 'double'. If scale is NULL, 1.0 will be used
* as a default.
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Peter Ross <pross@xvid.org>
This hack is used to limit the visibility of some AVFilterLink fields to
only certain files. Replace it with the same pattern that is used e.g.
in lavf AVStream/FFStream and avoid exposing these internal fields in a
public header completely.
Makes it robust against adding fields before it, which will be useful in
following commits.
Majority of the patch generated by the following Coccinelle script:
@@
typedef AVOption;
identifier arr_name;
initializer list il;
initializer list[8] il1;
expression tail;
@@
AVOption arr_name[] = { il, { il1,
- tail
+ .unit = tail
}, ... };
with some manual changes, as the script:
* has trouble with options defined inside macros
* sometimes does not handle options under an #else branch
* sometimes swallows whitespace
avfilter_insert_filter() already reports (also with AV_LOG_VERBOSE)
when a filter is auto-inserted.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
fc->ref points to an old VVCFrame, which cannot be used after frame_context_setup.
This prevents crashes in decode_nal_units-->ff_vvc_report_frame_finished.
Signed-off-by: Frank Plowman <post@frankplowman.com>
The avctx passed to avcodec_string() may have unset coded dimensions, as is the
case when called by av_dump_format() where the streams had all the needed
information at the container level, and as such no frames were decoded
internally.
Signed-off-by: James Almer <jamrial@gmail.com>
Besides simplifying address computations (it saves 432B of .text
in hevcdsp.o alone here) it also fixes undefined behaviour that
occurs if mx or my are 0 (happens when the filters are unused)
because they lead to an array index of -1 in the old code.
This happens in the checkasm-hevc_pel FATE-test.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Reviewed-by: Nuo Mi <nuomi2021@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
Otherwise, passing an UNSPECIFIED frame to am MPEG-only filter graph
would trigger insertion of an unnecessary vf_scale filter, which would
perform a memcpy to convert between the two.
This is safe to do because unspecified YUV frames are already
universally assumed to be MPEG range, in particular by swscale.
Currently, this only affects untagged RGB/XYZ/Gray, which get forced to
their corresponding metadata before entering the filter graph. The main
justification for this change, however, is the planned ability to add
automatic promotion of unspecified yuv to mpeg range yuv.
Notably, this change will never allow accidentally cross-promoting
unspecified to jpeg or to a specific YUV matrix, since that is still
bound by the constraints of YUV range negotiation as set up by
query_formats.
Since 7bf1b9b357,
the test produces ordinary \n, yet this is not what the reference
file used for the most time, leading to test failures.
Reviewed-by: Martin Storsjö <martin@martin.st>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
When this filter overrides frame properties, the outgoing frames have
a different YUV colorspace than the incoming ones. This requires
signalling the new colorspace on the outlink, and in particular, making
sure it's *not* set to a common ref with the input - otherwise the point
of this filter would be destroyed.
Untouched fields will continue being passed through, so we don't need to
do anything there.
The currently used pointer when unmapping DXVA2 and D3D11
actually points to an OpenCLDeviceContext.
Reviewed-by: Mark Thompson <sw@jkqxz.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The existing (and upcoming) available group types are meant to combine several
streams for presentation, with the result being treated as if it was a stream
itself.
For example, a file could export two stream groups of the same type with one of
them as the "default".
Signed-off-by: James Almer <jamrial@gmail.com>
Previously, we produced output with either \r\n or mixed line endings.
This was undesirable unto itself, but also made working with patches affecting
FATE output particularly challenging, especially via the mailing list.
Everything that consumes the SSA/ASS format is line-ending-agnostic,
so \n is selected to simplify git/ML usage in FATE.
Extra \r characters at the end of a packet are dropped. These are always
ignored by the renderer anyway.
fbw_channels must be > 0 as teh code is only run if cpl_enabled is set and that requires mode >= AC3_CHMODE_STEREO
CID 718138 Uninitialized scalar variable
assumes this assert to be false
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
For a corrupted stream, the value of nalu_len read from the extradata is not reliable.
We need to perform additional checks
Fixes: fuzzer timeout
Fixes: 65253/clusterfuzz-testcase-minimized-ffmpeg_BSF_VVC_MP4TOANNEXB_fuzzer-4972412487467008
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
These chunks contain the Content Light Level Information and the
Mastering Display Color Volume information that FFmpeg already supports
as AVFrameSideData. This patch adds support for the png encoder to save
this metadata as the corresponding chunks in the PNG stream.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
These chunks contain the Content Light Level Information and the
Mastering Display Color Volume information that FFmpeg already supports
as AVFrameSideData. This patch adds support for the png decoder to read
these chunks if present and attach the corresponding side data to the
decoded frame.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
The function signature for bytestream2_seek is (gb, offset, whence);
Before this patch, the code passed (gb, SEEK_SET, offset), which is
incorrect.
Siged-off-by: Leo Izen <leo.izen@gmail.com>
vc1_hwaccel_pixfmt_list_420 is referenced even if
!(CONFIG_WMV3IMAGE_DECODER || CONFIG_VC1IMAGE_DECODER) so move it out
of the #if block.
Signed-off-by: Akihiko Odaki <akihiko.odaki@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The channel designation metadata should not override the number of channels.
Let's warn the user if it is inconsistent, and keep the channel layout
unspecified.
Before the conversion to the channel layout API the code only set the mask, but
never overridden the channel count, so this restores the old behaviour.
Signed-off-by: Marton Balint <cus@passwd.hu>
Existing code could have caused wrong channel order signalling or reduced
channel count if a channel designation appeared multiple times. This is
actually an old bug, but the conversion to the new channel layout API made it
visible, because now the code overrides the proper channel count with the one
calculated from the mask.
Signed-off-by: Marton Balint <cus@passwd.hu>
The new API requires an extra array member at the very end,
which old API users did not do.
This disables in-place RDFT transforms and instead
does the transform out of place by copying once, there shouldn't
be a significant loss of speed as our in-place FFT requires a reorder
which is likely more expensive in the majority of cases to do.
These inline implementations of AV_COPY64, AV_SWAP64 and AV_ZERO64
are known to clobber the FPU state - which has to be restored
with the 'emms' instruction afterwards.
This was known and signaled with the FF_COPY_SWAP_ZERO_USES_MMX
define, which calling code seems to have been supposed to check,
in order to call emms_c() after using them. See
0b1972d409,
29c4c0886d and
df215e5758 for history on earlier
fixes in the same area.
However, new code can use these AV_*64() macros without knowing
about the need to call emms_c().
Just get rid of these dangerous inline assembly snippets; this
doesn't make any difference for 64 bit architectures anyway.
Signed-off-by: Martin Storsjö <martin@martin.st>
The previous assumption that DXV needs to be aligned to 16x16 was
erroneous. 4x4 works just as well, and FATE decoder tests pass for all
texture formats.
On the encoder side, we should reject input that isn't 4x4 aligned,
like the HAP encoder does, and stop aligning to 16x16. This both solves
the uninitialized reads causing current FATE tests to fail and produces
smaller encoded outputs.
With regard to correctness, I've checked the decoding path by encoding a
real-world sample with git master, and decoding it with
ffmpeg -i dxt1-master.mov -c:v rawvideo -f framecrc -
The results are exactly the same between master and this patch.
On the encoding side, I've encoded a real-world sample with both master
and this patch, and decoded both versions with
ffmpeg -i dxt1-{master,patch}.mov -c:v rawvideo -f framecrc -
Under this patch, results for both inputs are exactly the same.
In other words, the extra padding gained by 16x16 alignment over 4x4
alignment has no impact on decoded video.
Signed-off-by: Connor Worley <connorbworley@gmail.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
SDL supports only these three matrices. Actually, it only supports these
three combinations: BT.601+JPEG, BT.601+MPEG, BT.709+MPEG, but we have
no way to restrict the specific *combination* of YUV range and YUV
colorspace with the current filter design.
See-Also: https://trac.ffmpeg.org/ticket/10839
Instead of an incorrect conversion result, trying to play a YCgCo file
with ffplay will simply error out with a "No conversion possible" error.
An oversight in my previous series. This omission slipped under the
radar because fftools/ffmpeg_filter.c did not use these options, instead
preferring to insert an explicit format filter.
FFmpeg has instances of DECLARE_ALIGNED(32, ...) in a lot of structs,
which then end up heap-allocated.
By declaring any variable in a struct, or tree of structs, to be 32 byte
aligned, it allows the compiler to safely assume the entire struct
itself is also 32 byte aligned.
This might make the compiler emit code which straight up crashes or
misbehaves in other ways, and at least in one instances is now
documented to actually do (see ticket 10549 on trac).
The issue there is that an unrelated variable in SingleChannelElement is
declared to have an alignment of 32 bytes. So if the compiler does a copy
in decode_cpe() with avx instructions, but ffmpeg is built with
--disable-avx, this results in a crash, since the memory is only 16 byte
aligned.
Mind you, even if the compiler does not emit avx instructions, the code
is still invalid and could misbehave. It just happens not to. Declaring
any variable in a struct with a 32 byte alignment promises 32 byte
alignment of the whole struct to the compiler.
This patch limits the maximum alignment to the maximum possible simd
alignment according to configure.
While not perfect, it at the very least gets rid of a lot of UB, by
matching up the maximum DECLARE_ALIGNED value with the alignment of heap
allocations done by lavu.
Forgot to do this with the previous commit.
Actually makes the assembly being used.
Still the fastest FFT in the world, 15% faster than FFTW on the
largest available size.
The demuxer opens an internal parser instance in read_timestamp(), which
requires a codec context. There is no need for it to access the FFStream
one which is used for other purposes, it can allocate its own internal
one.
This check has survived the transition to AVCodecParameters, but is no
longer relevant after it, since the codec context is no longer updated
or accessed at all from the demuxer.
Resetting the counter of used elements is enough as nothing is
ever read from the currently unused elements.
Reviewed-by: Marth64 <marth64@proxyid.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The rcwt muxer uses several counters for how much data
it has already cached: One byte counter and one counter
for how many complete blocks (of three bytes each).
These counters can become inconsistent when the muxer is
fed incomplete blocks as the muxer presumes that it is
about to write a new block at the start of each write_packet
call. E.g. sending 65535*3+1 1-byte packets (with data[0] e.g. 0x03)
will trigger an out-of-bounds write.
This patch fixes this by processing the data in complete blocks
only. This also allows to simplify the code, e.g. to remove one of
the counters.
Reviewed-by: Marth64 <marth64@proxyid.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
They are not intended for decoders (for which there is the get_format
callback in case the user has a choice).
Also note that the list was wrong for MPEG4, because it did not contain
the high bit depth pixel formats used for studio profiles.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
They are not intended for decoders (for which there is the get_format
callback in case the user has a choice).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is currently called once in the codecs' init function
and once when (re)initializing the VC-1 decode context
(which happens upon frame size changes as well as before
decoding the first frame). The first one is unnecessary
now that vc1_decode_frame() no longer requires avctx->hwaccel
to be already set for hwaccel to work properly.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
VC-1 uses a 0x03 escaping scheme like H.26x and our decoder
unescapes data for this purpose, but hardware accelerations
just want the data as-is and therefore get fed the original
data. The pointers to the actual data are only setcorrectly
if avctx->hwaccel is set (after all, they are only used in
this case).
There are two problems with this: The first is that the branch
is pointless; the second is that it is harmful, because
a hardware acceleration may be added after the packet has been
parsed (in case there is a reconfiguration e.g. due to frame
size changes) in which case decoding the first few frames
won't work.
So delete these branches.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It is entirely unnecessary to use it given that all decoders
here share the same set of supported pixel formats. So just
hardcode this list.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
AVCodec.pix_fmts is only intended for encoders (decoders use
the get_format callback to let the user choose a pix fmt).
So remove them for the decoders for which this is possible
without further complications; keep them for now in the codecs
that actually use them (by passing avctx->codec->pix_fmts to
ff_get_formatt()).
Also notice that some of these lists were wrong; e.g.
317b7b06fd added support for YUV444P16
for cuviddec, but forgot to add it to pix_fmts.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This bug causes the DXT5 decoder to produce incorrect block texture data.
After the fix, textures are visually correct and match data decoded by
Resolume Alley (extracted with Nvida Nsight for comparison). Current FATE DXT5
samples did not cover this case.
Signed-off-by: Connor Worley <connorbworley@gmail.com>
DXV files seem to misnomer DXT5 and really encode DXT4 with
premultiplied alpha. At least, this is what Resolume alley does.
To check, encode some input with alpha as "Normal Quality, With Alpha"
in Alley, then decode the output with this change -- results are true
to the original input compared to git-master.
Signed-off-by: Connor Worley <connorbworley@gmail.com>
The runtime doesn't set the frame type to MFX_FRAMETYPE_IDR on the
returned mfx bitstream for a keyframe, it set the frame type to
MFX_FRAMETYPE_I only. This patch added workaround for VP9 keyframe to
make the coded stream seekable.
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
The VC1_D2010 profile, also known as VC1_VLD2010, has the same functionality
and specification as the VC1_D profile. Support for this profile serves only
as a positive indication that the accelerator has been designed with awareness
of the modifications specified in the August 2010 version of this specification.
Hardware accelerator drivers that expose support for this profile must not
also expose the previously specified VC1_D GUID, unless the accelerator works
properly with existing software decoders that use VC1_D and that do not incorporate
the corrections added to the August 2010 version of this specification.
As a result, we could give VC1_VLD2010 a higher priority and initialize
it first.
Signed-off-by: Aleksoid <Aleksoid1978@mail.ru>
Signed-off-by: Tong Wu <tong1.wu@intel.com>
Ensure all hwaccels that allocate a buffer use NVDECContext->bitstream_internal
instead. Otherwise, if FFHWAccel->end_frame() isn't called before
FFHWAccel->uninit(), an attempt to free a stale pointer to memory not owned by
the hwaccel could take place.
Reviewed-by: Timo Rothenpieler <timo@rothenpieler.org>
Signed-off-by: James Almer <jamrial@gmail.com>
This can be achieved by using AV_OPT_TYPE_FLAGS instead of
AV_OPT_TYPE_STRING. It also avoids strcmp() when accessing
the option.
Reviewed-by: Stefano Sabatini <stefasab@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It uses the int64_t instead of the double member.
(This code can currently not be reached: av_opt_get() calls
av_opt_find2() with NULL as unit in which case AV_OPT_TYPE_CONST
options are never returned, leading av_opt_get() to always
return AVERROR_OPTION_NOT_FOUND when searching for AV_OPT_TYPE_CONST*.
For the same reason the code read_number() will never be called
from get_number() when searching for an option of type
AV_OPT_TYPE_CONST. The other callers of read_number() also only
call it with types other than AV_OPT_TYPE_CONST.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
We don't need to include fifo.h, because we don't need AVFifo
as a complete type. Also add the other used headers directly.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Besides being extremly simple this also avoids including
ff_ccfifo_ccdetected() unnecessarily (it is only used by decklink).
This is possible because this is not avpriv, but duplicated into
lavd if necessary.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The muxer's AVCodecContext is currently used for exactly one thing:
To store a time base in it that has been derived via heuristics
in avformat_transfer_internal_stream_timing_info(); said time base
can then be read back via av_stream_get_codec_timebase().
But one does not need a whole AVCodecContext for that, a simple
AVRational is enough.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
For muxers, the internal AVCodecContext is basically unused
except in avformat_transfer_internal_stream_timing_info()
(which sets time_base and ticks_per_frame) and
av_stream_get_codec_timebase() (a getter for time_base).
This makes ticks_per_frame write-only, so don't set it.
Also remove an always-false check for the AVCodecContext's
codec_tag.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
info level will be too noisy if several instances of the decoder are fired
at the same time, as will be the case with tiled AVIF.
Signed-off-by: James Almer <jamrial@gmail.com>
info level will be too noisy if several instances of the decoder are fired
at the same time, as will be the case with tiled AVIF.
Signed-off-by: James Almer <jamrial@gmail.com>
If the number of CTUs reduces between one picture and the next, the
slice_idx table is reduced in size in the frame_context_for_each_tl call
on vvcdec.c:321. When initialising the slice_idx table on vvcdec.c:325,
the old code uses fc->tab.sz.ctu_count when calculating the table size.
fc->tab.sz.ctu_count holds the old ctu count at this point however, not
being updated to hold the new ctu count until vvcdec.c:342. This causes
an out-of-bounds write.
Patch fixes the problem by using pps->ctb_count, which was just used
when allocating the table, in place of fc->tab.sz.ctu_count when
initialising the table.
Signed-off-by: Frank Plowman <post@frankplowman.com>
Instead of overriding the frame properties in fill_picture(), advertise
the supported YUV colorspace and range at format negotiation time. (The
correct metadata will now be set automatically by ff_get_video_buffer)
This makes all ff_draw_* based filters aware of YUV colorspaces and
ranges. Needed for YUVJ removal. Also fixes a bug where e.g. vf_pad
would generate a limited range background even after conversion to
full-scale grayscale.
The FATE changes were a consequence of the aforementioned bugfix - the
gray scale files are output as full range (due to conversion by
libswscale, which hard-codes gray = full), and appropriately tagged as
such, but before this change the padded version incorrectly used
a limited range (16) black background for these formats.
libjxl doesn't support negative strides, but JPEG XL has an orientation
flag inside the codestream. We can use this to work around the library
limitation, by taking the absolute value of the negative row stride,
sending the image up-side-down, and telling the library that the image
has a vertical-flip orientation.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
Once it became fully non-transparent and service of shady practices
behind closed doors, I can not be here any more.
Signed-off-by: Paul B Mahol <onemda@gmail.com>
Currently, when writing PCMA or PCMU tracks with FLV or RTMP, the
stereo flag and sample rate flag inside RTMP audio messages are
overridden, making impossible to distinguish between mono and stereo
tracks. This patch fixes the issue by restoring the same flag mechanism
of all other codecs, that takes into consideration the right channel
count and sample rate.
Signed-off-by: Alessandro Ros <aler9.dev@gmail.com>
Signed-off-by: Marton Balint <cus@passwd.hu>
Distinguish between a decoder being entirely missing and a decoder which
requires features which are not present in the incomplete implementation
in libavcodec and therefore can't be used.
Yadif filter assumed that the output timebase is always half of the input
timebase. This is not true if halving the input time base is not representable
as an AVRational causing the output timestamps to be invalidly scaled in such a
case.
So let's use av_reduce instead of av_mul_q when calculating the output time
base and if the conversion is inexact then let's fall back to the original
timebase which probably makes more parctical sense than using x/INT_MAX.
Fixes invalidly scaled pts_time values in this command line:
ffmpeg -f lavfi -i testsrc -vf settb=tb=1/2000000000,yadif,showinfo -f null none
Signed-off-by: Marton Balint <cus@passwd.hu>
Fixes parsing small timebases from expressions (where the expression API
converts the result to double), like in this command line:
ffprobe -f lavfi -i testsrc=d=1,settb=1/2000000000 -show_streams -show_entries stream=time_base
Before the patch timebase was parsed as 1/1999999999.
Signed-off-by: Marton Balint <cus@passwd.hu>
The current criterion is to check for the existence of
update_thread_context. Change this to check for whether
we are actually decoding VP8 (and not VP7 or VP8-in-WebP).
This is equivalent to the current criterion, but allows
the WebP decoder to evolve and to get its own update_thread_context.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
VP8-in-WebP only uses key frame encoding (see [1]), yet this
is currently not enforced. This commit does so in order to
make output reproducible with frame-threading as the VP8 decoder's
update_thread_context is not called at all when using decoding
VP8-in-WebP (as this is unnecessary for key frame-only streams).
[1]: https://developers.google.com/speed/webp/docs/riff_container
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
"When aps_params_type is equal to ALF_APS or SCALING_APS, the value of aps_adaptation_parameter_set_id shall be
in the range of 0 to 7, inclusive.
When aps_params_type is equal to LMCS_APS, the value of aps_adaptation_parameter_set_id shall be in the range of 0
to 3, inclusive."
Fixes: out of array accesses
Fixes: 65932/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_VVC_fuzzer-4563412340244480
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
There are 6 deprecated ISO language codes that are still valid for DVDs.
This patch allows avlanguage to recognize them correctly. The codes are:
(1) "in" - legacy code for Indonesian, mapped to the modern code
(2) "iw" - legacy code for Hebrew, mapped to the modern code
(3) "ji" - legacy code for Yiddish, mapped to the modern code
(4) "jw" - legacy code for Javanese, published and used as a typoed version of "jv"
(5) "mo" - legacy code for Moldavian, mapped to the inclusive code
(6) "sh" - legacy code for Serbo-Croatian, no modern inclusive code so it is left alone
All of this can be verified from several sources including:
https://en.wikipedia.org/wiki/List_of_ISO_639_language_codes
Signed-off-by: Marth64 <marth64@proxyid.net>
Use avio_get_dyn_buf()+ffio_free_dyn_buf() instead of
avio_close_dyn_buf()+av_free(). This saves an allocation
(and memcpy) in case all the data fits in the AVIOContext's
write buffer.
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It implements BSD-specific support for very old analog capture cards,
which are highly unlikely to be useful today. After being added in 2005,
there were never any commits to it beyond compilation fixes and generic
maintenance. There have also been zero trac tickets for this device, and
the only related web search result I found concludes that it does not
work.
The code also does some unacceptable things, like messing with signal
handlers and storing its state in global variables.
This commit lets ffplay properly propagate YUV metadata into the filter
graph, avoiding such issues as e.g. accidentally passing YCgCo into a
filter that can't support it. Also fixes an error related to this
missing metadata from buffersrc (since commit 2d555dc82d)
See-Also: https://trac.ffmpeg.org/ticket/10839
Fixes a regression since d9fed9df2a, where the single animated stream would
be exported twice as two independent streams.
Signed-off-by: James Almer <jamrial@gmail.com>
The video param change check will print loglines below debug level
for each frame which is different from the inlink parameters. This
can spam the console. It is now printed at warning level once for
each param change else it is kept at debug level.
Partially addresses #10823
Fixes segfaults when trying to map a group index with a value equal to the
amount of groups in the file.
Signed-off-by: James Almer <jamrial@gmail.com>
Check that vps_each_layer_is_an_ols_flag, which indicates that "at
least one OLS specified by the VPS contains more than one layer," is
set if num_multi_layer_olss is non-zero.
Fixes: 65160/clusterfuzz-testcase-minimized-ffmpeg_BSF_VVC_METADATA_fuzzer-4665241535119360
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Frank Plowman <post@frankplowman.com>
Signed-off-by: James Almer <jamrial@gmail.com>
Previously, the demuxer would register decoder with the scheduler, using
InputStream as opaque, and pass the scheduling index to the decoder.
Now the registration is done by the decoder itself, using DecoderPriv as
opaque, and the scheduling index is returned to demuxer from dec_open().
decoder_thread() then no longer needs to be accessed from outside of
ffmpeg_dec and can be made static.
It is done based on demuxer information, so that is the more appropriate
place for this code.
This is a step towards decoupling Decoder and InputStream.
* as this decision is based on demuxing information, move it from the
decoder to the demuxer
* as the issue being addressed is latency added by frame threading, we
only need to disable frame threading, not all threading
Similar to what is currently done for other components, e.g. (de)muxers.
There is nothing in the public part currently, but that will change in
future commits.
Use 8 packets/frames by default rather than 1, which seems to provide
better throughput.
Allow -thread_queue_size to set the muxer queue size manually again.
It presumably exists because HapContext contains an AVClass*.
Yet AVClass is actually defined in log.h and even this inclusion
can be avoided by struct AVClass*. This avoids opt.h inclusions
in hap.c and hapdec.c.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
ff_texturedspenc_init() doesn't support most of the function types
supported for decoding; add a separate context containing only pointers
for the actually supported types.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The old code set a common struct from each thread;
this only "worked" (but is still UB) because the values
written are the same for each thread.
Fix this by moving the assignments to the main thread.
(This also avoids casting const away from a const AVFrame*.)
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Use class confidence instead of box_score to filt boxes, which is more
accurate. Class confidence is obtained by multiplying class probability
distribution and box_score.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
For detect and classify output, width and height make no sence, so
change width, height to dims to represent the shape of tensor. Use
layout and dims to get width, height and channel.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
Now when using openvino backend, user doesn't need to set input/output
names in command line. Model ports will be automatically detected.
For example:
ffmpeg -i input.png -vf \
dnn_detect=dnn_backend=openvino:model=model.xml:input=image:\
output=detection_out -y output.png
can be simplified to:
ffmpeg -i input.png -vf dnn_detect=dnn_backend=openvino:model=model.xml\
-y output.png
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
nal->skipped_bytes_pos contains the positions of errors relative to the
start of the slice header, whereas the position they were tested against
is relative to the start of the slice data, i.e. one byte after the end
of the slice header.
Patch fixes this by storing the size of the slice header in H266RawSlice
and adding it to the position given by the GetBitContext before
comparing to skipped_bytes_pos. This fixes AVERROR_INVALIDDATAs for
various valid bitstreams, such as the LMCS_B_Dolby_2 conformance
bitstream.
Signed-off-by: Frank Plowman <post@frankplowman.com>
Resolves the following undefined behavior sanitiser error:
runtime error: shift exponent 32 is too large for 32-bit type 'int'
Signed-off-by: Frank Plowman <post@frankplowman.com>
Signed-off-by: James Almer <jamrial@gmail.com>
Fixes#10759.
It can happen in H.264, MPEG2, VC1 that the current frame resource
memory is already in ref_resource. For example, for a interlaced frame,
the same curr memory is passed twice. For the second time it could possibly
reference itself. When this happens the curr is already given an index and
in ref_resources. When the reference frame index is required, we should check
the existance in the ref_resources first before assigning a new index for it.
Signed-off-by: Tong Wu <tong1.wu@intel.com>
This file has been exported since our minimum required version (0.7.0),
but it wasn't documented. Instead it was transitively included by
<jxl/decode.h> (but not jxl/encode.h), which ffmpeg relied on.
libjxl broke its API in libjxl/libjxl@66b9592393 by removing
the transitive include of version.h, and they do not plan on adding
it back. Instead they are choosing to leave the API backwards-
incompatible with downstream callers written for some fairly recent
versions of their API.
As a result, we include <jxl/version.h> to continue to build against
more recent versions of libjxl. The version macros removed are also
present in that file, so we no longer need to redefine them.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
Should fix "member access within misaligned address 0xf00 for type 'const union
av_alias64', which requires 8 byte alignment" errors as reported by GCC ubsan.
Signed-off-by: James Almer <jamrial@gmail.com>
Should fix "member access within misaligned address 0xf00 for type 'const union
av_alias64', which requires 8 byte alignment" errors as reported by GCC ubsan.
Signed-off-by: James Almer <jamrial@gmail.com>
Avoids casts all over the place; in this case, it also
replaces the unsafe cast uint8_t**->const uint8_t **
by the safe cast uint8_t**->const uint8_t * const*.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Sort options by name, review formatting, apply consistency fixes and
fill the gaps (e.g. missing value for constants or flags), and review
and extend content.
Should fix "member access within misaligned address 0xf00 for type 'const union
av_alias64', which requires 8 byte alignment" errors as reported by GCC ubsan.
Signed-off-by: James Almer <jamrial@gmail.com>
VVC specifies << as arithmetic left shift, i.e. x << y is equivalent to
x * pow2(y). C's << on the other hand has UB if x is negative. This
patch removes all UB resulting from this, mostly by replacing x << y
with x * (1 << y), but there are also a couple places where the OOP was
changed instead.
Signed-off-by: Frank Plowman <post@frankplowman.com>
Signed-off-by: James Almer <jamrial@gmail.com>
Probably an artifact of a rebase, as this check is done below.
Fixes "Conditional jump or move depends on uninitialised value(s)" errors as
reported by Valgrind.
Signed-off-by: James Almer <jamrial@gmail.com>
The AVIAMFAudioElement and AVIAMFMixPresentation that are ultimately used
are allocated by ff_iamfdec_read_descriptors().
Fixes some memory leaks reported by Valgrind.
Signed-off-by: James Almer <jamrial@gmail.com>
AV_OPT_TYPE_INT64 should not be used for ints.
Should fix warnings about store to misaligned address for type 'int64_t'
Signed-off-by: James Almer <jamrial@gmail.com>
Finishes fixing vp5/potter512-400-partial.avi
The fate-matroska-ms-mode test ref is updated to reflect that the Speex decoder
can now read the stream.
Signed-off-by: James Almer <jamrial@gmail.com>
There could be bogus bytes at the start, as is the case of
vp5/potter512-400-partial.avi from the FATE suite, which could be a case of bad
remuxing from an OGG source.
Partially fixes decoding of vp5/potter512-400-partial.avi
Signed-off-by: James Almer <jamrial@gmail.com>
Covers muxing from raw pcm audio input into FLAC, using several scalable layouts,
and demuxing the result.
Signed-off-by: James Almer <jamrial@gmail.com>
Add it to decoder options instead, to be processed when opening the
decoder. This way it won't be overridden by flags the user might be
setting otherwise.
Fixes "runtime error: member access within misaligned address 0xf00 for type
'struct bar', which requires # byte alignment" errors under GCC ubsan.
Reviewed-by: Nuo Mi <nuomi2021@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
Fixes server compatibility issues with rtspclientsink GStreamer plugin.
>From specification:
RFC 7826 "Real-Time Streaming Protocol Version 2.0" (https://datatracker.ietf.org/doc/html/rfc7826), section 18.54:
mode: The mode parameter indicates the methods to be supported for
this session. The currently defined valid value is "PLAY". If
not provided, the default is "PLAY". The "RECORD" value was
defined in RFC 2326; in this specification, it is unspecified
but reserved. RECORD and other values may be specified in the
future.
RFC 2326 "Real Time Streaming Protocol (RTSP)" (https://datatracker.ietf.org/doc/html/rfc2326), section 12.39:
mode:
The mode parameter indicates the methods to be supported for
this session. Valid values are PLAY and RECORD. If not
provided, the default is PLAY.
mode=receive was always like this, from the initial commit 'a8ad6ffa rtsp: Add listen mode'.
For comparison, Wowza was used to push RTSP stream to. Both GStreamer and FFmpeg had no issues.
Here is the capture of Wowza responding to SETUP request:
200 OK
CSeq: 3
Server: Wowza Streaming Engine 4.8.26+4 build20231212155517
Cache-Control: no-cache
Expires: Mon, 15 Jan 2024 19:40:31 GMT
Transport: RTP/AVP/UDP;unicast;client_port=11640-11641;mode=record;source=172.17.0.2;server_port=6976-6977
Date: Mon, 15 Jan 2024 19:40:31 GMT
Session: 1401457689;timeout=60
Test setup:
Server: ffmpeg -loglevel trace -y -rtsp_flags listen -i rtsp://0.0.0.0:30800/live.stream t.mp4
FFmpeg client: ffmpeg -re -i "Big Buck Bunny - FULL HD 30FPS.mp4" -c:v libx264 -f rtsp rtsp://127.0.0.1:30800/live.stream
GStreamer client: gst-launch-1.0 videotestsrc is-live=true pattern=smpte ! queue ! videorate ! videoscale ! video/x-raw,width=640,height=360,framerate=60/1 ! timeoverlay font-desc="Sans, 84" halignment=center valignment=center ! queue ! videoconvert ! tee name=t t. ! x264enc bitrate=9000 pass=cbr speed-preset=ultrafast byte-stream=false key-int-max=15 threads=1 ! video/x-h264,profile=baseline ! queue ! rsink. audiotestsrc ! voaacenc ! queue ! rsink. t. ! queue ! autovideosink rtspclientsink name=rsink location=rtsp://localhost:30800/live.stream
Test results:
modified FFmpeg client -> stock server : ok
stock FFmpeg client -> modified server : ok
modified FFmpeg client -> modified server : ok
GStreamer client -> modified server : ok
Signed-off-by: Paul Orlyk <paul.orlyk@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes server compatibility issues with rtspclientsink GStreamer plugin
Signed-off-by: Paul Orlyk <paul.orlyk@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Previously AV_PIX_FMT_RGB8 was documented as "RGB 3:3:2,
(msb)2R 3G 3B(lsb)". While the RGB 3:3:2 part is correct, the latter
part should be: (msb)3R 3G 2B(lsb). This commit also updates the
format's pixdesc description to be (msb)3R 3G 2B(lsb).
Signed-off-by: Jeffrey Knockel <jeff@jeffreyknockel.com>
Reviewed-by: "Diederick C. Niehorster" <dcnieho@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This is the 64bit version of Chris Doty-Humphreys SFC64
Compared to the LCGs these produce much better quality numbers.
Compared to LFGs this needs less state. (our LFG has 224 byte
state for its 32bit version) this has 32byte state
Also the initialization for our LFG is slower.
This is also much faster than KISS or PCG.
This commit replaces the broken LCG used before.
(broken as it had only a period ~200M due to being put in a double)
This changes the output from random() which is why libswresample.mak
is updated, update was done using the command in libswresample.mak
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
As I understand, support for .STR files is broken for almost 10 years now (since 161442ff2c it seems).
Currently, ffmpeg fails with tons of errors like this on version 1 STRs, e.g. Wipeout 1:
[mdec @ 00000000027c72c0] ac-tex damaged at 1 9
What happens is that only the audio is present in the video file.
Anyway, that one character patch fixes the problem, video is now rendered.
Signed-off-by: aybe <aybe@users.noreply.github.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The C library function double atan2(double y, double x) takes y as the first
parameter and x as the second parameter.
Signed-off-by: Haixia Shi <hshi@meta.com>
On some platforms (in particular, ARM/AArch64), the implementation
of AV_READ_TIME() may use a privileged instruction - in such
cases, benchmarking just fails with a SIGILL.
Instead of crashing, try executing AV_READ_TIME() once within
a region with the signal handler active, to allow gracefully
informing the user about the issue.
This matches the dav1d checkasm commit
95a192549a448b70d9542e840c4e34b60d09b093.
Signed-off-by: Martin Storsjö <martin@martin.st>
Parse iprp and iinf boxes and its child boxes to get the actual codec used
(AV1 for avif, HEVC for heic), and properly export extradata and other
properties in a generic way.
The avif tests reference files are updated as the extradata is now exported.
Based on a patch by Swaraj Hota
Co-authored-by: Swaraj Hota <swarajhota353@gmail.com>
Signed-off-by: James Almer <jamrial@gmail.com>
Signed-off-by: Marth64 <marth64@proxyid.net>
Raw Captions With Time (RCWT) is a format native to ccextractor, a commonly
used open source tool for processing 608/708 closed caption (CC) sources.
It can be used to archive the original, raw CC bitstream and to produce
a source file file for later CC processing or conversion. As a result,
it also allows for interopability with ccextractor for processing CC data
extracted via ffmpeg. The format is simple to parse and can be used
to retain all lines and variants of CC.
A free specification of RCWT can be found here:
https://github.com/CCExtractor/ccextractor/blob/master/docs/BINARY_FILE_FORMAT.TXT
This muxer implements the specification as of 01/05/2024, which has
been stable and unchanged for 10 years as of this writing.
This muxer will have some nuances from the way that ccextractor muxes RCWT.
No compatibility issues when processing the output with ccextractor
have been observed as a result of this so far, but mileage may vary
and outputs will not be a bit-exact match.
Specifically, the differences are:
(1) This muxer will identify as "FF" as the writing program identifier, so
as to be honest about the output's origin.
(2) ffmpeg's MPEG-1/2, H264, HEVC, etc. decoders extract closed captioning
data differently than ccextractor from embedded SEI/user data.
For example, DVD captioning bytes will be translated to ATSC A53 format.
This allows ffmpeg to handle 608/708 in a consistant way downstream.
This is a lossless conversion and the meaningful data is retained.
(3) This muxer will not alter the extracted data except to remove invalid
packets in between valid CC blocks. On the other hand, ccextractor
will by default remove mid-stream padding, and add padding at the end
of the stream (in order to convey the end time of the source video).
tests/checkasm/checkasm:
C LSX LASX
hevc_idct_32x32_8_c: 1243.0 211.7 101.7
Speedup of decoding H265 4K 30FPS 30Mbps on
3A6000 with 8 threads is 1fps(56fps-->57fps).
Reviewed-by: yinshiyou-hf@loongson.cn
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
After this patch, the peformance of decoding H265 4K 30FPS 30Mbps
on 3A6000 with 8 threads improves 2fps (45fps-->47fsp).
Reviewed-by: yinshiyou-hf@loongson.cn
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixed blackstripe on bottom or segmentation fault in case
when patch width and height differ.
Signed-off-by: Vladimir Petrov <vppetrovmms@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The device can be selected by GPU name or index. For example,
ffplay -hwaccel cuda \
-vulkan_params device="NVIDIA GeForce RTX 3060" \
foo.mp4
ffplay -hwaccel cuda -vulkan_params device="0" foo.mp4
Please note that select device by index only supported by hwcontext,
not by libplacebo.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
This second patch fixes the following error at the end of a .STR stream conversion:
[in#0/psxstr @ 0000000000681e80] Error during demuxing: I/O error
It's been a bit of trial and error as I've never used ffmpeg, but returning AVERROR_EOF appears to be the way to go (doesn't complain anymore).
Signed-off-by: aybe <aybe@users.noreply.github.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This third patch fixes warnings that are false positives (still on STRv1).
That's because these sectors are simply empty ones as can be read in "System Description CD-ROM XA, May 1991,
4.3.2.3".
Haven't attempted significant refactoring as it just works, left a comment instead about the situation.
The result is that there are no more false warnings when converting.
Signed-off-by: aybe <aybe@users.noreply.github.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This replaces the riscv specific handling from
7212466e73 (which essentially is
reverted), with a different implementation of the same (plus a bit
more), based on the corresponding feature in dav1d's checkasm,
supporting both Unix and Windows.
See in particular the dav1d commits
0b6ee30eab2400e4f85b735ad29a68a842c34e21,
0421f787ea592fd2cc74c887f20b8dc31393788b,
8501a4b20135f93a4c3b426468e2240e872949c5 and
d23e87f7aee26ddcf5f7a2e185112031477599a7, authored by Henrik Gramner.
The overall approach compared to the existing implementation for
riscv is the same; set up a signal handler, store the state with
sigsetjmp, jump out of the crashing function with siglongjmp.
The main difference is in what happens when the signal handler
is invoked. In the previous implementation, it would resume from
right before calling the crashing function, and then skip that call
based on the setjmp return value.
In the imported implementation from dav1d, we return to right before
the check_func() call, which will skip testing the current function
(as the pointer is the same as it was before).
Other differences are:
- Support for other signal handling mechanisms (Windows
AddVectoredExceptionHandler)
- Using RtlCaptureContext/RtlRestoreContext instead of setjmp/longjmp
on Windows with SEH
- Only catching signals once per function - if more than one
signal is delivered before signal handling is reenabled, any
signal is handled as it would without our handler
- Not using an arch specific signal handler written in assembly
Signed-off-by: Martin Storsjö <martin@martin.st>
This was accidentally removed in a prior revision of the series,
alongside the corresponding (separate) version bump. Instead coalesce it
into the follow-up commit's entry, since that's the lowest version
actually supporting the new fields.
Adding my new gpg key that I will be using from now on.
This key is ed25519, which is more secure than the old rsa4096.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
The layout for the frame flags is as follow:
chroma_format u(2)
reserved u(2)
interlace_mode u(2)
reserved u(2)
chroma_format has 2 allowed values:
0: reserved
1: reserved
2: 4:2:2
3: 4:4:4
interlace_mode has 3 allowed values:
0: progressive
1: tff
2: bff
3: reserved
0x80 is what we expect for "422 not interlaced", and the extra 0x2 from
0x82 is actually writing into the reserved bits.
This byte represents 4 reserved bits followed by 4 alpha_channel_type bits.
alpha_channel_type currently has 3 differents defined values: 0 (no
alpha), 1 (8b alpha), and 2 (16b alpha), all the other values are
reserved. The 4 initial reserved bits are expected to be 0.
This byte represents 4 reserved bits followed by 4 alpha_channel_type bits.
alpha_channel_type currently has 3 differents defined values: 0 (no
alpha), 1 (8b alpha), and 2 (16b alpha), all the other values are
reserved. This part is correctly written (alpha_bits>>3 does the correct
thing), but the 4 initial bits are reserved.
Quoting SMPTE RDD 36:2015:
A decoder shall abort if it encounters a bitstream with an unsupported
bitstream_version value. If 0, the value of the chroma_format syntax
element shall be 2 (4:2:2 sampling) and the value of the
alpha_channel_type element shall be 0 (no encoded alpha); if 1, any
permissible value may be used for those syntax elements.
So if we're not in 4:2:2 or if there is alpha, we are not allowed to use
version 0.
Quoting SMPTE RDD 36:2015:
A decoder shall abort if it encounters a bitstream with an unsupported
bitstream_version value. If 0, the value of the chroma_format syntax
element shall be 2 (4:2:2 sampling) and the value of the
alpha_channel_type element shall be 0 (no encoded alpha); if 1, any
permissible value may be used for those syntax elements.
So if we're not in 4:2:2 or if there is alpha, we are not allowed to use
version 0.
Move section to the top of the file, use table in place of subsection
to list the comprising muxers, and show media type information and
extensions in the item entry names.
option kernel_driver for vaapi device creation can be used to choose the
desired device on Linux, which is more convenient than DRM render node
in a multiple-device system (e.g. Intel iGPU + AMD dGPU or inverse), but
this option requires libdrm works. vaapi is autodetected at
configuration time, let's make libdrm autodetected too.
Reviewed-by: Zhao Zhili <quinkblack@foxmail.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Filter graph reconfigurations can cause nontrivial issues, e.g. PTS
discontinuities, so it is better to warn the user about them.
Signed-off-by: Marton Balint <cus@passwd.hu>
FIx warnings (soon to be errors in GCC 14, already so in Clang 15):
```
src/libavcodec/vulkan_av1.c: In function ‘vk_av1_create_params’:
src/libavcodec/vulkan_av1.c:183:43: error: initialization of ‘long long unsigned int’ from ‘void *’ makes integer from pointer without a cast [-Wint-conversion]
183 | .videoSessionParametersTemplate = NULL,
| ^~~~
src/libavcodec/vulkan_av1.c:183:43: note: (near initialization for ‘(anonymous).videoSessionParametersTemplate’)
```
Use Vulkan's VK_NULL_HANDLE instead of bare NULL.
Fix Trac ticket #10724.
Was reported downstream in Gentoo at https://bugs.gentoo.org/919067.
Signed-off-by: Sam James <sam@gentoo.org>
These functions encode a slice of alpha (1 to 8 macroblocks) which are
expected to be encoded as a repeated sequence of "[diff][run-1]", where
diff is the running difference of the alpha value and run is how many
times that value is expected to be duplicated (within the limit of a
grand total of 2048 unpacked samples, corresponding to a slice of 8 MB).
Even when run==0 (the run variable semantic is actually "run minus 1"),
there is always a diff previously encoded that needs a counter of at
least 1. This means we need to call put_alpha_run() unconditionally at
the end of the bitstream to account for the last running diff.
This commit fixes glitchy playbacks on QuickTime with M2 and M3 hardware
(but not M1 for some mysterious reason) with files generated with
commands such as:
ffmpeg -f lavfi -i testsrc2=d=5:s=912x320,chromakey -c:v prores_aw -profile:v 4 -y aw.mov
ffmpeg -f lavfi -i testsrc2=d=5:s=912x320,chromakey -c:v prores_ks -profile:v 4444 -y ks.mov
The glitch expresses itself deterministically as blinking black
rectangles on random frames (for example on frame 21, 54, 71, 79, ...).
Even with the proresdec from FFmpeg, overreads actually happens while
reading the run-minus-1 value (around val = get_bits(gb, 4) in
unpack_alpha()). This doesn't seem to cause any particular issue because
it simply overreads into the next slice, and because the decoder is
resilient, but it's still a problem.
The investigation leading to this fix was made possible because of paid
work for Jitter (https://jitter.video).
Fixes ticket #10255.
In the mov muxer (in mov_write_video_tag()), bits_per_coded_sample will
be written under certain conditions and is required to be 32 for the
transparency to be honored in QuickTime.
prores_kostya already has this setting but prores_anatoliy and
prores_videotoolbox didn't.
The check for this option does succeed - MSVC accepts the option,
but prints a warning about it being unknown and ignored, for
each compiled object file:
cl : Command line warning D9002 : ignoring unknown option '-mfp16-format=ieee'
The configure script only attempts to add this option on ARM,
therefore this warning isn't seen by the majority of people
building with MSVC.
Making this option into a no-op probably isn't entirely right,
but on the other hand, we don't want to litter the code that
checks for support for the option with compiler specific
conditions either.
Signed-off-by: Martin Storsjö <martin@martin.st>
This commit removes the follow warning and error:
D3D12 WARNING: ID3D12CommandList::ResourceBarrier: Called on the same subresource(s) of
Resource(0x000002236E0E00D0:'Unnamed ID3D12Resource Object') in separate Barrier Descs
which is inefficient and likely unintentional. Desc[0] and Desc[1] on (subresource :
4294967295). [RESOURCE_MANIPULATION WARNING #1008: RESOURCE_BARRIER_DUPLICATE_SUBRESOURCE_TRANSITIONS]
D3D12 ERROR: ID3D12CommandList::ResourceBarrier: Before state (0x0: D3D12_RESOURCE_STATE_[COMMON|PRESENT])
of resource (0x000002236E0E00D0:'Unnamed ID3D12Resource Object') (subresource: 0) specified
by transition barrier does not match with the state (0x20000: D3D12_RESOURCE_STATE_VIDEO_DECODE_WRITE)
specified in the previous call to ResourceBarrier [RESOURCE_MANIPULATION ERROR #527:
RESOURCE_BARRIER_BEFORE_AFTER_MISMATCH]
Tested-by: Tong Wu <tong1.wu@intel.com>
Signed-off-by: Wu Jianhua <toqsxw@outlook.com>
Previous assertion was not useful. Now a warning is added to replace it.
For get_surface_index, we should return a zero index in case the index is not found.
But a warning is necessary to notify.
Signed-off-by: Tong Wu <tong1.wu@intel.com>
Previous max_num_refs was based on pp.frame_refs plus 1 and it could possibly
reaches the size limit. Actually it should be the size of pp.ref_frame_map
plus 1.
Signed-off-by: Tong Wu <tong1.wu@intel.com>
hw_frames_ctx on the input link is only set when the input link is
configured, which hasn't happened yet. This temporarily hacks around
the problem (in a way no worse than before the format negotiation
changes) until a proper fix can be applied.
The gcrypt definition of `bn_new` used to use the return statement
on errors, with an AVERROR return value, regardless of the signature
of the function where the macro is used - it is called in
`dh_generate_key` and `ff_dh_init` which return pointers. As a result,
compiling with gcrypt and the ffrtmpcrypt protocol resulted in an
int-conversion warning. GCC 14 may upgrade these to errors [1].
This patch fixes the problem by changing the macro to remove `AVERROR`
and instead set `bn` to null if the allocation fails. This is the
behaviour of all the other `bn_new` implementations and so the result is
already checked at all the callsites. AFAICT, this should be the only
change needed to get ffmpeg off Fedora's naughty list of projects with
warnings which may be upgraded to errors in GCC 14 [2].
[1]: https://gcc.gnu.org/pipermail/gcc/2023-May/241264.html
[2]: https://www.mail-archive.com/devel@lists.fedoraproject.org/msg196024.html
Signed-off-by: Frank Plowman <post@frankplowman.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
There is a warning in XCode:"Declaration shadows a local variable"
Signed-off-by: xufuji456 <839789740@qq.com>
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
VideoToolbox use different identifiers for the same pixel format
with different color range, like
kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange
kCVPixelFormatType_420YpCbCr8BiPlanarFullRange.
Before the patch, vt_pool_alloc() always use limited range, and it
will fail for pixel format AV_PIX_FMT_BGRA since there is no limited
range kCVPixelFormatType_32BGRA.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
The start code is matched against 0x000001, zero_byte was treated
as last byte of last frame rather than the beginning of next frame.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
vvc decoder plug-in to avcodec.
split frames into slices/tiles and send them to vvc_thread for further decoding
reorder and wait for the frame decoding to be done and output the frame
Features:
+ Support I, P, B frames
+ Support 8/10/12 bits, chroma 400, 420, 422, and 444 and range extension
+ Support VVC new tools like MIP, CCLM, AFFINE, GPM, DMVR, PROF, BDOF, LMCS, ALF
+ 295 conformace clips passed
- Not support RPR, IBC, PALETTE, and other minor features yet
Performance:
C code FPS on an i7-12700K (x86):
BQTerrace_1920x1080_60_10_420_22_RA.vvc 93.0
Chimera_8bit_1080P_1000_frames.vvc 184.3
NovosobornayaSquare_1920x1080.bin 191.3
RitualDance_1920x1080_60_10_420_32_LD.266 150.7
RitualDance_1920x1080_60_10_420_37_RA.266 170.0
Tango2_3840x2160_60_10_420_27_LD.266 33.7
C code FPS on a M1 Mac Pro (ARM):
BQTerrace_1920x1080_60_10_420_22_RA.vvc 58.7
Chimera_8bit_1080P_1000_frames.vvc 153.3
NovosobornayaSquare_1920x1080.bin 150.3
RitualDance_1920x1080_60_10_420_32_LD.266 105.0
RitualDance_1920x1080_60_10_420_37_RA.266 133.0
Tango2_3840x2160_60_10_420_27_LD.266 21.7
Asm optimizations still working in progress. please check
https://github.com/ffvvc/FFmpeg/wiki#performance-data for the latest
Contributors (based on code merge order):
Nuo Mi <nuomi2021@gmail.com>
Xu Mu <toxumu@outlook.com>
Frank Plowman <post@frankplowman.com>
Shaun Loo <shaunloo10@gmail.com>
Wu Jianhua <toqsxw@outlook.com>
Thank you for reporting issues and providing performance reports:
Łukasz Czech <lukaszcz18@wp.pl>
Xu Fulong <839789740@qq.com>
Thank you for providing review comments:
Ronald S. Bultje <rsbultje@gmail.com>
James Almer <jamrial@gmail.com>
Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Co-authored-by: Xu Mu <toxumu@outlook.com>
Co-authored-by: Frank Plowman <post@frankplowman.com>
Co-authored-by: Shaun Loo <shaunloo10@gmail.com>
Co-authored-by: Wu Jianhua <toqsxw@outlook.com>
This is the main entry point for the CTU (Coding Tree Unit) decoder.
The code will divide the CTU decoder into several stages.
It will check the stage dependencies and run the stage decoder.
Co-authored-by: Xu Mu <toxumu@outlook.com>
Co-authored-by: Frank Plowman <post@frankplowman.com>
Co-authored-by: Shaun Loo <shaunloo10@gmail.com>
Co-authored-by: Wu Jianhua <toqsxw@outlook.com>
Generalize drawtext utilities to make them usable in other filters.
This will be needed to introduce the QR code source and filter without
duplicating functionality.
The function lf_db_new was deprecated in lensfun 09dcd3e7ad in 2017
in favour of lf_db_create. The AVfilter was adjusted in 8b78eb312d
but the configure check wasn't changed.
lensfun removed lf_db_new declaration in lf 35c0017593 so configure
fails to find lensfun.
Rewrite the format parsing code to make it more easily generalizable. In
particular, `invert_formats` does not depend on the type of format list
passed to it, which allows me to re-use this helper in an upcoming
commit.
Slightly shortens the code, at the sole cost of doing several malloc
(ff_add_format) instead of a single malloc.
This is at odds with the YUV matrix negotiation API, in which such
dynamic changes in YUV encoding are no longer easily possible. There is
also no really strong motivating reason to do this, since the choice of
YUV matrix is essentially arbitrary and not actually related to the
Dolby Vision decoding process.
This filter will always accept any input format, even if the user sets
a specific in_range/in_color_matrix. This is to preserve status quo with
current behavior, where passing a specific in_color_matrix merely
overrides the incoming frames' attributes. (Use `vf_format` to force
a specific input range)
Because changing colorspace and color_range now requires reconfiguring
the link, we can lift sws_setColorspaceDetails out of scale_frame and
into config_props. (This will also get re-called if the input frame
properties change)
To allow adding proper negotiation, in particular, to fftools.
These values will simply be negotiated downstream for YUV formats, and
ignored otherwise.
Motivated by YUVJ removal. This change will allow full negotiation
between color ranges and matrices as needed. By default, all ranges and
matrices are marked as supported.
Because grayscale formats are currently handled very inconsistently (and
in particular, assumed as forced full-range by swscale), we exclude them
from negotiation altogether for the time being, to get this API merged.
After filter negotiation is available, we can relax the
grayscale-is-forced-jpeg restriction again, when it will be more
feasible to do so without breaking a million test cases.
Note that this commit updates one FATE test as a consequence of the
sanity fallback for non-YUV formats. In particular, the test case now
writes rgb24(pc, gbr/unspecified/unspecified) to the matroska file,
instead of rgb24(unspecified/unspecified/unspecified) as before.
Currently, the logic inside the FF_FILTER_FORMATS_QUERY_FUNC branch
prevents this code from running in the event that we have a filter with
a single video input and a single audio output, as the resulting audio
output link will not have its channel counts / samplerates correctly
initialized to their default values, possibly triggering a segfault
downstream.
An example of such a filter is vaf_spectrumsynth. Although this
particular filter already sets up the channel counts and samplerates as
part of the query function and therefore avoids triggering this bug, the
bug still exists in principle. (And importantly, sets a wrong precedent)
Even if a query func is set. This is safe to do, because
ff_default_query_formats is documented not to touch any filter lists
that were already set by the query func.
The reason to do this is because it allows us to extend
AVFilterFormatsConfig without having to touch every filter in existence.
An alternative implementation of this commit would be to explicitly add
a `ff_default_query_formats` call at the end of every query_formats
function, but that would end up functionally equivalent to this change
while touching a whole lot more code paths for no reason.
As a bonus, eliminates some code/logic duplication from this function.
Add dynamic outputs support. Some models don't have fixed output size.
Its size changes according to result. Now openvino can run these kinds of
models.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
Fixes: out of array access:
Fixes: tickets/10745/poc12ffmpeg
Found-by: Li Zeyuan and Zeng Yunxiang.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: out of array read
Fixes: tickets/10744/poc11ffmpeg
Found-by: Li Zeyuan and Zeng Yunxiang.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: out of array access
Fixes: tickets/10743/poc10ffmpeg
Found-by: Zeng Yunxiang and Li Zeyuan
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: out of array access
Fixes: tickets/10747/poc14ffmpeg
Found-by: Zeng Yunxiang and Song Jiaxuan
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The code works in steps of 2 lines and lacks support for odd height
Implementing odd height support is better but for now this fixes the
out of array access
Fixes: out of array access
Fixes: tickets/10702/poc6ffmpe
Found-by: Zeng Yunxiang
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 2314885530818453536 - -7412889664301817824 cannot be represented in type 'long'
Fixes: 64296/clusterfuzz-testcase-minimized-ffmpeg_dem_MOV_fuzzer-6304027146846208
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 2147478526 + 33924 cannot be represented in type 'int'
Fixes: shift exponent 32 is too large for 32-bit type 'unsigned int'
Fixes: 64243/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_JPEGLS_fuzzer-5195717848989696
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Forgot to init c->sao_edge_filter[idx] when idx=0/1/2/3.
After this patch, the speedup of decoding H265 4K 30FPS
30Mbps on 3A6000 is about 7% (42fps==>45fps).
Change-Id: I521999b397fa72b931a23c165cf45f276440cdfb
Reviewed-by: yinshiyou-hf@loongson.cn
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Rework the code a bit to speed up the 10-bit bitpacked decoding
routine. This is probably about as fast as I can get it without
switching to assembly language.
Demonstratable with:
./ffmpeg -f lavfi -i "smptehdbars=size=3840x2160" -c bitpacked -f image2 -frames:v 1 source.yuv
./ffmpeg -f bitpacked -pix_fmt yuv422p10le -s 3840x2160 -c:v bitpacked -i source.yuv -pix_fmt yuv422p10le out.yuv
On my development system, it went from 80ms for a 2160p frame
down to 20ms (i.e. a 4X speedup). Good enough for now, I hope...
Comments from Marton:
Originally on my system better performance could be achieved by simply
switching to the cached bitstream reader, but for Devin it was slower than
his direct byte operations.
I changed the order of writing output from u/y/v/y to u/v/y/y, and that made
the code faster than the cached bitstream reader on my system as well.
TIMER measurement of the decode loop on Ryzen 5 3600 with command line:
./ffmpeg -stream_loop 256 -threads 1 -f bitpacked -pix_fmt yuv422p10le -s 3840x2160 -c:v bitpacked -i source.yuv -pix_fmt yuv422p10le -f null none -loglevel error
Before: 823204127 decicycles in YUV, 256 runs, 0 skips
After: 315070524 decicycles in YUV, 256 runs, 0 skips
Signed-off-by: Devin Heitmueller <dheitmueller@ltnglobal.com>
Signed-off-by: Marton Balint <cus@passwd.hu>
Some encoders (e.g., libx264) dump encoder configuration as user
data unregistered SEI message. This option try to print it as
ascii character when possible.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
The specification doesn't mention that clusters cannot have alphabet
sizes greater than 1 << bundle->log_alphabet_size, but the reference
implementation rejects these entropy streams as invalid, so we should
too. Refusing to do so can overflow a stack variable that should be
large enough otherwise.
Fixes#10738.
Found-by: Zeng Yunxiang and Li Zeyuan
Signed-off-by: Leo Izen <leo.izen@gmail.com>
Fixes the build. It's a requirement when utilizing PIE.
Signed-off-by: Brad Smith <brad@comstyle.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
If a sequence of JXL images is encapsulated in a container that has PTS
information, we should use the PTS information from the container. At
this time there is no container that does this, but if JPEG XL support
is ever added to NUT, AVTransport, or some other container, this commit
should allow the PTS information those containers provide to work as
expected.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
These pixel formats have always been supported by libjxl, but at the
time this plugin was written, they were not in FFmpeg yet. Now that
they are in FFmpeg, we should support them.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
These pixel formats have always been supported by libjxl, but at the
time this plugin was written, they were not in FFmpeg yet. Now that
they are in FFmpeg, we should support them.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
FFmpeg doesn't support tv-range RGB throughout most of its pipeline, so
we should keep the warning. However, in case something does support it
we should at least keep it tagged properly. Additionally, the encoder
writes this tag if the space is tagged as such so this makes a round
trip work as it should.
Also, PNG doesn't support nonzero matrices but we only warn and ignore
in that case, so we have no reason to error out for illegal cICP ranges
either (i.e. greater than 1).
Signed-off-by: Leo Izen <leo.izen@gmail.com>
Reported-by: Kacper Michajłow <kasper93@gmail.com>
Unnecessary since acf63d5350adeae551d412db699f8ca03f7e76b9;
also avoids relocations.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
For memcpy and memcmp, we need to multiply by the element size,
otherwise we're copying and comparing only a fraction of the buffer.
For decorrelate_sr, the buffer p1 is the one that is mutated;
copy and check p1 instead of p2.
For decorrelate_sm, both buffers are mutated, so copy and check
both of them.
For decorrelate_sm, the memcpy initialization of p1 and p1_2 was
reversed - p1 is filled with randomize, but then memcpy copies from
p1_2 to p1. As p1_2 is uninitialized at this point, clang concluded
that the copy was bogus and omitted it entirely, triggering failures
in this test on x86 (where there was an existing assembly implementation
to test).
Signed-off-by: Martin Storsjö <martin@martin.st>
* filter subtitle/data options out of main, video and audio sections
* add filters that were missing entirely from the subtitle section
* add a missing section for advanced subtitle options
So they don't clutter the standard help output.
-loglevel is marked because there is no need to show two options (-v and
-loglevel) that do the same thing.
Currently it requires every single OPT_SPEC option to be accompanied by
an array of alternate names for this option. The vast majority of
options have no alternate names, resulting in a large numbers of
unnecessary single-element arrays that merely contain the option name.
Extend the option parsing API to allow marking options as having
alternate names, or as being the canonical name for some existing
alternatives. Use this new information to avoid the need for
abovementioned unnecessary single-element arrays.
This option flag only carries nontrivial information for options that
call a function, in all other cases its presence can be inferred from
the option type (bool options do not have arguments, all other types do)
and is thus nothing but useless clutter.
Change the option parsing code to infer its value when it can, and drop
the flag from options where it's not needed.
It causes those options to be parsed as either
* -autofoo 0/1 (with an argument)
* -noautofoo (without an argument)
This is unnecessary, confusing, and against the documentation; these are
also the only two bool options that take an argument.
This should not affect the users, as these options are on by default,
and are supposed to be used as -nofoo per the documentation.
Otherwise an unitialized stack value would be copied to FPSConvContext.
As it's then never used, it tends not to be a problem in practice,
however it is UB and some compilers warn about it.
The check for UWP mode was duplicated from right above, in
d54127c41a.
Also, instead of several lines with "enabled uwp && ...", make one
"if enabled uwp; then" block.
Signed-off-by: Martin Storsjö <martin@martin.st>
The current pack_output function pointer is a property of the decoder,
rather than a constant method provided by the DSP code. Indeed, except
for an unused initialisation, the field is never used in DSP code.
The penultimate loop iteration could pick any vl such that:
vlenb/4 < vl <= vlenb/2
Thus if the total length is not a multiple of vlenb/2, the vfadd.vf
on the penultimate iteration would yield corrupt values for the last
iteration.
To avoid this, force vl = vlen/2 until the last iteration. Unfortunately
this latent bug is not reproducible with either hardware or QEMU as of now.
This skips the round-trip to scalar register for the sliding 'x'
coefficients, improving performance by about 5%. The trick here is that
the vector slide-up instruction preserves elements in destination vector
until the slide offset.
The switch from vfslide1up.vf to vslideup.vi also allows the elimination
of data dependencies on consecutive slides. Since the specifications
recommend sticking to power of two offsets, we could slide as follows:
vslideup.vi v8, v0, 2
vslideup.vi v4, v0, 1
vslideup.vi v12, v8, 1
vslideup.vi v16, v8, 2
However in the device under test, this seems to make performance slightly
worse, so this is left for (in)validation with future better hardware.
Long is 32 bits signed on Windows, and nb_stream{s,_groups} are both unsigned
int. In a realistic scenario this wont make a difference, but it's still
proper.
Also ensure the parsed string is an integer while at it.
Signed-off-by: James Almer <jamrial@gmail.com>
It caused lacking a public declaration build error with
-Werror=missing-prototypes.
Since DXGI_FORMAT is moved to public since patch set V10, this function
is no longer useful. Now remove it.
Signed-off-by: Tong Wu <tong1.wu@intel.com>
Signed-off-by: James Almer <jamrial@gmail.com>
This fixes the following build error:
src/libavcodec/d3d12va_decode.c:49:10: error: no previous prototype for function
'ff_d3d12va_get_surface_index' [-Werror,-Wmissing-prototypes]
49 | unsigned ff_d3d12va_get_surface_index(const AVCodecContext *avctx,
| ^
Signed-off-by: Martin Storsjö <martin@martin.st>
This disables regenerating ffversion.h whenever the checked out
git commit changes, speeding up development rebuilds.
Whenever this option is set, force the version to be printed as
"unknown" rather than showing potentially stale information.
Signed-off-by: Martin Storsjö <martin@martin.st>
Same as d3d11va, this flag enables main still picture profile for
d3d12va. User should add this flag when decoding main still picture
profile.
Signed-off-by: Tong Wu <tong1.wu@intel.com>
The implementation is based on:
https://learn.microsoft.com/en-us/windows/win32/medfound/direct3d-12-video-overview
With the Direct3D 12 video decoding support, we can render or process
the decoded images by the pixel shaders or compute shaders directly
without the extra copy overhead, which is beneficial especially if you
are trying to render or post-process a 4K or 8K video.
The command below is how to enable d3d12va:
ffmpeg -hwaccel d3d12va -i input.mp4 output.mp4
Signed-off-by: Wu Jianhua <toqsxw@outlook.com>
Signed-off-by: Tong Wu <tong1.wu@intel.com>
In close_output(), a dummy frame is created with format NONE passed
to enc_open(), which isn't prepared for it. The NULL pointer
dereference happened at
av_pix_fmt_desc_get(enc_ctx->pix_fmt)->comp[0].depth.
When fgt.graph is NULL, skip fg_output_frame() since there is
nothing to output.
frame #0: 0x0000005555bc34a4 ffmpeg_g`enc_open(opaque=0xb400007efe2db690, frame=0xb400007efe2d9f70) at ffmpeg_enc.c:235:44
frame #1: 0x0000005555bef250 ffmpeg_g`enc_open(sch=0xb400007dde2d4090, enc=0xb400007e4e2daad0, frame=0xb400007efe2d9f70) at ffmpeg_sched.c:1462:11
frame #2: 0x0000005555bee094 ffmpeg_g`send_to_enc(sch=0xb400007dde2d4090, enc=0xb400007e4e2daad0, frame=0xb400007efe2d9f70) at ffmpeg_sched.c:1571:19
frame #3: 0x0000005555bee01c ffmpeg_g`sch_filter_send(sch=0xb400007dde2d4090, fg_idx=0, out_idx=0, frame=0xb400007efe2d9f70) at ffmpeg_sched.c:2154:12
frame #4: 0x0000005555bcf124 ffmpeg_g`close_output(ofp=0xb400007e4e2d85b0, fgt=0x0000007d1790eb08) at ffmpeg_filter.c:2225:15
frame #5: 0x0000005555bcb000 ffmpeg_g`fg_output_frame(ofp=0xb400007e4e2d85b0, fgt=0x0000007d1790eb08, frame=0x0000000000000000) at ffmpeg_filter.c:2317:16
frame #6: 0x0000005555bc7e48 ffmpeg_g`filter_thread(arg=0xb400007eae2ce7a0) at ffmpeg_filter.c:2836:15
frame #7: 0x0000005555bee568 ffmpeg_g`task_wrapper(arg=0xb400007d8e2db478) at ffmpeg_sched.c:2200:21
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
From AOSP doc, these values are device and codec specific, but lower
values generally result in more efficient (smaller-sized) encoding.
For example, global_quality 50 on Pixel 6 results a 1080P 30 FPS
HEVC with 3744 kb/s, while global_quality 80 results 28178 kb/s.
Fix#10689
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
The ffmpeg coding style doesn't usually use const on scalar
parameters (or on the pointer values - as opposed to the type
that is pointed to, where it has a semantic meaning), contrary
to the dav1d coding style (where this was imported from).
This avoids warnings about differences in the type signatures
between declaration and definition of this function, with older
versions of MSVC.
The issue was observed with one version of MSVC 2017,
19.16.27024.1, with warnings like these:
src/tests/checkasm/checkasm.c(969): warning C4028: formal parameter 3 different from declaration
The warning itself is bogus as the const here is harmless, and
newer versions of MSVC no longer warn about this.
Signed-off-by: Martin Storsjö <martin@martin.st>
This can be used to run tests multple times, with e.g. differing
QEMU settings, by adding something like this to the FATE configuration
file:
target_exec="qemu-aarch64-static"
fate_targets="fate-checkasm fate-cpu"
fate_environments="sve128 sve256 sve512"
sve128_env="QEMU_CPU=max,sve128=on"
sve256_env="QEMU_CPU=max,sve256=on"
sve512_env="QEMU_CPU=max,sve512=on"
It's also possible to customize the target_exec command further
by injecting a sufficiently quoted variable into it, which then can
be updated for each run, e.g. target_exec="\$(CUR_EXEC_CMD)".
For each of the environment names in fate_environments, the tests
that are run get the name suffixed on the fate tests in the
test log and fate report, e.g. "fate-checkasm-h264dsp_sve128".
Signed-off-by: Martin Storsjö <martin@martin.st>
This can be useful if doing testing of uncommon CPU extensions by
running tests with QEMU (by configuring with e.g.
"target_exec=qemu-aarch64"), by only running the checkasm tests,
to get a reasonable test coverage without excessive test runtime.
For such a config, setting fate_targets="fate-checkasm fate-cpu"
can be a good tradeoff.
Signed-off-by: Martin Storsjö <martin@martin.st>
Reduce false positives for VVC files by adding additional checks in
`vvc_probe`. Specifically, `nuh_temporal_id_plus1` is tested for valid
values in extra cases depending on the NAL unit type, as per ITU-T H.266
section 7.4.2.2.
Resolves trac #10703.
Signed-off-by: Frank Plowman <post@frankplowman.com>
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
Converting from an integer to HWND (which is a pointer) requires
an explicit cast, otherwise Clang errors out like this:
src/libavdevice/gdigrab.c:280:14: error: incompatible integer to pointer conversion assigning to 'HWND' (aka 'struct HWND__ *') from 'long' [-Wint-conversion]
280 | hwnd = strtol(name, &p, 0);
| ^ ~~~~~~~~~~~~~~~~~~~
(With GCC and MSVC, this was a mere warning, but with recent Clang,
this is an error.)
After adding a cast, all compilers also warn something like this:
src/libavdevice/gdigrab.c:280:16: warning: cast to 'HWND' (aka 'struct HWND__ *') from smaller integer type 'long' [-Wint-to-pointer-cast]
280 | hwnd = (HWND) strtol(name, &p, 0);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
On Windows, long types are 32 bit, so to get a usable pointer, we
need to use long long. And interpret it as unsigned long long
while at it - i.e. using strtoull.
Finally, right above it, the code triggered the following warning:
src/libavdevice/gdigrab.c:278:15: warning: mixing declarations and code is incompatible with standards before C99 [-Wdeclaration-after-statement]
278 | char *p;
| ^
Signed-off-by: Martin Storsjö <martin@martin.st>
With the way the runtime checks are currently set up, every single
openh264 release, no matter how minor, is considered an ABI break and
requires ffmpeg recompilation. This is unnecessarily strict because it
doesn't allow downstream distributions to ship any openh264 bug fix
version updates without breaking ffmpeg's openh264 support.
Years ago, at the time when ffmpeg's openh264 support was merged,
openh264 releases were done without a versioned soname (the library was
just libopenh264.so, unversioned). Since then, starting with version
1.3.0, openh264 has started using versioned sonames and the intent has
been to bump the soname every time there's a new release with an ABI
change.
This patch drops the exact version check and instead adds a minimum
requirement on 1.3.0 to the configure script.
Signed-off-by: Kalev Lember <klember@redhat.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
During a resampling operation where
1) user has specified first_pts
2) SWR_FLAG_RESAMPLE is not set initially (directly or otherwise)
3) first_pts has been fulfilled (always using hard compensation)
then upon first encountering a delay where a soft compensation is
required, swr_set_compensation will lead to another init of swr which
will reset outpts to the specified firstpts thus leading to an output
frame having its pts = firstpts. When the next input frame is received,
swr will see a large delay and inject silence from firstpts to the
current frame's pts. This can lead to severe desync and in worst case,
loss of audio playback.
Parameter firstpts initialized to AV_NOPTS_VALUE in swr_alloc and then
checked in swr_init to avoid resetting outpts, thus avoiding reapplication
of firstpts.
Fixes#4131.
The {br}/{abr} directives are not limited to post-encoding, they can
also be used pre-muxing. The already-present {packet} tag describes this
more accurately, so just drop the assertions.
They cannot store 1 as signed, only 0 and -1.
Avoids warnings such as:
implicit truncation from 'int' to a one-bit wide bit-field changes value from 1 to -1 [-Wsingle-bit-bitfield-constant-conversion]
The names of the cpu flags, when parsed from a string with
av_parse_cpu_caps, are parsed by the libavutil eval functions. These
interpret dashes as subtractions. Therefore, these previous cpu flag
names haven't been possible to set.
Use the official names for these extensions, as the previous ad-hoc
names wasn't parseable.
libavutil/tests/cpu tests that the cpu flags can be set, and prints
the detected flags.
Acked-by: Rémi Denis-Courmont <remi@remlab.net>
Signed-off-by: Martin Storsjö <martin@martin.st>
x11grab can capture windows by their ID, but gdigrab can only capture
windows by their names, internally calling FindWindowW to lookup its
handle.
This patch simply allows the user to specify a window handle directly.
Signed-off-by: Lena <lena@nihil.gay>
The 8x4 and 4x4 use a needlessly large multiplier (unless/until we care
about embedded 64-bit-vector hardware). This is merely suboptimal.
The 8x4 case also uses an incorrect vector length, which leads to incorrect
behaviour on future/hypothetical hardware with 256-bit or larger vectors.
Pointed-out-by: Martin Storsjö <martin@martin.st>
It's available on macOS since 10.8, but not available on iOS until
16.0.
Reported and tested by xufuji456.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
We can't call ff_get_rv_vlenb() if we don't have RVV available
at all.
Acked-by: Rémi Denis-Courmont <remi@remlab.net>
Signed-off-by: Martin Storsjö <martin@martin.st>
Add input pad to get model input resolution. Detection models always
have fixed input size. And the output coordinators are based on the
input resolution, so we need to get input size to map coordinators to
our real output frames.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
Add multiple output support to openvino backend. You can use '&' to
split different output when you set output name using command line.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
Avoids a -Wstringop-truncation warning by using av_strlcopy instead of
strncpy. Additionally, prints a warning to the log context if this
truncation occurred.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
The loop iterates over the length of the vector, not the order. This is
to avoid reloading the same data for each lag value. However this means
the loop only works if the maximum order is no larger than VLENB.
The loop is roughly equivalent to:
for (size_t j = 0; j < lag; j++)
autoc[j] = 1.;
while (len > lag) {
for (ptrdiff_t j = 0; j < lag; j++)
autoc[j] += data[j] * *data;
data++;
len--;
}
while (len > 0) {
for (ptrdiff_t j = 0; j < len; j++)
autoc[j] += data[j] * *data;
data++;
len--;
}
Since register pressure is only at 50%, it should be possible to implement
the same loop for order up to 2xVLENB. But this is left for future work.
Performance numbers are all over the place from ~1.25x to ~4x speedups,
but at least they are always noticeably better than nothing.
This test verifies the parser's handling of multiframe JXL files that
have an entropy-encoded permuted table of contents for each frame. The
testcase is actually six JXL codestreams concatenated together, and the
parser needs to be able to find the boundaries.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
<OS>_VERSION_MAX_ALLOWED indicates what version is available in
the SDK, while <OS>_VERSION_MIN_REQUIRED is the version we can
assume is available, i.e. similar to what is set with e.g.
-miphoneos-version-min on the command line.
This fixes build errors like these:
src/libavdevice/avfoundation.m:788:37: error: 'AVCaptureDeviceTypeContinuityCamera' is only available on macOS 14.0 or newer [-Werror,-Wunguarded-availability-new]
[deviceTypes addObject: AVCaptureDeviceTypeContinuityCamera];
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/System/Library/Frameworks/AVFoundation.framework/Headers/AVCaptureDevice.h:551:38: note: 'AVCaptureDeviceTypeContinuityCamera' has been marked as being introduced in macOS 14.0 here, but the deployment target is macOS 13.0.0
AVF_EXPORT AVCaptureDeviceType const AVCaptureDeviceTypeContinuityCamera API_AVAILABLE(macos(14.0), ios(17.0), macCatalyst(17.0), tvos(17.0)) API_UNAVAILABLE(visionos) API_UNAVAILABLE(watchos);
^
Alternatively, we could use these more modern APIs, if enclosed
in suitable @available() checks.
- Fixes YA formats, because previous code always assumed alpha as the 4th
component.
- Fixes PAL format (as long as 0 is black, as in a systematic palette), because
previous code assumed it as limited Y.
- Fixes XYZ format because it does not need nonzero chroma components
- Fixes xv30be as the bitstream mode got merged to the non-bitstream mode.
Signed-off-by: Marton Balint <cus@passwd.hu>
Should fix valgrind warnings about Conditional jump or move depends on uninitialised value.
Reviewed-by: Anton Khirnov <anton@khirnov.net>
Signed-off-by: James Almer <jamrial@gmail.com>
Change the main loop and every component (demuxers, decoders, filters,
encoders, muxers) to use the previously added transcode scheduler. Every
instance of every such component was already running in a separate
thread, but now they can actually run in parallel.
Changes the results of ffmpeg-fix_sub_duration_heartbeat - tested by
JEEB to be more correct and deterministic.
See the comment block at the top of fftools/ffmpeg_sched.h for more
details on what this scheduler is for.
This commit adds the scheduling code itself, along with minimal
integration with the rest of the program:
* allocating and freeing the scheduler
* passing it throughout the call stack in order to register the
individual components (demuxers/decoders/filtergraphs/encoders/muxers)
with the scheduler
The scheduler is not actually used as of this commit, so it should not
result in any change in behavior. That will change in future commits.
As for the analogous decoding change, this is only a preparatory step to
a fully threaded architecture and does not yet make encoding truly
parallel. The main thread will currently submit a frame and wait until
it has been fully processed by the encoder before moving on. That will
change in future commits after filters are moved to threads and a
thread-aware scheduler is added.
This code suffers from a known issue - if an encoder with a sync queue
receives EOF it will terminate after processing everything it currently
has, even though the sync queue might still be triggered by other
threads. That will be fixed in following commits.
* the code is made shorter and simpler
* avoids constantly allocating and freeing AVPackets, thanks to
ThreadQueue integration with ObjPool
* is consistent with decoding/filtering/muxing
* reduces the diff in the future switch to thread-aware scheduling
This makes ifile_get_packet() always block. Any potential issues caused
by this will be resolved by the switch to thread-aware scheduling in
future commits.
Otherwise they'd be silently ignored if received by the filtering thread
before the filtergraph can be initialized, which would make the output
dependent on the order in which frames from different inputs arrive.
As previously for decoding, this is merely "scaffolding" for moving to a
fully threaded architecture and does not yet make filtering truly
parallel - the main thread will currently wait for the filtering thread
to finish its work before continuing. That will change in future commits
after encoders are also moved to threads and a thread-aware scheduler is
added.
Avoid making decisions based on current graph input state, which makes
the output dependent on the order in which the frames from different
inputs are interleaved.
Makes the output of fate-filter-overlay-dvdsub-2397 more correct - the
subtitle appears two frames later, which is closer to its PTS as stored
in the file.
Building with macOS platform, the compiler has a warning: 'kAudioObjectPropertyElementMaster' is deprecated in macOS 12.0
Signed-off-by: xufuji456 <839789740@qq.com>
win32 typically doesn't have unistd.h, so always including it will break
MSVC builds. The usage of those POSIX functions are already guarded by
_WIN32, so use that to guard unistd.h include as well.
Building with iOS platform, the compiler has a warning: "'devicesWithMediaType:' is deprecated: first deprecated in iOS 10.0 - Use AVCaptureDeviceDiscoverySession instead"
Signed-off-by: xufuji456 <839789740@qq.com>
The earlier code writes the file and then tries to patch up
the size later. This is avoidable for the common case of
a single image because one can know the complete size
in advance and write it.
Fixes ticket #4609.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The libvmaf filter was doing substring checks in place of string equality
comparisons. This led to a bug when the user specified the pooling method
"harmonic_mean", since "mean" was checked first and the substring comparison
returned true. This patch changes all substring comparisons for string equality
comparisons. This is both correct and more efficient than the existing method.
Signed-off-by: nilfm <nilf@netflix.com>
When multiple hardwares are available, the default one might not be
Intel Hardware. We can use option vendor_id to choose the required
vendor.
Tested-by: Artem Galin <artem.galin@intel.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
User may choose the hardware via option vendor_id when multiple
hardwares are available.
Signed-off-by: Artem Galin <artem.galin@intel.com>
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
The current logic for detecting frames that are too small for the
algorithm does not account for chroma sub-sampling, and so a sample
where the luma plane is large enough, but the chroma planes are not
will not be rejected. In that event, a heap overflow will occur.
This change adjusts the logic to consider the chroma planes and makes
the change to all three bwdif implementations.
Fixes#10688
Signed-off-by: Cosmin Stejerean <cosmin@cosmin.at>
Reviewed-by: Thomas Mundt <tmundt75@gmail.com>
Signed-off-by: Philip Langdale <philipl@overt.org>
This patch allows the JXL parser to parse sequences of extremely small
files concatenated together. (e.g. smaller than the parser buffer)
Signed-off-by: Leo Izen <leo.izen@gmail.com>
According to ISO/IEC 14996-12, size == 1 means a 64-bit extended-size
field occurs *after* the 32-bit box type, not before. This fix should
allow correct parsing of JXL files with extended-size boxes.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
Both qsv encoders and decoders use 4 as the default value of
async_depth, let's use 4 as the default value for vpp_qsv filter too.
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
mfxExtendedDeviceId mightn't be supported in certain configurations of
oneVPL on Linux, so we can't ensure a property filter for
mfxExtendedDeviceId.DeviceID or mfxExtendedDeviceId.VendorID works as
expected. This fixed the issue mentioned in [1]
[1] http://ffmpeg.org/pipermail/ffmpeg-user/2023-October/056983.html
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
The type of qsv decoders is FF_CODEC_CB_TYPE_DECODE which must not
return AVERROR(EAGAIN). commit 42b20c9 added an assertion to check the
returned value.
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
libvpl deprecated some symbols (e.g. MFX_EXTBUFF_VPP_DENOISE2 is used to
replace MFX_EXTBUFF_VPP_DENOISE), however the new symbols aren't support
by MediaSDK runtime. In order to support the combination of libvpl and
MediaSDK runtime on legacy devices, we continue to use the deprecated
symbols in FFmpeg. This patch added -DMFX_DEPRECATED_OFF to CFLAGS to
silence the corresponding compilation warnings.
Signed-off-by: Haihao Xiang <haihao.xiang@intel.com>
Fixes OOM when caller keeps adding frames into filtergraph
that reached EOF by other means, for example EOF is signalled
by other filter in filtergraph or by buffersink.
The len parameter was changed from unsigned int to size_t in
567c67c6c8.
This fixes crashes in the reference C code, when running checkasm
on aarch64.
Signed-off-by: Martin Storsjö <martin@martin.st>
Fixes OOM when caller keeps adding frames into filtergraph
that reached EOF by other means, for example EOF is signalled
by other filter in filtergraph or by buffersink.
Building iOS platform with arm64, the compiler has a warning: "instruction movi.2d with immediate #0 may not function correctly on this CPU, converting to movi.16b"
Signed-off-by: xufuji456 <839789740@qq.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
Declaring the function argument as const fixes a warning down the line
that the const parameter is stripped. We don't modify this argument.
Signed-off-by: Leo Izen <leo.izen@gmail.com>
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
3d29724c00 removed the doc entry for the option pool while adding
a parser function for it at the same time!
The option remains available and undeprecated.
Fixes trac #10693
This fixes corrupted audio for applications relying on ch_layout when
codec downmixing is active.
Signed-off-by: Geoffrey McRae <geoff@hostfission.com>
Signed-off-by: James Almer <jamrial@gmail.com>
There are many kinds of detection DNN model and they have different
preprocess and postprocess methods. To support more models,
"model_type" option is added to help to choose preprocess and
postprocess function.
Signed-off-by: Wenbin Chen <wenbin.chen@intel.com>
Reviewed-by: Guo Yejun <yejun.guo@intel.com>
This reverts commit bec6dfcd5c.
The patch is NOP since ffurl_open_whitelist copy options from parent
automatically.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
Terminating the whole checkasm process is not very helpful. This will
report if an illegal instruction occurs while executing a tested
function. This is a common occurrence whilst developping RISC-V
assembler, due to the compatibility between vector configuration and
instruction done at run-time.
The input is laid out in 16 segments, of which 13 actually need to be
loaded. There are no really efficient ways to deal with this:
1) If we load 8 segments wit unit stride, then narrow to 16 segments with
right shifts, we can only get one half-size vector per segment, or just 2
elements per vector (EMUL=1/2) - at least with 128-bit vectors.
This ends up unsurprisingly about as fas as the C code.
2) The current approach is to load with strides. We keep that approach,
but improve it using three 4-segmented loads instead of 12 single-segment
loads. This divides the number of distinct loaded addresses by 4.
3) A potential third approach would be to avoid segmentation altogether
and splat the scalar coefficient into vectors. Then we can use a
unit-stride and maximum EMUL. But the downside then is that we have to
multiply the 3 (of 16) unused segments with zero as part of the
multiply-accumulate operations.
In addition, we also reuse vectors mid-loop so as to increase the EMUL
from 1 to 2, which also improves performance a little bit.
Oeverall the gains are quite small with the device under test, as it does
not deal with segmented loads very well. But at least the code is tidier,
and should enjoy bigger speed-ups on better hardware implementation.
Before:
ps_hybrid_analysis_c: 1819.2
ps_hybrid_analysis_rvv_f32: 1037.0 (before)
ps_hybrid_analysis_rvv_f32: 990.0 (after)
This stores the constant coefficients deinterleaved, so that they can be
loaded directly with NF=0. Unfortunately, we cannot optimise loading the
input, due to insufficient memory alignment (not 32-bit).
Before:
g722_apply_qmf_c: 82.5
g722_apply_qmf_rvv_i32: 78.2
After:
g722_apply_qmf_c: 82.5
g722_apply_qmf_rvv_i32: 65.2
Gathers are (unsurprisingly) a notable exception to the rule that R-V V
gets faster with larger group multipliers. So roll the function to speed
it up.
Before:
vector_fmul_reverse_fixed_c: 2840.7
vector_fmul_reverse_fixed_rvv_i32: 2430.2
After:
vector_fmul_reverse_fixed_c: 2841.0
vector_fmul_reverse_fixed_rvv_i32: 962.2
It might be possible to further optimise the function by moving the
reverse-subtract out of the loop and adding ad-hoc tail handling.
When encoders don't support global header like MediaCodec, FLV
muxer needs to add extract_extradata bsf automatically. The codec
list doesn't include VP9 since it's not supported by
extract_extradata.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
Validate that a hw_frames_ctx is available before using it for
the AVHWAccel.free_frame_priv callback, and don't require it to
be present when the callback is not in use by the HWAccel.
v2: check for free_frame_priv (Hendrik)
v3: return EINVAL (Christoph Reiter)
v4: better commit message (Hendrik)
v5: fix typo with missed frames_ctx (Lynne)
See[1]: https://github.com/msys2/MINGW-packages/pull/19050
Fixes: be07145109 ("avcodec: add AVHWAccel.free_frame_priv callback")
CC: Lynne <dev@lynne.ee>
CC: Christoph Reiter <reiter.christoph@gmail.com>
Signed-off-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
For fate-h264_mp4toannexb_ticket5927 and
fate-h264_mp4toannexb_ticket5927_2, they work by accident
previously. The sample file has two 'avc1' entries, and video
samples use the second one. It means packets should be decoded with
new extradata in side data. Before this patch, only extradata was
kept in the output, new extradata has been dropped. The output can
be decoded because the two extradata are almost the same, except
level indication. This patch fixed the issue, and add another
fate test.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
If there is a single group of SPS/PPS before an IDR frame, but no
SPS/PPS after that, we will miss the chance to reset
idr_sps_seen/idr_pps_seen. No SPS/PPS are inserted afterwards.
This patch saves in-band SPS/PPS and insert them before IDR frames
when necessary.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
start_code_size depends on whether PS comes from out-of-band or
in-band. Make the code more readable.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
This avoids SEI and IDR recovery flags affecting each other
Also eliminate litteral numbers from recovery handling
This should make the code clearer
Improves: tickets/4738/tickets_cut.ts
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
This is only supported at compilation time. If Zfhmin is supported, then
conversions are fast, which is what the flag is used for. At this time,
run-tiem detection is not possible, as in not supported by Linux. But even
if it were, the current FFmpeg approach seems unable to deal with it (same
problem as on x86, really).
In this case, the inner loop computing the scalar product can be reduced
to just one multiplication and one sum even with 128-bit vectors. The
result is a lot simpler, but also brings more modest performance gains:
flac_lpc_16_13_c: 15241.0
flac_lpc_16_13_rvv_i32: 11230.0
flac_lpc_16_16_c: 17884.0
flac_lpc_16_16_rvv_i32: 12125.7
flac_lpc_16_29_c: 27847.7
flac_lpc_16_29_rvv_i32: 10494.0
flac_lpc_16_32_c: 30051.5
flac_lpc_16_32_rvv_i32: 10355.0
The entire set of 32 coefficients and corresponding past 32 samples can
fit in a single vector (with LMUL=8) exactly, but... since widening
double the needed vector sizes, we still end up too short with 128-bit
vectors. This adds a very simple version for future 256+-bit hardware,
and for pred_orders values up to 16, and a bit more involved loop for
for 128-bit hardware with pred_orders between 17 and 32.
With 128-bit hardware, the benchmarks look like this:
flac_lpc_32_13_c: 30152.0
flac_lpc_32_13_rvv_i32: 10244.7
flac_lpc_32_16_c: 37314.2
flac_lpc_32_16_rvv_i32: 10126.2
flac_lpc_32_29_c: 61910.0
flac_lpc_32_29_rvv_i32: 14495.2
flac_lpc_32_32_c: 68204.0
flac_lpc_32_32_rvv_i32: 13273.7
decorrelate_ls, _rs and _ms are decorrelate[1], [2] and [3] respectively.
The code ended up testing indep ([0]) as twice, ms never, and misnaming
the other two.
Segmented loads are slow, so here we use unit-strided load and narrowing shifts.
c910:
fcmul_add_c: 2179
fcmul_add_rvv_f64: 1652
c908:
fcmul_add_c: 4891.2
fcmul_add_rvv_f64: 2399.5
Signed-off-by: Rémi Denis-Courmont <remi@remlab.net>
This commit adds support for VP8 bitstream read methods to the cbs
codec. This enables the trace_headers bitstream filter to support VP8,
in addition to AV1, H.264, H.265, and VP9. This can be useful for
debugging VP8 stream issues.
The CBS VP8 implements a simple VP8 boolean decoder using GetBitContext
to read the bitstream.
Only the read methods `read_unit` and `split_fragment` are implemented.
The write methods `write_unit` and `assemble_fragment` return the error
code AVERROR_PATCHWELCOME. This is because CBS VP8 write is unlikely to
be used by any applications at the moment. The write methods can be
added later if there is a real need for them.
TESTS: ffmpeg -i fate-suite/vp8/frame_size_change.webm -vcodec copy
-bsf:v trace_headers -f null -
Signed-off-by: Jianhui Dai <jianhui.j.dai@intel.com>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
This commit exports the `vp8_token_update_probs` variable to internal
library scope to facilitate its reuse within the library.
Signed-off-by: Jianhui Dai <jianhui.j.dai@intel.com>
Signed-off-by: Ronald S. Bultje <rsbultje@gmail.com>
Better performance can probably be achieved with a more intricate
unrolled loop, but this is a start:
add_hfyu_left_pred_bgr32_c: 15084.0
add_hfyu_left_pred_bgr32_rvv_i32: 10280.2
This would actually be cleaner with the RISC-V P extension, but that is
not ratified yet (I think?) and usually not supported if V is supported.
Current code tracks min/max pts for each stream separately; then when
the file ends it combines them with last frame's duration to compute the
total duration of each stream; finally it selects the longest of those
durations as the file duration.
This is incorrect - the total file duration is the largest timestamp
difference between any frames, regardless of the stream.
Also change the way the last frame information is reported from decoders
to the muxer - previously it would be just the last frame's duration,
now the end timestamp is sent, which is simpler.
Changes the result of the fate-ffmpeg-streamloop-transcode-av test,
where the timestamps are shifted slightly forward. Note that the
matroska demuxer does not return the first audio packet after seeking
(due to buggy interaction betwen the generic code and the demuxer), so
there is a gap in audio.
This ensures that tq_receive() will always return EOF after all streams
were receive-finished, even though the sending side might not have
closed them yet. This may allow the receiver to avoid manually tracking
which streams it has already closed.
It is common for subtitle streams to have large gaps between packets.
When the caller is interleaving packets from multiple files, it can
easily happen that two successive subtitle packets trigger this limit,
even though no excessive buffering is happening.
Should fix#7064
With explicit unrolling, we can skip half of the sign bit flips, and
the compiler is then better able to optimise the scalar loop:
predictor_c: 31376.0 (before)
predictor_c: 23703.0 (after)
This is restricted to 128-bit vectors as larger vector sizes could read
past the end of the noise array. Support for future hardware with larger
vector sizes is left for some other time.
hf_apply_noise_0_c: 2319.7
hf_apply_noise_0_rvv_f32: 1229.0
hf_apply_noise_1_c: 2539.0
hf_apply_noise_1_rvv_f32: 1244.7
hf_apply_noise_2_c: 2319.7
hf_apply_noise_2_rvv_f32: 1232.7
hf_apply_noise_3_c: 2541.2
hf_apply_noise_3_rvv_f32: 1244.2
The tested functions treat s_m[i] == 0 as a special case. Other than
that, the functions are slightly complicated vector additions.
This actually makes the zero case happen pseudorandomly.
In my personal opinion, we should not need to support unaligned YUY2
pixel maps. They should always be aligned to at least 32 bits, and the
current code assumes just 16 bits. However checkasm does test for
unaligned input bitmaps. QEMU accepts it, but real hardware dose not.
In this particular case, we can at the same time improve performance and
handle unaligned inputs, so do just that.
uyvytoyuv422_c: 104379.0
uyvytoyuv422_c: 104060.0
uyvytoyuv422_rvv_i32: 25284.0 (before)
uyvytoyuv422_rvv_i32: 19303.2 (after)
This saves three scratch registers and three instructions per line. The
performance gains are mostly negligible. The main point is to free up
registers for further rework.
Fixes OOM when caller keeps adding frames into filtergraph
that reached EOF by other means, for example EOF is signalled
by other filter in filtergraph or by buffersink.
This commit fixes issue with missing SPS/PPS headers in video
encoded by AMF AV1 encoder.
Missing headers leads to broken seek in MPV video player.
Default value for property AV1_HEADER_INSERTION_MODE shouldn't be setup
to NONE (no headers insertion). We need to skip definition of this property,
because default value depends on USAGE property.
Signed-off-by: Dmitrii Ovchinnikov <ovchinnikov.dmitrii@gmail.com>
With 5 accumulator vectors and 6 inputs, this can only use LMUL=2.
Also the number of vector loop iterations is small, just 5 on 128-bit
vector hardware.
The vector loop is somewhat unusual in that it processes data in
descending memory order, in order to save on vector slides:
in descending order, we can extract elements to carry over to the next
iteration from the bottom of the vectors directly. With ascending order
(see in the Opus postfilter function), there are no ways to get the top
elements directly. On the downside, this requires the use of separate
shift and sub (the would-be SH3SUB instruction does not exist), with
a small pipeline stall on the vector load address.
The edge cases in scalar are done in scalar as this saves on loads
and remains significantly faster than C.
autocorrelate_c: 669.2
autocorrelate_rvv_f32: 421.0
Given the size of the data set, strided memory accesses cannot be avoided.
We can still do better than the current code.
ps_hybrid_synthesis_deint_c: 12065.5
ps_hybrid_synthesis_deint_rvv_i32: 13650.2 (before)
ps_hybrid_synthesis_deint_rvv_i64: 8181.0 (after)
Segmented loads may be slower than not. So this advantageously uses a
unit-strided load and narrowing shifts instead.
Before:
ps_add_squares_c: 60757.7
ps_add_squares_rvv_f32: 22242.5
After:
ps_add_squares_c: 60516.0
ps_add_squares_rvv_i64: 17067.7
Fixes: Erroneous HTTP POST instead of HTTP PUT for WebVTT HLS variant playlists.
Signed-off-by: Léon Spaans <leons@gridpoint.nl>
Signed-off-by: Steven Liu <lq@chinaffmpeg.org>
PGMYUV seems to be always limited range. This was a format originally
invented by FFmpeg at a time when YUVJ distinguished limited from full
range YUV, and this codec never appeared to output YUVJ in any
circumstance, so hard-coding limited range preserves the status quo.
The other formats are explicitly documented to be full range RGB/gray
formats. That said, don't tag them yet, due to outstanding bugs w.r.t
grayscale formats and color range handling.
This change in behavior updates a bunch of FATE tests in trivial ways
(added tagging being the only difference).
When using vf_scale to force a specific output color space, also tag
this on the AVFrame. (Mirroring existing logic for output range)
Move the sanity fix for RGB after the new assignment, to avoid leaking
bogus YUV colorspace metadata for RGB spaces.
No need to write a custom string parser when we can just use an integer
option with preset values. The various bits of fallback logic are wholly
redundant with equivalent logic already inside sws_getCoefficients.
Note: I disallowed setting 'out_color_matrix=auto', because this does
not do anything meaningful in the current code (just hard-codes
AVCOL_SPC_BT470BG fallback).
The documentation states that invalid entries default to SWS_CS_DEFAULT.
A value of 0 is not a valid SWS_CS_*, yet the code incorrectly
hard-codes it to BT.709 coefficients instead of SWS_CS_DEFAULT.
This was a complete hack seemingly designed to work around a different
bug, which was fixed in the previous commit. As such, there is no more
reason not to do this, as it simply breaks changing color range in
sws_setColorspaceDetails for no reason.
More commonly, this fixes the case of sws_setColorspaceDetails after
sws_getContext, since the latter implies sws_init_context.
The problem here is that sws_init_context sets up the range conversion
and fast path tables based on the values of srcRange/dstRange at init
time. This may result in locking in a "wrong" path (either using
unscaled fast path when range conversion later required, or using
scaled slow path when range conversion becomes no longer required).
There are two way outs:
1. Always initialize range conversion and unscaled converters, even if
they will be unused, and extend the runtime check.
2. Re-do initialization if the values change after
sws_setColorspaceDetails.
I opted for approach 1 because it was simpler and easier to reason
about.
Reword the av_log message to make it clear that this special converter
is not necessarily used, depending on whether or not there is range
conversion or YUV matrix conversion going on.
The check if drawing needs to be initialized and supported formats
should be drawable ones was flawed, as pad_stop/pad_start is only
populated from stop_duration/start_duration after these checks.
To fix that, check the _duration variants as well and for better
readability and maintainability break the check out into its own
helper.
Fixes a regression from 86b252ea9dFix#10621
Signed-off-by: Anton Khirnov <anton@khirnov.net>
We calculated offsets as pairs, but addressed them in the shader
as single float values, while reading them as ivec2s.
Also removes unused code (was provisionally added if cooperative matrices
could be used, but that turned out to be impossible).
VK_KHR_PORTABILITY_ENUMERATION_EXTENSION_NAME is required on macOS,
and VK_INSTANCE_CREATE_ENUMERATE_PORTABILITY_BIT_KHR flag should
be set.
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
Texinfo 7.0 produces quite different HTML to Texinfo 6.8. Without
this change, enumerated option flags (i.e. Possible values of x
are...) render as white text on a white background with Texinfo 7.0
and are unreadable. This change removes a style for the selector
`.table .table` which causes the background to turn white for these
elements. As far as I can tell, it is not actually used anywhere in
files generated by Texinfo 6.8.
Signed-off-by: Frank Plowman <post@frankplowman.com>
Resolves trac ticket #10636 (http://trac.ffmpeg.org/ticket/10636).
Texinfo 7.0, released in November 2022, changed the names of various
functions. Compiling docs with Texinfo 7.0 resulted in warnings and
improperly formatted documentation. More old names appear to have
been removed in Texinfo 7.1, released October 2023, which causes docs
compilation to fail.
This commit addresses the issue by adding logic to switch between the old
and new function names depending on the Texinfo version. Texinfo 6.8
produces identical documentation before and after the patch.
CC
https://www.mail-archive.com/debian-bugs-dist@lists.debian.org/msg1938238.htmlhttps://bugs.gentoo.org/916104
Signed-off-by: Frank Plowman <post@frankplowman.com>
By intent, and in practice, the "active contributor" criterion applies
to the person authoring the commits, not the one pushing them into the
repository.
When operating on large blocks of data it's common to repeatedly use
an instruction on multiple registers. Using the REPX macro makes it
easy to quickly write dense code to achieve this without having to
explicitly duplicate the same instruction over and over.
For example,
REPX {paddw x, m4}, m0, m1, m2, m3
REPX {mova [r0+16*x], m5}, 0, 1, 2, 3
will expand to
paddw m0, m4
paddw m1, m4
paddw m2, m4
paddw m3, m4
mova [r0+16*0], m5
mova [r0+16*1], m5
mova [r0+16*2], m5
mova [r0+16*3], m5
Commit taken from x264:
https://code.videolan.org/videolan/x264/-/commit/6d10612ab0007f8f60dd2399182efd696da3ffe4
Signed-off-by: Frank Plowman <post@frankplowman.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
This uses a more traditional approach allowing up processing of up to
period minus two elements per iteration. This also allows the algorithm
to work for all and any vector length.
As the T-Head C908 device under test can load 16 elements loop, there is
unsurprisingly a little performance drop when the period is minimal and
the parallelism is capped at 13 elements:
Before:
postfilter_15_c: 21222.2
postfilter_15_rvv_f32: 22007.7
postfilter_512_c: 20189.7
postfilter_512_rvv_f32: 22004.2
postfilter_1022_c: 20189.7
postfilter_1022_rvv_f32: 22004.2
After:
postfilter_15_c: 20189.5
postfilter_15_rvv_f32: 7057.2
postfilter_512_c: 20189.5
postfilter_512_rvv_f32: 5667.2
postfilter_1022_c: 20192.7
postfilter_1022_rvv_f32: 5667.2
As in the aligned case, we can use VLSE64.V, though the way of doing so
gets more convoluted, so the performance gains are more modest:
get_pixels_unaligned_c: 126.7
get_pixels_unaligned_rvv_i32: 145.5 (before)
get_pixels_unaligned_rvv_i64: 62.2 (after)
For the reference, those are the aligned benchmarks (unchanged) on the
same T-Head C908 hardware:
get_pixels_c: 126.7
get_pixels_rvi: 85.7
get_pixels_rvv_i64: 33.2
Without this flag, timestamps were embedded into the final
binary if CUDA was enabled.
Signed-off-by: Rob Hall <robxnanocode@outlook.com>
Signed-off-by: Anton Khirnov <anton@khirnov.net>
We currently mostly do not empty the MMX state in our MMX
DSP functions; instead we only do so before code that might
be using x87 code. This is a violation of the System V i386 ABI
(and maybe of other ABIs, too):
"The CPU shall be in x87 mode upon entry to a function. Therefore,
every function that uses the MMX registers is required to issue an
emms or femms instruction after using MMX registers, before returning
or calling another function." (See 2.2.1 in [1])
This patch does not intend to change all these functions to abide
by the ABI; it only does so for ff_pixelutils_sad_8x8_mmxext, as this
function can by called by external users, because it is exported
via the pixelutils API. Without this, the following fragment will
assert (on x86/x64):
uint8_t src1[8 * 8], src2[8 * 8];
av_pixelutils_sad_fn fn = av_pixelutils_get_sad_fn(3, 3, 0, NULL);
fn(src1, 8, src2, 8);
av_assert0_fpu();
[1]: https://raw.githubusercontent.com/wiki/hjl-tools/x86-psABI/intel386-psABI-1.1.pdf
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is cleaner, but it is also a workaround for when
the header exists, but cannot be compiled.
This will happen when the compiler has no inline asm
support.
Possibly the configure check should be improved as well.
The test can currently pass when _Pragma is not supported, since
_Pragma might be treated as a implicitly declared function.
This happens e.g. with tinycc.
Extending the check to 2 pragmas both matches the actual use
better and avoids this misdetection.
The `-caption_encoding` option was reported as having a default value of
'ass', whereas it's actually 'auto'.
Signed-off-by: zheng qian <xqq@xqq.im>
Signed-off-by: Gyan Doshi <ffmpeg@gyani.pro>
With 128-bit vectors, this is mostly pointless but also harmless.
Performance gains should be more noticeable with larger vector sizes.
neg_odd_64_c: 76.2
neg_odd_64_rvv_i64: 74.7
Up until now each thread had its own buffer pool for extradata
buffers when using frame-threading. Each thread can have at most
three references to extradata and in the long run, each thread's
bufferpool seems to fill up with three entries. But given
that at any given time there can be at most 2 + number of threads
entries used (the oldest thread can have two references to preceding
frames that are not currently decoded and each thread has its own
current frame, but there can be no references to any other frames),
this is wasteful. This commit therefore uses a single buffer pool
that is synced across threads.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Up until now, the VAAPI encoder uses fake data with the
AVBuffer-API: The data pointer does not point to real memory,
but is instead just a VABufferID converted to a pointer.
This has probably been copied from the VAAPI-hwcontext-API
(which presumably does it to avoid allocations).
This commit changes this without causing additional allocations
by switching to the RefStruct-pool API. This also fixes an
unchecked av_buffer_ref().
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It involves less allocations, in particular no allocations
after the entry has been created. Therefore creating a new
reference from an existing one can't fail and therefore
need not be checked. It also avoids indirections and casts.
Also note that nvdec_decoder_frame_init() (the callback
to initialize new entries from the pool) does not use
atomics to read and replace the number of entries
currently used by the pool. This relies on nvdec (like
most other hwaccels) not being run in a truely frame-threaded
way.
Tested-by: Timo Rothenpieler <timo@rothenpieler.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It involves less allocations and therefore has the nice property
that deriving a reference from a reference can't fail,
simplifying hevc_ref_frame().
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It involves less allocations and therefore has the nice property
that deriving a reference from a reference can't fail.
This allows for considerable simplifications in
ff_h264_(ref|replace)_picture().
Switching to the RefStruct API also allows to make H264Picture
smaller, because some AVBufferRef* pointers could be removed
without replacement.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Very similar to the AVBufferPool API, but with some differences:
1. Reusing an already existing entry does not incur an allocation
at all any more (the AVBufferPool API needs to allocate an AVBufferRef).
2. The tasks done while holding the lock are smaller; e.g.
allocating new entries is now performed without holding the lock.
The same goes for freeing.
3. The entries are freed as soon as possible (the AVBufferPool API
frees them in two batches: The first in av_buffer_pool_uninit() and
the second immediately before the pool is freed when the last
outstanding entry is returned to the pool).
4. The API is designed for objects and not naked buffers and
therefore has a reset callback. This is called whenever an object
is returned to the pool.
5. Just like with the RefStruct API, custom allocators are not
supported.
(If desired, the FFRefStructPool struct itself could be made
reference counted via the RefStruct API; an FFRefStructPool
would then be freed via ff_refstruct_unref().)
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This makes the code more testable as uninitialized fields are 0
and not random values from the last call
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
These entries do not correspond to VLC symbols that can be used
they do corrupt various variables like min/max bits
This also no longer assumes that there is a single non subtable
entry
Probably fixes some infinite loops too
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Fixes: signed integer overflow: 1900031961 + 553590817 cannot be represented in type 'int'
Fixes: 63061/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_APE_fuzzer-5166188298371072
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
The check is based on not infinite looping. It is likely
a more strict check can be done
Fixes: Infinite loop
Fixes: 62473/clusterfuzz-testcase-minimized-ffmpeg_BSF_EVC_FRAME_MERGE_fuzzer-5719883750703104
Fixes: 62765/clusterfuzz-testcase-minimized-ffmpeg_dem_EVC_fuzzer-6448531252314112
Fixes: 63378/clusterfuzz-testcase-minimized-ffmpeg_dem_MPEGPS_fuzzer-6504993844494336
Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Reviewed-by: "Dawid Kozinski/Multimedia (PLT) /SRPOL/Staff Engineer/Samsung Electronics" <d.kozinski@samsung.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
Everything besides VLC.table is basically write-only
and even VLC.table can be removed by accessing the
underlying table directly.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
For all VLCs here, the number of bits of the VLC is write-only,
because it is hardcoded at the call site. Therefore one can replace
these VLC structures with the only thing that is actually used:
The pointer to the VLCElem table. And in most cases one can even
avoid this.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Everything besides VLC.table is basically write-only
and even VLC.table can be removed by accessing the
underlying tables directly.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
For some VLCs here, the number of bits of the VLC is
write-only, because it is hardcoded at the call site.
Therefore one can replace these VLC structures with
the only thing that is actually used: The pointer
to the VLCElem table.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This allows to avoid the relocations inherent in an array
to individual tables; it also reduces padding.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
For all VLCs here, the number of bits of the VLC is
write-only, because it is hardcoded at the call site.
Therefore one can replace these VLC structures with
the only thing that is actually used: The pointer
to the VLCElem table.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
ff_ps_init() initializes some tables for AAC parametric stereo
and some of them are only valid for the fixed- or floating-point
decoder, whereas others (namely VLCs) are valid for both.
The latter are therefore initialized by ff_ps_init_common()
and because the two versions of ff_ps_init() can be run
concurrently, it is guarded by an AVOnce.
Yet now that there is ff_aacdec_common_init_once() there is
a better way to do this: Call ff_ps_init_common()
from ff_aacdec_common_init_once(). That way there is no need
to guard ff_ps_init_common() by an AVOnce any more.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This allows to avoid the relocations inherent in a table
to individual tables; it also reduces padding.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
It allows to replace code tables of type uint32_t or uint16_t
by symbols of type uint8_t. It is also faster.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
For all VLCs here, the number of bits of the VLC is
write-only, because it is hardcoded at the call site.
Therefore one can replace these VLC structures with
the only thing that is actually used: The pointer
to the VLCElem table.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The VLCs, their init code and the tables used for initialization
are currently duplicated for the floating- and fixed-point decoders.
This commit stops doing so and moves this stuff to aacdec_common.c.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
For all VLCs here, the number of bits of the VLC is
write-only, because it is hardcoded at the call site.
Therefore one can replace these VLC structures with
the only thing that is actually used: The pointer
to the VLCElem table. And in some cases one can even
avoid this.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Due to making the decode frames context use the coded size, the
filter started to display those artifacts as it reused the input frame's size.
Change it to instead output the real image size for images, not the input.
h->long_ref isn't guaranteed to be contiguously filled. Use the approach
from both vaapi_h264 and vdpau_h264 which goes through the 16 frames in
h->long_ref to find the LTR entries.
Fixes MR2_MW_A.264 from JVT-AVC_V1.
The fixed-point decoder actually does not use the floating-point
tables initialized by ff_aac_tableinit() at all. So don't
initialize them for it; instead merge initializing these tables
into ff_aac_float_common_init() which is already the function
for the common static initializations of the floating-point
AAC decoder and the (also floating-point) AAC encoder.
Doing so saves also one AVOnce.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
They (as well as their init code) are currently duplicated
for the floating- and fixed-point decoders.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
For all VLCs here, the number of bits of the VLC is write-only,
because it is hardcoded at the call site. Therefore one can replace
these VLC structures with the only thing that is actually used:
The pointer to the VLCElem table. And in some cases one can even
avoid this.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
For all VLCs here, the number of bits of the VLC is
write-only, because it is hardcoded at the call site.
Therefore one can replace these VLC structures with
the only thing that is actually used: The pointer
to the VLCElem table. And in some cases one can even
avoid this.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Everything besides VLC.table is basically write-only
and even VLC.table can be removed by accessing the
underlying table directly.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Everything besides VLC.table is basically write-only
and even VLC.table can be removed by accessing the
underlying table directly.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
For all VLCs here, the number of bits of the VLC is
write-only, because it is hardcoded at the call site.
Therefore one can replace these VLC structures with
the only thing that is actually used: The pointer
to the VLCElem table. And in some cases one can even
avoid this.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Everything besides VLC.table is basically write-only
and even VLC.table can be removed by accessing the
underlying table directly.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Everything besides VLC.table is basically write-only
and even VLC.table can be removed by accessing the
underlying table directly.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Everything besides VLC.table is basically write-only
and even VLC.table can be removed by accessing the
underlying table directly.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Everything besides VLC.table is basically write-only
and even VLC.table can be removed by accessing the
underlying table directly.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Everything besides VLC.table is basically write-only
and only VLC.table needs to be retained.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Everything besides VLC.table is basically write-only
and even VLC.table can be removed by accessing the
underlying table directly.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Everything besides VLC.table is basically write-only
and even VLC.table can be removed by accessing the
underlying table directly.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Everything besides VLC.table is basically write-only
and even VLC.table can be removed by accessing the
underlying table directly.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Said object is only allowed to be modified during its
initialization and is immutable afterwards.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
For most VLCs here, the number of bits of the VLC is
write-only, because it is hardcoded at the call site.
Therefore one can replace these VLC structures with
the only thing that is actually used: The pointer
to the VLCElem table.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
These VLCs are very big: The VP3 one have 164382 elements
but due to the overallocation enough memory for 313344 elements
are allocated (1.195 MiB with sizeof(VLCElem) == 4);
for VP4 the numbers are very similar, namely 311296 and 164392
elements. Since 1f4cf92cfb, each
frame thread has its own copy of these VLCs.
This commit fixes this by sharing these VLCs across threads.
The approach used here will also make it easier to support
stream reconfigurations in case of frame-multithreading
in the future.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Of all these VLCs here, only VLC.table was really used
after init, so use the ff_vlc_init_tables API
to get rid of them.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Of all these VLCs here, only VLC.table was really used
after init, so use the ff_vlc_init_tables API
to get rid of them.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Of all these VLCs here, only VLC.table was really used
after init, so use the ff_vlc_init_tables API
to get rid of them.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Of all these VLCs here, only VLC.table was really used
after init, so use the ff_vlc_init_tables API
to get rid of them.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Of all these VLCs here, only VLC.table was really used
after init, so use the ff_vlc_init_tables API
to get rid of them.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Of all these VLCs here, only VLC.table was really used
after init, so use the ff_vlc_init_tables API
to get rid of them.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Of all these VLCs here, only VLC.table was really used
after init, so use the ff_vlc_init_tables API
to get rid of them.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Of all these VLCs here, only VLC.table was really used
after init, so use the ff_vlc_init_tables API
to get rid of them.
Also combine the ff_msmp4_dc_(luma|chroma)_vlcs as well
as the tables used to generate them to simplify the code.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Of all these VLCs here, only VLC.table was really used
after init, so use the ff_vlc_init_tables API
to get rid of them.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Of all these VLCs here, only VLC.table was really used
after init, so use the ff_vlc_init_tables API
to get rid of them.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Of all these VLCs here, only VLC.table was really used
after init, so use the ff_vlc_init_tables API
to get rid of them.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Of all these VLCs here, only VLC.table was really used
after init, so use the ff_vlc_init_tables API
to get rid of them.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
Of all these VLCs here, only VLC.table was really used
after init, so use the ff_vlc_init_tables API
to get rid of them.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
These are quite small and therefore force reloads
that can be avoided by modest increases in the number of bits used.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
This is especially important for frame-threaded decoders like
this one, because up until now each thread had an identical
copy of all these VLC tables.
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
For lots of static VLCs, the number of bits is not read from
VLC.bits, but rather a compile-constant that is hardcoded
at the callsite of get_vlc2(). Only VLC.table is ever used
and not using it directly is just an unnecessary indirection.
This commit adds helper functions and macros to avoid the VLC
structure when initializing VLC tables; there are 2x2 functions:
Two choices for init_sparse or from_lengths and two choices
for "overlong" initialization (as used when multiple VLCs are
initialized that share the same underlying table).
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
The code above this does a whitelist on desc->flags, which now includes
the (disallowed) AV_PIX_FMT_FLAG_XYZ for XYZ formats. So there is no
more need for a separate check, here.
There are already several places in the codebase that match desc->name
against "xyz", and many downstream clients replicate this behavior.
I have no idea why this is not just a flag.
Motivated by my desire to add yet another check for XYZ to the codebase,
and I'd rather not keep copy/pasting a string comparison hack.
These are not supported by the drawing functions at all, and were
incorrectly advertised as supported in the past.
Note: This check is added only to separate the logic change from the API
change in the following commit, and will be removed again after it
becomes redundant.
Clang versions before 17 (Xcode versions up to and including 15.0)
had a very annoying bug in its behaviour of the ".arch" directive
in assembly. If the directive only contained a level, such as
".arch armv8.2-a", it did validate the name of the level, but it
didn't apply the level to what instructions are allowed. The level
was applied if the directive contained an extra feature enabled,
such as ".arch armv8.2-a+crc" though. It was also applied on the
next ".arch_extension" directive.
This bug, combined with the fact that the same versions of Clang
didn't support the dotprod/i8mm extension names in either
".arch <level>+<feature>" or in ".arch_extension", could lead to
unexepcted build failures.
As the dotprod/i8mm extensions couldn't be enabled dynamically
via the ".arch_extension" directive, someone building ffmpeg could
try to enable them by configuring their build with
--extra-cflags="-march=armv8.6-a".
During configure, we test for support for the i8mm instructions
like this:
# Built with -march=armv8.6-a
.arch armv8.2-a # Has no visible effect here
#.arch_extension i8mm # Omitted as the extension name isn't known
usdot v0.4s, v0.16b, v0.16b
# Successfully assembled as armv8.6-a is the effective level,
# and i8mm is enabled implicitly in armv8.6-a.
Thus, we would enable assembling those instructions. However if
we later check for another extension, such as sve (which those
versions of Clang actually do support), we can later run into the
following situation when building actual code:
# Built with -march=armv8.6-a
.arch armv8.2-a # Has no visible effect here
#.arch_extension i8mm # Omitted as the extension name isn't known
.arch_extension sve # Included as "sve" is as supported extension name
# .arch_extension effectively activates the previous .arch directive,
# so the effective level is armv8.2-a+sve now.
usdot v0.4s, v0.16b, v0.16b
# Fails to build the instructions that require i8mm. Despite the
# configure check, the unrelated ".arch_extension sve" directive
# breaks the functionality of the i8mm feature.
This patch avoids this situation:
- By adding a dummy feature such as "+crc" on the .arch directive
(if supported), we make sure that it does get applied immediately,
avoiding it taking effect spuriously at a later unrelated
".arch_extension" directive.
- By checking for higher arch levels such as armv8.4-a and armv8.6-a,
we can assemble the dotprod and i8mm extensions without the user
needing to pass -march=armv8.6-a. This allows using the dotprod/i8mm
codepaths via runtime detection while keeping the binary runnable
on older versions. I.e. this enables the i8mm codepaths on Apple M2
machines while built with Xcode's Clang.
TL;DR: Enable the I8MM extensions for Apple M2 without the user needing
to do a custom configuration; avoid potential build breakage if a user
does such a custom configuration.
Once Xcode versions that have these issues fixed are prevalent, we
can consider reverting this change.
Signed-off-by: Martin Storsjö <martin@martin.st>
If the scan lines are aligned, we can load each row as a 64-bit value,
thus avoiding segmentation. And then we can factor the conversion or
subtraction.
In principle, the same optimisation should be possible for high depth,
but would require 128-bit elements, for which no FFmpeg CPU flag
exists.
This should hopefully clarify that the option only affects MSZ
full-width characters, and not all full-width ASCII. Additionally,
this matches the prefix with the upstream option.
Signed-off-by: TADANO Tokumei <aimingoff@pc.nifty.jp>
This patch adds two MSZ (Middle Size; half width) character
related options, mapping against newly added upstream
functionality:
* `replace_msz_japanese`, which was introduced in version 1.0.1
of libaribcaption.
* `replace_msz_glyph`, which was introduced in version 1.1.0
of libaribcaption.
The latter option improves bitmap type rendering if specified
fonts contain half-width glyphs (e.g., BIZ UDGothic), even
if both ASCII and Japanese MSZ replacement options are set
to false.
As these options require newer versions of libaribcaption, the
configure requirement has been bumped accordingly.
Signed-off-by: TADANO Tokumei <aimingoff@pc.nifty.jp>
On some environments, a `bool` variable is of smaller size than `int`.
As AV_OPT_TYPE_BOOL is internally handled as sizeof(int), if a `bool`
option was set on such an environment, the memory of following
variables would be filled. Additionally, set values may be destroyed
by av_opt_copy().
Signed-off-by: TADANO Tokumei <aimingoff@pc.nifty.jp>
@@ -26,7 +26,9 @@ The General Assembly is sovereign and legitimate for all its decisions regarding
The General Assembly is made up of active contributors.
Contributors are considered "active contributors" if they have pushed more than 20 patches in the last 36 months in the main FFmpeg repository, or if they have been voted in by the GA.
Contributors are considered "active contributors" if they have authored more than 20 patches in the last 36 months in the main FFmpeg repository, or if they have been voted in by the GA.
The list of active contributors is updated twice each year, on 1st January and 1st July, 0:00 UTC.
Additional members are added to the General Assembly through a vote after proposal by a member of the General Assembly. They are part of the GA for two years, after which they need a confirmation by the GA.
av_log(NULL, AV_LOG_WARNING, "Multiple %s options specified for stream %d, only the last option '-%s%s%s "SPECIFIER_OPT_FMT_##type"' will be used.\n",\
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.