Compare commits

...

1241 Commits

Author SHA1 Message Date
David Korczynski 444f2cf047 avfilter/boxblur: Fix off by one errors
Fixes: ada-2-poc.mkv

Found-by: Claude and Ada Logics. This issue was found by Anthropic from using agents to study security of open source projects, and I am from Ada Logics helping validate the found issues and report to maintainers.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-05-16 17:22:45 +00:00
James Almer ef3ff9a73d avformat/iamf_writer: reject unset frame size
The specification states that nb_samples in codec config must not be zero.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-05-16 13:55:23 -03:00
James Almer 412aa48868 fftools/ffmpeg_mux_init: propagate the muxer request for fixed frame size
Signed-off-by: James Almer <jamrial@gmail.com>
2026-05-16 13:55:23 -03:00
James Almer c0bdc3b62a avformat/avformat: add an AVOutputFormat capability flag to signal fixed frame size is needed.
And set it on the IAMF muxer.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-05-16 13:55:22 -03:00
James Almer 8e162daf9a fftools:/ffmpeg_enc: honor the user request for fixed size frames
And set it also for non-variable frame size encoders.

FATE changes are the result of passing a frame_size to flac and wavenc
encoders, instead of letting them choose one.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-05-16 13:55:22 -03:00
James Almer 8567345514 tests/fate/lavf-audio: set frame_size on fate-lavf-ogg
Both worksaround a issue the following commit reveals (encoding with 4096
frame_size fails on aarch64 for unknown reasons), and tests setting
frame_size now that it's allowed (and ensuring the CLI doesn't overwrite it).

Signed-off-by: James Almer <jamrial@gmail.com>
2026-05-16 13:55:22 -03:00
James Almer 53d46a51fa avcodec/encode: propagate skip samples side data if present
Only for non-delay codecs.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-05-16 13:55:22 -03:00
James Almer 355d05a784 avcodec/encode: report that the padded samples must be discarded
For encoders where we pad the last frame, actually tag the silent samples as
discardable.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-05-16 13:55:22 -03:00
James Almer 7c5df8d34d avformat/matroskaenc: use frame_size to write audio DefaultDuration
Signed-off-by: James Almer <jamrial@gmail.com>
2026-05-16 13:55:22 -03:00
James Almer b1120b1ed8 avcodec: add a flag to force encoders to use fixed size frames
Signed-off-by: James Almer <jamrial@gmail.com>
2026-05-16 13:55:22 -03:00
James Almer bf2695e876 avcodec/pcm-dvdenc: don't allow the user to set frame_size
This is for an upcoming change where the field will become user settable.
Unless a proper check for frame_size is introduced, it's better to just not
allow arbitrary values to be used.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-05-16 13:55:22 -03:00
David Korczynski 08d7646abf avformat/assenc: Add the missing parentheses
Fixes: ada-1-poc.mkv

Found-by: Claude and Ada Logics. This issue was found by Anthropic from using agents to study security of open source projects, and I am from Ada Logics helping validate the found issues and report to maintainers.
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-05-16 16:53:51 +00:00
Kacper Michajłow 200cbaeb5a avformat/hlsenc: use correct close function for custom io
This is open by s->io_open().

Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2026-05-16 18:03:26 +02:00
Kacper Michajłow 06ef9a74ea avformat/hlsenc: respect io_open set in AVFormatContext
io_open_default() will call internal impl if needed, don't call it
directly.

Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2026-05-16 17:18:51 +02:00
Kacper Michajłow 4cf687b3b1 avformat/dashenc: respect io_open set in AVFormatContext
io_open_default() will call internal impl if needed, don't call it
directly.

Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2026-05-16 17:18:51 +02:00
Kacper Michajłow fbc4003642 avformat/dashdec: respect io_open set in AVFormatContext
io_open_default() will call internal impl if needed, don't call it
directly.

Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2026-05-16 17:18:51 +02:00
Michael Niedermayer 37c176a2a2 tests/fate/voice: Add fate-g726le-encode
Co-Authored-by: AI
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-05-16 15:09:52 +00:00
Thai Duong bbdce45fda avcodec/diracdec: Enlarge mctmp to cover the worst-case blheight·ybsep + yblen rows, and break the MC loop when no output rows remain
Fixes: ffmpeg_ANT-2026-02842_dirac-mctmp-heap-overflow

Discovered by Claude (Anthropic). Confirmed and reported by Thai Duong (Calif.io).

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-05-16 15:03:50 +00:00
Kacper Michajłow dc74fe70b2 avformat/demux: use correct close function for custom io
You may look and think `AVFMT_FLAG_CUSTOM_IO` check is enough, but this
is not what it seems. This flag means that user provided custom
AVIOContext, before creating AVFormatContext and it should not be
closed. However nested sub-demuxers may still open an temporary io, and
those have to be closed and use correct io_close2 function.

You can see 0dcac9c3f0 and
ef01061225 where this flag is cleared for
nested opens to avoid leaking those.

lavf micro version bumped so API users can know if it is safe to use
custom io.

Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2026-05-16 11:09:56 +00:00
Stuart Eichert 2aad4fb2e3 Typo: Remove space in 'centiseconds', 'microseconds', and 'nanoseconds'.
According to Chapter 3, Paragraph 2 of the "SI Brochure - 9th ed./version 3.02":

> Prefix symbols are printed in upright typeface, as are unit symbols,
> regardless of the typeface used in the surrounding text and are
> attached to unit symbols without a space between the prefix symbol
> and the unit symbol.

https://www.bipm.org/documents/20126/41483022/SI-Brochure-9-EN.pdf
2026-05-15 18:19:40 -07:00
James Almer 18b83f2d0a tools/zmqsend: close the input FILE
Fixes CVE-2026-30998

Fixes: Resource leak
Found-by: Xinghang Lv
Signed-off-by: James Almer <jamrial@gmail.com>
2026-05-15 20:35:56 -03:00
James Almer 144af8f81a tools/zmqsend: free the AVBprint buffer after using it
Fixes CVE-2026-30999

Fixes: memleak
Found-by: Xinghang Lv
Signed-off-by: James Almer <jamrial@gmail.com>
2026-05-15 20:32:24 -03:00
Link Mauve b5fc215e2d avformat/mods: Return EOF for packets starting at the index offset
Assuming there is no padding between the last packet and the index, this
prevents the index from being parsed as a normal packet, with non-
sensical data.
2026-05-15 19:30:52 +00:00
Link Mauve c4b7a51d35 avformat/mods: Parse the index entries
This lets us seek in the video properly, based on the table at the end
of the files, and has been tested with Suikoden Tierkreis videos.

While at it I’ve also set the duration of the stream, this makes the
progress bar work correctly in mpv.
2026-05-15 19:30:52 +00:00
Andreas Rheinhardt 00ff728512 avfilter/vf_pp7: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-15 20:29:29 +02:00
Andreas Rheinhardt 7971953d29 avfilter/x86/vf_pp7: Port ff_pp7_dctB_mmx to SSE2
Unfortunately a bit slower than the MMX version due to
the impossibility to use memory operands in paddw.
The situation would reverse if ff_dctB_mmx() would have
to issue emms.

dctB_c:                                                  3.7 ( 1.00x)
dctB_mmx:                                                3.3 ( 1.13x)
dctB_sse2:                                               3.6 ( 1.03x)

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-15 20:29:29 +02:00
Andreas Rheinhardt fc9e63474f avfilter/vf_pp7dsp: Add restrict
Makes GCC optimize the scalar codepath away.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-15 20:29:29 +02:00
Andreas Rheinhardt 94a49068db tests/checkasm: Add vf_pp7 checkasm test
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-15 20:29:29 +02:00
Andreas Rheinhardt 617a9afeb4 avfilter/vf_pp7: Add proper PP7DSPContext
This is in preparation for checkasm tests for dctB.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-15 20:29:29 +02:00
Andreas Rheinhardt 0a1faa7202 avfilter/vf_pp7: Constify
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-15 20:29:29 +02:00
Niklas Haas 76dc83d9be swscale/ops_dispatch: make ff_sws_ops_compile() output optional
Allows the uops macro generation code to not actually compile any passes.
More generally, this could be used to e.g. test if an op list is supported by
a backend without actually creating the passes.

The `bool first` change is needed because the `input == prev` check no longer
works if we don't actually compiled any passes.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-05-15 18:53:05 +02:00
Niklas Haas 420b1bf368 swscale/ops_dispatch: allow forcing specific ops backend
This will be used eventually when I rewrite checkasm/sw_ops to re-use the
code in ops_dispatch.c instead of hand-rolling the execution layer.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-05-15 18:53:05 +02:00
Niklas Haas 9021448857 swscale/ops_dispatch: merge ff_sws_ops_compile_backend() and compile()
Passing backend == NULL now loops over the backends as before.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-05-15 18:53:05 +02:00
Niklas Haas ad17144ce6 swscale/ops_dispatch: move op list print to ff_sws_ops_compile_backend()
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-05-15 18:53:05 +02:00
Niklas Haas 90669ab52e swscale/ops: move ff_sws_compile_pass() and friends to ops_dispatch.h
This function actually lives in ops_dispatch.c, and doesn't really make
sense in ops.h anymore. We should also move some stuff out of ops_internal.h,
which doesn't depend on any external ops stuff, here.

This allows the backend/compilation-related stuff to co-exist more nicely.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-05-15 18:53:05 +02:00
Niklas Haas 1d841635a4 swscale/ops: also include scaling ops in ff_sws_enum_op_lists()
Using the configured scaler from the SwsContext implicitly. This does affect
the output of libswscale/tests/sws_ops.c, which now prints about 4x as much
data (taking roughly 4x as long, but still within a second on my machine).

We can make this process a lot faster by forcing SWS_SCALE_POINT as the
scaler, which skips calculating any actual filter weights in favor of
generating a trivial 1-tap filter.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-05-15 18:53:05 +02:00
Niklas Haas eec9f712f5 swscale/ops: re-use ff_sws_op_list_generate() in ff_sws_enum_op_lists()
The only difference here is an extra ff_sws_add_filters() call, which is
a no-op because src w/h = dst w/h = 16.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-05-15 18:53:05 +02:00
Niklas Haas cac183f46f swscale/ops: don't silently suppress non-ENOTSUP errors
Matches the behavior to the comment.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-05-15 18:53:05 +02:00
Niklas Haas dacbf080f3 swscale/ops_chain: simplify ff_sws_op_compile_tables() signature
This no longer accesses prev/next as a result of the `unused` removal, so
the signature can be simplified to just take the op directly.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-05-15 18:53:05 +02:00
Niklas Haas 064600585e swscale/ops_chain: remove flexible from SWS_OP_MIN/MAX entries
We have other op types that skip checking the data even in non-flexible mode,
so there is a precedent for just leaving away `flexible` for such kernels.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-05-15 18:53:05 +02:00
Niklas Haas 98c1dbafbe swscale/ops_memcpy: don't depend on ops_backend.h
This is private to the C template based backend.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-05-15 18:53:05 +02:00
Niklas Haas 62aad4513c swscale/graph: move format conversion logic to formats.c
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-05-15 18:53:05 +02:00
Niklas Haas 0611abc1bb swscale/graph: move code for adding filters to format.h
Mirroring the precedent established by the other SwsOp-generating functions.
This allows us to re-use it for the uops macro generator.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-05-15 18:53:05 +02:00
Niklas Haas 9fe0ff3d56 swscale/graph: make _reinit() only call _init(), not _create()
This allows us to preserve the same memory allocation when
reinitializing a graph, which is a nice bonus.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-05-15 18:53:05 +02:00
Niklas Haas 56305c460c swscale/graph: add ff_sws_graph_alloc() and _init()
As an alternative to the current _create() API.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-05-15 18:53:05 +02:00
Niklas Haas 5e0dddef80 swscale/graph: move graph uninit logic to helper function
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-05-15 18:53:05 +02:00
Ramiro Polla d0a84c660a swscale/unscaled: fix rgbToRgbWrapper for non-native-endian formats
The fix from 5fa2a65c11 introduced a regression for non-native-endian
formats (such as rgb565be on a little-endian system).

Reproducible with:
$ ./libswscale/tests/swscale -unscaled 1 -src rgb565be -dst rgb24

Also:
$ ./ffmpeg_g -i /opt/samples/jpegls/128.jls -vf "scale=size=512x512,format=rgb24,scale=flags=neighbor,format=rgb565be" -f rawvideo -vframes 1 -y rgb565be.raw
$ magick -size 512x512 -endian MSB RGB565:rgb565be.raw output.png
$ ./ffplay_g output.png

(note: don't use ffmpeg to convert from rgb565be.raw to output for the
test above since it will perform the same bug and cancel out the error)
2026-05-15 14:21:50 +00:00
Ramiro Polla d812c8b0eb swscale/tests/swscale: log test parameters on loss error
When running with "-v 0", the test parameters were not being printed,
which made it hard to track down which conversion the error referred
to.

Now the test parameters are logged with av_log() when a loss error
happens.
2026-05-15 14:12:48 +00:00
Ramiro Polla 1cc9b15bab swscale/tests/swscale: fix -p option when -flags and/or -unscaled are used
The -p, -flags, and -unscaled options all affected the decision to
select a subsample of the tests to run. When specifying -p 0.1, about
57% of the tests would run instead of the expect 10%.

This commit fixes this by separating -p from -flags and -unscaled.
2026-05-15 14:12:48 +00:00
Ramiro Polla 24d432e227 swscale/tests/swscale: improve help text for -p option 2026-05-15 14:12:48 +00:00
Andreas Rheinhardt b2867481d9 avformat/avformat: Add AVFMT_EXPERIMENTAL to allowed flags
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-14 22:58:04 +02:00
Marton Balint 43bbd6dbf9 avformat/utils: avoid void pointer arithmetic
Fixes compliation on MSVC.

Regression since cb708d8703.

Based on code by James Almer.

Signed-off-by: Marton Balint <cus@passwd.hu>
2026-05-14 20:52:45 +02:00
Ben Kepner a327bc0561 avformat/hlsenc: fix segment duration with mixed stream time bases
When audio and video streams have different time bases (e.g. video at
1/90000 and audio at 1/48000), vs->start_pts was stored as a raw PTS
from whichever stream's packet arrived first. The segment split
comparison then subtracted this value from the current packet's PTS
without accounting for the time base difference, producing incorrect
elapsed time calculations.

This caused segments to be split at wrong points — either too
frequently (on every keyframe) or not at all, depending on the
relative magnitudes of the time bases.

Fix by normalizing vs->start_pts to AV_TIME_BASE_Q at the point of
assignment and converting pkt->pts to the same base before comparison.
This ensures the segment split decision is always unit-consistent
regardless of which stream's packet is being evaluated.

The bug is most easily triggered by HLS muxing with video passthrough
and audio transcode, where the video retains its container time base
while the audio encoder outputs in its native time base.

Signed-off-by: Ben Kepner <u6bkep@gmail.com>
2026-05-13 23:04:46 +00:00
Marton Balint 566ad7869e avformat/hlsenc: remove unused function parameter
Signed-off-by: Marton Balint <cus@passwd.hu>
2026-05-13 22:41:23 +02:00
Marton Balint f20ea3fb22 avformat/hlsenc: dynamically allocate segment uris along with the segment struct
As suggested by Andreas Rheinhardt.

Supersedes: #22536.

Signed-off-by: Marton Balint <cus@passwd.hu>
2026-05-13 22:41:23 +02:00
Marton Balint cb708d8703 avformat/utils: add ff_bprint_finalize_as_fam to put bprint strings to flexible array members
Signed-off-by: Marton Balint <cus@passwd.hu>
2026-05-13 22:41:23 +02:00
Kirill Gavrilov 553321d59e libavcodec/hdrdec: accept "#?RGBE" header in addition to "#?RADIANCE"
Some Radiance HDR image files in the wild have "#?RGBE" header,
which other image readers accept.

Also updated hdr_probe() in libavformat/img2dec.
2026-05-13 19:35:33 +00:00
Vignesh Venkat f69aa0cc64 avcodec/aomenc: Handle Smpte2094App5 metadata
If packets contain Smpte2094App5 metadata, pass it to
the libaom encoder.

Signed-off-by: Vignesh Venkat <vigneshv@google.com>
2026-05-13 19:13:23 +00:00
Lynne 2d826f18fb vulkan/prores_raw: don't load the quantization matrix on every invocation 2026-05-14 02:55:53 +09:00
Lynne 13aabf726b vulkan/prores_raw: specify format on image
Unlike other decoders or encoders, prores_raw only has a single
Vulkan format to worry about.
This is a 20% speedup on AMD, since AMD apparently has optimizations
for this.
2026-05-14 02:55:53 +09:00
Lynne a2737497de vulkan/prores_raw: add skip_bits_unchecked and use it
show_bits(gb, 32) is called immediately above. It guarantees that
the following skip_bits call will not need to reload.
2026-05-14 02:55:53 +09:00
Lynne 5dc567a28e vulkan/prores_raw: remove redundant fast golomb parsing path 2026-05-14 02:55:52 +09:00
Lynne 64f848890c vulkan/prores_raw: use 16-bit/32-bit uints where needed
16-bit ints can overflow.
2026-05-14 02:55:52 +09:00
Lynne 74e3d63fb6 vulkan/prores_raw: use get_bits shared memory cache
50% speedup on AMD.
2026-05-14 02:55:52 +09:00
Lynne 67811c2754 vulkan/common: fix get_bit() with SMEM caching
First of all, it uses the wrong data pointer. Second, gb.bits wouldn't
get set if LOAD64 was called after the start of the stream.
2026-05-14 02:55:48 +09:00
jiangjie 8ffaead836 fftools/ffmpeg_filter: fix frame reference leak in fg_output_step
When clone_side_data() fails in fg_output_step(), the function returns
without calling av_frame_unref(frame), leaking the frame reference.
2026-05-13 14:09:06 +00:00
Marvin Scholz 7e045dfbfc ffprobe: implement printing IAMF frame side data 2026-05-13 15:19:11 +02:00
Marvin Scholz 65635453cb avcodec: map IAMF packet side data to frame side data 2026-05-13 15:19:11 +02:00
Marvin Scholz 99908c6e05 avutil: add IAMF frame side data types
These contain the same data as the packet side data equivalents.
2026-05-13 15:19:11 +02:00
jiangjie 22d06b39ce avfilter/avfilter: fix memory leak of filter name in ff_filter_alloc error path
When ff_filter_alloc fails after the name has been allocated (via
av_strdup), the error handling code frees inputs and input_pads but
misses freeing ret->name, causing a memory leak.

Add av_freep(&ret->name) in the error path before freeing inputs.
2026-05-13 20:11:19 +08:00
Marvin Scholz 4851060ccd avutil: hdr_dynamic_metadata: fix error code
When s is NULL in av_dynamic_hdr_smpte2094_app5_from_t35, that's not an
allocation error but just invalid API usage. If there is any allocation
failure beforehand that would lead to this, the caller has to check it,
like is already done in all usages of this function in FFmpeg itself.
2026-05-12 17:18:38 +02:00
Marvin Scholz 37ff8fad47 avutil: hdr_dynamic_metadata: handle allocation failure
Handle allocation failure properly in
av_dynamic_hdr_smpte2094_app5_from_t35.

Fixes Coverity issue #1691448
2026-05-12 17:18:38 +02:00
Marvin Scholz 3dbc3c6954 avcodec/libvorbisenc: conditionally set initial_padding
Only set initial_padding when vorbis_analysis_blockout succeeds,
this avoids passing uninitialized data/garbage pointer to
av_vorbis_parse_frame.

Fix Coverity Issue 1681345
2026-05-12 15:17:28 +00:00
Andreas Rheinhardt c29d1b9df5 avformat/id3v2: Fix indentation
Forgotten after e9c372362c.

Reviewed-by: Romain Beauxis <toots@rastageeks.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-12 16:12:54 +02:00
Andreas Rheinhardt 356e427d5c avformat/id3v2: Use proper logcontext
Otherwise one could not associate log messages with inputs.

Reviewed-by: Romain Beauxis <toots@rastageeks.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-12 16:12:51 +02:00
Andreas Rheinhardt e626b02a01 avformat/id3v2: Avoid temporary buffer
Reviewed-by: Romain Beauxis <toots@rastageeks.org>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-12 16:11:56 +02:00
Marvin Scholz 18761f9fb5 avformat/rtpdec_av1: fix buffer overflow due to variable confusion
The pktpos denotes the position in the output packet buffer, while
buf_ptr is the position in the input buffer. As this payload is ignored,
nothing is written to the output packet so increasing the pktpos does
not make sense here, instead the buf_ptr has to be increased to advance
the input buffer to the correct position after this OBU.

This incorrect increment here could result in pktpos exceeding the whole
size of the output packet and the later call to memcpy to write to that
buffer would start its write way past the end of the packet buffer.

Fix #22812

Reported-By: fre3dm4n
2026-05-12 16:02:51 +02:00
nyanmisaka d01d18ad71 vulkan: fix using encode caps before querying
Fix using enc_caps.supportedEncodeFeedbackFlags before
calling vkGetPhysicalDeviceVideoCapabilitiesKHR().

Otherwise the check will never pass and will fail with ENOTSUP.

Fixes 3f9e04b

Signed-off-by: nyanmisaka <nst799610810@gmail.com>
2026-05-12 11:28:09 +00:00
David Rosca 6b3e0f903e vulkan_encode: Fix description for quality option
From spec:
  Generally, using higher video encode quality levels may produce
  higher quality video streams at the cost of additional processing time.
2026-05-12 09:14:34 +00:00
Vignesh Venkat 2c1af16872 avformat/matroskaenc: Use correct buffer for smpte2094_app5
In the call to mkv_write_blockadditional, use the correct
buffer for smpte2094_app5.

Commit 38df985fba updated the
buffer usage to prevent incorrect buffer reuse, but left this line
unchanged inadvertently.

Signed-off-by: Vignesh Venkat <vigneshv@google.com>
2026-05-11 14:44:25 -07:00
Vignesh Venkat e1797cdd51 avcodec/libvpxenc: Copy Smpte2094App5 metadata
If incoming packets contain Smpte2094App5 metadata, retain them
so that they are passed through to the output.

Signed-off-by: Vignesh Venkat <vigneshv@google.com>
2026-05-11 20:17:11 +00:00
Niklas Haas c1ff2c24b5 swscale/filters: hard-code radius for trivial kernels
box() and triangle() have well-defined, trivially verifiable numerical
inverses.

We could actually pre-compute and hard-code the numerical inverse of all
non-parametric kernels, but I'm a bit reluctant to do this as I have plans to
adjust the value of SWS_MAX_REDUCE_CUTOFF based on the desired bit depth of the
output, which makes a hard-coding approach unfeasible.

(It would also be a brittle solution that may break whenever we extend the
scaler configuration API, as well as making it harder to add new filters)

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-05-11 19:59:39 +02:00
Nariman-Sayed 837cf8e38f avformat/tls_mbedtls: fix DTLS handshake failure when receiving non-DTLS packets
Some WebRTC servers such as Pion send STUN packets concurrently during
the DTLS handshake. Unlike OpenSSL and GnuTLS which filter non-DTLS
packets internally, mbedtls passes all received UDP packets directly to
its DTLS state machine, causing the handshake to fail.

Fix this by using ff_is_dtls_packet() in mbedtls_recv to discard
non-DTLS packets such as STUN by returning WANT_READ, as specified
by RFC 5764 Section 5.1.2.

Signed-off-by: Nariman-Sayed <narimansayed28@gmail.com>
2026-05-11 12:36:58 +00:00
Nariman-Sayed 094f72748d avformat/tls: move DTLS packet detection into ff_is_dtls_packet()
Move the DTLS packet detection logic from whip.c into a shared
ff_is_dtls_packet() function in tls.c, with its declaration and
related macros in tls.h. Update whip.c to use the new shared function.

Signed-off-by: Nariman-Sayed <narimansayed28@gmail.com>
2026-05-11 12:36:58 +00:00
Kacper Michajłow 17bc88e67f avformat/hls: disable http_persistent/http_multiple with custom io_open
Both rely on the AVIOContext being backed by the builtin URLContext.
When the API user overrides io_open, the keepalive path asserts on the
missing URLContext and the http_multiple auto-detect probe fails on
every read. http_multiple=1 still works even with custom IO.

Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2026-05-11 09:00:31 +00:00
Lynne fd25b35dd2 vulkan_ffv1: support decoding 32-bit float video
Sponsored-by: Sovereign Tech Fund
2026-05-11 05:32:41 +09:00
jiangjie a67c15dfa5 avutil/hwcontext_vulkan: fix resource leak on alloc_mem failure
Fix by using goto fail so that vulkan_frame_free() properly cleans up
all previously created resources.
2026-05-10 20:01:47 +00:00
Andreas Rheinhardt f16ec8913e avcodec/h264_cavlc: Fix indentation
Forgotten after 8d6947bc7d.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-10 17:16:38 +02:00
Dale Curtis 5bbc00c05d [Wave] Fix issues with unaligned metadata chunks.
Fixes corruption issues with the sample in this PR.

Signed-off-by: Dale Curtis <dalecurtis@chromium.org>
2026-05-10 01:18:09 +00:00
Michael Niedermayer 188461be10 avformat/mpegts: Dont assume fc->priv_data is a MpegTSContext
Fixes: out of array access
Fixes: 508365271/clusterfuzz-testcase-minimized-ffmpeg_dem_WTV_fuzzer-6219535958212608

Regression since: b9cb948ec1

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-05-09 18:35:08 +00:00
James Almer b2dfc14276 avcodec/vvc_parser: properly split PUs when a Prefix SEI NUT is found
Signed-off-by: James Almer <jamrial@gmail.com>
2026-05-09 11:44:39 -03:00
James Almer 2948acd528 avformat/nal: take into account removed zero bytes when calculating buffer size in nal_parse_units()
Fixes issue #23010

Signed-off-by: James Almer <jamrial@gmail.com>
2026-05-09 11:28:46 -03:00
jiangjie 2f4ad2497e avformat/movenc: fix dynamic buffer leaks on error paths
In mov_write_iacb_tag(), the dynamic buffer dyn_bc was leaked when
ff_iamf_write_descriptors() failed.

In mov_write_track_udta_tag(), the dynamic buffer pb_buf was leaked
when mov_write_track_kinds() failed, as the error path returned
directly instead of going through cleanup.

Fix both by ensuring ffio_free_dyn_buf() is called on all error paths.
2026-05-09 19:27:17 +08:00
Zhao Zhili 180a10647d avformat/tee: clean up local resources on program copy failure
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2026-05-09 10:46:35 +00:00
Vignesh Venkat 8518599cd1 avformat/matroskaenc: Write additional mappings for webm
The elements written in mkv_write_blockadditionmapping
(MaxBlockAdditionID, BlockAddIDType and BlockAddIDValue) are all
allowed in WebM as well. Move them out of the "if (!IS_WEBM)"
block.

Matroska spec:
https://www.matroska.org/technical/elements.html#MaxBlockAdditionID
(See column with title "W" which shows WebM availability).

WebM spec:
https://www.webmproject.org/docs/container/#MaxBlockAdditionID

Signed-off-by: Vignesh Venkat <vigneshv@google.com>
2026-05-08 13:33:31 -07:00
Andreas Rheinhardt c8a4770599 avcodec/vc1dsp: Consistently use ptrdiff_t for stride
Also do the same in the x86 MMX code and its MIPS MMI clone.

Reviewed-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-08 13:28:34 +02:00
Andreas Rheinhardt ff0ad0278d avcodec/cbs: Move ff_cbs_all_codec_ids to cbs_bsf
Only used as AVBitStreamFilter.codec_ids. This avoids duplicating
it into lavf.

Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-08 09:35:38 +02:00
Andreas Rheinhardt b6d2a0fc66 configure: Add missing apv_metadata->cbs_apv dependency
Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-08 09:35:35 +02:00
Andreas Rheinhardt 6a59c847b5 configure: Redo enabling cbs in lavf
Right now, the cbs_type_table (the table of all CodedBitstreamTypes
supported by CBS) is empty unless cbs_apv and cbs_av1 is enabled.
The latter are only enabled in configure if they are needed in lavc.
This means that the mov muxers (the only users of cbs-in-lavf)
don't work as they should depending upon the availability of
e.g. the av1_metadata BSF. The table being empty is also illegal C
and according to PR #23038 MSVC warns about this (as does GCC
with -pedantic) and it may even lead to an internal compiler error.

This could be fixed by simply adding a mov_muxer->cbs_av1,cbs_apv
dependency in configure, yet this would have the downside that
it would force cbs_av1 and cbs_apv to be built for lavc, too,
even though it may not be needed there. So add new configure
variables cbs_{apv,av1}_lavf and cbs_lavf to track this correctly.

Reported-by: xiaozhuai <798047000@qq.com>
Reviewed-by: James Almer <jamrial@gmail.com>
Reviewed-by: xiaozhuai <798047000@qq.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-08 09:34:52 +02:00
Manuel Lauss 0f45541e04 avcodec/sanm: extend the codec37 mv table to 3x512 entries
the c37_mv table is 3x 510-entry tables combined.  Extend each
with a coordinate pair for index 0xff, which allows to eliminate
the index check in the code37/48 block handlers.

Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
2026-05-08 05:08:22 +00:00
Manuel Lauss b418da28bf avcodec/sanm: fobj: apply the x/y offsets after size determination
Otherwise a wrong size might be determined.

Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
2026-05-08 05:08:22 +00:00
Manuel Lauss 01d895340c avcodec/sanm: accept fixed dimensions for ANIM at decode_init
This undoes 556cef27d9, which I added to fix a fuzzer-crash,
but there's no reason to expect the decoder can only be invoked
via the smush demuxer.  Instead also accept a range of dimensions
from 2x2 up to 640x480.

Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
2026-05-08 05:08:22 +00:00
Manuel Lauss d1577bc018 avcodec/sanm: fobj codec37+: reject too large frames.
For the diff-buffer codecs, return error for frames that are larger
than the currently configured canvas.  This mimics the behaviour
of the DOS smush engines.

Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
2026-05-08 05:08:22 +00:00
Manuel Lauss cc2f9ff14f avcodec/sanm: fobj: do not use codec-id to determine canvas size.
Codec>=37 with smaller dimensions can be embedded onto larger canvasses;
it makes no sense to trust their dimensions explicitly.

Signed-off-by: Manuel Lauss <manuel.lauss@gmail.com>
2026-05-08 05:08:22 +00:00
Yong Yu 9ab345b2d1 fftools/graph: Add missing include "libavutil/mem.h" for fftools/graph/graphprint.c
when HAVE_AV_CONFIG_H is defined, include libavutil/mem.h
is skipped in libavutil/common.h.
Need this header file to build successfully.
2026-05-07 22:00:38 +00:00
Romain Beauxis 0f2e693956 tests/fate/id3v2.mak: add new tests for comm, lyrics, txx and wma
comments.

Signed-off-by: Romain Beauxis <romain.beauxis@gmail.com>
2026-05-07 09:46:53 -05:00
Romain Beauxis 85cc813412 libavformat/tests/id3v2: add test program for raw ID3v2 frame debugging
Signed-off-by: Romain Beauxis <romain.beauxis@gmail.com>
2026-05-07 09:46:37 -05:00
Romain Beauxis 910d796430 libavformat/id3v2: wire FF_FDEBUG_ID3V2 frame debugging
Signed-off-by: Romain Beauxis <romain.beauxis@gmail.com>
2026-05-07 09:46:17 -05:00
Michael Niedermayer b5c7c7d273 avcodec/cbs_h266_syntax_template: tighten sh_num_tiles_in_slice_minus1 upper bound
Fixes: out of array access

Found-by: Vishal Panchani
Fix suggested by: Vishal Panchani
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-05-07 13:08:34 +00:00
Zhao Zhili a17e96b103 avcodec/hevc: scope missing-ref loop counters locally 2026-05-07 13:01:16 +00:00
Zhao Zhili 3b939ced79 avcodec/hevc: limit missing-ref fill to coded planes
generate_missing_ref walked frame->f->data[] until a NULL slot, which
on alpha-video frames extended to data[3] and read
sps->hshift[3]/vshift[3] out of bounds.

The alpha plane is produced by the alpha layer via
replace_alpha_plane; the base decoder path never reads or writes it.
Bound the fill loop by the SPS coded plane count. This both removes
the out-of-bounds shift access and avoids an unnecessary full-frame
memset of the alpha plane.

Fixes: out of array read
Fixes: 500770604/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_HEVC_fuzzer-6157374833623040

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
2026-05-07 13:01:16 +00:00
牟凡 28ecb07e55 avcodec/cbs_h266: fix wrong condition for chroma MTT depth in PH
In the picture header parser, the chroma branch incorrectly tested
sps_max_mtt_hierarchy_depth_intra_slice_chroma to decide whether to
parse ph_log2_diff_max_{bt,tt}_min_qt_intra_slice_chroma.

Per ITU-T H.266 (V4, 01/2026) section 7.3.2.8 "Picture header
structure syntax", the condition is on the just-parsed
ph_max_mtt_hierarchy_depth_intra_slice_chroma, exactly mirroring the
luma branch a few lines above and the inter-slice branch below.
sps_partition_constraints_override_enabled_flag allows the picture
header to override the SPS values, so testing the SPS field is
incorrect and desynchronises the parser whenever the PH override
changes the chroma MTT depth from/to zero.

Signed-off-by: Mou Fan <moufan17@126.com>
2026-05-07 10:42:44 +01:00
Andreas Rheinhardt f2e5eff3ff avcodec/atsc_a53: Avoid GetBits API to parse A53 CC data
This fixes overreads with libdav1d, because it provides
non-padded data in violation to the requirements of
the GetBits API.

Furthermore, using the GetBits API here is wasteful,
as the offsets here are known and the actual data to be copied
is even byte-aligned, allowing to use memcpy.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-06 15:27:21 +02:00
ldm0 17734f6967 doc/examples/demux_decode: update ffplay command for audio and video output
Signed-off-by: ldm0 <liudingming@bytedance.com>
2026-05-05 23:44:42 +00:00
Dale Curtis 256d93413f avformat/mov: Fix negative index given to can_seek_to_key_sample()
The potentially negative return value of av_index_search_timestamp()
wasn't being handled before passing it to can_seek_to_key_sample().

Found by Wongi Lee (@_qwerty_po) of Theori with Xint Code,
Jungwoo Lee (@physicube).

Signed-off-by: Dale Curtis <dalecurtis@chromium.org>
2026-05-05 21:26:38 +00:00
Andreas Rheinhardt 5c44245878 avutil/hwcontext_vulkan: Fix shadowing
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-05 12:46:26 +02:00
Andreas Rheinhardt 3fbf5faa3f avutil/hwcontext_vulkan: Add av_fallthrough
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-05 12:46:23 +02:00
Andreas Rheinhardt 310cf06a27 avcodec/av1dec: Avoid implicit fallthrough
Fixes a -Wimplicit-fallthrough warning from Clang;
GCC does not warn about this.

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-05 12:46:00 +02:00
Gyan Doshi 7fc335cb27 avformat/tee: relay programs to child muxers 2026-05-05 12:54:40 +05:30
Gyan Doshi 1b82e58a3a avformat/segment: relay programs to child muxers 2026-05-05 12:54:40 +05:30
Gyan Doshi 0005b36eb7 avformat/hlsenc: relay programs to child muxers 2026-05-05 12:54:40 +05:30
Gyan Doshi 5c557dd5d5 avformat: add av_program_copy()
Helper to transfer programs from one muxing context to another.
2026-05-05 12:54:36 +05:30
Gyan Doshi 7623379a77 avformat: add av_program_add_stream_index2()
av_program_add_stream_index() added in 526efa1053
may fail to carry out its purpose but the lack of
a return value stops callers from catching any error.

Fixed in new function.
2026-05-05 12:51:54 +05:30
Andreas Rheinhardt 1c522ffdef avcodec/x86/mpegvideoenc{,_template}: Remove remnants of MMX
Reviewed-by: Kieran Kunhya <kieran@kunhya.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-04 17:31:05 +02:00
Andreas Rheinhardt d328a02a9a avcodec/x86/vp6dsp_init: Update obsolete comment
Forgotten in 6cb3ee80b3.

Reviewed-by: Kieran Kunhya <kieran@kunhya.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-04 17:31:01 +02:00
Andreas Rheinhardt 564f610cbf avcodec/x86/vc1dsp_loopfilter: Remove MMXEXT funcs overridden by SSSE3
Reviewed-by: Kieran Kunhya <kieran@kunhya.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-04 17:28:28 +02:00
Andreas Rheinhardt 6a46ea7da2 avcodec/x86/constants, h263_loopfilter: Move pb_FC to h263_loopfilter
Only used there.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-04 16:57:18 +02:00
Thilo Borgmann aa14727cd5 avcodec/webp: export XMP metadata
Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>
2026-05-04 12:47:30 +02:00
huanghuihui0904 b40d91cad9 avformat: avoid potential tmp_opts leak in ffurl_connect()
When options is NULL, ffurl_connect() creates a temporary dictionary
(tmp_opts). If the protocol_blacklist av_dict_set() fails after the
whitelist entry was inserted, the function returns without freeing
this dictionary.

Ensure tmp_opts is freed on this error path.

Signed-off-by: Huihui_Huang <hhhuang@smu.edu.sg>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-03 20:32:16 +00:00
João Neves 2c71a28bf0 avcodec/hdrdec: fix pixel count decrement in RLE decompress loop
The w variable counts pixels, not bytes. The non-RLE path correctly
uses w-- (one pixel = 4 bytes), but the RLE path uses w -= 4, causing
the loop to terminate after roughly 1/4 of the expected pixels.

The w -= 4 was introduced in 14e99cb472 which moved the decrement
inside the loop to fix an OOB write (clusterfuzz-5423041009549312).
The move was correct, but the decrement value should have been 1 to
match the non-RLE path.

Signed-off-by: João Neves <joaocns0@protonmail.com>
2026-05-03 20:19:51 +00:00
João Neves daedf4012d avcodec/exr: check rle() return value in rle_uncompress()
rle_uncompress() silently discards the return value of rle(). When the
compressed data is malformed and rle() returns AVERROR_INVALIDDATA,
processing continues on a partially filled buffer. Propagate the error
to the caller, which already handles it at line 1420.

Signed-off-by: João Neves <joaocns0@protonmail.com>
2026-05-03 20:15:54 +00:00
Alexander Slobodeniuk 1e9dd2b7e9 avformat/mpegts: handle AC-4 descriptor in DVB extension
as defined in ETSI EN 300 468 (Table 109).

This allows ffprobe to recognize that .ts
file has an ac4 stream.

Checked on the files downloaded from
https://ott.dolby.com/OnDelKits/AC-4/Download_v15.html
2026-05-03 20:10:26 +00:00
Alexander Slobodeniuk dd020e1025 avformat/mpegts: simplify ac3/eac3 descriptor handling
those lines are literally the same, so removing the
code duplication
2026-05-03 20:10:26 +00:00
Alexander Slobodeniuk cda069b092 avformat/mpegts: don't check impossible branches
Quit dvb extension handling when the descriptor
have been processed
2026-05-03 20:10:26 +00:00
Jesper Ek d5a913f99f avformat/gxf: return proper errors when reading header/packet
Returning -1 resulted in an "operation not permitted" error which
was incorrect. This changes the error to a correct "invalid data".
2026-05-03 20:03:00 +00:00
James Almer 1a2c16fe51 avcodec/av1dec: check that primary_ref_frame is within range
Fixes CVE-2026-30997

Fixes: Out-of-Bounds Access
Found-by: Xinghang Lv
Signed-off-by: James Almer <jamrial@gmail.com>
2026-05-03 15:55:21 -03:00
Romain Beauxis f80431dc4e .forgejo/CODEOWNERS: fix ogg pattern for @toots 2026-05-03 17:05:25 +00:00
James Almer 3393dc3020 avformat/dashdec: propagate parsing requirement from the underlying demuxer
Signed-off-by: James Almer <jamrial@gmail.com>
2026-05-03 17:00:17 +00:00
James Almer e76bfba1cf avformat/mov: request parsing for LCEVC streams
Given that no standalone decoder will be present, use a parser to get stream
information that's not reported by the container.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-05-03 17:00:17 +00:00
Andreas Rheinhardt da195b1e84 avcodec/qsvenc: Add av_fallthrough
Reviewed-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-03 18:23:10 +02:00
Andreas Rheinhardt e1115751dd avcodec/nvenc: Add av_fallthrough
Reviewed-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-03 18:23:07 +02:00
Andreas Rheinhardt 095897060a avcodec/libzvbi-teletextdec: Add av_fallthrough
Reviewed-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-03 18:23:05 +02:00
Andreas Rheinhardt a9b97d070e avcodec/libxvid: Add av_fallthrough
Reviewed-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-03 18:23:03 +02:00
Andreas Rheinhardt dc12dd82a1 avcodec/libxavs2: Add av_fallthrough
Reviewed-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-03 18:23:00 +02:00
Andreas Rheinhardt 8881e1a52c avcodec/libvpxenc: Add av_fallthrough
Reviewed-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-03 18:22:58 +02:00
Andreas Rheinhardt 64bea20837 avcodec/libopusenc: Add av_fallthrough
Reviewed-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-03 18:22:56 +02:00
Andreas Rheinhardt d8b02fdb9f avcodec/libaomenc: Use av_fallthrough to mark fallthrough
Reviewed-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-03 18:22:53 +02:00
Andreas Rheinhardt 02391996f8 avfilter/vf_stereo3d: Add av_fallthrough
Reviewed-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-03 18:22:51 +02:00
Andreas Rheinhardt 5144b51151 avfilter/vf_super2xsai: Add av_fallthrough
Reviewed-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-03 18:22:49 +02:00
Andreas Rheinhardt 21c2d38537 avformat/rmdec: Fix shadowing
Reviewed-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-03 18:22:46 +02:00
Andreas Rheinhardt 2fd9d69034 avformat/rmdec: Add av_fallthrough
Reviewed-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-03 18:22:44 +02:00
Andreas Rheinhardt 3cf225b5f8 avcodec/aac/aacdec: Fix shadowing
Reviewed-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-03 18:22:42 +02:00
Andreas Rheinhardt d29cbb87c3 avcodec/aac/aacdec: Add av_fallthrough
Reviewed-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-03 18:22:39 +02:00
Andreas Rheinhardt cf5191fac7 avcodec/hevc/hevcdec: Fix shadowing
Reviewed-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-03 18:22:37 +02:00
Andreas Rheinhardt 0cbf77e843 avcodec/hevc/hevcdec: Add av_fallthrough
Reviewed-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-03 18:22:34 +02:00
Andreas Rheinhardt e61c940654 avcodec/mpegvideo_enc: Fix shadowing
Reviewed-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-03 18:22:31 +02:00
Andreas Rheinhardt 04ba5e7537 avcodec/mpegvideo_enc: Add av_fallthrough
Reviewed-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-03 18:22:29 +02:00
Andreas Rheinhardt 7b4b658a87 avcodec/mpegvideo_motion: Add av_unreachable, fix fallthrough warnings
Reviewed-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-03 18:22:27 +02:00
Andreas Rheinhardt 4b58570ff7 avcodec/sga: Add av_fallthrough
Reviewed-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-03 18:22:24 +02:00
Andreas Rheinhardt 392ce463a5 avcodec/tiff: Fix shadowing
Reviewed-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-03 18:22:18 +02:00
Andreas Rheinhardt 25b7166fe3 avcodec/tiff: Add av_fallthrough
Reviewed-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-03 18:22:16 +02:00
Andreas Rheinhardt 05a8e89474 avcodec/tta: Fix shadowing
Reviewed-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-03 18:22:13 +02:00
Andreas Rheinhardt 5a7558a0a2 avcodec/tta: Add av_fallthrough
Reviewed-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-03 18:22:11 +02:00
Andreas Rheinhardt 9eeca76cbe avcodec/vdpau_mpeg12: Use av_fallthrough to mark fallthrough
Reviewed-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-03 18:22:07 +02:00
Andreas Rheinhardt 2d0d937ed2 swscale/ops_chain: Use av_fallthrough to mark fallthrough
Reviewed-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-03 18:22:05 +02:00
Andreas Rheinhardt a867648555 swscale/x86/swscale: Fix shadowing
Reviewed-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-03 18:22:03 +02:00
Andreas Rheinhardt e241a45548 swscale/x86/swscale: Add av_fallthrough
Reviewed-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-05-03 18:21:45 +02:00
Michael Niedermayer 2e32276872 avcodec/aac/aacdec_usac_mps212: fix attach_lsb() OOB after huff_decode
Fixes: VS-FF-2026-0001/poc.wav

Reported-by: Vuln Seeker Cyber Security Team
2026-05-03 15:11:28 +00:00
Michael Niedermayer 118bddf0ce avcodec/dfpwmdec: Check nb_samples
Fixes: integer overflow

Found-by: Dhiraj Mishra <mishra.dhiraj95@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-05-03 16:56:43 +02:00
Michael Niedermayer 7ae36ceba9 avcodec/alsdec: do not set nbits invalidly
note that the spec actually disallows the 0 case too but we are
a little lenient here so the full 24bit twos-complement range can be handled

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-05-03 14:54:27 +00:00
Michael Niedermayer 43a0715e30 swscale/swscale_unscaled: adjust last line copy
Fixes: out of array access
Fixes: DFVULN-694

*Reporter: Zhenpeng (Leo) Lin at depthfirst*

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-05-03 14:52:32 +00:00
Michael Niedermayer 7d0837a742 swscale/swscale: Check srcSliceY and srcSliceH
Obviously noone should pass negative values, they make no sense, but better to
explicitly check

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-05-03 14:52:32 +00:00
Michael Niedermayer 2678bce860 avformat/avidec: check LIST size in avi_load_index()
This avoids an unsigned integer underflow and passing that large value to ff_read_riff_info()
2026-05-03 14:40:49 +00:00
depthfirst-dev[bot] f1c3f1cae1 avformat/avidec: validate INFO list size before parsing
Reject INFO list chunks that are too small to contain the expected
4-byte list type field before calling ff_read_riff_info().

The parser subtracts 4 from the list size when handing the remaining
payload to ff_read_riff_info(). If the chunk is smaller than 4 bytes,
that underflows the expected structure and should be treated as invalid
input.

Fixes: DFVULN-607

*Vulnerability reported by Zhenpeng (Leo) Lin at depthfirst*
*Patch validated by Zheng Yu at depthfirst*
2026-05-03 14:40:49 +00:00
Michael Niedermayer f47ca0a5e6 avformat/matroskadec: Check audio.sub_packet_h * audio.frame_size
Fixes: out of array access
Fixes: poc_matroska.mkv

This issue requires manually increasing the malloc limit
(-max_alloc 4294967296)

Found-by: Guanni Qu <qguanni@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-05-03 14:39:45 +00:00
Michael Niedermayer 9d9250e5da avformat/pcm: Use 64bit for byte_rate
Fixes: integer overflow

Found-by: Marius Momeu <marius.momeu@berkeley.edu>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-05-03 13:26:34 +00:00
Michael Niedermayer b45a6d3f76 avcodec/adpcm: signed integer overflow in ADPCM_N64
Fixes: signed integer overflow

Found-by: Marius Momeu <marius.momeu@berkeley.edu>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-05-03 13:26:34 +00:00
Michael Niedermayer 2d4ec46345 libavformat/xwma: fix overflow in seek position
Fixes: signed integer overflow

Found-by: Marius Momeu <marius.momeu@berkeley.edu>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-05-03 13:26:34 +00:00
Michael Niedermayer 0f5705959d avcodec/hevc/ps: validate rep_format dimensions in multi-layer SPS
When an SPS uses the multi-layer extension (nuh_layer_id > 0 with
sps_max_sub_layers_minus1 == 7), width and height are taken from the
VPS rep_format without the av_image_check_size() validation that the
direct path performs.  HEVC F.7.4.3.1.1 requires rep_format pic
dimensions to satisfy the constraints in 7.4.3.2.1, including
"pic_width_in_luma_samples shall not be equal to 0".

Run the same av_image_check_size() check in the multi-layer-extension
path so the SPS is rejected before it reaches setup_pps().

Fixes: VS-FF-2026-0003/poc.flv
Fixes: out of array access

Found-by: Vuln Seeker Cyber Security Team
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-05-03 13:26:06 +00:00
Marius Momeu e32b2c8886 avfilter/vf_kerndeint: Check for minimum height
Fixes: out of array access

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-05-03 13:25:48 +00:00
Marius Momeu ff3223b5d6 avcodec/ralf: Add the missing return statement after the error log
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-05-03 13:25:30 +00:00
Michael Niedermayer c568f40597 avfilter/vf_codecview: Clamp block to the visible frame region
Fixes: write into the padding area of the frame

Found-by: Marius Momeu <marius.momeu@berkeley.edu>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-05-03 13:23:21 +00:00
Michael Niedermayer 2a991a3475 avcodec/zmbv: reject XOR data that overruns the decompression buffer
Add a per-block bounds check at the start of each XOR block so the
read is rejected before src crosses decomp_len, and propagate the
error from decode_frame().

Fixes: out of array read

Found-by: Seung Min Shin
2026-05-03 13:22:37 +00:00
Michael Niedermayer 2f60af465a avcodec/rasc: fix heap use-after-free in decode_move()
Use a separate scratch buffer (s->mv_scratch) for the type-0 pixel
copy so s->delta and mc are not disturbed for the lifetime of
decode_move().  The new buffer is freed in decode_close().

Found-by: Seung Min Shin
Patch based on suggsted fix by Seung Min Shin

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-05-03 13:20:27 +00:00
depthfirst-dev[bot] 8010aa2193 avformat/rtpdec_mpeg4: reject zero-length AU header sections
Reject AU header sections with a signaled length of zero in
rtp_parse_mp4_au().

The AU-headers-length field specifies the length in bits of the AU header
section that immediately follows. A zero-length section is not useful input
for this parser and can lead to invalid downstream state, so reject it
up front together with oversized values.

*Vulnerability reported by Zhenpeng (Leo) Lin at depthfirst*
*Patch validated by Zheng Yu at depthfirst*

Fixes: OOB read
2026-05-03 13:19:55 +00:00
Niels Provos fd5023053a avcodec/hevc/refs: Check multiplication in alloc_frame()
Fixes: integer overflow on 32bit
2026-05-03 13:19:35 +00:00
Michael Niedermayer 89e128224e fftools/ffmpeg_opt: fix mismatching negative maps
Fixes:  -f lavfi -i testsrc2=size=128x128:rate=1:d=1   -filter_complex '[0:v]scale=64:64[vout]'   -map '[vout]'   -map -0:v   -f null -
Previously  -0:v matched [vout] apparently

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-05-03 13:19:18 +00:00
depthfirst-dev[bot] 25a98586cc fftools/ffmpeg_opt: validate stream index in negative map handling
Negative -map processing iterates previously parsed stream map entries
and dereferences input_files[m->file_index]->ctx->streams[m->stream_index]
without validating that stream_index is in range.

A malformed earlier map can leave m->stream_index negative, which causes
an out-of-bounds read when a later negative map walks existing entries.
Check that stream_index is non-negative and below nb_streams before
calling stream_specifier_match().

*Vulnerability reported by Zhenpeng (Leo) Lin at depthfirst*
*Patch validated by Zheng Yu at depthfirst*

Fixes: DFVULN-695
2026-05-03 13:19:18 +00:00
Flavio Milan a45a91b23b avformat/rtmpproto: prevent integer overflow accumulating FLV buffer size
Fixes: out of array access
2026-05-03 13:18:54 +00:00
depthfirst-dev[bot] 52b78cd3fe avformat/rtmpproto: validate compressed SWF header length
Reject truncated compressed SWF input before attempting to read the
8-byte header in rtmp_calc_swfhash().

Compressed SWF data identified by the "CWS" signature must be at least
8 bytes long to contain the fixed header. Bail out early when the input
is shorter to avoid operating on malformed data.

*Vulnerability reported by Zhenpeng (Leo) Lin at depthfirst*
*Patch validated by Zheng Yu at depthfirst*

Fixes: DFVULN-612
2026-05-03 12:43:21 +00:00
depthfirst-dev[bot] 1a00ea51cb avformat/rtsp: Fix out-of-bounds read in SDP parser when control_url is empty
Guard against empty string before reading the last byte in control_url.
When parsing relative a=control: paths, if no base control URL was set,
the code would access control_url[strlen(control_url)-1] which on an
empty string causes a size_t underflow and out-of-bounds read.

Now compute the length first and check for len == 0 before array access.

*Vulnerability reported by Zhenpeng (Leo) Lin at depthfirst*
*Patch validated by Zheng Yu at depthfirst*

Fixes: DFVULN-611
2026-05-03 12:43:05 +00:00
depthfirst-dev[bot] 664d44a825 avformat/rtpdec_latm: avoid integer overflow in LATM length parsing
latm_parse_packet() accumulated attacker-controlled AU length bytes in
a signed int and later checked data->pos + cur_len against data->len.
That addition could overflow, allowing malformed packets to bypass the
bounds check and drive memcpy() far past the end of the LATM buffer.

Reject length-byte accumulation that would exceed the remaining packet
size, and compare cur_len against the remaining buffer space using
subtraction so the bounds check cannot overflow.

Fixes: DFVULN-610

*Vulnerability reported by Zhenpeng (Leo) Lin at depthfirst*
*Patch validated by Zheng Yu at depthfirst*
2026-05-03 12:42:57 +00:00
Michael Niedermayer 1772386392 avcodec/h264: recompute per-slice direct mode state for every slice
Regression since: 7f05c5cea0
Fixes: poc10
Fixes: null pointer dereference

Reported-by: Dale Curtis <dalecurtis@chromium.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-05-03 12:42:47 +00:00
Michael Niedermayer 1886c3269d avcodec/h264_refs: Clear stale pointers from ref_list
Testcase: poc10.bin

Reported-by: Dale Curtis <dalecurtis@chromium.org>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-05-03 12:42:47 +00:00
Michael Niedermayer a780d46d3b avcodec/leaddec: Check input data before allocating buffer
Fixes: Timeout
Fixes: 471636089/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_LEAD_fuzzer-6346348464242688

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-05-03 12:40:44 +00:00
Michael Niedermayer b801f1fe6d avcodec/pdvdec: Check input space before buffer allocation
this rejects packets whose claimed decompressed frame would require a deflate ratio beyond the format's theoretical 1032:1 limit

Fixes: Timeout
Fixes: 474457186/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_PDV_fuzzer-5366108782919680

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-05-03 10:25:54 +00:00
Kacper Michajłow 702b0784b7 avformat/concat: guard total_size overflow
Fixes: 466797413/clusterfuzz-testcase-minimized-fuzzer_options_parser-6015183727427584
Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2026-05-03 07:30:46 +00:00
Kacper Michajłow be207a0d66 avformat/concat: change concat_data::total_size to int64_t
It's both initialized as int64_t in concat_open() and returned as
int64_t in concat_seek().
2026-05-03 07:30:46 +00:00
Gyan Doshi 4a2b643646 avcodec/mediacodecdec: declare correct class for audio decoders
The class for video decoders had been assigned till date.
2026-05-03 05:58:13 +00:00
Michael Niedermayer 016a241102 avformat/iamf_parse.c: Fix potential integer overflow in opus_decoder_config()
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-05-03 02:36:54 +00:00
Michael Niedermayer 8439e02037 avformat: Fix various extradata padding issues
Reported-by: Kenan Alghythee <kalghy2@uic.edu>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-05-03 02:36:54 +00:00
Michael Niedermayer 23227a444d avcodec/wmaenc: Fix missing padding in extradata
Reported-by: Kenan Alghythee <kalghy2@uic.edu>
2026-05-03 02:36:54 +00:00
Michael Niedermayer 242ff799c7 avcodec/tdsc: remove double stride adjustment
Fixes: out of array access

Found-by: Seung Min Shin
Patch based on suggested fix by Seung Min Shin
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-05-02 23:11:24 +00:00
depthfirst-dev[bot] 5408059eb7 avformat/cafdec: fix negative index use in read_seek
av_index_search_timestamp() returns a negative value when a seek target
cannot be resolved from the stream index. Bail out before using that
result as an index into sti->index_entries to avoid out-of-bounds reads.

Fixes: Buffer underflow

Fixes: DFVULN-608

*Vulnerability reported by Zhenpeng (Leo) Lin at depthfirst*
*Patch validated by Zheng Yu at depthfirst*
2026-05-02 21:40:19 +00:00
Michael Niedermayer 05817dc7dd avcodec/notchlc: Check 255 loops
Fixes: integer overflow

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-05-02 21:39:02 +00:00
Michael Niedermayer 91d29be49a avformat/rtpdec_jpeg: check qtable_len
Fixes: out of array access
Fixes: 605/pc.py

Based-on patch by depthfirst

*Reporter: Zhenpeng (Leo) Lin at depthfirst*
2026-05-02 21:16:51 +00:00
ASTRA 26732641fb avformat/vividas: use-of-uninitialized-value in keybuffer
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-05-02 21:16:19 +00:00
Michael Niedermayer bf4eb194cf avcodec/tdsc: Better input size check
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-05-02 21:13:01 +00:00
Michael Niedermayer bb69a090a7 avcodec/tdsc: Check jpeg size
Fixes: out of array read
Fixes: tdsc_tile_dim_mismatch.avi

Found-by: Ante Silovic <asilovic155@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-05-02 21:13:01 +00:00
Michael Niedermayer af87d77514 avcodec/tdsc: Prettier uncompress() check
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-05-02 21:13:01 +00:00
Michael Niedermayer e9e6fb8798 avcodec/tdsc: Check tile_size
Fixes: out of array read
Fixes: tdsc_war_groom_far4096.avi

Found by: Ante Silovic <asilovic155@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-05-02 21:13:01 +00:00
Michael Niedermayer 9572ab7f45 avcodec/decode: Better documentation for ff_set_dimensions()
Clarify what is checked and that it avoids explicit generic overflow checks

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-05-02 21:11:47 +00:00
Nicolas George 571dd04ca2 lavfi/vf_scale: store aspect ratio as a whole 2026-05-02 14:47:00 +02:00
Nicolas George 5f3783005e lavfi/vf_scale: simplify indirections
link is always input[0].
2026-05-02 14:46:59 +02:00
Kacper Michajłow dba0b078c8 avcodec/vaapi_av1: reorder functions to avoid fwd decl 2026-05-01 23:59:06 +00:00
Kacper Michajłow 688f68bffa avcodec/vaapi_av1: fix leak of ref frames on init failure
Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2026-05-01 23:59:06 +00:00
Kacper Michajłow 1bb12370b0 avformat/httpauth: avoid casting callback functions type
Technically it's is UB to call function of different type.

Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2026-05-01 23:58:40 +00:00
Kacper Michajłow d768bd564e avformat/hls: avoid casting callback functions type
Technically it's is UB to call function of different type.

Fixes:
src/libavformat/utils.c:531:9: runtime error: call to function handle_variant_args through pointer to incorrect function type 'void (*)(void *, const char *, int, char **, int *)'
src/libavformat/hls.c:379: note: handle_variant_args defined here

Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2026-05-01 23:58:40 +00:00
zhanghongyuan 7e5eb2b65c fftools/opt_common: Use enum for encoder/decoder selection
Replace magic numbers (0 and 1) with SHOW_DECODER and SHOW_ENCODER
enum values throughout the opt_common.c file.
2026-05-01 23:37:55 +00:00
Jun Zhao ea3e09bfb1 fate/filter-video: use run $(FFMPEG) for scale-zero-dim test
$(FFMPEG) expands to "ffmpeg.exe" on Windows/MSYS2, and the bare
$(FFMPEG) call falls through to PATH lookup, picking up an externally
installed ffmpeg instead of the freshly built binary in $target_path.
That stale binary lacked the rejection added in a45fe72c9d, causing
msys2-clang64/clangarm64/ucrt64 slots to silently produce 250x2
instead of failing at 500x0.

Wrap with fate-run.sh's run() so $target_exec and $target_path are
resolved correctly on all platforms, matching the convention used by
e.g. fate-id3v2-invalid-tags, and avoiding the ffmpeg() helper's
unrelated default flags.

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-05-01 23:45:30 +08:00
Leo Izen 739fc9249c avcodec/libjxlenc: fix frame->linesize raw pointer read
These should say frame->linesize[0] as it does everywhere else this
variable is referenced. Fixes a typo bug.

Signed-off-by: Leo Izen <leo.izen@gmail.com>
2026-05-01 07:40:25 -04:00
Leo Izen 05b5add006 avcodec/libjxlenc: check orientation tag metadata before reading
We need to check that entry->count is nonzero and that entry->type is
AV_TIFF_SHORT before reading from the buffer, in case a maliciously
constructed IFD uses a zero-count or an unusual type (e.g. IFD) for it.

Signed-off-by: Leo Izen <leo.izen@gmail.com>
2026-05-01 07:40:25 -04:00
Leo Izen f1cab2d018 avcodec/exif_internal.h: improve return docs for ff_exif_get_buffer
This commit improves the documentation for the return value of the
function ff_exif_get_buffer.

Signed-off-by: Leo Izen <leo.izen@gmail.com>
2026-05-01 07:40:25 -04:00
Leo Izen 087ec68451 avcodec/exif.c: synthesize EXIF data from frame metadata and matrix
If the displaymatrix is present, we should synthesize EXIF data from
the values there even if there is no EXIF attached to the frame.

Signed-off-by: Leo Izen <leo.izen@gmail.com>
2026-05-01 07:40:25 -04:00
Leo Izen 1d36c4d8ae avcodec/exif.c: reset ifd->size when freeing ifd->entries
If we free ifd->entries then we need to set ifd->size to 0 so another
call to av_fast_realloc doesn't get confused.

Signed-off-by: Leo Izen <leo.izen@gmail.com>
2026-05-01 07:40:24 -04:00
Leo Izen 326808ad2f avcodec/exif.c: add check for singular displaymatrix data
If av_exif_matrix_to_orientation returns 0, then the display matrix
is singular. In this case we should treat it as 1 and print a warning.

Signed-off-by: Leo Izen <leo.izen@gmail.com>
2026-05-01 07:40:24 -04:00
Leo Izen 317d660281 avcodec/exif.c: account for header_mode difference on rewrite
When determining if we need to rewrite the exif buffer or can pass
through as-is, account for a difference in header_mode requested from
the one that is used internally.

Signed-off-by: Leo Izen <leo.izen@gmail.com>
2026-05-01 07:40:24 -04:00
Leo Izen 4f5dfce5a8 avcodec/exif.c: use less than or equal for max width and height
The max width and height for PIXEL_X_TAG and PIXEL_Y_TAG is 0xFFFFu
because these are unsigned shorts, but we used < instead of <=
erroneously. Fix that.

Signed-off-by: Leo Izen <leo.izen@gmail.com>
2026-05-01 07:40:24 -04:00
Leo Izen 2cddfe7d0c avcodec/exif.c: pop entry off IFD if allocation fails
In av_exif_set_entry, if cloning the entry fails because of an alloc
failed, then we remove the entry from the IFD. If that entry exists
in the middle of ifd->entries we need to shift everything to the left
which this commit implements.

Signed-off-by: Leo Izen <leo.izen@gmail.com>
2026-05-01 07:40:24 -04:00
Leo Izen 0c39b1bccd avcodec/exif.h: fix documentation on av_exif_get_entry and similar
Add additional documentation to av_exif_get_entry and also to
av_exif_set_entry that was already part of the existing ABI but was
insufficiently documented before this commit. Also clarifies that
av_fast_realloc is used, instead of av_realloc on av_exif_set_entry.

Signed-off-by: Leo Izen <leo.izen@gmail.com>
2026-05-01 07:40:24 -04:00
Leo Izen 28e64cad6f avutil/frame.h: re-align dynamic HDR frame data declaration
This is aligned forward by an extra space, because it inheried the
incorrect alignment from the EXIF declaration above it (fixed in the
previous commit).

Signed-off-by: Leo Izen <leo.izen@gmail.com>
2026-05-01 07:40:24 -04:00
Leo Izen 600a8a5a9c avutil/frame.h: fix AV_EXIF_SIDE_DATA declaration
This commit re-aligns the declaration by removing extra whitespace
and fixes the comment above to have the correct acronym. It also
documents what the four magic bytes indicate.

Signed-off-by: Leo Izen <leo.izen@gmail.com>
2026-05-01 07:40:18 -04:00
Dale Curtis a7d42bfba8 avformat/mov: Limit maximum box size for mov_read_lhvc()
Signed-off-by: Dale Curtis <dalecurtis@chromium.org>
2026-04-30 22:50:51 +00:00
Nil Fons Miret e294b390a0 avfilter/vf_unsharp: fix amount scaling in the high-bit-depth path
The 16-bit kernel is dispatched for every non-8-bit pixel format
(9/10/12/16-bit content, all stored in uint16_t). It's supposed to
undo the Q16 scaling that set_filter_param() applies to `amount`:

    fp->amount = amount * 65536.0;

but the shift written in the kernel is `>> (8+nbits)`, which for the
nbits=16 instantiation of the macro comes out to `>> 24` instead of
`>> 16`. Because of this, on any non-8-bit input, unsharp applies ~1/256
of the user's requested strength and is effectively a no-op. The
8-bit kernel (nbits=8) happens to be correct because 8+8 == 16.

This commit also widens the intermediate product to int64 before the
shift, to avoid a potential overflow. Take a 16-bit pixel at the
edge of a sharp white/black region, with the user-facing `amount`
set to its declared maximum of 5.0.

    *srx       = 65535
    blur       = 32768
    diff       = *srx - blur                  = 32767
    amount_q16 = 5.0 * 65536                  = 327680

Then the kernel computes:

    product    = diff * amount_q16
               = 32767 * 327680               = 10,737,090,560     (~1.07e10)

which overflows INT32_MAX. Widening to int64 keeps the
multiplication in range; the subsequent `>> 16` brings it back to
sample range and the final cast to int32 is then safe. The widening
is a semantic no-op for 8/9/10/12-bit content where the product
always fits in int32 (worst case at 12-bit: 4095 * 327680 ~ 1.34e9).

Introduced by ee792ebe08 (2019-11-08, "avfilter/vf_unsharp: add 10bit
support"). The fate-filter-unsharp-yuv420p10 reference added in the
same series was generated from the broken kernel and is regenerated
here. fate-filter-unsharp (8-bit) is unaffected.

Repro:

    python3 -c "import numpy as np; y=np.tile(np.where(np.arange(128)//8 & 1, 512, 256).astype('<u2'), (128,1)); c=np.full((64,64), 512, '<u2'); open('in.yuv','wb').write(y.tobytes()+c.tobytes()*2)"

    ffmpeg -f rawvideo -pix_fmt yuv420p10le -s 128x128 -i in.yuv \
        -lavfi "split=2[a][b];[b]unsharp=la=1[bs];[a][bs]psnr" \
        -f null - 2>&1 | grep PSNR

Before: `PSNR y:66.50 ...` -- the filter is effectively a no-op,
        so the sharpened output matches the input almost exactly.
After:  `PSNR y:28.27 ...` -- the filter actually sharpens, so
        output and input differ as expected.

Signed-off-by: Nil Fons Miret <nilf@netflix.com>
Made-with: Cursor
2026-04-30 21:15:58 +00:00
depthfirst-dev[bot] 68ea660d83 avformat/mov: reject dimg references with zero entries
Reject dimg entries with a zero reference count in mov_read_iref_dimg().
This is the earliest point where the parser learns how many input images
a derived HEIF item references, so it is the right place to enforce the
invariant.

If entries == 0 is accepted here, the value is stored in HEIFGrid.nb_tiles,
later propagated by read_image_iovl() into AVStreamGroupTileGrid.nb_tiles,
and finally consumed in istg_parse_tile_grid(), which assumes at least one
tile and reads tg->offsets[tg->nb_tiles - 1]. With zero tiles, that
assumption breaks and leads to the out-of-bounds access seen in ASan.

Fixing the problem at the parser boundary is preferable to adding a later
workaround because it prevents creation of an invalid derived-image state
and stops that malformed state from reaching downstream consumers.

This is also consistent with the HEIF specification. Both iovl and grid
derived images are formed from one or more input images, and for grid the
dimg reference count must equal rows * columns; since rows and columns are
encoded as *_minus_one + 1, that count cannot be zero. A zero dimg entry
count is therefore invalid input and should be rejected when parsed.
2026-04-30 19:19:07 +00:00
Romain Beauxis 0f6ba39122 avfilter/vf_frei0r: guard against NULL string fields. 2026-04-30 08:33:31 -05:00
Andreas Rheinhardt cc3ca17127 avcodec/x86/qpeldsp{,_init}: Use proper prefix
E.g. rename ff_put_mpeg4_qpel8_h_lowpass_ssse3 to
ff_mpeg4_put_qpel8_h_lowpass_ssse3.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-30 10:39:33 +02:00
Andreas Rheinhardt ca43bc6202 avcodec/x86/qpeldsp_init: Mark functions as hidden
It allows pic 32bit code to call the underlying
assembly functions directly, without loading
the GOT first; this saves 1245B of .text here
(for 32bit pic code).

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-30 10:39:33 +02:00
Andreas Rheinhardt 23d3116af9 avcodec/x86/qpeldsp: Add combination of h_lowpass + l2
If the subpel part of the horizontal component of
the motion vector is 1/4 or 3/4, the MPEG-4 qpel motion compensation
first computes the mc for the corresponding motion vector
with 1/2 horizontal subpel part and then averages this
with the left (for 1/4) or the right (for 3/4) source pixel.
These two stages are currently performed in two different functions,
involving a stack buffer as intermediate.

This means that horizontal prediction for every function with
a 1/4 or 3/4 horizontal subpel mv is more expensive code-size wise
(and also performance-wise) as it involves two calls. Given that
the horizontal lowpass functions are not that long, adding combinations
of h_lowpass+l2 actually reduces binary size: An increase of 1136B
in the asm files is more than offset by size reductions in
the wrappers: 1968B here when not using stack protection,
2256B when using stack protection.

Of course it also improves performance. Old benchmarks:
avg_qpel_pixels_tab[0][1]_ssse3:                       106.9 ( 8.69x)
avg_qpel_pixels_tab[0][3]_ssse3:                       105.5 ( 8.84x)
avg_qpel_pixels_tab[0][5]_ssse3:                       226.9 ( 8.57x)
avg_qpel_pixels_tab[0][7]_ssse3:                       231.1 ( 8.38x)
avg_qpel_pixels_tab[0][9]_ssse3:                       217.8 ( 9.04x)
avg_qpel_pixels_tab[0][11]_ssse3:                      214.9 ( 9.32x)
avg_qpel_pixels_tab[0][13]_ssse3:                      227.1 ( 8.48x)
avg_qpel_pixels_tab[0][15]_ssse3:                      236.1 ( 8.02x)

New benchmarks:
avg_qpel_pixels_tab[0][1]_ssse3:                        96.7 ( 9.65x)
avg_qpel_pixels_tab[0][3]_ssse3:                        96.6 ( 9.73x)
avg_qpel_pixels_tab[0][5]_ssse3:                       225.8 ( 8.61x)
avg_qpel_pixels_tab[0][7]_ssse3:                       228.4 ( 8.51x)
avg_qpel_pixels_tab[0][9]_ssse3:                       217.1 ( 9.05x)
avg_qpel_pixels_tab[0][11]_ssse3:                      217.8 ( 9.32x)
avg_qpel_pixels_tab[0][13]_ssse3:                      227.2 ( 8.54x)
avg_qpel_pixels_tab[0][15]_ssse3:                      220.5 ( 8.72x)

Note: The l2 functions are also used for vertical lowpass
functions, yet given that they are much bigger, duplicating
them would lead to massive code size increase.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-30 10:39:33 +02:00
Andreas Rheinhardt f946cac2d9 avcodec/x86/qpeldsp: Remove horizontal mmxext mc functions
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-30 10:39:33 +02:00
Andreas Rheinhardt 1d040c527d avcodec/x86/qpeldsp: Add SSSE3 size 8 horizontal filter
Beats the mmxext version by a lot (in the following,
[1][1-3] refers to horizontal-only size 8 mc;
the _sse2 comparators for the other cases use mmxext
horizontal mc coupled with vertical SSE2 mc):

avg_qpel_pixels_tab[1][1]_c:                           223.9 ( 1.00x)
avg_qpel_pixels_tab[1][1]_mmxext:                       66.2 ( 3.38x)
avg_qpel_pixels_tab[1][1]_ssse3:                        36.8 ( 6.08x)
avg_qpel_pixels_tab[1][2]_c:                           251.0 ( 1.00x)
avg_qpel_pixels_tab[1][2]_mmxext:                       58.5 ( 4.29x)
avg_qpel_pixels_tab[1][2]_ssse3:                        25.5 ( 9.84x)
avg_qpel_pixels_tab[1][3]_c:                           226.9 ( 1.00x)
avg_qpel_pixels_tab[1][3]_mmxext:                       66.3 ( 3.42x)
avg_qpel_pixels_tab[1][3]_ssse3:                        35.8 ( 6.34x)
avg_qpel_pixels_tab[1][5]_c:                           473.9 ( 1.00x)
avg_qpel_pixels_tab[1][5]_sse2:                        110.7 ( 4.28x)
avg_qpel_pixels_tab[1][5]_ssse3:                        76.0 ( 6.24x)
avg_qpel_pixels_tab[1][6]_c:                           440.9 ( 1.00x)
avg_qpel_pixels_tab[1][6]_sse2:                        102.1 ( 4.32x)
avg_qpel_pixels_tab[1][6]_ssse3:                        67.1 ( 6.58x)
avg_qpel_pixels_tab[1][7]_c:                           473.8 ( 1.00x)
avg_qpel_pixels_tab[1][7]_sse2:                        108.0 ( 4.39x)
avg_qpel_pixels_tab[1][7]_ssse3:                        74.6 ( 6.35x)
avg_qpel_pixels_tab[1][9]_c:                           492.9 ( 1.00x)
avg_qpel_pixels_tab[1][9]_sse2:                        102.1 ( 4.83x)
avg_qpel_pixels_tab[1][9]_ssse3:                        67.1 ( 7.35x)
avg_qpel_pixels_tab[1][10]_c:                          465.6 ( 1.00x)
avg_qpel_pixels_tab[1][10]_sse2:                        94.9 ( 4.91x)
avg_qpel_pixels_tab[1][10]_ssse3:                       57.5 ( 8.10x)
avg_qpel_pixels_tab[1][11]_c:                          492.8 ( 1.00x)
avg_qpel_pixels_tab[1][11]_sse2:                       102.4 ( 4.81x)
avg_qpel_pixels_tab[1][11]_ssse3:                       68.7 ( 7.17x)
avg_qpel_pixels_tab[1][13]_c:                          476.6 ( 1.00x)
avg_qpel_pixels_tab[1][13]_sse2:                       108.6 ( 4.39x)
avg_qpel_pixels_tab[1][13]_ssse3:                       74.7 ( 6.38x)
avg_qpel_pixels_tab[1][14]_c:                          434.9 ( 1.00x)
avg_qpel_pixels_tab[1][14]_sse2:                       102.2 ( 4.25x)
avg_qpel_pixels_tab[1][14]_ssse3:                       66.6 ( 6.53x)
avg_qpel_pixels_tab[1][15]_c:                          474.1 ( 1.00x)
avg_qpel_pixels_tab[1][15]_sse2:                       107.9 ( 4.39x)
avg_qpel_pixels_tab[1][15]_ssse3:                       74.3 ( 6.38x)
put_no_rnd_qpel_pixels_tab[1][1]_c:                    222.1 ( 1.00x)
put_no_rnd_qpel_pixels_tab[1][1]_mmxext:                66.0 ( 3.37x)
put_no_rnd_qpel_pixels_tab[1][1]_ssse3:                 35.2 ( 6.31x)
put_no_rnd_qpel_pixels_tab[1][2]_c:                    212.2 ( 1.00x)
put_no_rnd_qpel_pixels_tab[1][2]_mmxext:                56.8 ( 3.74x)
put_no_rnd_qpel_pixels_tab[1][2]_ssse3:                 25.0 ( 8.48x)
put_no_rnd_qpel_pixels_tab[1][3]_c:                    224.5 ( 1.00x)
put_no_rnd_qpel_pixels_tab[1][3]_mmxext:                65.8 ( 3.41x)
put_no_rnd_qpel_pixels_tab[1][3]_ssse3:                 35.8 ( 6.26x)
put_no_rnd_qpel_pixels_tab[1][5]_c:                    460.1 ( 1.00x)
put_no_rnd_qpel_pixels_tab[1][5]_sse2:                 114.6 ( 4.01x)
put_no_rnd_qpel_pixels_tab[1][5]_ssse3:                 83.1 ( 5.53x)
put_no_rnd_qpel_pixels_tab[1][6]_c:                    438.6 ( 1.00x)
put_no_rnd_qpel_pixels_tab[1][6]_sse2:                 104.2 ( 4.21x)
put_no_rnd_qpel_pixels_tab[1][6]_ssse3:                 67.5 ( 6.50x)
put_no_rnd_qpel_pixels_tab[1][7]_c:                    458.0 ( 1.00x)
put_no_rnd_qpel_pixels_tab[1][7]_sse2:                 113.8 ( 4.02x)
put_no_rnd_qpel_pixels_tab[1][7]_ssse3:                 79.9 ( 5.73x)
put_no_rnd_qpel_pixels_tab[1][9]_c:                    439.0 ( 1.00x)
put_no_rnd_qpel_pixels_tab[1][9]_sse2:                 103.7 ( 4.23x)
put_no_rnd_qpel_pixels_tab[1][9]_ssse3:                 68.9 ( 6.37x)
put_no_rnd_qpel_pixels_tab[1][10]_c:                   427.0 ( 1.00x)
put_no_rnd_qpel_pixels_tab[1][10]_sse2:                 93.2 ( 4.58x)
put_no_rnd_qpel_pixels_tab[1][10]_ssse3:                57.9 ( 7.37x)
put_no_rnd_qpel_pixels_tab[1][11]_c:                   439.9 ( 1.00x)
put_no_rnd_qpel_pixels_tab[1][11]_sse2:                104.0 ( 4.23x)
put_no_rnd_qpel_pixels_tab[1][11]_ssse3:                69.2 ( 6.36x)
put_no_rnd_qpel_pixels_tab[1][13]_c:                   459.3 ( 1.00x)
put_no_rnd_qpel_pixels_tab[1][13]_sse2:                113.2 ( 4.06x)
put_no_rnd_qpel_pixels_tab[1][13]_ssse3:                83.8 ( 5.48x)
put_no_rnd_qpel_pixels_tab[1][14]_c:                   439.5 ( 1.00x)
put_no_rnd_qpel_pixels_tab[1][14]_sse2:                103.3 ( 4.25x)
put_no_rnd_qpel_pixels_tab[1][14]_ssse3:                67.9 ( 6.47x)
put_no_rnd_qpel_pixels_tab[1][15]_c:                   453.6 ( 1.00x)
put_no_rnd_qpel_pixels_tab[1][15]_sse2:                113.7 ( 3.99x)
put_no_rnd_qpel_pixels_tab[1][15]_ssse3:                80.0 ( 5.67x)
put_qpel_pixels_tab[1][1]_c:                           229.0 ( 1.00x)
put_qpel_pixels_tab[1][1]_mmxext:                       65.5 ( 3.50x)
put_qpel_pixels_tab[1][1]_ssse3:                        33.8 ( 6.77x)
put_qpel_pixels_tab[1][2]_c:                           212.5 ( 1.00x)
put_qpel_pixels_tab[1][2]_mmxext:                       56.6 ( 3.75x)
put_qpel_pixels_tab[1][2]_ssse3:                        23.4 ( 9.08x)
put_qpel_pixels_tab[1][3]_c:                           227.5 ( 1.00x)
put_qpel_pixels_tab[1][3]_mmxext:                       64.4 ( 3.53x)
put_qpel_pixels_tab[1][3]_ssse3:                        33.5 ( 6.79x)
put_qpel_pixels_tab[1][5]_c:                           466.5 ( 1.00x)
put_qpel_pixels_tab[1][5]_sse2:                        106.8 ( 4.37x)
put_qpel_pixels_tab[1][5]_ssse3:                        71.8 ( 6.50x)
put_qpel_pixels_tab[1][6]_c:                           438.7 ( 1.00x)
put_qpel_pixels_tab[1][6]_sse2:                        102.0 ( 4.30x)
put_qpel_pixels_tab[1][6]_ssse3:                        65.3 ( 6.72x)
put_qpel_pixels_tab[1][7]_c:                           466.0 ( 1.00x)
put_qpel_pixels_tab[1][7]_sse2:                        106.3 ( 4.38x)
put_qpel_pixels_tab[1][7]_ssse3:                        70.9 ( 6.57x)
put_qpel_pixels_tab[1][9]_c:                           456.0 ( 1.00x)
put_qpel_pixels_tab[1][9]_sse2:                        100.1 ( 4.55x)
put_qpel_pixels_tab[1][9]_ssse3:                        64.0 ( 7.13x)
put_qpel_pixels_tab[1][10]_c:                          425.1 ( 1.00x)
put_qpel_pixels_tab[1][10]_sse2:                        92.6 ( 4.59x)
put_qpel_pixels_tab[1][10]_ssse3:                       55.1 ( 7.71x)
put_qpel_pixels_tab[1][11]_c:                          452.7 ( 1.00x)
put_qpel_pixels_tab[1][11]_sse2:                        99.6 ( 4.55x)
put_qpel_pixels_tab[1][11]_ssse3:                       63.8 ( 7.09x)
put_qpel_pixels_tab[1][13]_c:                          471.2 ( 1.00x)
put_qpel_pixels_tab[1][13]_sse2:                       106.4 ( 4.43x)
put_qpel_pixels_tab[1][13]_ssse3:                       71.4 ( 6.60x)
put_qpel_pixels_tab[1][14]_c:                          439.7 ( 1.00x)
put_qpel_pixels_tab[1][14]_sse2:                       101.8 ( 4.32x)
put_qpel_pixels_tab[1][14]_ssse3:                       64.8 ( 6.79x)
put_qpel_pixels_tab[1][15]_c:                          467.8 ( 1.00x)
put_qpel_pixels_tab[1][15]_sse2:                       106.1 ( 4.41x)
put_qpel_pixels_tab[1][15]_ssse3:                       72.6 ( 6.44x)

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-30 10:39:33 +02:00
Andreas Rheinhardt c0e1c1d6b3 avcodec/x86/qpeldsp: Add SSSE3 size 16 horizontal filter
Beats the mmxext version by a lot (in the following,
[0][1-3] refers to horizontal-only size 16 mc;
the _sse2 comparators for the other cases use mmxext
horizontal mc coupled with vertical SSE2 mc):

avg_qpel_pixels_tab[0][1]_c:                           945.5 ( 1.00x)
avg_qpel_pixels_tab[0][1]_mmxext:                      262.6 ( 3.60x)
avg_qpel_pixels_tab[0][1]_ssse3:                       110.4 ( 8.57x)
avg_qpel_pixels_tab[0][2]_c:                          1042.1 ( 1.00x)
avg_qpel_pixels_tab[0][2]_mmxext:                      245.1 ( 4.25x)
avg_qpel_pixels_tab[0][2]_ssse3:                        91.7 (11.37x)
avg_qpel_pixels_tab[0][3]_c:                           941.8 ( 1.00x)
avg_qpel_pixels_tab[0][3]_mmxext:                      260.1 ( 3.62x)
avg_qpel_pixels_tab[0][3]_ssse3:                       110.1 ( 8.56x)
avg_qpel_pixels_tab[0][5]_c:                          1939.5 ( 1.00x)
avg_qpel_pixels_tab[0][5]_sse2:                        394.3 ( 4.92x)
avg_qpel_pixels_tab[0][5]_ssse3:                       247.4 ( 7.84x)
avg_qpel_pixels_tab[0][6]_c:                          1785.8 ( 1.00x)
avg_qpel_pixels_tab[0][6]_sse2:                        380.6 ( 4.69x)
avg_qpel_pixels_tab[0][6]_ssse3:                       221.1 ( 8.08x)
avg_qpel_pixels_tab[0][7]_c:                          1932.5 ( 1.00x)
avg_qpel_pixels_tab[0][7]_sse2:                        393.4 ( 4.91x)
avg_qpel_pixels_tab[0][7]_ssse3:                       238.8 ( 8.09x)
avg_qpel_pixels_tab[0][9]_c:                          1976.9 ( 1.00x)
avg_qpel_pixels_tab[0][9]_sse2:                        380.8 ( 5.19x)
avg_qpel_pixels_tab[0][9]_ssse3:                       223.3 ( 8.85x)
avg_qpel_pixels_tab[0][10]_c:                         1911.9 ( 1.00x)
avg_qpel_pixels_tab[0][10]_sse2:                       366.9 ( 5.21x)
avg_qpel_pixels_tab[0][10]_ssse3:                      207.0 ( 9.24x)
avg_qpel_pixels_tab[0][11]_c:                         2046.9 ( 1.00x)
avg_qpel_pixels_tab[0][11]_sse2:                       385.5 ( 5.31x)
avg_qpel_pixels_tab[0][11]_ssse3:                      227.9 ( 8.98x)
avg_qpel_pixels_tab[0][13]_c:                         1940.8 ( 1.00x)
avg_qpel_pixels_tab[0][13]_sse2:                       389.7 ( 4.98x)
avg_qpel_pixels_tab[0][13]_ssse3:                      244.2 ( 7.95x)
avg_qpel_pixels_tab[0][14]_c:                         1778.4 ( 1.00x)
avg_qpel_pixels_tab[0][14]_sse2:                       379.2 ( 4.69x)
avg_qpel_pixels_tab[0][14]_ssse3:                      223.5 ( 7.96x)
avg_qpel_pixels_tab[0][15]_c:                         1905.9 ( 1.00x)
avg_qpel_pixels_tab[0][15]_sse2:                       398.9 ( 4.78x)
avg_qpel_pixels_tab[0][15]_ssse3:                      238.3 ( 8.00x)
put_no_rnd_qpel_pixels_tab[0][1]_c:                    922.5 ( 1.00x)
put_no_rnd_qpel_pixels_tab[0][1]_mmxext:               275.0 ( 3.35x)
put_no_rnd_qpel_pixels_tab[0][1]_ssse3:                108.4 ( 8.51x)
put_no_rnd_qpel_pixels_tab[0][2]_c:                    889.7 ( 1.00x)
put_no_rnd_qpel_pixels_tab[0][2]_mmxext:               236.7 ( 3.76x)
put_no_rnd_qpel_pixels_tab[0][2]_ssse3:                 86.8 (10.25x)
put_no_rnd_qpel_pixels_tab[0][3]_c:                    915.5 ( 1.00x)
put_no_rnd_qpel_pixels_tab[0][3]_mmxext:               274.3 ( 3.34x)
put_no_rnd_qpel_pixels_tab[0][3]_ssse3:                108.2 ( 8.46x)
put_no_rnd_qpel_pixels_tab[0][5]_sse2:                 400.0 ( 4.63x)
put_no_rnd_qpel_pixels_tab[0][5]_ssse3:                246.0 ( 7.53x)
put_no_rnd_qpel_pixels_tab[0][6]_c:                   1753.9 ( 1.00x)
put_no_rnd_qpel_pixels_tab[0][6]_sse2:                 382.5 ( 4.59x)
put_no_rnd_qpel_pixels_tab[0][6]_ssse3:                226.4 ( 7.75x)
put_no_rnd_qpel_pixels_tab[0][7]_c:                   1854.6 ( 1.00x)
put_no_rnd_qpel_pixels_tab[0][7]_sse2:                 393.5 ( 4.71x)
put_no_rnd_qpel_pixels_tab[0][7]_ssse3:                248.6 ( 7.46x)
put_no_rnd_qpel_pixels_tab[0][9]_c:                   1794.3 ( 1.00x)
put_no_rnd_qpel_pixels_tab[0][9]_sse2:                 382.2 ( 4.70x)
put_no_rnd_qpel_pixels_tab[0][9]_ssse3:                228.0 ( 7.87x)
put_no_rnd_qpel_pixels_tab[0][10]_c:                  1724.7 ( 1.00x)
put_no_rnd_qpel_pixels_tab[0][10]_sse2:                353.8 ( 4.88x)
put_no_rnd_qpel_pixels_tab[0][10]_ssse3:               206.5 ( 8.35x)
put_no_rnd_qpel_pixels_tab[0][11]_c:                  1796.3 ( 1.00x)
put_no_rnd_qpel_pixels_tab[0][11]_sse2:                378.1 ( 4.75x)
put_no_rnd_qpel_pixels_tab[0][11]_ssse3:               227.1 ( 7.91x)
put_no_rnd_qpel_pixels_tab[0][13]_c:                  1834.4 ( 1.00x)
put_no_rnd_qpel_pixels_tab[0][13]_sse2:                400.7 ( 4.58x)
put_no_rnd_qpel_pixels_tab[0][13]_ssse3:               244.2 ( 7.51x)
put_no_rnd_qpel_pixels_tab[0][14]_c:                  1755.7 ( 1.00x)
put_no_rnd_qpel_pixels_tab[0][14]_sse2:                387.2 ( 4.53x)
put_no_rnd_qpel_pixels_tab[0][14]_ssse3:               226.8 ( 7.74x)
put_no_rnd_qpel_pixels_tab[0][15]_c:                  1847.3 ( 1.00x)
put_no_rnd_qpel_pixels_tab[0][15]_sse2:                400.6 ( 4.61x)
put_no_rnd_qpel_pixels_tab[0][15]_ssse3:               246.1 ( 7.51x)
put_qpel_pixels_tab[0][1]_c:                           919.6 ( 1.00x)
put_qpel_pixels_tab[0][1]_mmxext:                      255.5 ( 3.60x)
put_qpel_pixels_tab[0][1]_ssse3:                       108.3 ( 8.49x)
put_qpel_pixels_tab[0][2]_c:                           883.9 ( 1.00x)
put_qpel_pixels_tab[0][2]_mmxext:                      238.1 ( 3.71x)
put_qpel_pixels_tab[0][2]_ssse3:                        86.7 (10.19x)
put_qpel_pixels_tab[0][3]_c:                           921.9 ( 1.00x)
put_qpel_pixels_tab[0][3]_mmxext:                      258.9 ( 3.56x)
put_qpel_pixels_tab[0][3]_ssse3:                       108.1 ( 8.53x)
put_qpel_pixels_tab[0][5]_c:                          1907.5 ( 1.00x)
put_qpel_pixels_tab[0][5]_sse2:                        384.2 ( 4.96x)
put_qpel_pixels_tab[0][5]_ssse3:                       234.8 ( 8.13x)
put_qpel_pixels_tab[0][6]_c:                          1757.4 ( 1.00x)
put_qpel_pixels_tab[0][6]_sse2:                        382.8 ( 4.59x)
put_qpel_pixels_tab[0][6]_ssse3:                       217.6 ( 8.08x)
put_qpel_pixels_tab[0][7]_c:                          1927.5 ( 1.00x)
put_qpel_pixels_tab[0][7]_sse2:                        384.6 ( 5.01x)
put_qpel_pixels_tab[0][7]_ssse3:                       231.2 ( 8.34x)
put_qpel_pixels_tab[0][9]_c:                          1832.1 ( 1.00x)
put_qpel_pixels_tab[0][9]_sse2:                        374.8 ( 4.89x)
put_qpel_pixels_tab[0][9]_ssse3:                       219.4 ( 8.35x)
put_qpel_pixels_tab[0][10]_c:                         1710.3 ( 1.00x)
put_qpel_pixels_tab[0][10]_sse2:                       384.5 ( 4.45x)
put_qpel_pixels_tab[0][10]_ssse3:                      202.9 ( 8.43x)
put_qpel_pixels_tab[0][11]_c:                         1825.0 ( 1.00x)
put_qpel_pixels_tab[0][11]_sse2:                       369.6 ( 4.94x)
put_qpel_pixels_tab[0][11]_ssse3:                      216.8 ( 8.42x)
put_qpel_pixels_tab[0][13]_c:                         1898.4 ( 1.00x)
put_qpel_pixels_tab[0][13]_sse2:                       384.9 ( 4.93x)
put_qpel_pixels_tab[0][13]_ssse3:                      238.6 ( 7.96x)
put_qpel_pixels_tab[0][14]_c:                         1779.1 ( 1.00x)
put_qpel_pixels_tab[0][14]_sse2:                       373.3 ( 4.77x)
put_qpel_pixels_tab[0][14]_ssse3:                      218.1 ( 8.16x)
put_qpel_pixels_tab[0][15]_c:                         1918.2 ( 1.00x)
put_qpel_pixels_tab[0][15]_sse2:                       385.3 ( 4.98x)
put_qpel_pixels_tab[0][15]_ssse3:                      236.8 ( 8.10x)

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-30 10:39:33 +02:00
Andreas Rheinhardt a3d747f344 avcodec/x86/qpeldsp{,_init}: Use SSE2 pixels16x16_l2 functions
put and avg versions have been added and used in H264
in b91081274f. This commit
adds the size 16 version of put_no_rnd and uses all three
of them in the SSE2 size 16 qpel functions (i.e. it uses
them in the ones that have a vertical component); it also
removes the 16x17 MMXEXT versions (which are no longer used).

This is particularly beneficial for put_no_rnd:
avg_qpel_pixels_tab[0][5]_c:                          1910.9 ( 1.00x)
avg_qpel_pixels_tab[0][5]_sse2 (old):                  405.1 ( 4.72x)
avg_qpel_pixels_tab[0][5]_sse2:                        392.9 ( 4.86x)
avg_qpel_pixels_tab[0][6]_c:                          1778.9 ( 1.00x)
avg_qpel_pixels_tab[0][6]_sse2 (old):                  385.5 ( 4.61x)
avg_qpel_pixels_tab[0][6]_sse2:                        374.9 ( 4.75x)
avg_qpel_pixels_tab[0][7]_c:                          1935.3 ( 1.00x)
avg_qpel_pixels_tab[0][7]_sse2 (old):                  403.1 ( 4.80x)
avg_qpel_pixels_tab[0][7]_sse2:                        391.6 ( 4.94x)
avg_qpel_pixels_tab[0][9]_c:                          1969.0 ( 1.00x)
avg_qpel_pixels_tab[0][9]_sse2 (old):                  384.1 ( 5.13x)
avg_qpel_pixels_tab[0][9]_sse2:                        380.3 ( 5.18x)
avg_qpel_pixels_tab[0][11]_c:                         2014.9 ( 1.00x)
avg_qpel_pixels_tab[0][11]_sse2 (old):                 385.6 ( 5.23x)
avg_qpel_pixels_tab[0][11]_sse2:                       380.2 ( 5.30x)
avg_qpel_pixels_tab[0][13]_c:                         1925.7 ( 1.00x)
avg_qpel_pixels_tab[0][13]_sse2 (old):                 406.1 ( 4.74x)
avg_qpel_pixels_tab[0][13]_sse2:                       390.4 ( 4.93x)
avg_qpel_pixels_tab[0][14]_c:                         1793.0 ( 1.00x)
avg_qpel_pixels_tab[0][14]_sse2 (old):                 389.6 ( 4.60x)
avg_qpel_pixels_tab[0][14]_sse2:                       377.1 ( 4.75x)
avg_qpel_pixels_tab[0][15]_c:                         1913.0 ( 1.00x)
avg_qpel_pixels_tab[0][15]_sse2 (old):                 404.2 ( 4.73x)
avg_qpel_pixels_tab[0][15]_sse2:                       390.8 ( 4.89x)
put_no_rnd_qpel_pixels_tab[0][5]_c:                   1864.1 ( 1.00x)
put_no_rnd_qpel_pixels_tab[0][5]_sse2 (old):           425.6 ( 4.38x)
put_no_rnd_qpel_pixels_tab[0][5]_sse2:                 396.2 ( 4.71x)
put_no_rnd_qpel_pixels_tab[0][6]_c:                   1767.1 ( 1.00x)
put_no_rnd_qpel_pixels_tab[0][6]_sse2 (old):           388.4 ( 4.55x)
put_no_rnd_qpel_pixels_tab[0][6]_sse2:                 377.7 ( 4.68x)
put_no_rnd_qpel_pixels_tab[0][7]_c:                   1874.9 ( 1.00x)
put_no_rnd_qpel_pixels_tab[0][7]_sse2 (old):           427.6 ( 4.38x)
put_no_rnd_qpel_pixels_tab[0][7]_sse2:                 400.0 ( 4.69x)
put_no_rnd_qpel_pixels_tab[0][9]_c:                   1759.7 ( 1.00x)
put_no_rnd_qpel_pixels_tab[0][9]_sse2 (old):           393.0 ( 4.48x)
put_no_rnd_qpel_pixels_tab[0][9]_sse2:                 379.7 ( 4.63x)
put_no_rnd_qpel_pixels_tab[0][11]_c:                  1820.9 ( 1.00x)
put_no_rnd_qpel_pixels_tab[0][11]_sse2 (old):          392.7 ( 4.64x)
put_no_rnd_qpel_pixels_tab[0][11]_sse2:                377.4 ( 4.82x)
put_no_rnd_qpel_pixels_tab[0][13]_c:                  1841.2 ( 1.00x)
put_no_rnd_qpel_pixels_tab[0][13]_sse2 (old):          427.1 ( 4.31x)
put_no_rnd_qpel_pixels_tab[0][13]_sse2:                395.9 ( 4.65x)
put_no_rnd_qpel_pixels_tab[0][14]_c:                  1761.3 ( 1.00x)
put_no_rnd_qpel_pixels_tab[0][14]_sse2 (old):          392.3 ( 4.49x)
put_no_rnd_qpel_pixels_tab[0][14]_sse2:                375.9 ( 4.69x)
put_no_rnd_qpel_pixels_tab[0][15]_c:                  1869.1 ( 1.00x)
put_no_rnd_qpel_pixels_tab[0][15]_sse2 (old):          425.6 ( 4.39x)
put_no_rnd_qpel_pixels_tab[0][15]_sse2:                397.3 ( 4.70x)
put_qpel_pixels_tab[0][5]_c:                          1888.2 ( 1.00x)
put_qpel_pixels_tab[0][5]_sse2 (old):                  396.5 ( 4.76x)
put_qpel_pixels_tab[0][5]_sse2:                        382.5 ( 4.94x)
put_qpel_pixels_tab[0][6]_c:                          1760.4 ( 1.00x)
put_qpel_pixels_tab[0][6]_sse2 (old):                  377.0 ( 4.67x)
put_qpel_pixels_tab[0][6]_sse2:                        372.1 ( 4.73x)
put_qpel_pixels_tab[0][7]_c:                          1927.6 ( 1.00x)
put_qpel_pixels_tab[0][7]_sse2 (old):                  396.5 ( 4.86x)
put_qpel_pixels_tab[0][7]_sse2:                        383.4 ( 5.03x)
put_qpel_pixels_tab[0][9]_c:                          1775.9 ( 1.00x)
put_qpel_pixels_tab[0][9]_sse2 (old):                  377.9 ( 4.70x)
put_qpel_pixels_tab[0][9]_sse2:                        372.3 ( 4.77x)
put_qpel_pixels_tab[0][11]_c:                         1809.0 ( 1.00x)
put_qpel_pixels_tab[0][11]_sse2 (old):                 374.6 ( 4.83x)
put_qpel_pixels_tab[0][11]_sse2:                       380.3 ( 4.76x)
put_qpel_pixels_tab[0][13]_c:                         1893.2 ( 1.00x)
put_qpel_pixels_tab[0][13]_sse2 (old):                 399.2 ( 4.74x)
put_qpel_pixels_tab[0][13]_sse2:                       384.7 ( 4.92x)
put_qpel_pixels_tab[0][14]_c:                         1756.2 ( 1.00x)
put_qpel_pixels_tab[0][14]_sse2 (old):                 377.9 ( 4.65x)
put_qpel_pixels_tab[0][14]_sse2:                       374.4 ( 4.69x)
put_qpel_pixels_tab[0][15]_c:                         1922.8 ( 1.00x)
put_qpel_pixels_tab[0][15]_sse2 (old):                 399.0 ( 4.82x)
put_qpel_pixels_tab[0][15]_sse2:                       387.8 ( 4.96x)

The purely vertical size 16 mc functions now no longer use any MMX.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-30 10:39:33 +02:00
Andreas Rheinhardt dad0c01076 avcodec/x86/qpeldsp: Remove vertical MMXEXT mc functions
Superseded by SSE2.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-30 10:39:33 +02:00
Andreas Rheinhardt 9beecb2670 avcodec/x86/qpeldsp: Add SSE2 vertical lowpass functions
Benchmarks ([4], [8] and [12] are pure vertical functions
and therefore show the biggest improvements):

avg_qpel_pixels_tab[0][4]_c:                           844.5 ( 1.00x)
avg_qpel_pixels_tab[0][4]_mmxext:                      225.5 ( 3.74x)
avg_qpel_pixels_tab[0][4]_sse2:                        146.6 ( 5.76x)
avg_qpel_pixels_tab[0][5]_c:                          1915.9 ( 1.00x)
avg_qpel_pixels_tab[0][5]_mmxext:                      499.6 ( 3.83x)
avg_qpel_pixels_tab[0][5]_sse2:                        405.5 ( 4.72x)
avg_qpel_pixels_tab[0][6]_c:                          1775.9 ( 1.00x)
avg_qpel_pixels_tab[0][6]_mmxext:                      484.9 ( 3.66x)
avg_qpel_pixels_tab[0][6]_sse2:                        385.4 ( 4.61x)
avg_qpel_pixels_tab[0][7]_c:                          1937.0 ( 1.00x)
avg_qpel_pixels_tab[0][7]_mmxext:                      501.3 ( 3.86x)
avg_qpel_pixels_tab[0][7]_sse2:                        403.6 ( 4.80x)
avg_qpel_pixels_tab[0][8]_c:                           976.7 ( 1.00x)
avg_qpel_pixels_tab[0][8]_mmxext:                      216.9 ( 4.50x)
avg_qpel_pixels_tab[0][8]_sse2:                        113.1 ( 8.64x)
avg_qpel_pixels_tab[0][9]_c:                          1971.8 ( 1.00x)
avg_qpel_pixels_tab[0][9]_mmxext:                      494.9 ( 3.98x)
avg_qpel_pixels_tab[0][9]_sse2:                        388.3 ( 5.08x)
avg_qpel_pixels_tab[0][10]_c:                         1900.8 ( 1.00x)
avg_qpel_pixels_tab[0][10]_mmxext:                     476.4 ( 3.99x)
avg_qpel_pixels_tab[0][10]_sse2:                       362.4 ( 5.24x)
avg_qpel_pixels_tab[0][11]_c:                         2003.3 ( 1.00x)
avg_qpel_pixels_tab[0][11]_mmxext:                     496.5 ( 4.04x)
avg_qpel_pixels_tab[0][11]_sse2:                       385.9 ( 5.19x)
avg_qpel_pixels_tab[0][12]_c:                          841.8 ( 1.00x)
avg_qpel_pixels_tab[0][12]_mmxext:                     226.7 ( 3.71x)
avg_qpel_pixels_tab[0][12]_sse2:                       143.3 ( 5.87x)
avg_qpel_pixels_tab[0][13]_c:                         1929.0 ( 1.00x)
avg_qpel_pixels_tab[0][13]_mmxext:                     499.6 ( 3.86x)
avg_qpel_pixels_tab[0][13]_sse2:                       412.1 ( 4.68x)
avg_qpel_pixels_tab[0][14]_c:                         1777.9 ( 1.00x)
avg_qpel_pixels_tab[0][14]_mmxext:                     484.8 ( 3.67x)
avg_qpel_pixels_tab[0][14]_sse2:                       385.9 ( 4.61x)
avg_qpel_pixels_tab[0][15]_c:                         1914.8 ( 1.00x)
avg_qpel_pixels_tab[0][15]_mmxext:                     501.8 ( 3.82x)
avg_qpel_pixels_tab[0][15]_sse2:                       405.0 ( 4.73x)
avg_qpel_pixels_tab[1][4]_c:                           203.4 ( 1.00x)
avg_qpel_pixels_tab[1][4]_mmxext:                       64.7 ( 3.14x)
avg_qpel_pixels_tab[1][4]_sse2:                         40.3 ( 5.05x)
avg_qpel_pixels_tab[1][5]_c:                           488.8 ( 1.00x)
avg_qpel_pixels_tab[1][5]_mmxext:                      134.6 ( 3.63x)
avg_qpel_pixels_tab[1][5]_sse2:                        108.5 ( 4.50x)
avg_qpel_pixels_tab[1][6]_c:                           448.2 ( 1.00x)
avg_qpel_pixels_tab[1][6]_mmxext:                      128.8 ( 3.48x)
avg_qpel_pixels_tab[1][6]_sse2:                        102.5 ( 4.37x)
avg_qpel_pixels_tab[1][7]_c:                           489.6 ( 1.00x)
avg_qpel_pixels_tab[1][7]_mmxext:                      134.5 ( 3.64x)
avg_qpel_pixels_tab[1][7]_sse2:                        108.8 ( 4.50x)
avg_qpel_pixels_tab[1][8]_c:                           223.8 ( 1.00x)
avg_qpel_pixels_tab[1][8]_mmxext:                       57.5 ( 3.89x)
avg_qpel_pixels_tab[1][8]_sse2:                         36.3 ( 6.16x)
avg_qpel_pixels_tab[1][9]_c:                           496.6 ( 1.00x)
avg_qpel_pixels_tab[1][9]_mmxext:                      129.8 ( 3.82x)
avg_qpel_pixels_tab[1][9]_sse2:                        105.1 ( 4.72x)
avg_qpel_pixels_tab[1][10]_c:                          466.1 ( 1.00x)
avg_qpel_pixels_tab[1][10]_mmxext:                     123.2 ( 3.78x)
avg_qpel_pixels_tab[1][10]_sse2:                        99.1 ( 4.70x)
avg_qpel_pixels_tab[1][11]_c:                          497.9 ( 1.00x)
avg_qpel_pixels_tab[1][11]_mmxext:                     129.9 ( 3.83x)
avg_qpel_pixels_tab[1][11]_sse2:                       105.4 ( 4.72x)
avg_qpel_pixels_tab[1][12]_c:                          203.5 ( 1.00x)
avg_qpel_pixels_tab[1][12]_mmxext:                      63.8 ( 3.19x)
avg_qpel_pixels_tab[1][12]_sse2:                        38.8 ( 5.25x)
avg_qpel_pixels_tab[1][13]_c:                          487.9 ( 1.00x)
avg_qpel_pixels_tab[1][13]_mmxext:                     134.7 ( 3.62x)
avg_qpel_pixels_tab[1][13]_sse2:                       108.4 ( 4.50x)
avg_qpel_pixels_tab[1][14]_c:                          447.4 ( 1.00x)
avg_qpel_pixels_tab[1][14]_mmxext:                     128.2 ( 3.49x)
avg_qpel_pixels_tab[1][14]_sse2:                       102.4 ( 4.37x)
avg_qpel_pixels_tab[1][15]_c:                          487.5 ( 1.00x)
avg_qpel_pixels_tab[1][15]_mmxext:                     134.0 ( 3.64x)
avg_qpel_pixels_tab[1][15]_sse2:                       109.9 ( 4.44x)

put_no_rnd_qpel_pixels_tab[0][4]_c:                    825.5 ( 1.00x)
put_no_rnd_qpel_pixels_tab[0][4]_mmxext:               242.5 ( 3.40x)
put_no_rnd_qpel_pixels_tab[0][4]_sse2:                 136.0 ( 6.07x)
put_no_rnd_qpel_pixels_tab[0][5]_c:                   1837.4 ( 1.00x)
put_no_rnd_qpel_pixels_tab[0][5]_mmxext:               542.5 ( 3.39x)
put_no_rnd_qpel_pixels_tab[0][5]_sse2:                 446.5 ( 4.11x)
put_no_rnd_qpel_pixels_tab[0][6]_c:                   1766.3 ( 1.00x)
put_no_rnd_qpel_pixels_tab[0][6]_mmxext:               493.6 ( 3.58x)
put_no_rnd_qpel_pixels_tab[0][6]_sse2:                 394.6 ( 4.48x)
put_no_rnd_qpel_pixels_tab[0][7]_c:                   1877.4 ( 1.00x)
put_no_rnd_qpel_pixels_tab[0][7]_mmxext:               541.9 ( 3.46x)
put_no_rnd_qpel_pixels_tab[0][7]_sse2:                 447.6 ( 4.19x)
put_no_rnd_qpel_pixels_tab[0][8]_c:                    785.1 ( 1.00x)
put_no_rnd_qpel_pixels_tab[0][8]_mmxext:               206.2 ( 3.81x)
put_no_rnd_qpel_pixels_tab[0][8]_sse2:                 101.6 ( 7.73x)
put_no_rnd_qpel_pixels_tab[0][9]_c:                   1772.2 ( 1.00x)
put_no_rnd_qpel_pixels_tab[0][9]_mmxext:               489.5 ( 3.62x)
put_no_rnd_qpel_pixels_tab[0][9]_sse2:                 394.8 ( 4.49x)
put_no_rnd_qpel_pixels_tab[0][10]_c:                  1711.5 ( 1.00x)
put_no_rnd_qpel_pixels_tab[0][10]_mmxext:              461.2 ( 3.71x)
put_no_rnd_qpel_pixels_tab[0][10]_sse2:                357.9 ( 4.78x)
put_no_rnd_qpel_pixels_tab[0][11]_c:                  1815.9 ( 1.00x)
put_no_rnd_qpel_pixels_tab[0][11]_mmxext:              490.8 ( 3.70x)
put_no_rnd_qpel_pixels_tab[0][11]_sse2:                394.0 ( 4.61x)
put_no_rnd_qpel_pixels_tab[0][12]_c:                   824.8 ( 1.00x)
put_no_rnd_qpel_pixels_tab[0][12]_mmxext:              242.9 ( 3.40x)
put_no_rnd_qpel_pixels_tab[0][12]_sse2:                135.3 ( 6.10x)
put_no_rnd_qpel_pixels_tab[0][13]_c:                  1843.5 ( 1.00x)
put_no_rnd_qpel_pixels_tab[0][13]_mmxext:              545.4 ( 3.38x)
put_no_rnd_qpel_pixels_tab[0][13]_sse2:                444.9 ( 4.14x)
put_no_rnd_qpel_pixels_tab[0][14]_c:                  1758.1 ( 1.00x)
put_no_rnd_qpel_pixels_tab[0][14]_mmxext:              497.7 ( 3.53x)
put_no_rnd_qpel_pixels_tab[0][14]_sse2:                393.5 ( 4.47x)
put_no_rnd_qpel_pixels_tab[0][15]_c:                  1861.3 ( 1.00x)
put_no_rnd_qpel_pixels_tab[0][15]_mmxext:              545.0 ( 3.42x)
put_no_rnd_qpel_pixels_tab[0][15]_sse2:                445.7 ( 4.18x)
put_no_rnd_qpel_pixels_tab[1][4]_c:                    198.3 ( 1.00x)
put_no_rnd_qpel_pixels_tab[1][4]_mmxext:                64.3 ( 3.08x)
put_no_rnd_qpel_pixels_tab[1][4]_sse2:                  39.8 ( 4.98x)
put_no_rnd_qpel_pixels_tab[1][5]_c:                    460.7 ( 1.00x)
put_no_rnd_qpel_pixels_tab[1][5]_mmxext:               137.2 ( 3.36x)
put_no_rnd_qpel_pixels_tab[1][5]_sse2:                 113.5 ( 4.06x)
put_no_rnd_qpel_pixels_tab[1][6]_c:                    441.4 ( 1.00x)
put_no_rnd_qpel_pixels_tab[1][6]_mmxext:               126.7 ( 3.49x)
put_no_rnd_qpel_pixels_tab[1][6]_sse2:                 103.7 ( 4.26x)
put_no_rnd_qpel_pixels_tab[1][7]_c:                    465.9 ( 1.00x)
put_no_rnd_qpel_pixels_tab[1][7]_mmxext:               137.7 ( 3.38x)
put_no_rnd_qpel_pixels_tab[1][7]_sse2:                 114.0 ( 4.09x)
put_no_rnd_qpel_pixels_tab[1][8]_c:                    193.8 ( 1.00x)
put_no_rnd_qpel_pixels_tab[1][8]_mmxext:                52.1 ( 3.72x)
put_no_rnd_qpel_pixels_tab[1][8]_sse2:                  27.8 ( 6.97x)
put_no_rnd_qpel_pixels_tab[1][9]_c:                    450.9 ( 1.00x)
put_no_rnd_qpel_pixels_tab[1][9]_mmxext:               126.2 ( 3.57x)
put_no_rnd_qpel_pixels_tab[1][9]_sse2:                 104.3 ( 4.32x)
put_no_rnd_qpel_pixels_tab[1][10]_c:                   436.5 ( 1.00x)
put_no_rnd_qpel_pixels_tab[1][10]_mmxext:              118.1 ( 3.69x)
put_no_rnd_qpel_pixels_tab[1][10]_sse2:                 92.4 ( 4.73x)
put_no_rnd_qpel_pixels_tab[1][11]_c:                   453.6 ( 1.00x)
put_no_rnd_qpel_pixels_tab[1][11]_mmxext:              128.7 ( 3.52x)
put_no_rnd_qpel_pixels_tab[1][11]_sse2:                103.6 ( 4.38x)
put_no_rnd_qpel_pixels_tab[1][12]_c:                   201.2 ( 1.00x)
put_no_rnd_qpel_pixels_tab[1][12]_mmxext:               64.2 ( 3.13x)
put_no_rnd_qpel_pixels_tab[1][12]_sse2:                 39.6 ( 5.08x)
put_no_rnd_qpel_pixels_tab[1][13]_c:                   461.9 ( 1.00x)
put_no_rnd_qpel_pixels_tab[1][13]_mmxext:              137.6 ( 3.36x)
put_no_rnd_qpel_pixels_tab[1][13]_sse2:                113.4 ( 4.07x)
put_no_rnd_qpel_pixels_tab[1][14]_c:                   442.6 ( 1.00x)
put_no_rnd_qpel_pixels_tab[1][14]_mmxext:              127.0 ( 3.49x)
put_no_rnd_qpel_pixels_tab[1][14]_sse2:                102.2 ( 4.33x)
put_no_rnd_qpel_pixels_tab[1][15]_c:                   462.9 ( 1.00x)
put_no_rnd_qpel_pixels_tab[1][15]_mmxext:              139.5 ( 3.32x)
put_no_rnd_qpel_pixels_tab[1][15]_sse2:                113.3 ( 4.09x)

put_qpel_pixels_tab[0][4]_c:                           824.6 ( 1.00x)
put_qpel_pixels_tab[0][4]_mmxext:                      220.1 ( 3.75x)
put_qpel_pixels_tab[0][4]_sse2:                        137.8 ( 5.98x)
put_qpel_pixels_tab[0][5]_c:                          1892.0 ( 1.00x)
put_qpel_pixels_tab[0][5]_mmxext:                      508.0 ( 3.72x)
put_qpel_pixels_tab[0][5]_sse2:                        408.6 ( 4.63x)
put_qpel_pixels_tab[0][6]_c:                          1758.0 ( 1.00x)
put_qpel_pixels_tab[0][6]_mmxext:                      476.7 ( 3.69x)
put_qpel_pixels_tab[0][6]_sse2:                        381.4 ( 4.61x)
put_qpel_pixels_tab[0][7]_c:                          1924.3 ( 1.00x)
put_qpel_pixels_tab[0][7]_mmxext:                      495.1 ( 3.89x)
put_qpel_pixels_tab[0][7]_sse2:                        417.2 ( 4.61x)
put_qpel_pixels_tab[0][8]_c:                           772.1 ( 1.00x)
put_qpel_pixels_tab[0][8]_mmxext:                      197.5 ( 3.91x)
put_qpel_pixels_tab[0][8]_sse2:                        118.4 ( 6.52x)
put_qpel_pixels_tab[0][9]_c:                          1778.2 ( 1.00x)
put_qpel_pixels_tab[0][9]_mmxext:                      476.7 ( 3.73x)
put_qpel_pixels_tab[0][9]_sse2:                        379.6 ( 4.68x)
put_qpel_pixels_tab[0][10]_c:                         1714.6 ( 1.00x)
put_qpel_pixels_tab[0][10]_mmxext:                     460.7 ( 3.72x)
put_qpel_pixels_tab[0][10]_sse2:                       386.8 ( 4.43x)
put_qpel_pixels_tab[0][11]_c:                         1819.1 ( 1.00x)
put_qpel_pixels_tab[0][11]_mmxext:                     474.9 ( 3.83x)
put_qpel_pixels_tab[0][11]_sse2:                       404.5 ( 4.50x)
put_qpel_pixels_tab[0][12]_c:                          829.7 ( 1.00x)
put_qpel_pixels_tab[0][12]_mmxext:                     221.5 ( 3.75x)
put_qpel_pixels_tab[0][12]_sse2:                       138.7 ( 5.98x)
put_qpel_pixels_tab[0][13]_c:                         1892.8 ( 1.00x)
put_qpel_pixels_tab[0][13]_mmxext:                     494.4 ( 3.83x)
put_qpel_pixels_tab[0][13]_sse2:                       413.9 ( 4.57x)
put_qpel_pixels_tab[0][14]_c:                         1763.1 ( 1.00x)
put_qpel_pixels_tab[0][14]_mmxext:                     473.4 ( 3.72x)
put_qpel_pixels_tab[0][14]_sse2:                       377.8 ( 4.67x)
put_qpel_pixels_tab[0][15]_c:                         1896.4 ( 1.00x)
put_qpel_pixels_tab[0][15]_mmxext:                     492.5 ( 3.85x)
put_qpel_pixels_tab[0][15]_sse2:                       399.0 ( 4.75x)
put_qpel_pixels_tab[1][4]_c:                           198.6 ( 1.00x)
put_qpel_pixels_tab[1][4]_mmxext:                       60.9 ( 3.26x)
put_qpel_pixels_tab[1][4]_sse2:                         40.1 ( 4.95x)
put_qpel_pixels_tab[1][5]_c:                           471.4 ( 1.00x)
put_qpel_pixels_tab[1][5]_mmxext:                      131.8 ( 3.58x)
put_qpel_pixels_tab[1][5]_sse2:                        107.2 ( 4.40x)
put_qpel_pixels_tab[1][6]_c:                           440.3 ( 1.00x)
put_qpel_pixels_tab[1][6]_mmxext:                      126.3 ( 3.49x)
put_qpel_pixels_tab[1][6]_sse2:                        100.6 ( 4.38x)
put_qpel_pixels_tab[1][7]_c:                           469.2 ( 1.00x)
put_qpel_pixels_tab[1][7]_mmxext:                      131.7 ( 3.56x)
put_qpel_pixels_tab[1][7]_sse2:                        106.9 ( 4.39x)
put_qpel_pixels_tab[1][8]_c:                           194.2 ( 1.00x)
put_qpel_pixels_tab[1][8]_mmxext:                       52.9 ( 3.67x)
put_qpel_pixels_tab[1][8]_sse2:                         28.0 ( 6.95x)
put_qpel_pixels_tab[1][9]_c:                           464.6 ( 1.00x)
put_qpel_pixels_tab[1][9]_mmxext:                      125.1 ( 3.71x)
put_qpel_pixels_tab[1][9]_sse2:                        100.9 ( 4.60x)
put_qpel_pixels_tab[1][10]_c:                          433.8 ( 1.00x)
put_qpel_pixels_tab[1][10]_mmxext:                     118.2 ( 3.67x)
put_qpel_pixels_tab[1][10]_sse2:                        94.5 ( 4.59x)
put_qpel_pixels_tab[1][11]_c:                          463.9 ( 1.00x)
put_qpel_pixels_tab[1][11]_mmxext:                     125.5 ( 3.70x)
put_qpel_pixels_tab[1][11]_sse2:                       102.6 ( 4.52x)
put_qpel_pixels_tab[1][12]_c:                          199.2 ( 1.00x)
put_qpel_pixels_tab[1][12]_mmxext:                      63.7 ( 3.12x)
put_qpel_pixels_tab[1][12]_sse2:                        36.2 ( 5.50x)
put_qpel_pixels_tab[1][13]_c:                          475.6 ( 1.00x)
put_qpel_pixels_tab[1][13]_mmxext:                     139.5 ( 3.41x)
put_qpel_pixels_tab[1][13]_sse2:                       107.3 ( 4.43x)
put_qpel_pixels_tab[1][14]_c:                          441.9 ( 1.00x)
put_qpel_pixels_tab[1][14]_mmxext:                     126.9 ( 3.48x)
put_qpel_pixels_tab[1][14]_sse2:                       101.3 ( 4.36x)
put_qpel_pixels_tab[1][15]_c:                          475.9 ( 1.00x)
put_qpel_pixels_tab[1][15]_mmxext:                     131.9 ( 3.61x)
put_qpel_pixels_tab[1][15]_sse2:                       107.0 ( 4.45x)

The new functions (in qpeldsp.asm) occupy 8244B (the MMXEXT functions
which they will replace occupy only 6720B).

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-30 10:39:33 +02:00
Andreas Rheinhardt 405465700c avcodec/x86/qpeldsp: Don't allocate stack unnecessarily
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-30 10:39:33 +02:00
Andreas Rheinhardt 188df9549c avcodec/x86/qpeldsp: Don't use too much stack
We only need (SIZE+1)*SIZE words.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-30 10:39:33 +02:00
Andreas Rheinhardt bcf7293a21 avcodec/x86/qpeldsp: Remove unused declaration
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-30 10:39:33 +02:00
Andreas Rheinhardt 7b56259dd5 avcodec/x86/constants: Move ff_pw_{15,20} to qpeldsp.asm
Only used there.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-30 10:39:32 +02:00
Andreas Rheinhardt c2685234a6 avcodec/x86/qpeldsp_init: Deduplicate 8x8 and 16x16 code
Also split the big macro into smaller ones for the pure horizontal vs
the pure vertical and the mixed directions.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-30 10:39:32 +02:00
Andreas Rheinhardt cf79d8052d avcodec/x86/qpeldsp_init: Specify alignment properly
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-30 10:39:32 +02:00
Andreas Rheinhardt 69906d31c5 avcodec/x86/qpeldsp_init: Don't use unnecessarily big stack buffer
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-30 10:39:32 +02:00
Andreas Rheinhardt d3bd1318b3 avcodec/x86/qpeldsp: Don't zero unnecessarily
This value is write-only.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-30 10:39:32 +02:00
Andreas Rheinhardt d46414b46b avcodec/x86/qpeldsp: Simplify resetting output pointer
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-30 10:39:32 +02:00
Stefan Breunig 9172ab1245 fate/filter-video: add frei0r_src test
An installation of frei0r-plugins is required to run the tests,
which is usually seperate from the build headers. Some systems
have it packaged (e.g. apt install frei0r-plugins). An upstream
release extracted to FREI0R_PATH also works.

Signed-off-by: Stefan Breunig <stefan-ffmpeg-devel@breunig.xyz>
2026-04-30 03:46:18 +00:00
Nicolas Dato 3aa5d957d1 avformat/dashdec: fix previous commit where I inadvertently removed the case when calc_next_seg_no_from_timelines returned -1 and move_timelines wasn't called
Signed-off-by: Nicolas Dato <nicolas.dato@gmail.com>
2026-04-29 23:54:37 +00:00
Nicolas Dato 8a8bde6a54 avformat/dashdec: fix calculation and usage of cur_seq_no, fixing issue 22335
Functions like calc_cur_seg_no, calc_min_seg_no, and calc_max_seg_no calculated
the segment number taking into account the first_seq_no.
However, functions like get_segment_start_time_based_on_timeline and
calc_cur_seg_no didn't take first_seq_no into account.
This made dashdec believe that the cur_seq_no was always less than min_seq_no,
logging 'old fragment' and calling calc_cur_seq_no.

In live dash streams with some startNumber, that call to calc_cur_seq_no after
the 'old fragment' log made ffmpeg reposition itself 60 seconds before the
current time whenever the manifest reloaded.
This made ffmpeg skip segments, specially when the manifest reloaded slower
than the segments duration, resulting in a new manifest with more than one new
segment.

Signed-off-by: Nicolas Dato <nicolas.dato@gmail.com>
2026-04-29 23:54:37 +00:00
Michael Niedermayer c25673fe70 avformat/mpegts: Fix memleak of pes_filter.opaque
Fixes: 490257166/clusterfuzz-testcase-minimized-ffmpeg_dem_MPEGTS_fuzzer-4815675538604032

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-04-29 20:50:21 +00:00
James Almer 2e6af10481 avformat/dashdec: copy stream groups from input representations
Signed-off-by: James Almer <jamrial@gmail.com>
2026-04-29 14:00:03 +00:00
James Almer 8fad6dcfd9 avformat/dashdec: support more than one underlying stream per Representation
Some Dash manifests contain Representations within an Adaptation Set that
reference an underlying mp4 context that contain more than the stream it
describes, as is the case of LCEVC enhancements.

Despite the fact open_demux_for_component() loops through all streams in the
underlying context, the rest of the demuxer is writen assuming only the
stream described by the corresponding representation will be present, which
results in completely wrong stream index assignments.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-04-29 14:00:03 +00:00
Martin Storsjö 397c7c7524 tools/check_arm_indent: Run formatting on arm, in addition to aarch64
Add exceptions for files that aren't handled well (or that would
require more manual cleanups to make the output look good).
2026-04-29 13:53:07 +03:00
Martin Storsjö f6b21eca5e tools/check_arm_indent: Add missing ;; in switch case, fix indentation 2026-04-29 13:53:07 +03:00
Martin Storsjö 963ea707e3 arm/rv40dsp: Add * on comment continuation lines in prototypes
This avoids that the assembly indenter script tries to indent these
lines as assembly code.
2026-04-29 13:53:07 +03:00
Martin Storsjö 0a86aead82 arm/vc1dsp: Fix a few cases of inconsistent indentation
The function ff_vc1_unescape_buffer_helper_neon intentionally
uses unusual indentation, to indicate different levels of
unrolling in the function.
2026-04-29 13:53:07 +03:00
Martin Storsjö 10a45072fc arm/jrevdct: Indent previously unindented assembly
The comments have been manually tweaked to line up properly.
2026-04-29 13:53:07 +03:00
Martin Storsjö 5e0f1b1eda arm/hevcdsp_qpel: Reindent code that seem to lack consistent indentation 2026-04-29 13:53:07 +03:00
Martin Storsjö 65d4c5bbe2 arm: Reindent asm that used consistent but differing styles
The qpel_filter macros in hevcdsp_qpel_neon.S have been
manually tweaked to keep reasonable indentation of the
comments.
2026-04-29 13:53:07 +03:00
Martin Storsjö 2325421904 arm/synth_filter_vfp: Fix indentation
This was done with manual adjustments; the reindentation
script doesn't handle the VFP/NOVFP macros at the start of
lines.
2026-04-29 13:53:07 +03:00
Ramiro Polla 8d9c1db95d arm/simple_idct_arm: Reindent previously unindented code 2026-04-29 13:53:07 +03:00
Martin Storsjö a65ed248fd arm/simple_idct_armv6: Reindent previously consistent assembly to shared style
This has manual fixups, as the indenting script wants to
lowercase constants like W46 to w46, which breaks things.
2026-04-29 13:49:27 +03:00
Martin Storsjö b27fd61020 arm/simple_idct_armv5te: Reindent previously consistent code to common style
This has manual fixups, as the indenting script wants to
lowercase constants like W26 to w26, which breaks things.
2026-04-29 13:49:27 +03:00
Martin Storsjö 8e199a2a9f arm/rv34dsp: Adjust macro argument indentation slightly
The previous form did neatly align with the lines above, but doesn't
match general indentation rules from our indentation script.
2026-04-29 13:49:27 +03:00
Martin Storsjö 9653588441 libswscale/arm: Switch consistent indentation to common style
Some of these files aligned instructions to 4/24 columns, while
we commonly indent arm/aarch64 assembly to 8/24 columns.
Some of these files also used a different alignment for the
operands.
2026-04-29 13:49:27 +03:00
Martin Storsjö c5a3cb00b7 libswresample/arm: Change to the common indentation size
These files consistently aligned instructions to 4/24 columns,
while we commonly indent arm/aarch64 assembly to 8/24 columns.
2026-04-29 13:49:27 +03:00
Martin Storsjö 25d703dd2a libavutil/arm: Fix indentation in asm.S 2026-04-29 13:49:27 +03:00
Martin Storsjö d94e2b0f7c arm/hevcdsp: Fix misindented instructions in some macros 2026-04-29 13:49:27 +03:00
Martin Storsjö 7eaeb5ab4a arm: Fix indentation of stray individual misaligned instructions 2026-04-29 13:49:27 +03:00
Martin Storsjö 17765fe831 arm: Reindent assembly where it was off by one char 2026-04-29 13:49:27 +03:00
Martin Storsjö 946e80fde7 libswscale/arm: Lowercase the "LSL" keyword 2026-04-29 13:49:27 +03:00
Martin Storsjö ea7079074c tools/indent_arm_assembly: Don't indent "foo .req bar" lines like an instruction
These are used a bit in our arm assembly, while they're used much
less in our aarch64 assembly.
2026-04-29 13:49:27 +03:00
Martin Storsjö cd7a3cd799 tools/indent_arm_assembly: Recognize more comment forms, for skipping lowercasing
When we try to lowercase register names (e.g. Q0 -> q0) we avoid
doing that for parts of the code that are comments, as comments
occasionally contain pseudocode that contain such mentions that
aren't register names, but pseudocode/reference code variables.
See 7ebb6c54eb for more details
about that.

In addition to recognizing comments starting with //, also
recognize /* and @ (which is a comment char in arm assembly, but
not in aarch64).
2026-04-29 13:49:27 +03:00
Michael Niedermayer 7c67748537 avformat/mov: check extradata in mov_read_dops()
We do want to limit an attackers ability to change once parsed structures.
So once extradata (or another array) is finished and possibly has been used we do not
want to allow an attacker to change it.

This reduces the attack surface

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-04-29 00:46:47 +00:00
Ted Meyer 53cd2c9f2a avformat/mov: Check read size for opus extradata
in mov_read_dops, `size` bytes is allocated for
`st->codecpar->extradata`, but ff_alloc_extradata doesn't memset, so the
contents of that buffer are just old heap data. If `avio_read` reads
fewer bytes than were requested, uninitialized data can still be left in
the extradata buffer, which is operated on by AV_WL16A and AV_WL32A.

I think the best solution here is to just check the read size and ensure
it's filling the extradata buffer in it's entirety, or erroring out if
there isn't enough data left.
2026-04-28 23:46:56 +00:00
Andreas Rheinhardt bd1587037f avutil/tests/.gitignore: Add recently added test tools
Reviewed-by: Kacper Michajłow <kasper93@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-28 20:08:35 +02:00
marcos ashton fa3d20072b tests/fate/libavutil: add FATE test for timestamp
Test av_ts_make_string with NOPTS, zero, positive, negative, and
INT64 boundary values, av_ts2str macro, av_ts_make_time_string2
with various timebases, and av_ts_make_time_string pointer
variant.

Coverage for libavutil/timestamp.c: 0.00% -> 100.00%
2026-04-28 16:17:47 +00:00
marcos ashton 9b47495dee tests/fate/libavutil: add FATE test for tdrdi
Test av_tdrdi_alloc with 1 and 3 displays, and the inline
av_tdrdi_get_display accessor. Verifies that the returned
pointer matches entries_offset + idx * entry_size, tests
write/read-back of display width exponent/mantissa and view ID
fields, and OOM paths via av_max_alloc.

Coverage for libavutil/tdrdi.c: 0.00% -> 100.00%
2026-04-28 16:17:47 +00:00
marcos ashton 215799e369 tests/fate/libavutil: add FATE test for hdr_dynamic_vivid_metadata
Test av_dynamic_hdr_vivid_alloc and
av_dynamic_hdr_vivid_create_side_data. Verifies zero defaults,
write/read-back of system_start_code, num_windows, and
color transform params (min/avg/var/max RGB), frame side
data attachment, and OOM paths via av_max_alloc.

Coverage for libavutil/hdr_dynamic_vivid_metadata.c: 0.00% -> 100.00%
2026-04-28 16:17:47 +00:00
marcos ashton 2d9c8a9382 tests/fate/libavutil: add FATE test for buffer
Test av_buffer_alloc, av_buffer_allocz, av_buffer_create with
custom free callback, AV_BUFFER_FLAG_READONLY, av_buffer_ref,
av_buffer_is_writable, av_buffer_get_ref_count,
av_buffer_make_writable, av_buffer_realloc (including from NULL),
av_buffer_replace (including with NULL), av_buffer_pool
init/get/uninit cycle, av_buffer_pool_init2 with custom alloc
and pool_free callbacks, av_buffer_pool_buffer_get_opaque, and
OOM paths via av_max_alloc.

Coverage for libavutil/buffer.c: 0.00% -> 90.19%

Remaining uncovered lines are mutex init failures and
secondary allocation failure paths.
2026-04-28 16:17:47 +00:00
jiangjie 03931e8865 libavfilter/vf_amf_common: free the frame allocated by av_frame_alloc on error 2026-04-28 14:57:34 +00:00
Zhao Zhili 603234f945 avdevice/v4l2: fix mmap_free() skipping first buffer
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2026-04-28 13:45:20 +00:00
Zhao Zhili beb315ca31 avformat/wavdec: fix unchecked avio_read in w64_read_header
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2026-04-28 13:44:57 +00:00
Marvin Scholz f044c5e627 doc: remove unclear description
There is no caller when presuming that the user will use lavc for
decoding.
2026-04-28 14:31:19 +02:00
Marvin Scholz c9937ff139 doc: mark functions related to AVCodecParameters
This makes these functions appear in the AVCodecParameters
documentation page, so they are easier to find.
2026-04-28 14:31:19 +02:00
Marvin Scholz ab1a970bc0 doc: style changes for the AVCodecParameters
Mostly adding references and making the video/audio only
annotations not be the brief description.
2026-04-28 14:31:19 +02:00
Marvin Scholz 0e51f7abbd configure: add implicit-fallthrough warning flags 2026-04-28 12:29:37 +00:00
Marvin Scholz e24882912f swscale/yuv2rgb: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz e4f6aa8611 avcodec/wmadec: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz dc7692b831 avcodec/aac: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 97ff804e21 avcodec/ac3dec: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz a384a4ff3a avcodec/ansi: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 5cee00b85f avcodec/argo: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 0f3fe9e2bf avcodec/avs: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz e5e12328bf avcodec/bethsoftvideo: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 0f81f78829 avcodec/bink: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 49c62c3337 avcodec/bintext: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz d578926366 avcodec/c39: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 7b94360e0e avcodec/cavs: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz c772decdd0 avcodec/dca: add break 2026-04-28 12:29:37 +00:00
Marvin Scholz 5cdbd0337f avcodec/dds: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 9a765c453a avcodec/dpxenc: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz b70d6b4f58 avcodec/dv: add break 2026-04-28 12:29:37 +00:00
Marvin Scholz 5a5742498b avcodec/dxa: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz de011d5893 avcodec/dxtory: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 87926346e7 avcodec/eatgq: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 5f574e6416 avcodec/ffv1enc: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 4a805cfa53 avcodec/flacdsp: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz d4d5ac3bb2 avcodec/gdv: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 735e670334 avcodec/h264dec: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 16e944c8e4 avcodec/imx: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 4fe52a2484 avcodec/jpeg2000dec: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz c49390ee87 avcodec/jpeglsdec: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz f4a05e3528 avcodec/lagarith: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz b976075088 avcodec/lcldec: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 0d7ea1bb55 avcodec/microdvddec: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 417da4d71c avcodec/mpegaudio: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz abc3b65ccb avcodec/pafvideo: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 41dbb4412a avcodec/psd: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 3ad3315342 avcodec/rpza: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 7eff0307ff avcodec/rv34: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 120cc26594 avcodec/sgienc: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 965a4b6ae1 avcodec/sheervideo: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 31ea9122e5 avcodec/svq1dec: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 8a4abaaa4d avcodec/takdec: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 092d223b7b avcodec/tiff: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz e6c7fd4106 avcodec/tiffenc: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 10742fdc65 avcodec/wavpackdec: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 55f224afeb avfilter/af_biquads: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 435a617cc8 avfilter/vf_negate: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 89870d404c avformat/aiffdec: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz f0e9854f79 avformat/avidec: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 85c88d748f avformat/avienc: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 2ea8e764e2 avformat/bethsoftvid: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 749f01e3ea avformat/cafdec: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz c25c83abf5 avformat/concat: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 720d5c3c51 avformat/electronicarts: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 337a3fba9d avformat/epafdec: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 3f815180e8 avformat/flvdec: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz c26334f750 avformat/flvenc: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 223939e6be avformat/id3v2: add break 2026-04-28 12:29:37 +00:00
Marvin Scholz e2c36fbb7f avformat/idroqdec: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 44271c2dde avformat/jacobsubdec: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz f715db05fa avformat/lmlm4: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 346d7f63cb avformat/lvfdec: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 6d3392fd60 avformat/matroskadec: add break 2026-04-28 12:29:37 +00:00
Marvin Scholz bcf0b71d8c avformat/mov: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 9f22a4d363 avformat/mxfdec: add fall-through annotation and break 2026-04-28 12:29:37 +00:00
Marvin Scholz 50b1da33e4 avformat/mxfenc: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 452d0239ca avformat/network: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz f819d3452c avformat/nutdec: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz a6b8525f6e avformat/nuv: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 49fc8ddf65 avformat/oggparsetheora: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 70fed6fd33 avformat/rtmppkt: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 2777e4d389 avformat/takdec: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 125dd9ee2a avformat/txd: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz d607b2249f avformat/ty: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 5988639f39 avformat/yuv4mpegdec: return proper error
The header is not invalid in this case, but ffmpeg still doesn't
support it.
2026-04-28 12:29:37 +00:00
Marvin Scholz 3e48505dda swscale: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz e2a8b73688 avcodec/txd: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 20d6759f8e avcodec/vb: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz c790f4284f avcodec/vc1: add break 2026-04-28 12:29:37 +00:00
Marvin Scholz e92c4076d6 avcodec/vmnc: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 8f50eeee02 avcodec/zmbv: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz ce740510aa avutil: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz b50a586583 avutil/avsscanf: add break 2026-04-28 12:29:37 +00:00
Marvin Scholz 7e3e88d28d avutil/hwcontext_videotoolbox: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz 3c5bb10a87 avcodec/videotoolbox: add fall-through annotations 2026-04-28 12:29:37 +00:00
Marvin Scholz c023f91eeb avfilter/vf_v360: add break
Also replace an av_assert0 with av_unreachable.
2026-04-28 12:29:37 +00:00
Marvin Scholz 121a81d586 avfilter/avf_showcqt: add fall-through annotation 2026-04-28 12:29:37 +00:00
Marvin Scholz 62a93f0148 avutil: replace fall-through comments 2026-04-28 12:29:37 +00:00
Marvin Scholz 752cf875d8 swscale: replace fall-through comments 2026-04-28 12:29:37 +00:00
Marvin Scholz d5ae10e6d4 avformat: replace fall-through comments 2026-04-28 12:29:37 +00:00
Marvin Scholz 49f3620119 avfilter: replace fall-through comments 2026-04-28 12:29:37 +00:00
Marvin Scholz 938fa8b14c avcodec: replace fall-through comments 2026-04-28 12:29:37 +00:00
Marvin Scholz afd3f01501 fftools: replace fall-through comments 2026-04-28 12:29:37 +00:00
Marvin Scholz cc863d68d7 avutil: add av_fallthrough 2026-04-28 12:29:37 +00:00
Marvin Scholz 3daf664a5a avcodec: cbs_lcevc: remove dead code
The error is already handled before the loop, so this can never be true.

Fix Coverity issue 1683139
2026-04-28 12:24:54 +00:00
depthfirst-dev[bot] eec78bdac1 avformat/rtspdec: reject non-positive ANNOUNCE Content-Length
rtsp_read_announce() treated any non-zero Content-Length as valid,
including negative values parsed via strtol(). This could send invalid
sizes into allocation, body reads and trailing NUL writes.

Accept only strictly positive SDP body lengths and reject invalid
Content-Length values with AVERROR_INVALIDDATA.

Found-by: Seung Min Shin (was reported to us on 10th April)
CC: 신승민 <guncraft2000@naver.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-04-28 12:07:16 +00:00
Zhao Zhili 9eaa559847 avformat/matroskadec: fix invalid check and uninitialized memory access
size is uninitialized when av_dynamic_hdr_smpte2094_app5_alloc failed.

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2026-04-28 11:59:59 +00:00
Zhao Zhili 1b98286131 swscale: unref on allocation failure in frame_alloc_buffers()
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2026-04-28 11:58:33 +00:00
Jun Zhao 3cdd76ba96 avcodec/libsvtav1: reject tiny inputs on SVT-AV1 < 3.0.0
SVT-AV1 < 3.0.0 requires input dimensions of at least 64x64.
Older versions may otherwise silently accept smaller inputs without
producing output and cause the caller to hang. Reject such inputs
explicitly in config_enc_params() to produce a clear error.
v3.0.0+ supports sub-64px dimensions and validates the
input itself, so the check is gated with SVT_AV1_CHECK_VERSION.

Fix #22817

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-04-28 06:14:38 +00:00
Jun Zhao d247110148 fate/filter-video: add regression test for scale zero-dim rejection
Add a regression test covering issue #22817: cascaded scale=...:-2
filters on extreme aspect ratios previously produced zero output
dimensions silently. The test expects ffmpeg to fail fast.

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-04-28 06:14:38 +00:00
Jun Zhao bfbc5632f1 avfilter/vf_libplacebo: propagate ff_scale_adjust_dimensions() error
ff_scale_adjust_dimensions() can now return a negative error code when
the evaluated output dimensions are non-positive.  Check the return
value and fail fast instead of continuing with the unadjusted result.

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-04-28 06:14:38 +00:00
Jun Zhao fd51dc5d20 avfilter/vf_amf_common: propagate ff_scale_adjust_dimensions() error
ff_scale_adjust_dimensions() can now return a negative error code when
the evaluated output dimensions are non-positive.  Check the return
value and fail fast instead of continuing with the unadjusted result.

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-04-28 06:14:38 +00:00
Jun Zhao 4fc0b36bac avfilter/vf_scale_d3d12: propagate ff_scale_adjust_dimensions() error
ff_scale_adjust_dimensions() can now return a negative error code when
the evaluated output dimensions are non-positive.  Check the return
value and fail fast instead of continuing with the unadjusted result.

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-04-28 06:14:38 +00:00
Jun Zhao 0929b6038c avfilter/vf_scale_vulkan: propagate ff_scale_adjust_dimensions() error
ff_scale_adjust_dimensions() can now return a negative error code when
the evaluated output dimensions are non-positive.  Check the return
value and fail fast instead of continuing with the unadjusted result.

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-04-28 06:14:38 +00:00
Jun Zhao 3f9b92be42 avfilter/vf_scale_vt: propagate ff_scale_adjust_dimensions() error
ff_scale_adjust_dimensions() can now return a negative error code when
the evaluated output dimensions are non-positive.  Check the return
value and fail fast instead of continuing with the unadjusted result.

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-04-28 06:14:38 +00:00
Jun Zhao 8fc4cc982e avfilter/vf_scale_vaapi: propagate ff_scale_adjust_dimensions() error
ff_scale_adjust_dimensions() can now return a negative error code when
the evaluated output dimensions are non-positive.  Check the return
value and fail fast instead of continuing with the unadjusted result.

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-04-28 06:14:38 +00:00
Jun Zhao 6a37a60726 avfilter/vf_scale_npp: propagate ff_scale_adjust_dimensions() error
ff_scale_adjust_dimensions() can now return a negative error code when
the evaluated output dimensions are non-positive.  Check the return
value and fail fast instead of continuing with the unadjusted result.

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-04-28 06:14:38 +00:00
Jun Zhao 33d657c7d0 avfilter/vf_scale_cuda: propagate ff_scale_adjust_dimensions() error
ff_scale_adjust_dimensions() can now return a negative error code when
the evaluated output dimensions are non-positive.  Check the return
value and fail fast instead of continuing with the unadjusted result.

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-04-28 06:14:38 +00:00
Jun Zhao a45fe72c9d avfilter/scale_eval: reject non-positive output dimensions
When scale filter expressions evaluate to zero or negative output
dimensions (e.g. cascaded scale=...:-2 on extreme aspect ratios),
ff_scale_adjust_dimensions() only checked for int32 overflow and
passed them through, potentially hanging downstream components.

Reject them explicitly so the pipeline fails fast.

Callers that currently ignore the return value will be updated in
the following patches to propagate the error.

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-04-28 06:14:38 +00:00
Zuxy Meng c55ab93eef avcodec/x86/h264_intrapred: Replace pred8x8_horizontal_8_mmxext with SSE2
Deprecating MMX. Instruction count unchanged.

Signed-off-by: Zuxy Meng <zuxy.meng@gmail.com>
2026-04-27 20:31:20 -07:00
Jeongkeun Kim 4ea59d5665 avcodec/aarch64: add NEON DCA LFE FIR filter functions
Port lfe_fir0_float and lfe_fir1_float to AArch64 NEON. These polyphase
FIR interpolation filters have an x86 SSE/AVX path but no AArch64
equivalent, falling back to scalar C.

The inner loop computes two dot products per output pair. Precomputing a
reversed LFE sample vector before the inner loop avoids per-iteration
shuffle overhead.

Benchmarks on AWS Graviton3 (Neoverse V1, c7g.xlarge):
  lfe_fir0_float: C 5902.0 cycles -> NEON 2135.0 cycles (2.77x)
  lfe_fir1_float: C 2836.3 cycles -> NEON 1527.8 cycles (1.86x)
Measured with: taskset -c 0 ./tests/checkasm/checkasm --test=dcadsp --bench,
3-run average, Ubuntu 22.04 (kernel 6.8.0-1052-aws), perf_event_paranoid=0.

Signed-off-by: Jeongkeun Kim <variety0724@gmail.com>
2026-04-27 20:13:23 +00:00
Georgii Zagoruiko 1ced59326a aarch64/vvc: Optimisations of put_chroma_hv() functions for 10/12-bit
Apple M4:
put_chroma_hv_10_2x2_c:                                  9.1 ( 1.00x)
put_chroma_hv_10_4x4_c:                                 20.1 ( 1.00x)
put_chroma_hv_10_8x8_c:                                 35.6 ( 1.00x)
put_chroma_hv_10_8x8_neon:                              15.4 ( 2.31x)
put_chroma_hv_10_16x16_c:                              113.7 ( 1.00x)
put_chroma_hv_10_16x16_neon:                            57.0 ( 1.99x)
put_chroma_hv_10_32x32_c:                              406.9 ( 1.00x)
put_chroma_hv_10_32x32_neon:                           225.7 ( 1.80x)
put_chroma_hv_10_64x64_c:                             1498.8 ( 1.00x)
put_chroma_hv_10_64x64_neon:                           876.2 ( 1.71x)
put_chroma_hv_10_128x128_c:                           5757.0 ( 1.00x)
put_chroma_hv_10_128x128_neon:                        3446.6 ( 1.67x)
put_chroma_hv_12_2x2_c:                                  9.9 ( 1.00x)
put_chroma_hv_12_4x4_c:                                 19.2 ( 1.00x)
put_chroma_hv_12_8x8_c:                                 36.1 ( 1.00x)
put_chroma_hv_12_8x8_neon:                              17.9 ( 2.02x)
put_chroma_hv_12_16x16_c:                              112.2 ( 1.00x)
put_chroma_hv_12_16x16_neon:                            55.6 ( 2.02x)
put_chroma_hv_12_32x32_c:                              416.6 ( 1.00x)
put_chroma_hv_12_32x32_neon:                           224.3 ( 1.86x)
put_chroma_hv_12_64x64_c:                             1464.8 ( 1.00x)
put_chroma_hv_12_64x64_neon:                           860.1 ( 1.70x)
put_chroma_hv_12_128x128_c:                           5776.8 ( 1.00x)
put_chroma_hv_12_128x128_neon:                        3445.2 ( 1.68x)

RPi5:
put_chroma_hv_10_2x2_c:                                118.5 ( 1.00x)
put_chroma_hv_10_4x4_c:                                190.6 ( 1.00x)
put_chroma_hv_10_8x8_c:                                303.1 ( 1.00x)
put_chroma_hv_10_8x8_neon:                             172.6 ( 1.76x)
put_chroma_hv_10_16x16_c:                             1036.1 ( 1.00x)
put_chroma_hv_10_16x16_neon:                           626.7 ( 1.65x)
put_chroma_hv_10_32x32_c:                             3624.4 ( 1.00x)
put_chroma_hv_10_32x32_neon:                          2386.9 ( 1.52x)
put_chroma_hv_10_64x64_c:                            13612.1 ( 1.00x)
put_chroma_hv_10_64x64_neon:                          9314.8 ( 1.46x)
put_chroma_hv_10_128x128_c:                          52975.4 ( 1.00x)
put_chroma_hv_10_128x128_neon:                       37083.5 ( 1.43x)
put_chroma_hv_12_2x2_c:                                118.6 ( 1.00x)
put_chroma_hv_12_4x4_c:                                188.1 ( 1.00x)
put_chroma_hv_12_8x8_c:                                303.4 ( 1.00x)
put_chroma_hv_12_8x8_neon:                             176.7 ( 1.72x)
put_chroma_hv_12_16x16_c:                             1037.9 ( 1.00x)
put_chroma_hv_12_16x16_neon:                           626.5 ( 1.66x)
put_chroma_hv_12_32x32_c:                             3629.0 ( 1.00x)
put_chroma_hv_12_32x32_neon:                          2386.6 ( 1.52x)
put_chroma_hv_12_64x64_c:                            13649.0 ( 1.00x)
put_chroma_hv_12_64x64_neon:                          9313.6 ( 1.47x)
put_chroma_hv_12_128x128_c:                          52978.0 ( 1.00x)
put_chroma_hv_12_128x128_neon:                       37101.2 ( 1.43x)
2026-04-27 20:10:57 +00:00
Marvin Scholz 9e90fa505e fftools: ffprobe: fix type mismatch in assert
The enum is unsigned, so instead compare to -1 before assigning to
the unsigned type.
2026-04-27 14:31:02 +02:00
Marvin Scholz 0396831b04 fftools: ffprobe: use unsigned in print_list_fmt
Unsigned makes more sense in this context.
2026-04-27 14:31:02 +02:00
Marvin Scholz 92cbe0454f fftools: ffprobe: adjust type of nb_streams
There is no reason for this to be signed, it is never negative.
2026-04-27 14:31:02 +02:00
Marvin Scholz 7c254feb0a fftools: ffprobe: narrow variable scopes and adjust types
Prevents several integers of different sign comparison warnings.
2026-04-27 14:31:02 +02:00
Marvin Scholz 0fc1183a60 swscale: ops_dispatch: fix leak on error
Assign to `exec_base.in_offset_x` before the error handling,
to ensure the error cleanup path properly frees the already
allocated memory.

Fixes Coverity issue #1691725
2026-04-27 12:29:48 +00:00
Andreas Rheinhardt 4867d251ad swscale/x86/yuv2yuvX: Simplify rotating
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-26 23:48:21 +02:00
Andreas Rheinhardt f5ed254528 swscale/x86/yuv2yuvX: Port ff_yuv2yuvX_mmxext to SSE2
The mmx function performs two registers in parallel;
given the larger register size of SSE2, the same amount
of data can be processed in one register with some speedups.
(Given that this function is used for tail-processing,
not processing more data is important.)

Switching to SSE2 also fixes a bug introduced in
554c2bc708: Since said
commit, only half the dither values were used. This
seems not to matter in practice, as the functions here
use dither only in the following form:
((filtersize-1)*8+dither)>>4. The dither values used
here come from ff_dither_8x8_128 which has the property
that ff_dither_8x8_128[i][j] and ff_dither_8x8_128[i][j+4]
always lead to the same result in the above formula.

Old benchmarks:
yuv2yuvX_8_2_0_512_approximate_c:                     2309.9 ( 1.00x)
yuv2yuvX_8_2_0_512_approximate_mmxext:                 250.2 ( 9.23x)
yuv2yuvX_8_2_0_512_approximate_sse3:                    98.8 (23.39x)
yuv2yuvX_8_2_0_512_approximate_avx2:                    52.9 (43.63x)
yuv2yuvX_8_2_16_512_approximate_c:                    2263.0 ( 1.00x)
yuv2yuvX_8_2_16_512_approximate_mmxext:                245.3 ( 9.22x)
yuv2yuvX_8_2_16_512_approximate_sse3:                  114.3 (19.80x)
yuv2yuvX_8_2_16_512_approximate_avx2:                   85.6 (26.45x)
yuv2yuvX_8_2_32_512_approximate_c:                    2155.8 ( 1.00x)
yuv2yuvX_8_2_32_512_approximate_mmxext:                235.6 ( 9.15x)
yuv2yuvX_8_2_32_512_approximate_sse3:                   93.6 (23.04x)
yuv2yuvX_8_2_32_512_approximate_avx2:                   78.1 (27.60x)
yuv2yuvX_8_2_48_512_approximate_c:                    2084.8 ( 1.00x)
yuv2yuvX_8_2_48_512_approximate_mmxext:                230.2 ( 9.05x)
yuv2yuvX_8_2_48_512_approximate_sse3:                  105.0 (19.85x)
yuv2yuvX_8_2_48_512_approximate_avx2:                   71.9 (29.00x)
yuv2yuvX_8_4_0_512_approximate_c:                     3496.3 ( 1.00x)
yuv2yuvX_8_4_0_512_approximate_mmxext:                 455.0 ( 7.68x)
yuv2yuvX_8_4_0_512_approximate_sse3:                   157.5 (22.20x)
yuv2yuvX_8_4_0_512_approximate_avx2:                    88.4 (39.53x)
yuv2yuvX_8_4_16_512_approximate_c:                    3380.9 ( 1.00x)
yuv2yuvX_8_4_16_512_approximate_mmxext:                440.0 ( 7.68x)
yuv2yuvX_8_4_16_512_approximate_sse3:                  175.0 (19.32x)
yuv2yuvX_8_4_16_512_approximate_avx2:                  134.1 (25.22x)
yuv2yuvX_8_4_32_512_approximate_c:                    3277.6 ( 1.00x)
yuv2yuvX_8_4_32_512_approximate_mmxext:                427.2 ( 7.67x)
yuv2yuvX_8_4_32_512_approximate_sse3:                  149.7 (21.89x)
yuv2yuvX_8_4_32_512_approximate_avx2:                  115.5 (28.37x)
yuv2yuvX_8_4_48_512_approximate_c:                    3167.8 ( 1.00x)
yuv2yuvX_8_4_48_512_approximate_mmxext:                414.9 ( 7.63x)
yuv2yuvX_8_4_48_512_approximate_sse3:                  164.1 (19.31x)
yuv2yuvX_8_4_48_512_approximate_avx2:                  101.2 (31.30x)
yuv2yuvX_8_8_0_512_approximate_c:                     5987.5 ( 1.00x)
yuv2yuvX_8_8_0_512_approximate_mmxext:                 854.1 ( 7.01x)
yuv2yuvX_8_8_0_512_approximate_sse3:                   294.6 (20.32x)
yuv2yuvX_8_8_0_512_approximate_avx2:                   144.1 (41.56x)
yuv2yuvX_8_8_16_512_approximate_c:                    5848.9 ( 1.00x)
yuv2yuvX_8_8_16_512_approximate_mmxext:                834.4 ( 7.01x)
yuv2yuvX_8_8_16_512_approximate_sse3:                  312.1 (18.74x)
yuv2yuvX_8_8_16_512_approximate_avx2:                  214.9 (27.22x)
yuv2yuvX_8_8_32_512_approximate_c:                    5610.1 ( 1.00x)
yuv2yuvX_8_8_32_512_approximate_mmxext:                811.6 ( 6.91x)
yuv2yuvX_8_8_32_512_approximate_sse3:                  277.5 (20.21x)
yuv2yuvX_8_8_32_512_approximate_avx2:                  189.8 (29.55x)
yuv2yuvX_8_8_48_512_approximate_c:                    5415.8 ( 1.00x)
yuv2yuvX_8_8_48_512_approximate_mmxext:                782.3 ( 6.92x)
yuv2yuvX_8_8_48_512_approximate_sse3:                  289.4 (18.72x)
yuv2yuvX_8_8_48_512_approximate_avx2:                  165.3 (32.76x)
yuv2yuvX_8_16_0_512_approximate_c:                   11100.7 ( 1.00x)
yuv2yuvX_8_16_0_512_approximate_mmxext:               1682.1 ( 6.60x)
yuv2yuvX_8_16_0_512_approximate_sse3:                  558.8 (19.86x)
yuv2yuvX_8_16_0_512_approximate_avx2:                  280.1 (39.63x)
yuv2yuvX_8_16_16_512_approximate_c:                  10772.1 ( 1.00x)
yuv2yuvX_8_16_16_512_approximate_mmxext:              1611.0 ( 6.69x)
yuv2yuvX_8_16_16_512_approximate_sse3:                 578.1 (18.63x)
yuv2yuvX_8_16_16_512_approximate_avx2:                 418.8 (25.72x)
yuv2yuvX_8_16_32_512_approximate_c:                  10381.5 ( 1.00x)
yuv2yuvX_8_16_32_512_approximate_mmxext:              1560.4 ( 6.65x)
yuv2yuvX_8_16_32_512_approximate_sse3:                 525.8 (19.74x)
yuv2yuvX_8_16_32_512_approximate_avx2:                 370.7 (28.01x)
yuv2yuvX_8_16_48_512_approximate_c:                  10046.1 ( 1.00x)
yuv2yuvX_8_16_48_512_approximate_mmxext:              1512.4 ( 6.64x)
yuv2yuvX_8_16_48_512_approximate_sse3:                 546.0 (18.40x)
yuv2yuvX_8_16_48_512_approximate_avx2:                 315.0 (31.89x)

New benchmarks:
yuv2yuvX_8_2_0_512_approximate_c:                     2302.5 ( 1.00x)
yuv2yuvX_8_2_0_512_approximate_sse2:                   184.4 (12.49x)
yuv2yuvX_8_2_0_512_approximate_sse3:                   100.1 (23.01x)
yuv2yuvX_8_2_0_512_approximate_avx2:                    54.9 (41.98x)
yuv2yuvX_8_2_16_512_approximate_c:                    2224.6 ( 1.00x)
yuv2yuvX_8_2_16_512_approximate_sse2:                  180.0 (12.36x)
yuv2yuvX_8_2_16_512_approximate_sse3:                  109.5 (20.31x)
yuv2yuvX_8_2_16_512_approximate_avx2:                   81.3 (27.35x)
yuv2yuvX_8_2_32_512_approximate_c:                    2165.3 ( 1.00x)
yuv2yuvX_8_2_32_512_approximate_sse2:                  176.6 (12.26x)
yuv2yuvX_8_2_32_512_approximate_sse3:                   93.7 (23.11x)
yuv2yuvX_8_2_32_512_approximate_avx2:                   73.1 (29.61x)
yuv2yuvX_8_2_48_512_approximate_c:                    2088.0 ( 1.00x)
yuv2yuvX_8_2_48_512_approximate_sse2:                  170.7 (12.23x)
yuv2yuvX_8_2_48_512_approximate_sse3:                  103.4 (20.20x)
yuv2yuvX_8_2_48_512_approximate_avx2:                   69.4 (30.10x)
yuv2yuvX_8_4_0_512_approximate_c:                     3496.8 ( 1.00x)
yuv2yuvX_8_4_0_512_approximate_sse2:                   320.3 (10.92x)
yuv2yuvX_8_4_0_512_approximate_sse3:                   158.8 (22.02x)
yuv2yuvX_8_4_0_512_approximate_avx2:                    86.4 (40.49x)
yuv2yuvX_8_4_16_512_approximate_c:                    3443.5 ( 1.00x)
yuv2yuvX_8_4_16_512_approximate_sse2:                  325.3 (10.59x)
yuv2yuvX_8_4_16_512_approximate_sse3:                  171.9 (20.03x)
yuv2yuvX_8_4_16_512_approximate_avx2:                  123.6 (27.85x)
yuv2yuvX_8_4_32_512_approximate_c:                    3272.2 ( 1.00x)
yuv2yuvX_8_4_32_512_approximate_sse2:                  302.7 (10.81x)
yuv2yuvX_8_4_32_512_approximate_sse3:                  148.9 (21.98x)
yuv2yuvX_8_4_32_512_approximate_avx2:                  110.6 (29.58x)
yuv2yuvX_8_4_48_512_approximate_c:                    3166.3 ( 1.00x)
yuv2yuvX_8_4_48_512_approximate_sse2:                  291.0 (10.88x)
yuv2yuvX_8_4_48_512_approximate_sse3:                  162.9 (19.44x)
yuv2yuvX_8_4_48_512_approximate_avx2:                  102.3 (30.95x)
yuv2yuvX_8_8_0_512_approximate_c:                     5967.6 ( 1.00x)
yuv2yuvX_8_8_0_512_approximate_sse2:                   691.2 ( 8.63x)
yuv2yuvX_8_8_0_512_approximate_sse3:                   294.2 (20.28x)
yuv2yuvX_8_8_0_512_approximate_avx2:                   154.9 (38.52x)
yuv2yuvX_8_8_16_512_approximate_c:                    5780.2 ( 1.00x)
yuv2yuvX_8_8_16_512_approximate_sse2:                  606.2 ( 9.53x)
yuv2yuvX_8_8_16_512_approximate_sse3:                  309.3 (18.69x)
yuv2yuvX_8_8_16_512_approximate_avx2:                  208.7 (27.69x)
yuv2yuvX_8_8_32_512_approximate_c:                    5604.3 ( 1.00x)
yuv2yuvX_8_8_32_512_approximate_sse2:                  592.3 ( 9.46x)
yuv2yuvX_8_8_32_512_approximate_sse3:                  281.1 (19.94x)
yuv2yuvX_8_8_32_512_approximate_avx2:                  185.4 (30.23x)
yuv2yuvX_8_8_48_512_approximate_c:                    5413.7 ( 1.00x)
yuv2yuvX_8_8_48_512_approximate_sse2:                  570.4 ( 9.49x)
yuv2yuvX_8_8_48_512_approximate_sse3:                  294.9 (18.36x)
yuv2yuvX_8_8_48_512_approximate_avx2:                  166.5 (32.51x)
yuv2yuvX_8_16_0_512_approximate_c:                   11099.4 ( 1.00x)
yuv2yuvX_8_16_0_512_approximate_sse2:                 1213.6 ( 9.15x)
yuv2yuvX_8_16_0_512_approximate_sse3:                  563.0 (19.72x)
yuv2yuvX_8_16_0_512_approximate_avx2:                  294.8 (37.65x)
yuv2yuvX_8_16_16_512_approximate_c:                  10718.1 ( 1.00x)
yuv2yuvX_8_16_16_512_approximate_sse2:                1121.2 ( 9.56x)
yuv2yuvX_8_16_16_512_approximate_sse3:                 563.7 (19.01x)
yuv2yuvX_8_16_16_512_approximate_avx2:                 389.5 (27.51x)
yuv2yuvX_8_16_32_512_approximate_c:                  10373.3 ( 1.00x)
yuv2yuvX_8_16_32_512_approximate_sse2:                1096.2 ( 9.46x)
yuv2yuvX_8_16_32_512_approximate_sse3:                 526.7 (19.70x)
yuv2yuvX_8_16_32_512_approximate_avx2:                 354.7 (29.24x)
yuv2yuvX_8_16_48_512_approximate_c:                  10066.9 ( 1.00x)
yuv2yuvX_8_16_48_512_approximate_sse2:                1055.8 ( 9.53x)
yuv2yuvX_8_16_48_512_approximate_sse3:                 527.9 (19.07x)
yuv2yuvX_8_16_48_512_approximate_avx2:                 313.7 (32.09x)

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-26 23:48:21 +02:00
Andreas Rheinhardt 62285be009 swscale/x86/swscale_template: Don't set use_mmx_vfilter when disabled
Commit 554c2bc708
ported the yuv2planeX functions that are set iff
use_mmx_vfilter is set to external assembly
and did it in a way that resulted in linking failures
when inline assembly is enabled, but external assembly
is disabled. This was later fixed in commit
c00567647e, but in such
a manner that use_mmx_vfilter can be set without any
of the accompanying yuv2planeX functions being set;
and in case inline assembly was unavailable,
these external assembly functions would never be selected.

This makes the filter-fps and filter-fps-cfr tests fail
with inline assembly but with --disable-x86asm, as
reported in issue #21113. Fix this by moving sws_init_swscale_mmxext
directly into ff_sws_init_swscale_x86() and setting
use_mmx_vfilter directly besides the yuv2planeX function pointer.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-26 23:48:21 +02:00
Andreas Rheinhardt 3dd03e4d4c avcodec/libvorbisenc: Cleanup on error
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-26 23:08:46 +02:00
Andreas Rheinhardt df70e8297b avcodec/libvorbisenc: Return error upon error
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-26 23:08:46 +02:00
Andreas Rheinhardt 51c5cc0ca3 avcodec/libvorbisenc: Fix leak of vorbis_comment upon error
Just free it immediately and unconditionally after it is no longer
needed.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-26 23:08:46 +02:00
Semih Baskan bdc0a29204 avcodec/nvenc: gate MVHEVC capability check on codec_id
NV_ENC_H264_PROFILE_HIGH_10 and NV_ENC_HEVC_PROFILE_MULTIVIEW_MAIN
both equal 3 when their respective NVENC_HAVE_* flags are defined.
The MVHEVC check in nvenc_check_capabilities() matches against
ctx->profile alone, so an H.264 encode with profile=high10 is
rejected as if it were an HEVC multiview request on hardware
without MVHEVC support.

Signed-off-by: Semih Baskan <strst.gs@gmail.com>
2026-04-26 19:25:12 +00:00
Timo Rothenpieler 1351c2c019 forgejo/workflows: make labeler also removed non-applicable labels again 2026-04-26 15:28:38 +00:00
Zhao Zhili e717604a29 avfilter/mpdecimate: fix kept-frame forwarding and error handling
When duplicate frames are forced to be kept, forward the input frame
without cloning instead of creating an unnecessary extra reference.
This removes the leak path introduced when clone allocation fails.

For frames that become the new reference, keep using a clone for
forwarding.

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2026-04-26 16:05:11 +08:00
Macdu 78da965b58 avcodec/atrac9tab: correct base curve value 2026-04-25 22:44:39 +02:00
Nicolas George fc4960b155 MAINTAINERS: add myself for libavfilter
I have been maintaining the framework and orphaned filters
for ages but apparently it was never recorded in this file.
2026-04-25 19:18:28 +02:00
Nicolas George 0b08d82425 configure: use ./src instead of src
Fix “do "src/doc/t2h.pm" failed, '.' is no longer in @INC; did you mean do "./src/doc/t2h.pm"?”
when we have such a symlink.
2026-04-25 17:25:28 +02:00
James Almer 45fe315cf0 tests/fate/mpegts: add tests for LCEVC samples
Both single track (Payloads inside SEI messages) and dual track.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-04-24 16:04:48 -03:00
llyyr 5134b0aceb libavutil/hwcontext_vulkan: fix internal queue sync device creation
6f811ad never set VK_DEVICE_QUEUE_CREATE_INTERNALLY_SYNCHRONIZED_BIT_KHR
on the queues at vkCreateDevice. Without the flag, the driver doesn't
actually synchronizes queues.

Fixes: 6f811ad751 ("hwcontext_vulkan: implement internal queue synchronization")
2026-04-24 16:07:28 +05:30
llyyr 51660ad523 libavutil/vulkan: replace GetDeviceQueue with GetDeviceQueue2
vkGetDeviceQueue2 with flags = 0 is equivalent to vkGetDeviceQueue and
is available since Vulkan 1.1. Needed to support queues created with
non-zero VkDeviceQueueCreateFlags.

Fixes VUID-vkGetDeviceQueue-flags-01841 VVL error.
2026-04-24 16:07:25 +05:30
ASTRA 163ba704b7 avformat/wavdec: Fix use-of-uninitialized-value in find_guid()
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-04-24 06:36:41 +00:00
AdityaTeltia 71d5fa8d4d avformat/hls_sample_encryption: add missing padding for audio setup buffer
Fixes ticket #22890.

The ff_hls_senc_parse_audio_setup_info function passes setup_data to
parsers like avpriv_ac3_parse_header and init_get_bits8 which require
the buffer to be padded with AV_INPUT_BUFFER_PADDING_SIZE bytes at the end.
2026-04-24 03:23:06 +00:00
Ramiro Polla 08f56d4898 avcodec/webp: remove write-only lossless field from WebPContext 2026-04-23 16:46:42 +00:00
Ramiro Polla 9e8e82308b avcodec/webp: use av_fourcc2str() to print fourccs 2026-04-23 16:46:42 +00:00
Ramiro Polla 0cd2bbe4f4 avformat/apngdec: fix playback of piped apng files
The check for avio_size() made apng_read_header() return an error
instead of just disabling looping.
2026-04-23 16:46:21 +00:00
Ramiro Polla 2e92764b86 avformat/apngdec: remove unused function argument 2026-04-23 16:46:21 +00:00
Lynne 162ad61486 vulkan/ffv1: fix second-line linecache initialization for Golomb
This was a difficult problem to find.

Sponsored-by: Sovereign Tech Fund
2026-04-22 23:24:04 +02:00
Marvin Scholz 0c6b4ad5fc doc: add narrow variable scope in loops to style examples 2026-04-22 13:29:18 +00:00
jade 5242bdae82 avformat/id3v2: add image/jxl for JPEG XL image attachments
This allows JPEG XL images to be recognized as valid attachments.
Since JPEG is already widely used for cover art, JXL's support for
lossless JPEG transcodes can decrease the total size of music collections.
This fixes JXL cover art rendering in applications like mpv which rely
on FFmpeg for demuxing.

Signed-off-by: jade <heartstopp1ng@proton.me>
2026-04-22 13:28:17 +00:00
Marvin Scholz fc8d975b0b .forgejo/CODEOWNERS: fix incorrect username
The username does not exist, instead replaced it with the user
who authored the commit that added it.
2026-04-22 12:55:43 +00:00
Marvin Scholz d4cf7cf1cf lavfi: vf_drawtext: properly propagate errors
Do not ignore errors from draw_text().
2026-04-22 12:33:26 +00:00
Marvin Scholz 69072fe8d8 lavfi: vf_drawtext: check memory allocation
Switch to av_calloc and check the allocation.

Fix #22867
2026-04-22 12:33:26 +00:00
Lynne 117807510a vf_overlay_vulkan: port to compile-time SPIR-V generation 2026-04-22 12:45:45 +02:00
Lynne c7d3d3ac55 vf_blend_vulkan: port to compile-time SPIR-V generation 2026-04-22 12:45:45 +02:00
Lynne 4d6cd9f983 vf_scdet_vulkan: port to compile-time SPIR-V generation 2026-04-22 12:45:40 +02:00
Ashrit Shetty 9acd820732 avcodec/mfenc: populate video input type with size, rate, interlace
mf_encv_input_adjust() currently only validates the pixel format and
otherwise leaves the input IMFMediaType unchanged. The Microsoft
H.264, H.265 and AV1 encoder MFTs tolerate this and internally infer
the missing attributes from the previously-set output type. Other
MediaFoundation encoder MFTs that follow the specification more
strictly reject the input type with MF_E_INVALIDMEDIATYPE (due to
MF_E_ATTRIBUTENOTFOUND on MF_MT_FRAME_SIZE / MF_MT_FRAME_RATE) when
those attributes are absent, which causes IMFTransform::SetInputType
to fail and aborts encoding.

Set MF_MT_FRAME_SIZE, MF_MT_FRAME_RATE and MF_MT_INTERLACE_MODE on
the input media type, mirroring what mf_encv_output_adjust() already
writes to the output type. Behaviour on the Microsoft MFTs is
unchanged (they were already using these values) and encoding now
works with stricter third-party MFTs.

The MF_MT_FRAME_SIZE assignment has been present but commented out
since the original MediaFoundation wrapper was added in 050b72ab5e.

Signed-off-by: Ashrit Shetty <ashritshetty@microsoft.com>
Signed-off-by: Martin Storsjö <martin@martin.st>
2026-04-22 10:48:24 +03:00
Dmitrii Gershenkop d1d873c003 avfilter/vf_frc_amf: Add AMF Frame Rate Converter filter 2026-04-21 16:47:05 +00:00
Diego de Souza 0bba4603e2 avcodec/cuviddec: fix monochrome formats misclassified as YUV444
Monochrome formats (gray, gray10le) have log2_chroma_w == 0 and
log2_chroma_h == 0 because they have no chroma planes — the same
values as YUV444. This caused them to be misclassified as YUV444 by
the is_yuv444 detection introduced in bcea693f75.

After fed6612415 changed cuvid_test_capabilities to use is_yuv444
instead of hardcoding cudaVideoChromaFormat_420, monochrome AV1
streams were rejected with "Codec av1_cuvid is not supported with
this chroma format".

Add an nb_components > 1 guard to exclude single-component formats
from the YUV444 path.

Patch by: Aniket Dhok <adhok@nvidia.com>
Signed-off-by: Diego de Souza <ddesouza@nvidia.com>
2026-04-21 16:51:54 +02:00
Diego de Souza afc8556a6a avcodec/cuviddec: handle 4-byte AV1CodecConfigurationRecord
AV1CodecConfigurationRecord may contain only the 4-byte header and no
configOBUs. Still skip the header in that case so only configOBUs are
passed to cuvidParseVideoData().

Otherwise the av1C header itself is treated as sequence header data
and AV1 decoding can fail with an unknown error.

Suggested-by: Aniket Dhok <adhok@nvidia.com>
Signed-off-by: Diego de Souza <ddesouza@nvidia.com>
2026-04-21 14:03:58 +00:00
Paul Adenot 99d8d3891f avcodec: Allow enabling DTX in libopusenc 2026-04-21 13:38:44 +00:00
Víctor Manuel Jáquez Leal f4ac7dee87 avcodec/vulkan_encode_h265: fix capabilities flags
Replace the H264 ones.
2026-04-21 13:08:59 +00:00
Jun Zhao 75838b9c89 lavc/hevc: add aarch64 NEON for reference sample filtering
3-tap [1,2,1]>>2: shared implementation body across size-specialized
entry points (8x8/16x16/32x32) to reduce code size. Fold the 3-tap
kernel into uhadd + urhadd: uhadd gives floor((prev+next)/2), then
urhadd rounds with curr to produce (prev + 2*curr + next + 2) >> 2
on 16 bytes in-place (no widen/narrow needed). Overlap-last technique
for tail avoids partial stores. Caller pads input arrays by 16 bytes
to guarantee safe over-read.

Strong smoothing (32x32): preloaded weight tables, interleaved
umull/umlal pairs (two 16-byte blocks at a time) to hide
rshrn-to-store latency, with paired st1 for 32-byte writes.

checkasm --bench --runs=15 (Apple M4, average of 3 trials):
  ref_filter_3tap_8x8_8_neon:    4.1x
  ref_filter_3tap_16x16_8_neon:  3.3x
  ref_filter_3tap_32x32_8_neon:  2.5x
  ref_filter_strong_8_neon:      1.9x

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-04-21 07:50:49 +00:00
Jun Zhao 188757d43d tests/checkasm: add hevc_pred ref_filter_3tap and ref_filter_strong tests
Test 3-tap for 8x8/16x16/32x32 (both filtered_left and
filtered_top outputs). Test strong smoothing for filtered_top
and in-place left modification.

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-04-21 07:50:49 +00:00
Jun Zhao a3d8e417c0 lavc/hevc: extract reference sample filter into function pointers
Extract 3-tap [1,2,1]>>2 and strong intra smoothing from
intra_pred() into HEVCPredContext function pointers, preparing
for arch-specific overrides.

ref_filter_3tap[3] indexed by log2_size - 3 (sizes 8/16/32).
ref_filter_strong for 32x32 luma only.

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-04-21 07:50:49 +00:00
Lynne d3e915d6d1 vf_xfade_vulkan: remove unused includes 2026-04-21 09:39:54 +02:00
Lynne 6f811ad751 hwcontext_vulkan: implement internal queue synchronization 2026-04-21 08:34:47 +02:00
Lynne 8483de2858 chromaber_vulkan: switch to compile-time SPIR-V generation 2026-04-21 08:28:50 +02:00
Lynne 8001b19dc8 vf_gblur_vulkan: port to compile-time SPIR-V generation 2026-04-21 08:28:50 +02:00
Lynne ada9716172 vsrc_testsrc_vulkan: convert to compile-time SPIR-V generation 2026-04-21 08:28:50 +02:00
Lynne 4061e3351f vf_transpose_vulkan: convert to compile-time SPIR-V generation 2026-04-21 08:28:50 +02:00
Lynne d0ee5d0556 vf_flip_vulkan: convert to compile-time SPIR-V generation 2026-04-21 08:28:50 +02:00
Lynne 2f7d3290c0 vf_xfade_vulkan: convert to compile-time SPIR-V generation 2026-04-21 08:28:49 +02:00
Lynne f8f485fb3c vf_interlace_vulkan: convert to compile-time SPIR-V generation 2026-04-21 08:28:49 +02:00
Lynne d381151ae3 vulkan_filter: add an argument for setting the Z workgroup count 2026-04-21 08:28:45 +02:00
Zuxy Meng dc23adde9b avcodec/x86/h264_intrapred: Replace pred8x8_dc_8_mmxext with SSE2
Deprecating MMX w/o performance regression; nearly identical performance
numbers on my Zen 4 (1.99x vs c)

Signed-off-by: Zuxy Meng <zuxy.meng@gmail.com>
2026-04-20 19:38:56 -07:00
nyanmisaka c92304f8c7 avfilter: add transpose_cuda video filter
This patch adds the transpose_cuda video filter.
It's similar to the existing transpose filter but accelerated by CUDA.

It supports the same pixel formats as the scale_cuda filter.
This also supersedes the deprecated transpose_npp filter.

Example usage:
ffmpeg -hwaccel cuda -hwaccel_output_format cuda -i <INPUT> -vf "transpose_cuda=dir=clock" <OUTPUT>

Signed-off-by: nyanmisaka <nst799610810@gmail.com>
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
2026-04-20 21:08:21 +02:00
Arien Shibani c67a4554d1 CONTRIBUTING.md: add blank line after top heading
Insert spacing after the first heading (MD022-style).
2026-04-20 12:32:50 +00:00
Arien Shibani 849e8307ce doc/transforms.md: add document title and fix heading structure
Add a top-level title and demote former section headings (MD041-style hierarchy).

Add blank lines around headings and fenced code blocks where appropriate (MD022 and MD031-style). Some Markdown parsers, including kramdown, only recognize headings that are preceded by a blank line.
2026-04-20 12:32:50 +00:00
Arien Shibani 6e3366e9bc INSTALL.md: add title heading and normalize section levels
Use a top-level heading on the first line (MD041-style) and adjust section levels for clearer document structure. Improves navigation for assistive technologies that rely on heading outlines.
2026-04-20 12:32:50 +00:00
Arien Shibani 519c80b626 README.md: use consistent ATX heading style
Align heading markers with markdownlint MD003 suggestions.
2026-04-20 12:32:50 +00:00
Andreas Rheinhardt 5e69e6d49c avformat/pdvenc: Don't silently truncate value
This muxer seems to intend to support output that does
not begin at zero (instead of e.g. just hardcoding
nb_frames_pos to 16). But then it is possible
that avio_seek() returns values > INT_MAX even
though the part of the file written by us can not
exceed this value. So the return value of avio_seek()
needs to be checked as 64bit integer and not silently
truncated to int.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-20 12:54:31 +02:00
Andreas Rheinhardt 2b8438a495 avformat/pdvenc: Remove always-false checks
The number of streams is always one (namely one video stream
with codec id AV_CODEC_ID_PDV) due to the MAX_ONE_OF_EACH,
ONLY_DEFAULT_CODECS flags. Also, the generic code (init_muxer()
in mux.c) checks that video streams have proper dimensions set.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-20 12:54:31 +02:00
Andreas Rheinhardt 6135ccbf80 avcodec/pdvenc: Return directly upon error
This encoder has the FF_CODEC_CAP_INIT_CLEANUP cap set.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-20 12:54:31 +02:00
Andreas Rheinhardt 87a6be19f8 avcodec/pdvenc: Remove always false check
av_image_check_size() already checks that width*height
fits into an int.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-20 12:54:31 +02:00
Andreas Rheinhardt c94cb9c04f avcodec/pdvenc: Remove always-false pixel format check
Already checked via CODEC_PIXFMTS.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-20 12:54:31 +02:00
Andreas Rheinhardt e908c92f5a avcodec/cavs: Don't allocate block separately
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-20 12:21:41 +02:00
Jeongkeun Kim de18feb0f0 avutil/aarch64: add pixelutils 32x32 SAD NEON implementation
This adds a NEON-optimized function for computing 32x32 Sum of Absolute
Differences (SAD) on AArch64, addressing a gap where x86 had SSE2/AVX2
implementations but AArch64 lacked equivalent coverage.

The implementation mirrors the existing sad8 and sad16 NEON functions,
employing a 4-row unrolled loop with UABAL and UABAL2 instructions for
efficient load-compute interleaving, and four 8x16-bit accumulators to
handle the wider 32-byte rows.

Benchmarks on AWS Graviton3 (Neoverse V1, c7g.xlarge) using checkasm:
  sad_32x32_0: C 146.4 cycles -> NEON  98.1 cycles (1.49x speedup)
  sad_32x32_1: C 141.4 cycles -> NEON  98.9 cycles (1.43x speedup)
  sad_32x32_2: C 140.7 cycles -> NEON  95.0 cycles (1.48x speedup)

Signed-off-by: Jeongkeun Kim <variety0724@gmail.com>
2026-04-19 19:27:55 +00:00
llyyr 4af27ba4ca doc/APIchanges: fix date and version in latest entry
This incorrectly lists the libavcodec major version as 60 instead of
62. Also fix the date and commit hash while at it

Fixes: 7faa6ee2aa ("libavformat/matroska: Support smpte 2094-50 metadata")

Signed-off-by: llyyr <llyyr.public@gmail.com>
2026-04-19 15:37:33 +00:00
Romain Beauxis 82d7e375f1 libavdevice/alsa.c: fix NULL pointer dereference 2026-04-19 15:00:08 +00:00
Andreas Rheinhardt 415b466d41 avcodec/x86/vp3dsp: Port ff_vp3_idct_dc_add_mmxext to SSE2
This change should improve performance on Skylake and later
Intel CPUs (which have only half the ports for saturated adds/subs
for mmx register compared to xmm register): llvm-mca predicts
a 25% performance improvement on Skylake.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-19 08:21:17 +02:00
Andreas Rheinhardt e7a613b274 avcodec/x86/vp3dsp: Avoid loads and stores
Instead reuse values in registers.

Old benchmarks:
idct_add_c:                                             74.2 ( 1.00x)
idct_add_sse2:                                          60.4 ( 1.23x)
idct_put_c:                                            100.8 ( 1.00x)
idct_put_sse2:                                          58.7 ( 1.72x)

New benchmarks:
idct_add_c:                                             74.2 ( 1.00x)
idct_add_sse2:                                          55.2 ( 1.34x)
idct_put_c:                                            107.5 ( 1.00x)
idct_put_sse2:                                          54.1 ( 1.99x)

Hint: For x64, all the intermediate stores could be avoided.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-19 08:21:17 +02:00
Andreas Rheinhardt ed59fc77e8 avcodec/x86/vp3dsp: Use named args in idct functions
Also avoid REX prefixes while just at it.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-19 08:21:17 +02:00
Andreas Rheinhardt c1af56357b avcodec/x86/vp3dsp: Avoid unnecessary macro, repetition
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-19 08:21:17 +02:00
Andreas Rheinhardt 88879f2eff tests/checkasm/vp3dsp: Add test for idct_add, idct_put, idct_dc_add
Due to a discrepancy between SSE2 and the C version coefficients
for idct_put and idct_add are restricted to a range not causing
overflows.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-19 08:21:08 +02:00
Andreas Rheinhardt 84b9de0633 avcodec/x86/vp3dsp: Port ff_put_vp_no_rnd_pixels8_l2_mmx to SSE2
This allows to use pavgb to reduce the amount of instructions used
to calculate the average; processing two rows via movhps allows
to reduce the amount of pxor and pavgb even further and turned
out to be beneficial.
This patch also avoids a load as the constant used here can be easily
generated at runtime.

Old benchmarks:
put_no_rnd_pixels_l2_c:                                 13.3 ( 1.00x)
put_no_rnd_pixels_l2_mmx:                               11.6 ( 1.15x)

New benchmarks:
put_no_rnd_pixels_l2_c:                                 13.4 ( 1.00x)
put_no_rnd_pixels_l2_sse2:                               7.5 ( 1.77x)

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-19 08:15:54 +02:00
Andreas Rheinhardt 37bc3a237b tests/checkasm/vp3dsp: Add test for put_no_rnd_pixels_l2
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-19 08:14:50 +02:00
Andreas Rheinhardt 79ce4432e0 avcodec/simple_idct10_template: Reduce amount of registers used
This allows to avoid the stack for the 8 bit simple IDCT;
for the other IDCTs, it avoids storing and restoring two
xmm registers on Win64.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-19 07:36:37 +02:00
Andreas Rheinhardt e1782fb016 avutil/x86/pixelutils: Don't use mmx in 8x8 SAD
This function is exported, so has to abide by the ABI
and therefore issues emms since commit
5b85ca5317. Yet this is
expensive and using SSE2 instead improves performance.
Also avoid the initial zeroing and the last pointer
increment while just at it.
This removes the last usage of mmx from libavutil*.

Old benchmarks:
sad_8x8_0_c:                                            13.2 ( 1.00x)
sad_8x8_0_mmxext:                                       27.8 ( 0.48x)
sad_8x8_1_c:                                            13.2 ( 1.00x)
sad_8x8_1_mmxext:                                       27.6 ( 0.48x)
sad_8x8_2_c:                                            13.3 ( 1.00x)
sad_8x8_2_mmxext:                                       27.6 ( 0.48x)

New benchmarks:
sad_8x8_0_c:                                            13.3 ( 1.00x)
sad_8x8_0_sse2:                                         11.7 ( 1.13x)
sad_8x8_1_c:                                            13.8 ( 1.00x)
sad_8x8_1_sse2:                                         11.6 ( 1.20x)
sad_8x8_2_c:                                            13.2 ( 1.00x)
sad_8x8_2_sse2:                                         11.8 ( 1.12x)

Hint: Using two psadbw or one psadbw and movhps made no difference
in the benchmarks, so I chose the latter due to smaller codesize.

*: except if lavu provides avpriv_emms for other libraries

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-18 21:21:11 +02:00
Michael Niedermayer d538a71ad5 avcodec/svq1dec: Check input space for minimum
We reject inputs that are significantly smaller than the smallest frame.
This check raises the minimum input needed before time consuming computations are performed
it thus improves the computation per input byte and reduces the potential DoS impact

Fixes: Timeout
Fixes: 472769364/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_SVQ1_DEC_fuzzer-5519737145851904

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-04-18 18:32:50 +00:00
Andreas Rheinhardt fcffc0e1c5 avformat/matroskaenc: Remove pointless side-data size checks
Just presume that we any present side data is actually valid.

Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-17 23:53:12 +02:00
Andreas Rheinhardt 38df985fba avformat/matroskaenc: Use separate buffer for SMPTE 2094 blockadditional
Otherwise the buffer for the hdr10+ blockadditional would
be clobbered if both are present (the buffers can only be
reused after the ebml_writer_write() call).

Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-17 23:53:07 +02:00
Andreas Rheinhardt 25ce544d4b avformat/matroskaenc: Increase size of EBML_WRITER array
7faa6ee2aa added support
for writing AV_PKT_DATA_DYNAMIC_HDR_SMPTE_2094_APP5,
yet forgot to update the size of the EBML element buffer.

Reviewed-by: James Almer <jamrial@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-17 23:52:53 +02:00
Vignesh Venkat 7faa6ee2aa libavformat/matroska: Support smpte 2094-50 metadata
Add support for parsing and muxing smpte 2094-50 metadata. It will
be stored as an ITUT-T35 message in the BlockAdditional element with
an AddId type of 4 (which is reserved for ITUT-T35 in the matroska
spec).

https://www.matroska.org/technical/codec_specs.html#itu-t35-metadata

Signed-off-by: Vignesh Venkatasubramanian <vigneshv@google.com>
2026-04-17 18:51:25 +00:00
Lynne c1b19ee69f aacdec: add support for 960-frame HE-AAC (DAB+) decoding
Finally, after so many years. I'm sure there's good DAB+ content
out there being broadcast. Go and listen to it.
2026-04-17 16:46:52 +02:00
Hassan Hany 9e4041d5ea avcodec/opus: use precomputed NLSF weights for Silk decoder
Precompute the SILK NLSF residual weights from the stage-1 codebooks and use the table during LPC decode. This removes the per-coefficient mandated fixed-point weight calculation in silk_decode_lpc() while preserving the same decoded values.
2026-04-17 14:39:20 +00:00
Niklas Haas 96f82f4fbb swscale/x86/ops: simplify SWS_OP_CLEAR patterns
Mark the components to be cleared, not the components to be preserved.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-16 23:25:17 +02:00
Niklas Haas 08707934cc swscale/ops_backend: simplify SWS_OP_CLEAR declarations
Mark the components to be cleared, not the components to be preserved.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-16 23:25:17 +02:00
Niklas Haas 7a71a01a1b swscale/ops: nuke SwsComps.unused
Finally, remove the last relic of this accursed design mistake.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-16 23:25:17 +02:00
Niklas Haas a797e30f71 swscale/aarch64/ops: compute SWS_OP_PACK mask directly
Instead of implicitly relying on SwsComps.unused, which contains the exact
same information. (cf. ff_sws_op_list_update_comps)

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-16 23:25:17 +02:00
Niklas Haas 6d1e549195 swscale/aarch64/ops: use SWS_OP_NEEDED() instead of next->comps.unused
These are basically identical, but the latter is being phased out.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-16 23:25:17 +02:00
Niklas Haas 18cc71fc8e swscale/aarch64/ops: fix SWS_OP_LINEAR mask check
The implementation of AARCH64_SWS_OP_LINEAR loops over elements of this mask
to determine which *output* rows to compute. However, it is being set by this
loop to `op->comps.unused`, which is a mask of unused *input* rows. As such,
it should be looking at `next->comps.unused` instead.

This did not result in problems in practice, because none of the linear
matrices happened to trigger this case (more input columns than output rows).

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-16 23:25:17 +02:00
Niklas Haas df4fe85ae3 swscale/ops_chain: replace SwsOpEntry.unused by SwsCompMask
Needed to allow us to phase out SwsComps.unused altogether.

It's worth pointing out the change in semantics; while unused tracks the
unused *input* components, the mask is defined as representing the
computed *output* components.

This is 90% the same, expect for read/write, pack/unpack, and clear; which
are the only operations that can be used to change the number of components.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-16 23:25:10 +02:00
Niklas Haas 215cd90201 swscale/x86/ops: simplify DECL_DITHER definition
This extra indirection boilerplate just for the 0-size fast path really isn't
doing us any favors.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-16 23:24:55 +02:00
Niklas Haas 9f0dded48d swscale/ops_chain: check for exact linear mask match
Makes this logic a lot simpler and less brittle. We can trivially adjust the
list of linear masks that are required, whenever it changes as a result of any
future modifications.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-16 23:24:55 +02:00
Niklas Haas e20a32d730 swscale/x86/ops: align linear kernels with reference backend
See previous commit.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-16 23:24:55 +02:00
Niklas Haas 9b1c1fe95f swscale/ops_backend: align linear kernels with actually needed masks
Using the power of libswscale/tests/sws_ops -summarize lets us see which
kernels are actually needed by real op lists.

Note: I'm working on a separate series which will obsolete this implementation
whack-a-mole game altogether, by generating a list of all possible op kernels
at compile time.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-16 23:24:55 +02:00
Niklas Haas af2674645f swscale/ops: drop offset from SWS_MASK_ALPHA
This is far more commonly used without an offset than with; so having it there
prevents these special cases from actually doing much good.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-16 23:24:55 +02:00
Niklas Haas 526195e0a3 swscale/x86/ops_float: fix typo in linear_row
First vector is %2, not %3. This was never triggered before because all of
the existing masks never hit this exact case.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-16 23:24:55 +02:00
Niklas Haas 6a83e15392 swscale/ops_chain: simplify SwsClearOp checking
Since this now has an explicit mask, we can just check that directly, instead
of relying on the unused comps hack/trick.

Additionally, this also allows us to distinguish between fixed value and
arbitrary value clears by just having the SwsOpEntry contain NAN values iff
they support any clear value.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-16 23:24:22 +02:00
Niklas Haas 80bd6c0cd5 swscale/ops: don't strip range metadata for unused components
As alluded to by the previous commit, this is now no longer necessary to
prevent their print-out.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-16 23:23:36 +02:00
Niklas Haas 3680642e1b swscale/ops: simplify min/max range print check
This does come with a slight change in behavior, as we now don't print the
range information in the case that the range is only known for *unused*
components. However, in practice, that's already guaranteed by update_comps()
stripping the range info explicitly in this case.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-16 23:23:36 +02:00
Niklas Haas 9bb2b11d5b swscale/ops: add SwsCompMask parameter to print_q4()
Instead of implicitly excluding NAN values if ignore_den0 is set. This
gives callers more explicit control over which values to print, and in
doing so, makes sure "unintended" NaN values are properly printed as such.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-16 23:23:36 +02:00
Niklas Haas cf2d40f65d swscale/ops: add explicit clear mask to SwsClearOp
Instead of implicitly testing for NaN values. This is mostly a straightforward
translation, but we need some slight extra boilerplate to ensure the mask
is correctly updated when e.g. commuting past a swizzle.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-16 23:23:36 +02:00
Niklas Haas 4020607f0a swscale/ops: add SwsCompMask and related helpers
This new type will be used over the following commits to simplify the
codebase.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-16 23:23:36 +02:00
Niklas Haas ce2ca1a186 swscale/ops_optimizer: fix commutation of U32 clear + swap_bytes
This accidentally unconditionally overwrote the entire clear mask, since
Q(n) always set the denominator to 1, resulting in all channels being
cleared instead of just the ones with nonzero denominators.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-16 23:23:36 +02:00
Niklas Haas 953d278a01 tests/swscale: fix input pattern generation for very small sizes
This currently completely fails for images smaller than 12x12; and even in that
case, the limited resolution makes these tests a bit useless.

At the risk of triggering a lot of spurious SSIM regressions for very
small sizes (due to insufficiently modelling the effects of low resolution on
the expected noise), this patch allows us to at least *run* such tests.

Incidentally, 8x8 is the smallest size that passes the SSIM check.
2026-04-16 20:59:39 +00:00
Niklas Haas 0da2bbab68 swscale/ops_dispatch: re-indent (cosmetic) 2026-04-16 20:59:39 +00:00
Niklas Haas 4c19f82cc0 swscale/ops_dispatch: compute minimum needed tail size
Not only does this take into account extreme edge cases where the plane
padding can significantly exceed the actual width/stride, but it also
correctly takes into account the filter offsets when scaling; which the
previous code completely ignored.

Simpler, robuster, and more correct. Now valgrind passes for 100% of format
conversions for me, with and without scaling.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-16 20:59:39 +00:00
Niklas Haas cd8ece4114 swscale/ops_dispatch: generalize the number of tail blocks
This is a mostly straightforward internal mechanical change that I wanted
to isolate from the following commit to make bisection easier in the case of
regressions.

While the number of tail blocks could theoretically be different for input
vs output memcpy, the extra complexity of handling that mismatch (and
adjusting all of the tail offsets, strides etc.) seems not worth it.

I tested this commit by manually setting `p->tail_blocks` to higher values
and seeing if that still passed the self-check under valgrind.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-16 20:59:39 +00:00
Niklas Haas dba7b81b38 swscale/ops_dispatch: avoid calling comp->func with w=0
The x86 kernel e.g. assumes that at least one block is processed; so avoid
calling this with an empty width. This is currently only possible if e.g.
operating on an unpadded, very small image whose total linesize is less than
a single block.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-16 20:59:39 +00:00
Niklas Haas 35174913ac swscale/ops_dispatch: fix and generalize tail buffer size calculation
This code had two issues:

1. It was over-allocating bytes for the input offset map case, and
2. It was hard-coding the assumption that there is only a single tail block

We can fix both of these issues by rewriting the way the tail size is derived.

In the non-offset case, and assuming only 1 tail block:
    aligned_w - safe_width
  = num_blocks * block_size - (num_blocks - 1) * block_size
  = block_size

Additionally, the FFMAX(tail_size_in/out) is unnecessary, because:
    tail_size = pass->width - safe_width <= aligned_w - safe_width

In the input offset case, we instead realize that the input kernel already
never over-reads the input due to the filter size adjustment/clamping, so
the only thing we need to ensure is that we allocate extra bytes for the
input over-read.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-16 20:59:39 +00:00
Niklas Haas f604add8c1 swscale/ops_dispatch: remove pointless AV_CEIL_RSHIFT()
The over_read/write fields are not documented as depending on the subsampling
factor. Actually, they are not documented as depending on the plane at all.

If and when we do actually add support for horizontal subsampling to this
code, it will most likely be by turning all of these key variables into
arrays, which will be an upgrade we get basically for free.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-16 20:59:39 +00:00
Niklas Haas dd8ff89adf swscale/ops_dispatch: add helper to explicitly control pixel->bytes rounding
This makes it far less likely to accidentally add or remove a +7 bias when
repeating this often-used expression.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-16 20:59:39 +00:00
Niklas Haas 16a57b2985 swscale/ops_dispatch: ensure block size is multiple of pixel size
This could trigger if e.g. a backend tries to operate on monow formats with
a block size that is not a multiple of 1. In this case, `block_size_in`
would previously be miscomputed (to e.g. 0), which is obviously wrong.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-16 20:59:39 +00:00
Niklas Haas 86307dad4a swscale/ops_dispatch: make offset calculation code robust against overflow
As well as weird edge cases like trying to filter `monow` and pixels landing
in the middle of a byte. Realistically, this will never happen - we'd instead
pre-process it into something byte-aligned, and then dispatch a byte-aligned
filter on it.

However, I need to add a check for overflow in any case, so we might as well
add the alignment check at the same time. It's basically free.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-16 20:59:39 +00:00
Niklas Haas 95e4f7cac5 swscale/ops_dispatch: fix rounding direction of plane_size
This is an upper bound, so it should be rounded up.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-16 20:59:39 +00:00
Niklas Haas c6e47b293d swscale/ops_dispatch: pre-emptively guard against int overflow
By using size_t whenever we compute derived figures.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-16 20:59:39 +00:00
Niklas Haas 0524e66aec swscale/ops_dispatch: drop pointless const (cosmetic)
These are clearly not mutated within their constrained scope, and it just
wastes valuable horizontal space.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-16 20:59:39 +00:00
Niklas Haas c98810ac78 swscale/ops_dispatch: zero-init tail buffer
Prevents valgrind from complaining about operating on uninitialized bytes.
This should be cheap as it's only done once during setup().

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-16 20:59:39 +00:00
Niklas Haas ba516a34cd swscale/x86/ops_int: use sized mov for packed_shuffle output
This code made the input read conditional on the byte count, but not the
output, leading to a lot of over-write for cases like 15, 5.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-16 20:59:39 +00:00
Niklas Haas 4264045137 swscale/x86/ops: set missing over_read metadata on filter ops
These align the filter size to a multiple of the internal tap grouping
(either 1/2/4 for vpgatherdd, or the XMM size for the 4x4 transposed kernel).
This may over-read past the natural end of the input buffer, if the aligned
size exceeds the true size.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-16 20:59:39 +00:00
Peter von Kaenel d013863f00 avcodec/lcevcdec: poll on LCEVC_Again from LCEVC_ReceiveDecoderPicture
The V-Nova LCEVC pipeline processes frames on internal background
worker threads. LCEVC_ReceiveDecoderPicture returns LCEVC_Again (-1)
when the worker has not yet completed the frame, which is the
documented "not ready, try again" response. The original code treated
any non-zero return as a fatal error (AVERROR_EXTERNAL), causing decode
to abort mid-stream.

Poll until LCEVC_Success or a genuine error is returned.

Signed-off-by: Peter von Kaenel <Peter.vonKaenel@harmonicinc.com>
Signed-off-by: James Almer <jamrial@gmail.com>
2026-04-16 17:19:28 -03:00
Andreas Rheinhardt 9ab37ef918 avcodec/packet: Remove always-true check
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-16 19:27:03 +00:00
Andreas Rheinhardt 4b5e1d25c3 avcodec/decode: Short-circuit side-data processing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-16 19:27:03 +00:00
Andreas Rheinhardt a85709537e avcodec/decode: Avoid temporary frame in ff_reget_buffer()
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-16 19:27:03 +00:00
Andreas Rheinhardt b595b3075e avcodec/decode: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-16 19:27:03 +00:00
Andreas Rheinhardt f99e4a0f23 avcodec/decode: Optimize call away if possible
post_process_opaque is only used by LCEVC, so it is unused
on most builds.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-16 19:27:03 +00:00
Andreas Rheinhardt 312bfd512d avcodec/decode: Remove always-true checks
dc->lcevc.ctx is only != NULL for video.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-16 19:27:03 +00:00
Andreas Rheinhardt 2d062dd0c6 avcodec/decode: Make post_process_opaque a RefStruct reference
Avoids the post_process_opaque_free callback; the only user of
this is already a RefStruct reference and presumably other users
would want to use a pool for this, too, so they would use
RefStruct-objects, too.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-16 19:27:03 +00:00
Andreas Rheinhardt 0ee1947d9b avcodec/lcevcdec: Use pool to avoid allocations of FFLCEVCFrame
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-16 19:27:03 +00:00
Kacper Michajłow 03967fcff4 tests/checkasm/sw_ops: fix too large shift for int
Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2026-04-16 18:56:22 +00:00
Kacper Michajłow 369dbbe488 swscale/ops_memcpy: guard exec->in_stride[-1] access
When use_loop == true and idx < 0, we would incorrectly check
in_stride[idx], which is OOB read. Reorder conditions to avoid that.

Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2026-04-16 18:56:22 +00:00
Niklas Haas 1764683668 swscale/ops_backend: disable FP contraction where possible
In particular, Clang defaults to FP contraction enabled. GCC defaults to
off in standard C mode (-std=c11), but the C standard does not actually
require any particular default.

The #pragma STDC pragma, despite its name, warns on anything except Clang.

Fixes: https://code.ffmpeg.org/FFmpeg/FFmpeg/issues/22796
See-also: https://discourse.llvm.org/t/fp-contraction-fma-on-by-default/64975
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-16 17:19:51 +00:00
Hassan Hany 3b19a61837 avcodec/vorbisdec: validate windowtype and transformtype 2026-04-16 10:24:41 +00:00
Gyan Doshi 5abc240a27 avcodec/videotoolboxenc: add missing field and rectify cap flags 2026-04-16 09:58:06 +00:00
Daniel Verkamp 8eae5de5af avformat/wavenc: Keep fmt chunk first for -rf64 auto
When the WAV muxer's `-rf64 auto` option is used, the output is intended
to be a normal WAV file if possible, only extended to RF64 format when
the file size grows too large. This was accomplished by reserving space
for the extra RF64-specific data using a standard JUNK chunk (ignored by
readers), then overwriting the reserved space later with a ds64 chunk if
needed.

In the original rf64 auto implementation, the JUNK chunk was placed
right after the RIFF/WAVE file header, before the fmt chunk; this is the
design suggested by the "Achieving compatibility between BWF and RF64"
section of the RF64 spec:

  RIFF 'WAVE' <JUNK chunk> <fmt-ck> ...

However, this approach means that the fmt chunk is no longer in its
conventional location at the beginning of the file, and some WAV-reading
tools are confused by this layout. For example, the `file` tool is not
able to show the format information for a file with the extra JUNK chunk
before fmt.

This change shuffles the order of the chunks for `-rf64 auto` mode so
that the reserved space follows fmt instead of preceding it:

  RIFF 'WAVE' <fmt-ck> <JUNK chunk> ...

With this small modification, tools expecting the fmt chunk to be the
first chunk in the file work with files produced by `-rf64 auto`.

This means the fmt chunk won't be in the location required by RF64, so
if the automatic RF64 conversion is triggered, the fmt chunk needs to be
relocated by rewriting it following the ds64 chunk during the conversion:

  RF64 'WAVE' <ds64 chunk> <fmt-ck> ...
2026-04-16 09:12:45 +00:00
Andreas Rheinhardt 39f34ee019 tests/checkasm/h264chroma: Use more realistic block sizes
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-16 07:36:01 +02:00
Andreas Rheinhardt 3de38c6b6e avcodec/h264chroma: Fix incorrect alignment documentation
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-16 07:36:01 +02:00
Andreas Rheinhardt 53a26f7d41 avcodec/mpegvideo_dec: Use C version of h264chroma mc2 functions
H.264 only uses these functions with height 2 or 4 and
the aarch64, arm and mips versions of them optimize based
on this. Yet this is not true when these functions are used
by the lowres code in mpegvideo_dec.c. So revert back to
the C versions of these functions for mpegvideo_dec so that
the H.264 decoder can still use fully optimized functions.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-16 07:36:01 +02:00
James Almer 2feb213287 avcodec/lcevc: make CBS reallocate the LCEVC payload
Frame side data unfortunately lacks padding, which CBS needs, so we can't reuse
the existing AVBufferRef.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-04-15 16:13:44 -03:00
Niklas Haas dcfd8ebe86 tests/checkasm/sw_ops: remove random value clears
These can randomly trigger the alpha/zero fast paths, resulting in spurious
tests or randomly diverging performance if the backend happens to implement
that particular fast path.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-15 14:51:16 +00:00
Niklas Haas 80b86f0807 tests/checkasm/sw_ops: fix check_scale()
This was not actually testing integer path. Additionally, for integer
scales, there is a special fast path for expansion from bits to full range,
which we should separate from the random value test.
2026-04-15 14:51:16 +00:00
Niklas Haas e199d6b375 swscale/x86/ops: add missing component annotation on expand_bits
This only does a single component; so it should be marked as such.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-15 14:51:16 +00:00
Niklas Haas b6755b0158 swscale/ops_memcpy: always use loop on buffers with large padding
The overhead of the loop and memcpy call is less than the overhead of
possibly spilling into  one extra unnecessary cache line. 64 is still a
good rule of thumb for L1 cache line size in 2026.

I leave it to future code archeologists to find and tweak this constant if
it ever becomes unnecessary.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-15 14:51:16 +00:00
Niklas Haas 026a6a3101 tests/checkasm/sw_ops: remove redundant filter tests
Most of these filters don't test anything meaningfully different relative to
each other; the only filters that really have special significant are POINT
(for now) and maybe BILINEAR down the line.

Apart from that, SINC, combined with the src size loop, already tests both
extreme cases (large and small filters), with large, oscillating unwindonwed
weights.

The other filters are not adding anything of substance to this, while massively
slowing down the runtime of this test. We can, of course, change this if the
backends ever get more nuanced handling.

checkasm: all 855 tests passed (down from 1575)

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-15 14:51:16 +00:00
Niklas Haas 91582f7287 tests/checkasm/sw_ops: explicitly test all backends
The current code was a bit clumsy in that it always picked the first
available backend when choosing the new function. This meant that some x86
paths were not being tested at all, whenever the memcpy backend (which has
higher priority) could serve the request.

This change makes it so that each backend is explicitly tested against only
implementations provided by that same backend.

checkasm: all 1575 tests passed (up from 1305)

As an aside, it also lets us benchmark the memcpy backend directly against
the C reference backend.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-15 14:51:16 +00:00
Niklas Haas d5089a1c62 tests/checkasm/sw_ops: don't shadow 'report'
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-15 14:51:16 +00:00
Niklas Haas 3c1781f931 tests/checkasm/sw_ops: separate op compilation from testing
This commit is purely moving around code; there is no functional change.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-15 14:51:16 +00:00
Niklas Haas e83de76f08 tests/checkasm/sw_ops: check all planes in CHECK_COMMON()
This can help e.g. properly test that the masked/excluded components are
left unmodified.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-15 14:51:16 +00:00
Niklas Haas eac90ce6ce tests/checkasm/sw_ops: set correct plane index order
All four components were accidentally being read/written to/from the same
plane.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-15 14:51:16 +00:00
Niklas Haas 590eb4b70d tests/checkasm/sw_ops: remove some unnecessary checks
These don't actually exist at runtime, and will soon be removed from the
backends as well.

This commit is intentionally a bit incomplete; as I will rewrite this
based on the auto-generated macros in the upcoming ops_micro series.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-15 14:51:16 +00:00
Stéphane Cerveau 3f9e04b489 vulkan: fix encode feedback query handling
Check that the driver supports both BUFFER_OFFSET and BYTES_WRITTEN
encode feedback flags before creating the query pool, failing with
EINVAL if either is missing.

Set these flags explicitly instead of masking off HAS_OVERRIDES with a
bitwise NOT, which could pass unrecognized bits from newer drivers to
vkCreateQueryPool causing validation errors and
crashes.
2026-04-14 21:31:45 +00:00
Vignesh Venkat c8dd769217 ffprobe: Support printing SMPTE 2094 APP5 side data
Signed-off-by: Vignesh Venkatasubramanian <vigneshv@google.com>
2026-04-14 20:41:14 +00:00
Vignesh Venkat 37aefb6e40 avcodec/dav1d: Support parsing smpte 2094-50 metadata
Signed-off-by: Vignesh Venkatasubramanian <vigneshv@google.com>
2026-04-14 20:35:57 +00:00
Andreas Rheinhardt d5fc732359 avcodec/codec_internal: Include avcodec.h for enum AVCodecConfig
Forward-declaring an enum is not legal C (the underlying type of
the enum may depend upon the enum constants, so this may cause
ABI issues with -fshort-enums); compilers warn about this
with -pedantic.

This essentially reverts 7e84865cff.
Notice that almost* all files that include codec_internal.h also
need to include avcodec.h, so this does not lead to unnecessary
rebuilds.

This addresses part of #22684.

*: The only file I am aware of that defines an FFCodec and does not
need AVCodecContext as complete type is null.c (but even it already
includes it implicitly); the avcodec.c test tool seems to be the only
file where this commit actually leads to an unnecessary avcodec.h
inclusion.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-14 16:04:47 +02:00
Andreas Rheinhardt fc8c6d4665 swscale/swscale: Remove ineffective check
If any of the dstStrides is not aligned mod 16, the warning
above this one will be triggered, setting stride_unaligned_warned,
so that the following check for stride_unaligned_warned will
be always false.

Reviewed-by: Niklas Haas <ffmpeg@haasn.dev>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-14 15:22:49 +02:00
Paul Adenot 6c114bd6fa avcodec/vp9: Rollback dimensions when format is rejected
Fixes: BMO#2029296

Found-by: Mozilla Security Team, Paul Adenot for the write variant
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-04-14 01:42:53 +00:00
Kacper Michajłow 1092852406 swscale/ops: remove type from continuation functions
The glue code doesn't care about types, so long the functions are
chained correctly. Let's not pretend there is any type safety there, as
the function pointers were casted anyway from unrelated types.
Particularly some f32 and u32 are shared.

This fixes errors like so:
src/libswscale/ops_tmpl_int.c:471:1: runtime error: call to function linear_diagoff3_f32 through pointer to incorrect function type 'void (*)(struct SwsOpIter *, const struct SwsOpImpl *, unsigned int *, unsigned int *, unsigned int *, unsigned int *)'
libswscale/ops_tmpl_float.c:208: note: linear_diagoff3_f32 defined here

Fixes: #22332
2026-04-13 23:28:30 +00:00
Kacper Michajłow 9a2a0557ad swscale/ops: remove optimize attribute from op functions
It was added to force auto vectorization on GCC builds. Since then auto
vectorization has been enabled for whole code base, 1464930696.

According to GCC documentaiton, the optimize attribute should be used
for debugging purposes only. It is not suitable in production code.

In particular it's unclear whether the attribute is applied, as it's is
actually lost when function is inlined, so usage of it is quite fragile.

Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2026-04-13 23:28:30 +00:00
Michael Niedermayer 29a0973855 avformat/rtpdec_qdm2: Check block_size
Fixes: out of array access
no testcase

Found-by: Joshua Rogers <joshua@joshua.hu> with ZeroPath
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-04-13 20:19:37 +00:00
Andreas Rheinhardt 660d6ece1b avutil/tests/.gitignore: Add recently added test tools
Reviewed-by: Marvin Scholz <epirat07@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-13 16:12:31 +02:00
Ramiro Polla 49a2d426e9 configure: collapse else + if into elif 2026-04-13 12:46:49 +00:00
Ramiro Polla 31f280e1e6 configure: add missing quotes around user-specified tool paths
Found-by: Luke Jolliffe <luke.jolliffe@bbc.co.uk>
2026-04-13 12:46:49 +00:00
Ramiro Polla c1d4fe8b44 configure: fix html docs generation when makeinfo is disabled
The makeinfo_html variable wasn't being disabled when the makeinfo test
failed, which prevented texi2html from being probed.

Fixes 589da160b2.

Found-by: Luke Jolliffe <luke.jolliffe@bbc.co.uk>
2026-04-13 12:46:49 +00:00
Zhao Zhili b62ae766c1 avfilter/vf_ssim360: fix integer overflow in tape_length allocation
tape_length * 8 overflows 32-bit int for large input widths. Then
av_malloc_array() allocates a tiny buffer while the subsequent
loop writes tape_length*8 BilinearMap entries, causing
heap-buffer-overflow.

Validate the value in float before converting to int and left
shifting, to avoid both float-to-int and signed left shift
overflow UB. Also split av_malloc_array() arguments to avoid
the multiplication overflow.

Fixes: #21511

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2026-04-13 19:49:32 +08:00
Zhao Zhili b796d72eb2 fftools/ffmpeg_filter: skip autoscale for hardware format
This fix failure:
ffmpeg -hwaccel cuda -hwaccel_output_format cuda \
  -i The_Beauty_of_Earth-1.mp4 \
  -vf scale_cuda=2880:1440 \
  -c:v hevc_nvenc \
  -pix_fmt cuda \
  -b:v 8M -c:a copy \
  -y test_scale.mp4

> Reconfiguring filter graph because hwaccel changed
> Impossible to convert between the formats supported by the filter
> 'Parsed_scale_cuda_0' and the filter 'auto_scale_0'.
> Error reinitializing filters!

Signed-off-by: Zhao Zhili <quinkblack@foxmail.com>
2026-04-13 19:46:54 +08:00
Zhao Zhili a85a8e6757 configure: fix VSX remaining enabled when -mvsx is unsupported
When check_cflags -mvsx fails, the && short-circuit prevents
check_cc from running. Since check_cc is responsible for
disabling vsx on failure, skipping it leaves vsx incorrectly
enabled.

Fix by removing the && so check_cc always executes.

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2026-04-13 11:45:36 +00:00
Andreas Rheinhardt 32678dcc88 avcodec/x86/snowdsp_init: Remove disabled SSE2 functions
Disabled in 3e0f7126b5
(almost 20 years ago) and no one fixed them, so remove them.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-13 12:56:35 +02:00
Andreas Rheinhardt bd2964e611 avcodec/x86/snowdsp_init: Use standard init pattern
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-13 12:56:01 +02:00
Andreas Rheinhardt 338dc25642 avcodec/x86/snowdsp_init: Remove MMXEXT, SSE2 inner_add_yblock versions
They have been superseded by SSSE3; the SSE2 version was even disabled
(and segfaults if enabled).

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-13 12:53:17 +02:00
Andreas Rheinhardt 5c830fccf4 avcodec/x86/snowdsp: Add SSSE3 inner_add_yblock
Compared to the MMX version, this version benefits from wider
registers and pmaddubsw. It also has fewer unnecessary loads
and stores: On x64, the MMX version has 12 unnecessary GPR loads
and 6 stores in each line when width is eight; for width 16,
there are 17 unnecessary GPR loads and six stores per line.
Even the 32bit SSSE3 version only has six loads and zero stores
per line more than the x64 version. Furthermore, in contrast
to the MMX version, the SSSE3 version also does not clobber
the array of block pointers given to it.

Benchmarks:
inner_add_yblock_2_c:                                   29.2 ( 1.00x)
inner_add_yblock_2_mmx:                                 32.5 ( 0.90x)
inner_add_yblock_2_ssse3:                               28.6 ( 1.02x)
inner_add_yblock_4_c:                                   85.2 ( 1.00x)
inner_add_yblock_4_mmx:                                 89.2 ( 0.96x)
inner_add_yblock_4_ssse3:                               84.5 ( 1.01x)
inner_add_yblock_8_c:                                  302.0 ( 1.00x)
inner_add_yblock_8_mmx:                                 77.0 ( 3.92x)
inner_add_yblock_8_ssse3:                               30.6 ( 9.85x)
inner_add_yblock_16_c:                                1164.7 ( 1.00x)
inner_add_yblock_16_mmx:                               260.4 ( 4.47x)
inner_add_yblock_16_ssse3:                              82.3 (14.15x)

Both the MMX and SSSE3 versions leave the size 2 and 4 cases
to ff_snow_inner_add_yblock_c() (but the MMX version has
a prologue at the beginning that it needs to undo before
the call, leading to the higher overhead for these sizes).
I don't know why the SSSE3 version is marginally faster than
the C version in these cases.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-13 12:51:35 +02:00
Andreas Rheinhardt 2fdccaf7d6 tests/checkasm/mpegvideo_unquantize: Fix precedence problem
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-13 12:51:35 +02:00
Andreas Rheinhardt 4f30bd6fba tests/checkasm/llvidencdsp: Fix nonsense randomization
The first loop was never entered due to a precedence problem;
the second loop initialized everything, although it was not intended
that way.
This has been added in 56b8769a1c.
Sorry for this.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-13 12:51:34 +02:00
Andreas Rheinhardt e0ed3fa834 tests/checkasm: Add snowdsp test
Only inner_add_yblock for now.
Hint: Said function uses a pointer to an array of pointers as parameter.
The MMX version clobbers the array in such a way that calling the
function repeatedly with the same arguments (as happens inside bench_new())
leads to buffer overflows and segfaults. Therefore CALL4 had to be
overridden to restore the original pointers. This workaround will be
removed soon when the MMX version is removed.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-13 12:46:24 +02:00
Andreas Rheinhardt 764e021946 avcodec/snowdata: Add explicit alignment for obmc tables
This is in preparation for adding SSSE3 assembly.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-13 12:46:24 +02:00
Andreas Rheinhardt 28d0a5091a avcodec/snow_dwt: Remove pointless forward declaration
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-13 12:46:24 +02:00
Andreas Rheinhardt 5f373872c0 avcodec/x86/snow_dwt: Avoid slice_buffer in inner_add_yblock
It is unnecessary and avoids the src_y parameter;
it also makes this function more ASM-friendly.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-13 12:46:24 +02:00
Andreas Rheinhardt fd77f00a8f avcodec/snow: Avoid always-true branch
The input lines used in ff_snow_inner_add_yblock()
must always be set (because their values are used).
The MMX assembly always relied on this.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-13 12:46:24 +02:00
Andreas Rheinhardt 13d621cc7c avcodec/snow: Disable dead code in ff_snow_inner_add_yblock()
It is only used with add != 0 (and the assembly functions
only support this case).

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-13 12:46:24 +02:00
Andreas Rheinhardt eed0830a0c avcodec/snowdata: Don't use 8 bits for six bits data
This has been done in 561a18d3ba
in order to avoid shifts, yet this rationale no longer applies
since d593e32983. So shift them back;
this is in preparation for using these coefficients together with
pmaddubsw.

Hint: 561a18d3ba also added a block
guarded by "if(LOG2_OBMC_MAX == 8". I changed the condition to remove
this check (i.e. kept the block) which should not change the output
at all. Yet all FATE tests pass if the block is completely
removed. I don't know if this block is necessary at all.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-13 12:46:24 +02:00
Andreas Rheinhardt 761b6f2359 swscale/x86/output: Remove obsolete MMXEXT function
Possible now that the SSE2 function is available
even when the stack is not aligned.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-13 08:46:44 +02:00
Andreas Rheinhardt 8a7c1f7fb8 swscale/x86/output: Make xmm functions usable even without aligned stack
x86-32 lacks one GPR, so it needs to be read from the stack.
If the stack needs to be realigned, we can no longer access
the original location of one argument, so just request a bit
more stack size and copy said argument at a fixed offset from
the new stack.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-13 08:46:44 +02:00
Andreas Rheinhardt 0bb161fd09 swscale/x86/output: Simplify creating dither register
Only the lower quadword needs to be rotated, because
the register is zero-extended immediately afterwards anyway.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-13 08:46:44 +02:00
Andreas Rheinhardt f5c5bca803 swscale/x86/scale: Remove always-false mmsize checks
Forgotten in a05f22eaf3.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-13 08:46:44 +02:00
Andreas Rheinhardt 999ccf6495 swresample/x86/{audio_convert,rematrix}: Remove remnants of MMX
Forgotten in 2b94f23b06,
4e51e48ebd and
374b3ab03c.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-13 01:16:46 +02:00
Andreas Rheinhardt e29c7089d2 avcodec/x86/vp8dsp_loopfilter: Remove always-true mmsize checks
Forgotten in 6a551f1405.
Also fix the comment claiming that there are MMXEXT functions
in this file.

Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-13 00:41:22 +02:00
Andreas Rheinhardt 9f560c8c1a avcodec/x86/vp3dsp: Remove unused macros
Forgotten in a677b38298.

Reviewed-by: Ronald S. Bultje <rsbultje@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-13 00:41:22 +02:00
Jun Zhao 411484e8c9 lavc/videotoolbox_vp9: fix vpcC flags offset
Write the 24-bit vpcC flags field at the current cursor position after
the version byte. The previous code wrote to p+1 instead of p, leaving
one byte uninitialized between version and flags and shifting all
subsequent fields (profile, level, bitdepth, etc.) by one byte.

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-04-12 22:15:51 +00:00
Jun Zhao 57397a683d lavc/videotoolboxenc: return SEI parse errors
Return the actual find_sei_end() error when SEI appending fails instead of
reusing the previous status code. This preserves the real parse failure for
callers instead of reporting malformed SEI handling as success.

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-04-12 22:15:51 +00:00
Niklas Haas b09d57c41d avfilter/buffersrc: re-add missing overflow warning
This was originally introduced by commit 05d6cc116e. During the FFmpeg-libav
split, this function was refactored by commit 7e350379f8 into
av_buffersrc_add_frame(), replacing av_buffersrc_add_ref(). The new function
did not include the overflow warning, despite the same being done for
buffersink.

Then, when commit a05a44e205 merged the two functions back together, the
libav implementation was favored over the FFmpeg implementation, silently
removing the overflow warning in the process.

This commit re-adds that missing warning.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-12 20:02:18 +00:00
nyanmisaka ab7b6ef0a2 hwcontext_vulkan: fix double free when vulkan_map_to_drm fails
The multiplanar image with storage_bit enabled fails to be exported
to DMA-BUF on the QCOM turnip driver, thus triggering this double-free issue.

```
[Parsed_hwmap_2 @ 0xffff5c002a70] Configure hwmap vulkan -> drm_prime.
[hwmap @ 0xffff5c001180] Filter input: vulkan, 1920x1080 (0).
[AVHWFramesContext @ 0xffff5c004e00] Unable to export the image as a FD!
free(): double free detected in tcache 2
Aborted
```

Additionally, add back an av_unused attribute. Otherwise, the compiler
will complain about unused variables when CUDA is not enabled.

Signed-off-by: nyanmisaka <nst799610810@gmail.com>
2026-04-12 20:50:38 +08:00
zuxy 56b97c03d4 avcodec/x86/h264_intrapred: Replace pred8x8_top_dc_8_mmxext with SSE2
More about deprecating MMX than any performance gain; nearly identical
performance numbers on my Zen 4 (1.36x vs c), but llvm-mca predicts
>60% perf gain on Intel CPUs newer than Skylake.

Signed-off-by: Zuxy Meng <zuxy.meng@gmail.com>
2026-04-11 19:11:46 -07:00
Niklas Haas c29465bcb6 swscale/x86/ops: use plain ret instruction
The original intent here was probably to make the ops code agnostic to
which operation is actually last in the list, but the existence of a
divergence between CONTINUE and FINISH already implies that we hard-code
the assumption that the final operation is a write op.

So we can just massively simplify this with a call/ret pair instead of
awkwardly exporting and then jumping back to the return label. This actually
collapses FINISH down into just a plain RET, since the op kernels already
don't set up any extra stack frame.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-11 16:30:15 +00:00
Tymur Boiko f7ca6f7481 vulkan: fix -Wdiscarded-qualifiers warning and misleading DRM modifier log
ff_vk_find_struct returns const void *, so storing it in const void *drm_create_pnext
fixes the initialization warning but then dpb_hwfc->create_pnext = drm_create_pnext
assigns const void * to void *, triggering the same warning at that line. The right
fix is a (void *) cast at the call site, same as done for buf_pnext.

Also restrict the GetPhysicalDeviceImageFormatProperties2 verbose log in
try_export_flags to the DRM modifier path only: when has_mods is false the log
always printed mod[0]=0x0, which is misleading since no DRM modifier is involved.

Signed-off-by: Tymur Boiko <tboiko@nvidia.com>
2026-04-11 12:50:07 +00:00
Kacper Michajłow eaadd05232 .forgejo/CODEOWNERS: add myself for hls.*
Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2026-04-11 01:58:35 +02:00
Kacper Michajłow 721545a3c2 MAINTAINERS: add myself as HLS demuxer maintainer
Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2026-04-11 01:58:35 +02:00
Kacper Michajłow cc41e6a462 tests/fate/hlsenc: add hls-event-no-endlist test
Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2026-04-11 01:58:34 +02:00
Kacper Michajłow 6d98a9a2e8 avformat/hls: fix seeking in EVENT playlists that start mid-stream
HLS EVENT playlists (e.g. Twitch VODs) are seekable but not finished,
so live_start_index causes playback to begin near the end. The first
packet's DTS then becomes first_timestamp, creating a wrong mapping
between timestamps and segments.

Fix this by subtracting the cumulative duration of skipped segments from
first_timestamp so it reflects the true start of the playlist.

Also set per-stream start_time from first_timestamp so correct time is
reported, reset pts_wrap_reference on seek to prevent bogus wrap
arounds.

Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2026-04-11 01:58:34 +02:00
Niklas Haas ef13a29d08 avfilter/framepool: fix frame pool uninit check
Fixes a memory leak caused by AV_MEDIA_TYPE_VIDEO == 0 being excluded by
the !pool->type check. We can just remove the entire check because
av_buffer_pool_uninit() is already safe on NULL.

Fixes: fe2691b3bb
Reported-by: Kacper Michajłow <kasper93@gmail.com>
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-10 22:02:00 +02:00
Alexandru Ardelean e43aab67ed avdevice/v4l2: rename 'buff_data' -> 'buf_desc'
Since we've added a 'buf_data' struct, rename this to avoid any confusion
about this one.

Signed-off-by: Alexandru Ardelean <aardelean@deviqon.com>
2026-04-10 16:02:28 +00:00
Alexandru Ardelean 1011e4d647 avdevice/v4l2: wrap buf_start and buf_len into a struct
This reduces the number of malloc() & free() calls, and structures the
data for the buffers a bit neatly.
In case more per-buffer data needs to be added, having a separate struct
is useful.

Signed-off-by: Alexandru Ardelean <aardelean@deviqon.com>
2026-04-10 16:02:28 +00:00
Alexandru Ardelean 24adcf3a72 avdevice/v4l2: fix potential memleak when allocating device buffers
In the loop which allocates the buffers for a V4L2 device, if failure
occurs for a certain buffer (e.g. 3rd of 4 buffers), then the previously
allocated buffers (and the buffer array) would not be free'd in
the mmap_init(). This would cause a leak.

This change handles the error cases of that loop to free all allocated
resources, so that when mmap_init() fails nothing is leaked.

Signed-off-by: Alexandru Ardelean <aardelean@deviqon.com>
2026-04-10 16:02:28 +00:00
Niklas Haas 0e983a0604 swscale: align allocated frame buffers to SwsPass hints
This avoids hitting the slow memcpy fallback paths altogether, whenever
swscale.c is handling plane allocation.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-10 15:12:18 +02:00
Niklas Haas b5573a8683 swscale/ops_dispatch: cosmetic
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-10 15:12:18 +02:00
Niklas Haas 3a15990368 swscale/ops_dispatch: forward correct pass alignment
As a consequence of the fact that the frame pool API doesn't let us directly
access the linesize, we have to "un-translate" the over_read/write back to
the nearest multiple of the pixel size.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-10 15:12:18 +02:00
Niklas Haas 5441395a48 swscale/graph: add optimal alignment/padding hints
Allows the pass buffer allocator to make smarter decisions based on the actual
alignment requirements of the specific pass.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-10 15:12:18 +02:00
Niklas Haas 2deca0ec19 swscale: clean up allocated frames on error
Matches the semantics of sws_frame_begin(), which also cleans up any
allocated buffers on error.

This is an issue introduced by the commit that allowed ff_sws_graph_run()
to fail in the first place.

Fixes: 563cc8216b
2026-04-10 15:12:18 +02:00
Niklas Haas 6c89a30ecd swscale: add FFFramePool and use it for allocating planes
The major consequence of this is that we start allocating buffers per plane,
instead of allocating one contiguous buffer. This makes the no-op/refcopy
case slightly slower, but doesn't meaningfully affect the rest:

yuva444p -> yuva444p, time=157/1000 us (ref=78/1000 us), speedup=0.497x slower
Overall speedup=1.016x faster, min=0.983x max=1.092x

However, this is a necessary consequence of the desire to allow partial plane
allocations / single plane refcopies. This slowdown also does not affect
vf_scale, which already uses avfilter/framepool.c (via ff_get_video_buffer).

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-10 15:12:18 +02:00
Niklas Haas fe2691b3bb avfilter/framepool: stack-allocate FFFramePool
Saves a pointless free/alloc cycle on reinit. For the vast majority of filter
links, this going to be allocated anyway; and on the occasions that it's not,
the waste is marginal.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-10 15:12:18 +02:00
Niklas Haas a2ca55c563 avfilter/framepool: remove unnecessary braces (style)
As per the FFmpeg coding style guidelines, braces should be avoided on
isolated single-line statement bodies.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-10 15:12:18 +02:00
Niklas Haas 5c4490a0a6 avfilter/framepool: fix whitespace (cosmetic)
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-10 15:12:18 +02:00
Niklas Haas 38543781cc avfilter/framepool: move variable declarations to site of definition
This is not C89 anymore.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-10 15:12:18 +02:00
Niklas Haas 6efbd99e48 avfilter/framepool: remove check for impossible condition
FFALIGN(..., pool->align) = (...) & ~(pool->align - 1), so this condition
equates to: ((...) & ~(align - 1) & (align - 1)), which is trivially 0.

(Note that all expressions are of type `int`)

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-10 15:12:18 +02:00
Niklas Haas 0b43b8ef31 avfilter/framepool: make FFFramePool public
This struct is overally pretty trivial and there is little to no internal
state or invariants that need to be protected.

Making it public allows e.g. libswscale to allocate buffers for individual
planes directly.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-10 15:12:18 +02:00
Niklas Haas 3e99631873 avfilter/framepool: remove pointless ternary (cosmetic)
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-10 15:12:18 +02:00
Niklas Haas 53ce7265ab avfilter/framepool: use strongly typed union of pixel/sample format
Replacing the generic `int format` field. This aids in debugging, as
e.g. gdb will tend to translate the strongly typed enums back into human
readable names automatically.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-10 15:12:18 +02:00
Niklas Haas 9034587d10 avfilter/framepool: make ff_sws_frame_pool_{audio,video}_init static
Not used outside of the (strictly more flexible) _reinit() anymore.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-10 15:12:18 +02:00
Niklas Haas 39ecc89988 avfilter/framepool: nuke ff_frame_pool_get_*_config
This helper is of dubious utility - it was only used to reinitialize the
frame pools, which is better handled by `ff_frame_pool_reinit()`, and at
present only serves to make extending the API harder.

Users who really need to randomly query the state of the frame pool can
already keep track of the values they set.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-10 15:12:18 +02:00
Niklas Haas a0510f7f72 avfilter/framepool: update frame dimensions on reinit
The previous logic (ported from libavfilter/video.c) would leave the frame
pool intact if the linesize did not change as a result of changing the frame
dimensions. However, this caused ff_default_get_video_buffer2() to return
frames with the old width/height.

I think this bug was avoided in practice because the only filters to actually
support changing the resolution at runtime already always explicitly overrode
the width/height of allocated output buffers by the link properties.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-10 15:12:18 +02:00
Niklas Haas ad7956d5bb avfilter/{audio,video}: switch to ff_frame_pool_*_reinit()
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-10 15:12:17 +02:00
Niklas Haas 083a014746 avfilter/framepool: add ff_frame_pool_*_reinit() helpers
This moves the check-uninit-reinit logic out of audio.c/video.c and into
framepool.c, where it can be more conveniently re-used by future users.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-10 15:12:17 +02:00
Niklas Haas 143b810e75 avfilter/framepool: remove alloc argument
Not really needed by anything and makes this API a bit clunkier to extend.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-10 15:12:17 +02:00
Niklas Haas 82871857eb avfilter/framepool: actually use specified allocator for audio frames
This bug means audio buffers were never correctly zero'd as intended by
libavfilter/audio.c

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-10 15:12:17 +02:00
Tymur Boiko 25e187f849 vulkan: fix DRM map, decode barriers, and video frame setup for modifier output
When mapping Vulkan Video frames to DMA-BUF, synchronize using an exportable
binary semaphore and sync_fd where supported. Submit a lightweight exec that
waits on each plane's timeline semaphore at the current value, signals a
SYNC_FD-exportable binary semaphore, then export with vkGetSemaphoreFdKHR.
Store that binary semaphore in AVVkFrameInternal and reuse it across maps
instead of creating and destroying each time: for
VK_EXTERNAL_SEMAPHORE_HANDLE_TYPE_SYNC_FD_BIT, copy transference means a
successful vkGetSemaphoreFdKHR unsignals the semaphore like a wait, so it can
be signaled again on the next map submit. If export is unavailable, fall back
to vkWaitSemaphores.

Moved drm_sync_sem destroy to vulkan_free_internal

Export dma-buf fds with GetMemoryFdKHR for each populated f->mem[i], iterating
up to the sw_format plane count instead of stopping at the image count, so
multi-memory bindings are not skipped. Describe DRM layers using
max(sw planes, image count) and query subresource layout with the correct
aspect and image index when one VkImage backs multiple planes. Reference the
source hw_frames_ctx on the mapped frame and close dma-buf fds on failure paths.

For DMA-BUF-capable pools, honor VK_EXTERNAL_MEMORY_FEATURE_DEDICATED_ONLY_BIT
from format export queries when binding memory. With DRM modifiers and a
video profile in create_pnext, preserve caller usage and image flags instead of
overwriting them from generic supported_usage probing; use the modifier list
create info when probing export flags for modifier tiling.

Include VK_IMAGE_USAGE_VIDEO_DECODE_DPB_BIT_KHR from the output frames
context's usage together with DST (fixes
VUID-VkVideoBeginCodingInfoKHR-slotIndex-07245) instead of adding DPB usage
only when !is_current.

In ff_vk_decode_add_slice, pass VkVideoProfileListInfoKHR (from the output
frames context's create_pnext) as the pNext argument to
ff_vk_get_pooled_buffer instead of the full create_pnext chain. In
ff_vk_frame_params, set tiling to OPTIMAL only when it is not already
DRM_FORMAT_MODIFIER_EXT. In ff_vk_decode_init, when the output pool's
create_pnext includes VkImageDrmFormatModifierListCreateInfoEXT, initialize the
DPB pool with that modifier-list pNext and DRM_FORMAT_MODIFIER_EXT tiling;
otherwise use VkVideoProfileListInfoKHR and OPTIMAL as before. When
VK_VIDEO_DECODE_CAPABILITY_DPB_AND_OUTPUT_DISTINCT_BIT_KHR is unset, the output
and DPB pools cannot use different layouts or tiling, so the DPB pool must
match the output pool.

Also fix av_hwframe_map ioctl sync_fd export, multi-planar semaphore handling,
and related failure-path cleanup.

Signed-off-by: Tymur Boiko <tboiko@nvidia.com>
2026-04-10 11:39:40 +00:00
James Almer 492e6e68dc doc/APIchanges: fix date and version in the latest entry
Signed-off-by: James Almer <jamrial@gmail.com>
2026-04-09 17:21:28 -03:00
Vignesh Venkat 6ba6db4f19 libavutil: Add functions for SMPTE-2094-50 HDR metadata
SMPTE-2094-50 is an upcoming standard that is close to being
finalized.

Define a side data type for carrying this metadata. And add
functions for parsing and writing it. This is very similar to
the handling of HDR10+ metadata.

The spec is available here: https://github.com/SMPTE/st2094-50

Signed-off-by: Vignesh Venkatasubramanian <vigneshv@google.com>
2026-04-09 20:01:00 +00:00
Lynne 283faf55f8 Changelog: add v360_vulkan to the changelog 2026-04-09 16:51:30 +02:00
Lynne d3d0b7a5ee lavfi/v360: add a Vulkan-compute based filter
This just adds a Vulkan compute-based 360-degree video conversion.
It implements a sufficient subset of the most popular 360-degree video formats.

Options such as rotation are dynamic and can be adjusted during runtime.

Some of the work was based on Paul B. Mahol's patch from 2020. There
were spots where the arithmetic conversion was incorrect.
2026-04-09 12:31:24 +02:00
Priyanshu Thapliyal 056562a5ff fate/lavf: add PDV round-trip and seek coverage
Add FATE coverage for PDV encoding and decoding via lavf, including
intra and inter frame cases, skip-nokey decoding, and container-level
seek coverage.

Use -strict experimental in the encode commands because the encoder
is marked experimental.
2026-04-09 03:01:43 +00:00
Priyanshu Thapliyal 4c0d563f85 avformat/pdvenc: add Playdate video muxer
Add a muxer for the Playdate PDV container format.

The muxer writes the frame table and packet layout required by the
Playdate runtime. It requires seekable output and a predeclared
maximum number of frames (-max_frames).

Includes validation for single video stream input, dimension and
framerate checks, and bounded payload/table offset checks. The frame
entry table is allocated once in write_header() using max_frames + 1.

Document the muxer in doc/muxers.texi and add a Changelog entry.
2026-04-09 03:01:43 +00:00
Priyanshu Thapliyal 43e5b26c00 avcodec/pdvenc: add Playdate video encoder
Add a native encoder for the Playdate PDV format.

Supports monob (1-bit) video, producing zlib-compressed intra frames
and XOR-based delta frames.

Includes bounds checking, overflow guards, correct linesize handling
using ptrdiff_t, and proper buffer allocation ordering.

Mark the encoder as experimental by setting AV_CODEC_CAP_EXPERIMENTAL,
since it has not been validated against Panic's official Playdate
player or SDK.
2026-04-09 03:01:43 +00:00
Michael Niedermayer d0761626cf avcodec/escape130: Initialize old_y_avg
Fixes: use of uninitialized memory

Found-by: Carl Sampson <carl.sampson@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-04-09 01:14:39 +02:00
bird 5c3602abaa avformat/sctp: add size check in sctp_read() matching sctp_write()
Commit 5b98cea4 added a size < 2 guard to sctp_write() to prevent
out-of-bounds access when max_streams is enabled, but the identical
pattern in sctp_read() was not addressed.

When max_streams is non-zero, sctp_read() passes (buf + 2, size - 2)
to ff_sctp_recvmsg(). If size < 2, size - 2 wraps to a large value
on the implicit cast to size_t in the callee.

Add the same guard.

Signed-off-by: bird <6666242+bird@users.noreply.github.com>
2026-04-08 20:52:52 +00:00
David Hampton db53fe1ac2 avdevice/alsa.c: Conditionally compile out ESTRPIPE
NetBSD doesn't have this error code, so put in a test to prevent
compiling related code on NetBSD systems.
2026-04-08 20:44:16 +00:00
Andreas Rheinhardt 87f74e4f39 avcodec/g726: Remove dead sample rate check
Checked generically since 39206c5e58.

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-08 21:33:43 +02:00
Andreas Rheinhardt 8463ca8dc9 avcodec/g726: Don't return value from g726_reset()
It always returns zero which none of the callers check,
so just return nothing instead.

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-08 21:33:40 +02:00
Andreas Rheinhardt e97f52e557 avcodec/g726: Fix indentation
Forgotten in e344c1ea36.

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-08 21:33:17 +02:00
Andreas Rheinhardt 12b58b86cc avcodec/x86/rv40dsp: Fix wrong comment
Forgotten in d25b3497f2
and 9abf906800.

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-08 21:00:04 +02:00
Andreas Rheinhardt ae5314a6bf avcodec/x86/rv34dsp: Port ff_rv34_idct_add_mmxext to SSSE3
With this commit, the RV30 and RV40 decoders no longer clobber
the fpu state for normal decoding (only error resilience can
still do so).

rv34_idct_add_c:                                        58.1 ( 1.00x)
rv34_idct_add_mmxext:                                   16.5 ( 3.52x)
rv34_idct_add_ssse3:                                    12.2 ( 4.76x)

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-08 21:00:00 +02:00
Andreas Rheinhardt d728f3c808 tests/checkasm/rv34dsp: Add test for rv34_idct_add
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-08 20:59:56 +02:00
Andreas Rheinhardt 4c64a8a986 avcodec/rv34: Remove pointless has_ac variable
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-08 20:59:52 +02:00
Andreas Rheinhardt c48f21f778 avcodec/rv34: Use VLC symbol table to avoid LUTs
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-08 20:54:21 +02:00
James Almer 4c69c5f156 configure: only warn about spirv-headers if vulkan was explicitly requested
Given it's autodected by default, its checks should not print warnings nor abort the process.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-04-08 12:29:28 -03:00
James Almer 58752ffcdd avfilter/af_whisper: don't set an AVOption accessible field to read only memory
It should also not be set to an av_malloc'd one given it's not an exported option.

Fixes issue #22741.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-04-08 11:52:44 -03:00
Timo Rothenpieler 5c35d0b880 avformat/tls_gnutls: actually send client cert if one is provided
Without setting this flag, apparently gnutls will only send the client
certificate according some logic based on what it thinks the server
accepts.
This is not the case a lot of times.
Just force it to send the client cert the user supplied, if one was
supplied, no matter what.

Fixes #22707
2026-04-08 12:26:29 +00:00
Zhao Zhili 9917308cc2 swscale/vulkan: fix dither buffer leak on mapping failure
A failure while preparing a dither buffer leaves the newly allocated
buffer outside the cleanup range, leaking Vulkan resources. Make the
failure path cover the current buffer as well.

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2026-04-07 17:53:14 +00:00
Niklas Haas cd1126a3cd swscale/ops: add assertion that comps_src is sane
This assertion guards against bugs like that fixed in the previous
commit.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-07 15:10:04 +00:00
Niklas Haas 420019a313 swscale/ops_optimizer: properly swizzle comps_src when splitting
Fixes a pre-existing latent bug in the subpass splitting, that was
made worse / exposed by 048ca3b367.

Fixes: cba54e9e3b
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-07 15:10:04 +00:00
nyanmisaka dcbfa11c96 avcodec/amfenc: remove the AMF_VIDEO_ENCODER_AV1_CAP_{WIDTH,HEIGHT}_ALIGNMENT_FACTOR_LOCAL
These have been defined in AMF 1.4.35+ but we are on 1.5.0.

Signed-off-by: nyanmisaka <nst799610810@gmail.com>
2026-04-07 14:54:21 +00:00
nyanmisaka 113c1c0624 avcodec/amfenc: let the HEVC encoder profile follow the target bit depth
Previously, you could even set the Main profile for the P010 input and 10-bit output.

Signed-off-by: nyanmisaka <nst799610810@gmail.com>
2026-04-07 14:54:21 +00:00
nyanmisaka 2f592f0699 avcodec/amfenc: use pixel desc to determine YUV and bit depth
Therefore, YUV420P and X2BGR10 are now being taken into consideration.

Signed-off-by: nyanmisaka <nst799610810@gmail.com>
2026-04-07 14:54:21 +00:00
nyanmisaka 0acc73b7fb avcodec/amfenc: support full range in AV1 and update deprecated AMF range flags for H264/HEVC
Furthermore, the flags for H264/HEVC have been updated to those renamed in AMF 1.5.0+,
instead of using the old ones that were already marked as deprecated:

AMF_VIDEO_ENCODER_FULL_RANGE_COLOR -> AMF_VIDEO_ENCODER_OUTPUT_FULL_RANGE_COLOR
AMF_VIDEO_ENCODER_HEVC_NOMINAL_RANGE -> AMF_VIDEO_ENCODER_HEVC_OUTPUT_FULL_RANGE_COLOR

The macro content remains the same, therefore it will not cause regressions.

Signed-off-by: nyanmisaka <nst799610810@gmail.com>
2026-04-07 14:54:21 +00:00
James Almer e7696357de avformat/dashdec: export LCEVC Stream Groups when the manifest reports the relevant dependency
Signed-off-by: James Almer <jamrial@gmail.com>
2026-04-07 10:04:55 -03:00
Jack Lau 0510aff11b avformat/hlsenc: fix compile error when mp4 is disabled
Regression since dc4c798970

Handle the case where mp4 is disabled since mp4 as
an optional dependency of hls_muxer.

Signed-off-by: Jack Lau <jacklau1222gm@gmail.com>
2026-04-07 02:20:34 +00:00
Ruikai Peng e90c2ff4b5 avcodec/libdav1d: fix heap overflow in US ITU-T T.35 metadata parsing
The US country_code path in parse_itut_t35_metadata() reads the
the provider_code with bytestream2_get_be16u(), which is a
unchecked version that does not validate the remaining
length before reading. When an AV1 stream contains ITU-T T.35
metadata with country_code set to 0xB5 (which is US) and a
payload shorter than 2 bytes, this results in a heap overflow
reading 2 bytes past the allocation.

The UK country code already guards against this issue by
checking it before the unchecked read. We're using the same
pattern to the US country code path.

Pwno crafted an AV1 IVF with a metadata OBU containing ITU-T T.35
with country_code=0xB5 and a 1-byte payload. Decoding with libdav1d
triggers the overflow. ASan says:

ERROR: AddressSanitizer: heap-buffer-overflow
READ of size 2 at 0x5020000003f0 thread T0
  #0 bytestream_get_be16 src/libavcodec/bytestream.h:98
  #1 bytestream2_get_be16u src/libavcodec/bytestream.h:98
  #2 parse_itut_t35_metadata src/libavcodec/libdav1d.c:376

0x5020000003f1 is located 0 bytes after 1-byte region

Found-by: Pwno
2026-04-06 23:39:40 +00:00
James Almer 757cc97790 avcodec/lcevcdec: support differing base and enhancement bitdepths
Signed-off-by: James Almer <jamrial@gmail.com>
2026-04-06 14:07:59 -03:00
James Almer 3a2eae155d avcodec/lcevcdec: add 14bit pixel formats
Signed-off-by: James Almer <jamrial@gmail.com>
2026-04-06 14:07:59 -03:00
James Almer 01b0b86225 avcodec/lcevc_parser: move pixel format table to a shared file
Signed-off-by: James Almer <jamrial@gmail.com>
2026-04-06 14:07:59 -03:00
Patrice Dumas 589da160b2 configure: add makeinfo option
Rename makeinfo enabled variable to makeinfo_command. Do not put
makeinfo_command in HAVE_LIST, it is not used.
2026-04-06 15:07:17 +00:00
Andreas Rheinhardt 7fd2be97b9 avcodec/x86/h264_chromamc: Avoid mmx in chroma_mc8_ssse3 functions
No impact on performance here.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-06 11:28:49 +02:00
Andreas Rheinhardt 0c4c9c66bd avfilter/x86/vf_atadenoise: Don't load args unnecessarily
These args will be read directly from the stack into xmm register,
so loading them into GPRs is unnecessary.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-06 11:28:49 +02:00
Andreas Rheinhardt e1297f3080 avcodec/x86/h264_idct: Use tmp reg in SUMSUB_BA if possible
It allows to exchange a paddw by a movdqa.

Old benchmarks:
idct8_add4_8bpp_c:                                     664.6 ( 1.00x)
idct8_add4_8bpp_sse2:                                  142.2 ( 4.67x)
idct8_add_8bpp_c:                                      215.5 ( 1.00x)
idct8_add_8bpp_sse2:                                    35.1 ( 6.14x)

New benchmarks:
idct8_add4_8bpp_c:                                     666.9 ( 1.00x)
idct8_add4_8bpp_sse2:                                  135.3 ( 4.93x)
idct8_add_8bpp_c:                                      217.7 ( 1.00x)
idct8_add_8bpp_sse2:                                    34.0 ( 6.41x)

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-06 11:28:49 +02:00
Andreas Rheinhardt ed116bab02 avcodec/x86/me_cmp: Use tmp reg in SUMSUB_BA if possible
It allows to exchange a paddw by a movdqa.

Old benchmarks:
hadamard8_diff_0_c:                                    366.1 ( 1.00x)
hadamard8_diff_0_sse2:                                  56.4 ( 6.49x)
hadamard8_diff_0_ssse3:                                 53.0 ( 6.90x)
hadamard8_diff_1_c:                                    183.0 ( 1.00x)
hadamard8_diff_1_sse2:                                  28.0 ( 6.53x)
hadamard8_diff_1_ssse3:                                 26.0 ( 7.03x)

New benchmarks:
hadamard8_diff_0_c:                                    371.4 ( 1.00x)
hadamard8_diff_0_sse2:                                  55.0 ( 6.76x)
hadamard8_diff_0_ssse3:                                 49.5 ( 7.50x)
hadamard8_diff_1_c:                                    183.4 ( 1.00x)
hadamard8_diff_1_sse2:                                  26.8 ( 6.85x)
hadamard8_diff_1_ssse3:                                 23.1 ( 7.92x)

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-06 11:28:49 +02:00
Andreas Rheinhardt da59f288c6 avcodec/hevc/dsp_template: Add restrict to add_residual functions
Allows the compiler to optimize the the aliasing checks away
and saves 5376B here (GCC 15, -O3).
Also, avoid converting the stride to uint16_t for >8bpp:
stride /= sizeof(pixel) will use an unsigned division
(i.e. a logical right shift)*, which is not what is intended here.

*: If size_t is the corresponding unsigned type to ptrdiff_t

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-06 11:28:49 +02:00
Andreas Rheinhardt 72058ccdf8 tests/checkasm/sw_scale: Don't use declare_func_emms in yuv2nv12cX check
There are no implementations of yuv2nv12cX clobbering the fpu state,
so make the test stricter to ensure that it stays that way.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-06 11:28:49 +02:00
Andreas Rheinhardt 135cc04c3b tests/checkasm/sw_yuv2yuv: Don't use declare_func_emms
It is not needed (there are no MMX functions here) and
given that there is no emms_c() cleaning up after convert_unscaled,
convert_unscaled must not clobber the fpu state.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-06 11:28:49 +02:00
Andreas Rheinhardt 14c30b9d19 tests/checkasm/png: Don't use declare_func_emms for add_paeth_pred
There is an x86 implementation using MMX registers, but it actually
issues emms on its own (since 57a29f2e7d).

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-06 11:28:49 +02:00
Andreas Rheinhardt fcea2aa75d tests/checkasm/vf_fspp: Don't use declare_func_emms for store_slice
Forgotten in ff85a20b7d.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-06 11:28:49 +02:00
Andreas Rheinhardt 96f0e6e927 tests/checkasm/sw_yuv2rgb: Don't use declare_func_emms unnecessarily
The last MMX(EXT) convert_unscaled functions have been removed
in 61e851381f. And anyway, there
is no emms_c cleaning up after these functions, so they must not
clobber the fpu state; that they did it at the time this checkasm
test has been added was a bug introduced by
e934194b6a and fixed by the removal
of said MMX(EXT) functions.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-06 11:28:49 +02:00
Andreas Rheinhardt 759512d36a avcodec/x86/cavsidct: Use tmp reg in SUMSUB_BA if possible
It allows to exchange a paddw by a movdqa.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-06 11:28:49 +02:00
Andreas Rheinhardt 8b700fad94 avcodec/mpegvideoencdsp: Add restrict to shrink
Makes GCC avoid creating the aliasing fallback path
and saves 1280B of .text here.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-06 10:39:17 +02:00
Andreas Rheinhardt 6e95052ac2 avcodec/x86/mpegvideoenc_template: Avoid indirect call
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-06 10:39:17 +02:00
Sankalpa Sarkar 7b49a69f43 fate: add unit tests for libavutil/timecode functions 2026-04-05 22:23:08 +02:00
Sankalpa Sarkar b462674645 fate/hlsenc: Add tests for untested features 2026-04-05 14:02:48 +00:00
Dana Feng 235d5fd30a .forgejo/codeowners: Add @danaf as reviewer for mpdecimate filter
This will provide notifications when there are pull requests that
touch the mpdecimate filter code.

Signed-off-by: Dana Feng <danaf@twosigma.com>
2026-04-05 00:26:55 +00:00
Dana Feng 31a711aa68 vf_mpdecimate: Add comprehensive tests for keep and max options
Add tests for the mpdecimate filter to verify correct behavior:
- fate-filter-mpdecimate-keep: tests keep=3 option
- fate-filter-mpdecimate-keep1: tests keep=1 option
- fate-filter-mpdecimate-maxdrop-pos: tests max=3 (positive) option
- fate-filter-mpdecimate-maxdrop-neg: tests max=-3 (negative) option

Signed-off-by: Dana Feng <danaf@twosigma.com>
2026-04-05 00:26:55 +00:00
Dana Feng 63822ae21f vf_mpdecimate: Fix keep option logic for keep > 0
Fix the following issues with the keep option:

- Add similarity check during keep period. Previously, the code
  returned early during the keep period without checking if the
  frame is actually similar to the reference.

- Reset keep_count on different frames. Previously, the counter
  could accumulate across non-consecutive similar frames, causing
  frames to be dropped earlier than expected.

- Keep the same frame reference if appropriate. Previously, the
  code made similar frames the new reference, causing reference
  drift and gradual scene changes.

Signed-off-by: Dana Feng <danaf@twosigma.com>
2026-04-05 00:26:55 +00:00
Michael Niedermayer b11729f154 avutil/samplefmt: Dont claim that av_get_sample_fmt_string checks sample_fmt
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-04-05 00:19:09 +00:00
marcos ashton e18c8c533d tests/fate/libavutil: add FATE test for mathematics
Test the integer math utility functions: av_gcd, av_rescale,
av_rescale_rnd (all rounding modes including PASS_MINMAX),
av_rescale_q, av_compare_ts, av_compare_mod, av_rescale_delta,
and av_add_stable. Includes large-value tests that exercise the
128-bit multiply path in av_rescale_rnd.

av_bessel_i0 is not tested since it uses floating point math
that is not bitexact across platforms.

Coverage for libavutil/mathematics.c: 0.00% -> 82.03%

Remaining uncovered lines are av_bessel_i0 (float, 23 lines)
and one edge case fallback in av_rescale_delta.
2026-04-05 00:12:29 +00:00
marcos ashton 66b1dbfb98 tests/fate/libavutil: add FATE test for samplefmt
Test all public API functions: name/format round-trip lookups,
bytes_per_sample, is_planar, packed/planar conversions,
alt_sample_fmt, get_sample_fmt_string, samples_get_buffer_size,
samples_alloc, samples_alloc_array_and_samples, samples_copy,
and samples_set_silence. OOM error paths are exercised via
av_max_alloc().

Coverage for libavutil/samplefmt.c: 0.00% -> 95.28%

Remaining uncovered lines are the fill_arrays failure path
and the overlapping memmove branch in samples_copy.
2026-04-05 00:12:29 +00:00
marcos ashton 117897bcd0 tests/fate/libavutil: add FATE test for rc4
Test the three public API functions: av_rc4_alloc, av_rc4_init,
and av_rc4_crypt. Verifies keystream output against RFC 6229
test vectors for 40, 56, 64, and 128-bit keys, encrypt/decrypt
round-trip, inplace operation, and the invalid key_bits error path.

Coverage for libavutil/rc4.c: 0.00% -> 100.00%
2026-04-05 00:12:29 +00:00
Lynne 2b6fbcad6d swscale/vulkan: compile SPIR-V backed only if SPIR-V headers are found
Instead of making Vulkan depend on the headers, make the compilation of
the SPIR-V backend depend on the headers.

Sponsored-by: Sovereign Tech Fund
2026-04-04 19:02:27 +00:00
nyanmisaka 69fc910777 avfilter/scale_cuda: fix color bleeding in lanczos scaling
Prior to this, the results were not saturated into the uchar/ushort range before
being written. The characteristics of the Lanczos filter exposed this issue.

In addition, the results were truncated rather than rounded, which resulted
in checkerboard artifacts in solid color areas and were noticeable when
using Lanczos with 8-bit input.

Example:
ffmpeg -init_hw_device cuda -f lavfi -i testsrc2=s=960x540,format=yuv420p \
-vf hwupload,scale_cuda=format=yuv420p:w=-2:h=720:interp_algo=lanczos \
-c:v h264_nvenc -qp:v 20 -t 1 <OUTPUT>

Fix #20784

Signed-off-by: nyanmisaka <nst799610810@gmail.com>
2026-04-04 11:31:16 +00:00
Patrice Dumas 2dff0156ba doc/t2h.pm: Never use node nor empty @top heading in ffmpeg_heading_command 2026-04-04 12:52:53 +05:30
Hankang Li e33b3962e5 swscale: fix signed integer overflow in color conversion arithmetic
Fixes: #22331

Signed-off-by: Hankang Li <hankang201222@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-04-04 02:43:59 +02:00
James Almer b47a459867 configure: don't abort if spirv-headers are not present
Vulkan was soft enabled, so this check has no reason to call die()

Signed-off-by: James Almer <jamrial@gmail.com>
2026-04-03 16:54:21 -03:00
Ramiro Polla 1f6699ef26 ffbuild/common: remove DBG=1 to preprocess external asm
It had been added in bc8e044d (2015), and broken in 3ddae9ee (2017).

Nobody has complained since, so it's safe to assume that it is not
being used.
2026-04-03 16:15:33 +02:00
Lynne 554dcc2885 vf_scale_vulkan: make sure that pixfmts are different when using swscale
The swscale internals currently have a quirk which causes the memcpy
backend to be called when the pixfmts match. Obviously, this doesn't do
what is expected, as hardware frames cannot just be copied.
Check for this.

Sponsored-by: Sovereign Tech Fund
2026-04-02 21:15:06 +02:00
Lynne 990768080e tests/swscale: add support for testing Vulkan hardware acceleration
Sponsored-by: Sovereign Tech Fund

Co-authored-by: Ramiro Polla <ramiro.polla@gmail.com>
2026-04-02 21:15:06 +02:00
Lynne 47e4e95173 swscale/vulkan: add a native SPIR-V assembler backend
swscale gets runtime-defined assembly once again!

This commit splits the Vulkan backend into two, SPIR-V and GLSL,
enabling falling back onto the GLSL implementation if an instruction
is unavailable, or simply for testing.

Sponsored-by: Sovereign Tech Fund
2026-04-02 21:15:06 +02:00
Lynne 6a723420dc swscale/vulkan: add a SPIR-V assembler header file
This commit adds a SPIR-V assembler header file. It was partially generated
from the SPIR-V header file JSON definition, then edited by hand to template
and reduce its size as much as possible.
It only implements the essentials required for SPIR-V assembly that swscale
requires.

Sponsored-by: Sovereign Tech Fund
2026-04-02 21:15:06 +02:00
Lynne 5fa4085774 swscale/vulkan: use uniform buffers for dither matrix
Uniform buffers are much simpler to index, and require no work from
the driver compiler to optimize.
In SPIR-V, large 2D shader constants can be spilled into scratch memory,
since you need to create a function variable to index them during runtime.

Sponsored-by: Sovereign Tech Fund
2026-04-02 21:15:06 +02:00
Lynne d4bcd3340e swscale/vulkan: add a check for BGRA features
The issue is that very often, hardware has limited support for BGRA
formats.

As this is a limitation of Vulkan itself, we cannot work around this
in a compatible way.

Sponsored-by: Sovereign Tech Fund
2026-04-02 21:15:06 +02:00
Lynne 72a0b20e42 configure: enable Vulkan only if the SPIR-V headers are installed
FFmpeg has had an issue with GLSL compilation libraries since they
were first merged 6 years ago. The libraries don't have a stable ABI,
are very difficult for packagers to compile and integrate, are slow,
not threadsafe, and uncomfortable to use. The decision to switch all
Vulkan code to either compile-time GLSL or SPIR-V assembly was taken
in January, and since then, and included with the release of FFmpeg 8.1,
the progress has been steadily eliminating all remaining runtime GLSL
compilation.

Sponsored-by: Sovereign Tech Fund
2026-04-02 21:15:05 +02:00
Lynne ebfd18ad03 hwcontext_vulkan: temporarily disable formats which require shader+framework processing
The main issue is that BGR formats only semi-exist in Vulkan. Unlike all
other formats, they require the user to manually remap the pixel order, and
are also forbidden from being written to without a format in shaders. The main
reason for this was conservative - Vulkan is supposed to work everywhere, including
platforms where there is no write-time remapping/swizzing support.

Sponsored-by: Sovereign Tech Fund
2026-04-02 21:15:05 +02:00
Lynne 5a6480af0c hwcontext_vulkan: do not indicate support for rgb565
It requires special handling.

Sponsored-by: Sovereign Tech Fund
2026-04-02 21:15:05 +02:00
Lynne 2bea947027 hwcontext_vulkan: don't indicate support for AV_PIX_FMT_UYVA
Uploading and downloading is broken.

Sponsored-by: Sovereign Tech Fund
2026-04-02 21:15:05 +02:00
Lynne 4b1e79f062 hwcontext_vulkan: return ENOTSUP in vulkan_frames_init
If a format is not supported, EINVAL is not the appropriate
return code.

Sponsored-by: Sovereign Tech Fund
2026-04-02 21:15:05 +02:00
Lynne 0a543441be swscale/vulkan: always reserve 4 image descriptors
The issue is that with multiplane images, or packed images,
there may be some mismatching between what .elems has, and what
we need.
Descriptors are cheap, so just always reserve 4.

Sponsored-by: Sovereign Tech Fund
2026-04-02 21:15:05 +02:00
Lynne eb71c6c9a4 swscale/vulkan: move execution context to be a part of a shader
The issue is that the main Vulkan context is shared between possibly
multiple shaders, and registering a new shader requires allocating
descriptors.

Sponsored-by: Sovereign Tech Fund
2026-04-02 21:15:05 +02:00
Lynne 7c33948b29 swscale/vulkan: fix potential memory issues
The issue is that updating descriptors relies on the pointers of
the structures remaining the same since creation.

Sponsored-by: Sovereign Tech Fund
2026-04-02 21:15:01 +02:00
Sankalpa Sarkar 65eed0732c avformat: check avio_read() return values in dss/dtshd/mlv
Multiple demuxers call avio_read() without checking its return
value. When input is truncated, destination buffers remain
uninitialized but are still used for offset calculations, memcmp,
and metadata handling. This results in undefined behavior
(detectable with Valgrind/MSan).

Fix this by checking the return value of avio_read() in:
- dss.c: dss_read_seek() — check before using header buffer
- dtshddec.c: FILEINFO chunk — check before using value buffer
- mlvdec.c: check_file_header() — check before memcmp on version

Fixes: #21520
2026-04-02 19:06:59 +00:00
Kacper Michajłow 1e031d4af7 configure: treat unrecognized option warnings as errors in test_ld
This fixes dummy warnings when link/lld-link is called by the clang:
lld-link: warning: ignoring unknown argument '--as-needed'
lld-link: warning: ignoring unknown argument '-rpath-link=:libswresample:libswscale:libavfilter:libavdevice:libavformat:libavcodec:libavutil'

Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2026-04-02 16:37:55 +00:00
Kacper Michajłow 740dc9e027 configure: test if -lm is available on host compiler
Fixes host binaries compilation on platforms without math lib.

Fixes clang host compilation, which replaces `-lm` with `m.lib` that
does not exist:
LINK : fatal error LNK1181: cannot open input file 'm.lib'
clang: error: linker command failed with exit code 1181 (use -v to see invocation)

Fixes MSVC (cl) host warning:
cl : Command line warning D9002 : ignoring unknown option '-lm'

Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2026-04-02 16:37:55 +00:00
Kacper Michajłow 43be5cccd8 configure: add llvm toolchain option
This uses llvm tools. `clang-*` toolchain is left mostly for backward
compatibility, although it doesn't use llvm tools, only clang. On top of
that it's for enabling sanitizers. While `llvm` toolchain can be use
without sanitizer suffix.

Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2026-04-02 16:37:55 +00:00
Ruikai Peng 7466d8a850 avformat/whip: check RTP history packet size before RTX retransmission
handle_rtx_packet() constructs an RTX packet by shifting the payload
of a history entry to insert the original sequence number. It uses
memmove with length (ori_size - 12), but never checks that ori_size
is at least 12 bytes (the minimum RTP header size).

Zero-initialized history slots have seq == 0 and size == 0.
rtp_history_find() only compares sequence numbers, so an RTCP NACK
requesting seq 0 early in a session matches such a slot. The
subtraction then wraps to a huge value when converted to size_t,
causing a stack buffer overflow in memmove().

Add a little size check to reject history entries smaller than and
valid RTP header before any arithmetic on their size.

Found-by: Pwno
2026-04-02 12:19:09 +00:00
Niklas Haas 85bef2c2bc swscale/ops: split SwsConst up into op-specific structs
It was a bit clunky, lacked semantic contextual information, and made it
harder to reason about the effects of extending this struct. There should be
zero runtime overhead as a result of the fact that this is already a big
union.

I made the changes in this commit by hand, but due to the length and noise
level of the commit, I used Opus 4.6 to verify that I did not accidentally
introduce any bugs or typos.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-02 11:48:15 +00:00
Niklas Haas 75b7e8904b swscale/ops: move comment (cosmetic)
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-02 11:48:15 +00:00
Niklas Haas dc705268c7 swscale/ops: define flags_identity as an enum constant
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-02 11:48:15 +00:00
Niklas Haas 32ba5c13de swscale/ops_chain: split generic setup helpers into op-specific helpers
This has the side benefit of not relying on the q2pixel macro to avoid division
by zero, since we can now explicitly avoid operating on undefined clear values.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-02 11:48:15 +00:00
Niklas Haas 50793bc9bd swscale/ops_chain: remove unused helper function
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-02 11:48:15 +00:00
Niklas Haas c24d67a0ff swscale/vulkan/ops: use QSTR/QTYPE to print all rationals
Now this helper is a bit more useful.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-02 11:48:15 +00:00
Niklas Haas 7a4cffa25d swscale/vulkan/ops: simplify QTYPE macro
There's no reason for this macro to hard-code op->c.q4[i].

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-04-02 11:48:15 +00:00
Zhao Zhili eedf8f0165 avcodec/hevc: workaround hevc-alpha videos generated by VideoToolbox
Apple VideoToolbox is the dominant producer of hevc-alpha videos, but
early versions generates non-standard VPS extensions that fail to
parse and return AVERROR_INVALIDDATA. Fix this by returning
AVERROR_PATCHWELCOME instead of AVERROR_INVALIDDATA for unsupported
VPS extension configurations. Setting poc_lsb_not_present for the
alpha layer in the fallback path when it has no direct dependency
on the base layer, so that IDR slices on the alpha layer won't
incorrectly read pic_order_cnt_lsb.

Fix #22384

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2026-04-01 22:54:36 +08:00
Zhao Zhili 28ab24b717 avformat/matroskadec: avoid calling get_bytes_left() three times with the same state
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2026-04-01 14:19:35 +00:00
Zhao Zhili bba9bf7e7e avcodec/libdav1d: fix null pointer dereference in LCEVC side data handling
ff_frame_new_side_data() may set sd to NULL and return 0 when
side_data_pref() determines that existing side data should be
preferred.

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2026-04-01 14:17:27 +00:00
Zhao Zhili f9d289020d avcodec/av1dec: fix null pointer dereference in LCEVC side data handling
ff_frame_new_side_data() may set sd to NULL and return 0 when
side_data_pref() determines that existing side data should be
preferred.

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2026-04-01 14:17:27 +00:00
Andreas Rheinhardt f6bbd63557 avutil/tests/.gitignore: Add recently added test tools
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-04-01 14:04:16 +00:00
Zhao Zhili 316531e61c avfilter/vidstabtransform: always use in-place transform path
libvidstab's vsTransformPrepare() takes different internal code paths
for in-place (src == dest) vs. separate-buffer operation. The
separate-buffer path stores a shallow copy of the source frame pointer
in td->src without allocating internal memory (srcMalloced stays 0).
When a subsequent frame takes the in-place path, vsFrameIsNull(&td->src)
is false so vsFrameAllocate() is skipped, and vsFrameCopy() writes into
the stale pointer left over from the previous frame, corrupting memory
that the caller no longer owns.

Whether a given frame is writable depends on pipeline scheduling and
frame reference management, which can change between FFmpeg versions.
Since FFmpeg 8.1, changes in the scheduler caused some frames to arrive
as non-writable, leading to alternation between in-place and
separate-buffer paths that triggered the bug.

Fix this by marking the input pad with AVFILTERPAD_FLAG_NEEDS_WRITABLE.

Fix #22595
2026-04-01 21:56:37 +08:00
Zhao Zhili c695ad1197 avfilter/vidstabtransform: use existing ctx variable for outlink
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2026-04-01 21:56:37 +08:00
Brad Smith e64a1d2953 libavutil/ppc: Remove mfspr-based AltiVec detection code for Linux
The getauxval() and auxv methods cover the last 25+ years of Linux.

Signed-off-by: Brad Smith <brad@comstyle.com>
2026-04-01 04:33:44 +00:00
Michael Niedermayer ddcb9dd3b5 avcodec/aac/aacdec_usac: Implement missing bits of otts_bands_phase and residual_bands computation
Fixes: out of array access
Fixes: matejsmycka/poc.mp4

Introducing commit: `baad75cafa6bac298b72c177f657a2eb8e31cff1` — "aacdec_usac: add support for parsing Mpsp212 (MPEG surround)", 2025-11-17.

Found-by: Matěj Smyčka <matejsmycka@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-03-31 22:29:18 +00:00
Lynne 9c04a40136 vulkan/ffv1: implement floating-point decoding
Sponsored-by: Sovereign Tech Fund
2026-03-31 23:47:45 +02:00
Lynne f5054f726d ffv1enc_vulkan: implement floating-point encoding
Sponsored-by: Sovereign Tech Fund
2026-03-31 23:47:45 +02:00
Lynne 29b8614e62 vulkan/ffv1: fix bitstream initialization for Golomb
Was broken when we switched to descriptors.

Sponsored-by: Sovereign Tech Fund
2026-03-31 23:47:45 +02:00
Lynne 35c6cdb191 hwcontext_vulkan: add support for GBRPF16/GBRAPF16
Sponsored-by: Sovereign Tech Fund
2026-03-31 23:47:39 +02:00
Martin Storsjö 77ff3bcb90 aarch64: Add AARCH64_VALID_JUMP_CALL_TARGET
We currently don't have any cases where this is needed, but include
it for completeness and clarity.

These macros for BTI were added in
08b4716a9e.

A later comment in this file, added in
248986a0db, referenced the macro
AARCH64_VALID_JUMP_CALL_TARGET which never was added here before.
2026-03-31 19:57:46 +00:00
Martin Storsjö 8ed8e221bd aarch64: Fix a URL typo
This was added in 248986a0db.
2026-03-31 19:57:46 +00:00
marcos ashton 878eabdfef tests/fate/libavutil: add FATE test for video_enc_params
Unit test covering av_video_enc_params_alloc,
av_video_enc_params_block, and
av_video_enc_params_create_side_data.

Tests allocation for all three codec types (VP9, H264, MPEG2) and
the NONE type, with 0 and 4 blocks, with and without size output.
Verifies block getter indexing by writing and reading back
coordinates, dimensions, and delta_qp values. Tests frame-level qp
and delta_qp fields, and side data creation with frame attachment.

Coverage for libavutil/video_enc_params.c: 0.00% -> 86.21%
(remaining uncovered lines are OOM error paths)

Signed-off-by: marcos ashton <marcosashiglesias@gmail.com>
2026-03-31 18:05:51 +01:00
marcos ashton c8ec660d78 tests/fate/libavutil: add FATE test for detection_bbox
Unit test covering av_detection_bbox_alloc, av_get_detection_bbox,
and av_detection_bbox_create_side_data.

Tests allocation with 0, 1, and 4 bounding boxes, with and without
size output. Verifies bbox getter indexing by writing and reading
back coordinates, labels, and confidence values. Tests classify
fields (labels and confidences), the header source field, and
side data creation with frame attachment.

Coverage for libavutil/detection_bbox.c: 0.00% -> 86.67%
(remaining uncovered lines are OOM error paths)

Signed-off-by: marcos ashton <marcosashiglesias@gmail.com>
2026-03-31 18:05:51 +01:00
marcos ashton be2fa77344 tests/fate/libavutil: add FATE test for spherical
Unit test covering all 4 public API functions in libavutil/spherical.c:
av_spherical_alloc, av_spherical_projection_name, av_spherical_from_name,
and av_spherical_tile_bounds.

Tests allocation with and without size output, all 7 projection type
name lookups, projection name round-trip verification, out-of-range
handling, and tile bounds computation for full-frame, quarter-tile,
and centered-tile configurations.

Coverage for libavutil/spherical.c: 0.00% -> 100.00%

Signed-off-by: marcos ashton <marcosashiglesias@gmail.com>
2026-03-31 18:05:51 +01:00
Andreas Rheinhardt c1aed85491 avcodec/x86/h264_idct: Avoid spilling register unnecessarily
It is only needed in the unlikely codepath. The ordinary one
only uses six xmm registers.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-31 17:31:58 +02:00
Andreas Rheinhardt 9fdd7e23e3 avfilter/x86/vf_atadenoise: Avoid load
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-31 16:49:51 +02:00
Araz Iusubov 3b55818764 avcodec/amfdec: set context dimensions from decoder size 2026-03-31 14:07:31 +00:00
Ramiro Polla 53537f6cf5 swscale/aarch64: mark CPS kernel functions as indirect branch targets
Only the process functions are entered via an indirect _call_ from C.
The kernel functions and process_return are dispatched to by indirect
_branches_ instead (continuation-passing style design).

Make use of the recently added "jumpable" parameter to the function
macro in libavutil/aarch64/asm.S to fix these functions when BTI is
enabled.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>
2026-03-31 11:48:52 +00:00
Ramiro Polla af443abe99 aarch64: Add support for indirect branch targets in the function macro
The function macro emits AARCH64_VALID_CALL_TARGET for exported symbols,
marking them as valid destinations for indirect _calls_. Functions that
are reached by indirect _branches_ (i.e. tail-call dispatch chains
where the link register is not set) require AARCH64_VALID_JUMP_TARGET
instead.

This commit adds a "jumpable" parameter to the function macro that, when
set, emits AARCH64_VALID_JUMP_TARGET instead of AARCH64_VALID_CALL_TARGET.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>
2026-03-31 11:48:52 +00:00
Dmitrii Gershenkop 8b93c94f47 avutil/hwcontext_amf: Add AMF_IFACE_CALL macro
Using AMF interfaces in C can be cumbersome and visually difficult to process in some cases: i.e.: object->function(object, args). To improve code readability, a new macro is added. This commit is instrumental for future AMF integration refactoring.
2026-03-31 11:33:00 +00:00
Dmitrii Gershenkop 6f75e879b6 avfilter/vf_vpp_amf: Minor clean up.
-vf_vpp_amf.c: Remove unused variables.
-vf_amf_common.c: Fix hdrmeta_buffer memory leak.
-hwcontext_amf.c: Fix av_amf_extract_hdr_metadata not picking up light metadata if display mastering metadata is not set.
-doc/filters.texi: Remove irrelevant example with HDR metadata for vpp_amf.
2026-03-31 11:17:51 +00:00
Kacper Michajłow 7d57621b83 avutil/x86/x86util: tone down NASM workaround and use info section
The use of code section (.text) was forced by the unreleased NASM
3.02rc3 which made the issue worse, but preventing assambling anything
without code section, including when only data was present.

This works fine for the most part, but using code (.text) section with
IMAGE_COMDAT_SELECT_ANY causes issues with lib.exe after stripping such
object:
fatal error LNK1143: invalid or corrupt file: no symbol for COMDAT section 0x2

Esentially it makes our workaround not work in all cases, and while
string could be disabled like it already is for MSVC/ICL builds, it used
to work so let's preserve that state.

This make it not compatible with NASM 3.02rc3 when CV debug info is
generated, but hopefully the upstream fix will be merged before release,
to avoid this regression:
https://github.com/netwide-assembler/nasm/pull/221

Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2026-03-30 19:46:53 +02:00
Jun Zhao 89c21b5ab7 lavc/hevc: add aarch64 NEON for Planar prediction
Add NEON-optimized implementation for HEVC intra Planar prediction at
8-bit depth, supporting all block sizes (4x4 to 32x32).

Planar prediction implements bilinear interpolation using an incremental
base update: base_{y+1}[x] = base_y[x] - (top[x] - left[N]), reducing
per-row computation from 4 multiply-adds to 1 subtract + 1 multiply.
Uses rshrn for rounded narrowing shifts, eliminating manual rounding
bias. All left[y] values are broadcast in the NEON domain, avoiding
GP-to-NEON transfers.

4x4 interleaves row computations across 4 rows to break dependencies.
16x16 uses v19-v22 for persistent base/decrement vectors, avoiding
callee-saved register spills. 32x32 processes 8 rows per loop iteration
(4 iterations total) to reduce code size while maintaining full NEON
utilization.

Speedup over C on Apple M4 (checkasm --bench):

    4x4: 2.25x    8x8: 6.40x    16x16: 9.72x    32x32: 3.21x

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-03-30 14:32:10 +00:00
Jun Zhao 60b372c934 lavc/hevc: add aarch64 NEON for DC prediction
Add NEON-optimized implementation for HEVC intra DC prediction at 8-bit
depth, supporting all block sizes (4x4 to 32x32).

DC prediction computes the average of top and left reference samples
using uaddlv, with urshr for rounded division. For luma blocks smaller
than 32x32, edge smoothing is applied: the first row and column are
blended toward the reference using (ref[i] + 3*dc + 2) >> 2 computed
entirely in the NEON domain. Fill stores use pre-computed address
patterns to break dependency chains.

Also adds the aarch64 initialization framework (Makefile, pred.c/pred.h
hooks, hevcpred_init_aarch64.c).

Speedup over C on Apple M4 (checkasm --bench):

    4x4: 2.28x    8x8: 3.14x    16x16: 3.29x    32x32: 3.02x

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-03-30 14:32:10 +00:00
Jun Zhao 514f57f85d tests/checkasm: add HEVC intra prediction test
Add checkasm test for HEVC intra prediction covering DC, planar, and
angular modes at all block sizes (4x4 to 32x32) for 8-bit and 10-bit
depth.

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-03-30 14:32:10 +00:00
nyanmisaka 87b7e578ec avcodec/amfenc: add encoder average QP stats
This allows for real-time monitoring of the encoder's average QP in ffmpeg CLI.

Signed-off-by: nyanmisaka <nst799610810@gmail.com>
2026-03-30 13:23:56 +00:00
Andreas Rheinhardt f56d073d7e swscale/tests/.gitignore: Add sws_ops_aarch64
Forgotten in a1bfaa0e78.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-30 14:31:25 +02:00
Andreas Rheinhardt 3a1e63e007 avcodec/x86/vvc/alf: Avoid zeroing unnecessarily
In case of >8bpp, there is already a zero register available
(for clipping); in case of Unix64, one can simply use an
unused register. Doing so reduces codesize.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-30 13:51:53 +02:00
Andreas Rheinhardt 8901f858eb avcodec/x86/vvc/alf: Hoist creating shift register out of loop
Possible now that this function no longer uses unnecessarily many
registers.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-30 13:51:53 +02:00
Andreas Rheinhardt a3dfc511a5 avcodec/x86/vvc/alf: Don't push+pop unused register
This function only uses 14 GPRs.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-30 13:51:53 +02:00
Andreas Rheinhardt 5de2c4c89e avcodec/x86/vvc/alf: Avoid reload
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-30 13:51:53 +02:00
Andreas Rheinhardt d727b7a64e avcodec/x86/vvc/alf: Avoid modifying nonvolatile registers
Avoids push+pop on Win64; in any case, using registers m0-m7
more often saves codesize.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-30 13:51:53 +02:00
Andreas Rheinhardt b1d6f31d65 avcodec/x86/vvc/alf: Use correct shift amount
Fixes a bug in 94f9ad8061.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-30 13:51:53 +02:00
Andreas Rheinhardt 2cce9a8279 avcodec/x86/vvc/alf: Avoid modifying nonvolatile registers
Avoids push+pop on Win64; in any case, using registers m0-m7
more often saves codesize.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-30 13:51:53 +02:00
Andreas Rheinhardt cb1ffc58ca avcodec/x86/vvc/of: Don't use ymm regs where xmm are sufficient
Also use a register in the 0-7 range as clobber reg,
as this reduces codesize (by 51B).

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-30 13:51:53 +02:00
Andreas Rheinhardt 1785542a80 avcodec/x86/vvc/of: Don't add to zero
Instead rewrite the code to use assignment. Saves zeroing and
additions.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-30 13:51:53 +02:00
Andreas Rheinhardt 06fa26d2e8 avcodec/x86/vvc/of: Deduplicate common code
The height 8 and 16 cases differ from the second BDOF mini block onwards,
but even the beginning of said mini block is the same and can therefore
be deduplicated. This saves 821B here.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-30 13:51:53 +02:00
Andreas Rheinhardt 002b3bc1b3 avcodec/x86/vvc/of: Avoid punpckldq
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-30 13:51:53 +02:00
Andreas Rheinhardt ada58bd0e2 avcodec/x86/vvc/of: Use xmm registers where sufficient
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-30 13:51:53 +02:00
Andreas Rheinhardt ad34eb2ae6 avcodec/x86/vvc/of: Correct comment
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-30 13:51:53 +02:00
Andreas Rheinhardt 2570f5d307 avcodec/x86/vvc/of: Avoid scalar log2
Instead convert the integers to floats and inspect the exponent.

Old benchmarks:
apply_bdof_8_8x16_c:                                  3295.2 ( 1.00x)
apply_bdof_8_8x16_avx2:                                312.7 (10.54x)
apply_bdof_8_16x8_c:                                  3269.1 ( 1.00x)
apply_bdof_8_16x8_avx2:                                203.6 (16.05x)
apply_bdof_8_16x16_c:                                 6584.8 ( 1.00x)
apply_bdof_8_16x16_avx2:                               413.6 (15.92x)
apply_bdof_10_8x16_c:                                 3313.9 ( 1.00x)
apply_bdof_10_8x16_avx2:                               321.5 (10.31x)
apply_bdof_10_16x8_c:                                 3306.5 ( 1.00x)
apply_bdof_10_16x8_avx2:                               200.4 (16.50x)
apply_bdof_10_16x16_c:                                6659.7 ( 1.00x)
apply_bdof_10_16x16_avx2:                              402.4 (16.55x)
apply_bdof_12_8x16_c:                                 3305.7 ( 1.00x)
apply_bdof_12_8x16_avx2:                               321.8 (10.27x)
apply_bdof_12_16x8_c:                                 3258.1 ( 1.00x)
apply_bdof_12_16x8_avx2:                               198.6 (16.41x)
apply_bdof_12_16x16_c:                                6600.2 ( 1.00x)
apply_bdof_12_16x16_avx2:                              392.6 (16.81x)

New benchmarks:
apply_bdof_8_8x16_c:                                  3269.9 ( 1.00x)
apply_bdof_8_8x16_avx2:                                266.5 (12.27x)
apply_bdof_8_16x8_c:                                  3252.9 ( 1.00x)
apply_bdof_8_16x8_avx2:                                182.6 (17.81x)
apply_bdof_8_16x16_c:                                 6596.7 ( 1.00x)
apply_bdof_8_16x16_avx2:                               362.7 (18.19x)
apply_bdof_10_8x16_c:                                 3351.3 ( 1.00x)
apply_bdof_10_8x16_avx2:                               269.0 (12.46x)
apply_bdof_10_16x8_c:                                 3329.1 ( 1.00x)
apply_bdof_10_16x8_avx2:                               174.5 (19.08x)
apply_bdof_10_16x16_c:                                6654.3 ( 1.00x)
apply_bdof_10_16x16_avx2:                              357.8 (18.60x)
apply_bdof_12_8x16_c:                                 3274.1 ( 1.00x)
apply_bdof_12_8x16_avx2:                               276.0 (11.86x)
apply_bdof_12_16x8_c:                                 3263.5 ( 1.00x)
apply_bdof_12_16x8_avx2:                               176.8 (18.46x)
apply_bdof_12_16x16_c:                                6576.4 ( 1.00x)
apply_bdof_12_16x16_avx2:                              357.8 (18.38x)

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-30 13:51:53 +02:00
Andreas Rheinhardt 03b83f8feb avcodec/x86/vvc/of: Remove redundant instructions
m8 here (corresponding to a mix of sgx2 and sgy2 in derive_bdof_vx_vy
in the C version) is always nonnegative, so the psignd boils down to
a check for m8 being zero. But if an entry of m8 is zero, then
the corresponding entry of m9 is automatically zero, too, as sgx2
being zero implies sgxdi being zero and sgy2 implies sgxgy, sgydi
being zero.* So just remove these redundant instructions.

*: In other words, one could remove the sgx2,sgy2>0 checks from
the end of derive_bdof_vx_vy() as long as av_log2(0) is defined.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-30 13:51:53 +02:00
Ramiro Polla 2517c328fc swscale/aarch64: add NEON sws_ops backend
This commit pieces together the previous few commits to implement the
NEON backend for sws_ops.

In essence, a tool which runs on the target (sws_ops_aarch64) is used
to enumerate all the functions that the backend needs to implement. The
list it generates is stored in the repository (ops_entries.c).

The list from above is used at build time by a code generator tool
(ops_asmgen) to implement all the sws_ops functions the NEON backend
supports, and generate a lookup function in C to retrieve the assembly
function pointers.

At runtime, the NEON backend fetches the function pointers to the
assembly functions and chains them together in a continuation-passing
style design, similar to the x86 backend.

The following speedup is observed from legacy swscale to NEON:
A520: Overall speedup=3.780x faster, min=0.137x max=91.928x
A720: Overall speedup=4.129x faster, min=0.234x max=92.424x

And the following from the C sws_ops implementation to NEON:
A520: Overall speedup=5.513x faster, min=0.927x max=14.169x
A720: Overall speedup=4.786x faster, min=0.585x max=20.157x

The slowdowns from legacy to NEON are the same for C/x86. Mostly low
bit-depth conversions that did not perform dithering in legacy.

The 0.585x outlier from C to NEON is gbrpf32le -> gbrapf32le, which is
mostly memcpy with the C implementation. All other conversions are
better.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>
2026-03-30 11:38:35 +00:00
Ramiro Polla 534757926f swscale/aarch64: introduce ops_asmgen for NEON backend
The NEON sws_ops backend follows the same continuation-passing style
design as the x86 backend.

Unlike the C and x86 backends, which implement the various operation
functions through the use of templates and preprocessor macros, the
NEON backend uses a build-time code generator, which is introduced by
this commit.

This code generator has two modes of operation:
 -ops:
  Generates an assembly file in GNU assembler syntax targeting AArch64,
  which implements all the sws_ops functions the NEON backend supports.
 -lookup:
  Generates a C function with a hierarchical condition chain that
  returns the pointer to one of the functions generated above, based on
  a given set of parameters derived from SwsOp.

This is the core of the NEON sws_ops backend.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>
2026-03-30 11:38:35 +00:00
Ramiro Polla 991611536c swscale/aarch64: introduce a runtime aarch64 assembler interface
The runtime assembler interface provides an instruction-level IR and
builder API tailored to the needs of the swscale dynamic pipeline.
It is not meant to be a general purpose assembler interface.

Currently only a static file backend, which emits GNU assembler text,
has been implemented. In the future, this interface will be used to
write functions dynamically at runtime.

This code will be compiled both for runtime usage to generate optimized
functions and for build-time usage to generate static assembly files.
Therefore, it must not depend on internal FFmpeg libraries.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>
2026-03-30 11:38:35 +00:00
Ramiro Polla a1bfaa0e78 swscale/aarch64: introduce tool to enumerate sws_ops for NEON backend
The NEON sws_ops backend will use a build-time code generator for the
various operation functions it needs to implement. This build time code
generator (ops_asmgen) will need a list of the operations that must be
implemented. This commit adds a tool (sws_ops_aarch64) that generates
such a list (ops_entries.c).

The list is generated by iterating over all possible conversion
combinations and collecting the parameters for each NEON assembly
function that has to be implemented, defined by an unique set of
parameters derived from SwsOp. Whenever swscale evolves, with improved
optimization passes, new pixel formats, or improvements to the backend
itself, this file (ops_entries.c) should be regenerated by running:
    $ make sws_ops_entries_aarch64

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>
2026-03-30 11:38:35 +00:00
Kacper Michajłow e54e117998 avutil/x86/x86util: define .text section additionally to COMDAT one
This is needed to cover the case when assembled source doesn't have
.text section. NASM documentation suggest to add $ suffix to section
name for COMDAT in .text, but this actually requires the main .text
section to exist also. And use less generic suffix for our dummy
sub-section.

Third time's the charm.

Fixes: 80cd067715
Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2026-03-30 01:08:45 +02:00
Soham Kute e3bcb9ac76 avformat/tests: add FATE tests for yuv4mpegpipe pixel formats
The existing fate-lavf-yuv420p.y4m covers only the default format.
Add four entries that pass -pix_fmt explicitly to the lavf_video
macro: yuv422p, yuv444p, yuv411p, and gray.

These exercise the branches in yuv4mpegpipe_write_header() that write
the "C422", "C444", "C411", and "Cmono" chroma descriptor strings in
the stream header.  All four are gated on ENCDEC(RAWVIDEO,YUV4MPEGPIPE)
and added to FATE_LAVF_VIDEO_SCALE so they inherit the requirement for
CONFIG_SCALE_FILTER that lavf_video's -auto_conversion_filters needs.

Reference files were generated from the actual encoder output and
follow the md5+size+CRC format used by the other lavf references.

Signed-off-by: Soham Kute <officialsohamkute@gmail.com>
2026-03-29 23:01:39 +00:00
Soham Kute 9bf999c24f avcodec/tests: add encoder-parser API test for H.261
Add tests/api/api-enc-parser-test.c, a generic encoder+parser round-trip
test that takes codec_name, width, and height on the command line
(defaults: h261 176 144).

Three cases are tested:

garbage - a single av_parser_parse2() call on 8 bytes with no Picture
Start Code; verifies out_size == 0 so the parser emits no spurious data.

bulk - encodes 2 frames, concatenates the raw packets, feeds the whole
buffer to a fresh parser in one call, then flushes.  Verifies that
exactly 2 non-empty frames come out and that the parser found the PSC
boundary between them.

split - the same buffer fed in two halves (chunk boundary falls inside
frame 0).  Verifies the parser still emits exactly 2 frames when input
arrives incrementally, and that the collected bytes are identical to
the bulk output (checked with memcmp).

Implementation notes: avcodec_get_supported_config() selects the pixel
format; chroma height uses AV_CEIL_RSHIFT with log2_chroma_h from
AVPixFmtDescriptor; data[1] and data[2] are checked independently so
semi-planar formats work; the encoded buffer is given
AV_INPUT_BUFFER_PADDING_SIZE zero bytes at the end; parse_stream()
skips the fed chunk if consumed==0 to prevent an infinite loop.

Two FATE entries in tests/fate/api.mak: QCIF (176x144) and CIF
(352x288), both standard H.261 resolutions.

Signed-off-by: Soham Kute <officialsohamkute@gmail.com>
2026-03-29 23:01:39 +00:00
Soham Kute dc8183377c avutil/tests/file: replace trivial test with error-path coverage
The original test only mapped the source file and printed its content,
exercising none of the error branches in av_file_map().

Replace it with a test that maps a real file (path via argv[1] for
out-of-tree builds) and verifies it is non-empty, then calls
av_file_map() on a nonexistent file twice: once with log_offset=0 to
confirm the error is logged at AV_LOG_ERROR, and once with log_offset=1
to confirm the level is raised by one, covering the
log_level_offset_offset path in av_vlog().  A custom av_log callback
captures the emitted level independently of the global log level.
The two error cases share a single for() loop to avoid duplication.

Add a FATE entry in tests/fate/libavutil.mak with CMP=null since
there is no fixed stdout to compare.

Signed-off-by: Soham Kute <officialsohamkute@gmail.com>
2026-03-29 23:01:39 +00:00
Kacper Michajłow 80cd067715 avutil/x86util: don't produce empty object files on win{32,64}
In cases when preprocesor would remove all code, nasm would produce
empty object files. This is technically not wrong, but often cause
issues with various tooling:

* NASM fails to emit CodeView debug info when there is no code [1]
* Older VS2022 builds hangs on empty files [2]
* GNU binutils `strip` errors when there is no sections [3]
error: the input file '.o' has no sections

Workaround those issues by adding dummy byte in COMDAT section,
which is then dropped by linker, as the `__x86util_notref` symbol is not
referenced from C. [4] IMAGE_COMDAT_SELECT_ANY (2) is used to allow
multiple symbol definition.

This is limited to win{32,64} as this is the target where issues were
observed.

[1] https://github.com/netwide-assembler/nasm/issues/216
[2] https://developercommunity.visualstudio.com/t/MSVC-Hangs-when-compiling-ffmpeg-When-l/10233953
[3] https://trac.ffmpeg.org/ticket/6711
[4] https://www.nasm.us/docs/3.01/nasm09.html#section-9.6.1

Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2026-03-29 23:00:06 +02:00
Kacper Michajłow 2b1d8ba3ec avfilter/x86/vf_atadenoise: move %if ARCH_X86_64 after x86util include
This is consistent pattern with other files. Also is needed for next
commit to always include x86util.asm

Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2026-03-29 22:22:29 +02:00
Kacper Michajłow 2b8ca0f3c5 avfilter/x86/avf_showcqt: add missing section declaration
Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2026-03-29 22:22:29 +02:00
Jun Zhao 368f58109e doc/muxers: fix mpegts muxer documentation
Fix the default value of mpegts_original_network_id from 0x0001 to
0xff01 to match the actual code (DVB_PRIVATE_NETWORK_START).

Add the missing hevc_digital_hdtv service type to the
mpegts_service_type option list.

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-03-29 11:06:36 +00:00
Jun Zhao 4c0dff0878 lavf/mpegtsenc: Add parentheses to clarify operator precedence in CC update
While "cc + 1 & 0xf" is technically correct because addition has
higher precedence than bitwise AND in C, the intent of "(cc + 1) & 0xf"
is not immediately obvious without recalling the precedence table.

Add explicit parentheses to make the intended evaluation order clear
and improve readability.

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-03-29 11:06:36 +00:00
Niklas Haas 3f39783337 swscale/ops_chain: simplify ff_sws_compile_op_tables() with int index
Instead of this needlessly complicated dance of allocating on-stack copies
of SwsOpList only to iterate with AVERROR(EAGAIN).

This was originally thought to be useful for compiling multiple ops at once,
but even that can be solved in easier ways.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-29 12:13:40 +02:00
Niklas Haas f6a2d41fe2 swscale/ops: keep track of correct dither min/max
Mostly, this just affects the metadata in benign ways, e.g.:

 rgb24 -> yuv444p:
   [ u8 +++X] SWS_OP_READ         : 3 elem(s) packed >> 0
     min: {0, 0, 0, _}, max: {255, 255, 255, _}
   [ u8 +++X] SWS_OP_CONVERT      : u8 -> f32
     min: {0, 0, 0, _}, max: {255, 255, 255, _}
   [f32 ...X] SWS_OP_LINEAR       : matrix3+off3 [...]
     min: {16, 16, 16, _}, max: {235, 240, 240, _}
   [f32 ...X] SWS_OP_DITHER       : 16x16 matrix + {0 3 2 -1}
-    min: {33/2, 33/2, 33/2, _}, max: {471/2, 481/2, 481/2, _}
+    min: {16.001953, 16.001953, 16.001953, _}, max: {235.998047, 240.998047, 240.998047, _}
   [f32 +++X] SWS_OP_CONVERT      : f32 -> u8
     min: {16, 16, 16, _}, max: {235, 240, 240, _}
   [ u8 XXXX] SWS_OP_WRITE        : 3 elem(s) planar >> 0
     (X = unused, z = byteswapped, + = exact, 0 = zero)

However, it surprisingly actually includes a semantic change, whenever
converting from limited range to monob or monow:

 yuv444p -> monow:
   [ u8 +XXX] SWS_OP_READ         : 1 elem(s) planar >> 0
     min: {0, _, _, _}, max: {255, _, _, _}
   [ u8 +XXX] SWS_OP_CONVERT      : u8 -> f32
     min: {0, _, _, _}, max: {255, _, _, _}
   [f32 .XXX] SWS_OP_LINEAR       : luma [...]
     min: {-20/219, _, _, _}, max: {235/219, _, _, _}
   [f32 .XXX] SWS_OP_DITHER       : 16x16 matrix + {0 -1 -1 -1}
-    min: {179/438, _, _, _}, max: {689/438, _, _, _}
+    min: {-0.089371, _, _, _}, max: {2.071106, _, _, _}
+  [f32 .XXX] SWS_OP_MAX          : {0 0 0 0} <= x
+    min: {0, _, _, _}, max: {2.071106, _, _, _}
   [f32 .XXX] SWS_OP_MIN          : x <= {1 _ _ _}
-    min: {179/438, _, _, _}, max: {1, _, _, _}
+    min: {0, _, _, _}, max: {1, _, _, _}
   [f32 +XXX] SWS_OP_CONVERT      : f32 -> u8
     min: {0, _, _, _}, max: {1, _, _, _}
   [ u8 XXXX] SWS_OP_WRITE        : 1 elem(s) planar >> 3
     (X = unused, z = byteswapped, + = exact, 0 = zero)

Note the presence of an extra SWS_OP_MAX, to correctly clamp sub-blacks
(values below 16) to 0.0, rather than underflowing. This was previously
undetected because the dither was modelled as adding 0.5 to every pixel value,
but that's only true on average - not always.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-29 12:13:11 +02:00
Niklas Haas 7989fd973a swscale/ops: add min/max to SwsDitherOp
This gives more accurate information to the range tracker.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-29 12:10:38 +02:00
Niklas Haas 915523e136 swscale/ops: add missing check on SwsDitherOp.y_offset
Doesn't actually affect anything in the currently generated ops lists.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-29 12:10:38 +02:00
Niklas Haas 7af7b8664b swscale/ops_chain: check SWS_COMP_GARBAGE instead of next->comps.unused
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-29 09:39:09 +00:00
Niklas Haas 048ca3b367 swscale/ops_optimizer: check COMP_GARBAGE instead of next->comps.unused
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-29 09:39:09 +00:00
Niklas Haas e8f6c9dbf2 swscale/ops: only print SWS_OP_SCALE denom if not 1
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-29 09:39:09 +00:00
Niklas Haas ff397a327a swscale/ops: remove unneeded conditional on describe_comp_flags
next->comps.unused[] is redundant with SWS_COMP_GARBAGE now.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-29 09:39:09 +00:00
Niklas Haas 804041045e swscale/ops: remove redundant unused mask from ops printout
This is now fully redundant with the previous op's output; because unused
components are always marked as garbage on the input side.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-29 09:39:09 +00:00
Niklas Haas 13388c0cac swscale/ops: test for SWS_COMP_GARBAGE instead of next->comps.unused
When printing/describing operations.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-29 09:39:09 +00:00
Niklas Haas 6fb0efb35c swscale/ops: strip value range from garbage components
Just removes the unnecessary value range after the WRITE, as a result of
the previous change and the fact that we already skipped printing these for
unused components.

 rgb24 -> bgr24:
   [ u8 XXXX -> +++X] SWS_OP_READ         : 3 elem(s) packed >> 0
     min: {0 0 0 _}, max: {255 255 255 _}
   [ u8 ...X -> +++X] SWS_OP_SWIZZLE      : 2103
     min: {0 0 0 _}, max: {255 255 255 _}
   [ u8 ...X -> XXXX] SWS_OP_WRITE        : 3 elem(s) packed >> 0
-    min: {0 0 0 _}, max: {255 255 255 _}
     (X = unused, z = byteswapped, + = exact, 0 = zero)

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-29 09:39:09 +00:00
Niklas Haas 7d94d9fc52 swscale/ops: mark all unused components as GARBAGE
This only affects the print-out of the SWS_OP_WRITE at the end of every op,
list, because the ops list print-out was otherwise already checking the unused
mask.

 rgb24 -> bgr24:
   [ u8 XXXX -> +++X] SWS_OP_READ         : 3 elem(s) packed >> 0
     min: {0 0 0 _}, max: {255 255 255 _}
   [ u8 ...X -> +++X] SWS_OP_SWIZZLE      : 2103
     min: {0 0 0 _}, max: {255 255 255 _}
-  [ u8 ...X -> +++X] SWS_OP_WRITE        : 3 elem(s) packed >> 0
+  [ u8 ...X -> XXXX] SWS_OP_WRITE        : 3 elem(s) packed >> 0
     min: {0 0 0 _}, max: {255 255 255 _}
     (X = unused, z = byteswapped, + = exact, 0 = zero)

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-29 09:39:09 +00:00
Niklas Haas 672c0ad69a swscale/ops: slightly refactor unused[] computation
Needed for the upcoming removal of op->comps.unused[]. This keeps the
dependency array entirely within the ff_sws_op_list_update_comps() function,
apart from being arguably simpler and easier to follow.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-29 09:39:09 +00:00
Niklas Haas c0cc7f341a swscale/ops: simplify SwsOpList.order_src/dst
Just define these directly as integer arrays; there's really no point in
having them re-use SwsSwizzleOp; the only place this was ever even remotely
relevant was in the no-op check, which any decent compiler should already
be capable of optimizing into a single 32-bit comparison.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-29 09:39:09 +00:00
Niklas Haas d33403ba50 avfilter/buffersrc: use 1 << n for flags (cosmetic)
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-29 09:38:01 +00:00
WyattBlue 33f837a9e9 avfilter/af_whisper.c: Set split_on_word
This prevents `max_len` splitting via tokens, which splits words
like "don't" and proper nouns inappropriately.
2026-03-29 09:37:41 +00:00
nyanmisaka 107a309f3c fftools/ffmpeg_filter: fix the incomplete printing of reason for video filter graph reconfiguration
"Reconfiguring filter graph because video parameters changed to yuv420p10le(pc, bt709), 1920x1080, unspecified alph"

Fixup f07573f

Adding a missing space fixed this.
2026-03-29 09:34:23 +00:00
James Almer 482e7a1696 avformat/matroskadec: remove unnecessary log
Added by mistake in ec86dade2f

Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-29 00:08:26 -03:00
James Almer ad7d270935 avcodec/libdav1d: call ff_attach_decode_data() on output frames
This will allow the injection of LCEVC side data.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-28 22:07:54 -03:00
James Almer 823c6fc0b8 avcodec/decode: make LCEVC injection available to decoders that don't call ff_get_buffer()
Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-28 22:07:54 -03:00
James Almer 8528c697c7 avcodec/av1dec: add support for LCEVC ITU-T35 payloads
Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-28 22:07:54 -03:00
James Almer 4c7a8df34d avcodec/av1dec: refactor parsing ITU-T35 metadata
Use a switch case. Will be useful in the following commit.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-28 22:07:54 -03:00
James Almer 29d8c2af4d avcodec/libdav1d: add support for LCEVC ITU-T35 payloads
Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-28 22:07:54 -03:00
James Almer fe1ffd63fb avcodec/libdav1d: refactor parsing ITU-T35 metadata
Use a switch case. Will be useful in the following commit.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-28 22:07:54 -03:00
James Almer ec86dade2f avformat/matroskadec: add support for LCEVC ITU-T35 payloads
Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-28 22:07:54 -03:00
James Almer 47dc4e3429 avformat/matroskadec: refactor parsing Block Additional
Use a switch case. Will be useful in the following commit.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-28 22:07:50 -03:00
Andreas Rheinhardt 1a7979a2f8 avcodec/x86/h26x/h2656_inter: Simplify splatting coefficients
For pre-AVX2, vpbroadcastw is emulated via a load, followed
by two shuffles. Yet given that one always wants to splat
multiple pairs of coefficients which are adjacent in memory,
one can do better than that: Load all of them at once, perform
a punpcklwd with itself and use one pshufd per register.
In case one has to sign-extend the coefficients, too,
one can replace the punpcklwd with one pmovsxbw (instead of one
per register) and use pshufd directly afterwards.

This saved 4816B of .text here.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-29 01:05:23 +01:00
Andreas Rheinhardt a72b00675c avcodec/x86/h26x/h2656_inter: Don't prepare unused coeffs for hv funcs
8 tap motion compensation functions with both vertical and horizontal
components are under severe register pressure, so that the filter
coefficients have to be put on the stack. Before this commit,
this meant that coefficients for use with pmaddubsw and pmaddwd
were always created. Yet this is completely unnecessary, as
every such register is only used for exactly one purpose and
it is known at compile time which one it is (only 8bit horizontal
filters are used with pmaddubsw), so only prepare that one.
This also allows to half the amount of stack used.

This saves 2432B of .text here.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-29 01:05:23 +01:00
Andreas Rheinhardt 88870f33ab avcodec/x86/h26x/h2656_inter: Remove always-true checks
It has already been checked before that we are only dealing
with high bitdepth here.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-29 01:05:23 +01:00
Andreas Rheinhardt eb5ac9fee7 avfilter/x86/vf_idetdsp: Avoid (v)movdqa
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-29 01:05:23 +01:00
Andreas Rheinhardt c00721310f avcodec/x86/hevc/deblock: Avoid vmovdqa
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-29 01:05:23 +01:00
Andreas Rheinhardt 5c88f46c92 avutil/x86/aes: Only assemble iff HAVE_AESNI_EXTERNAL
This avoids relying on DCE and works around a NASM bug [1].

[1]: https://github.com/netwide-assembler/nasm/issues/216

Reviewed-by: Kacper Michajłow <kasper93@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-28 23:25:54 +01:00
Andreas Rheinhardt 4c179adeaf avcodec/Makefile: Add avformat->h2645_parse.o lcevctab.o dependencies
Fixes static --disable-everything builds.
Forgotten in 053822d9ce
and 49c449b33a.

Reviewed-by: Kacper Michajłow <kasper93@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-28 23:25:31 +01:00
Andreas Rheinhardt e91727e7ef avcodec/x86/mpeg4videodsp: Fix build failure without x86asm
Since ba793127c4,
the x86 mpeg4videodsp code uses ff_emulated_edge_mc_sse2()
instead of ff_emulated_edge_mc_8. This leads to linker errors
when x86asm is disabled. Fix this by also falling back to ff_gmc_c()
in case edge emulation is needed with external SSE2 being unavailable.

An alternative is to go back to ff_emulated_edge_mc_8(), but this
would readd the uglyness to videodsp for a niche case.

Reported-by: James Almer <jamrial@gmail.com>
Reviewed-by: Hendrik Leppkes <h.leppkes@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-28 22:39:05 +01:00
James Almer a5c10346fc avcodec/lcevcdec: do nothing with unsupported pixel formats
Instead of failing and stopping the decoding process.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-28 18:33:12 -03:00
James Almer 5a75d905cb avformat/mpegts: create stream groups after having parsed the entire PMT
Some faulty files have an LCEVC descriptor with a single stream, resulting in
a group being created but never fully populated with the current
implementation.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-28 18:13:36 -03:00
James Almer d069ba22ff avcodec/decode: don't try to apply LCEVC enhancements if some other kind of post processing is active
Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-28 20:14:13 +00:00
James Almer d6a22cda38 avcodec/decode: add a hwaccel specific post_process callback to FrameDecodeData
Leave the existing one for non decoder-specific, post processing usage.
With this, scenarios like nvdec decoding can work algonside lcevc enhancement application.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-28 20:14:13 +00:00
Lynne 5482deeb66 lavfi/scale_vulkan: fix width/height match check
Sponsored-by: Sovereign Tech Fund
2026-03-28 19:36:58 +01:00
Lynne 0e077f2dc1 swscale/vulkan: do not apply order_src/dst for packed r/w
> packed = load all components from a single plane (the index given by order_src[0])
> planar = load one component each from separate planes (the index given by order_src[i])

Sponsored-by: Sovereign Tech Fund
2026-03-28 19:36:04 +01:00
Lynne 69c9cfbddf swscale/vulkan: fix redundant check for packed data
This is always in the branch where packed == false.

Sponsored-by: Sovereign Tech Fund
2026-03-28 19:36:04 +01:00
Niklas Haas 814f862832 swscale/graph: add scaling ops when required
The question of whether to do vertical or horizontal scaling first is a tricky
one. There are several valid philosophies:

1. Prefer horizontal scaling on the smaller pixel size, since this lowers the
   cost of gather-based kernels.
2. Prefer minimizing the number of total filter taps, i.e. minimizing the size
   of the intermediate image.
3. Prefer minimizing the number of rows horizontal scaling is applied to.

Empirically, I'm still not sure which approach is best overall, and it probably
depends at least a bit on the exact filter kernels in use. But for now, I
opted to implement approach 3, which seems to work well. I will re-evaluate
this once the filter kernels are actually finalized.

The 'scale' in 'libswscale' can now stand for 'scaling'.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 18:50:14 +01:00
Niklas Haas 2ef01689c4 swscale/x86/ops: add 4x4 transposed kernel for large filters
Above a certain filter size, we can load the offsets as scalars and loop
over filter taps instead. To avoid having to assemble the output register
in memory (or use some horrific sequence of blends and insertions), we process
4 adjacent pixels at a time and do a 4x4 transpose before accumulating the
weights.

Significantly faster than the existing kernels after 2-3 iterations.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 18:50:14 +01:00
Niklas Haas 4bf51d6615 swscale/x86/ops: add reference SWS_OP_FILTER_H implementation
This uses a naive gather-based loop, similar to the existing legacy hscale
SIMD. This has provably correct semantics (and avoids overflow as long as
the filter scale is 1 << 14 or so), though it's not particularly fast for
larger filter sizes.

We can specialize this to more efficient implementations in a subset of cases,
but for now, this guarantees a match to the C code.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 18:50:14 +01:00
Niklas Haas 568cdca9cc swscale/x86/ops: implement support for SWS_OP_FILTER_V
Ideally, we would like to be able to specialize these to fixed kernel
sizes as well (e.g. 2 taps), but that only saves a tiny bit of loop overhead
and at the moment I have more pressing things to focus on.

I found that using FMA instead of straight mulps/addps gains about 15%, so
I defined a separate FMA path that can be used when BITEXACT is not specified
(or when we can statically guarantee that the final sum fits into the floating
point range).

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 18:50:14 +01:00
Niklas Haas 7966de1ce6 swscale/x86/ops: add support for applying y line bump
A singular `imul` per line here is completely irrelevant in terms of
overhead, and definitely not the worth of whatever precomputation would be
required to avoid it.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 18:50:14 +01:00
Niklas Haas 77588898e2 swscale/x86/ops: add some missing packed shuffle instances
Missing ayuv64le -> gray and vyu444 -> gray; these conversions can arise
transiently during scaling.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 18:50:14 +01:00
Niklas Haas 98f2aba45a swscale/x86/ops: add bxq/yq variants of bxd/yd
Sometimes, bxd/yd need to be passed directly to a 64-bit memory operand,
which requires the use of the 64-bit variants. Since we can't guarantee that
the high bits are correctly zero'd on function entry, add an explicit
movsxd instruction to cover the first loop iteration.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 18:50:14 +01:00
Niklas Haas 48369f6cf2 swscale/x86/ops: reserve one more temporary register
Slightly more convenient for the calculations inside the filter kernel, and
ultimately not significant due to the fact that the extra register only needs
to be saved on the loop entrypoint.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 18:50:14 +01:00
Niklas Haas 4ff32b6e86 swscale/ops_chain: add optional check() call to SwsOpEntry
Allows implementations to implement more advanced logic to determine if an
operation is compatible or not.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 18:50:14 +01:00
Niklas Haas 7b6170a9a5 tests/swscale: don't hard-error on low bit depth SSIM loss
This is an expected consequence of the fact that the new ops code does not
yet do error diffusion, which only really affects formats like rgb4 and monow.

Specifically, this avoids erroring out with the following error:

 loss 0.214988 is WORSE by 0.0111071, ref loss 0.203881
 SSIM {Y=0.745148 U=1.000000 V=1.000000 A=1.000000}

When scaling monow -> monow from 96x96 to 128x96.

We can remove this hack again in the future when error diffusion is implemented,
but for now, this check prevents me from easily testing the scaling code.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 18:50:14 +01:00
Niklas Haas d8b82c1097 tests/checkasm/sw_ops: add tests for SWS_OP_FILTER_H/V
These tests check that the (fused) read+filter ops work.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 18:50:14 +01:00
Niklas Haas 0402ecc270 tests/checkasm/sw_ops: set value range on op list input
May allow more efficient implementations that rely on the value range being
constrained.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 18:50:14 +01:00
Niklas Haas 43242e8a88 tests/checkasm/sw_ops: increase line count
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 18:50:14 +01:00
Niklas Haas 1a8c3d522e swscale/ops_backend: add support for SWS_OP_FILTER_H
Naive scalar loop to serve mainly as a reference for the asm backends.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 18:50:14 +01:00
Niklas Haas e787f75ec8 swscale/ops_backend: add support for SWS_OP_FILTER_V
These could be implemented as a special case of DECL_READ(), but the
amount of extra noise that entails is not worth it; especially due to the
extra setup/free code that needs to be used here.

I've decided that, for now, the canonical implementation shall convert the
weights to floating point before doing the actual scaling. This is not a huge
efficiency loss (since the result will be 32-bit anyways, and mulps/addps are
1-cycle ops); so the main downside comes from the single extra float conversion
on the input pixels.

In theory, we may revisit this later if it turns out that using e.g. pmaddwd
is a win even for vertical scaling, but for now, this works and is a simple
starting point. Vertical scaling also tends to happen after horizontal scaling,
at which point the input will be F32 already to begin with.

For smaller types/kernels (e.g. U8 input with a reasonably sized kernel),
the result here is exact either way, since the resulting 8+14 bit sum fits
exactly into float.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 18:50:14 +01:00
Niklas Haas 542557ba47 swscale/ops_backend: implement support for y_bump map
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 18:50:14 +01:00
Niklas Haas fce3deaa3b swscale/ops_backend: add SwsOpExec to SwsOpIter
Needed for the scaling kernel, which accesses line strides.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 18:50:14 +01:00
Niklas Haas 0b91b5a5e4 swscale/ops_backend: remove unused/wrong #define
PIXEL_MIN is either useless (int) or wrong (float); should be -FLT_MAX
rather than FLT_MIN, if the intent is to capture the most negative possible
value.

Just remove it since we don't actually need it for anything.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 18:50:14 +01:00
Niklas Haas e6e9c45892 swscale/ops_dispatch: try again with split subpasses if compile() fails
First, we try compiling the filter pass as-is; in case any backends decide to
handle the filter as a single pass. (e.g. Vulkan, which will want to compile
such using internal temporary buffers and barriers)

If that fails, retry with a chained list of split passes.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 18:50:14 +01:00
Niklas Haas e3daeff965 swscale/ops_dispatch: compute input x offset map for SwsOpExec
This is cheap to precompute and can be used as-is for gather-style horizontal
filter implementations.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 18:50:14 +01:00
Niklas Haas dc88946d7b swscale/ops_dispatch: fix plane width calculation
This was wrong if sub_x > 1.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 18:50:14 +01:00
Niklas Haas 78878b9daa swscale/ops_dispatch: refactor tail handling
Rather than dispatching the compiled function for each line of the tail
individually, with a memcpy to a shared buffer in between, this instead copies
the entire tail region into a temporary intermediate buffer, processes it with
a single dispatch call, and then copies the entire result back to the
destination.

The main benefit of this is that it enables scaling, subsampling or other
quirky layouts to continue working, which may require accessing lines adjacent
to the main input.

It also arguably makes the code a bit simpler and easier to follow, but YMMV.

One minor consequence of the change in logic is that we also no longer handle
the last row of an unpadded input buffer separately - instead, if *any* row
needs to be padded, *all* rows in the current slice will be padded. This is
a bit less efficient but much more predictable, and as discussed, basically
required for scaling/filtering anyways.

While we could implement some sort of hybrid regime where we only use the new
logic when scaling is needed, I really don't think this would gain us anything
concrete enough to be worth the effort, especially since the performance is
basically roughly the same across the board:

16 threads:
  yuv444p 1920x1080 -> ayuv 1920x1080: speedup=1.000x slower (input memcpy)
  rgb24   1920x1080 -> argb 1920x1080: speedup=1.012x faster (output memcpy)

1 thread:
  yuv444p 1920x1080 -> ayuv 1920x1080: speedup=1.062x faster (input memcpy)
  rgb24   1920x1080 -> argb 1920x1080: speedup=0.959x slower (output memcpy)

Overall speedup is +/- 1% across the board, well within margin of error.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 18:50:14 +01:00
Niklas Haas 015abfab38 swscale/ops_dispatch: precompute relative y bump map
This is more useful for tight loops inside CPU backends, which can implement
this by having a shared path for incrementing to the next line (as normal),
and then a separate path for adding an extra position-dependent, stride
multiplied line offset after each completed line.

As a free upside, this encoding does not require any separate/special handling
for the exec tail.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 18:50:14 +01:00
Niklas Haas 2583d7ad9b swscale/ops_dispatch: add line offsets map to SwsOpPass
And use it to look up the correct source plane line for each destination
line. Needed for vertical scaling, in which case multiple output lines can
reference the same input line.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 18:50:14 +01:00
Niklas Haas 9f0353a5b7 swscale/ops_optimizer: implement filter optimizations
We have to move the filters out of the way very early to avoid blocking
SWS_OP_LINEAR fusion, since filters tend to be nested in between all the
decode and encode linear ops.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 18:50:13 +01:00
Niklas Haas a41bc1dea3 swscale/ops_optimizer: merge duplicate SWS_OP_SCALE
(As long as the constant doesn't overflow)

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 18:50:13 +01:00
Niklas Haas cba54e9e3b swscale/ops: add helper function to split filter subpasses
An operation list containing multiple filter passes, or containing nontrivial
operations before a filter pass, need to be split up into multiple execution
steps with temporary buffers in between; at least for CPU backends.

This helper function introduces the necessary subpass splitting logic

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 18:50:13 +01:00
Niklas Haas bf09910292 swscale/ops: add filter kernel to SwsReadWriteOp
This allows reads to directly embed filter kernels. This is because, in
practice, a filter needs to be combined with a read anyways. To accomplish
this, we define filter ops as their semantic high-level operation types, and
then have the optimizer fuse them with the corresponding read/write ops
(where possible).

Ultimately, something like this will be needed anyways for subsampled formats,
and doing it here is just incredibly clean and beneficial compared to each
of the several alternative designs I explored.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 18:50:13 +01:00
Niklas Haas 63140bff5e swscale/ops: define SWS_OP_FILTER_H/V
This commit merely adds the definitions. The implementations will follow.

It may seem a bit impractical to have these filter ops given that they
break the usual 1:1 association between operation inputs and outputs, but
the design path I chose will have these filter "pseudo-ops" end up migrating
towards the read/write for CPU implementations. (Which don't benefit from
any ability to hide the intermediate memory internally the way e.g. a fused
Vulkan compute shader might).

What we gain from this design, on the other hand, is considerably cleaner
high-level code, which doesn't need to concern itself with low-level
execution details at all, and can just freely insert these ops wherever
it needs to. The dispatch layer will take care of actually executing these
by implicitly splitting apart subpasses.

To handle out-of-range values and so on, the filters by necessity have to
also convert the pixel range. I have settled on using floating point types
as the canonical intermediate format - not only does this save us from having
to define e.g. I32 as a new intermediate format, but it also allows these
operations to chain naturally into SWS_OP_DITHER, which will basically
always be needed after a filter pass anyways.

The one exception here is for point sampling, which would rather preserve
the input type. I'll worry about this optimization at a later point in time.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 18:50:13 +01:00
Niklas Haas 53ee892035 swscale/graph: add way to roll back passes
When an op list needs to be decomposed into a more complicated sequence
of passes, the compile() code may need to roll back passes that have already
been partially compiled, if a later pass fails to compile.

This matters for subpass splitting (e.g. for filtering), as well as for
plane splitting.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 18:50:13 +01:00
Niklas Haas 475b11b2e0 swscale/filters: write new filter LUT generation code
This is a complete rewrite of the math in swscale/utils.c initFilter(), using
floating point math and with a bit more polished UI and internals. I have
also included a substantial number of improvements, including a method to
numerically compute the true filter support size from the parameters, and a
more robust logic for the edge conditions. The upshot of these changes is
that the filter weight computation is now much simpler and faster, and with
fewer edge cases.

I copy/pasted the actual underlying kernel functions from libplacebo, so this
math is already quite battle-tested. I made some adjustments to the defaults
to align with the existing defaults in libswscale, for backwards compatibility.

Note that this commit introduces a lot more filter kernels than what we
actually expose; but they are cheap to carry around, don't take up binary
space, and will probably save some poor soul from incorrectly reimplementing
them in the future. Plus, I have plans to expand the list of functions down
the line, so it makes sense to just define them all, even if we don't
necessarily use them yet.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 18:50:13 +01:00
Niklas Haas f76aa4e408 swscale/tests/sws_ops: add option for summarizing all operation patterns
This can be used to either manually verify, or perhaps programmatically
generate, the list of operation patterns that need to be supported by a
backend to be feature-complete.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 16:48:13 +00:00
Niklas Haas d7a079279f swscale/tests/sws_ops: refactor argument parsing
To allow for argumentless options in the future.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 16:48:13 +00:00
Ramiro Polla b6e470467e swscale/tests/sws_ops: add -v option to set log verbosity
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>
2026-03-28 16:48:13 +00:00
Niklas Haas d3db2dc518 swscale/tests/sws_ops: simplify using ff_sws_enum_op_lists()
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 16:48:13 +00:00
Niklas Haas 4395e8f3a2 swscale/ops: add helper function to enumerate over all op lists
This moves the logic from tests/sws_ops into the library itself, where it
can be reused by e.g. the aarch64 asmgen backend to iterate over all possible
operation types it can expect to see.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 16:48:13 +00:00
Niklas Haas f62c837eb6 swscale/ops: move op-formatting code to helper function
Annoyingly, access to order_src/dst requires access to the SwsOpList, so
we have to append that data after the fact.

Maybe this is another incremental tick in favor of `SwsReadWriteOp` in the
ever-present question in my head of whether the plane order should go there
or into SwsOpList.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 16:48:13 +00:00
Niklas Haas 8fae195395 swscale/ops: avoid printing values for ignored components
Makes the list output a tiny bit tidier. This is cheap to support now thanks
to the print_q4() helper.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 16:48:13 +00:00
Niklas Haas 1caa548caf swscale/ops: refactor PRINTQ() macro
Instead of allocating a billion tiny temporary buffers, these helpers now
directly append to an AVBPrint. I decided to explicitly control whether or not
a value with denom 0 should be printed as "inf/nan" or as "_", because a lot
of ops have the implicit semantic of "den == 0 -> ignored". At the same time,
we don't want to obscure legitimate NAN/INF values when the do occur
unintentionally.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 16:48:13 +00:00
Niklas Haas 0d54a1b53a swscale/ops: remove , from comp min/max print-out for consistency
Interferes with an upcoming simplification, otherwise.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 16:48:13 +00:00
Niklas Haas a0bb0c2772 swscale/ops: use AVBPrint for assembling op descriptions
This commit does not yet touch the PRINTQ macro, but it gets rid of at least
one unnecessary hand-managed buffer.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 16:48:13 +00:00
Niklas Haas 95e6c68707 swscale/ops: print exact constant on SWS_OP_SCALE
More informative.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 16:48:13 +00:00
Niklas Haas f6d963553b swscale/ops: correctly uninit all ops in ff_sws_op_list_remove_at()
This only ever removed a single op, even with count > 1.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 16:48:13 +00:00
Niklas Haas 6f1664382d swscale/format: add helper function to get "default" SwsFormat
But still apply the sanitization/defaulting logic from ff_fmt_from_frame().

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 16:48:13 +00:00
Niklas Haas 08a7b714f2 swscale/format: move SwsFormat sanitization to helper function
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-28 16:48:13 +00:00
Priyanshu Thapliyal d1bcaab230 avcodec/alsdec: preserve full float value in zero-truncated samples
Signed-off-by: Priyanshu Thapliyal <priyanshuthapliyal2005@gmail.com>
2026-03-28 12:18:37 +00:00
Priyanshu Thapliyal febc82690d avcodec/alsdec: propagate read_diff_float_data() errors in read_frame_data()
The return value of read_diff_float_data() was previously ignored,
allowing decode to continue silently with partially transformed samples
on malformed floating ALS input. Check and propagate the error.

All failure paths in read_diff_float_data() already return
AVERROR_INVALIDDATA, so the caller fix is sufficient without
any normalization inside the function.

Signed-off-by: Priyanshu Thapliyal <priyanshuthapliyal2005@gmail.com>
2026-03-28 11:53:38 +00:00
Andreas Rheinhardt bb65b54f2f avcodec/x86/sbcdsp: Port MMX sbc_calc_scalefactors to SSE4
Besides giving a nice speedup over the MMX version,
it also avoids processing unnecessarily much input and
touching unnecessarily much output in the 2ch-4subbands case.

calc_scalefactors_1ch_4subbands_c:                     106.9 ( 1.00x)
calc_scalefactors_1ch_4subbands_mmx:                    46.7 ( 2.29x)
calc_scalefactors_1ch_4subbands_sse4:                   11.8 ( 9.05x)
calc_scalefactors_1ch_8subbands_c:                     220.5 ( 1.00x)
calc_scalefactors_1ch_8subbands_mmx:                    92.3 ( 2.39x)
calc_scalefactors_1ch_8subbands_sse4:                   23.8 ( 9.28x)
calc_scalefactors_2ch_4subbands_c:                     222.5 ( 1.00x)
calc_scalefactors_2ch_4subbands_mmx:                   139.3 ( 1.60x)
calc_scalefactors_2ch_4subbands_sse4:                   23.6 ( 9.41x)
calc_scalefactors_2ch_8subbands_c:                     440.3 ( 1.00x)
calc_scalefactors_2ch_8subbands_mmx:                   196.8 ( 2.24x)
calc_scalefactors_2ch_8subbands_sse4:                   46.5 ( 9.48x)

The MMX version has been removed.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-28 11:25:38 +01:00
Andreas Rheinhardt e4e5beb394 tests/checkasm/sbcdsp: Add test for calc_scalefactors
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-28 11:25:38 +01:00
Andreas Rheinhardt cd886bf0a5 avcodec/x86/sbcdsp: Port ff_sbc_analyze_[48]_mmx to SSE2
Halfs the amount of pmaddwd and improves performance a lot:
sbc_analyze_4_c:                                        55.7 ( 1.00x)
sbc_analyze_4_mmx:                                       7.0 ( 7.94x)
sbc_analyze_4_sse2:                                      4.3 (12.93x)
sbc_analyze_8_c:                                       131.1 ( 1.00x)
sbc_analyze_8_mmx:                                      22.4 ( 5.84x)
sbc_analyze_8_sse2:                                     10.7 (12.25x)

It also saves 224B of .text and allows to remove the emms_c()
from sbcenc.c (notice that ff_sbc_calc_scalefactors_mmx()
issues emms on its own, so it already abides by the ABI).

Hint: A pshufd could be avoided per function if the constants
were reordered.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-28 11:25:38 +01:00
Andreas Rheinhardt 7cf5e90586 tests/checkasm: Add sbcdsp tests
Only sbc_analyze_4 and sbc_analyze_8 for now.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-28 11:25:38 +01:00
Andreas Rheinhardt 90215634f1 avcodec/sbcenc: Remove redundant memset()
A codec's private context is zero-allocated.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-28 11:25:38 +01:00
Andreas Rheinhardt f670006960 avcodec/sbcenc: Use correct size for PutBitContext
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-28 11:25:38 +01:00
Andreas Rheinhardt 3540a6a308 avcodec/sbcenc: Don't output uninitialized data
Check in init whether the parameters are valid.
This can be triggered with
ffmpeg -i tests/data/asynth-44100-2.wav -c sbc -sbc_delay 0.001 \
-b:a 100k -f null -

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-28 11:25:38 +01:00
Andreas Rheinhardt b5ce98b3ff avcodec/sbcdsp: Constify
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-28 11:25:38 +01:00
Andreas Rheinhardt af45345f7e tests/fate: Add SBC tests
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-28 11:25:38 +01:00
Andreas Rheinhardt 0a81a1ce66 avcodec/x86/sbcdsp: Fix calculating four-subbands stereo scalefactors
sbc_calc_scalefactors uses an int32_t [16/*max blocks*/][2/*max
channels*/][8/*max subbands*/] array. The MMX version of this code
treats the two inner arrays as one [2*8] array to process
and it processes subbands*channels of them. But when subbands
is < 8 and channels is two, the entries to process are not
contiguous: One has to process 0..subbands-1 and 8..7+subbands,
yet the code processed 0..2*subbands-1.
This commit fixes this by processing entries 0..7+subbands
if there are two channels.

Before this commit, the following command line triggered an
av_assert2() in put_bits():
ffmpeg_g -i tests/data/asynth-44100-2.wav -c sbc -b:a 200k \
-sbc_delay 0.003 -f null -

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-28 11:25:38 +01:00
Andreas Rheinhardt 1c9f56f969 avcodec/sbc: Use union to save space
One buffer is encoder-only, the other decoder-only.
Also move crc_ctx before the buffers (into padding).

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-28 11:25:38 +01:00
Andreas Rheinhardt 7e032d6963 avcodec/sbcdec: Remove AVClass* from context
This decoder has no private class.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-28 11:25:38 +01:00
Andreas Rheinhardt e249dfce72 doc/optimization: Don't refer to non-existing subdirectory
Removed in cdd139d760.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-28 04:43:28 +00:00
James Almer 62f944d594 avfilter/vf_lcevc: add missing pixel formats
Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-27 21:00:53 -03:00
James Almer 1b7483dddd avfilter/vf_lcevc: workaround for unknown initial dimensions
This is not enough as filters down the chain may get wrong dimensions

Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-27 21:00:19 -03:00
James Almer eb40d70081 avcodec/lcevcdec: add missing pixel formats
Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-27 21:00:14 -03:00
James Almer 96b1b0bf67 avcodec/lcevcdec: also decompose NON_IDR NALUs
The first Global Config process block may be in one of them.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-27 20:56:45 -03:00
Anton Khirnov 3befae81f1 lavc/decode: change sw format selection logic in avcodec_default_get_format()
Choose the first non-hwaccel format rather than the last one. This
matches the logic in ffmpeg CLI and selects YUVA rather than YUV for
HEVC with alpha.
2026-03-27 19:42:08 -03:00
Anton Khirnov dba8c62400 lavfi/vf_tiltandshift: stop (ab)using AVFrame.opaque
This filter uses AVFrame.opaque to build a linked list of AVFrames. This
is very wrong, as AVFrame.opaque is intended to store caller's private
data and may not be touched by filters. What's worse, the filter leaks
the opaque values to the outside.

Use an AVFifo instead of a linked list to implement the same logic.
2026-03-27 19:42:08 -03:00
Andreas Rheinhardt 6ed6815b46 avcodec/tests/motion: Remove test tool
It only tests MMX (me_cmp does not have pure MMX functions any more)
and MMXEXT and is therefore x86-only. Furthermore, checkasm is superior
in every regard.

Removing it also fixes a build failure (there is no dependency of this
tool on me_cmp).

Reviewed-by: Kacper Michajłow <kasper93@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-27 18:48:48 +01:00
osamu620 edab091ac2 avcodec/jpeg2000: Remove trailing whitespace
Remove trailing whitespace
2026-03-27 13:56:00 +00:00
Osamu Watanabe 8490363634 avcodec/jpeg2000: Fix undefined behavior on ROI shift-up 2026-03-27 13:56:00 +00:00
Georgii Zagoruiko 1c385023aa aarch64/vvc: Optimisations of put_chroma_v() functions for 10/12-bit
Apple M4:
put_chroma_v_10_2x2_c:                                   5.8 ( 1.00x)
put_chroma_v_10_4x4_c:                                   9.0 ( 1.00x)
put_chroma_v_10_4x4_neon:                                1.7 ( 5.29x)
put_chroma_v_10_8x8_c:                                  22.1 ( 1.00x)
put_chroma_v_10_8x8_neon:                                5.8 ( 3.79x)
put_chroma_v_10_16x16_c:                                56.3 ( 1.00x)
put_chroma_v_10_16x16_neon:                             21.2 ( 2.66x)
put_chroma_v_10_32x32_c:                               181.6 ( 1.00x)
put_chroma_v_10_32x32_neon:                             86.9 ( 2.09x)
put_chroma_v_10_64x64_c:                               680.3 ( 1.00x)
put_chroma_v_10_64x64_neon:                            337.4 ( 2.02x)
put_chroma_v_10_128x128_c:                            2567.3 ( 1.00x)
put_chroma_v_10_128x128_neon:                         1374.8 ( 1.87x)
put_chroma_v_12_2x2_c:                                   6.4 ( 1.00x)
put_chroma_v_12_4x4_c:                                   8.2 ( 1.00x)
put_chroma_v_12_4x4_neon:                                1.5 ( 5.56x)
put_chroma_v_12_8x8_c:                                  18.9 ( 1.00x)
put_chroma_v_12_8x8_neon:                                5.7 ( 3.29x)
put_chroma_v_12_16x16_c:                                52.6 ( 1.00x)
put_chroma_v_12_16x16_neon:                             19.9 ( 2.65x)
put_chroma_v_12_32x32_c:                               185.7 ( 1.00x)
put_chroma_v_12_32x32_neon:                             81.9 ( 2.27x)
put_chroma_v_12_64x64_c:                               661.8 ( 1.00x)
put_chroma_v_12_64x64_neon:                            342.1 ( 1.93x)
put_chroma_v_12_128x128_c:                            2547.8 ( 1.00x)
put_chroma_v_12_128x128_neon:                         1368.0 ( 1.86x)

RPi4:
put_chroma_v_10_2x2_c:                                  64.8 ( 1.00x)
put_chroma_v_10_4x4_c:                                 157.2 ( 1.00x)
put_chroma_v_10_4x4_neon:                               39.7 ( 3.96x)
put_chroma_v_10_8x8_c:                                 562.1 ( 1.00x)
put_chroma_v_10_8x8_neon:                               98.8 ( 5.69x)
put_chroma_v_10_16x16_c:                              1170.7 ( 1.00x)
put_chroma_v_10_16x16_neon:                            380.7 ( 3.07x)
put_chroma_v_10_32x32_c:                              3696.6 ( 1.00x)
put_chroma_v_10_32x32_neon:                           1723.8 ( 2.14x)
put_chroma_v_10_64x64_c:                             13170.9 ( 1.00x)
put_chroma_v_10_64x64_neon:                           7284.1 ( 1.81x)
put_chroma_v_10_128x128_c:                           46068.3 ( 1.00x)
put_chroma_v_10_128x128_neon:                        27219.5 ( 1.69x)
put_chroma_v_12_2x2_c:                                  63.8 ( 1.00x)
put_chroma_v_12_4x4_c:                                 156.5 ( 1.00x)
put_chroma_v_12_4x4_neon:                               39.3 ( 3.98x)
put_chroma_v_12_8x8_c:                                 560.9 ( 1.00x)
put_chroma_v_12_8x8_neon:                               98.7 ( 5.68x)
put_chroma_v_12_16x16_c:                              1169.9 ( 1.00x)
put_chroma_v_12_16x16_neon:                            380.8 ( 3.07x)
put_chroma_v_12_32x32_c:                              3693.9 ( 1.00x)
put_chroma_v_12_32x32_neon:                           1728.4 ( 2.14x)
put_chroma_v_12_64x64_c:                             13170.9 ( 1.00x)
put_chroma_v_12_64x64_neon:                           7284.9 ( 1.81x)
put_chroma_v_12_128x128_c:                           46068.0 ( 1.00x)
put_chroma_v_12_128x128_neon:                        27224.6 ( 1.69x)
2026-03-27 13:42:50 +00:00
Ingo Oppermann 4bb2989cce fftools/ffmpeg_filter: remove duplicate assignment
Signed-off-by: Ingo Oppermann <ingo@datarhei.com>
2026-03-27 06:37:27 +00:00
James Almer 5dfe661f03 avformat/mov: ignore duplicate streams referenced with an sbas tref entry
Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-26 22:17:07 -03:00
Priyanshu Thapliyal ae6f233988 avcodec/alsdec: fix mantissa unpacking in compressed Part A path
Signed-off-by: Priyanshu Thapliyal <priyanshuthapliyal2005@gmail.com>
2026-03-26 16:25:09 +00:00
Zhao Zhili fd9f1e9c52 avfilter/vf_drawtext: fix newline rendered as .notdef glyph
GET_UTF8 advances the pointer past the newline byte before the
newline check, so shape_text_hb receives text that includes the
newline character. Since HarfBuzz does not treat U+000A as
default-ignorable, it gets shaped into a .notdef glyph.

Fixes #21565

Reported-by: scriptituk <info@scriptit.uk>
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2026-03-26 07:24:15 +00:00
Kacper Michajłow 0f600cbc16 configure: enable nasm debug information also for non-ELF targets
The default NASM selection of debug information formats should cover all
cases nicely. See `nasm -h -F` for the default and supported formats.

This commit allows emitting debug information for macho{32,64} (DWARF)
and win{32,64} (CodeView), where previously only ELF targets would
get debug information.

Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2026-03-26 00:25:29 +00:00
Niklas Haas 238df21a4f avfilter/vf_libplacebo: early-free unused resources
Otherwise, this will indefinitely persist the last couple of mapped frames
(including any extra decoded frames) in memory, even though they will never be
used again, causing a gradual memory leak until filter uninit.

Signed-off-by: Niklas Haas <git@haasn.dev>
Sponsored-by: nxtedition AB
2026-03-25 17:47:09 +00:00
Priyanshu Thapliyal e7b4ddc9d6 avcodec/pngdec: fix dead overflow check in decode_text_to_exif()
The expression (exif_len & ~SIZE_MAX) is always 0 for size_t,
making the overflow guard permanently dead code.

Reported-by: Guanni Qu <qguanni@gmail.com>
Signed-off-by: Priyanshu Thapliyal <priyanshuthapliyal2005@gmail.com>
2026-03-25 16:48:12 +00:00
Aleksoid e84b3c7e98 avcodec/vp9: Fixed memory leak when vp9_frame_alloc() function fails. 2026-03-25 14:31:34 +00:00
Kacper Michajłow e17d84ac8a avcodec/vp9: fix cbs fragment leak on error
Fixes: c0bf1382a7
Fixes: 490257166/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_VP9_fuzzer-6185031050788864
Fixes: 490131106/clusterfuzz-testcase-minimized-fuzzer_loadfile-5438205762797568
Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2026-03-25 14:02:19 +00:00
Zhao Zhili 44ad73031d avcodec/bsf/lcevc_metadata: fix copy-paste typo in chroma loc setup
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2026-03-25 12:19:46 +00:00
Zhao Zhili 3cfbf56192 doc/developer: allow whitespace changes mixed with functional changes
The cosmetic-changes policy in developer.texi was written during the SVN
era, when reviewing indentation changes mixed with functional changes
was genuinely difficult.

Since FFmpeg has moved to Git, reviewers now have simple built-in tools
to ignore whitespace changes:

  git diff -w
  git log -p --ignore-all-space

Forgejo's pull request UI also offers a 'Hide whitespace changes'
toggle, making it trivial to focus on the functional diff.

For those who prefer reviewing patches in their mail client, the same
result can be achieved by saving the patch and running:

  git apply --ignore-whitespace <patch> && git diff -w

Relax the policy so that indentation changes which are invisible to
git diff --ignore-all-space may accompany functional changes, while
still requiring non-whitespace cosmetic changes to be in separate
commits.

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2026-03-25 04:14:23 +00:00
marcos ashton 5d70f0844c libavutil/stereo3d: fix prefix matching in *_from_name() functions
The three *_from_name() functions used av_strstart() for prefix matching,
which returns incorrect results when one name is a prefix of another.

av_stereo3d_from_name("side by side (quincunx subsampling)") matched
"side by side" at index 1 and returned AV_STEREO3D_SIDEBYSIDE instead of
AV_STEREO3D_SIDEBYSIDE_QUINCUNX. Similarly,
av_stereo3d_primary_eye_from_name("nonexistent") matched "none" and
returned AV_PRIMARY_EYE_NONE instead of -1.

Switch all three functions from av_strstart() to strcmp() for exact
matching. No in-tree callers rely on prefix matching.

Signed-off-by: marcos ashton <marcosashiglesias@gmail.com>
2026-03-25 01:32:20 +00:00
marcos ashton 9559a6036d libavfilter/vf_v360: fix operator precedence in stereo loop condition
The loop condition in the DEFINE_REMAP macro:

  stereo < 1 + s->out_stereo > STEREO_2D

is parsed by C as:

  (stereo < (1 + s->out_stereo)) > STEREO_2D

Since STEREO_2D is 0 and relational operators return 0 or 1, the
outer comparison against 0 is a no-op for STEREO_2D and STEREO_SBS.
But for STEREO_TB (value 2) the loop runs 3 iterations instead of 2,
producing an out-of-bounds stereo pass.

Add parentheses so the comparison is evaluated first:

  stereo < 1 + (s->out_stereo > STEREO_2D)

This gives 1 iteration for 2D and 2 for any stereo format (SBS or TB),
matching the actual number of stereo views.

Signed-off-by: marcos ashton <marcosashiglesias@gmail.com>
2026-03-25 01:19:08 +00:00
Priyanshu Thapliyal 1853c80e20 avcodec/alsdec: fix abs(INT_MIN) UB in read_diff_float_data()
Replace abs() with FFABSU() to avoid undefined behavior when
raw_samples[c][i] == INT_MIN. Per libavutil/common.h, FFABS()
has the same INT_MIN UB as abs(); FFABSU() is the correct
helper as it casts to unsigned before negation.

Reported-by: Guanni Qu <qguanni@gmail.com>
Signed-off-by: Priyanshu Thapliyal <priyanshuthapliyal2005@gmail.com>
2026-03-25 00:16:41 +00:00
Ted Meyer fc7cab6be3 avformat/mov: Handle integer overflow in MOV parser
A chromium UBSAN fuzzer caught this instance.
2026-03-24 23:48:18 +00:00
wangbin 49c449b33a avformat/codecstring: fix undefined lcevc symbols if muxers are disabled 2026-03-24 23:14:41 +00:00
Andreas Rheinhardt 76a5d7c545 avfilter/framepool: Mark init, uninit functions as av_cold
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-24 23:13:00 +00:00
Andreas Rheinhardt c3486e96dd avfilter/framepool: Reindent after the previous commit
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-24 23:13:00 +00:00
Andreas Rheinhardt e1e2c85537 avfilter/framepool: Use av_unreachable() for unreachable code
Instead of av_assert0(0).

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-24 23:13:00 +00:00
Andreas Rheinhardt 1c101330d6 avfilter/framepool: Remove impossible branches
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-24 23:13:00 +00:00
Andreas Rheinhardt 92d06a8027 avcodec/vvc/ctu: Put scratchbufs into union to save space
This reduces sizeof(VVCLocalContext) from 4580576B to
3408032B here.

Reviewed-by: Frank Plowman <post@frankplowman.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-24 18:12:00 +01:00
Andreas Rheinhardt a10c731723 avcodec/vvc/ctu: Move often accessed fields to the start of structs
And move the big buffers to the end. This reduces codesize
as offset+displacement addressing modes are either unavailable
or require more bytes of displacement is too large. E.g. this
saves 5952B on x64 here and 3008B on AArch64. This change should
also improve data locality.

Reviewed-by: Frank Plowman <post@frankplowman.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-24 18:10:55 +01:00
Andreas Rheinhardt e41799d6ec avcodec/vvc: Use static_assert where appropriate
Reviewed-by: Frank Plowman <post@frankplowman.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-24 18:09:43 +01:00
Lynne f40fcf8024 libavfilter/scale_vulkan: do not unnencessarily set s->qf on every frame
The initialization function already does this.

Sponsored-by: Sovereign Tech Fund
2026-03-24 15:21:23 +01:00
Lynne 940e407c64 libavfilter/scale_vulkan: use swscale's Vulkan code
This commit enables using swscale's newly added Vulkan code.

Sponsored-by: Sovereign Tech Fund
2026-03-24 15:21:17 +01:00
Lynne e88d4ef718 swscale/vulkan: take order_src/order_dst into account
This fixes rgba/gbrap/bgra conversions.

Sponsored-by: Sovereign Tech Fund
2026-03-24 15:21:17 +01:00
Lynne e01d19aad6 swscale/vulkan: implement SWS_OP_LINEAR
Sponsored-by: Sovereign Tech Fund
2026-03-24 15:21:17 +01:00
Lynne 4805f317a6 swscale/vulkan: implement SWS_OP_DITHER
Sponsored-by: Sovereign Tech Fund
2026-03-24 15:21:17 +01:00
Lynne d212ff08e0 swscale/vulkan: implement SWS_OP_CONVERT
Sponsored-by: Sovereign Tech Fund
2026-03-24 15:21:17 +01:00
Lynne bea41f1f90 swscale/vulkan: implement SWS_OP_LSHIFT/SWS_OP_RSHIFT
Sponsored-by: Sovereign Tech Fund
2026-03-24 15:21:17 +01:00
Lynne d35a77879c swscale/vulkan: implement SWS_OP_MIN/SWS_OP_MAX
Sponsored-by: Sovereign Tech Fund
2026-03-24 15:21:16 +01:00
Lynne bf93a67733 swscale/vulkan: implement SW_OP_SCALE
Sponsored-by: Sovereign Tech Fund
2026-03-24 15:21:16 +01:00
Lynne 41f748d2ee swscale/vulkan: add precise qualifier to f32
Sponsored-by: Sovereign Tech Fund
2026-03-24 15:21:09 +01:00
James Almer d61d724905 avcodec/bsf/lcevc_metadata: write Aditional Info blocks after the Global Config block
Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-24 11:14:17 -03:00
James Almer 35a1e43a6a avcodec/cbs_lcevc: fix writing process blocks with size 6
6 is an undefined value for payload_size_type. For those, 7 is used to signal
a custom_byte_size synxtax element.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-24 11:12:25 -03:00
Niklas Haas 00d1f41b2e swscale/ops_backend: avoid UB (null pointer arithmetic)
Just use uintptr_t, it accomplishes the exact same thing while being defined
behavior.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-24 13:20:59 +00:00
Scott Theisen 7516bf24db libavformat/mpegts.c: pat_cb(): Ensure all PIDs are valid
but just ignore invalid PAT entries so subsequent valid
entries are parsed.

ISO/IEC 13818-1:2021 specifies a valid range of [0x0010, 0x1FFE] in
§ 2.4.4.6 Semantic definition of fields in program association section
and Table 2-3 – PID table

ts->current_pid is always 0 since that is the PID for the PAT.
2026-03-23 19:50:13 +00:00
Nariman-Sayed 2501954d49 avformat/rtpdec: fix RTCP RR cumulative packet loss clamping
Per RFC 3550 Appendix A.3, the cumulative number of packets lost is a
signed 24-bit field. Clamp to signed 24-bit range using av_clip_intp2
and av_zero_extend to handle duplicate packets correctly.
2026-03-23 19:49:25 +00:00
Romain Beauxis 053fb462d8 tests/fate/ogg-*.mak: Make sure that copy tests do not run when
$(FFMPEG) is not compiled.
2026-03-23 10:53:33 -05:00
James Almer e1158301f0 avformat/mov: don't try to create an LCEVC group if there's a single track
In this scenario, as it's the case with DASH segments, the lcevc track will be
alone but potentially have a sbas tref entry referencing itself, which will
make avformat_stream_group_add_stream() fail.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-23 10:50:13 -03:00
Kacper Michajłow 9c63742425 avutil/attributes: enable av_flatten when available
This enables av_flatten on Clang in particular.

It was disabled because at the time this attribute was not supported.
It was implemented in Clang/LLVM 3.5 [1].

Use `__has_attribute` to check for availability. This has been added in
Clang 2.9 [2].

This reverts change 5858a67f13.

[1] https://github.com/llvm/llvm-project/commit/41af7c2fdc8cc2ef186669dcb21cac58d5bd69ee
[2] https://github.com/llvm/llvm-project/commit/274a70ed7f4315c83273173fce4c3b0e097958d6

Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2026-03-22 15:55:54 +00:00
Romain Beauxis 878fb73135 libavfilter/vf_frei0r: use .so suffix for plugins on macOS 2026-03-22 14:27:36 +00:00
James Almer 711b1a52bd avformat/movenc: check if a packet is to be discarded when calculating edit list durations
Demuxers like mov will export packets not meant for presentation (e.g. because
an edit list doesn't include them) by flagging them as discard, but the mov
muxer completely ignored this, resulting in output edit lists considering every
packet.

Fixes issue #22552

Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-21 23:35:39 -03:00
Michael Niedermayer 1bde76da89 avcodec/dvdsub_parser: Fix buf_size check
Fixes: signed integer overflow
Fixes: out of array access
Fixes: dvdsub_int_overflow_mixed_ps.mpg

Found-by: Quang Luong of Calif.io in collaboration with OpenAI Codex
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-03-22 00:33:26 +00:00
Andreas Rheinhardt 9d97771bc6 avcodec/bsf/extract_extradata: Remove pointless checks
It doesn't hurt to keep track of filtered_size:
The end result will be ignored if extradata is not removed
from the bitstream.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-21 15:05:17 +00:00
Andreas Rheinhardt 1dd853010a avcodec/bsf/extract_extradata: Redo extracting LCEVC extradata
Changes compared to the current version include:
1. We no longer use a dummy PutByteContext on the first pass
for checking whether there is extradata in the NALU. Instead
the first pass no longer writes anything to any PutByteContext
at all; the size information is passed via additional int*
parameters. (This no longer discards const when initializing
the dummy PutByteContext, fixing a compiler warning.)
2. We actually error out on invalid data in the first pass,
ensuring that the second pass never fails.
3. The first pass is used to get the exact sizes of both
the extradata and the filtered data. This obviates the need
for reallocating the buffers lateron. (It also means
that the extradata side data will have been allocated with
av_malloc (ensuring proper alignment) instead of av_realloc().)
4. The second pass now writes both extradata and (if written)
the filtered data instead of parsing the NALUs twice.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-21 15:05:17 +00:00
Andreas Rheinhardt 548b9f5ca7 avcodec/bsf/extract_extradata: Inline constants
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-21 15:05:17 +00:00
stevxiao a077da895b avfilter/dnn_backend_torch: add CUDA/ROCm device support
Add support for CUDA and ROCm (AMD GPU) devices in the LibTorch DNN
backend.

This works for both NVIDIA CUDA and AMD ROCm, as PyTorch exposes ROCm
through the CUDA-compatible API.

Usage:

./ffmpeg -i input.mp4 -vf scale=224:224,format=rgb24,dnn_processing=dnn_backend=torch:model=sr_model_torch.pt:device=cuda output.mp4

Reviewed-by: Guo Yejun <yejun.guo@intel.com>
Signed-off-by: younengxiao <steven.xiao@amd.com>
2026-03-21 16:25:40 +08:00
marcos ashton 924cc51ffe tests/fate/pcm: add FATE tests for pcm_bluray encoder and decoder
Add enc_dec_pcm roundtrip tests for the pcm_bluray codec covering
mono, stereo, 5.1, 7.0, and 7.1 channel layouts in s16. The 5.1
and 7.0 tests use an explicit pan filter for channel layout
conversion so the PAN_FILTER dependency is declared only where
needed. An additional s32 test uses a FATE sample file with real
>16-bit content (divertimenti_2ch_96kHz_s24.wav) and decodes to
s32le to verify the full 32-bit round-trip.

enc_dec_pcm is used instead of transcode because the MPEGTS muxer
produces different binary output on 32-bit and 64-bit platforms,
causing the intermediate file checksum to fail on 32-bit CI.

Coverage for libavcodec/pcm-bluray.c: 0.00% -> 93.75%
Coverage for libavcodec/pcm-blurayenc.c: 0.00% -> 91.71%

Signed-off-by: marcos ashton <marcosashiglesias@gmail.com>
2026-03-21 01:04:20 +00:00
marcos ashton 345071f747 tests/fate/libavutil: add FATE test for stereo3d
Add a unit test covering av_stereo3d_alloc, av_stereo3d_alloc_size,
av_stereo3d_create_side_data, av_stereo3d_type_name,
av_stereo3d_from_name, av_stereo3d_view_name,
av_stereo3d_view_from_name, and av_stereo3d_primary_eye_name.
The from_name calls are driven by a static name table so each
string appears exactly once. Round-trip inverse checks verify
that type_name/from_name and view_name/view_from_name are
consistent with each other.

Coverage for libavutil/stereo3d.c: 0.00% -> 100.00%

Signed-off-by: marcos ashton <marcosashiglesias@gmail.com>
2026-03-21 01:04:20 +00:00
marcos ashton ed19c181c2 tests/fate/libavutil: add FATE test for film_grain_params
Add a unit test covering alloc, create_side_data, and select
for AV1 and H.274 film grain parameter types (22 cases).

Coverage for libavutil/film_grain_params.c: 0.00% -> 97.73%

Signed-off-by: marcos ashton <marcosashiglesias@gmail.com>
2026-03-21 01:04:20 +00:00
marcos ashton a43ea8bff7 avfilter/af_pan: fix sscanf() return value checks in parse_channel_name
sscanf() returns EOF (-1) on input failure, which is non-zero and
passes a bare truthy check. When this happens, the %n directive is
never processed, so len stays uninitialized. Using that value to
advance the arg pointer causes an out-of-bounds read and crash.

Check for >= 1 instead, matching the fix applied to the other
sscanf() call in init() by commit b5b6391d64.

Fixes: https://code.ffmpeg.org/FFmpeg/FFmpeg/issues/22451
Signed-off-by: marcos ashton <marcosashiglesias@gmail.com>
2026-03-21 00:44:30 +00:00
tark1998 442e6c80bf avformat/mov: add m4v to allowed extensions
M4V is a standard extension for MPEG-4 video files, commonly used by
Apple devices and software. While it is functionally similar to MP4,
it was missing from the list of recognized extensions for the
MOV/MP4 demuxer.
2026-03-21 00:40:39 +00:00
marcos ashton dfa53aae5f avutil/bswap: fix implicit conversion warning in av_bswap64
Explicitly cast uint64_t arguments to uint32_t before passing them
to av_bswap32(). The truncation is intentional (extracting low and
high halves), but clang on macOS 26 warns about it.

Fixes: https://code.ffmpeg.org/FFmpeg/FFmpeg/issues/22453
Signed-off-by: marcos ashton <marcosashiglesias@gmail.com>
2026-03-21 00:34:50 +00:00
Weidong Wang 06d19d000d avformat/rsd: reject short ADPCM_THP extradata reads
Use ffio_read_size() to enforce exact-length reads of the per-channel
ADPCM_THP coefficient tables. Previously the return value of
avio_read() was unchecked, silently accepting truncated extradata.
2026-03-21 00:29:04 +00:00
Michael Niedermayer e9c6d411c4 doc/CVSS
A simple (FFmpeg specific) guide how to choose CVSS
2026-03-20 22:01:43 +01:00
Michael Niedermayer 313e776ba7 avcodec/ffv1dec: Allocate the minimum size for fltmap and fltmap32 with the current implementation
Found-by: Lynne
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-03-20 15:50:09 +00:00
Jeremy James 3d4461e16d fftools/ffprobe: Show stream group side data
Specifically output side data from tile groups with -show_stream_groups
which includes rotation information in HEIC images.

Signed-off-by: Jeremy James <jeremy.james@gmail.com>
2026-03-20 12:45:44 +00:00
Martin Storsjö f72f692afa aarch64: Add PAC sign/validation of the link register
Whenever the link register is stored on the stack, sign it
before storing it and validate at a symmetrical point (with the
stack at the same level as when it was signed).

These macros only have an effect if built with PAC enabled (e.g.
through -mbranch-protection=standard), otherwise they don't
generate any extra instructions.

None of these cases were present when PAC support was added
in 248986a0db in 2022.

Without these changes, PAC still had an effect in the compiler
generated code and in the existing cases where we these macros were
used - but make it apply to the remaining cases of link register
on the stack.
2026-03-20 13:16:06 +02:00
Martin Storsjö dbf7354d98 aarch64/inter_sme2: Remove needless backup/restore of x29/x30
The sme_entry/sme_exit macros already take care of backing up/restoring
these registers. Additionally, as long as no function calls are
made within the function, x30 doesn't need to be backed up at all.
2026-03-20 13:16:06 +02:00
Peter Bennett 42029a8836 libavcodec/mediacodec: MythTV Fix for incorrect stride with amazon fire stick
With 1080i MPEG2 video, amazon fire stick uses a different stride from what
is returned.
2026-03-20 04:40:06 +00:00
Zhao Zhili 163b9b6c7e avformat/lcevc: return error when no valid NAL units are found
ff_lcvec_parse_config_record() returns success before this patch
when no IDR or NON_IDR NAL units are found.

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2026-03-20 10:47:13 +08:00
Zhao Zhili eadce30402 avformat/lcevc: merge duplicate IDR and NON_IDR branches
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2026-03-20 10:45:31 +08:00
stevxiao 7b534c2bdc avfilter/dnn_backend_tf: fix ctx async field access
ctx->options.async does not exist on DnnContext; the correct
field is ctx->async directly on the context struct.

Signed-off-by: younengxiao <steven.xiao@amd.com>
2026-03-20 02:22:06 +00:00
James Almer 053822d9ce avformat/codecstring: add support for LCEVC streams
Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-19 11:59:59 -03:00
James Almer cbfd280f77 avformat/lcevc: add a function to parse sequence and global config blocks
This exposes parsing already being done to write lvcC boxes, for the purpose
of having these values available elsewhere.
Will be useful for the following change.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-19 11:59:58 -03:00
Andreas Rheinhardt ba793127c4 avcodec/x86/mpeg4videodsp: Use SSE2 emulated_edge_mc
Possible now that this function is no longer MMX.

Old benchmarks:
gmc_edge_emulation_c:                                  782.3 ( 1.00x)
gmc_edge_emulation_ssse3:                              220.3 ( 3.55x)

New benchmarks:
gmc_edge_emulation_c:                                  770.9 ( 1.00x)
gmc_edge_emulation_ssse3:                              111.0 ( 6.94x)

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-19 14:44:40 +01:00
Andreas Rheinhardt b33d1d1ba2 avcodec/x86/mpeg4videodsp: Add gmc_ssse3
It beats MMX by a lot, because it has to process eight words.
Also notice that the MMX code expects registers to be preserved
between separate inline assembly blocks which is not guaranteed;
the new code meanwhile does not presume this.

Benchmarks:
gmc_c:                                                 817.8 ( 1.00x)
gmc_mmx:                                               210.7 ( 3.88x)
gmc_ssse3:                                              80.7 (10.14x)

The MMX version has been removed.

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-19 14:44:37 +01:00
Andreas Rheinhardt e922923fd8 avcodec/x86/mpeg4videodsp: Use smaller edge_emu buffer
edge_emu_mc allows to use different src and dst strides,
so one can replace the outsized edge emu buffer with
one that is much smaller and nevertheless big enough
for all our needs; it also avoids having to check
whether the buffer is actually big enough.

This also improves performance (if the compiler uses
stack probing). Old benchmarks:
gmc_c:                                                 814.5 ( 1.00x)
gmc_mmx:                                               243.7 ( 3.34x)

New benchmarks:
gmc_c:                                                 813.8 ( 1.00x)
gmc_mmx:                                               213.5 ( 3.81x)

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-19 14:44:33 +01:00
Andreas Rheinhardt 338316f0a3 tests/checkasm: Add test for mpeg4videodsp
It already uncovered a bug in the MMX version of gmc.

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-19 14:44:30 +01:00
Andreas Rheinhardt e33260809c avcodec/x86/mpeg4videodsp: Fix sprite_warping_accuracy 0-2
MPEG-4 GMC uses the following motion prediction scheme:
For output pixel (x,y), the reference pixel at fractional
coordinates (ox+dxx*x+dxy*y,oy+dyx*x+dyy*y) is used as prediction;
the latter is calculated via bilinear interpolation. The coefficients
here are fixed-point values with 16+shift fractional bits
where shift is sprite_warping_accuracy+1. For the weights,
only the shift most significant fractional bits are used.
shift can be at most four*.

The x86 MMX gmc implementation performs these calculations
using 16-bit words. To do so, it restricts itself to the case
in which the four least significant bits of dxx,dxy,dyx,dyy
are zero and shifts these bits away. Yet in case shift is
less than four, the 16 bits retained also contain at least
one bit that actually belongs to the fpel component
(which is already taken into account by using the correct
pixels for interpolation).

(This has been uncovered by a to-be-added checkasm test.
I don't know whether there are actual files in the wild
using sprite_warping_accuracy 0-2.)

*: It is always four when encoding with xvid and GMC.

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-19 14:44:08 +01:00
James Almer 106616f13d avformat/mov: tighten sample count value in mov_read_sdtp
sc->sample_count and sc->sdtp_count are both unsigned ints.

Fixes Coverity issue CID 168634.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-18 20:34:33 -03:00
Niklas Haas 292056ec03 forgejo/pre-commit/ignored-words: add re-use
We allow both readd and re-add, so it makes sense to allow both reuse and
re-use. They are both listed in my dictionary.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-18 23:32:35 +00:00
Andreas Rheinhardt 30a811cc7d avcodec/rv34dsp: Reduce size of chroma pixels tabs
Only two sizes exist.

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-18 18:05:20 +01:00
Andreas Rheinhardt 42ebefbd98 tests/checkasm/rv34dsp: Don't use unnecessarily large buffers
RV34 uses 4x4 blocks.

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-18 18:05:06 +01:00
Andreas Rheinhardt c90cf2aa1f avcodec/x86/rv34dsp: Port ff_rv34_idct_dc_noround_mmxext to sse2
No change in benchmarks here.

Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-18 18:04:44 +01:00
James Almer b50cbdc04f fftools/ffmpeg_demux: properly unnitialize the side_data_prefer_packet AVBprint buffer
Fixes Coverity issue CID 1689616.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-18 13:29:23 -03:00
James Almer e22a1ed712 avcodec/h2645_sei: don't use provider_code uninitialized
Regression since 8172be423e.
Fixes Coverity issue CID 1689618.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-18 13:25:08 -03:00
Zhao Zhili 9800032722 checkasm/aarch64: fix operator precedence bug in ARG_STACK
The expression ((8*(MAX_ARGS - 8) + 15) & ~15 + 16)
evaluates to zero on Apple platforms due to assembler operator
precedence differences. LLVM's integrated assembler uses different
precedence rules depending on the target:

unsigned AsmParser::getBinOpPrecedence(AsmToken::TokenKind K,
				   MCBinaryExpr::Opcode &Kind) {
    bool ShouldUseLogicalShr = MAI.shouldUseLogicalShr();
    return IsDarwin ? getDarwinBinOpPrecedence(K, Kind, ShouldUseLogicalShr)
	      : getGNUBinOpPrecedence(MAI, K, Kind, ShouldUseLogicalShr);
}

In Darwin mode (Apple targets), arithmetic operators (+, -) have
higher precedence than bitwise operators (&, |, ^), similar to C.
In GNU mode (ELF targets), bitwise operators have higher precedence
than arithmetic operators.
2026-03-18 13:48:18 +00:00
Niklas Haas 70537ec8e6 swscale/x86/ops: cosmetic
And remove a pointless assertion.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-18 09:09:44 +00:00
Niklas Haas a8606aba8e swscale/x86/ops: move over_read/write determination to setup func
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-18 09:09:44 +00:00
Niklas Haas 8e3eacd7ad swscale/ops_chain: allow implementations to expose over_read/write
And plumb it all the way through to the SwsCompiledOp. This is cleaner than
setting up this metadata up-front in x86/ops.c; and more importantly, it
allows us to determine the amount of over-read programmatically during ops
setup.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-18 09:09:44 +00:00
Niklas Haas 3aef0213e7 swscale/ops_chain: add SwsContext to SwsImplParams
Mainly so that implementations can consult sws->flags, to e.g. decide
whether the kernel needs to be bit-exact.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-18 09:09:44 +00:00
Niklas Haas 9c6638b179 swscale/ops_chain: add SwsOpTable to SwsImplParams
Mainly so setup functions can look at table->block_size, and perhaps
the table flags, as well as anything else we may add in the future.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-18 09:09:44 +00:00
Niklas Haas ef114cedef swscale/ops_chain: refactor setup() signature
This is basically a cosmetic commit that groups all of the parameters to
setup() into a single struct, as well as the return type. This gives the
immediate benefit of freeing up 8 bytes per op table entry, though the
main motivation will come in the following commits.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-18 09:09:44 +00:00
Niklas Haas 3310fe95ae swscale/ops_dispatch: also print ops list after optimizing
Will make more sense in light of the fact that this may not correspond
to the op list actually sent to the backends, due to subpass splitting.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-18 09:09:44 +00:00
Niklas Haas 039b492019 swscale/ops_dispatch: correctly round tail size
If the block size is somehow less than 8, this may round down, leading to
one byte too few being copied (e.g. for monow/rgb4).

This was never an issue for current backends because they all have block sizes
of 8 or larger, but a future platform may have different requirements.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-18 09:09:44 +00:00
Niklas Haas 800c3a71e9 swscale/ops_dispatch: properly handle negative strides
The `memcpy_in` condition is reversed for negative strides, which require a
memcpy() on the *first* line, not the last line. Additionally, the check
just completely didn't work for negative linesizes, due to comparing against
a negative stride.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-18 09:09:44 +00:00
Niklas Haas 91e76dc766 swscale/ops_dispatch: remove slice_align hack
Added in commit 00907e1244 to hack around a problem that was caused by
the Vulkan backend's incorrect use of the ops dispatch code, which was fixed
properly in commit 143cb56501.

This logic never made sense to begin with, it was only meant to disable the
memcpy logic for Vulkan specifically.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-18 09:09:44 +00:00
Niklas Haas 18962a3764 swscale/ops: loop over copied list instead of original (cosmetic)
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-18 09:09:44 +00:00
Niklas Haas 5230624619 swscale/graph: allocate output buffers after adding all passes
There's no reason to immediately allocate all of these; we can do it at the
end when we know for sure which passes we have.

This will matter especially if we ever add a way to remove passes again after
adding them (spoiler: we will).

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-18 09:09:44 +00:00
Jun Zhao c51a420b59 doc/examples/hw_decode: check fopen return value for output file
The output file fopen() result is not checked. If it fails (e.g.
permission denied or invalid path), output_file is NULL and the
subsequent fwrite() call will crash.

Add a NULL check with an error message, consistent with the
existing error handling pattern in this example.

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-03-18 02:08:09 +00:00
Jun Zhao c29206e456 doc/examples/transcode: query encoder for supported configs
avcodec_get_supported_config() is called with dec_ctx (the decoder
context) to query supported pixel formats and sample formats, but
the intent is to configure the encoder. The decoder supported
format list may differ from the encoder, leading to format
negotiation failures or incorrect output.

Pass enc_ctx instead so the actual encoder capabilities are
queried.

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-03-18 02:08:09 +00:00
Jun Zhao d25d6b991d doc/examples/vaapi_encode: return raw error codes from encode_write
encode_write() mapped all return values from avcodec_receive_packet()
into 0 or -1, which destroyed the AVERROR_EOF signal needed by the
caller. The flush call in main() could never see AVERROR_EOF, so a
successful encode always exited with a non-zero status.

Let encode_write() return the original error code and have each
call site handle the expected status:

  - Encoding loop: ignore AVERROR(EAGAIN) (need more input)
  - Flush path:    ignore AVERROR_EOF (normal end-of-stream)

This makes the control flow explicit and easier to follow for
anyone reading the example.

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-03-18 02:08:09 +00:00
Jun Zhao 8d80b13cbe doc/examples/remux: fix NULL pointer dereference in cleanup
The cleanup path uses `ofmt->flags` to check AVFMT_NOFILE, but
`ofmt` is only assigned after avformat_alloc_output_context2
succeeds. If a failure occurs between output context allocation
and the `ofmt` assignment (e.g. stream_mapping allocation fails),
ofmt_ctx is non-NULL while ofmt is still NULL, causing a crash.

Use ofmt_ctx->oformat->flags instead, which is always valid when
ofmt_ctx is non-NULL.

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-03-18 02:08:09 +00:00
Jun Zhao 68e18d3a1c doc/examples/encode_audio: fix hardcoded stereo sample stride
The sample generation loop hardcodes a stride of 2 (stereo) with
samples[2*j], but the channel count is dynamically selected by
select_channel_layout() which picks the layout with the highest
channel count. If the encoder supports more than 2 channels,
samples will be written at wrong offsets.

Use c->ch_layout.nb_channels as the stride instead.

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-03-18 02:08:09 +00:00
Jun Zhao 042ff80562 doc/examples/qsv_decode: fix raw frame dump for chroma planes
The output loop used sw_frame->width as the write size for all
planes. This is only correct for NV12 where the interleaved UV
plane happens to have the same byte width as the Y plane. For
other pixel formats (e.g. YUV420P where U/V planes are half
width, or P010 where samples are 2 bytes), the output would be
corrupted.

Use av_image_get_linesize() to compute the correct byte width
for each plane based on the actual pixel format.

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-03-18 02:08:09 +00:00
Jun Zhao 2c47383d74 doc/examples/hw_decode: fix fwrite error check
fwrite() returns size_t (unsigned), so comparing its return value
with < 0 is always false and write errors are silently ignored.
Check against the expected byte count instead.

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-03-18 02:08:09 +00:00
Jun Zhao 1cf02df122 doc/examples/decode_video: check fopen return value in pgm_save
pgm_save() passes the FILE pointer from fopen() directly to
fprintf() and fwrite() without a NULL check. If fopen() fails
(e.g. permission denied or disk full), this causes a NULL pointer
dereference and crash.

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-03-18 02:08:09 +00:00
Jun Zhao 752faebaa6 doc/examples/vaapi_encode: open raw YUV input in binary mode
fopen() with "r" opens the file in text mode, which on Windows
translates \r\n to \n, corrupting raw NV12 pixel data. Use "rb"
to open in binary mode, matching the output file which already
uses "w+b".

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-03-18 02:08:09 +00:00
Jun Zhao 40f085ac3d doc/examples: fix output context cleanup in transcode examples
avformat_close_input() is designed for input format contexts only.
Using it on output contexts is API misuse — it accesses iformat
(which is NULL for output contexts) and does not follow the correct
output cleanup path.

Replace with the proper pattern already used in remux.c and
transcode.c: avio_closep() to close the IO handle, followed by
avformat_free_context() to free the format context.

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-03-18 02:08:09 +00:00
Guangyu Sun 34bff02984 avcodec/libvpxenc: fix sRGB colorspace for non-RGB pixel formats
When encoding VP9 with a YUV pixel format (e.g. yuv420p) and
AVCOL_SPC_RGB colorspace metadata, libvpxenc unconditionally set
VPX_CS_SRGB. This produced a spec-violating bitstream: Profile 0
(4:2:0) with sRGB colorspace, which is only valid for Profile 1/3
(4:4:4). The resulting file is undecodable.

Fix this by setting ctx->vpx_cs to VPX_CS_SRGB in set_pix_fmt()
for 4:4:4 YUV formats when AVCOL_SPC_RGB is set, matching the
existing GBRP path. This covers the legitimate case of RGB data in
YUV444 containers (e.g. H.264 High 4:4:4 with identity matrix).

With this change, any AVCOL_SPC_RGB that reaches the switch in
set_colorspace() is guaranteed to be a subsampled format where
sRGB is invalid. Return an error so the user can fix their
pipeline rather than silently producing incorrect output.

To reproduce:

  ffmpeg -f lavfi -i testsrc=s=64x64:d=1:r=1 \
    -c:v libvpx-vp9 -pix_fmt yuv420p -colorspace rgb bad.webm
  ffprobe bad.webm
  # -> "vp9 (Profile 0), none(pc, gbr/...), 64x64"
  ffmpeg -i bad.webm -f null -
  # -> 0 frames decoded, error

See also:
  https://issues.webmproject.org/487307225

Signed-off-by: Guangyu Sun <gsun@roblox.com>
Signed-off-by: James Zern <jzern@google.com>
2026-03-17 13:39:59 -07:00
Martin Storsjö 846746be4b aarch64: Add Armv9.3-A GCS (Guarded Control Stack) support
Signal that our assembly is compliant with the GCS feature, if
the GCS feature is enabled in the compiler (available since Clang
18 and GCC 15) - this is enabled by -mbranch-protection=standard
with a new enough compiler.

GCS doesn't require any specific modifications to the assembly
code, but requires that all functions return to the expected call
address (checked through a shadow stack).
2026-03-17 20:37:53 +00:00
Martin Storsjö 1f7ed8a78d aarch64: hevcdsp: Make returns match the call site
For cases when returning early without updating any pixels, we
previously returned to return address in the caller's scope,
bypassing one function entirely. While this may seem like a neat
optimization, it makes the return stack predictor mispredict
the returns - which potentially can cost more performance than
it gains.

Secondly, if the armv9.3 feature GCS (Guarded Control Stack) is
enabled, then returns _must_ match the expected value; this feature
is being enabled across linux distributions, and by fixing the
hevc assembly, we can enable the security feature on ffmpeg as well.
2026-03-17 20:37:53 +00:00
Sun Yuechi 1c7b72cd6b lavc/riscv: remove unused fixed_vtype.S
Signed-off-by: Sun Yuechi <sunyuechi@iscas.ac.cn>
2026-03-17 16:40:05 +00:00
Huihui_Huang 6e37545d6b swscale: fix possible lut leak in adapt_colors()
adapt_colors() allocates a SwsLut3D before calling add_convert_pass(). If add_convert_pass() fails, the function returns without freeing the previously allocated lut. Free lut on that error path.

Signed-off-by: Huihui_Huang <hhhuang@smu.edu.sg>
2026-03-17 15:56:40 +08:00
Martin Storsjö ac4d50cb26 swscale/ops_chain: Make ff_op_priv_free clear the freed pointer 2026-03-16 20:31:12 +00:00
Martin Storsjö e07daf85a4 swscale/ops_chain: Don't pass an aligned union as parameter by value
Passing a struct/union by value can generally be inefficient.
Additionally, when the struct/union is declared to be aligned,
whether it really stays aligned when passed as a parameter by
value is unclear.

This fixes build errors like this, with MSVC targeting 32 bit ARM:

    libswscale/ops_chain.h(91): error C2719: 'unnamed-parameter': formal parameter with requested alignment of 16 won't be aligned
2026-03-16 20:31:12 +00:00
Diego de Souza 6ef0ef51dc avcodec/nvdec: fix surface pool limits and unsafe_output lifetime
Cap ulNumDecodeSurfaces to 32 and ulNumOutputSurfaces to 64 to prevent
cuvidCreateDecoder from failing with CUDA_ERROR_INVALID_VALUE when
initial_pool_size exceeds the hardware limits.

Also cap the decoder index pool (dpb_size) to 32 so that indices
handed out via av_refstruct_pool_get stay within the valid range
for cuvidDecodePicture's CurrPicIdx.

When unsafe_output is enabled, stop holding idx_ref in the unmap
callback. Since cuvidMapVideoFrame copies decoded data into an
independent output mapping slot, the decode surface index can safely
be reused as soon as the DPB releases it, without waiting for the
downstream consumer to release the mapped frame. This decouples the
decode surface index lifetime (max 32) from the output mapping slot
lifetime (max 64), eliminating the "No decoder surfaces left" error
that occurred when downstream components like nvenc held too many
frames.

Signed-off-by: Diego de Souza <ddesouza@nvidia.com>
2026-03-16 18:18:12 +00:00
Marvin Scholz cce545a74b avutil: attributes: fix AV_HAS_STD_ATTRIBUTE checks
Attributes with the language-supported [[attr]] style are only supported
since C++11 and C23 respectively, so this needs to be accounted for in
these checks.

This solves a huge amount of warning spam of:
  warning: [[]] attributes are a C23 extension [-Wc23-extensions]
when using --enable-extra-warnings.
2026-03-16 18:32:20 +01:00
Jun Zhao c49f6bec20 lavf/vvcdec: fix false-positive VVC detection of MP3 files
The VVC probe only checked forbidden_zero_bit but not
nuh_layer_id range in the NAL unit header. This allowed
certain MP3 files to be misdetected as VVC streams because
their frame data coincidentally contained 00 00 01 start
code patterns that looked like valid NAL units.

Add a check for nuh_layer_id (must be <= 55). The existing
check_temporal_id() already validates nuh_temporal_id_plus1
is in [1, 7]. Together these two checks reject the bogus
NAL units produced by MP3 frame data.

Note: nuh_reserved_zero_bit is intentionally not checked
here, as it is reserved for future use by the spec and may
become non-zero in a later revision.

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-03-16 16:21:12 +00:00
Michael Niedermayer 4b83833087 avformat/wsddec: Use ffio_read_size() in get_metadata()
Fixes: use of uninitialized memory
Fixes: 492587173/clusterfuzz-testcase-minimized-ffmpeg_dem_WSD_fuzzer-6596163492184064

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-03-16 15:51:53 +01:00
Nariman-Sayed b20f42b156 avformat/tls_openssl: fix DTLS retransmission when packet lost in blocking mode
OpenSSL DTLS can't retransmit lost packets in blocking mode.
Switch to non-blocking mode and use DTLSv1_handle_timeout()
to properly handle DTLS handshake retransmissions.
2026-03-16 14:49:36 +00:00
Philip Tang 261960392e avformat/whip: add timeout option for HTTP
WHIP can receive timeout option to allow dropping
connection attempts which would otherwise hang in the event that remote
server is not replying.
2026-03-16 14:46:13 +00:00
Zhao Zhili dbd783f389 avformat/lcevc: fix wrong NAL count written for NON IDR
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2026-03-16 13:26:52 +00:00
Zhao Zhili 82b39de805 avformat/lcevc: fix memleak on write_nalu() failure
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2026-03-16 13:26:52 +00:00
Zhao Zhili cc866fb5e9 avformat/movenc: fix loop variable shadowing in LCEVC stream group init
Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2026-03-16 13:25:59 +00:00
Nicholas Carlini 3e8bec7871 avformat/mpegts: fix descriptor accounting across multiple IOD descriptors
pmt_cb() passes mp4_descr + mp4_descr_count as the output base but
MAX_MP4_DESCR_COUNT (16) as the capacity, not the remaining capacity.
init_MP4DescrParseContext() resets d->descr_count to 0 on every call,
so the bounds check at parse_MP4ESDescrTag compares a fresh 0 against
16 regardless of the shifted base.

A PMT with two IOD descriptors of 16 ESDescrs each will crash. The first
fills the buffer mp4_descr[0..15], and then the second writes
mp4_descr[16..31] -- 1152 bytes past the end of the stack.

This change passes the remaining capacity instead of always passing 16.
The writeback in mp4_read_iods is incremented so the caller's running
count is preserved.

Fixes: stack-buffer-overflow

Found-by: Nicholas Carlini <nicholas@carlini.com>
2026-03-16 11:51:27 +00:00
Anton Khirnov 5b112b17c0 opus/dec_celt: avoid emph_coeff becoming a subnormal
This happens for silence frames, which on many CPUs massively slows down
processing the decoded output.

Cf. https://github.com/Genymobile/scrcpy/issues/6715
2026-03-16 11:51:49 +01:00
Weidong Wang 236dbc9f82 avcodec/xxan: zero-initialize y_buffer
Fixes ticket #22420.

When the first decoded frame is type 1, xan_decode_frame_type1() reads y_buffer as prior-frame state before any data has been written to it.
Since y_buffer is allocated with av_malloc(), this may propagate uninitialized heap data into the decoded luma output.

Allocate y_buffer with av_mallocz() instead.
2026-03-16 10:24:33 +00:00
James Almer 6ba0b59d8b avcodec/bytestream2: don't allow using NULL pointers
This is UB.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-15 19:27:07 -03:00
James Almer 2556db6173 avcodec/bsf/extract_extradata: don't use a NULL pointer to initialize an empty PutByteContext
Fixes UB in the form or adding a 0 offset to a NULL pointer, and substracting a
NULL pointer from another.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-15 19:27:07 -03:00
James Almer 5ebd50415f avcodec/bsf/extract_extradata: reallocate buffers with the final used size
The buffers are allocated using the worst case scenario of the entire NALU
being written, when this is in many times not the case.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-15 19:27:01 -03:00
James Almer 1434d99b19 avcodec/bsf/extract_extradata: write correct length start codes for LCEVC
The specification for LCEVC states that start codes may be three or four bytes
long except for the first NALU in an AU, which must be four bytes long.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-15 19:20:06 -03:00
James Almer d1431d3f50 avcodec/bsf/extract_extradata: write correct length start codes for h26x
The specification for H.26{4,5,6} states that start codes may be three or four
bytes long long except for the first NALU in an AU, and for NALUs of parameter
set types, which must be four bytes long.
This is checked by ff_cbs_h2645_unit_requires_zero_byte(), which is made
available outside of CBS for this change.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-15 19:20:06 -03:00
James Almer 6bc257e292 avformat/nal: remove trailing zeroes from NALUs
Based on the behaviour from cbs_h2645, which removes actual
trailing_zero_8bits bytes and possibly also work arounds issues in
ff_h2645_extract_rbsp(). In this case, the same issue could be
present in ff_nal_find_startcode().

Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-15 19:20:06 -03:00
James Almer 1d65e985b3 fftools/ffmpeg_demux: add options to override mastering display and content light level metadata
Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-15 17:52:05 -03:00
James Almer 8172be423e avcodec/h2645_sei: fix parsing payloads for UK country_code
The correct syntax after country_code is:

t35_uk_country_code_second_octet      b(8)
t35_uk_manufacturer_code_first_octet  b(8)
t35_uk_manufacturer_code_second_octet b(8)

Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-15 17:25:12 -03:00
James Almer 3af824a540 avcodec/h2645_sei: reindent after the previous change
Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-15 17:25:12 -03:00
James Almer 64edbb37f1 avcodec/h2645_sei: refactor decode_registered_user_data()
Switch statements are cleaner and will be useful for an upcoming change.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-15 17:25:12 -03:00
WyattBlue 482395f830 avfilter/af_whisper: Add translate parameter 2026-03-15 06:53:19 +00:00
James Almer 539fc854e7 fftools/ffmpeg_mux_init: add support for LCEVC Stream Group muxing
Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-14 20:50:27 -03:00
James Almer 9f9db1f673 avformat/options: add missing AVOption for AVStreamGroupLCEVC
Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-14 20:50:27 -03:00
James Almer 0878ae59f9 avformat/movenc: add support for LCEVC track muxing
Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-14 20:50:21 -03:00
James Almer 77ddfcfeb1 Changelog: move an entry wrongly put in the 8.1 section to next
Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-14 20:43:11 -03:00
Michael Niedermayer 70286d59f1 avcodec/exr: Check input space before reverse_lut()
Fixes: use of uninitialized memory
Fixes: 490707906/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_EXR_DEC_fuzzer-6310933506097152

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-03-14 23:24:11 +01:00
Pierre-Anthony Lemieux dfc5d176c9 fuzzer: improve documentation 2026-03-14 21:36:58 +00:00
Nicholas Carlini 55bf0e6cd5 avformat/mpegts: remove JPEG-XS early return on invalid header_size
new_pes_packet() moves a buffer with pkt->buf = pes->buffer before
JPEG-XS validation. If header_size > pkt->size, an early return leaves
pes->buffer as a stale alias of pkt->buf with refcount 1. Later,
mpegts_read_packet() calls av_packet_unref(), freeing the buffer
through pkt->buf. The flush loop then re-enters new_pes_packet() and
dereferences the dangling pes->buffer; a second path hits it via
av_buffer_unref() in handle_packets() after a seek.

Drop the early return. The packet is delivered with AV_PKT_FLAG_CORRUPT
set, matching the PES-size-mismatch case above, and the function falls
through to the normal cleanup path. The else guards the header trim so
pkt->data/pkt->size stay valid for the memset.

Fixes: use after free
Fixes regression since 16f89d342e.

Found-by: Nicholas Carlini <nicholas@carlini.com>
2026-03-14 21:01:41 +00:00
Michael Niedermayer 770bc1c23a avcodec/aac/aacdec_usac_mps212: Introduce a temporary array for ff_aac_ec_data_dec()
This also reverts: c2364e9222

Fixes: out of array access (testcase exists but did not replicate for me)

Founbd-by: Gil Portnoy <dddhkts1@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-03-14 21:23:53 +01:00
Michael Niedermayer 12303cd922 avcodec/cbs_h266_syntax_template: Check tile_y
Fixes: invalid state leading to out of array access
Fixes: 490615782/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_VVC_fuzzer-4711353817563136

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-03-14 21:23:17 +01:00
Andreas Rheinhardt e33573813d avcodec/x86/apv_dsp: Don't clip unnecessarily
It is redundant due to packusdw.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-14 19:31:45 +01:00
Andreas Rheinhardt 691f9cd428 avcodec/apv_dsp: Reindent after previous commit
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-14 19:31:45 +01:00
Andreas Rheinhardt 59b119023f avcodec/apv_dsp: Remove dead 8 bit code
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-14 19:31:45 +01:00
Andreas Rheinhardt 506ea84c1c avcodec/apv_decode: Don't rely on AV_PIX_FMT_YUV420 == 0
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-14 19:31:44 +01:00
Andreas Rheinhardt 99339f7b2b avcodec/apv_decode: Remove unused array entries
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-14 19:31:44 +01:00
Andreas Rheinhardt 6b5b0d6a50 avcodec/apv_decode: Remove always-false branches
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-14 19:31:44 +01:00
Andreas Rheinhardt 4300931e23 avcodec/apv_decode: Fix pixel format selection
The current code just happens to work for 10 and 12.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-14 19:31:44 +01:00
Lynne c102e89448 hwcontext_vulkan: deprecate AVVulkanDeviceContext.lock/unlock_queue
Without replacement, as VK_KHR_internally_synchronized_queues will be required.
2026-03-14 17:05:06 +00:00
Nicholas Carlini 39e1969303 avcodec/h264_slice: reject slice_num >= 0xFFFF
An H.264 picture with 65536 slices makes slice_num collide with the
slice_table sentinel. slice_table is uint16_t, initialized via
memset(..., -1, ...) so spare entries (one per row, mb_stride =
mb_width + 1) stay 0xFFFF. slice_num is an uncapped ++h->current_slice.
At slice 65535 the collision makes slice_table[spare] == slice_num
pass, defeating the deblock_topleft check in xchg_mb_border and the
top_type zeroing in fill_decode_caches.

With both guards bypassed at mb_x = 0, top_borders[top_idx][-1]
underflows 96 bytes and XCHG writes at -88 below the allocation
(plus -72 and -56 for chroma in the non-444 path).

Fixes: heap-buffer-overflow

Found-by: Nicholas Carlini <nicholas@carlini.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-03-14 16:52:58 +00:00
Jun Zhao 795bccdaf5 lavfi/bwdif: fix heap-buffer-overflow with small height videos
Reproduce:
  ffmpeg -i /tmp/bwdif_test_input_160x4_gray16.jpg -vf "bwdif" -f null -

filter_intra accesses rows 3 lines away via cur[mrefs3] and cur[prefs3].
For small height videos (h <= 4), this causes heap-buffer-overflow.

Add boundary check for filter_intra when YADIF_FIELD_END is set.
The boundary condition (y < 3) or (y + 3 >= td->h) precisely matches
filter_intra's 3-line context requirement.

Test file: 160x4 gray16 JPEG
https://code.ffmpeg.org/attachments/db2ace24-bc00-4af6-a53a-5df6b0d51b15

fix #21570

Reviewed-by: Thomas Mundt <tmundt75@gmail.com>
Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-03-14 23:26:51 +08:00
Zhao Zhili 9a1e9f9368 MAINTAINERS: add myself to maintained modules and update CODEOWNERS 2026-03-14 15:18:22 +00:00
Ramiro Polla 5640bd3a4f swscale/tests/swscale: require reference file to perform comparisons
The legacy scaler is no longer implicitly used to generate a reference
to perform comparisons for every conversion. It is now up to the user
to generate a reference file and use it as input for a separate run to
perform comparisons.

It is now possible to compare against previous runs of the graph-based
scaler, for example to test for newer optimizations.

This reduces the overall time necessary to obtain speedup numbers from
the legacy scaler to the graph-based scaler (or any other comparison,
for that matter) since the reference must only be run once.

For example, to check the speedup between the legacy scaler and the
graph-based scaler:
  ./libswscale/tests/swscale [...] -bench 50 -legacy 1 > legacy_ref.txt
  ./libswscale/tests/swscale [...] -bench 50 -ref legacy_ref.txt

If no -ref file is specified, we are assuming that we are generating a
reference file, and therefore all information is printed (including
ssim/loss, and benchmarks if -bench is used).

If a -ref file is specified, the output printed depends on whether we
are testing for correctness (ssim/loss only) or benchmarking (time/
speedup only, along with overall speedup).

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>
2026-03-14 06:13:19 +00:00
Ramiro Polla 5146519131 swscale/tests/swscale: add -pretty option to align fields in output
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>
2026-03-14 06:13:19 +00:00
Ramiro Polla d1779ece67 swscale/tests/swscale: print number of iterations in benchmark output
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>
2026-03-14 06:13:19 +00:00
Ramiro Polla 1568cae66f swscale/tests/swscale: some tweaks to the output format
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>
2026-03-14 06:13:19 +00:00
Ramiro Polla 6ed173f238 swscale/tests/swscale: always print loss in output
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>
2026-03-14 06:13:19 +00:00
Ramiro Polla 822d592575 swscale/tests/swscale: make loss printing code more consistent
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>
2026-03-14 06:13:19 +00:00
Ramiro Polla 1980d8ba8a swscale/tests/swscale: remove duplicate printing of parameters on error
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>
2026-03-14 06:13:19 +00:00
Ramiro Polla b03ff92567 swscale/tests/swscale: print losses using scientific notation
This emphasizes the order of magnitude of the loss, which is what is
important for us.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>
2026-03-14 06:13:19 +00:00
Ramiro Polla 739d9eca59 swscale/tests/swscale: be more strict with reference file format
The format of the reference file is the output which is printed to
stdout from this tool itself.

Malformed reference files cause an error, with a more descriptive error
message. Running a subset of the reference conversions is still
supported through -src and/or -dst.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>
2026-03-14 06:13:19 +00:00
Ramiro Polla 041b6f330f swscale/tests/swscale: indent after previous commit
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>
2026-03-14 06:13:19 +00:00
Ramiro Polla 6780353460 swscale/tests/swscale: restore reference file functionality
The test results (along with SSIM) are printed to stdout again so that
the output can be parsed by -ref.

Benchmark results have also been added to the output.

We still need to re-run the reference tests to perform benchmarks, but
this will be simplified in the next few commits.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>
2026-03-14 06:13:19 +00:00
Ramiro Polla 0b427ea47e swscale/tests/swscale: reorder test results output
The conversion parameters, ssim/loss, and benchmark results will
eventually be merged into the same output line.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>
2026-03-14 06:13:19 +00:00
Ramiro Polla 58cdd89789 swscale/tests/swscale: allow passing -bench 0 to disable benchmarks
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>
2026-03-14 06:13:19 +00:00
Ramiro Polla 101a2f6fc6 swscale/tests/swscale: add option to run main conversion with legacy scaler
The low bit depth workaround code is duplicated in this commit, but the
other occurrence will be removed in a few commits, so I see no reason
to factor it out.

The legacy scaler still has some conversions that give results much
worse than the expected loss, but we still want them as reference, so
we don't trigger expected loss errors on conversions with the legacy
scaler.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>
2026-03-14 06:13:19 +00:00
Ramiro Polla 3eb178a197 swscale/tests/swscale: split scale_new() out of run_test()
We will eventually be able to select between running the new graph-based
scaler or the legacy scaler.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>
2026-03-14 06:13:19 +00:00
Ramiro Polla c27df6ccc9 swscale/tests/swscale: add helper function to log sws_scale_frame() failures
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>
2026-03-14 06:13:19 +00:00
Ramiro Polla c6d78efa9b swscale/tests/swscale: split init_frame() out of run_test()
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>
2026-03-14 06:13:19 +00:00
Ramiro Polla 7e635337cf swscale/tests/swscale: introduce struct test_results to collect results
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>
2026-03-14 06:13:19 +00:00
Ramiro Polla 8d35328e54 swscale/tests/swscale: split print_results() out of run_test()
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>
2026-03-14 06:13:19 +00:00
Ramiro Polla 2b0e1920e5 swscale/tests/swscale: propagate error out of run_test()
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>
2026-03-14 06:13:19 +00:00
Ramiro Polla 2222b523f2 swscale/tests/swscale: cosmetics (avoid assignments in conditions) 2026-03-14 06:13:19 +00:00
Ramiro Polla 713979919d Revert "tests/swscale: check supported inputs for legacy swscale separately"
Support for input and output formats are already checked in run_self_tests().

This reverts commit a22faeb992.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>
2026-03-14 06:13:19 +00:00
Niklas Haas 80e48f2e78 swscale/tests/swscale: fix typos 2026-03-14 06:13:19 +00:00
Michael Niedermayer f73849887c avcodec/wmv2dec: More Checks about reading skip bits
Fixes: out of array read with --disable-safe-bitstream-reader
Fixes: poc_wmv2.avi

Note, this requires the safe bitstream reader to be turned off by the user and the user disregarding the security warning

Change suggested by: Guanni Qu <qguanni@gmail.com>
Found-by: Guanni Qu <qguanni@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-03-13 23:22:54 +01:00
Gil Portnoy 26dd9f9b56 avcodec/cbs_h266_syntax_template: Fix w/h typo
Fixes: out of array access
Fixes: vvc_poc_subpic_wh_bug.h266

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-03-13 23:21:31 +01:00
Gil Portnoy e1d9080e6a avcodec/aac/aacdec_usac_mps212: Fix wrong end_band parameter to coarse_to_fine()
note, all call sites set start_band=0, this is thus a cosmetic fix

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-03-13 23:03:36 +01:00
Gil Portnoy d75b7c2252 avcodec/aac/aacdec_usac_mps212: Fix typo in huff_data_2d()
This is not a security issue

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-03-13 23:03:36 +01:00
Gil Portnoy 8b9851b005 avcodec/aac/aacdec_usac_mps212: Off-by-one bounds check in ff_aac_ec_data_deci()
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>

No testcase, the check seems redundant
2026-03-13 23:03:35 +01:00
Oliver Chang d519ab8993 aacdec_usac: skip FD-specific decoding for LPD channels
`spectrum_decode` currently executes Frequency Domain (FD) decoding steps
for all channels, regardless of their `core_mode`. When a channel is in
Linear Prediction Domain (LPD) mode (`core_mode == 1`), FD-specific
parameters such as scalefactor offsets (`sfo`) and individual channel
stream (`ics`) information are not parsed.

This causes a global-buffer-overflow in `dequant_scalefactors`. Because
`spectrum_scale` is called on LPD channels, it uses stale or
uninitialized `sfo` values to index `ff_aac_pow2sf_tab`. In the reported
crash, a stale `sfo` value of 240 resulted in an index of 440
(240 + POW_SF2_ZERO), exceeding the table's size of 428.

Fix this by ensuring `spectrum_scale` and `imdct_and_windowing` are only
called for channels where `core_mode == 0` (FD).

Co-authored-by: CodeMender <codemender-patching@google.com>
Fixes: https://issues.oss-fuzz.com/486160985
2026-03-13 22:57:25 +01:00
Michael Niedermayer c5d5fb2309 avformat/dhav: Fix handling or slightly larger files
Fixes: integer overflow
Fixes: 490241718/clusterfuzz-testcase-minimized-ffmpeg_dem_DHAV_fuzzer-4902512932225024

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-03-13 22:48:55 +01:00
Michael Niedermayer eb5d607861 avutil/timecode: Check for integer overflow in av_timecode_init_from_components()
Fixes: integer overflow
Fixes: testcase that calls av_timecode_init_from_components() with hh set explicitly to INT_MAX

Found-by: Youngjae Choi, Mingyoung Ban, Seunghoon Woo
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-03-13 22:48:32 +01:00
Jun Zhao 254b92ec8a lavc/hevc: reorder aarch64 NEON pel function assignments
Group assignments by filter family (qpel, epel), variant
(base, uni, bi, uni_w, bi_w) and direction (pixels, h, v, hv).
Add NEON8_FNASSIGN_QPEL_H macro to replace repeated manual
qpel horizontal assignments.

No functional change.

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-03-13 21:43:37 +00:00
Jun Zhao 489d36b5e1 lavc/hevc: add aarch64 NEON for epel uni horizontal filter
Add NEON-optimized implementations for HEVC EPEL uni-directional
horizontal interpolation (put_hevc_epel_uni_h) at 8-bit depth.

These functions perform horizontal 4-tap EPEL filtering with
output directly to uint8_t pixels (no weighting):
- 4-tap horizontal EPEL filter
- Output: (filter_result + 32) >> 6, clipped to [0, 255]

Supports all block widths: 4, 6, 8, 12, 16, 24, 32, 48, 64.

Performance results on Apple M4:
./tests/checkasm/checkasm --test=hevc_pel --bench

put_hevc_epel_uni_h4_8_neon:   2.26x
put_hevc_epel_uni_h6_8_neon:   2.71x
put_hevc_epel_uni_h8_8_neon:   4.40x
put_hevc_epel_uni_h12_8_neon:  3.60x
put_hevc_epel_uni_h16_8_neon:  3.00x
put_hevc_epel_uni_h24_8_neon:  3.72x
put_hevc_epel_uni_h32_8_neon:  3.14x
put_hevc_epel_uni_h48_8_neon:  3.16x
put_hevc_epel_uni_h64_8_neon:  3.15x

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-03-13 21:43:37 +00:00
Jun Zhao f5e6cca935 lavc/hevc: add aarch64 NEON for qpel uni-weighted HV filter
Add NEON-optimized implementations for HEVC QPEL uni-directional
weighted HV interpolation (put_hevc_qpel_uni_w_hv) at 8-bit depth,
for block widths 6, 12, 24, and 48.

These functions perform horizontal then vertical 8-tap QPEL filtering
with weighting (wx, ox, denom) and output to uint8_t. Previously
only widths 4, 8, 16, 32, 64 were implemented; this completes
coverage for all standard HEVC block widths.

Performance results on Apple M4:
./tests/checkasm/checkasm --test=hevc_pel --bench

put_hevc_qpel_uni_w_hv6_8_neon:   3.11x
put_hevc_qpel_uni_w_hv12_8_neon:  3.19x
put_hevc_qpel_uni_w_hv24_8_neon:  2.26x
put_hevc_qpel_uni_w_hv48_8_neon:  1.80x

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-03-13 21:43:37 +00:00
Jun Zhao fe41ff7413 lavc/hevc: add aarch64 NEON for qpel uni-weighted vertical filter
Add NEON-optimized implementations for HEVC QPEL uni-weighted
vertical interpolation (put_hevc_qpel_uni_w_v) at 8-bit depth.

These functions perform weighted uni-directional prediction with
vertical QPEL filtering:
- 8-tap vertical QPEL filter
- Weighted prediction: (filter_result * wx + offset) >> shift

Previously only sizes 4, 8, 16, 64 were optimized. This patch adds
optimized implementations for all remaining sizes: 6, 12, 24, 32, 48.

Performance results on Apple M4:
./tests/checkasm/checkasm --test=hevc_pel --bench

put_hevc_qpel_uni_w_v6_8_neon:   3.40x
put_hevc_qpel_uni_w_v12_8_neon:  3.24x
put_hevc_qpel_uni_w_v24_8_neon:  3.06x
put_hevc_qpel_uni_w_v32_8_neon:  2.66x
put_hevc_qpel_uni_w_v48_8_neon:  2.67x

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-03-13 21:43:37 +00:00
Jun Zhao 32df0352b7 lavc/hevc: move subs earlier in qpel uni-weighted NEON loops
Move the subs instruction before the store macro in the 8x-unrolled
loops of qpel_uni_w_v4/v8/v16/v64 and qpel_uni_w_hv4/hv8/hv16, so
that many NEON instructions from the store macro separate it from the
conditional branch. This gives the CPU pipeline time to resolve the
condition flags before the branch decision.

Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
2026-03-13 21:43:37 +00:00
Linke e44d76f61f avformat/av1: fix uvlc loop past end of bitstream
When get_bits_left() returns a negative value (bitstream reader already past the end of the buffer), the while condition while (get_bits_left(gb)) evaluates to true since any non-zero int is truthy.

With the safe bitstream reader enabled, get_bits1() returns 0 past the buffer end, so the break never triggers and leading_zeros increments toward INT_MAX.

Change the condition to > 0, consistent with skip_1stop_8data_bits() which already uses <= 0 for the same pattern.

Signed-off-by: Linke <1102336121@qq.com>
2026-03-13 21:29:14 +00:00
Gil Portnoy 51606de0e9 avcodec/cbs_h266_syntax_template: Fix rows vs columns
Fixes: out of array access
Fixes: vvc_poc_cbs_divergence_max.h266

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-03-13 21:59:22 +01:00
Ted Meyer 86f53f9ffb avformat/mov: do not allocate out-of-range buffers
There's a possibility here with a well-crafted MP4 file containing only
the nested boxes in order: MOOV.TRAK.MDIA.MINF.STBL.SDTP where the
header size uses the 64 bit large size, and the ending stdp box has some
size value >= 0x100000014.

On a 32 bit build of ffmpeg, av_malloc's size parameter drops the high
order bits of `entries`, and and the allocation is now a controlled size
that is significantly smaller than `entries`. The following loop will
then write off the ended of allocated memory with data that follows the
box fourcc.
2026-03-13 21:53:12 +01:00
Karl Mogensen fa281d1394 avfilter/af_lv2: call lilv_instance_activate before lilv_instance_run
Why: the change is done to comply with lilv expectations of hosts.

Added call lilv_instance_activate in the config_output function to abide by lilv documentation that states it must be called before lilv_instance_run:
"This MUST be called before calling lilv_instance_run()" - documentation source (https://github.com/lv2/lilv/blob/main/include/lilv/lilv.h)

Added call lilv_instance_deactivate in the uninit function to abide by lv2 documentation:
"If a host calls activate(), it MUST call deactivate() at some point in the future" - documentation source (https://gitlab.com/lv2/lv2/-/blob/main/include/lv2/core/lv2.h)

Added instance_activated integer to LV2Context struct to track if instance was activated and only do lilv_instance_deactivate if was activated to abide by lv2 documentation:
"Hosts MUST NOT call deactivate() unless activate() was previously called." - documentation source (https://gitlab.com/lv2/lv2/-/blob/main/include/lv2/core/lv2.h)

Regarding the patcheck warning (possibly constant :instance_activated):
This is a false positive since the struct member is zero-initialized.

Fixes: trac issue #11661 (https://trac.ffmpeg.org/ticket/11661)
Reported-by: Dave Flater
Signed-off-by: Karl Mogensen <karlmogensen0@gmail.com>
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-03-13 21:31:54 +01:00
Andreas Rheinhardt b3996ee578 avcodec/lcevctab: Use smaller types
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-13 16:09:40 +00:00
Andreas Rheinhardt 464f440773 avcodec/lcevctab: Properly deduplicate ff_lcevc_resolution_type
(Currently lcevctab.o does not export anything.)

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-13 16:09:40 +00:00
Andreas Rheinhardt 1b70aab908 configure: Add lcevc->cbs_lcevc dependency
Forgotten in 49d75d81f6.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-13 16:09:40 +00:00
James Almer aa70cd0d19 avfilter/vf_vpp_amf: look for HDR metadata in link side data
This is the correct way to use and propagate this kind of metadata.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-13 12:52:57 -03:00
James Almer 49d75d81f6 avcodec/lcevcdec: don't try to derive final dimensions from SAR
Not only do some sources not provide an aspect ratio, as is the case of
MPEG-TS, but also some enhanced streams have no change in dimensions, and this
heuristic would generate bugus values.
Instead, we need to parse the LCEVC bitstream for a Global Config process block
in order to get the actual dimensions. This add a little overhead, but it can't
be avoided.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-13 09:18:42 -03:00
James Almer c5aa31d252 avcodec/lcevc_parser: move the resolution type table to a header
Will be useful in the following commit.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-13 09:18:42 -03:00
James Almer ce7375fc17 avcodec/cbs_lcevc: don't look for process blocks if the unit was not decomposed
Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-13 09:18:42 -03:00
Zhao Zhili f189657ec6 avformat/rtmpproto: fix listen_timeout conversion for special negative values
rtmpproto converts listen_timeout to milliseconds by multiplying it
by 1000 before passing it to TCP. However, negative values are special
sentinels (e.g., -1 for infinite wait) and should not be multiplied.

This worked prior to commit 49c6e6cc44 because there was no range
validation. Since that commit, ff_parse_opts_from_query_string
validates option values against their declared ranges, causing these
multiplied negative values to fail.

Fixes ticket #22469.

Signed-off-by: Zhao Zhili <zhilizhao@tencent.com>
2026-03-13 11:38:39 +00:00
Michael Niedermayer b4b569f922 avcodec/aom_film_grain: Remove impossible check
fgp is freshly allocated so it cannot be equal to ref

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-03-13 04:39:52 +01:00
Michael Niedermayer ebb6ac1bc7 avcodec/aom_film_grain: avoid duplicate indexes in ff_aom_parse_film_grain_sets()
Fixes: use after free
Fixes: 478301106/clusterfuzz-testcase-minimized-ffmpeg_dem_HEVC_fuzzer-6155792247226368

Found-by:  continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-03-13 04:39:52 +01:00
Michael Niedermayer 4ccad70d57 avformat/hxvs: Do not allow backward steps in hxvs_probe()
Fixes: infinite loop
Fixes: 487632033/clusterfuzz-testcase-minimized-ffmpeg_dem_IMAGE2_fuzzer-4565877872984064

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-03-13 04:39:35 +01:00
Michael Niedermayer f84c859ec5 avcodec/bsf/extract_extradata: Replace incorrect size accounting
Fixes: out of array writes
Fixes: 492054712/clusterfuzz-testcase-minimized-ffmpeg_BSF_EXTRACT_EXTRADATA_fuzzer-5705993148497920

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-03-13 02:03:59 +00:00
Michael Niedermayer 3b98e29da8 swscale/output: fix integer overflows in chroma in yuv2rgba64_X_c_template()
Fixes: signed integer overflow: 130489 * 16525 cannot be represented in type 'int'
Fixes: 488950053/clusterfuzz-testcase-minimized-ffmpeg_SWS_fuzzer-4627272670969856

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-03-13 02:51:19 +01:00
Michael Niedermayer 7241b80422 avcodec/lcldec: Fixes uqvq overflow
Fixes: integer overflow
Fixes: 490241717/clusterfuzz-testcase-minimized-ffmpeg_AV_CODEC_ID_ZLIB_DEC_fuzzer-4560518961758208

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-03-13 02:49:59 +01:00
Olivier Laflamme 10d36e5d3d fftools/ffprobe: Initialize data_dump_format_id
This was used uninitialized previously

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-03-13 01:19:39 +00:00
Niklas Haas 86eb07154d doc/scaler: document new sws scaler flags
And label the old ones as deprecated.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-12 22:09:13 +01:00
Niklas Haas 803ac77187 swscale: mark scale-related SwsFlags as deprecated
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-12 22:09:13 +01:00
Niklas Haas 3503b19711 swscale: add enum SwsScaler, SwsContext.scaler to replace legacy flags
Another step towards a cleaner API, with a cleaner separation of purposes.
Also avoids wasting a whopping one third of the flag space on what really
shouldn't have been a flag to begin with.

I pre-emptively decided to separate the scaler selection between "scaler"
and "scaler_sub", the latter defining what's used for things like 4:2:0
subsampling.

This allows us to get rid of the awkwardly defined SWS_BICUBLIN flag, in favor
of that just being the natural consequence of using a different scaler_sub.

Lastly, I also decided to pre-emptively axe the poorly defined and
questionable SWS_X scaler, which I doubt ever saw much use. The old flag
is still available as a deprecated flag, anyhow.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-12 22:09:04 +01:00
Niklas Haas a1b8cbb8bc swscale/utils: separate luma and chroma scaler selection
Pre-requisite for adding support for configuring these independently.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-12 22:08:07 +01:00
Niklas Haas 36c31fd5ba swscale: don't hard code number of scaler params
In case we ever need to increase this number in the future.
I won't bother bumping the ABI version for this new #define, since it doesn't
affect ABI, and I'm about to bump the ABI version in a following commit.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-12 22:08:07 +01:00
Niklas Haas 8115a05aa5 swscale: fix SWS_SPLINE documentation
This was incorrectly inferred to be a Keys spline when the documentation
was first added; but it's actually an "unwindowed" (in theory) natural
cubic spline with C2 continuity everywhere, which is a completely different
thing.

(SWS_BICUBIC is closer to being a Keys spline)

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-12 22:08:07 +01:00
Niklas Haas 55dd4a18bb tests/checkasm/sw_ops: declare temporary arrays static
These are quite large; GCC on my end warns about big stack allocations.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-12 21:02:48 +00:00
Niklas Haas baac4a1174 swscale/x86/ops: add section comments (cosmetic)
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-12 21:02:48 +00:00
Niklas Haas 7fb1e0832c swscale/ops_dispatch: move ENOTSUP error to ff_sws_compile_pass()
Or else this might false-positive when we retry compilation after subpass
splitting.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-12 21:02:48 +00:00
Niklas Haas e7c84a8e6a swscale/ops_dispatch: infer destination format from SwsOpList
This is now redundant.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-12 21:02:48 +00:00
Niklas Haas b5db7c7354 swscale/ops_dispatch: have ff_sws_compile_pass() take ownership of ops
More useful than just allowing it to "modify" the ops; in practice this means
the contents will be undefined anyways - might as well have this function
take care of freeing it afterwards as well.

Will make things simpler with regards to subpass splitting.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-12 21:02:48 +00:00
Niklas Haas adf2d4e90f swscale/ops_dispatch: add helper function to clean up SwsCompiledOp
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-12 21:02:48 +00:00
Niklas Haas 8227a21c27 swscale/ops_optimizer: always clear unused dither components
Makes the op list a bit more stable.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-12 21:02:48 +00:00
Niklas Haas 563cc8216b swscale/graph: allow setup() to return an error code
Useful for a handful of reasons, including Vulkan (which depends on external
device resources), but also a change I want to make to the tail handling.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-12 21:02:48 +00:00
Niklas Haas 6c92ab6a4e swscale/graph: remove redundant check
Such formats are already rejected by ff_sws_decode/encode_pixfmt().

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-12 21:02:48 +00:00
Niklas Haas 4e63dbeb6d swscale/ops_chain: add more integer types to SwsOpPriv
In particular, I need i32, but the others are also reasonable additions.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-12 21:02:48 +00:00
Niklas Haas f535212a2c swscale/ops_chain: allow free callback to take SwsOpPriv
I mainly want to be able to store two pointers side-by-side.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-12 21:02:48 +00:00
Niklas Haas 66f3a62b46 swscale/ops_backend: use in/out_bump[] in process()
Instead of recomputing the input/output address on each iteration, we
can use the in_bump/out_bump arrays the way the x86 backend does.

I initially avoided this in order to ensure the reference backend always does
the correct thing, even if some future bug causes the bump values to be
computed incorrectly, but doing it this way makes an upcoming change easier.

(And besides, it would be easier to just add an av_assert2() to catch those
cases)

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-12 21:02:48 +00:00
James Almer 927c81b569 avutil/version: bump after recent additions
Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-12 17:18:00 -03:00
Andreas Rheinhardt 835781af23 configure,swscale/x86/Makefile: Remove special red-zone handling
ff_h[yc]scale_fast_mmxext() call other functions from inline assembly;
these functions look like leaf functions to GCC, so it may use the
red zone to avoid modifying the stack. But this makes the call
instructions in the inline asm corrupt the stack.

In order to fix this 424bcc46b5
made libswscale/x86/swscale_mmx.o be compiled with -mno-red-zone.
Later Libav fixed it in their version in commit
b14fa5572c by saving and restoring
the memory clobbered by the call (as is still done now). This was
merged into FFmpeg in 0e7fc3cafe,
without touching the -mno-red-zone hack.

Libav later renamed swscale_mmx.c to just swscale.c in
16d2a1a51c which was merged into FFmpeg
in commit 2cb4d51654, without
removing the -mno-red-zone hack, although the file it applies to
no longer existed.

This commit removes the special red-zone handling given that it is
inactive anyway.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-12 18:26:45 +01:00
Andreas Rheinhardt 9bbe1ec86f avutil/opt: Remove obsolete LIBAVUTIL_VERSION_MAJOR checks
Removing them has been forgotten during the lavu 59->60 bump.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-12 18:26:42 +01:00
Romain Beauxis 87bf42899b Add myself as maintainer to the various ogg files. 2026-03-12 15:22:56 +00:00
Nicolas Gaullier fbb3c99032 fate/gapless: remove duplicate ffprobe dependencies
Signed-off-by: Nicolas Gaullier <nicolas.gaullier@cji.paris>
2026-03-12 13:54:35 +00:00
Nicolas Gaullier 2cf8d64f3c fate/probe: simplify for consistency
Signed-off-by: Nicolas Gaullier <nicolas.gaullier@cji.paris>
2026-03-12 13:54:35 +00:00
Nicolas Gaullier db336e1c51 fate/scale2ref_keep_aspect: fix dependency
Regression since 5b5e692da6.

Signed-off-by: Nicolas Gaullier <nicolas.gaullier@cji.paris>
2026-03-12 13:54:35 +00:00
Ramiro Polla 72167e5150 avcodec/mjpegdec: deprecate extern_huff option 2026-03-12 14:47:01 +01:00
Nicolas Gaullier afcde6551c avformat/mov: fix skip_samples when sample_rate and time_base do not match
Fixes #21076.
2026-03-12 12:42:06 +00:00
Nicolas Gaullier b66c314c4b fftools/ffprobe: keep decoder buffers unflushed for show_streams()
When a decoder buffer is flushed, parts of the private context is reset,
which may affect show_streams().

Example:
ffprobe -of flat fate-suite/ac3/mp3ac325-4864-small.ts \
    -analyze_frames -show_entries stream=ltrt_cmixlev
Before: ltrt_cmixlev="0.000000"
After:  ltrt_cmixlev="0.707107"

Currently, it seems that only ac3 downmix info is concerned.
(ac3 downmix options are exported since 376bb8481a).

Fix regression since 045a8b15b1.

Signed-off-by: Nicolas Gaullier <nicolas.gaullier@cji.paris>
2026-03-12 12:18:58 +00:00
Nicolas Gaullier 8a0ae6b344 fftools/ffmpeg_sched: report progress using max dts instead of trailing_dts()
This is to reapply 18217bb0f5.
Its commit msg is still meaningful:
"Using the max instead of the min avoids the progress stopping
with gaps in sparse streams (subtitles)."

Also on a very similar issue: currently, a single stream with
no data makes ffmpeg reports N/A for both time and speed.
Fix this by ignoring missing dtses.

Fix regressions since d119ae2fd8.

Signed-off-by: Nicolas Gaullier <nicolas.gaullier@cji.paris>
2026-03-12 12:15:15 +00:00
Romain Beauxis 9dc44b43b2 fftools/ffplay.c: Also print demuxer-level metadata updates. 2026-03-12 02:45:13 +00:00
Michael Niedermayer ba0f8083fd avformat/aiffdec: Check for partial read
Fixes: read of uninitialized memory
Fixes: 490305404/clusterfuzz-testcase-minimized-ffmpeg_dem_AIFF_fuzzer-6406386140643328

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-03-11 20:06:26 +00:00
Kacper Michajłow 5074d9f06e hwcontext_amf: fix version variable type and remove cast
Fixes compilation errors on newer Clang/GCC that errors out on
incompatible pointers.

error: incompatible pointer types passing 'unsigned long long *' to
parameter of type 'amf_uint64 *' (aka 'unsigned long *')
[-Wincompatible-pointer-types]

Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2026-03-11 18:41:10 +00:00
Lynne 7c79c79a50 aacdec_usac_mps212: reject reserved freq_res value 2026-03-11 17:43:09 +00:00
Kacper Michajłow b028dac149 configure: bump AMF requirement to 1.5.0
6972b127de requires at least version
1.5.0, as earlier versions are not compatible with C due to unguarded
`extern "C"`.

Signed-off-by: Kacper Michajłow <kasper93@gmail.com>
2026-03-11 17:32:20 +00:00
Shreesh Adiga 5085432f8b avutil/crc: add aarch64 NEON PMULL+EOR3 SIMD implementation for av_crc
Implemented clmul algorithm for aarch64 using PMULL and EOR3 instructions.
The logic and structure is same as x86 clmul implementation with
slight rearrangement of constants as per PMULL and PMULL2 instructions.

Benchmarking in Android (Termux) on a MediaTek Dimensity 9400 SoC:

./tests/checkasm/checkasm --test=crc --bench --runs=12
benchmarking with native FFmpeg timers
nop: 0.2
checkasm: SVE 128 bits, using random seed 2502847808
checkasm: bench runs 4096 (1 << 12)
CRC:
 - crc.crc [OK]
PMULL:
 - crc.crc [OK]
checkasm: all 10 tests passed
crc_8_ATM_c:                                            26.0 ( 1.00x)
crc_8_ATM_pmull_eor3:                                    0.7 (37.17x)
crc_8_EBU_c:                                            46.4 ( 1.00x)
crc_8_EBU_pmull_eor3:                                    1.5 (31.47x)
crc_16_ANSI_c:                                          36.3 ( 1.00x)
crc_16_ANSI_pmull_eor3:                                  1.1 (31.70x)
crc_16_ANSI_LE_c:                                       90.9 ( 1.00x)
crc_16_ANSI_LE_pmull_eor3:                               2.8 (32.30x)
crc_16_CCITT_c:                                        118.0 ( 1.00x)
crc_16_CCITT_pmull_eor3:                                 3.7 (32.00x)
crc_24_IEEE_c:                                           1.6 ( 1.00x)
crc_24_IEEE_pmull_eor3:                                  0.1 (12.19x)
crc_32_IEEE_c:                                          45.2 ( 1.00x)
crc_32_IEEE_pmull_eor3:                                  1.4 (31.39x)
crc_32_IEEE_LE_c:                                       49.1 ( 1.00x)
crc_32_IEEE_LE_crc:                                      2.5 (19.51x)
crc_32_IEEE_LE_pmull_eor3:                               1.5 (32.84x)
crc_custom_polynomial_c:                                45.3 ( 1.00x)
crc_custom_polynomial_pmull_eor3:                        1.3 (35.16x)
2026-03-11 14:03:36 +00:00
Shreesh Adiga 952e588600 avutil/crc: refactor helper functions to separate header file
Move the reverse and xnmodp functions to a separate header
so that it can be reused for aarch64 implementation of av_crc.
2026-03-11 14:03:36 +00:00
Shreesh Adiga b19bd0de6c avutil/cpu: add aarch64 CPU feature flag for PMULL and EOR3 2026-03-11 14:03:36 +00:00
Timo Rothenpieler fb088f224b avfilter/vf_vpp_amf: fix build on non-windows
sscanf and sscanf_s are identical for pure number parsing anyway.
2026-03-11 14:12:26 +01:00
Dmitrii Gershenkop 910000fe59 avfilter/vf_vpp_amf: Extend AMF Color Converter HDR capabilities 2026-03-11 10:23:35 +01:00
Ramiro Polla e3ee346749 swscale/tests/swscale: add -s option to set frame size
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>
2026-03-11 08:05:08 +00:00
Ramiro Polla 5c5444db59 swscale/tests/swscale: avoid redundant ref->src conversions
The ref->src conversion only needs to be performed once per source
pixel format.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>
2026-03-11 08:05:08 +00:00
Ramiro Polla a09cddc803 swscale/tests/swscale: make auxiliary conversions bitexact and accurate_rnd
This prevents the propagation of dither_error across frames, and should
also improve reproducibility across platforms.

Also remove setting of flags for sws_src_dst early on, since it will
inevitably be overwritten during the tests.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>
2026-03-11 08:05:08 +00:00
Ramiro Polla d935000f09 swscale/tests/swscale: give names to SwsContext variables 2026-03-11 08:05:08 +00:00
Ramiro Polla 49b1e214cf swscale/tests/swscale: pass opts and mode arguments as const pointers 2026-03-11 08:05:08 +00:00
Ramiro Polla e34071e7d5 swscale/tests/swscale: split init_ref() out of main()
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>
2026-03-11 08:05:08 +00:00
Ramiro Polla f83c9718ec swscale/tests/swscale: split parse_options() out of main()
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>
2026-03-11 08:05:08 +00:00
Ramiro Polla 953efc9f56 swscale/tests/swscale: remove hardcoded dimension checks
Remove dimension checks originally added to please static analysis
tools. There is little reason to have arbitrary limits in this
developer test tool. The reference files are under control by the user.

This reverts f70a651b3f and c0f0bec2f2.
2026-03-11 08:05:08 +00:00
Ramiro Polla 955cf563c8 swscale/tests/swscale: always allocate frame in scale_legacy()
Legacy swscale may overwrite the pixel formats in the context (see
handle_formats() in libswscale/utils.c). This may lead to an issue
where, when sws_frame_start() allocates a new frame, it uses the wrong
pixel format.

Instead of fixing the issue in swscale, just make sure dst is always
allocated prior to calling the legacy scaler.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>
2026-03-11 08:05:08 +00:00
Niklas Haas 2589ce4a2c tests/swscale: unref buffers before each iteration
Otherwise, we always pass frames that already have buffers allocated, which
breaks the no-op refcopy optimizations.

Testing with -p 0.1 -threads 16 -bench 10, on an AMD Ryzen 9 9950X3D:

 Before:
  Overall speedup=2.776x faster, min=0.133x max=629.496x
  yuv444p 1920x1080 -> yuv444p 1920x1080, flags=0x100000 dither=1
     time=9 us, ref=9 us, speedup=1.043x faster

 After:
  Overall speedup=2.721x faster, min=0.140x max=574.034x
  yuv444p 1920x1080 -> yuv444p 1920x1080, flags=0x100000 dither=1
    time=0 us, ref=28 us, speedup=516.504x faster

(The slowdown in the legacy swscale case is from swscale's lack of a no-op
refcopy optimizaton, plus the fact that it's now actually doing memory
work instead of a no-op / redundant memset)

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-11 08:05:08 +00:00
Niklas Haas 271bacffec tests/swscale: exclude init time from benchmark
This was originally intended to also include performance gains/losses
due to complicated setup logic, but in practice it just means that changing
the number of iterations dramatically affects the measured speedup; which
makes it harder to do quick bench runs during development.
2026-03-11 08:05:08 +00:00
James Almer a9984fec81 avcodec/lcevc_parser: check return value of init_get_bits8()
Fixes coverity issue CID 1684198.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-10 15:50:20 -03:00
Lynne 215e22d1f1 ffv1enc_vulkan: fix typo
Fixes a segfault when host mapping is unsupported.
2026-03-10 19:31:00 +01:00
Diego de Souza 63e0a2add2 avcodec/nvenc: change default H.264 profile from main to high
The NVENC H.264 high profile provides up to 16% bitrate savings
(BD-Rate measured with VMAF) compared to the main profile.

Since most users do not explicitly set a profile, changing the
default benefits the common case. Users requiring the main profile
for legacy decoder compatibility can still set it explicitly.

The change is gated behind a versioned define so it only takes
effect on the next major version bump (libavcodec 63).

Signed-off-by: Diego de Souza <ddesouza@nvidia.com>
2026-03-10 15:08:16 +00:00
Andreas Rheinhardt 0afa879a69 avcodec/aac/aacdec_usac: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:19 +01:00
Andreas Rheinhardt 7e70503ed4 avcodec/vp5: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:19 +01:00
Andreas Rheinhardt 67217549c8 avcodec/get_bits: Rename macro variables to avoid shadowing
Especially 'n' often leads to shadowing, e.g. in mpeg12dec.c.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:19 +01:00
Andreas Rheinhardt 3343567482 avcodec/motion_est: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:19 +01:00
Andreas Rheinhardt ac25aed6b3 avcodec/eatgq: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:19 +01:00
Andreas Rheinhardt 2d9cf94283 avfilter/vf_chromanr: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:19 +01:00
Andreas Rheinhardt 380e1cdb0c avfilter/af_afftfilt: Don't get max align multiple times
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:19 +01:00
Andreas Rheinhardt a4efdcaa53 avfilter/af_apsyclip: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:19 +01:00
Andreas Rheinhardt b0839506d7 avfilter/vf_blurdetect: Fix shadowing
Also use smaller scope in general.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:19 +01:00
Andreas Rheinhardt 1bbac3643b avfilter/avf_showspectrum: Avoid allocation
Also fixes an instance of shadowing.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:19 +01:00
Andreas Rheinhardt 7950e5d1a5 avcodec/rka: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:19 +01:00
Andreas Rheinhardt 0a1606f86c avcodec/hpeldsp: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:19 +01:00
Andreas Rheinhardt 92046bcd7b avcodec/cbs: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:18 +01:00
Andreas Rheinhardt 06ea751c51 avcodec/mpegaudioenc: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:18 +01:00
Andreas Rheinhardt bffaa6aaab avcodec/utvideodec: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:18 +01:00
Andreas Rheinhardt 78f8ef341e avcodec/wmaenc: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:18 +01:00
Andreas Rheinhardt c0ba18c527 avcodec/dv_tablegen: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:18 +01:00
Andreas Rheinhardt ba57a33351 avformat/id3v2: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:18 +01:00
Andreas Rheinhardt 90dae166b5 avformat/http: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:18 +01:00
Andreas Rheinhardt 8ddb82fd75 avformat/lafdec: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:18 +01:00
Andreas Rheinhardt 1617feef50 avformat/asfdec_f: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:18 +01:00
Andreas Rheinhardt 2ed4660960 avformat/rtpenc_mpegts: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:18 +01:00
Andreas Rheinhardt 6de2565b8e avformat/rtpdec_xiph: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:18 +01:00
Andreas Rheinhardt 422ad600cd avformat/oggparseopus: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:18 +01:00
Andreas Rheinhardt 51ae4f443d avcodec/cbs_av1_syntax_template: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:18 +01:00
Andreas Rheinhardt 615d5c2715 avformat/dsfdec: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:18 +01:00
Andreas Rheinhardt 1e440f2745 avformat/dovi_isom: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:18 +01:00
Andreas Rheinhardt 9c0f942293 avformat/aviobuf: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:18 +01:00
Andreas Rheinhardt 1778991846 avformat/avio: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:18 +01:00
Andreas Rheinhardt 5962ca0c20 avformat/avidec: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:18 +01:00
Andreas Rheinhardt 8f9239a869 avformat/mpc8: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:18 +01:00
Andreas Rheinhardt 378928e79f avformat/mpegtsenc: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:18 +01:00
Andreas Rheinhardt 968511ad03 avformat/dhav: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:18 +01:00
Andreas Rheinhardt a49eed2fb1 avformat/oggenc: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:18 +01:00
Andreas Rheinhardt 4014d35dda avformat/bonk: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:18 +01:00
Andreas Rheinhardt 5198d8802c avformat/matroskadec: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:18 +01:00
Andreas Rheinhardt fd88a52be0 avformat/matroskaenc: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:18 +01:00
Andreas Rheinhardt 7a0ae45bcf avformat/rmenc: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:18 +01:00
Andreas Rheinhardt 6aa367b9b3 avformat/smacker: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:18 +01:00
Andreas Rheinhardt 54672d194c avformat/srtpproto: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:18 +01:00
Andreas Rheinhardt b53752dc4c avformat/tcp: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:18 +01:00
Andreas Rheinhardt 619839ac16 avformat/tee: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:18 +01:00
Andreas Rheinhardt 034b37a51d avformat/vividas: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:18 +01:00
Andreas Rheinhardt a2a42aa404 avformat/vorbiscomment: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:18 +01:00
Andreas Rheinhardt 0037c5abdd avformat/webpenc: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:18 +01:00
Andreas Rheinhardt 61c22c71c2 avformat/yuv4mpegdec: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:18 +01:00
Andreas Rheinhardt 33971e6c4e avformat/apetag: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:18 +01:00
Andreas Rheinhardt 6a78db80f0 avformat/hlsenc: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:18 +01:00
Andreas Rheinhardt 999c04db35 avutil/channel_layout: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:18 +01:00
Andreas Rheinhardt 3fbe80d17e avutil/aes: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:18 +01:00
Andreas Rheinhardt df7b8cae7b avutil/slicethread: Fix shadowing
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:18 +01:00
Andreas Rheinhardt 0992c19c30 avfilter/avf_showspectrum: Fix allocation check
If s->stop is set, the return value would be overwritten
before being checked. This bug was introduced in the switch
to AV_TX in 014ace8f98.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-10 13:52:17 +01:00
Georgii Zagoruiko c1be2107c9 aarch64/vvc: Optimisations of put_luma_h() functions for 10/12-bit
RPi4:
put_chroma_h_10_2x2_c:                                  63.4 ( 1.00x)
put_chroma_h_10_4x4_c:                                 151.4 ( 1.00x)
put_chroma_h_10_8x8_c:                                 555.1 ( 1.00x)
put_chroma_h_10_8x8_neon:                              113.9 ( 4.88x)
put_chroma_h_10_16x16_c:                              1068.5 ( 1.00x)
put_chroma_h_10_16x16_neon:                            439.4 ( 2.43x)
put_chroma_h_10_32x32_c:                              3432.6 ( 1.00x)
put_chroma_h_10_32x32_neon:                           1878.3 ( 1.83x)
put_chroma_h_10_64x64_c:                             12872.2 ( 1.00x)
put_chroma_h_10_64x64_neon:                           7868.2 ( 1.64x)
put_chroma_h_10_128x128_c:                           45612.2 ( 1.00x)
put_chroma_h_10_128x128_neon:                        28742.1 ( 1.59x)
put_chroma_h_12_2x2_c:                                  63.7 ( 1.00x)
put_chroma_h_12_4x4_c:                                 151.5 ( 1.00x)
put_chroma_h_12_8x8_c:                                 555.2 ( 1.00x)
put_chroma_h_12_8x8_neon:                              114.2 ( 4.86x)
put_chroma_h_12_16x16_c:                              1068.1 ( 1.00x)
put_chroma_h_12_16x16_neon:                            438.8 ( 2.43x)
put_chroma_h_12_32x32_c:                              3419.7 ( 1.00x)
put_chroma_h_12_32x32_neon:                           1878.7 ( 1.82x)
put_chroma_h_12_64x64_c:                             12862.2 ( 1.00x)
put_chroma_h_12_64x64_neon:                           7868.2 ( 1.63x)
put_chroma_h_12_128x128_c:                           45613.5 ( 1.00x)
put_chroma_h_12_128x128_neon:                        28743.3 ( 1.59x)

Apple M4:
put_chroma_h_10_2x2_c:                                   2.5 ( 1.00x)
put_chroma_h_10_4x4_c:                                   6.5 ( 1.00x)
put_chroma_h_10_8x8_c:                                  17.8 ( 1.00x)
put_chroma_h_10_8x8_neon:                                6.8 ( 2.60x)
put_chroma_h_10_16x16_c:                                53.3 ( 1.00x)
put_chroma_h_10_16x16_neon:                             30.4 ( 1.75x)
put_chroma_h_10_32x32_c:                               181.8 ( 1.00x)
put_chroma_h_10_32x32_neon:                            116.2 ( 1.56x)
put_chroma_h_10_64x64_c:                               684.2 ( 1.00x)
put_chroma_h_10_64x64_neon:                            470.3 ( 1.45x)
put_chroma_h_10_128x128_c:                            2567.6 ( 1.00x)
put_chroma_h_10_128x128_neon:                         1879.3 ( 1.37x)
put_chroma_h_12_2x2_c:                                   1.9 ( 1.00x)
put_chroma_h_12_4x4_c:                                   7.0 ( 1.00x)
put_chroma_h_12_8x8_c:                                  16.8 ( 1.00x)
put_chroma_h_12_8x8_neon:                                7.9 ( 2.12x)
put_chroma_h_12_16x16_c:                                55.0 ( 1.00x)
put_chroma_h_12_16x16_neon:                             29.0 ( 1.90x)
put_chroma_h_12_32x32_c:                               182.5 ( 1.00x)
put_chroma_h_12_32x32_neon:                            116.9 ( 1.56x)
put_chroma_h_12_64x64_c:                               666.8 ( 1.00x)
put_chroma_h_12_64x64_neon:                            474.5 ( 1.41x)
put_chroma_h_12_128x128_c:                            2588.1 ( 1.00x)
put_chroma_h_12_128x128_neon:                         1912.2 ( 1.35x)
2026-03-10 12:48:54 +00:00
James Almer 125bb2e045 avcodec/lcevc_parser: Check that block_size is not negative
Based on 248b481c33

Signed-off-by: James Almer <jamrial@gmail.com>
2026-03-09 18:39:33 -03:00
Andreas Rheinhardt f6894debc0 avfilter/vf_hqdn3d: Remove unnecessary emms_c()
Added in e995cf1bcc,
yet this filter does not have any dsp function using MMX:
it only has generic x86 assembly, no SIMD at all,
so this emms_c() was always unnecessary.

Reviewed-by: Kacper Michajłow <kasper93@gmail.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-09 19:07:34 +01:00
nyanmisaka 3f10a054dc fftools/ffmpeg: fix read_key() always return 255 when there was no input
fixup 08d327e

When an uchar is set to -1, it will become 255 when read as an int.
Duplicate variables for two terminal types can also avoid unused variable warnings.

Signed-off-by: nyanmisaka <nst799610810@gmail.com>
2026-03-09 16:13:18 +00:00
Niklas Haas 68046d0b33 Revert "swscale/vulkan/ops: move buffer desc setting to helper function"
This reverts commit 32554fc107.

Accidentally pushed this commit twice, with the wrong location.
Correct version is 97682155e6.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-09 12:16:42 +01:00
Niklas Haas bd9590db70 swscale/ops_dispatch: remove unnecessary SwsOpExec fields
These were abstraction-violating in the first place. Good riddance.

This partially reverts commit c911295f09.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-09 12:01:51 +01:00
Niklas Haas 143cb56501 swscale/vulkan/ops: use opaque run function
Avoids some unnecessary round-trips through the execution harness, as well
as removing one unnecessary layer of abstraction (SwsOpExec).

It's a bit unfortunate that we have to cast away the const on the AVFrame,
since the Vulkan functions take non-const everywhere, even though all they're
doing is modifying frame internal metadata, but alas.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-09 12:01:51 +01:00
Niklas Haas 911176c880 swscale/ops_dispatch: add SwsCompiledFunc.opaque
Allows compiled functions to opt out of the ops_dispatch execution harness
altogether and just get dispatched directly as the pass run() function.

Useful in particular for Vulkan.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-09 12:01:51 +01:00
Niklas Haas 9571f5cf15 swscale/graph: simplify ff_sws_graph_add_pass() usages
Now that this function returns a status code and takes care of cleanup on
failure, many call-sites can just return the function directly.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-09 12:01:51 +01:00
Niklas Haas 2e29833832 swscale/graph: have ff_sws_graph_add_pass() free priv on failure
This is arguably more convenient for most downstream users, as will be
more prominently seen in the next commit.

Also allows this code to re-use a pass_free() helper with the graph uninit.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-09 12:01:51 +01:00
Niklas Haas 42a47838ea swscale/graph: add setup()/free() to ff_sws_graph_add_pass() signature
This is just slightly common enough a pattern that it IMO makes sense to do
so. This will also make more sense after the following commits.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-09 12:01:51 +01:00
Niklas Haas 5b8889f4e8 swscale/graph: add typedef for SwsPassSetup
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-09 12:01:51 +01:00
Niklas Haas 254c07bf60 swscale/graph: rename sws_filter_run_t to SwsPassFunc
This name is weirdly out-of-place in the libswscale naming convention.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-09 12:01:51 +01:00
Niklas Haas fdc0a66cbd swscale/graph: skip threading for single-slice passes
This condition was weaker than necessary.

In particular, graph->num_thread == 1 guarantees pass->num_slices == 1.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-09 12:01:51 +01:00
Niklas Haas 97682155e6 swscale/vulkan/ops: move buffer desc setting to helper function
And call it on the read/write ops directly, rather than this awkward loop.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-09 12:01:51 +01:00
Niklas Haas 9b7439c31b swscale: don't pointlessly loop over NULL buffers
This array is defined as contiguous.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-09 11:25:58 +01:00
Niklas Haas dd75b6b57c swscale: add sanity clear on AVFrame *dst
Before allocating/referencing buffers, make sure these fields are in a
defined state.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-09 11:25:58 +01:00
Niklas Haas 76c60b192d swscale: restructure sws_scale_frame() slightly
Results in IMHO slightly more readable code flow, and will be useful in an
upcoming commit (that adds logic to ref individual planes).

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-09 11:25:58 +01:00
Niklas Haas b52df46585 tests/checkasm/sw_ops: fix exec.slice_h assignment
This should match the number of lines. As an aside, align these declarations.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-09 11:25:58 +01:00
Niklas Haas a534156083 swscale/graph: pass SWS_OP_FLAG_OPTIMIZE
Instead of optimizing it with an explicit call. May enable more optimizations
in the future.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-09 11:25:58 +01:00
Niklas Haas f77ab892f5 swscale/ops_dispatch: print op list on successful compile
Instead of once at the start of add_convert_pass(). This makes much
more sense in light of the fact that we want to start e.g. splitting
passes apart.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-09 11:25:58 +01:00
Niklas Haas a31973e99c swscale/ops_dispatch: avoid redundant ff_sws_op_list_update_comps()
This is already called by compile_backend(), and nothing else in this file
depends on accurate values.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-09 11:25:58 +01:00
Niklas Haas 32554fc107 swscale/vulkan/ops: move buffer desc setting to helper function
And call it on the read/write ops directly, rather than this awkward loop.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-09 11:25:58 +01:00
Niklas Haas b6ebee038f swscale/vulkan/ops: move fractional read/write rejection to implementation
Rather than testing for it separately for some reason.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-09 11:25:57 +01:00
Niklas Haas 104475ecb9 swscale/vulkan/ops: fix undefined behavior on SWS_OP_CLEAR
op->rw.frac dereferences nonsense on clear ops.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-09 11:25:57 +01:00
Niklas Haas b8cd331305 swscale/vulkan/ops: log op name in generated shader
I think this just makes for a marginally nicer debugging experience.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-09 11:25:57 +01:00
Niklas Haas eebc07aba7 swscale/ops: simplify ff_sws_op_list_print
Using the new ff_sws_op_type_name() helper.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-09 11:25:57 +01:00
Niklas Haas 1addde59f9 swscale/ops: add ff_sws_op_type_name
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-09 11:25:57 +01:00
Niklas Haas 1d16161a8b swscale/ops: use SwsCompFlags typedef instead of plain int
This improves the debugging experience. These are all internal structs so
there is no need to worry about ABI stability as a result of adding flags.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-03-09 11:25:57 +01:00
Andreas Rheinhardt 1a9c345ee8 avutil/mips: Add msa optimizations for pixelutils
Adapted from the corresponding me_cmp code. Only the width 16 function
has been adapted, because it seems that the width 8 function actually
reads 16 bytes per line.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-09 10:17:26 +01:00
Andreas Rheinhardt 471db1d323 avutil/arm: Add armv6 optimizations for pixelutils
Adapted from the corresponding me_cmp code.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-09 10:17:26 +01:00
Andreas Rheinhardt 9b84b8682f avutil/riscv: Add rvv optimizations for pixelutils
Adapted from the corresponding me_cmp code.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-09 10:17:26 +01:00
Andreas Rheinhardt 022c42649c avutil/aarch64: Add neon optimizations for pixelutils
Adapted from the corresponding me_cmp code.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-09 10:17:26 +01:00
Andreas Rheinhardt e114c63234 avutil/x86/pixelutils: Avoid near-empty header
lavu/x86/pixelutils.h only declares exactly one function,
namely the arch-specific init function. Such declarations
are usually contained in the ordinary header providing
the generic init function, yet the latter is public in this case.

Given that said function is called from exactly one callsite,
the header can be made more useful by moving the actual x86-init
function to it (as a static inline function) and removing
pixelutils_init.c.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-09 10:17:26 +01:00
Andreas Rheinhardt 085f06a13f avutil/pixelutils: Don't unconditionally include arch-specific header
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-09 10:17:26 +01:00
Andreas Rheinhardt c9e056bc85 avutil/x86/pixelutils: Remove pointless AVX2 sad32x32 functions
Memory operands of VEX encoded instructions generally have
no alignment requirement and so can be used in the case where
both inputs are unaligned, too. Furthermore, unaligned load
instructions are as fast as aligned loads (from aligned addresses)
for modern cpus, in particular those with AVX2.

Therefore it makes no sense to have three different AVX2 sad32x32
functions. So remove two of them (the remaining one is the same
as the old one where src1 was aligned and src2 was not).

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-09 10:17:26 +01:00
Andreas Rheinhardt 2862fa37e1 tests/checkasm: Add pixelutils test
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-09 10:17:26 +01:00
Ramiro Polla 4792440ab8 swscale/unscaled: fix planarCopyWrapper for float formats with same endianness 2026-03-09 08:22:58 +00:00
Andreas Rheinhardt 32b42dfd9b avcodec/x86/pngdsp: Don't use 64bit unnecessarily
The automatic zero-extensions when assigning a 32bit register
make using 64bits unnecessary.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-09 07:28:29 +01:00
Andreas Rheinhardt a8679f456f avcodec/x86/pngdsp: Avoid jump
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-09 07:28:29 +01:00
Andreas Rheinhardt f0a28cf9ce avcodec/x86/pngdsp: Don't use mmx register in ff_add_bytes_l2_sse2()
This change has no measurable impact on performance here;
it is intended to avoid unpredictable behavior with floating
point operation like the one that led to commit
57a29f2e7d.

Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@outlook.com>
2026-03-09 07:28:29 +01:00
Aditya Banavi 31c2f814ca avformat/tls_gnutls: fix DTLS handshake failure in some WebRTC cases
The early code may encounter handshake failure when publish
WHIP to some server.

See RFC 8827 section 6.5:
All implementations MUST support DTLS 1.2 with the
TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256 cipher suite
and the P-256 curve.

So this patch uses the specific curve to avoid incompatibility.

Signed-off-by: Aditya Banavi <adityabanavi@gmail.com>
2026-03-09 03:11:04 +00:00
Michael Niedermayer 248b481c33 avcodec/bsf/extract_extradata: Check that block_size is not negative
Fixes: out of array access
Fixes: 490576036/clusterfuzz-testcase-minimized-ffmpeg_BSF_EXTRACT_EXTRADATA_fuzzer-4605696279904256

Found-by: continuous fuzzing process https://github.com/google/oss-fuzz/tree/master/projects/ffmpeg
Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-03-08 23:34:14 +00:00
Romain Beauxis e5e8efae5c libavformat/mov.c: Fix seek in fragmented mp4 files where the audio and video streams are written to seperate fragments 2026-03-08 16:22:24 -05:00
973 changed files with 42125 additions and 15037 deletions
+14
View File
@@ -57,6 +57,7 @@ libavcodec/.*png.* @Traneptora
libavcodec/.*prores.* @lynne
libavcodec/rangecoder.* @michaelni
libavcodec/ratecontrol.* @michaelni
libavcodec/rkmpp* @quink
libavcodec/rv60.* @pross
libavcodec/sgirle.* @pross
libavcodec/.*siren.* @lynne
@@ -109,6 +110,7 @@ libavfilter/vf_find_rect.* @michaelni
libavfilter/vf_icc.* @haasn
libavfilter/vf_libplacebo.* @haasn
libavfilter/vf_libvmaf.* @kylophone
libavfilter/vf_mpdecimate.* @dana-feng
libavfilter/vf_premultiply.* @haasn
libavfilter/vf_scale.* @haasn
libavfilter/vf_scale_vt.* @quink
@@ -139,6 +141,8 @@ libavformat/electronicarts.* @pross
libavformat/.*exif.* @Traneptora
libavformat/filmstrip.* @pross
libavformat/frm.* @pross
libavformat/hls.* @kasper93
libavformat/hxvs.* @quink
libavformat/iamf.* @jamrial
libavformat/icecast.c @ePirat
libavformat/ico.* @pross
@@ -153,6 +157,7 @@ libavformat/mlv.* @pross
libavformat/mm.* @pross
libavformat/msp.* @pross
libavformat/mv.* @pross
libavformat/ogg.* @toots
libavformat/pp_bnk.* @zane
libavformat/rm.* @pross
libavformat/sauce.* @pross
@@ -193,6 +198,7 @@ libavutil/aarch64/.* @lynne @mstorsjo
libavutil/arm/.* @mstorsjo
libavutil/ppc/.* @sean_mcg
libavutil/riscv/.* @Courmisch
libavutil/wasm/.* @quink
libavutil/x86/.* @lynne
# swresample
@@ -226,8 +232,16 @@ doc/.* @GyanD
# tests
# =====
tests/checkasm/riscv/.* @Courmisch
libavutil/tests/buffer.* @MarcosAsh
libavutil/tests/hdr_dynamic_vivid_metadata.* @MarcosAsh
libavutil/tests/tdrdi.* @MarcosAsh
libavutil/tests/timestamp.* @MarcosAsh
tests/ref/.*drawvg.* @ayosec
tests/ref/fate/buffer @MarcosAsh
tests/ref/fate/hdr_dynamic_vivid_metadata @MarcosAsh
tests/ref/fate/sub-mcc.* @programmerjake
tests/ref/fate/tdrdi @MarcosAsh
tests/ref/fate/timestamp @MarcosAsh
# Forgejo
# =======
+3 -3
View File
@@ -20,9 +20,9 @@ repos:
- id: trailing-whitespace
- repo: local
hooks:
- id: aarch64-asm-indent
name: fix aarch64 assembly indentation
files: ^.*/aarch64/.*\.S$
- id: arm-asm-indent
name: fix arm/aarch64 assembly indentation
files: ^.*/(arm|aarch64)/.*\.S$
language: script
entry: ./tools/check_arm_indent.sh --apply
pass_filenames: false
+1
View File
@@ -69,6 +69,7 @@ pEvents
PixelX
Psot
quater
re-use
readd
recuse
redY
+1
View File
@@ -22,6 +22,7 @@ jobs:
with:
configuration-path: .forgejo/labeler/labeler.yml
repo-token: ${{ secrets.AUTOLABELER_TOKEN }}
sync-labels: true
- name: Label by title-match
uses: actions/github-script@v8
with:
+2
View File
@@ -26,6 +26,8 @@
*.spv
*.spv.c
*.spv.gz
*.gen.c
*.gen.S
*.ptx
*.ptx.c
*.ptx.gz
+1
View File
@@ -1,4 +1,5 @@
# Note to Github users
Patches should be submitted to [Forgejo](https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls) or the [ffmpeg-devel mailing list](https://ffmpeg.org/mailman/listinfo/ffmpeg-devel) using `git format-patch` or `git send-email`. Github pull requests should be avoided because they are not part of our review process and **will be ignored**.
See [https://ffmpeg.org/developer.html#Contributing](https://ffmpeg.org/developer.html#Contributing) for more information.
+8
View File
@@ -2,6 +2,14 @@ Entries are sorted chronologically from oldest to youngest within each release,
releases are sorted from youngest to oldest.
version <next>:
- Extend AMF Color Converter (vf_vpp_amf) HDR capabilities
- LCEVC track muxing support in MP4 muxer
- Playdate video encoder and muxer
- Add v360_vulkan filter
- HE-AAC 960 decoding (DAB+)
- transpose_cuda filter
- Add AMF Frame Rate Converter (vf_frc_amf) filter
- SMPTE 2094-50 metadata support and passthrough
version 8.1:
+6 -10
View File
@@ -1,4 +1,4 @@
## Installing FFmpeg
# Installing FFmpeg
0. If you like to include source plugins, merge them before configure
for example run tools/merge-all-source-plugins
@@ -14,15 +14,11 @@ path when launching `configure`, e.g. `/ffmpegdir/ffmpeg/configure`.
3. Type `make install` to install all binaries and libraries you built.
NOTICE
------
## NOTICE
- Non system dependencies (e.g. libx264, libvpx) are disabled by default.
- Non system dependencies (e.g. libx264, libvpx) are disabled by default.
NOTICE for Package Maintainers
------------------------------
## NOTICE for Package Maintainers
- It is recommended to build FFmpeg twice, first with minimal external dependencies so
that 3rd party packages, which depend on FFmpegs libavutil/libavfilter/libavcodec/libavformat
can then be built. And last build FFmpeg with full dependencies (which may in turn depend on
some of these 3rd party packages). This avoids circular dependencies during build.
- It is recommended to build FFmpeg twice, first with minimal external dependencies so that 3rd party packages, which depend on FFmpegs libavutil/libavfilter/libavcodec/libavformat
can then be built. And last build FFmpeg with full dependencies (which may in turn depend on some of these 3rd party packages). This avoids circular dependencies during build.
+15 -4
View File
@@ -93,6 +93,7 @@ Other:
hash Reimar Doeffinger
hwcontext_cuda* Timo Rothenpieler
hwcontext_d3d12va* Wu Jianhua
hwcontext_oh* Zhao Zhili
hwcontext_vulkan* [2] Lynne
intfloat* Michael Niedermayer
integer.c, integer.h Michael Niedermayer
@@ -158,7 +159,7 @@ Codecs:
asv* Michael Niedermayer
atrac3plus* Maxim Poliakovski
audiotoolbox* rcombs
avs2* Huiwen Ren
avs2* Huiwen Ren, Zhao Zhili
bgmc.c, bgmc.h Thilo Borgmann
binkaudio.c Peter Ross
cavs* Stefan Gehrer
@@ -232,6 +233,7 @@ Codecs:
msvideo1.c Mike Melanson
nuv.c Reimar Doeffinger
nvdec*, nvenc* Timo Rothenpieler
oh* Zhao Zhili
omx.c Martin Storsjo, Aman Gupta
opus* Rostislav Pehlivanov
pcx.c Ivo van Poorten
@@ -243,6 +245,7 @@ Codecs:
qtrle.c Mike Melanson
ra144.c, ra144.h, ra288.c, ra288.h Roberto Togni
resample2.c Michael Niedermayer
rkmppenc* Zhao Zhili
rl2.c Sascha Sommer
rpza.c Roberto Togni
rtjpeg.c, rtjpeg.h Reimar Doeffinger
@@ -317,6 +320,10 @@ libavfilter
===========
Generic parts:
Framework and orphaned filters Nicolas George
(except hardware acceleration)
graphdump.c Nicolas George
motion_estimation.c Davinder Singh
@@ -348,7 +355,9 @@ Filters:
vf_minterpolate.c Davinder Singh
vf_readvitc.c Tobias Rapp (CC t.rapp at noa-archive dot com)
vf_scale.c [2] Michael Niedermayer
vf_scale_vt.c Zhao Zhili
vf_tonemap_opencl.c Ruiling Song
vf_transpose_vt.c Zhao Zhili
vf_yadif.c [2] Michael Niedermayer
vf_xfade_vulkan.c [2] Marvin Scholz (CC <epirat07@gmail.com>)
@@ -406,7 +415,9 @@ Muxers/Demuxers:
flvenc.c Michael Niedermayer, Steven Liu
gxf.c Reimar Doeffinger
gxfenc.c Baptiste Coudurier
hls.c Kacper Michajłow
hlsenc.c Christian Suloway, Steven Liu
hxvs.c Zhao Zhili
iamf* [2] James Almer
idcin.c Mike Melanson
idroqdec.c Mike Melanson
@@ -442,9 +453,9 @@ Muxers/Demuxers:
nsvdec.c Francois Revol
nut* Michael Niedermayer
nuv.c Reimar Doeffinger
oggdec.c, oggdec.h David Conrad
oggenc.c Baptiste Coudurier
oggparse*.c David Conrad
oggdec.c, oggdec.h David Conrad, Romain Beauxis
oggenc.c Baptiste Coudurier, Romain Beauxis
oggparse*.c David Conrad, Romain Beauxis
oma.c Maxim Poliakovski
pp_bnk.c Zane van Iperen
psxstr.c Mike Melanson
+1 -2
View File
@@ -1,5 +1,4 @@
FFmpeg README
=============
# FFmpeg README
FFmpeg is a collection of libraries and tools to process multimedia content
such as audio, video, subtitles and related metadata.
+1
View File
@@ -182,6 +182,7 @@ static inline __device__ float fabsf(float a) { return __builtin_fabsf(a); }
static inline __device__ float fabs(float a) { return __builtin_fabsf(a); }
static inline __device__ double fabs(double a) { return __builtin_fabs(a); }
static inline __device__ float sqrtf(float a) { return __builtin_sqrtf(a); }
static inline __device__ float rintf(float a) { return __builtin_rintf(a); }
static inline __device__ float __saturatef(float a) { return __nvvm_saturate_f(a); }
static inline __device__ float __sinf(float a) { return __nvvm_sin_approx_f(a); }
Vendored
+128 -39
View File
@@ -390,7 +390,7 @@ Toolchain options:
--tempprefix=PATH force fixed dir/prefix instead of mktemp for checks
--toolchain=NAME set tool defaults according to NAME
(<tool>[-sanitizer[-...]], e.g. clang-asan-ubsan
tools: gcc, clang, msvc, icl, gcov, llvm-cov,
tools: gcc, clang, llvm, msvc, icl, gcov, llvm-cov,
valgrind-memcheck, valgrind-massif, hardened
sanitizers: asan, fuzz, lsan, msan, tsan, ubsan)
--nm=NM use nm tool NM [$nm_default]
@@ -415,6 +415,7 @@ Toolchain options:
--pkg-config-flags=FLAGS pass additional flags to pkgconf []
--ranlib=RANLIB use ranlib RANLIB [$ranlib_default]
--doxygen=DOXYGEN use DOXYGEN to generate API doc [$doxygen_default]
--makeinfo=MAKEINFO use MAKEINFO to generate documentation [$makeinfo_default]
--host-cc=HOSTCC use host C compiler HOSTCC
--host-cflags=HCFLAGS use HCFLAGS when compiling for host
--host-cppflags=HCPPFLAGS use HCPPFLAGS when compiling for host
@@ -482,6 +483,8 @@ Optimization options (experts only):
--disable-arm-crc disable ARM/AArch64 CRC optimizations
--disable-dotprod disable DOTPROD optimizations
--disable-i8mm disable I8MM optimizations
--disable-pmull disable PMULL optimizations
--disable-eor3 disable EOR3 optimizations
--disable-sve disable SVE optimizations
--disable-sve2 disable SVE2 optimizations
--disable-sme disable SME optimizations
@@ -1079,6 +1082,10 @@ hostcc_o(){
eval printf '%s\\n' $HOSTCC_O
}
hostld_o(){
eval printf '%s\\n' $HOSTLD_O
}
glslc_o(){
eval printf '%s\\n' $GLSLC_O
}
@@ -1294,7 +1301,14 @@ test_ld(){
test_$type $($cflags_filter $flags) || return
flags=$($ldflags_filter $flags)
libs=$($ldflags_filter $libs)
test_cmd $ld $LDFLAGS $LDEXEFLAGS $flags $(ld_o $TMPE) $TMPO $libs $extralibs
log $ld $LDFLAGS $LDEXEFLAGS $flags $(ld_o $TMPE) $TMPO $libs $extralibs
output=$($ld $LDFLAGS $LDEXEFLAGS $flags $(ld_o $TMPE) $TMPO $libs $extralibs 2>&1)
ret=$?
echo "$output" >> $logfile
# link.exe and lld-link exit 0 even for unrecognized options, emitting
# only a warning (LNK4044 / "ignoring unknown argument"). Treat such
# output as failure so check_ldflags rejects those flags correctly.
test $ret -eq 0 && ! echo "$output" | grep -qE 'LNK4044|lld-link: warning: ignoring unknown argument'
}
check_ld(){
@@ -1833,6 +1847,16 @@ test_host_cc(){
test_cmd $host_cc $host_cflags "$@" $HOSTCC_C $(hostcc_o $TMPO) $TMPC
}
test_host_ld(){
log test_host_ld "$@"
flags=$(filter_out '-l*|*.so' $@)
libs=$(filter '-l*|*.so' $@)
test_host_cc $($host_cflags_filter $flags) || return
flags=$($host_ldflags_filter $flags)
libs=$($host_ldflags_filter $libs)
test_cmd $host_ld $host_ldflags $flags $(hostld_o $TMPE) $TMPO $libs $host_extralibs
}
test_host_cpp(){
log test_host_cpp "$@"
cat > $TMPC
@@ -1897,6 +1921,27 @@ check_host_cpp_condition(){
test_host_cpp_condition "$@" && enable $name
}
check_host_lib(){
log check_host_lib "$@"
headers="$1"
funcs="$2"
shift 2
{
for hdr in $headers; do
print_include $hdr
done
echo "#include <stdint.h>"
for func in $funcs; do
echo "long check_$func(void) { return (long) $func; }"
done
echo "int main(void) { int ret = 0;"
for func in $funcs; do
echo " ret |= ((intptr_t)check_$func) & 0xFFFF;"
done
echo "return ret; }"
} | test_host_ld "$@" && append host_extralibs "$@"
}
cp_if_changed(){
cmp -s "$1" "$2" && { test "$quiet" != "yes" && echo "$2 is unchanged"; } && return
mkdir -p "$(dirname $2)"
@@ -2299,6 +2344,8 @@ ARCH_EXT_LIST_ARM="
arm_crc
dotprod
i8mm
pmull
eor3
neon
vfp
vfpv3
@@ -2450,6 +2497,8 @@ HEADERS_LIST="
valgrind_valgrind_h
windows_h
winsock2_h
spirv_headers_spirv_h
spirv_unified1_spirv_h
"
INTRINSICS_LIST="
@@ -2575,6 +2624,8 @@ TOOLCHAIN_FEATURES="
as_archext_crc_directive
as_archext_dotprod_directive
as_archext_i8mm_directive
as_archext_sha3_directive
as_archext_aes_directive
as_archext_sve_directive
as_archext_sve2_directive
as_archext_sme_directive
@@ -2666,7 +2717,6 @@ HAVE_LIST="
gzip
ioctl_posix
libdrm_getfb2
makeinfo
makeinfo_html
opencl_d3d11
opencl_drm_arm
@@ -2696,7 +2746,9 @@ CONFIG_EXTRA="
cabac
cbs
cbs_apv
cbs_apv_lavf
cbs_av1
cbs_av1_lavf
cbs_h264
cbs_h265
cbs_h266
@@ -2848,6 +2900,7 @@ CMDLINE_SET="
cxx
dep_cc
doxygen
makeinfo
env
extra_version
gas
@@ -2918,6 +2971,8 @@ setend_deps="arm"
arm_crc_deps="aarch64"
dotprod_deps="aarch64 neon"
i8mm_deps="aarch64 neon"
pmull_deps="aarch64 neon"
eor3_deps="aarch64 neon"
sve_deps="aarch64 neon"
sve2_deps="aarch64 neon sve"
sme_deps="aarch64 neon sve sve2"
@@ -3233,6 +3288,7 @@ nuv_decoder_select="idctdsp"
opus_decoder_deps="swresample"
opus_encoder_select="audio_frame_queue"
pdv_decoder_select="inflate_wrapper"
pdv_encoder_select="deflate_wrapper"
png_decoder_select="inflate_wrapper"
png_encoder_select="deflate_wrapper llvidencdsp"
prores_decoder_select="blockdsp idctdsp"
@@ -3522,6 +3578,8 @@ scale_cuda_filter_deps="ffnvcodec"
scale_cuda_filter_deps_any="cuda_nvcc cuda_llvm"
thumbnail_cuda_filter_deps="ffnvcodec"
thumbnail_cuda_filter_deps_any="cuda_nvcc cuda_llvm"
transpose_cuda_filter_deps="ffnvcodec"
transpose_cuda_filter_deps_any="cuda_nvcc cuda_llvm"
transpose_npp_filter_deps="ffnvcodec libnpp"
overlay_cuda_filter_deps="ffnvcodec"
overlay_cuda_filter_deps_any="cuda_nvcc cuda_llvm"
@@ -3693,6 +3751,7 @@ vvc_parser_select="cbs_h266"
# bitstream_filters
aac_adtstoasc_bsf_select="adts_header mpeg4audio"
ahx_to_mp2_bsf_deps="lgpl_gpl"
apv_metadata_bsf_select="cbs_apv"
av1_frame_merge_bsf_select="cbs_av1"
av1_frame_split_bsf_select="cbs_av1"
av1_metadata_bsf_select="cbs_av1"
@@ -3786,6 +3845,7 @@ libkvazaar_encoder_deps="libkvazaar"
liblc3_decoder_deps="liblc3"
liblc3_encoder_deps="liblc3"
liblc3_encoder_select="audio_frame_queue"
liblcevc_dec_select="cbs_lcevc"
libmodplug_demuxer_deps="libmodplug"
libmp3lame_encoder_deps="libmp3lame"
libmp3lame_encoder_select="audio_frame_queue mpegaudioheader"
@@ -3906,7 +3966,7 @@ mlp_demuxer_select="mlp_parser"
mmf_muxer_select="riffenc"
mov_demuxer_select="iso_media riffdec"
mov_demuxer_suggest="iamfdec zlib"
mov_muxer_select="iso_media iso_writer riffenc rtpenc_chain vp9_superframe_bsf aac_adtstoasc_bsf ac3_parser"
mov_muxer_select="cbs_apv_lavf cbs_av1_lavf iso_media iso_writer riffenc rtpenc_chain vp9_superframe_bsf aac_adtstoasc_bsf ac3_parser"
mov_muxer_suggest="iamfenc"
mp3_demuxer_select="mpegaudio_parser"
mp3_muxer_select="mpegaudioheader"
@@ -4091,15 +4151,15 @@ avgblur_vulkan_filter_deps="vulkan spirv_compiler"
azmq_filter_deps="libzmq"
blackdetect_vulkan_filter_deps="vulkan spirv_library"
blackframe_filter_deps="gpl"
blend_vulkan_filter_deps="vulkan spirv_library"
blend_vulkan_filter_deps="vulkan spirv_compiler"
boxblur_filter_deps="gpl"
boxblur_opencl_filter_deps="opencl gpl"
bs2b_filter_deps="libbs2b"
bwdif_cuda_filter_deps="ffnvcodec"
bwdif_cuda_filter_deps_any="cuda_nvcc cuda_llvm"
bwdif_vulkan_filter_deps="vulkan spirv_compiler"
chromaber_vulkan_filter_deps="vulkan spirv_library"
color_vulkan_filter_deps="vulkan spirv_library"
chromaber_vulkan_filter_deps="vulkan spirv_compiler"
color_vulkan_filter_deps="vulkan spirv_compiler"
colorkey_opencl_filter_deps="opencl"
colormatrix_filter_deps="gpl"
convolution_opencl_filter_deps="opencl"
@@ -4128,7 +4188,7 @@ elbg_filter_deps="avcodec"
eq_filter_deps="gpl"
erosion_opencl_filter_deps="opencl"
find_rect_filter_deps="avcodec avformat gpl"
flip_vulkan_filter_deps="vulkan spirv_library"
flip_vulkan_filter_deps="vulkan spirv_compiler"
flite_filter_deps="libflite threads"
framerate_filter_select="scene_sad"
freezedetect_filter_select="scene_sad"
@@ -4137,15 +4197,15 @@ frei0r_filter_deps="frei0r"
frei0r_src_filter_deps="frei0r"
fspp_filter_deps="gpl"
fsync_filter_deps="avformat"
gblur_vulkan_filter_deps="vulkan spirv_library"
hflip_vulkan_filter_deps="vulkan spirv_library"
gblur_vulkan_filter_deps="vulkan spirv_compiler"
hflip_vulkan_filter_deps="vulkan spirv_compiler"
histeq_filter_deps="gpl"
hqdn3d_filter_deps="gpl"
iccdetect_filter_deps="lcms2"
iccgen_filter_deps="lcms2"
identity_filter_select="scene_sad"
interlace_filter_deps="gpl"
interlace_vulkan_filter_deps="vulkan spirv_library"
interlace_vulkan_filter_deps="vulkan spirv_compiler"
kerndeint_filter_deps="gpl"
ladspa_filter_deps="ladspa libdl"
lcevc_filter_deps="liblcevc_dec"
@@ -4176,7 +4236,7 @@ overlay_opencl_filter_deps="opencl"
overlay_qsv_filter_deps="libmfx"
overlay_qsv_filter_select="qsvvpp"
overlay_vaapi_filter_deps="vaapi VAProcPipelineCaps_blend_flags"
overlay_vulkan_filter_deps="vulkan spirv_library"
overlay_vulkan_filter_deps="vulkan spirv_compiler"
owdenoise_filter_deps="gpl"
pad_opencl_filter_deps="opencl"
pan_filter_deps="swresample"
@@ -4197,10 +4257,11 @@ scale2ref_filter_deps="swscale"
scale_filter_deps="swscale"
sr_amf_filter_deps="amf"
vpp_amf_filter_deps="amf"
frc_amf_filter_deps="amf windows_h"
scale_qsv_filter_deps="libmfx"
scale_qsv_filter_select="qsvvpp"
scdet_filter_select="scene_sad"
scdet_vulkan_filter_deps="vulkan spirv_library"
scdet_vulkan_filter_deps="vulkan spirv_compiler"
select_filter_select="scene_sad"
sharpness_vaapi_filter_deps="vaapi"
showcqt_filter_deps="avformat swscale"
@@ -4226,11 +4287,12 @@ tonemap_opencl_filter_deps="opencl const_nan"
transpose_opencl_filter_deps="opencl"
transpose_vaapi_filter_deps="vaapi VAProcPipelineCaps_rotation_flags"
transpose_vt_filter_deps="videotoolbox VTPixelRotationSessionCreate"
transpose_vulkan_filter_deps="vulkan spirv_library"
transpose_vulkan_filter_deps="vulkan spirv_compiler"
unsharp_opencl_filter_deps="opencl"
uspp_filter_deps="gpl avcodec"
v360_vulkan_filter_deps="vulkan spirv_compiler"
vaguedenoiser_filter_deps="gpl"
vflip_vulkan_filter_deps="vulkan spirv_library"
vflip_vulkan_filter_deps="vulkan spirv_compiler"
vidstabdetect_filter_deps="libvidstab"
vidstabtransform_filter_deps="libvidstab"
libvmaf_filter_deps="libvmaf"
@@ -4244,7 +4306,7 @@ scale_vulkan_filter_deps="vulkan spirv_compiler spirv_library"
vpp_qsv_filter_deps="libmfx"
vpp_qsv_filter_select="qsvvpp"
xfade_opencl_filter_deps="opencl"
xfade_vulkan_filter_deps="vulkan spirv_library"
xfade_vulkan_filter_deps="vulkan spirv_compiler"
yadif_cuda_filter_deps="ffnvcodec"
yadif_cuda_filter_deps_any="cuda_nvcc cuda_llvm"
yadif_videotoolbox_filter_deps="metal corevideo videotoolbox"
@@ -4328,7 +4390,7 @@ podpages_deps="perl"
manpages_deps="perl pod2man"
htmlpages_deps="perl"
htmlpages_deps_any="makeinfo_html texi2html"
txtpages_deps="perl makeinfo"
txtpages_deps="perl makeinfo_command"
doc_deps_any="manpages htmlpages podpages txtpages"
# default parameters
@@ -4352,6 +4414,7 @@ stdcxx_default="c++17"
cxx_default="g++"
host_cc_default="gcc"
doxygen_default="doxygen"
makeinfo_default="makeinfo"
install="install"
ln_s_default="ln -s -f"
glslc_default="glslc"
@@ -4461,7 +4524,7 @@ GLSLC_O='-o $@'
NVCC_C='-c'
NVCC_O='-o $@'
host_extralibs='-lm'
host_extralibs=
host_cflags_filter=echo
host_ldflags_filter=echo
@@ -4478,7 +4541,7 @@ mkdir -p ffbuild
if test -f configure; then
source_path=.
elif test -f src/configure; then
source_path=src
source_path=./src
else
source_path=$(cd $(dirname "$0"); pwd)
case "$source_path" in
@@ -4866,6 +4929,16 @@ case "$toolchain" in
cc_default="clang"
cxx_default="clang++"
;;
llvm|llvm-*)
cc_default="clang"
cxx_default="clang++"
ar_default="llvm-ar"
nm_default="llvm-nm -g"
ranlib_default="llvm-ranlib"
strip_default="llvm-strip"
windres_default="llvm-windres"
test "$toolchain" != "llvm" && add_sanitizers "${toolchain#llvm-}"
;;
gcc-*)
add_sanitizers "${toolchain#gcc-}"
cc_default="gcc"
@@ -4990,7 +5063,7 @@ if enabled cuda_nvcc; then
fi
set_default arch cc cxx doxygen pkg_config ranlib strip sysinclude \
target_exec x86asmexe glslc metalcc metallib stdc stdcxx
target_exec x86asmexe glslc metalcc metallib stdc stdcxx makeinfo
enabled cross_compile || host_cc_default=$cc
set_default host_cc
@@ -5007,7 +5080,7 @@ elif is_in -static $cc $LDFLAGS && ! is_in --static $pkg_config $pkg_config_flag
Note: When building a static binary, add --pkg-config-flags=\"--static\"."
fi
if test $doxygen != $doxygen_default && \
if test "$doxygen" != "$doxygen_default" && \
! $doxygen --version >/dev/null 2>&1; then
warn "Specified doxygen \"$doxygen\" not found, API documentation will fail to build."
fi
@@ -6307,7 +6380,7 @@ link_name=$(mktemp -u $TMPDIR/name_XXXXXXXX)
mkdir "$link_dest"
$ln_s "$link_dest" "$link_name"
touch "$link_dest/test_file"
if [ "$source_path" != "." ] && [ "$source_path" != "src" ] && ([ ! -d src ] || [ -L src ]) && [ -e "$link_name/test_file" ]; then
if [ "$source_path" != "." ] && [ "$source_path" != "./src" ] && ([ ! -d src ] || [ -L src ]) && [ -e "$link_name/test_file" ]; then
# create link to source path
[ -e src ] && rm src
$ln_s "$source_path" src
@@ -6561,8 +6634,10 @@ if enabled aarch64; then
# internal assembler in clang 3.3 does not support this instruction
enabled neon && check_insn neon 'ext v0.8B, v0.8B, v1.8B, #1'
archext_list="arm_crc dotprod i8mm sve sve2 sme sme_i16i64 sme2"
archext_list="arm_crc dotprod i8mm pmull eor3 sve sve2 sme sme_i16i64 sme2"
enabled arm_crc && check_archext_name_insn arm_crc crc 'crc32x w0, w0, x0'
enabled pmull && check_archext_name_insn pmull aes 'pmull v0.1q, v0.1d, v0.1d'
enabled eor3 && check_archext_name_insn eor3 sha3 'eor3 v0.16b, v1.16b, v2.16b, v3.16b'
enabled dotprod && check_archext_insn dotprod 'udot v0.4s, v0.16b, v0.16b'
enabled i8mm && check_archext_insn i8mm 'usdot v0.4s, v0.16b, v0.16b'
enabled sve && check_archext_insn sve 'whilelt p0.s, x0, x1'
@@ -6711,7 +6786,7 @@ elif enabled ppc; then
fi
if enabled vsx; then
check_cflags -mvsx &&
check_cflags -mvsx
check_cc vsx altivec.h "int v[4] = { 0 };
vector signed int v1 = vec_vsx_ld(0, v);"
fi
@@ -6769,7 +6844,6 @@ EOF
x86asmexe_probe=$1
if test_cmd $x86asmexe_probe -v; then
x86asmexe=$x86asmexe_probe
x86asm_debug="-g -F dwarf"
X86ASM_DEPFLAGS='-MD $(@:.o=.d)'
fi
check_x86asm x86asm "movbe ecx, [5]"
@@ -6783,9 +6857,7 @@ EOF
disabled x86asm && die "nasm not found or too old. Please install/update nasm or use --disable-x86asm for a build without hand-optimized assembly."
X86ASMFLAGS="-f $objformat"
test -n "$extern_prefix" && append X86ASMFLAGS "-DPREFIX"
case "$objformat" in
elf*) enabled debug && append X86ASMFLAGS $x86asm_debug ;;
esac
enabled debug && append X86ASMFLAGS "-g"
enabled avx512 && check_x86asm avx512_external "vmovdqa32 [eax]{k1}{z}, zmm0"
enabled avx512icl && check_x86asm avx512icl_external "vpdpwssds zmm31{k1}{z}, zmm29, zmm28"
@@ -7205,6 +7277,7 @@ enabled zlib_gzip && enabled gzip || disable resource_compression
check_lib libdl dlfcn.h "dlopen dlsym" || check_lib libdl dlfcn.h "dlopen dlsym" -ldl
check_lib libm math.h sin -lm
check_host_lib math.h sin -lm
atan2f_args=2
copysign_args=2
@@ -7559,10 +7632,14 @@ enabled schannel &&
enabled schannel && check_cc dtls_protocol "windows.h security.h schnlsp.h" "int i = SECPKG_ATTR_DTLS_MTU;" -DSECURITY_WIN32
makeinfo --version > /dev/null 2>&1 && enable makeinfo || disable makeinfo
enabled makeinfo \
&& [ 0$(makeinfo --version | grep "texinfo" | sed 's/.*texinfo[^0-9]*\([0-9]*\)\..*/\1/') -ge 5 ] \
$makeinfo --version > /dev/null 2>&1 && enable makeinfo_command || disable makeinfo_command makeinfo_html
if enabled makeinfo_command; then
[ 0$($makeinfo --version | grep "texinfo" | sed 's/.*texinfo[^0-9]*\([0-9]*\)\..*/\1/') -ge 5 ] \
&& enable makeinfo_html || disable makeinfo_html
elif test "$makeinfo" != "$makeinfo_default" ; then
warn "Specified makeinfo \"$makeinfo\" not found."
fi
disabled makeinfo_html && texi2html --help 2> /dev/null | grep -q 'init-file' && enable texi2html || disable texi2html
perl -v > /dev/null 2>&1 && enable perl || disable perl
pod2man --help > /dev/null 2>&1 && enable pod2man || disable pod2man
@@ -7792,6 +7869,12 @@ else
disable libglslang libshaderc spirv_library spirv_compiler
fi
if enabled vulkan; then
check_headers spirv-headers/spirv.h ||
check_headers spirv/unified1/spirv.h ||
{ requested vulkan && warn "spirv-headers not found, swscale SPIR-V backend unavailable"; }
fi
if enabled x86; then
case $target_os in
freebsd|mingw32*|mingw64*|win32|win64|linux|cygwin*)
@@ -7832,7 +7915,7 @@ fi
enabled amf &&
check_cpp_condition amf "AMF/core/Version.h" \
"(AMF_VERSION_MAJOR << 48 | AMF_VERSION_MINOR << 32 | AMF_VERSION_RELEASE << 16 | AMF_VERSION_BUILD_NUM) >= 0x0001000400240000"
"(AMF_VERSION_MAJOR << 48 | AMF_VERSION_MINOR << 32 | AMF_VERSION_RELEASE << 16 | AMF_VERSION_BUILD_NUM) >= 0x1000500000000"
# Funny iconv installations are not unusual, so check it after all flags have been set
if enabled libc_iconv; then
@@ -7881,6 +7964,15 @@ check_warning -Wempty-body
# This roughly matches the default thread stack size on Musl, which is 128 KiB,
# leaving some headroom for caller frames.
check_warning -Wstack-usage=122880
# GCC accepts warning option level 5 here to warn about all fallthroughs
# that are not explicitly marked with the appropriate attribute
if enabled gcc; then
check_warning -Wimplicit-fallthrough=5
else
check_warning -Wimplicit-fallthrough
fi
check_c_warning -Wmissing-prototypes
check_c_warning -Wstrict-prototypes
check_c_warning -Wunterminated-string-initialization
@@ -8004,11 +8096,6 @@ fi
enabled ftrapv && check_cflags -ftrapv
test_cc -mno-red-zone <<EOF && noredzone_flags="-mno-red-zone"
int x;
EOF
if enabled icc; then
# Just warnings, no remarks
check_allcflags -w1
@@ -8400,6 +8487,8 @@ if enabled aarch64; then
echo "NEON enabled ${neon-no}"
echo "DOTPROD enabled ${dotprod-no}"
echo "I8MM enabled ${i8mm-no}"
echo "PMULL enabled ${pmull-no}"
echo "EOR3 enabled ${eor3-no}"
echo "SVE enabled ${sve-no}"
echo "SVE2 enabled ${sve2-no}"
echo "SME enabled ${sme-no}"
@@ -8449,7 +8538,7 @@ echo "safe bitstream reader ${safe_bitstream_reader-no}"
echo "texi2html enabled ${texi2html-no}"
echo "perl enabled ${perl-no}"
echo "pod2man enabled ${pod2man-no}"
echo "makeinfo enabled ${makeinfo-no}"
echo "makeinfo enabled ${makeinfo_command-no}"
echo "makeinfo supports HTML ${makeinfo_html-no}"
echo "experimental features ${unstable-no}"
echo "xmllint enabled ${xmllint-no}"
@@ -8598,6 +8687,7 @@ LD_PATH=$LD_PATH
DLLTOOL=$dlltool
WINDRES=$windres
DOXYGEN=$doxygen
MAKEINFO=$makeinfo
LDFLAGS=$LDFLAGS
LDEXEFLAGS=$LDEXEFLAGS
LDSOFLAGS=$LDSOFLAGS
@@ -8666,7 +8756,6 @@ SLIB_INSTALL_EXTRA_LIB=${SLIB_INSTALL_EXTRA_LIB}
SLIB_INSTALL_EXTRA_SHLIB=${SLIB_INSTALL_EXTRA_SHLIB}
VERSION_SCRIPT_POSTPROCESS_CMD=${VERSION_SCRIPT_POSTPROCESS_CMD}
SAMPLES:=${samples:-\$(FATE_SAMPLES)}
NOREDZONE_FLAGS=$noredzone_flags
LIBFUZZER_PATH=$libfuzzer_path
IGNORE_TESTS=$ignore_tests
VERSION_TRACKING=$version_tracking
+37
View File
@@ -2,6 +2,43 @@ The last version increases of all libraries were on 2025-03-28
API changes, most recent first:
2026-05-16 - xxxxxxxxxxx - lavf 62.16.100 - avformat.h
Add AVFMT_FIXED_FRAMESIZE.
2026-05-16 - xxxxxxxxxxx - lavc 62.33.100 - avcodec.h
Add AV_CODEC_FLAG2_FIXED_FRAME_SIZE.
2026-05-12 - xxxxxxxxxx - lavu 60.31.100 - frame.h
Add IAMF frame side data types to enum AVFrameSideDataType:
- AV_FRAME_DATA_IAMF_MIX_GAIN_PARAM
- AV_FRAME_DATA_IAMF_DEMIXING_INFO_PARAM
- AV_FRAME_DATA_IAMF_RECON_GAIN_INFO_PARAM
2026-05-05 - xxxxxxxxxxx - lavf 62.15.100 - avformat.h
Add av_program_copy().
2026-05-05 - xxxxxxxxxxx - lavf 62.14.100 - avformat.h
Add av_program_add_stream_index2().
2026-04-14 - 7faa6ee2aa - lavc 62.30.100 - packet.h
Add AV_PKT_DATA_DYNAMIC_HDR_SMPTE_2094_APP5 side data type.
2026-04-09 - 6ba6db4f19 - lavu 60.30.100 - hdr_dynamic_metadata.h frame.h
Add AVDynamicHDRSmpte2094App5 struct and functions.
Add AV_FRAME_DATA_DYNAMIC_HDR_SMPTE_2094_APP5 side data type.
2026-03-14 - xxxxxxxxxx - lavu 60.29.100 - hwcontext_vulkan.h
Deprecate AVVulkanDeviceContext.lock_queue and
AVVulkanDeviceContext.unlock_queue without replacement.
2026-03-12 - xxxxxxxxxx - lsws 9.7.100 - swscale.h
Add enum SwsScaler, and SwsContext.scaler/scaler_sub.
2026-03-11 - 910000fe59d - lavu 60.28.100 - hwcontext_amf.h
Add av_amf_display_mastering_meta_to_hdrmeta(), av_amf_light_metadata_to_hdrmeta().
Add av_amf_extract_hdr_metadata(), av_amf_attach_hdr_metadata().
Add av_amf_get_color_profile().
2026-03-07 - c23d56b173a - lavc 62.26.100 - codec_desc.h
Add AV_CODEC_PROP_ENHANCEMENT.
+40
View File
@@ -0,0 +1,40 @@
This document is work in progress
*What is CVSS*
The Common Vulnerability Scoring System (CVSS) is an open, industry-standard framework used to measure and communicate the severity of software vulnerabilities, ranging from 0.0 to 10.0.
*Why we need this Document*
It is important that FFmpeg CVEs have consistent and correct CVSS, not only for the obvious reason that one can recognize the severity of an issue at first glance.
But also as these numbers form the basis of rewards paid in bug bounty systems. Inconsistent CVSS could lead to unfair payouts.
*What is this Document*
Prior 2026, FFmpeg had no guideline about CVSS.
This document describes how to select the CVSS for a FFmpeg related CVE. It currently only covers the Base Score.
*What is the CVSS Base Score*
AV Attack Vector (Network, Adjacent, Local, Physical)
AC Attack Complexity (Low, High)
PR Privileges Required (None, Low, High)
UI User Interaction (None, Required)
S Scope (Unchanged, Changed)
C Confidentiality (None, Low, High)
I Integrity (None, Low, High)
A Availability (None, Low, High)
*Things people have set incorrectly*
Below are general guidelines and in specific cases other things may apply.
Attack Vector.
Quote from https://www.first.org/cvss/v3.1/user-guide
"Specifically, analysts should only score for Network or Adjacent when a vulnerability is bound to the network stack.
Vulnerabilities which require user interaction to download or receive malicious content (which could also be delivered locally, e.g., via USB drives) should be scored as Local."
Availability.
FFmpeg Crashes -> AVAILABILITY IMPACT: Low
FFmpeg is frequently used as a short-lived, single-run process instead of a continuously running service that handles ongoing streams of user input. In that usage model, a crash usually causes only limited disruption.
User Interaction
Please consider if an attacker can actually set the parameters required for an attack.
In general arbitrary filter parameters cannot be set by an attacker and require the user/account owner/admin to set them
+3 -3
View File
@@ -54,7 +54,7 @@ TEXIDEP = perl $(SRC_PATH)/doc/texidep.pl $(SRC_PATH) $< $@ >$(@:%=%.d)
doc/%.txt: TAG = TXT
doc/%.txt: doc/%.texi
$(Q)$(TEXIDEP)
$(M)makeinfo --force --no-headers -o $@ $< 2>/dev/null
$(M)$(MAKEINFO) --force --no-headers -o $@ $< 2>/dev/null
GENTEXI = format codec
GENTEXI := $(GENTEXI:%=doc/avoptions_%.texi)
@@ -69,11 +69,11 @@ doc/%-all.html: TAG = HTML
ifdef HAVE_MAKEINFO_HTML
doc/%.html: doc/%.texi $(SRC_PATH)/doc/t2h.pm $(GENTEXI)
$(Q)$(TEXIDEP)
$(M)makeinfo --html -I doc --no-split -D config-not-all --init-file=$(SRC_PATH)/doc/t2h.pm --output $@ $<
$(M)$(MAKEINFO) --html -I doc --no-split -D config-not-all --init-file=$(SRC_PATH)/doc/t2h.pm --output $@ $<
doc/%-all.html: doc/%.texi $(SRC_PATH)/doc/t2h.pm $(GENTEXI)
$(Q)$(TEXIDEP)
$(M)makeinfo --html -I doc --no-split -D config-all --init-file=$(SRC_PATH)/doc/t2h.pm --output $@ $<
$(M)$(MAKEINFO) --html -I doc --no-split -D config-all --init-file=$(SRC_PATH)/doc/t2h.pm --output $@ $<
else
doc/%.html: doc/%.texi $(SRC_PATH)/doc/t2h.init $(GENTEXI)
$(Q)$(TEXIDEP)
-5
View File
@@ -7,11 +7,6 @@ V
Disable the default terse mode, the full command issued by make and its
output will be shown on the screen.
DBG
Preprocess x86 external assembler files to a .dbg.asm file in the object
directory, which then gets compiled. Helps in developing those assembler
files.
DESTDIR
Destination directory for the install targets, useful to prepare packages
or install FFmpeg in cross-environments.
+2
View File
@@ -646,6 +646,8 @@ Do not skip samples and export skip information as frame side data.
Do not reset ASS ReadOrder field on flush.
@item icc_profiles
Generate/parse embedded ICC profiles from/to colorimetry tags.
@item fixed_frame_size
Force audio encoders to use a fixed frame size.
@end table
@item export_side_data @var{flags} (@emph{decoding/encoding,audio,video,subtitles})
+28 -4
View File
@@ -225,6 +225,25 @@ AVStream *stream;
AVStream* stream;
@end example
@item
When sensible, prefer a narrow variable scope, especially in for loops:
@example c, good
// Good
for (unsigned i = 0; i < submix->nb_elements; i++) @{
// Do something...
@}
@end example
@example c, bad
// Bad style
unsigned i;
//...
for (i = 0; i < submix->nb_elements; i++) @{
// Do something...
@}
@end example
@end itemize
If you work on a file that does not follow these guidelines consistently,
@@ -368,7 +387,7 @@ symbols. If in doubt, just avoid names starting with @code{_} altogether.
Casts should be used only when necessary. Unneeded parentheses
should also be avoided if they don't make the code easier to understand.
@item
Where applicable, SI units shall be used. For example timeouts should use seconds as the fundamental unit not micro seconds.
Where applicable, SI units shall be used. For example timeouts should use seconds as the fundamental unit not microseconds.
That means a bare value like @samp{1.0} must mean 1 second, @samp{50m} means 50 milliseconds. For weight, gram shall be used.
@end itemize
@@ -471,13 +490,18 @@ ask/discuss it on the developer mailing list.
@subheading Cosmetic changes should be kept in separate patches.
We refuse source indentation and other cosmetic changes if they are mixed
with functional changes, such commits will be rejected and removed. Every
with functional changes, such commits will be rejected and removed. However,
indentation changes that can be ignored by @code{git diff --ignore-all-space}
(e.g. changes in whitespace amount, leading/trailing spaces) may be mixed with
functional changes, since reviewers can use @code{git diff -w} or
@code{git log -p --ignore-all-space} to review only the functional parts of
the change. Forgejo's pull request interface also provides a
``Hide whitespace changes'' option for this purpose. Every
developer has his own indentation style, you should not change it. Of course
if you (re)write something, you can use your own style, even though we would
prefer if the indentation throughout FFmpeg was consistent (Many projects
force a given indentation style - we do not.). If you really need to make
indentation changes (try to avoid this), separate them strictly from real
changes.
non-whitespace cosmetic changes, separate them strictly from real changes.
NOTE: If you had to put if()@{ .. @} over a large (> 5 lines) chunk of code,
then either do NOT change the indentation of the inner part within (do not
+4
View File
@@ -1045,6 +1045,10 @@ Other values include 0 for mono and stereo, 1 for surround sound with masking
and LFE bandwidth optimizations, and 255 for independent streams with an
unspecified channel layout.
@item dtx (N.A.)
Allow discontinuous transmission when set to 1. The default value is 0
(disabled).
@item apply_phase_inv (N.A.) (requires libopus >= 1.2)
If set to 0, disables the use of phase inversion for intensity stereo,
improving the quality of mono downmixes, but slightly reducing normal stereo
+5 -1
View File
@@ -42,7 +42,11 @@ static void pgm_save(unsigned char *buf, int wrap, int xsize, int ysize,
FILE *f;
int i;
f = fopen(filename,"wb");
f = fopen(filename, "wb");
if (!f) {
fprintf(stderr, "Could not open %s\n", filename);
return;
}
fprintf(f, "P5\n%d %d\n%d\n", xsize, ysize, 255);
for (i = 0; i < ysize; i++)
fwrite(buf + i * wrap, 1, xsize, f);
+10 -5
View File
@@ -336,15 +336,17 @@ int main (int argc, char **argv)
if (video_stream) {
printf("Play the output video file with the command:\n"
"ffplay -f rawvideo -pix_fmt %s -video_size %dx%d %s\n",
"ffplay -f rawvideo -pixel_format %s -video_size %dx%d %s\n",
av_get_pix_fmt_name(pix_fmt), width, height,
video_dst_filename);
}
if (audio_stream) {
enum AVSampleFormat sfmt = audio_dec_ctx->sample_fmt;
int n_channels = audio_dec_ctx->ch_layout.nb_channels;
AVChannelLayout mono = AV_CHANNEL_LAYOUT_MONO;
AVChannelLayout *ch_layout = &audio_dec_ctx->ch_layout;
const char *fmt;
char buf[64];
if (av_sample_fmt_is_planar(sfmt)) {
const char *packed = av_get_sample_fmt_name(sfmt);
@@ -352,15 +354,18 @@ int main (int argc, char **argv)
"(%s). This example will output the first channel only.\n",
packed ? packed : "?");
sfmt = av_get_packed_sample_fmt(sfmt);
n_channels = 1;
ch_layout = &mono;
}
if ((ret = get_format_from_sample_fmt(&fmt, sfmt)) < 0)
goto end;
if ((ret = av_channel_layout_describe(ch_layout, buf, sizeof(buf))) < 0)
goto end;
printf("Play the output audio file with the command:\n"
"ffplay -f %s -ac %d -ar %d %s\n",
fmt, n_channels, audio_dec_ctx->sample_rate,
"ffplay -f %s -ch_layout %s -sample_rate %d %s\n",
fmt, buf, audio_dec_ctx->sample_rate,
audio_dst_filename);
}
+2 -2
View File
@@ -218,10 +218,10 @@ int main(int argc, char **argv)
samples = (uint16_t*)frame->data[0];
for (j = 0; j < c->frame_size; j++) {
samples[2*j] = (int)(sin(t) * 10000);
samples[c->ch_layout.nb_channels*j] = (int)(sin(t) * 10000);
for (k = 1; k < c->ch_layout.nb_channels; k++)
samples[2*j + k] = samples[2*j];
samples[c->ch_layout.nb_channels*j + k] = samples[c->ch_layout.nb_channels*j];
t += tincr;
}
encode(c, frame, pkt, f);
+6 -1
View File
@@ -132,8 +132,9 @@ static int decode_write(AVCodecContext *avctx, AVPacket *packet)
goto fail;
}
if ((ret = fwrite(buffer, 1, size, output_file)) < 0) {
if (fwrite(buffer, 1, size, output_file) != size) {
fprintf(stderr, "Failed to dump raw data.\n");
ret = -1;
goto fail;
}
@@ -232,6 +233,10 @@ int main(int argc, char *argv[])
/* open the file to dump raw data */
output_file = fopen(argv[3], "w+b");
if (!output_file) {
fprintf(stderr, "Cannot open output file '%s'\n", argv[3]);
return -1;
}
/* actual decoding and dump the raw data */
while (ret >= 0) {
+11 -3
View File
@@ -39,6 +39,7 @@
#include <libavutil/error.h>
#include <libavutil/hwcontext.h>
#include <libavutil/hwcontext_qsv.h>
#include <libavutil/imgutils.h>
#include <libavutil/mem.h>
static int get_format(AVCodecContext *avctx, const enum AVPixelFormat *pix_fmts)
@@ -88,9 +89,16 @@ static int decode_packet(AVCodecContext *decoder_ctx,
goto fail;
}
for (i = 0; i < FF_ARRAY_ELEMS(sw_frame->data) && sw_frame->data[i]; i++)
for (j = 0; j < (sw_frame->height >> (i > 0)); j++)
avio_write(output_ctx, sw_frame->data[i] + j * sw_frame->linesize[i], sw_frame->width);
for (i = 0; i < FF_ARRAY_ELEMS(sw_frame->data) && sw_frame->data[i]; i++) {
int h = sw_frame->height >> (i > 0);
int linesize = av_image_get_linesize(sw_frame->format, sw_frame->width, i);
if (linesize < 0) {
ret = linesize;
goto fail;
}
for (j = 0; j < h; j++)
avio_write(output_ctx, sw_frame->data[i] + j * sw_frame->linesize[i], linesize);
}
fail:
av_frame_unref(sw_frame);
+3 -1
View File
@@ -430,7 +430,9 @@ int main(int argc, char **argv)
end:
avformat_close_input(&ifmt_ctx);
avformat_close_input(&ofmt_ctx);
if (ofmt_ctx && !(ofmt_ctx->oformat->flags & AVFMT_NOFILE))
avio_closep(&ofmt_ctx->pb);
avformat_free_context(ofmt_ctx);
avcodec_free_context(&decoder_ctx);
avcodec_free_context(&encoder_ctx);
av_buffer_unref(&hw_device_ctx);
+1 -1
View File
@@ -184,7 +184,7 @@ end:
avformat_close_input(&ifmt_ctx);
/* close output */
if (ofmt_ctx && !(ofmt->flags & AVFMT_NOFILE))
if (ofmt_ctx && !(ofmt_ctx->oformat->flags & AVFMT_NOFILE))
avio_closep(&ofmt_ctx->pb);
avformat_free_context(ofmt_ctx);
+2 -2
View File
@@ -177,7 +177,7 @@ static int open_output_file(const char *filename)
enc_ctx->width = dec_ctx->width;
enc_ctx->sample_aspect_ratio = dec_ctx->sample_aspect_ratio;
ret = avcodec_get_supported_config(dec_ctx, NULL,
ret = avcodec_get_supported_config(enc_ctx, NULL,
AV_CODEC_CONFIG_PIX_FORMAT, 0,
(const void**)&pix_fmts, NULL);
@@ -195,7 +195,7 @@ static int open_output_file(const char *filename)
if (ret < 0)
return ret;
ret = avcodec_get_supported_config(dec_ctx, NULL,
ret = avcodec_get_supported_config(enc_ctx, NULL,
AV_CODEC_CONFIG_SAMPLE_FORMAT, 0,
(const void**)&sample_fmts, NULL);
+3 -3
View File
@@ -96,7 +96,6 @@ static int encode_write(AVCodecContext *avctx, AVFrame *frame, FILE *fout)
end:
av_packet_free(&enc_pkt);
ret = ((ret == AVERROR(EAGAIN)) ? 0 : -1);
return ret;
}
@@ -118,7 +117,7 @@ int main(int argc, char *argv[])
height = atoi(argv[2]);
size = width * height;
if (!(fin = fopen(argv[3], "r"))) {
if (!(fin = fopen(argv[3], "rb"))) {
fprintf(stderr, "Fail to open input file : %s\n", strerror(errno));
return -1;
}
@@ -198,7 +197,8 @@ int main(int argc, char *argv[])
goto close;
}
if ((err = (encode_write(avctx, hw_frame, fout))) < 0) {
err = encode_write(avctx, hw_frame, fout);
if (err != AVERROR(EAGAIN) && err < 0) {
fprintf(stderr, "Failed to encode.\n");
goto close;
}
+3 -1
View File
@@ -294,7 +294,9 @@ int main(int argc, char **argv)
end:
avformat_close_input(&ifmt_ctx);
avformat_close_input(&ofmt_ctx);
if (ofmt_ctx && !(ofmt_ctx->oformat->flags & AVFMT_NOFILE))
avio_closep(&ofmt_ctx->pb);
avformat_free_context(ofmt_ctx);
avcodec_free_context(&decoder_ctx);
avcodec_free_context(&encoder_ctx);
av_buffer_unref(&hw_device_ctx);
+21
View File
@@ -1569,6 +1569,27 @@ Set whether on display the image should be vertically flipped.
See the @code{-display_rotation} option for more details.
@item -mastering_display[:@var{stream_specifier}] @var{G(%u,%u)B(%u,%u)R(%u,%u)WP(%u,%u)L(%u,%u)} (@emph{input,per-stream})
Set video mastering display metadata.
@var{G(%u,%u)B(%u,%u)R(%u,%u)WP(%u,%u)L(%u,%u)} is a string specifying
X,Y display primaries for GBR channels and white point (WP) in units of
0.00002, and max-min luminance (L) values in units of 0.0001 candela per
meter square. The values are unsigned integers representing the numerator
of a rational with an implicit denominator of 50000 for GBR and (WP), and
implicit denominator 10000 for (L).
This option overrides the mastering display metadata stored in the file,
if any.
@item -content_light[:@var{stream_specifier}] @var{%u,%u} (@emph{input,per-stream})
Set video content light metadata.
@var{%u,%u} is a string specifying max content light level and maximum picture
average light level.
This option overrides the content light metadata stored in the file, if any.
@item -vn (@emph{input/output})
As an input option, blocks all video streams of a file from being filtered or
being automatically selected or mapped for any output. See @code{-discard}
+1
View File
@@ -357,6 +357,7 @@
<xsd:complexType name="streamGroupComponentType">
<xsd:sequence>
<xsd:element name="subcomponents" type="ffprobe:streamGroupSubComponentList" minOccurs="0" maxOccurs="1"/>
<xsd:element name="side_data_list" type="ffprobe:packetSideDataListType" minOccurs="0" maxOccurs="1"/>
<xsd:element name="component_entry" type="ffprobe:streamGroupEntryType" minOccurs="0" maxOccurs="unbounded"/>
</xsd:sequence>
</xsd:complexType>
+157 -6
View File
@@ -7743,6 +7743,11 @@ The file path of the downloaded whisper.cpp model (mandatory).
The language to use for transcription ('auto' for auto-detect).
Default value: @code{"auto"}
@item translate
If enabled, translate the transcription from the source language to English. A
multilingual model is required to enable this option.
Default value: @code{"false"}
@item queue
The maximum size that will be queued into the filter before processing the audio
with whisper. Using a small value the audio stream will be processed more often,
@@ -7775,8 +7780,8 @@ Default value: @code{"text"}
@item max_len
Maximum segment length in characters. When set to a value greater than 0,
transcription segments will be split to not exceed this length. This is useful
for generating subtitles with shorter lines.
transcription segments will be split by word to not exceed this length. This is
useful for generating subtitles with shorter lines.
Default value: @code{"0"}
@item vad_model
@@ -14744,6 +14749,73 @@ This flag is enabled by default.
@end table
@end table
@section frc_amf
Double the frame rate, using Frame Rate Converter (FRC) provided by
AMD Advanced Media Framework library for hardware acceleration.
The filter accepts the following options:
@table @option
@item engine_type
Specify the engine used to run shaders.
@table @samp
@item dx11
DirectX 11.
@item dx12
DirectX 12 (Default value).
@end table
@item enable
Boolean value: enable/disable FRC. Dynamic value, can be altered at a runtime
without re-initializing the filter. (Default value: enabled).
@item fallback_mode
Fallback behavior in case of low interpolation confidence.
@table @samp
@item duplicate
Duplicate frame.
@item blend
Blend two frames together (Default value).
@end table
@item indicator
Boolean value: show FRC indicator square in the top left corner of the video
(Default value: disabled).
@item profile
Level of hierarchical motion search.
@table @samp
@item low
Less levels of hierarchical motion search.
Only recommended for extremely low resolutions.
@item high
Recommended for any resolution up to 1440p. (Default value)
@item super
More levels of hierarchical motion search. Recommended for resolutions 1440p
or higher.
@end table
@item mv_search_mode
Performance mode of the motion search.
@table @samp
@item native
Conduct motion search on the full resolution of source images.
@item performance
Conduct motion search on the down scaled source images.
Recommended for APU or low end GPU for better performance.
@end table
@item use_future_frame
Boolean value: enable dependency on future frame, improves quality for the cost
of latency (Default value: enabled).
@end table
@section framestep
Select one frame every N-th frame.
@@ -25648,6 +25720,22 @@ Work the same as the identical @ref{scale} filter options.
@item reset_sar
Works the same as the identical @ref{scale} filter option.
@item in_color_range
Override input color range.
@item out_color_range
Specify output color range.
The accepted values for in_trc and out_trc are:
@table @samp
@item studio
Studio (or restricted, or MPEG) color range.
@item full
Full (or JPEG) color range.
@end table
@anchor{color_profile}
@item color_profile
Specify all color properties at once.
@@ -25665,10 +25753,13 @@ BT.2020
@end table
@item trc
@item in_trc
Override input transfer characteristics.
@item out_trc
Specify output transfer characteristics.
The accepted values are:
The accepted values for in_trc and out_trc are:
@table @samp
@item bt709
BT.709
@@ -25720,10 +25811,13 @@ ARIB_STD_B67
@end table
@item primaries
@item in_primaries
Override input color primaries.
@item out_primaries
Specify output color primaries.
The accepted values are:
The accepted values for in_primaries and out_primaries are:
@table @samp
@item bt709
BT.709
@@ -25775,6 +25869,13 @@ Upscale to 4K and change color profile to bt2020.
@example
vpp_amf=4096:2160:color_profile=bt2020
@end example
@item
Override input primaries and input transfer characteristics, change both to bt709.
@example
vpp_amf=color_profile=bt2020:in_trc=smpte2084:in_primaries=bt2020:out_trc=bt709:out_primaries=bt709
@end example
@end itemize
@anchor{vstack}
@@ -27390,6 +27491,56 @@ Thumbnails are extracted from every @var{n}=150-frame batch, selecting one per b
@end itemize
@subsection transpose_cuda
Transpose rows with columns in the input video and optionally flip it.
For more in depth examples see the @ref{transpose} video filter, which shares mostly the same options.
It accepts the following parameters:
@table @option
@item dir
Specify the transposition direction.
Can assume the following values:
@table @samp
@item cclock_flip
Rotate by 90 degrees counterclockwise and vertically flip. (default)
@item clock
Rotate by 90 degrees clockwise.
@item cclock
Rotate by 90 degrees counterclockwise.
@item clock_flip
Rotate by 90 degrees clockwise and vertically flip.
@item reversal
Rotate by 180 degrees.
@item hflip
Flip horizontally.
@item vflip
Flip vertically.
@end table
@item passthrough
Do not apply the transposition if the input geometry matches the one
specified by the specified value. It accepts the following values:
@table @samp
@item none
Always apply transposition. (default)
@item portrait
Preserve portrait geometry (when @var{height} >= @var{width}).
@item landscape
Preserve landscape geometry (when @var{width} >= @var{height}).
@end table
@end table
@subsection yadif_cuda
Deinterlace the input video using the @ref{yadif} algorithm, but implemented
+21 -2
View File
@@ -1759,7 +1759,7 @@ See also the @ref{framehash} and @ref{md5} muxers.
Animated GIF muxer.
Note that the GIF format has a very large time base: the delay between two frames can
therefore not be smaller than one centi second.
therefore not be smaller than one centisecond.
@subsection Options
@table @option
@@ -3069,7 +3069,7 @@ Default is @code{0x0001}.
Set the @samp{original_network_id}. This is unique identifier of a
network in DVB. Its main use is in the unique identification of a service
through the path @samp{Original_Network_ID, Transport_Stream_ID}. Default
is @code{0x0001}.
is @code{0xff01}.
@item mpegts_service_id @var{integer}
Set the @samp{service_id}, also known as program in DVB. Default is
@@ -3096,6 +3096,8 @@ MPEG2 Digital HDTV service.
Advanced Codec Digital SDTV service.
@item advanced_codec_digital_hdtv
Advanced Codec Digital HDTV service.
@item hevc_digital_hdtv
HEVC Digital Television service.
@end table
@item mpegts_pmt_start_pid @var{integer}
@@ -3272,6 +3274,19 @@ ogg files can be safely chained.
@end table
@section pdv
Playdate Video muxer.
This muxer writes the Playdate video container used by Panic's Playdate SDK.
It requires a seekable output and a single PDV video stream.
@table @option
@item max_frames @var{frames}
Reserve space for at most @var{frames} video frames in the file header. This
option is mandatory.
@end table
@anchor{rcwtenc}
@section rcwt
@@ -3951,6 +3966,10 @@ This muxer supports the following options:
Set the timeout in milliseconds for ICE and DTLS handshake.
Default value is 5000.
@item timeout @var{integer}
Set timeout in seconds for socket I/O operations. Applicable only for HTTP output.
Default value is -1.
@item pkt_size @var{integer}
Set the maximum size, in bytes, of RTP packets that send out.
Default value is 1200.
+1 -6
View File
@@ -20,7 +20,7 @@ architecture-specific versions. It is recommended to look at older
revisions of the interesting files (web frontends for the various FFmpeg
branches are listed at http://ffmpeg.org/download.html).
Alternatively, look into the other architecture-specific versions in
the x86/, ppc/, alpha/ subdirectories. Even if you don't exactly
the x86/, ppc/, aarch64/ subdirectories. Even if you don't exactly
comprehend the instructions, it could help understanding the functions
and how they can be optimized.
@@ -191,11 +191,6 @@ __asm__() block.
Use external asm (nasm) or inline asm (__asm__()), do not use intrinsics.
The latter requires a good optimizing compiler which gcc is not.
When debugging a x86 external asm compilation issue, if lost in the macro
expansions, add DBG=1 to your make command-line: the input file will be
preprocessed, stripped of the debug/empty lines, then compiled, showing the
actual lines causing issues.
Inline asm vs. external asm
---------------------------
Both inline asm (__asm__("..") in a .c file, handled by a compiler such as gcc)
+53 -13
View File
@@ -11,48 +11,88 @@ For programmatic use, they can be set explicitly in the
@table @option
@anchor{scaler}
@item scaler, scaler_sub
Choose the scaling algorithm to use. Default value is @samp{auto} for both.
It accepts the following values:
@table @samp
@item auto
Aumotic choice. For @samp{scaler_sub}, this means the same algorithm as
@samp{scaler}. For @samp{scaler}, this defaults to the scaler flag selected
by @samp{sws_flags}.
@item bilinear
Bilinear filter. (AKA triangle filter)
@item bicubic
2-tap cubic BC-spline (AKA Mitchell-Netravali spline). The B and C parameters
can be configured by setting @code{param0} and @code{param1}, defaulting to
0.0 and 0.6 respectively.
@item point, neighbor
Point sampling (AKA nearest neighbor).
@item area
Area averaging. Equivalent to @samp{bilinear} for upscaling.
@item gaussian
2-tap Gaussian filter approximation. The sharpness parameter can be configured
by setting @code{param0}, defaulting to 3.0.
@item sinc
Unwindowed sinc filter.
@item lanczos
Lanczos resampling (sinc windowed sinc). The number of filter taps can
be configured by setting @code{param0}, defaulting to 3.
@item spline
Unwindowed natural bicubic spline.
@end table
@anchor{sws_flags}
@item sws_flags
Set the scaler flags. This is also used to set the scaling
algorithm. Only a single algorithm should be selected. Default
value is @samp{bicubic}.
algorithm, though this usage is deprecated in favor of setting @samp{scaler}.
Only a single algorithm may be selected. Default value is @samp{bicubic}.
It accepts the following values:
@table @samp
@item fast_bilinear
Select fast bilinear scaling algorithm.
Select fast bilinear scaling algorithm. (Deprecated)
@item bilinear
Select bilinear scaling algorithm.
Select bilinear scaling algorithm. (Deprecated)
@item bicubic
Select bicubic scaling algorithm.
Select bicubic scaling algorithm. (Deprecated)
@item experimental
Select experimental scaling algorithm.
Select experimental scaling algorithm. (Deprecated)
@item neighbor
Select nearest neighbor rescaling algorithm.
Select nearest neighbor rescaling algorithm. (Deprecated)
@item area
Select averaging area rescaling algorithm.
Select averaging area rescaling algorithm. (Deprecated)
@item bicublin
Select bicubic scaling algorithm for the luma component, bilinear for
chroma components.
chroma components. (Deprecated)
@item gauss
Select Gaussian rescaling algorithm.
Select Gaussian rescaling algorithm. (Deprecated)
@item sinc
Select sinc rescaling algorithm.
Select sinc rescaling algorithm. (Deprecated)
@item lanczos
Select Lanczos rescaling algorithm. The default width (alpha) is 3 and can be
changed by setting @code{param0}.
changed by setting @code{param0}. (Deprecated)
@item spline
Select natural bicubic spline rescaling algorithm.
Select natural bicubic spline rescaling algorithm. (Deprecated)
@item print_info
Enable printing/debug logging.
+53 -73
View File
@@ -55,7 +55,8 @@ sub get_formatting_function($$) {
# determine texinfo version
my $package_version = ff_get_conf('PACKAGE_VERSION');
$package_version =~ s/\+dev$//;
$package_version =~ s/\+nc$//;
$package_version =~ s/\+?dev$//;
my $program_version_num = version->declare($package_version)->numify;
my $program_version_6_8 = $program_version_num >= 6.008000;
@@ -119,29 +120,8 @@ sub ffmpeg_heading_command($$$$$)
}
my $heading_level;
# node is used as heading if there is nothing else.
if ($cmdname eq 'node') {
if (!$output_unit or
(((!$output_unit->{'extra'}->{'section'}
and $output_unit->{'extra'}->{'node'}
and $output_unit->{'extra'}->{'node'} eq $command)
or
((($output_unit->{'extra'}->{'unit_command'}
and $output_unit->{'extra'}->{'unit_command'} eq $command)
or
($output_unit->{'unit_command'}
and $output_unit->{'unit_command'} eq $command))
and $command->{'extra'}
and not $command->{'extra'}->{'associated_section'}))
# bogus node may not have been normalized
and defined($command->{'extra'}->{'normalized'}))) {
if ($command->{'extra'}->{'normalized'} eq 'Top') {
$heading_level = 0;
} else {
$heading_level = 3;
}
}
} else {
# Never use node for heading
if ($cmdname ne 'node') {
if (defined($command->{'extra'})
and defined($command->{'extra'}->{'section_level'})) {
$heading_level = $command->{'extra'}->{'section_level'};
@@ -153,58 +133,58 @@ sub ffmpeg_heading_command($$$$$)
}
}
my $heading = $self->command_text($command);
# $heading not defined may happen if the command is a @node, for example
# if there is an error in the node.
if (defined($heading) and $heading ne '' and defined($heading_level)) {
if ($root_commands{$cmdname}
and $sectioning_commands{$cmdname}) {
my $content_href = $self->command_contents_href($command, 'contents',
$self->{'current_filename'});
if ($content_href) {
my $this_href = $content_href =~ s/^\#toc-/\#/r;
$heading .= '<span class="pull-right">'.
'<a class="anchor hidden-xs" '.
"href=\"$this_href\" aria-hidden=\"true\">".
($ENV{"FA_ICONS"} ? '<i class="fa fa-link"></i>'
: '#').
'</a> '.
'<a class="anchor hidden-xs"'.
"href=\"$content_href\" aria-hidden=\"true\">".
($ENV{"FA_ICONS"} ? '<i class="fa fa-navicon"></i>'
: 'TOC').
'</a>'.
'</span>';
if (defined($heading_level)) {
my $heading = $self->command_text($command);
# empty heading corresponds to an empty @top
if ($heading ne '') {
if ($root_commands{$cmdname}
and $sectioning_commands{$cmdname}) {
my $content_href = $self->command_contents_href($command, 'contents',
$self->{'current_filename'});
if ($content_href) {
my $this_href = $content_href =~ s/^\#toc-/\#/r;
$heading .= '<span class="pull-right">'.
'<a class="anchor hidden-xs" '.
"href=\"$this_href\" aria-hidden=\"true\">".
($ENV{"FA_ICONS"} ? '<i class="fa fa-link"></i>'
: '#').
'</a> '.
'<a class="anchor hidden-xs"'.
"href=\"$content_href\" aria-hidden=\"true\">".
($ENV{"FA_ICONS"} ? '<i class="fa fa-navicon"></i>'
: 'TOC').
'</a>'.
'</span>';
}
}
}
my $in_preformatted;
if ($program_version_num >= 7.001090) {
$in_preformatted = $self->in_preformatted_context();
} else {
$in_preformatted = $self->in_preformatted();
}
if ($in_preformatted) {
$result .= $heading."\n";
} else {
# if the level was changed, set the command name right
if ($cmdname ne 'node'
and $heading_level ne $Texinfo::Common::command_structuring_level{$cmdname}) {
$cmdname
= $Texinfo::Common::level_to_structuring_command{$cmdname}->[$heading_level];
}
if ($program_version_num >= 7.000000) {
$result .= &{get_formatting_function($self,'format_heading_text')}($self,
$cmdname, [$cmdname], $heading,
$heading_level +$self->get_conf('CHAPTER_HEADER_LEVEL') -1,
$heading_id, $command);
my $in_preformatted;
if ($program_version_num >= 7.001090) {
$in_preformatted = $self->in_preformatted_context();
} else {
$result .= &{get_formatting_function($self,'format_heading_text')}(
$self, $cmdname, $heading,
$heading_level +
$self->get_conf('CHAPTER_HEADER_LEVEL') - 1, $command);
$in_preformatted = $self->in_preformatted();
}
if ($in_preformatted) {
$result .= $heading."\n";
} else {
# if the level was changed, set the command name right
if ($cmdname ne 'node'
and $heading_level ne $Texinfo::Common::command_structuring_level{$cmdname}) {
$cmdname
= $Texinfo::Common::level_to_structuring_command{$cmdname}->[$heading_level];
}
if ($program_version_num >= 7.000000) {
$result .= &{get_formatting_function($self,'format_heading_text')}($self,
$cmdname, [$cmdname], $heading,
$heading_level +$self->get_conf('CHAPTER_HEADER_LEVEL') -1,
$heading_id, $command);
} else {
$result .= &{get_formatting_function($self,'format_heading_text')}(
$self, $cmdname, $heading,
$heading_level +
$self->get_conf('CHAPTER_HEADER_LEVEL') - 1, $command);
}
}
}
}
+17 -9
View File
@@ -1,14 +1,17 @@
The basis transforms used for FFT and various other derived functions are based
on the following unrollings.
# Transforms
The basis transforms used for FFT and various other derived functions are based on the following unrollings.
The functions can be easily adapted to double precision floats as well.
# Parity permutation
## Parity permutation
The basis transforms described here all use the following permutation:
``` C
void ff_tx_gen_split_radix_parity_revtab(int *revtab, int len, int inv,
int basis, int dual_stride);
```
Parity means even and odd complex numbers will be split, e.g. the even
coefficients will come first, after which the odd coefficients will be
placed. For example, a 4-point transform's coefficients after reordering:
@@ -33,7 +36,8 @@ register or 0. This allows to reuse SSE functions as dual-transform
functions in AVX mode.
If length is smaller than basis/2 this function will not do anything.
# 4-point FFT transform
## 4-point FFT transform
The only permutation this transform needs is to swap the `z[1]` and `z[2]`
elements when performing an inverse transform, which in the assembly code is
hardcoded with the function itself being templated and duplicated for each
@@ -80,7 +84,8 @@ static void fft4(FFTComplex *z)
}
```
# 8-point AVX FFT transform
## 8-point AVX FFT transform
Input must be pre-permuted using the parity lookup table, generated via
`ff_tx_gen_split_radix_parity_revtab`.
@@ -193,7 +198,8 @@ This theme continues throughout the document. Note that in the actual assembly c
the paths are interleaved to improve unit saturation and CPU dependency tracking, so
to more clearly see them, you'll need to deinterleave the instructions.
# 8-point SSE/ARM64 FFT transform
## 8-point SSE/ARM64 FFT transform
Input must be pre-permuted using the parity lookup table, generated via
`ff_tx_gen_split_radix_parity_revtab`.
@@ -305,7 +311,8 @@ static void fft8(FFTComplex *z)
Most functions here are highly tuned to use x86's addsub instruction to save on
external sign mask loading.
# 16-point AVX FFT transform
## 16-point AVX FFT transform
This version expects the output of the 8 and 4-point transforms to follow the
even/odd convention established above.
@@ -445,7 +452,8 @@ static void fft16(FFTComplex *z)
}
```
# AVX split-radix synthesis
## AVX split-radix synthesis
To create larger transforms, the following unrolling of the C split-radix
function is used.
@@ -705,8 +713,8 @@ beginning to overlap, particularly `[o1]` with `[0]` after the second iteration.
To iterate further, set `z = &z[16]` via `z += 8` for the second iteration. After
the 4th iteration, the layout resets, so repeat the same.
## 15-point AVX FFT transform
# 15-point AVX FFT transform
The 15-point transform is based on the following unrolling. The input
must be permuted via the following loop:
+3 -15
View File
@@ -2,14 +2,6 @@
# common bits used by all libraries
#
DEFAULT_X86ASMD=.dbg
ifeq ($(DBG),1)
X86ASMD=$(DEFAULT_X86ASMD)
else
X86ASMD=
endif
ifndef SUBDIR
LINK = $(LD) $(1)
@@ -105,10 +97,6 @@ COMPILE_LASX = $(call COMPILE,CC,LASXFLAGS)
%_host.o: %.c
$(COMPILE_HOSTC)
%$(DEFAULT_X86ASMD).asm: %.asm
$(DEPX86ASM) $(X86ASMFLAGS) -M -o $@ $< > $(@:.asm=.d)
$(X86ASM) $(X86ASMFLAGS) -e $< | sed '/^%/d;/^$$/d;' > $@
%.o: %.asm
$(COMPILE_X86ASM)
-$(if $(ASMSTRIPFLAGS), $(STRIP) $(ASMSTRIPFLAGS) $@)
@@ -197,7 +185,7 @@ endif
clean::
$(RM) $(BIN2CEXE) $(CLEANSUFFIXES:%=ffbuild/%)
%.c %.h %.pc %.ver %.version: TAG = GEN
%.c %.h %.S %.pc %.ver %.version: TAG = GEN
# Dummy rule to stop make trying to rebuild removed or renamed headers
%.h %_template.c:
@@ -266,7 +254,7 @@ $(TOOLOBJS): | tools
OUTDIRS := $(OUTDIRS) $(dir $(OBJS) $(HOBJS) $(HOSTOBJS) $(SHLIBOBJS) $(STLIBOBJS) $(TESTOBJS))
CLEANSUFFIXES = *.d *.gcda *.gcno *.h.c *.ho *.map *.o *.objs *.pc *.ptx *.ptx.gz *.ptx.c *.spv *.spv.gz *.spv.c *.ver *.version *.html.gz *.html.c *.css.min.gz *.css.min *.css.c *$(DEFAULT_X86ASMD).asm *~ *.ilk *.pdb
CLEANSUFFIXES = *.d *.gcda *.gcno *.h.c *.ho *.map *.o *.objs *.pc *.ptx *.ptx.gz *.ptx.c *.spv *.spv.gz *.spv.c *.gen.c *.gen.S *.ver *.version *.html.gz *.html.c *.css.min.gz *.css.min *.css.c *~ *.ilk *.pdb
LIBSUFFIXES = *.a *.lib *.so *.so.* *.dylib *.dll *.def *.dll.a
define RULES
@@ -276,4 +264,4 @@ endef
$(eval $(RULES))
-include $(wildcard $(OBJS:.o=.d) $(HOSTOBJS:.o=.d) $(TESTOBJS:.o=.d) $(HOBJS:.o=.d) $(SHLIBOBJS:.o=.d) $(STLIBOBJS:.o=.d) $(SPVOBJS:.spv.o=.d)) $(OBJS:.o=$(DEFAULT_X86ASMD).d)
-include $(wildcard $(OBJS:.o=.d) $(HOSTOBJS:.o=.d) $(TESTOBJS:.o=.d) $(HOBJS:.o=.d) $(SHLIBOBJS:.o=.d) $(STLIBOBJS:.o=.d) $(SPVOBJS:.spv.o=.d))
+1 -1
View File
@@ -1267,7 +1267,7 @@ unsigned stream_specifier_match(const StreamSpecifier *ss,
break;
}
}
// fall-through
av_fallthrough;
case STREAM_LIST_GROUP_IDX:
if (ss->stream_list == STREAM_LIST_GROUP_IDX &&
ss->list_id >= 0 && ss->list_id < s->nb_stream_groups)
+3 -2
View File
@@ -253,7 +253,6 @@ void term_init(void)
/* read a key without blocking */
static int read_key(void)
{
unsigned char ch = -1;
#if HAVE_TERMIOS_H
int n = 1;
struct timeval tv;
@@ -265,6 +264,7 @@ static int read_key(void)
tv.tv_usec = 0;
n = select(1, &rfds, NULL, NULL, &tv);
if (n > 0) {
unsigned char ch;
n = read(0, &ch, 1);
if (n == 1)
return ch;
@@ -289,6 +289,7 @@ static int read_key(void)
}
//Read it
if(nchars != 0) {
unsigned char ch;
if (read(0, &ch, 1) == 1)
return ch;
return 0;
@@ -300,7 +301,7 @@ static int read_key(void)
if(kbhit())
return(getch());
#endif
return ch;
return -1;
}
static int decode_interrupt_cb(void *ctx)
+2
View File
@@ -218,6 +218,8 @@ typedef struct OptionsContext {
SpecifierOptList display_rotations;
SpecifierOptList display_hflips;
SpecifierOptList display_vflips;
SpecifierOptList mastering_displays;
SpecifierOptList content_lights;
SpecifierOptList rc_overrides;
SpecifierOptList intra_matrices;
SpecifierOptList inter_matrices;
+148 -6
View File
@@ -28,6 +28,7 @@
#include "libavutil/display.h"
#include "libavutil/error.h"
#include "libavutil/intreadwrite.h"
#include "libavutil/mastering_display_metadata.h"
#include "libavutil/mem.h"
#include "libavutil/opt.h"
#include "libavutil/parseutils.h"
@@ -68,6 +69,8 @@ typedef struct DemuxStream {
int autorotate;
int apply_cropping;
int force_display_matrix;
int force_mastering_display;
int force_content_light;
int drop_changed;
@@ -1248,6 +1251,125 @@ static int add_display_matrix_to_stream(const OptionsContext *o,
return 0;
}
static int add_mastering_display_to_stream(const OptionsContext *o,
AVFormatContext *ctx, InputStream *ist)
{
AVStream *st = ist->st;
DemuxStream *ds = ds_from_ist(ist);
AVMasteringDisplayMetadata *master_display;
AVPacketSideData *sd;
const char *p = NULL;
const int chroma_den = 50000;
const int luma_den = 10000;
size_t size;
int ret;
opt_match_per_stream_str(ist, &o->mastering_displays, ctx, st, &p);
if (!p)
return 0;
master_display = av_mastering_display_metadata_alloc_size(&size);
if (!master_display)
return AVERROR(ENOMEM);
ret = sscanf(p,
"G(%u,%u)B(%u,%u)R(%u,%u)WP(%u,%u)L(%u,%u)",
(unsigned*)&master_display->display_primaries[1][0].num,
(unsigned*)&master_display->display_primaries[1][1].num,
(unsigned*)&master_display->display_primaries[2][0].num,
(unsigned*)&master_display->display_primaries[2][1].num,
(unsigned*)&master_display->display_primaries[0][0].num,
(unsigned*)&master_display->display_primaries[0][1].num,
(unsigned*)&master_display->white_point[0].num,
(unsigned*)&master_display->white_point[1].num,
(unsigned*)&master_display->max_luminance.num,
(unsigned*)&master_display->min_luminance.num);
if (ret != 10 ||
(unsigned)(master_display->display_primaries[1][0].num | master_display->display_primaries[1][1].num |
master_display->display_primaries[2][0].num | master_display->display_primaries[2][1].num |
master_display->display_primaries[0][0].num | master_display->display_primaries[0][1].num |
master_display->white_point[0].num | master_display->white_point[1].num) > UINT16_MAX ||
(unsigned)(master_display->max_luminance.num | master_display->min_luminance.num) > INT_MAX ||
master_display->min_luminance.num > master_display->max_luminance.num) {
av_freep(&master_display);
av_log(ist, AV_LOG_ERROR, "Failed to parse mastering display option\n");
return AVERROR(EINVAL);
}
master_display->display_primaries[1][0].den = chroma_den;
master_display->display_primaries[1][1].den = chroma_den;
master_display->display_primaries[2][0].den = chroma_den;
master_display->display_primaries[2][1].den = chroma_den;
master_display->display_primaries[0][0].den = chroma_den;
master_display->display_primaries[0][1].den = chroma_den;
master_display->white_point[0].den = chroma_den;
master_display->white_point[1].den = chroma_den;
master_display->max_luminance.den = luma_den;
master_display->min_luminance.den = luma_den;
master_display->has_primaries = 1;
master_display->has_luminance = 1;
sd = av_packet_side_data_add(&st->codecpar->coded_side_data,
&st->codecpar->nb_coded_side_data,
AV_PKT_DATA_MASTERING_DISPLAY_METADATA,
(uint8_t *)master_display, size, 0);
if (!sd) {
av_freep(&master_display);
return AVERROR(ENOMEM);
}
ds->force_mastering_display = 1;
return 0;
}
static int add_content_light_to_stream(const OptionsContext *o,
AVFormatContext *ctx, InputStream *ist)
{
AVStream *st = ist->st;
DemuxStream *ds = ds_from_ist(ist);
AVContentLightMetadata *cll;
AVPacketSideData *sd;
const char *p = NULL;
size_t size;
int ret;
opt_match_per_stream_str(ist, &o->content_lights, ctx, st, &p);
if (!p)
return 0;
cll = av_content_light_metadata_alloc(&size);
if (!cll)
return AVERROR(ENOMEM);
ret = sscanf(p, "%u,%u",
(unsigned*)&cll->MaxCLL,
(unsigned*)&cll->MaxFALL);
if (ret != 2 || (unsigned)(cll->MaxCLL | cll->MaxFALL) > UINT16_MAX) {
av_freep(&cll);
av_log(ist, AV_LOG_ERROR, "Failed to parse content light option\n");
return AVERROR(EINVAL);
}
sd = av_packet_side_data_add(&st->codecpar->coded_side_data,
&st->codecpar->nb_coded_side_data,
AV_PKT_DATA_CONTENT_LIGHT_LEVEL,
(uint8_t *)cll, size, 0);
if (!sd) {
av_freep(&cll);
return AVERROR(ENOMEM);
}
ds->force_content_light = 1;
return 0;
}
static const char *input_stream_item_name(void *obj)
{
const DemuxStream *ds = obj;
@@ -1301,6 +1423,7 @@ static int ist_add(const OptionsContext *o, Demuxer *d, AVStream *st, AVDictiona
const char *bsfs = NULL;
char *next;
const char *discard_str = NULL;
AVBPrint bp;
int ret;
ds = demux_stream_alloc(d, st);
@@ -1366,6 +1489,14 @@ static int ist_add(const OptionsContext *o, Demuxer *d, AVStream *st, AVDictiona
if (ret < 0)
return ret;
ret = add_mastering_display_to_stream(o, ic, ist);
if (ret < 0)
return ret;
ret = add_content_light_to_stream(o, ic, ist);
if (ret < 0)
return ret;
opt_match_per_stream_str(ist, &o->hwaccels, ic, st, &hwaccel);
opt_match_per_stream_str(ist, &o->hwaccel_output_formats, ic, st,
&hwaccel_output_format);
@@ -1483,15 +1614,26 @@ static int ist_add(const OptionsContext *o, Demuxer *d, AVStream *st, AVDictiona
av_dict_set_int(&ds->decoder_opts, "apply_cropping",
ds->apply_cropping && ds->apply_cropping != CROP_CONTAINER, 0);
av_bprint_init(&bp, 0, AV_BPRINT_SIZE_AUTOMATIC);
if (ds->force_display_matrix) {
char buf[32];
if (av_dict_get(ds->decoder_opts, "side_data_prefer_packet", NULL, 0))
buf[0] = ',';
else
buf[0] = '\0';
av_strlcat(buf, "displaymatrix", sizeof(buf));
av_dict_set(&ds->decoder_opts, "side_data_prefer_packet", buf, AV_DICT_APPEND);
av_bprintf(&bp, ",");
av_bprintf(&bp, "displaymatrix");
}
if (ds->force_mastering_display) {
if (bp.len || av_dict_get(ds->decoder_opts, "side_data_prefer_packet", NULL, 0))
av_bprintf(&bp, ",");
av_bprintf(&bp, "mastering_display_metadata");
}
if (ds->force_content_light) {
if (bp.len || av_dict_get(ds->decoder_opts, "side_data_prefer_packet", NULL, 0))
av_bprintf(&bp, ",");
av_bprintf(&bp, "content_light_level");
}
if (bp.len)
av_dict_set(&ds->decoder_opts, "side_data_prefer_packet", bp.str, AV_DICT_APPEND);
av_bprint_finalize(&bp, NULL);
/* Attached pics are sparse, therefore we would not want to delay their decoding
* till EOF. */
if (ist->st->disposition & AV_DISPOSITION_ATTACHED_PIC)
+3
View File
@@ -227,6 +227,9 @@ int enc_open(void *opaque, const AVFrame *frame)
frame->ch_layout.nb_channels > 0);
enc_ctx->sample_fmt = frame->format;
enc_ctx->sample_rate = frame->sample_rate;
if (!enc_ctx->frame_size && (!(enc->capabilities & AV_CODEC_CAP_VARIABLE_FRAME_SIZE) ||
(enc_ctx->flags2 & AV_CODEC_FLAG2_FIXED_FRAME_SIZE)))
enc_ctx->frame_size = frame->nb_samples;
ret = av_channel_layout_copy(&enc_ctx->ch_layout, &frame->ch_layout);
if (ret < 0)
return ret;
+10 -5
View File
@@ -27,6 +27,7 @@
#include "libavfilter/buffersink.h"
#include "libavfilter/buffersrc.h"
#include "libavutil/attributes.h"
#include "libavutil/avassert.h"
#include "libavutil/avstring.h"
#include "libavutil/bprint.h"
@@ -1681,7 +1682,10 @@ static int configure_output_video_filter(FilterGraphPriv *fgp, AVFilterGraph *gr
av_frame_side_data_remove(&ofp->side_data, &ofp->nb_side_data, AV_FRAME_DATA_DISPLAYMATRIX);
}
if ((ofp->width || ofp->height) && (ofp->flags & OFILTER_FLAG_AUTOSCALE)) {
if ((ofp->width || ofp->height) && (ofp->flags & OFILTER_FLAG_AUTOSCALE) &&
// skip add scale for hardware format
!(ofp->format != AV_PIX_FMT_NONE &&
av_pix_fmt_desc_get(ofp->format)->flags & AV_PIX_FMT_FLAG_HWACCEL)) {
char args[255];
AVFilterContext *filter;
const AVDictionaryEntry *e = NULL;
@@ -2569,6 +2573,7 @@ static void video_sync_process(OutputFilterPriv *ofp, AVFrame *frame,
delta0 = 0;
ofp->next_pts = llrint(sync_ipts);
}
av_fallthrough;
case VSYNC_CFR:
// FIXME set to 0.5 after we fix some dts/pts bugs like in avidec.c
if (frame_drop_threshold && delta < frame_drop_threshold && fps->frame_number) {
@@ -2836,8 +2841,10 @@ static int fg_output_step(OutputFilterPriv *ofp, FilterGraphThread *fgt,
if (!fgt->got_frame) {
ret = clone_side_data(&fd->side_data, &fd->nb_side_data,
ofp->side_data, ofp->nb_side_data, 0);
if (ret < 0)
if (ret < 0) {
av_frame_unref(frame);
return ret;
}
}
fd->wallclock[LATENCY_PROBE_FILTER_POST] = av_gettime_relative();
@@ -3181,7 +3188,7 @@ static int send_frame(FilterGraph *fg, FilterGraphThread *fgt,
const char *color_space_name = av_color_space_name(frame->colorspace);
const char *color_range_name = av_color_range_name(frame->color_range);
const char *alpha_mode = av_alpha_mode_name(frame->alpha_mode);
av_bprintf(&reason, "video parameters changed to %s(%s, %s), %dx%d, %s alpha,",
av_bprintf(&reason, "video parameters changed to %s(%s, %s), %dx%d, %s alpha, ",
unknown_if_null(pixel_format_name), unknown_if_null(color_range_name),
unknown_if_null(color_space_name), frame->width, frame->height,
unknown_if_null(alpha_mode));
@@ -3337,8 +3344,6 @@ static int filter_thread(void *arg)
o = (intptr_t)fgt.frame->opaque;
o = (intptr_t)fgt.frame->opaque;
// message on the control stream
if (input_idx == fg->nb_inputs) {
FilterCommand *fc;
+12 -3
View File
@@ -1531,6 +1531,8 @@ static int ost_add(Muxer *mux, const OptionsContext *o, enum AVMediaType type,
if (oc->oformat->flags & AVFMT_GLOBALHEADER && ost->enc)
ost->enc->enc_ctx->flags |= AV_CODEC_FLAG_GLOBAL_HEADER;
if (oc->oformat->flags & AVFMT_FIXED_FRAMESIZE && ost->enc)
ost->enc->enc_ctx->flags2 |= AV_CODEC_FLAG2_FIXED_FRAME_SIZE;
opt_match_per_stream_int(ost, &o->copy_initial_nonkeyframes,
oc, st, &ms->copy_initial_nonkeyframes);
@@ -2127,7 +2129,8 @@ static int setup_sync_queues(Muxer *mux, AVFormatContext *oc,
nb_interleaved += IS_INTERLEAVED(type);
nb_av_enc += IS_AV_ENC(ost, type);
nb_audio_fs += (ost->enc && type == AVMEDIA_TYPE_AUDIO &&
!(ost->enc->enc_ctx->codec->capabilities & AV_CODEC_CAP_VARIABLE_FRAME_SIZE));
(!(ost->enc->enc_ctx->codec->capabilities & AV_CODEC_CAP_VARIABLE_FRAME_SIZE) ||
(ost->enc->enc_ctx->flags2 & AV_CODEC_FLAG2_FIXED_FRAME_SIZE)));
limit_frames |= ms->max_frames < INT64_MAX;
limit_frames_av_enc |= (ms->max_frames < INT64_MAX) && IS_AV_ENC(ost, type);
@@ -2552,6 +2555,8 @@ static int of_map_group(Muxer *mux, AVDictionary **dict, AVBPrint *bp, const cha
}
break;
}
case AV_STREAM_GROUP_PARAMS_LCEVC:
break;
default:
av_log(mux, AV_LOG_ERROR, "Unsupported mapped group type %d.\n", stg->type);
ret = AVERROR(EINVAL);
@@ -2574,6 +2579,8 @@ static int of_parse_group_token(Muxer *mux, const char *token, char *ptr)
{ .i64 = AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT }, .unit = "type" },
{ "iamf_mix_presentation", NULL, 0, AV_OPT_TYPE_CONST,
{ .i64 = AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION }, .unit = "type" },
{ "lcevc", NULL, 0, AV_OPT_TYPE_CONST,
{ .i64 = AV_STREAM_GROUP_PARAMS_LCEVC }, .unit = "type" },
{ NULL },
};
const AVClass class = {
@@ -2648,6 +2655,10 @@ static int of_parse_group_token(Muxer *mux, const char *token, char *ptr)
ret = avformat_stream_group_add_stream(stg, oc->streams[idx]);
if (ret < 0)
goto end;
OutputStream *ost = mux->of.streams[idx];
if (ost->enc && (type == AV_STREAM_GROUP_PARAMS_IAMF_AUDIO_ELEMENT ||
type == AV_STREAM_GROUP_PARAMS_IAMF_MIX_PRESENTATION))
ost->enc->enc_ctx->flags2 |= AV_CODEC_FLAG2_FIXED_FRAME_SIZE;
}
while (e = av_dict_get(dict, "stg", e, 0)) {
char *endptr;
@@ -2672,8 +2683,6 @@ static int of_parse_group_token(Muxer *mux, const char *token, char *ptr)
ret = of_parse_iamf_submixes(mux, stg, ptr);
break;
default:
av_log(mux, AV_LOG_FATAL, "Unknown group type %d.\n", type);
ret = AVERROR(EINVAL);
break;
}
+9
View File
@@ -645,6 +645,9 @@ static int opt_map(void *optctx, const char *opt, const char *arg)
for (i = 0; i < o->nb_stream_maps; i++) {
m = &o->stream_maps[i];
if (file_idx == m->file_index &&
!m->linklabel &&
m->stream_index >= 0 &&
m->stream_index < input_files[m->file_index]->nb_streams &&
stream_specifier_match(&ss,
input_files[m->file_index]->ctx,
input_files[m->file_index]->ctx->streams[m->stream_index],
@@ -1940,6 +1943,12 @@ const OptionDef options[] = {
{ .off = OFFSET(display_vflips) },
"set display vertical flip for stream(s) "
"(overrides any display rotation if it is not set)"},
{ "mastering_display", OPT_TYPE_STRING, OPT_VIDEO | OPT_PERSTREAM | OPT_INPUT | OPT_EXPERT,
{ .off = OFFSET(mastering_displays) },
"set SMPTE2084 mastering display color volume info" },
{ "content_light", OPT_TYPE_STRING, OPT_VIDEO | OPT_PERSTREAM | OPT_INPUT | OPT_EXPERT,
{ .off = OFFSET(content_lights) },
"set SMPTE2084 Max CLL and Max FALL values" },
{ "vn", OPT_TYPE_BOOL, OPT_VIDEO | OPT_OFFSET | OPT_INPUT | OPT_OUTPUT,
{ .off = OFFSET(video_disable) },
"disable video" },
+25 -5
View File
@@ -435,7 +435,7 @@ static void task_init(Scheduler *sch, SchTask *task, enum SchedulerNodeType type
task->func_arg = func_arg;
}
static int64_t trailing_dts(const Scheduler *sch, int count_finished)
static int64_t trailing_dts(const Scheduler *sch)
{
int64_t min_dts = INT64_MAX;
@@ -445,7 +445,7 @@ static int64_t trailing_dts(const Scheduler *sch, int count_finished)
for (unsigned j = 0; j < mux->nb_streams; j++) {
const SchMuxStream *ms = &mux->streams[j];
if (ms->source_finished && !count_finished)
if (ms->source_finished)
continue;
if (ms->last_dts == AV_NOPTS_VALUE)
return AV_NOPTS_VALUE;
@@ -457,6 +457,26 @@ static int64_t trailing_dts(const Scheduler *sch, int count_finished)
return min_dts == INT64_MAX ? AV_NOPTS_VALUE : min_dts;
}
static int64_t progressing_dts(const Scheduler *sch, int count_finished)
{
int64_t max_dts = INT64_MIN;
for (unsigned i = 0; i < sch->nb_mux; i++) {
const SchMux *mux = &sch->mux[i];
for (unsigned j = 0; j < mux->nb_streams; j++) {
const SchMuxStream *ms = &mux->streams[j];
if (ms->source_finished && !count_finished)
continue;
if (ms->last_dts != AV_NOPTS_VALUE)
max_dts = FFMAX(max_dts, ms->last_dts);
}
}
return max_dts == INT64_MIN ? AV_NOPTS_VALUE : max_dts;
}
void sch_remove_filtergraph(Scheduler *sch, int idx)
{
SchFilterGraph *fg = &sch->filters[idx];
@@ -1399,9 +1419,9 @@ static void schedule_update_locked(Scheduler *sch)
if (atomic_load(&sch->terminate))
return;
dts = trailing_dts(sch, 0);
dts = trailing_dts(sch);
atomic_store(&sch->last_dts, dts);
atomic_store(&sch->last_dts, progressing_dts(sch, 0));
// initialize our internal state
for (unsigned type = 0; type < 2; type++)
@@ -2768,7 +2788,7 @@ int sch_stop(Scheduler *sch, int64_t *finish_ts)
}
if (finish_ts)
*finish_ts = trailing_dts(sch, 1);
*finish_ts = progressing_dts(sch, 1);
sch->state = SCH_STATE_STOPPED;
+18 -9
View File
@@ -3007,6 +3007,7 @@ static int read_thread(void *arg)
// initial metadata as update.
st->event_flags &= ~AVSTREAM_EVENT_FLAG_METADATA_UPDATED;
}
ic->event_flags &= ~AVFMT_EVENT_FLAG_METADATA_UPDATED;
for (i = 0; i < AVMEDIA_TYPE_NB; i++) {
if (wanted_stream_spec[i] && st_index[i] == -1) {
av_log(NULL, AV_LOG_ERROR, "Stream specifier %s does not match any %s stream\n", wanted_stream_spec[i], av_get_media_type_string(i));
@@ -3175,16 +3176,24 @@ static int read_thread(void *arg)
is->eof = 0;
}
if (show_status && ic->streams[pkt->stream_index]->event_flags &
AVSTREAM_EVENT_FLAG_METADATA_UPDATED) {
fprintf(stderr, "\x1b[2K\r");
snprintf(metadata_description,
sizeof(metadata_description),
"\r New metadata for stream %d",
pkt->stream_index);
dump_dictionary(NULL, ic->streams[pkt->stream_index]->metadata,
metadata_description, " ", AV_LOG_INFO);
if (show_status) {
if (ic->event_flags & AVFMT_EVENT_FLAG_METADATA_UPDATED) {
fprintf(stderr, "\x1b[2K\r");
dump_dictionary(NULL, ic->metadata,
"\r New metadata", " ", AV_LOG_INFO);
}
if (ic->streams[pkt->stream_index]->event_flags &
AVSTREAM_EVENT_FLAG_METADATA_UPDATED) {
fprintf(stderr, "\x1b[2K\r");
snprintf(metadata_description,
sizeof(metadata_description),
"\r New metadata for stream %d",
pkt->stream_index);
dump_dictionary(NULL, ic->streams[pkt->stream_index]->metadata,
metadata_description, " ", AV_LOG_INFO);
}
}
ic->event_flags &= ~AVFMT_EVENT_FLAG_METADATA_UPDATED;
ic->streams[pkt->stream_index]->event_flags &= ~AVSTREAM_EVENT_FLAG_METADATA_UPDATED;
/* check if packet is in play range specified by user, then queue, otherwise discard */
+21
View File
@@ -43,6 +43,7 @@
#include "libavutil/bprint.h"
#include "libavutil/mem.h"
#include "libavutil/internal.h"
#endif
@@ -115,14 +116,22 @@ static void hwctx_lock_queue(void *priv, uint32_t qf, uint32_t qidx)
{
AVHWDeviceContext *avhwctx = priv;
const AVVulkanDeviceContext *hwctx = avhwctx->hwctx;
#if FF_API_VULKAN_SYNC_QUEUES
FF_DISABLE_DEPRECATION_WARNINGS
hwctx->lock_queue(avhwctx, qf, qidx);
FF_ENABLE_DEPRECATION_WARNINGS
#endif
}
static void hwctx_unlock_queue(void *priv, uint32_t qf, uint32_t qidx)
{
AVHWDeviceContext *avhwctx = priv;
const AVVulkanDeviceContext *hwctx = avhwctx->hwctx;
#if FF_API_VULKAN_SYNC_QUEUES
FF_DISABLE_DEPRECATION_WARNINGS
hwctx->unlock_queue(avhwctx, qf, qidx);
FF_ENABLE_DEPRECATION_WARNINGS
#endif
}
static int add_instance_extension(const char **ext, unsigned num_ext,
@@ -283,7 +292,11 @@ static void placebo_lock_queue(struct AVHWDeviceContext *dev_ctx,
{
RendererContext *ctx = dev_ctx->user_opaque;
pl_vulkan vk = ctx->placebo_vulkan;
#if FF_API_VULKAN_SYNC_QUEUES
FF_DISABLE_DEPRECATION_WARNINGS
vk->lock_queue(vk, queue_family, index);
FF_ENABLE_DEPRECATION_WARNINGS
#endif
}
static void placebo_unlock_queue(struct AVHWDeviceContext *dev_ctx,
@@ -292,7 +305,11 @@ static void placebo_unlock_queue(struct AVHWDeviceContext *dev_ctx,
{
RendererContext *ctx = dev_ctx->user_opaque;
pl_vulkan vk = ctx->placebo_vulkan;
#if FF_API_VULKAN_SYNC_QUEUES
FF_DISABLE_DEPRECATION_WARNINGS
vk->unlock_queue(vk, queue_family, index);
FF_ENABLE_DEPRECATION_WARNINGS
#endif
}
static int get_decode_queue(VkRenderer *renderer, int *index, int *count)
@@ -386,8 +403,12 @@ static int create_vk_by_placebo(VkRenderer *renderer,
device_ctx->user_opaque = ctx;
vk_dev_ctx = device_ctx->hwctx;
#if FF_API_VULKAN_SYNC_QUEUES
FF_DISABLE_DEPRECATION_WARNINGS
vk_dev_ctx->lock_queue = placebo_lock_queue;
vk_dev_ctx->unlock_queue = placebo_unlock_queue;
FF_ENABLE_DEPRECATION_WARNINGS
#endif
vk_dev_ctx->get_proc_addr = ctx->placebo_instance->get_proc_addr;
+154 -49
View File
@@ -215,6 +215,8 @@ typedef enum {
SECTION_ID_STREAM_GROUP_SUBPIECE,
SECTION_ID_STREAM_GROUP_BLOCKS,
SECTION_ID_STREAM_GROUP_BLOCK,
SECTION_ID_STREAM_GROUP_SIDE_DATA_LIST,
SECTION_ID_STREAM_GROUP_SIDE_DATA,
SECTION_ID_STREAM_GROUP_STREAMS,
SECTION_ID_STREAM_GROUP_STREAM,
SECTION_ID_STREAM_GROUP_DISPOSITION,
@@ -298,7 +300,7 @@ static const AVTextFormatSection sections[] = {
[SECTION_ID_STREAM_GROUP_STREAM_TAGS] = { SECTION_ID_STREAM_GROUP_STREAM_TAGS, "tags", AV_TEXTFORMAT_SECTION_FLAG_HAS_VARIABLE_FIELDS, { -1 }, .element_name = "tag", .unique_name = "stream_group_stream_tags" },
[SECTION_ID_STREAM_GROUP] = { SECTION_ID_STREAM_GROUP, "stream_group", 0, { SECTION_ID_STREAM_GROUP_TAGS, SECTION_ID_STREAM_GROUP_DISPOSITION, SECTION_ID_STREAM_GROUP_COMPONENTS, SECTION_ID_STREAM_GROUP_STREAMS, -1 } },
[SECTION_ID_STREAM_GROUP_COMPONENTS] = { SECTION_ID_STREAM_GROUP_COMPONENTS, "components", AV_TEXTFORMAT_SECTION_FLAG_IS_ARRAY, { SECTION_ID_STREAM_GROUP_COMPONENT, -1 }, .element_name = "component", .unique_name = "stream_group_components" },
[SECTION_ID_STREAM_GROUP_COMPONENT] = { SECTION_ID_STREAM_GROUP_COMPONENT, "component", AV_TEXTFORMAT_SECTION_FLAG_HAS_VARIABLE_FIELDS|AV_TEXTFORMAT_SECTION_FLAG_HAS_TYPE, { SECTION_ID_STREAM_GROUP_SUBCOMPONENTS, -1 }, .unique_name = "stream_group_component", .element_name = "component_entry", .get_type = get_stream_group_type },
[SECTION_ID_STREAM_GROUP_COMPONENT] = { SECTION_ID_STREAM_GROUP_COMPONENT, "component", AV_TEXTFORMAT_SECTION_FLAG_HAS_VARIABLE_FIELDS|AV_TEXTFORMAT_SECTION_FLAG_HAS_TYPE, { SECTION_ID_STREAM_GROUP_SIDE_DATA_LIST, SECTION_ID_STREAM_GROUP_SUBCOMPONENTS, -1 }, .unique_name = "stream_group_component", .element_name = "component_entry", .get_type = get_stream_group_type },
[SECTION_ID_STREAM_GROUP_SUBCOMPONENTS] = { SECTION_ID_STREAM_GROUP_SUBCOMPONENTS, "subcomponents", AV_TEXTFORMAT_SECTION_FLAG_IS_ARRAY, { SECTION_ID_STREAM_GROUP_SUBCOMPONENT, -1 }, .element_name = "component" },
[SECTION_ID_STREAM_GROUP_SUBCOMPONENT] = { SECTION_ID_STREAM_GROUP_SUBCOMPONENT, "subcomponent", AV_TEXTFORMAT_SECTION_FLAG_HAS_VARIABLE_FIELDS|AV_TEXTFORMAT_SECTION_FLAG_HAS_TYPE, { SECTION_ID_STREAM_GROUP_PIECES, -1 }, .element_name = "subcomponent_entry", .get_type = get_raw_string_type },
[SECTION_ID_STREAM_GROUP_PIECES] = { SECTION_ID_STREAM_GROUP_PIECES, "pieces", AV_TEXTFORMAT_SECTION_FLAG_IS_ARRAY, { SECTION_ID_STREAM_GROUP_PIECE, -1 }, .element_name = "piece", .unique_name = "stream_group_pieces" },
@@ -307,6 +309,8 @@ static const AVTextFormatSection sections[] = {
[SECTION_ID_STREAM_GROUP_SUBPIECE] = { SECTION_ID_STREAM_GROUP_SUBPIECE, "subpiece", AV_TEXTFORMAT_SECTION_FLAG_HAS_VARIABLE_FIELDS|AV_TEXTFORMAT_SECTION_FLAG_HAS_TYPE, { SECTION_ID_STREAM_GROUP_BLOCKS, -1 }, .element_name = "subpiece_entry", .get_type = get_raw_string_type },
[SECTION_ID_STREAM_GROUP_BLOCKS] = { SECTION_ID_STREAM_GROUP_BLOCKS, "blocks", AV_TEXTFORMAT_SECTION_FLAG_IS_ARRAY, { SECTION_ID_STREAM_GROUP_BLOCK, -1 }, .element_name = "block" },
[SECTION_ID_STREAM_GROUP_BLOCK] = { SECTION_ID_STREAM_GROUP_BLOCK, "block", AV_TEXTFORMAT_SECTION_FLAG_HAS_VARIABLE_FIELDS|AV_TEXTFORMAT_SECTION_FLAG_HAS_TYPE, { -1 }, .element_name = "block_entry", .get_type = get_raw_string_type },
[SECTION_ID_STREAM_GROUP_SIDE_DATA_LIST] = { SECTION_ID_STREAM_GROUP_SIDE_DATA_LIST, "side_data_list", AV_TEXTFORMAT_SECTION_FLAG_IS_ARRAY, { SECTION_ID_STREAM_GROUP_SIDE_DATA, -1 }, .element_name = "side_data", .unique_name = "stream_group_side_data_list" },
[SECTION_ID_STREAM_GROUP_SIDE_DATA] = { SECTION_ID_STREAM_GROUP_SIDE_DATA, "side_data", AV_TEXTFORMAT_SECTION_FLAG_HAS_TYPE|AV_TEXTFORMAT_SECTION_FLAG_HAS_VARIABLE_FIELDS, { -1 }, .unique_name = "stream_group_side_data", .element_name = "side_datum", .get_type = get_packet_side_data_type },
[SECTION_ID_STREAM_GROUP_STREAMS] = { SECTION_ID_STREAM_GROUP_STREAMS, "streams", AV_TEXTFORMAT_SECTION_FLAG_IS_ARRAY, { SECTION_ID_STREAM_GROUP_STREAM, -1 }, .unique_name = "stream_group_streams" },
[SECTION_ID_STREAM_GROUP_STREAM] = { SECTION_ID_STREAM_GROUP_STREAM, "stream", 0, { SECTION_ID_STREAM_GROUP_STREAM_DISPOSITION, SECTION_ID_STREAM_GROUP_STREAM_TAGS, -1 }, .unique_name = "stream_group_stream" },
[SECTION_ID_STREAM_GROUP_DISPOSITION] = { SECTION_ID_STREAM_GROUP_DISPOSITION, "disposition", 0, { -1 }, .unique_name = "stream_group_disposition" },
@@ -345,7 +349,7 @@ static const char unit_hertz_str[] = "Hz" ;
static const char unit_byte_str[] = "byte" ;
static const char unit_bit_per_second_str[] = "bit/s";
static int nb_streams;
static unsigned int nb_streams;
static uint64_t *nb_streams_packets;
static uint64_t *nb_streams_frames;
static int *selected_streams;
@@ -432,8 +436,8 @@ static void log_callback(void *ptr, int level, const char *fmt, va_list vl)
#define print_list_fmt(k, f, n, m, ...) do { \
av_bprint_clear(&pbuf); \
for (int idx = 0; idx < n; idx++) { \
for (int idx2 = 0; idx2 < m; idx2++) { \
for (unsigned int idx = 0; idx < n; idx++) { \
for (unsigned int idx2 = 0; idx2 < m; idx2++) { \
if (idx > 0 || idx2 > 0) \
av_bprint_chars(&pbuf, ' ', 1); \
av_bprintf(&pbuf, f, __VA_ARGS__); \
@@ -801,6 +805,62 @@ static void print_dynamic_hdr10_plus(AVTextFormatContext *tfc, const AVDynamicHD
}
}
static void print_dynamic_hdr_smpte2094_app5(AVTextFormatContext *tfc, const AVDynamicHDRSmpte2094App5 *metadata)
{
if (!metadata)
return;
print_int("application_version", metadata->application_version);
print_int("minimum_application_version", metadata->minimum_application_version);
print_int("has_custom_hdr_reference_white_flag", metadata->has_custom_hdr_reference_white_flag);
print_int("has_adaptive_tone_map_flag", metadata->has_adaptive_tone_map_flag);
if (metadata->has_custom_hdr_reference_white_flag)
print_int("hdr_reference_white", metadata->hdr_reference_white);
if (!metadata->has_adaptive_tone_map_flag)
return;
print_int("baseline_hdr_headroom", metadata->baseline_hdr_headroom);
print_int("use_reference_white_tone_mapping_flag", metadata->use_reference_white_tone_mapping_flag);
if (metadata->use_reference_white_tone_mapping_flag)
return;
print_int("num_alternate_images", metadata->num_alternate_images);
print_int("gain_application_space_chromaticities_flag", metadata->gain_application_space_chromaticities_flag);
print_int("has_common_component_mix_params_flag", metadata->has_common_component_mix_params_flag);
print_int("has_common_curve_params_flag", metadata->has_common_curve_params_flag);
if (metadata->gain_application_space_chromaticities_flag == 3) {
for (int i = 0; i < 8; i++)
print_int("gain_application_space_chromaticities", metadata->gain_application_space_chromaticities[i]);
}
for (int a = 0; a < metadata->num_alternate_images; a++) {
print_int("alternate_hdr_headroom", metadata->alternate_hdr_headrooms[a]);
print_int("component_mixing_type", metadata->component_mixing_type[a]);
if (metadata->component_mixing_type[a] == 3) {
for (int k = 0; k < 6; k++) {
print_int("has_component_mixing_coefficient_flag", metadata->has_component_mixing_coefficient_flag[a][k]);
if (metadata->has_component_mixing_coefficient_flag[a][k])
print_int("component_mixing_coefficient", metadata->component_mixing_coefficient[a][k]);
}
}
print_int("gain_curve_num_control_points_minus_1", metadata->gain_curve_num_control_points_minus_1[a]);
print_int("gain_curve_use_pchip_slope_flag", metadata->gain_curve_use_pchip_slope_flag[a]);
for (int c = 0; c <= metadata->gain_curve_num_control_points_minus_1[a]; c++)
print_int("gain_curve_control_point_x", metadata->gain_curve_control_points_x[a][c]);
for (int c = 0; c <= metadata->gain_curve_num_control_points_minus_1[a]; c++)
print_int("gain_curve_control_point_y", metadata->gain_curve_control_points_y[a][c]);
if (!metadata->gain_curve_use_pchip_slope_flag[a]) {
for (int c = 0; c <= metadata->gain_curve_num_control_points_minus_1[a]; c++)
print_int("gain_curve_control_point_theta", metadata->gain_curve_control_points_theta[a][c]);
}
}
}
static void print_dynamic_hdr_vivid(AVTextFormatContext *tfc, const AVDynamicHDRVivid *metadata)
{
if (!metadata)
@@ -942,7 +1002,7 @@ static void print_film_grain_params(AVTextFormatContext *tfc,
avtext_print_section_footer(tfc);
}
for (int uv = 0; uv < 2; uv++) {
for (unsigned uv = 0; uv < 2; uv++) {
if (!aom->num_uv_points[uv] && !aom->chroma_scaling_from_luma)
continue;
@@ -1008,7 +1068,8 @@ static void print_film_grain_params(AVTextFormatContext *tfc,
}
static void print_pkt_side_data(AVTextFormatContext *tfc,
AVCodecParameters *par,
int width,
int height,
const AVPacketSideData *sd,
SectionID id_data)
{
@@ -1034,7 +1095,7 @@ static void print_pkt_side_data(AVTextFormatContext *tfc,
print_int("padding", spherical->padding);
} else if (spherical->projection == AV_SPHERICAL_EQUIRECTANGULAR_TILE) {
size_t l, t, r, b;
av_spherical_tile_bounds(spherical, par->width, par->height,
av_spherical_tile_bounds(spherical, width, height,
&l, &t, &r, &b);
print_int("bound_left", l);
print_int("bound_top", t);
@@ -1305,7 +1366,7 @@ static void show_packet(AVTextFormatContext *tfc, InputFile *ifile, AVPacket *pk
avtext_print_section_header(tfc, NULL, SECTION_ID_PACKET_SIDE_DATA_LIST);
for (int i = 0; i < pkt->side_data_elems; i++) {
print_pkt_side_data(tfc, st->codecpar, &pkt->side_data[i],
print_pkt_side_data(tfc, st->codecpar->width, st->codecpar->height, &pkt->side_data[i],
SECTION_ID_PACKET_SIDE_DATA);
avtext_print_section_footer(tfc);
}
@@ -1341,6 +1402,9 @@ static void show_subtitle(AVTextFormatContext *tfc, AVSubtitle *sub, AVStream *s
fflush(stdout);
}
static void print_iamf_param_definition(AVTextFormatContext *tfc, const char *name,
const AVIAMFParamDefinition *param, SectionID section_id);
static void print_frame_side_data(AVTextFormatContext *tfc,
const AVFrame *frame,
const AVStream *stream)
@@ -1379,6 +1443,9 @@ static void print_frame_side_data(AVTextFormatContext *tfc,
} else if (sd->type == AV_FRAME_DATA_DYNAMIC_HDR_PLUS) {
AVDynamicHDRPlus *metadata = (AVDynamicHDRPlus *)sd->data;
print_dynamic_hdr10_plus(tfc, metadata);
} else if (sd->type == AV_FRAME_DATA_DYNAMIC_HDR_SMPTE_2094_APP5) {
AVDynamicHDRSmpte2094App5 *metadata = (AVDynamicHDRSmpte2094App5 *)sd->data;
print_dynamic_hdr_smpte2094_app5(tfc, metadata);
} else if (sd->type == AV_FRAME_DATA_CONTENT_LIGHT_LEVEL) {
print_context_light_level(tfc, (AVContentLightMetadata *)sd->data);
} else if (sd->type == AV_FRAME_DATA_ICC_PROFILE) {
@@ -1400,6 +1467,11 @@ static void print_frame_side_data(AVTextFormatContext *tfc,
print_int("view_id", *(int*)sd->data);
} else if (sd->type == AV_FRAME_DATA_EXIF) {
print_int("size", sd->size);
} else if (sd->type == AV_FRAME_DATA_IAMF_MIX_GAIN_PARAM ||
sd->type == AV_FRAME_DATA_IAMF_DEMIXING_INFO_PARAM ||
sd->type == AV_FRAME_DATA_IAMF_RECON_GAIN_INFO_PARAM) {
const AVIAMFParamDefinition *param = (AVIAMFParamDefinition *)sd->data;
print_iamf_param_definition(tfc, NULL, param, SECTION_ID_FRAME_SIDE_DATA);
}
avtext_print_section_footer(tfc);
}
@@ -1696,12 +1768,10 @@ static int read_interval_packets(AVTextFormatContext *tfc, InputFile *ifile,
}
av_packet_unref(pkt);
//Flush remaining frames that are cached in the decoder
for (i = 0; i < ifile->nb_streams; i++) {
for (int i = 0; i < ifile->nb_streams; i++) {
pkt->stream_index = i;
if (do_read_frames) {
while (process_frame(tfc, ifile, frame, pkt, &(int){1}) > 0);
if (ifile->streams[i].dec_ctx)
avcodec_flush_buffers(ifile->streams[i].dec_ctx);
}
}
@@ -1715,17 +1785,33 @@ end:
return ret;
}
static void flush_buffers(InputFile *ifile)
{
int i;
if (!do_read_frames)
return;
for (i = 0; i < ifile->nb_streams; i++) {
if (ifile->streams[i].dec_ctx)
avcodec_flush_buffers(ifile->streams[i].dec_ctx);
}
}
static int read_packets(AVTextFormatContext *tfc, InputFile *ifile)
{
AVFormatContext *fmt_ctx = ifile->fmt_ctx;
int i, ret = 0;
int ret = 0;
int64_t cur_ts = fmt_ctx->start_time;
if (read_intervals_nb == 0) {
ReadInterval interval = (ReadInterval) { .has_start = 0, .has_end = 0 };
ret = read_interval_packets(tfc, ifile, &interval, &cur_ts);
} else {
for (i = 0; i < read_intervals_nb; i++) {
for (int i = 0; i < read_intervals_nb; i++) {
/* flushing buffers can reset parts of the private context which may be
* read by show_streams(), so only flush between each read_interval */
if (i)
flush_buffers(ifile);
ret = read_interval_packets(tfc, ifile, &read_intervals[i], &cur_ts);
if (ret < 0)
break;
@@ -1738,7 +1824,7 @@ static int read_packets(AVTextFormatContext *tfc, InputFile *ifile)
static void print_dispositions(AVTextFormatContext *tfc, uint32_t disposition, SectionID section_id)
{
avtext_print_section_header(tfc, NULL, section_id);
for (int i = 0; i < sizeof(disposition) * CHAR_BIT; i++) {
for (unsigned i = 0; i < sizeof(disposition) * CHAR_BIT; i++) {
const char *disposition_str = av_disposition_to_string(1U << i);
if (disposition_str)
@@ -1961,7 +2047,7 @@ static int show_stream(AVTextFormatContext *tfc, AVFormatContext *fmt_ctx, int s
if (stream->codecpar->nb_coded_side_data) {
avtext_print_section_header(tfc, NULL, SECTION_ID_STREAM_SIDE_DATA_LIST);
for (int i = 0; i < stream->codecpar->nb_coded_side_data; i++) {
print_pkt_side_data(tfc, stream->codecpar, &stream->codecpar->coded_side_data[i],
print_pkt_side_data(tfc, stream->codecpar->width, stream->codecpar->height, &stream->codecpar->coded_side_data[i],
SECTION_ID_STREAM_SIDE_DATA);
avtext_print_section_footer(tfc);
}
@@ -1978,10 +2064,10 @@ static int show_stream(AVTextFormatContext *tfc, AVFormatContext *fmt_ctx, int s
static int show_streams(AVTextFormatContext *tfc, InputFile *ifile)
{
AVFormatContext *fmt_ctx = ifile->fmt_ctx;
int i, ret = 0;
int ret = 0;
avtext_print_section_header(tfc, NULL, SECTION_ID_STREAMS);
for (i = 0; i < ifile->nb_streams; i++)
for (int i = 0; i < ifile->nb_streams; i++)
if (selected_streams[i]) {
ret = show_stream(tfc, fmt_ctx, i, &ifile->streams[i], 0);
if (ret < 0)
@@ -1995,7 +2081,7 @@ static int show_streams(AVTextFormatContext *tfc, InputFile *ifile)
static int show_program(AVTextFormatContext *tfc, InputFile *ifile, AVProgram *program)
{
AVFormatContext *fmt_ctx = ifile->fmt_ctx;
int i, ret = 0;
int ret = 0;
avtext_print_section_header(tfc, NULL, SECTION_ID_PROGRAM);
print_int("program_id", program->id);
@@ -2009,7 +2095,7 @@ static int show_program(AVTextFormatContext *tfc, InputFile *ifile, AVProgram *p
goto end;
avtext_print_section_header(tfc, NULL, SECTION_ID_PROGRAM_STREAMS);
for (i = 0; i < program->nb_stream_indexes; i++) {
for (unsigned i = 0; i < program->nb_stream_indexes; i++) {
if (selected_streams[program->stream_index[i]]) {
ret = show_stream(tfc, fmt_ctx, program->stream_index[i], &ifile->streams[program->stream_index[i]], IN_PROGRAM);
if (ret < 0)
@@ -2026,10 +2112,10 @@ end:
static int show_programs(AVTextFormatContext *tfc, InputFile *ifile)
{
AVFormatContext *fmt_ctx = ifile->fmt_ctx;
int i, ret = 0;
int ret = 0;
avtext_print_section_header(tfc, NULL, SECTION_ID_PROGRAMS);
for (i = 0; i < fmt_ctx->nb_programs; i++) {
for (unsigned i = 0; i < fmt_ctx->nb_programs; i++) {
AVProgram *program = fmt_ctx->programs[i];
if (!program)
continue;
@@ -2053,7 +2139,7 @@ static void print_tile_grid_params(AVTextFormatContext *tfc, const AVStreamGroup
print_int("width", tile_grid->width);
print_int("height", tile_grid->height);
avtext_print_section_header(tfc, NULL, SECTION_ID_STREAM_GROUP_SUBCOMPONENTS);
for (int i = 0; i < tile_grid->nb_tiles; i++) {
for (unsigned i = 0; i < tile_grid->nb_tiles; i++) {
avtext_print_section_header(tfc, "tile_offset", SECTION_ID_STREAM_GROUP_SUBCOMPONENT);
print_int("stream_index", tile_grid->offsets[i].idx);
print_int("tile_horizontal_offset", tile_grid->offsets[i].horizontal);
@@ -2061,6 +2147,15 @@ static void print_tile_grid_params(AVTextFormatContext *tfc, const AVStreamGroup
avtext_print_section_footer(tfc);
}
avtext_print_section_footer(tfc);
if (tile_grid->nb_coded_side_data) {
avtext_print_section_header(tfc, NULL, SECTION_ID_STREAM_GROUP_SIDE_DATA_LIST);
for (int i = 0; i < tile_grid->nb_coded_side_data; i++) {
print_pkt_side_data(tfc, tile_grid->width, tile_grid->height, &tile_grid->coded_side_data[i],
SECTION_ID_STREAM_GROUP_SIDE_DATA);
avtext_print_section_footer(tfc);
}
avtext_print_section_footer(tfc);
}
avtext_print_section_footer(tfc);
}
@@ -2068,12 +2163,21 @@ static void print_iamf_param_definition(AVTextFormatContext *tfc, const char *na
const AVIAMFParamDefinition *param, SectionID section_id)
{
SectionID subsection_id, parameter_section_id;
subsection_id = sections[section_id].children_ids[0];
av_assert0(subsection_id != -1);
if (section_id == SECTION_ID_FRAME_SIDE_DATA)
subsection_id = SECTION_ID_FRAME_SIDE_DATA_COMPONENT_LIST;
else {
av_assert0(sections[section_id].children_ids[0] != -1);
subsection_id = sections[section_id].children_ids[0];
}
av_assert0(sections[subsection_id].children_ids[0] != -1);
parameter_section_id = sections[subsection_id].children_ids[0];
av_assert0(parameter_section_id != -1);
avtext_print_section_header(tfc, "IAMF Param Definition", section_id);
print_str("name", name);
// When printing as part of side-data, skip opening a section
if (section_id != SECTION_ID_FRAME_SIDE_DATA)
avtext_print_section_header(tfc, "IAMF Param Definition", section_id);
if (name)
print_str("name", name);
print_int("nb_subblocks", param->nb_subblocks);
print_int("type", param->type);
print_int("parameter_id", param->parameter_id);
@@ -2082,7 +2186,7 @@ static void print_iamf_param_definition(AVTextFormatContext *tfc, const char *na
print_int("constant_subblock_duration", param->constant_subblock_duration);
if (param->nb_subblocks > 0)
avtext_print_section_header(tfc, NULL, subsection_id);
for (int i = 0; i < param->nb_subblocks; i++) {
for (unsigned i = 0; i < param->nb_subblocks; i++) {
const void *subblock = av_iamf_param_definition_get_subblock(param, i);
switch(param->type) {
case AV_IAMF_PARAMETER_DEFINITION_MIX_GAIN: {
@@ -2116,7 +2220,9 @@ static void print_iamf_param_definition(AVTextFormatContext *tfc, const char *na
}
if (param->nb_subblocks > 0)
avtext_print_section_footer(tfc); // subsection_id
avtext_print_section_footer(tfc); // section_id
if (section_id != SECTION_ID_FRAME_SIDE_DATA)
avtext_print_section_footer(tfc); // section_id
}
static void print_iamf_audio_element_params(AVTextFormatContext *tfc, const AVStreamGroup *stg,
@@ -2131,7 +2237,7 @@ static void print_iamf_audio_element_params(AVTextFormatContext *tfc, const AVSt
print_int("audio_element_type", audio_element->audio_element_type);
print_int("default_w", audio_element->default_w);
avtext_print_section_header(tfc, NULL, SECTION_ID_STREAM_GROUP_SUBCOMPONENTS);
for (int i = 0; i < audio_element->nb_layers; i++) {
for (unsigned i = 0; i < audio_element->nb_layers; i++) {
const AVIAMFLayer *layer = audio_element->layers[i];
char val_str[128];
avtext_print_section_header(tfc, "IAMF Audio Layer", SECTION_ID_STREAM_GROUP_SUBCOMPONENT);
@@ -2167,7 +2273,7 @@ static void print_iamf_submix_params(AVTextFormatContext *tfc, const AVIAMFSubmi
print_int("nb_layouts", submix->nb_layouts);
print_q("default_mix_gain", submix->default_mix_gain, '/');
avtext_print_section_header(tfc, NULL, SECTION_ID_STREAM_GROUP_PIECES);
for (int i = 0; i < submix->nb_elements; i++) {
for (unsigned i = 0; i < submix->nb_elements; i++) {
const AVIAMFSubmixElement *element = submix->elements[i];
avtext_print_section_header(tfc, "IAMF Submix Element", SECTION_ID_STREAM_GROUP_PIECE);
print_int("stream_id", element->audio_element_id);
@@ -2190,7 +2296,7 @@ static void print_iamf_submix_params(AVTextFormatContext *tfc, const AVIAMFSubmi
if (submix->output_mix_config)
print_iamf_param_definition(tfc, "output_mix_config", submix->output_mix_config,
SECTION_ID_STREAM_GROUP_PIECE);
for (int i = 0; i < submix->nb_layouts; i++) {
for (unsigned i = 0; i < submix->nb_layouts; i++) {
const AVIAMFSubmixLayout *layout = submix->layouts[i];
char val_str[128];
avtext_print_section_header(tfc, "IAMF Submix Layout", SECTION_ID_STREAM_GROUP_PIECE);
@@ -2220,7 +2326,7 @@ static void print_iamf_mix_presentation_params(AVTextFormatContext *tfc, const A
print_str(annotation->key, annotation->value);
avtext_print_section_footer(tfc); // SECTION_ID_STREAM_GROUP_SUBCOMPONENT
}
for (int i = 0; i < mix_presentation->nb_submixes; i++)
for (unsigned i = 0; i < mix_presentation->nb_submixes; i++)
print_iamf_submix_params(tfc, mix_presentation->submixes[i]);
avtext_print_section_footer(tfc); // SECTION_ID_STREAM_GROUP_SUBCOMPONENTS
avtext_print_section_footer(tfc); // SECTION_ID_STREAM_GROUP_COMPONENT
@@ -2242,7 +2348,7 @@ static int show_stream_group(AVTextFormatContext *tfc, InputFile *ifile, AVStrea
{
AVFormatContext *fmt_ctx = ifile->fmt_ctx;
AVBPrint pbuf;
int i, ret = 0;
int ret = 0;
av_bprint_init(&pbuf, 1, AV_BPRINT_SIZE_UNLIMITED);
avtext_print_section_header(tfc, NULL, SECTION_ID_STREAM_GROUP);
@@ -2267,7 +2373,7 @@ static int show_stream_group(AVTextFormatContext *tfc, InputFile *ifile, AVStrea
goto end;
avtext_print_section_header(tfc, NULL, SECTION_ID_STREAM_GROUP_STREAMS);
for (i = 0; i < stg->nb_streams; i++) {
for (unsigned i = 0; i < stg->nb_streams; i++) {
if (selected_streams[stg->streams[i]->index]) {
ret = show_stream(tfc, fmt_ctx, stg->streams[i]->index, &ifile->streams[stg->streams[i]->index], IN_STREAM_GROUP);
if (ret < 0)
@@ -2285,10 +2391,10 @@ end:
static int show_stream_groups(AVTextFormatContext *tfc, InputFile *ifile)
{
AVFormatContext *fmt_ctx = ifile->fmt_ctx;
int i, ret = 0;
int ret = 0;
avtext_print_section_header(tfc, NULL, SECTION_ID_STREAM_GROUPS);
for (i = 0; i < fmt_ctx->nb_stream_groups; i++) {
for (unsigned i = 0; i < fmt_ctx->nb_stream_groups; i++) {
AVStreamGroup *stg = fmt_ctx->stream_groups[i];
ret = show_stream_group(tfc, ifile, stg);
@@ -2302,10 +2408,10 @@ static int show_stream_groups(AVTextFormatContext *tfc, InputFile *ifile)
static int show_chapters(AVTextFormatContext *tfc, InputFile *ifile)
{
AVFormatContext *fmt_ctx = ifile->fmt_ctx;
int i, ret = 0;
int ret = 0;
avtext_print_section_header(tfc, NULL, SECTION_ID_CHAPTERS);
for (i = 0; i < fmt_ctx->nb_chapters; i++) {
for (unsigned i = 0; i < fmt_ctx->nb_chapters; i++) {
AVChapter *chapter = fmt_ctx->chapters[i];
avtext_print_section_header(tfc, NULL, SECTION_ID_CHAPTER);
@@ -2425,7 +2531,7 @@ static const AVCodec *get_decoder_for_stream(AVFormatContext *fmt_ctx, AVStream
static int open_input_file(InputFile *ifile, const char *filename,
const char *print_filename)
{
int err, i;
int err;
AVFormatContext *fmt_ctx = NULL;
const AVDictionaryEntry *t = NULL;
int scan_all_pmts_set = 0;
@@ -2466,7 +2572,7 @@ static int open_input_file(InputFile *ifile, const char *filename,
err = avformat_find_stream_info(fmt_ctx, opts);
for (i = 0; i < orig_nb_streams; i++)
for (int i = 0; i < orig_nb_streams; i++)
av_dict_free(&opts[i]);
av_freep(&opts);
@@ -2484,7 +2590,7 @@ static int open_input_file(InputFile *ifile, const char *filename,
ifile->nb_streams = fmt_ctx->nb_streams;
/* bind a decoder to each input stream */
for (i = 0; i < fmt_ctx->nb_streams; i++) {
for (unsigned i = 0; i < fmt_ctx->nb_streams; i++) {
InputStream *ist = &ifile->streams[i];
AVStream *stream = fmt_ctx->streams[i];
const AVCodec *codec;
@@ -2542,10 +2648,9 @@ static int open_input_file(InputFile *ifile, const char *filename,
static void close_input_file(InputFile *ifile)
{
int i;
/* close decoder for each stream */
for (i = 0; i < ifile->nb_streams; i++)
for (int i = 0; i < ifile->nb_streams; i++)
avcodec_free_context(&ifile->streams[i].dec_ctx);
av_freep(&ifile->streams);
@@ -2558,7 +2663,7 @@ static int probe_file(AVTextFormatContext *tfc, const char *filename,
const char *print_filename)
{
InputFile ifile = { 0 };
int ret, i;
int ret;
int section_id;
do_analyze_frames = do_analyze_frames && do_show_streams;
@@ -2578,7 +2683,7 @@ static int probe_file(AVTextFormatContext *tfc, const char *filename,
REALLOCZ_ARRAY_STREAM(streams_with_closed_captions,0,ifile.fmt_ctx->nb_streams);
REALLOCZ_ARRAY_STREAM(streams_with_film_grain,0,ifile.fmt_ctx->nb_streams);
for (i = 0; i < ifile.fmt_ctx->nb_streams; i++) {
for (unsigned i = 0; i < ifile.fmt_ctx->nb_streams; i++) {
if (stream_specifier) {
ret = avformat_match_stream_specifier(ifile.fmt_ctx,
ifile.fmt_ctx->streams[i],
@@ -2796,9 +2901,9 @@ static inline void mark_section_show_entries(SectionID section_id,
static int match_section(const char *section_name,
int show_all_entries, AVDictionary *entries)
{
int i, ret = 0;
int ret = 0;
for (i = 0; i < FF_ARRAY_ELEMS(sections); i++) {
for (unsigned i = 0; i < FF_ARRAY_ELEMS(sections); i++) {
const struct AVTextFormatSection *section = &sections[i];
if (!strcmp(section_name, section->name) ||
(section->unique_name && !strcmp(section_name, section->unique_name))) {
@@ -3238,7 +3343,7 @@ int main(int argc, char **argv)
char *buf;
char *f_name = NULL, *f_args = NULL;
int ret, input_ret;
AVTextFormatDataDump data_dump_format_id;
AVTextFormatDataDump data_dump_format_id = AV_TEXTFORMAT_DATADUMP_XXD;
init_dynload();
+1
View File
@@ -33,6 +33,7 @@
#include "libavutil/avassert.h"
#include "libavutil/avstring.h"
#include "libavutil/mem.h"
#include "libavutil/pixdesc.h"
#include "libavutil/dict.h"
#include "libavutil/common.h"
+14 -9
View File
@@ -68,6 +68,11 @@ enum show_muxdemuxers {
SHOW_MUXERS,
};
enum show_codec {
SHOW_DECODER,
SHOW_ENCODER,
};
static FILE *report_file;
static int report_file_level = AV_LOG_DEBUG;
@@ -591,9 +596,9 @@ int show_help(void *optctx, const char *opt, const char *arg)
if (!*topic) {
show_help_default(topic, par);
} else if (!strcmp(topic, "decoder")) {
show_help_codec(par, 0);
show_help_codec(par, SHOW_DECODER);
} else if (!strcmp(topic, "encoder")) {
show_help_codec(par, 1);
show_help_codec(par, SHOW_ENCODER);
} else if (!strcmp(topic, "demuxer")) {
show_help_demuxer(par);
} else if (!strcmp(topic, "muxer")) {
@@ -708,16 +713,16 @@ int show_codecs(void *optctx, const char *opt, const char *arg)
/* print decoders/encoders when there's more than one or their
* names are different from codec name */
while ((codec = next_codec_for_id(desc->id, &iter, 0))) {
while ((codec = next_codec_for_id(desc->id, &iter, SHOW_DECODER))) {
if (strcmp(codec->name, desc->name)) {
print_codecs_for_id(desc->id, 0);
print_codecs_for_id(desc->id, SHOW_DECODER);
break;
}
}
iter = NULL;
while ((codec = next_codec_for_id(desc->id, &iter, 1))) {
while ((codec = next_codec_for_id(desc->id, &iter, SHOW_ENCODER))) {
if (strcmp(codec->name, desc->name)) {
print_codecs_for_id(desc->id, 1);
print_codecs_for_id(desc->id, SHOW_ENCODER);
break;
}
}
@@ -774,12 +779,12 @@ static int print_codecs(int encoder)
int show_decoders(void *optctx, const char *opt, const char *arg)
{
return print_codecs(0);
return print_codecs(SHOW_DECODER);
}
int show_encoders(void *optctx, const char *opt, const char *arg)
{
return print_codecs(1);
return print_codecs(SHOW_ENCODER);
}
int show_bsfs(void *optctx, const char *opt, const char *arg)
@@ -876,7 +881,7 @@ static int show_formats_devices(void *optctx, const char *opt, const char *arg,
const char *name = NULL;
const char *long_name = NULL;
if (muxdemuxers !=SHOW_DEMUXERS) {
if (muxdemuxers != SHOW_DEMUXERS) {
ofmt_opaque = NULL;
while ((ofmt = av_muxer_iterate(&ofmt_opaque))) {
is_dev = is_device(ofmt->priv_class);
+2 -1
View File
@@ -26,6 +26,7 @@
#include "avtextformat.h"
#include "libavutil/attributes.h"
#include "libavutil/bprint.h"
#include "libavutil/opt.h"
#include "tf_internal.h"
@@ -74,7 +75,7 @@ static char *ini_escape_str(AVBPrint *dst, const char *src)
case '=':
case ':':
av_bprint_chars(dst, '\\', 1);
/* fallthrough */
av_fallthrough;
default:
if ((unsigned char)c < 32)
av_bprintf(dst, "\\x00%02x", (unsigned char)c);
+4 -4
View File
@@ -130,7 +130,7 @@ OBJS-$(CONFIG_IVIDSP) += ivi_dsp.o
OBJS-$(CONFIG_JNI) += ffjni.o jni.o
OBJS-$(CONFIG_JPEGTABLES) += jpegtables.o
OBJS-$(CONFIG_LCMS2) += fflcms2.o
OBJS-$(CONFIG_LIBLCEVC_DEC) += lcevcdec.o
OBJS-$(CONFIG_LIBLCEVC_DEC) += lcevcdec.o lcevctab.o
OBJS-$(CONFIG_LLAUDDSP) += lossless_audiodsp.o
OBJS-$(CONFIG_LLVIDDSP) += lossless_videodsp.o
OBJS-$(CONFIG_LLVIDENCDSP) += lossless_videoencdsp.o
@@ -626,6 +626,7 @@ OBJS-$(CONFIG_PBM_ENCODER) += pnmenc.o
OBJS-$(CONFIG_PCX_DECODER) += pcx.o
OBJS-$(CONFIG_PCX_ENCODER) += pcxenc.o
OBJS-$(CONFIG_PDV_DECODER) += pdvdec.o
OBJS-$(CONFIG_PDV_ENCODER) += pdvenc.o
OBJS-$(CONFIG_PFM_DECODER) += pnmdec.o pnm.o
OBJS-$(CONFIG_PFM_ENCODER) += pnmenc.o
OBJS-$(CONFIG_PGM_DECODER) += pnmdec.o pnm.o
@@ -1127,7 +1128,7 @@ OBJS-$(CONFIG_FITS_DEMUXER) += fits.o
OBJS-$(CONFIG_TAK_DEMUXER) += tak.o
# libavformat dependencies for static builds
STLIBOBJS-$(CONFIG_AVFORMAT) += to_upper4.o
STLIBOBJS-$(CONFIG_AVFORMAT) += h2645_parse.o lcevctab.o to_upper4.o
STLIBOBJS-$(CONFIG_ISO_MEDIA) += mpegaudiotabs.o
STLIBOBJS-$(CONFIG_FLV_MUXER) += mpeg4audio_sample_rates.o
STLIBOBJS-$(CONFIG_HLS_DEMUXER) += ac3_channel_layout_tab.o
@@ -1277,7 +1278,7 @@ OBJS-$(CONFIG_IPU_PARSER) += ipu_parser.o
OBJS-$(CONFIG_JPEG2000_PARSER) += jpeg2000_parser.o
OBJS-$(CONFIG_JPEGXL_PARSER) += jpegxl_parser.o jpegxl_parse.o
OBJS-$(CONFIG_JPEGXS_PARSER) += jpegxs_parser.o
OBJS-$(CONFIG_LCEVC_PARSER) += lcevc_parser.o
OBJS-$(CONFIG_LCEVC_PARSER) += lcevc_parser.o lcevctab.o
OBJS-$(CONFIG_MISC4_PARSER) += misc4_parser.o
OBJS-$(CONFIG_MJPEG_PARSER) += mjpeg_parser.o
OBJS-$(CONFIG_MLP_PARSER) += mlp_parse.o mlp_parser.o mlp.o
@@ -1379,7 +1380,6 @@ TESTPROGS-$(CONFIG_GOLOMB) += golomb
TESTPROGS-$(CONFIG_IDCTDSP) += dct
TESTPROGS-$(CONFIG_DXV_ENCODER) += hashtable
TESTPROGS-$(CONFIG_MJPEG_ENCODER) += mjpegenc_huffman
TESTPROGS-$(HAVE_MMX) += motion
TESTPROGS-$(CONFIG_MPEGVIDEO) += mpeg12framerate
TESTPROGS-$(CONFIG_H264_METADATA_BSF) += h264_levels
TESTPROGS-$(CONFIG_HEVC_METADATA_BSF) += h265_levels
+14 -15
View File
@@ -676,6 +676,7 @@ ChannelElement *ff_aac_get_che(AACDecContext *ac, int type, int elem_id)
ac->tags_mapped++;
return ac->tag_che_map[type][elem_id] = ac->che[type][elem_id];
}
av_fallthrough;
case 13:
if (ac->tags_mapped > 3 && ((type == TYPE_CPE && elem_id < 8) ||
(type == TYPE_SCE && elem_id < 6) ||
@@ -683,17 +684,20 @@ ChannelElement *ff_aac_get_che(AACDecContext *ac, int type, int elem_id)
ac->tags_mapped++;
return ac->tag_che_map[type][elem_id] = ac->che[type][elem_id];
}
av_fallthrough;
case 12:
case 7:
if (ac->tags_mapped == 3 && type == TYPE_CPE) {
ac->tags_mapped++;
return ac->tag_che_map[TYPE_CPE][elem_id] = ac->che[TYPE_CPE][2];
}
av_fallthrough;
case 11:
if (ac->tags_mapped == 3 && type == TYPE_SCE) {
ac->tags_mapped++;
return ac->tag_che_map[TYPE_SCE][elem_id] = ac->che[TYPE_SCE][1];
}
av_fallthrough;
case 6:
/* Some streams incorrectly code 5.1 audio as
* SCE[0] CPE[0] CPE[1] SCE[1]
@@ -711,11 +715,13 @@ ChannelElement *ff_aac_get_che(AACDecContext *ac, int type, int elem_id)
ac->tags_mapped++;
return ac->tag_che_map[type][elem_id] = ac->che[TYPE_LFE][0];
}
av_fallthrough;
case 5:
if (ac->tags_mapped == 2 && type == TYPE_CPE) {
ac->tags_mapped++;
return ac->tag_che_map[TYPE_CPE][elem_id] = ac->che[TYPE_CPE][1];
}
av_fallthrough;
case 4:
/* Some streams incorrectly code 4.0 audio as
* SCE[0] CPE[0] LFE[0]
@@ -739,6 +745,7 @@ ChannelElement *ff_aac_get_che(AACDecContext *ac, int type, int elem_id)
ac->tags_mapped++;
return ac->tag_che_map[TYPE_SCE][elem_id] = ac->che[TYPE_SCE][1];
}
av_fallthrough;
case 3:
case 2:
if (ac->tags_mapped == (ac->oc[1].m4ac.chan_config != 2) &&
@@ -750,11 +757,13 @@ ChannelElement *ff_aac_get_che(AACDecContext *ac, int type, int elem_id)
ac->tags_mapped++;
return ac->tag_che_map[TYPE_SCE][elem_id] = ac->che[TYPE_SCE][1];
}
av_fallthrough;
case 1:
if (!ac->tags_mapped && type == TYPE_SCE) {
ac->tags_mapped++;
return ac->tag_che_map[TYPE_SCE][elem_id] = ac->che[TYPE_SCE][0];
}
av_fallthrough;
default:
return NULL;
}
@@ -889,12 +898,6 @@ static int decode_ga_specific_config(AACDecContext *ac, AVCodecContext *avctx,
int tags = 0;
m4ac->frame_length_short = get_bits1(gb);
if (m4ac->frame_length_short && m4ac->sbr == 1) {
avpriv_report_missing_feature(avctx, "SBR with 960 frame length");
if (ac) ac->warned_960_sbr = 1;
m4ac->sbr = 0;
m4ac->ps = 0;
}
if (get_bits1(gb)) // dependsOnCoreCoder
skip_bits(gb, 14); // coreCoderDelay
@@ -1246,7 +1249,7 @@ av_cold int ff_aac_decode_init(AVCodecContext *avctx)
ac->oc[1].m4ac.chan_config = i;
if (ac->oc[1].m4ac.chan_config) {
int ret = ff_aac_set_default_channel_config(ac, avctx, layout_map,
ret = ff_aac_set_default_channel_config(ac, avctx, layout_map,
&layout_map_tags,
ac->oc[1].m4ac.chan_config);
if (!ret)
@@ -1946,17 +1949,11 @@ static int decode_extension_payload(AACDecContext *ac, GetBitContext *gb, int cn
switch (type) { // extension type
case EXT_SBR_DATA_CRC:
crc_flag++;
av_fallthrough;
case EXT_SBR_DATA:
if (!che) {
av_log(ac->avctx, AV_LOG_ERROR, "SBR was found before the first channel element.\n");
return res;
} else if (ac->oc[1].m4ac.frame_length_short) {
if (!ac->warned_960_sbr)
avpriv_report_missing_feature(ac->avctx,
"SBR with 960 frame length");
ac->warned_960_sbr = 1;
skip_bits_long(gb, 8 * cnt - 4);
return res;
} else if (!ac->oc[1].m4ac.sbr) {
av_log(ac->avctx, AV_LOG_ERROR, "SBR signaled to be not-present but was found in the bitstream.\n");
skip_bits_long(gb, 8 * cnt - 4);
@@ -1977,7 +1974,8 @@ static int decode_extension_payload(AACDecContext *ac, GetBitContext *gb, int cn
ac->avctx->profile = AV_PROFILE_AAC_HE;
}
ac->proc.sbr_decode_extension(ac, che, gb, crc_flag, cnt, elem_type);
ac->proc.sbr_decode_extension(ac, che, gb, crc_flag, cnt, elem_type,
ac->oc[1].m4ac.frame_length_short);
if (ac->oc[1].m4ac.ps == 1 && !ac->warned_he_aac_mono) {
av_log(ac->avctx, AV_LOG_VERBOSE, "Treating HE-AAC mono as stereo.\n");
@@ -2087,6 +2085,7 @@ static void spectral_to_sample(AACDecContext *ac, int samples)
}
if (ac->oc[1].m4ac.sbr > 0) {
ac->proc.sbr_apply(ac, che, type,
ac->oc[1].m4ac.frame_length_short,
che->ch[0].output,
che->ch[1].output);
}
+3 -4
View File
@@ -433,9 +433,9 @@ typedef struct AACDecProc {
int (*sbr_ctx_alloc_init)(AACDecContext *ac, ChannelElement **che, int id_aac);
int (*sbr_decode_extension)(AACDecContext *ac, ChannelElement *che,
GetBitContext *gb, int crc, int cnt, int id_aac);
void (*sbr_apply)(AACDecContext *ac, ChannelElement *che,
int id_aac, void /* INTFLOAT */ *L, void /* INTFLOAT */ *R);
GetBitContext *gb, int crc, int cnt, int id_aac, int fl960);
void (*sbr_apply)(AACDecContext *ac, ChannelElement *che, int id_aac, int fl960,
void /* INTFLOAT */ *L, void /* INTFLOAT */ *R);
void (*sbr_ctx_close)(ChannelElement *che);
} AACDecProc;
@@ -557,7 +557,6 @@ struct AACDecContext {
OutputConfiguration oc[2];
int warned_num_aac_frames;
int warned_960_sbr;
unsigned warned_71_wide;
int warned_gain_control;
int warned_he_aac_mono;
+28 -12
View File
@@ -215,6 +215,11 @@ static int decode_usac_element_pair(AACDecContext *ac,
if (e->stereo_config_index) {
e->mps.freq_res = get_bits(gb, 3); /* bsFreqRes */
if (!e->mps.freq_res)
return AVERROR_INVALIDDATA; /* value 0 is reserved */
int numBands = ((int[]){0,28,20,14,10,7,5,4})[e->mps.freq_res]; // ISO/IEC 23003-1:2007, 5.2, Table 39
e->mps.fixed_gain = get_bits(gb, 3); /* bsFixedGainDMX */
e->mps.temp_shape_config = get_bits(gb, 2); /* bsTempShapeConfig */
e->mps.decorr_config = get_bits(gb, 2); /* bsDecorrConfig */
@@ -222,12 +227,21 @@ static int decode_usac_element_pair(AACDecContext *ac,
e->mps.phase_coding = get_bits1(gb); /* bsPhaseCoding */
e->mps.otts_bands_phase_present = get_bits1(gb);
if (e->mps.otts_bands_phase_present) /* bsOttBandsPhasePresent */
e->mps.otts_bands_phase = get_bits(gb, 5); /* bsOttBandsPhase */
int otts_bands_phase = ((int[]){0,10,10,7,5,3,2,2})[e->mps.freq_res]; // Table 109 — Default value of bsOttBandsPhase
if (e->mps.otts_bands_phase_present) { /* bsOttBandsPhasePresent */
otts_bands_phase = get_bits(gb, 5); /* bsOttBandsPhase */
if (otts_bands_phase > numBands)
return AVERROR_INVALIDDATA;
}
e->mps.otts_bands_phase = otts_bands_phase;
e->mps.residual_coding = e->stereo_config_index >= 2; /* bsResidualCoding */
if (e->mps.residual_coding) {
e->mps.residual_bands = get_bits(gb, 5); /* bsResidualBands */
int residual_bands = get_bits(gb, 5); /* bsResidualBands */
if (residual_bands > numBands)
return AVERROR_INVALIDDATA;
e->mps.residual_bands = residual_bands;
e->mps.otts_bands_phase = FFMAX(e->mps.otts_bands_phase,
e->mps.residual_bands);
e->mps.pseudo_lr = get_bits1(gb); /* bsPseudoLr */
@@ -1293,7 +1307,8 @@ static void spectrum_decode(AACDecContext *ac, AACUSACConfig *usac,
SingleChannelElement *sce = &cpe->ch[ch];
AACUsacElemData *ue = &sce->ue;
spectrum_scale(ac, sce, ue);
if (!ue->core_mode)
spectrum_scale(ac, sce, ue);
}
if (nb_channels > 1 && us->common_window) {
@@ -1327,13 +1342,13 @@ static void spectrum_decode(AACDecContext *ac, AACUSACConfig *usac,
/* Save coefficients and alpha values for prediction reasons */
if (nb_channels > 1) {
AACUsacStereo *us = &cpe->us;
AACUsacStereo *us2 = &cpe->us;
for (int ch = 0; ch < nb_channels; ch++) {
SingleChannelElement *sce = &cpe->ch[ch];
memcpy(sce->prev_coeffs, sce->coeffs, sizeof(sce->coeffs));
}
memcpy(us->prev_alpha_q_re, us->alpha_q_re, sizeof(us->alpha_q_re));
memcpy(us->prev_alpha_q_im, us->alpha_q_im, sizeof(us->alpha_q_im));
memcpy(us2->prev_alpha_q_re, us2->alpha_q_re, sizeof(us2->alpha_q_re));
memcpy(us2->prev_alpha_q_im, us2->alpha_q_im, sizeof(us2->alpha_q_im));
}
for (int ch = 0; ch < nb_channels; ch++) {
@@ -1343,8 +1358,9 @@ static void spectrum_decode(AACDecContext *ac, AACUSACConfig *usac,
if (sce->tns.present && ((nb_channels == 1) || (us->tns_on_lr)))
ac->dsp.apply_tns(sce->coeffs, &sce->tns, &sce->ics, 1);
ac->oc[1].m4ac.frame_length_short ? ac->dsp.imdct_and_windowing_768(ac, sce) :
ac->dsp.imdct_and_windowing(ac, sce);
if (!sce->ue.core_mode)
ac->oc[1].m4ac.frame_length_short ? ac->dsp.imdct_and_windowing_768(ac, sce) :
ac->dsp.imdct_and_windowing(ac, sce);
}
}
@@ -1655,7 +1671,7 @@ static int decode_usac_core_coder(AACDecContext *ac, AACUSACConfig *usac,
spectrum_decode(ac, usac, che, core_nb_channels);
if (ac->oc[1].m4ac.sbr > 0) {
ac->proc.sbr_apply(ac, che, nb_channels == 2 ? TYPE_CPE : TYPE_SCE,
ac->proc.sbr_apply(ac, che, nb_channels == 2 ? TYPE_CPE : TYPE_SCE, 0,
che->ch[0].output,
che->ch[1].output);
}
@@ -1719,8 +1735,8 @@ static int parse_audio_preroll(AACDecContext *ac, GetBitContext *gb)
}
/* Byte alignment is not guaranteed. */
for (int i = 0; i < au_len; i++)
tmp_buf[i] = get_bits(gb, 8);
for (int j = 0; j < au_len; j++)
tmp_buf[j] = get_bits(gb, 8);
ret = init_get_bits8(&gbc, tmp_buf, au_len);
if (ret < 0)
+16 -18
View File
@@ -240,7 +240,7 @@ static void huff_data_2d(GetBitContext *gb, int16_t *part0_data[2], int16_t (*da
0, 2*esc_cnt, 0, (2*lav + 1));
for (i = 0; i < esc_cnt; i++) {
data[esc_idx[i]][0] = esc_data[0][i] - lav;
data[esc_idx[i]][0] = esc_data[0][i] - lav;
data[esc_idx[i]][1] = esc_data[1][i] - lav;
}
}
}
@@ -464,10 +464,10 @@ static int ec_pair_dec(GetBitContext *gb,
}
if (pair) {
p_data[0] = data_pair[0];
p_data[1] = data_pair[1];
p_data[0] = data_diff[0];
p_data[1] = data_diff[1];
} else {
p_data[0] = data_pair[0];
p_data[0] = data_diff[0];
p_data[1] = NULL;
}
@@ -480,7 +480,7 @@ static int ec_pair_dec(GetBitContext *gb,
if (pair && (diff_freq[0] || diff_time_back))
diff_freq[1] = !get_bits1(gb);
int time_pair;
int time_pair = 0;
huff_decode(gb, p_data, data_type, diff_freq,
nb_bands, &time_pair);
@@ -534,11 +534,11 @@ static int ec_pair_dec(GetBitContext *gb,
}
/* Decode LSBs */
attach_lsb(gb, p_data[0], quant_offset, attach_lsb_flag,
nb_bands, p_data[0]);
attach_lsb(gb, data_pair[0], quant_offset, attach_lsb_flag,
nb_bands, data_pair[0]);
if (pair)
attach_lsb(gb, p_data[1], quant_offset, attach_lsb_flag,
nb_bands, p_data[1]);
attach_lsb(gb, data_pair[1], quant_offset, attach_lsb_flag,
nb_bands, data_pair[1]);
memcpy(&set1[start_band], data_pair[0], 2*nb_bands);
if (pair)
@@ -591,9 +591,6 @@ static int get_freq_strides(int16_t *freq_strides, int band_stride,
}
}
for (int i = 0; i <= data_bands; i++)
freq_strides[i] = av_clip_uintp2(freq_strides[i], 2);
return data_bands;
}
@@ -643,15 +640,16 @@ int ff_aac_ec_data_dec(GetBitContext *gb, AACMPSLosslessData *ld,
fine_to_coarse(ld->last_data, data_type, start_band, end_band);
}
int data_bands = get_freq_strides(ld->freq_res,
int16_t freq_stride_map[MPS_MAX_PARAM_BANDS + 1];
int data_bands = get_freq_strides(freq_stride_map,
stride_table[ld->freq_res[set_idx]],
start_band, end_band);
if (set_idx + data_pair > MPS_MAX_PARAM_SETS)
if (set_idx + data_pair >= MPS_MAX_PARAM_SETS)
return AVERROR(EINVAL);
for (int j = 0; j < data_bands; j++)
ld->last_data[start_band + j] = ld->last_data[ld->freq_res[j]];
ld->last_data[start_band + j] = ld->last_data[freq_stride_map[j]];
int err = ec_pair_dec(gb,
ld->data[set_idx + 0], ld->data[set_idx + 1],
@@ -664,11 +662,11 @@ int ff_aac_ec_data_dec(GetBitContext *gb, AACMPSLosslessData *ld,
if (data_type == MPS_IPD) {
const int mask = ld->coarse_quant[set_idx] ? 0x7 : 0xF;
for (int j = 0; j < data_bands; j++)
for (int k = ld->freq_res[j + 0]; k < ld->freq_res[j + 1]; k++)
for (int k = freq_stride_map[j + 0]; k < freq_stride_map[j + 1]; k++)
ld->last_data[k] = ld->data[set_idx + data_pair][start_band + j] & mask;
} else {
for (int j = 0; j < data_bands; j++)
for (int k = ld->freq_res[j + 0]; k < ld->freq_res[j + 1]; k++)
for (int k = freq_stride_map[j + 0]; k < freq_stride_map[j + 1]; k++)
ld->last_data[k] = ld->data[set_idx + data_pair][start_band + j];
}
@@ -860,7 +858,7 @@ int ff_aac_map_index_data(AACMPSLosslessData *ld,
for (int i = 0; i < nb_param_sets; i++) {
if (ld->coarse_quant_no[i] == 1) {
coarse_to_fine(tmp_idx_data[i], data_type, start_band,
stop_band - start_band);
stop_band);
ld->coarse_quant_no[i] = 0;
}
}
+8 -4
View File
@@ -84,9 +84,11 @@ void ff_aac_sbr_ctx_close_fixed(ChannelElement *che);
/** Decode one SBR element. */
int ff_aac_sbr_decode_extension(AACDecContext *ac, ChannelElement *che,
GetBitContext *gb, int crc, int cnt, int id_aac);
GetBitContext *gb, int crc, int cnt, int id_aac,
int fl960);
int ff_aac_sbr_decode_extension_fixed(AACDecContext *ac, ChannelElement *che,
GetBitContext *gb, int crc, int cnt, int id_aac);
GetBitContext *gb, int crc, int cnt, int id_aac,
int fl960);
/** Due to channel allocation not being known upon SBR parameter transmission,
* supply the parameters separately.
@@ -101,9 +103,11 @@ int ff_aac_sbr_decode_usac_data(AACDecContext *ac, ChannelElement *che,
/** Apply one SBR element to one AAC element. */
void ff_aac_sbr_apply(AACDecContext *ac, ChannelElement *che,
int id_aac, void /* float */ *L, void /* float */ *R);
int id_aac, int fl960,
void /* float */ *L, void /* float */ *R);
void ff_aac_sbr_apply_fixed(AACDecContext *ac, ChannelElement *che,
int id_aac, void /* int */ *L, void /* int */ *R);
int id_aac, int fl960,
void /* int */ *L, void /* int */ *R);
FF_VISIBILITY_POP_HIDDEN
+42 -39
View File
@@ -636,12 +636,11 @@ static const int8_t ceil_log2[] = {
};
static int read_sbr_grid(AACDecContext *ac, SpectralBandReplication *sbr,
GetBitContext *gb, SBRData *ch_data)
GetBitContext *gb, SBRData *ch_data, int numTimeSlots)
{
int i;
int bs_pointer = 0;
// frameLengthFlag ? 15 : 16; 960 sample length frames unsupported; this value is numTimeSlots
int abs_bord_trail = 16;
int abs_bord_trail = numTimeSlots;
int num_rel_lead, num_rel_trail;
unsigned bs_num_env_old = ch_data->bs_num_env;
int bs_frame_class, bs_num_env;
@@ -991,15 +990,15 @@ static void read_sbr_extension(AACDecContext *ac, SpectralBandReplication *sbr,
}
static int read_sbr_single_channel_element(AACDecContext *ac,
SpectralBandReplication *sbr,
GetBitContext *gb)
SpectralBandReplication *sbr,
GetBitContext *gb, int numTimeSlots)
{
int ret;
if (get_bits1(gb)) // bs_data_extra
skip_bits(gb, 4); // bs_reserved
if (read_sbr_grid(ac, sbr, gb, &sbr->data[0]))
if (read_sbr_grid(ac, sbr, gb, &sbr->data[0], numTimeSlots))
return -1;
read_sbr_dtdf(sbr, gb, &sbr->data[0], 0);
read_sbr_invf(sbr, gb, &sbr->data[0]);
@@ -1015,8 +1014,8 @@ static int read_sbr_single_channel_element(AACDecContext *ac,
}
static int read_sbr_channel_pair_element(AACDecContext *ac,
SpectralBandReplication *sbr,
GetBitContext *gb)
SpectralBandReplication *sbr,
GetBitContext *gb, int numTimeSlots)
{
int ret;
@@ -1024,7 +1023,7 @@ static int read_sbr_channel_pair_element(AACDecContext *ac,
skip_bits(gb, 8); // bs_reserved
if ((sbr->bs_coupling = get_bits1(gb))) {
if (read_sbr_grid(ac, sbr, gb, &sbr->data[0]))
if (read_sbr_grid(ac, sbr, gb, &sbr->data[0], numTimeSlots))
return -1;
copy_sbr_grid(&sbr->data[1], &sbr->data[0]);
read_sbr_dtdf(sbr, gb, &sbr->data[0], 0);
@@ -1041,8 +1040,8 @@ static int read_sbr_channel_pair_element(AACDecContext *ac,
if((ret = read_sbr_noise(ac, sbr, gb, &sbr->data[1], 1)) < 0)
return ret;
} else {
if (read_sbr_grid(ac, sbr, gb, &sbr->data[0]) ||
read_sbr_grid(ac, sbr, gb, &sbr->data[1]))
if (read_sbr_grid(ac, sbr, gb, &sbr->data[0], numTimeSlots) ||
read_sbr_grid(ac, sbr, gb, &sbr->data[1], numTimeSlots))
return -1;
read_sbr_dtdf(sbr, gb, &sbr->data[0], 0);
read_sbr_dtdf(sbr, gb, &sbr->data[1], 0);
@@ -1067,7 +1066,7 @@ static int read_sbr_channel_pair_element(AACDecContext *ac,
}
static unsigned int read_sbr_data(AACDecContext *ac, SpectralBandReplication *sbr,
GetBitContext *gb, int id_aac)
GetBitContext *gb, int id_aac, int numTimeSlots)
{
unsigned int cnt = get_bits_count(gb);
@@ -1075,12 +1074,12 @@ static unsigned int read_sbr_data(AACDecContext *ac, SpectralBandReplication *sb
sbr->ready_for_dequant = 1;
if (id_aac == TYPE_SCE || id_aac == TYPE_CCE) {
if (read_sbr_single_channel_element(ac, sbr, gb)) {
if (read_sbr_single_channel_element(ac, sbr, gb, numTimeSlots)) {
sbr_turnoff(sbr);
return get_bits_count(gb) - cnt;
}
} else if (id_aac == TYPE_CPE) {
if (read_sbr_channel_pair_element(ac, sbr, gb)) {
if (read_sbr_channel_pair_element(ac, sbr, gb, numTimeSlots)) {
sbr_turnoff(sbr);
return get_bits_count(gb) - cnt;
}
@@ -1133,12 +1132,13 @@ static void sbr_reset(AACDecContext *ac, SpectralBandReplication *sbr)
*/
int AAC_RENAME(ff_aac_sbr_decode_extension)(AACDecContext *ac, ChannelElement *che,
GetBitContext *gb_host, int crc,
int cnt, int id_aac)
int cnt, int id_aac, int fl960)
{
SpectralBandReplication *sbr = get_sbr(che);
unsigned int num_sbr_bits = 0, num_align_bits;
unsigned bytes_read;
GetBitContext gbc = *gb_host, *gb = &gbc;
int numTimeSlots = fl960 ? 15 : 16;
skip_bits_long(gb_host, cnt*8 - 4);
sbr->reset = 0;
@@ -1166,7 +1166,7 @@ int AAC_RENAME(ff_aac_sbr_decode_extension)(AACDecContext *ac, ChannelElement *c
sbr_reset(ac, sbr);
if (sbr->start)
num_sbr_bits += read_sbr_data(ac, sbr, gb, id_aac);
num_sbr_bits += read_sbr_data(ac, sbr, gb, id_aac, numTimeSlots);
num_align_bits = ((cnt << 3) - 4 - num_sbr_bits) & 7;
bytes_read = ((num_sbr_bits + num_align_bits + 4) >> 3);
@@ -1272,7 +1272,7 @@ int ff_aac_sbr_decode_usac_data(AACDecContext *ac, ChannelElement *che,
if (sbr_ch == 1) { /* sbr_single_channel_element */
/* if (harmonicSBR) ... */
if (read_sbr_grid(ac, sbr, gb, &sbr->data[0]))
if (read_sbr_grid(ac, sbr, gb, &sbr->data[0], 16))
return -1;
read_sbr_dtdf(sbr, gb, &sbr->data[0], indep_flag);
@@ -1291,7 +1291,7 @@ int ff_aac_sbr_decode_usac_data(AACDecContext *ac, ChannelElement *che,
/* if (harmonicSBR) ... */
if (read_sbr_grid(ac, sbr, gb, &sbr->data[0]))
if (read_sbr_grid(ac, sbr, gb, &sbr->data[0], 16))
return -1;
copy_sbr_grid(&sbr->data[1], &sbr->data[0]);
@@ -1323,9 +1323,9 @@ int ff_aac_sbr_decode_usac_data(AACDecContext *ac, ChannelElement *che,
/* if (harmonicSBR) ... */
if (read_sbr_grid(ac, sbr, gb, &sbr->data[0]))
if (read_sbr_grid(ac, sbr, gb, &sbr->data[0], 16))
return -1;
if (read_sbr_grid(ac, sbr, gb, &sbr->data[1]))
if (read_sbr_grid(ac, sbr, gb, &sbr->data[1], 16))
return -1;
read_sbr_dtdf(sbr, gb, &sbr->data[0], indep_flag);
@@ -1369,16 +1369,17 @@ static void sbr_qmf_analysis(AVFloatDSPContext *dsp, AVTXContext *mdct,
av_tx_fn mdct_fn,
#endif /* USE_FIXED */
SBRDSPContext *sbrdsp, const INTFLOAT *in, INTFLOAT *x,
INTFLOAT z[320], INTFLOAT W[2][32][32][2], int buf_idx)
INTFLOAT z[320], INTFLOAT W[2][32][32][2], int buf_idx,
int numTimeSlots)
{
int i;
#if USE_FIXED
int j;
#endif
memcpy(x , x+1024, (320-32)*sizeof(x[0]));
memcpy(x+288, in, 1024*sizeof(x[0]));
for (i = 0; i < 32; i++) { // numTimeSlots*RATE = 16*2 as 960 sample frames
// are not supported
int nb = numTimeSlots * 64;
memcpy(x , x+nb, (320-32)*sizeof(x[0]));
memcpy(x+288, in, nb*sizeof(x[0]));
for (i = 0; i < numTimeSlots*2; i++) { // RATE*numTimeSlots = 2* 16 or 15
dsp->vector_fmul_reverse(z, sbr_qmf_window_ds, x, 320);
sbrdsp->sum64x5(z);
sbrdsp->qmf_pre_shuffle(z);
@@ -1417,13 +1418,14 @@ static void sbr_qmf_synthesis(AVTXContext *mdct, av_tx_fn mdct_fn,
#endif /* USE_FIXED */
INTFLOAT *out, INTFLOAT X[2][38][64],
INTFLOAT mdct_buf[2][64],
INTFLOAT *v0, int *v_off, const unsigned int div)
INTFLOAT *v0, int *v_off, int numTimeSlots,
const unsigned int div)
{
int i, n;
const INTFLOAT *sbr_qmf_window = div ? sbr_qmf_window_ds : sbr_qmf_window_us;
const int step = 128 >> div;
INTFLOAT *v;
for (i = 0; i < 32; i++) {
for (i = 0; i < numTimeSlots*2; i++) {
if (*v_off < step) {
int saved_samples = (1280 - 128) >> div;
memcpy(&v0[SBR_SYNTHESIS_BUF_SIZE - saved_samples], v0, saved_samples * sizeof(INTFLOAT));
@@ -1463,11 +1465,11 @@ static void sbr_qmf_synthesis(AVTXContext *mdct, av_tx_fn mdct_fn,
/// Generate the subband filtered lowband
static int sbr_lf_gen(SpectralBandReplication *sbr,
INTFLOAT X_low[32][40][2], const INTFLOAT W[2][32][32][2],
int buf_idx)
int buf_idx, int numTimeSlots)
{
int i, k;
const int t_HFGen = 8;
const int i_f = 32;
const int i_f = numTimeSlots*2;
memset(X_low, 0, 32*sizeof(*X_low));
for (k = 0; k < sbr->kx[1]; k++) {
for (i = t_HFGen; i < i_f + t_HFGen; i++) {
@@ -1523,10 +1525,10 @@ static int sbr_hf_gen(AACDecContext *ac, SpectralBandReplication *sbr,
/// Generate the subband filtered lowband
static int sbr_x_gen(SpectralBandReplication *sbr, INTFLOAT X[2][38][64],
const INTFLOAT Y0[38][64][2], const INTFLOAT Y1[38][64][2],
const INTFLOAT X_low[32][40][2], int ch)
const INTFLOAT X_low[32][40][2], int ch, int numTimeSlots)
{
int k, i;
const int i_f = 32;
const int i_f = numTimeSlots*2;
const int i_Temp = FFMAX(2*sbr->data[ch].t_env_num_env_old - i_f, 0);
memset(X, 0, 2*sizeof(*X));
for (k = 0; k < sbr->kx[0]; k++) {
@@ -1681,7 +1683,7 @@ static void sbr_env_estimate(AAC_FLOAT (*e_curr)[48], INTFLOAT X_high[64][40][2]
}
void AAC_RENAME(ff_aac_sbr_apply)(AACDecContext *ac, ChannelElement *che,
int id_aac, void *L_, void *R_)
int id_aac, int fl960, void *L_, void *R_)
{
INTFLOAT *L = L_, *R = R_;
SpectralBandReplication *sbr = get_sbr(che);
@@ -1689,6 +1691,7 @@ void AAC_RENAME(ff_aac_sbr_apply)(AACDecContext *ac, ChannelElement *che,
int ch;
int nch = (id_aac == TYPE_CPE) ? 2 : 1;
int err;
int numTimeSlots = fl960 ? 15 : 16;
if (id_aac != sbr->id_aac) {
av_log(ac->avctx, id_aac == TYPE_LFE ? AV_LOG_VERBOSE : AV_LOG_WARNING,
@@ -1718,10 +1721,10 @@ void AAC_RENAME(ff_aac_sbr_apply)(AACDecContext *ac, ChannelElement *che,
sbr_qmf_analysis(ac->fdsp, sbr->mdct_ana, sbr->mdct_ana_fn, &sbr->dsp,
ch ? R : L, sbr->data[ch].analysis_filterbank_samples,
(INTFLOAT*)sbr->qmf_filter_scratch,
sbr->data[ch].W, sbr->data[ch].Ypos);
sbr->data[ch].W, sbr->data[ch].Ypos, numTimeSlots);
sbr->c.sbr_lf_gen(sbr, sbr->X_low,
(const INTFLOAT (*)[32][32][2]) sbr->data[ch].W,
sbr->data[ch].Ypos);
sbr->data[ch].Ypos, numTimeSlots);
sbr->data[ch].Ypos ^= 1;
if (sbr->start) {
sbr->c.sbr_hf_inverse_filter(&sbr->dsp, sbr->alpha0, sbr->alpha1,
@@ -1749,9 +1752,9 @@ void AAC_RENAME(ff_aac_sbr_apply)(AACDecContext *ac, ChannelElement *che,
/* synthesis */
sbr->c.sbr_x_gen(sbr, sbr->X[ch],
(const INTFLOAT (*)[64][2]) sbr->data[ch].Y[1-sbr->data[ch].Ypos],
(const INTFLOAT (*)[64][2]) sbr->data[ch].Y[ sbr->data[ch].Ypos],
(const INTFLOAT (*)[40][2]) sbr->X_low, ch);
(const INTFLOAT (*)[64][2]) sbr->data[ch].Y[1-sbr->data[ch].Ypos],
(const INTFLOAT (*)[64][2]) sbr->data[ch].Y[ sbr->data[ch].Ypos],
(const INTFLOAT (*)[40][2]) sbr->X_low, ch, numTimeSlots);
}
if (ac->oc[1].m4ac.ps == 1) {
@@ -1767,13 +1770,13 @@ void AAC_RENAME(ff_aac_sbr_apply)(AACDecContext *ac, ChannelElement *che,
L, sbr->X[0], sbr->qmf_filter_scratch,
sbr->data[0].synthesis_filterbank_samples,
&sbr->data[0].synthesis_filterbank_samples_offset,
downsampled);
numTimeSlots, downsampled);
if (nch == 2)
sbr_qmf_synthesis(sbr->mdct, sbr->mdct_fn, &sbr->dsp, ac->fdsp,
R, sbr->X[1], sbr->qmf_filter_scratch,
sbr->data[1].synthesis_filterbank_samples,
&sbr->data[1].synthesis_filterbank_samples_offset,
downsampled);
numTimeSlots, downsampled);
}
static void aacsbr_func_ptr_init(AACSBRContext *c)
+6 -2
View File
@@ -22,7 +22,8 @@ OBJS-$(CONFIG_VP8DSP) += aarch64/vp8dsp_init_aarch64.o
OBJS-$(CONFIG_AAC_DECODER) += aarch64/aacpsdsp_init_aarch64.o \
aarch64/sbrdsp_init_aarch64.o
OBJS-$(CONFIG_AAC_ENCODER) += aarch64/aacencdsp_init.o
OBJS-$(CONFIG_DCA_DECODER) += aarch64/synth_filter_init.o
OBJS-$(CONFIG_DCA_DECODER) += aarch64/dcadsp_init_aarch64.o \
aarch64/synth_filter_init.o
OBJS-$(CONFIG_OPUS_DECODER) += aarch64/opusdsp_init.o
OBJS-$(CONFIG_RV40_DECODER) += aarch64/rv40dsp_init_aarch64.o
OBJS-$(CONFIG_VC1DSP) += aarch64/vc1dsp_init_aarch64.o
@@ -65,7 +66,8 @@ NEON-OBJS-$(CONFIG_VP8DSP) += aarch64/vp8dsp_neon.o
# decoders/encoders
NEON-OBJS-$(CONFIG_AAC_DECODER) += aarch64/aacpsdsp_neon.o
NEON-OBJS-$(CONFIG_DCA_DECODER) += aarch64/synth_filter_neon.o
NEON-OBJS-$(CONFIG_DCA_DECODER) += aarch64/dcadsp_neon.o \
aarch64/synth_filter_neon.o
NEON-OBJS-$(CONFIG_OPUS_DECODER) += aarch64/opusdsp_neon.o
NEON-OBJS-$(CONFIG_VORBIS_DECODER) += aarch64/vorbisdsp_neon.o
NEON-OBJS-$(CONFIG_VP9_DECODER) += aarch64/vp9itxfm_16bpp_neon.o \
@@ -78,6 +80,8 @@ NEON-OBJS-$(CONFIG_HEVC_DECODER) += aarch64/hevcdsp_deblock_neon.o \
aarch64/hevcdsp_dequant_neon.o \
aarch64/hevcdsp_idct_neon.o \
aarch64/hevcdsp_init_aarch64.o \
aarch64/hevcpred_neon.o \
aarch64/hevcpred_init_aarch64.o \
aarch64/h26x/epel_neon.o \
aarch64/h26x/qpel_neon.o \
aarch64/h26x/sao_neon.o
+42
View File
@@ -0,0 +1,42 @@
/*
* AArch64 NEON optimised DCA DSP functions
* Copyright (c) 2026 Jeongkeun Kim <variety0724@gmail.com>
*
* This file is part of FFmpeg.
*
* FFmpeg is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2.1 of the License, or (at your option) any later version.
*
* FFmpeg is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with FFmpeg; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
*/
#include "config.h"
#include "libavutil/attributes.h"
#include "libavutil/cpu.h"
#include "libavutil/aarch64/cpu.h"
#include "libavcodec/dcadsp.h"
void ff_lfe_fir0_float_neon(float *pcm_samples, const int32_t *lfe_samples,
const float *filter_coeff, ptrdiff_t npcmblocks);
void ff_lfe_fir1_float_neon(float *pcm_samples, const int32_t *lfe_samples,
const float *filter_coeff, ptrdiff_t npcmblocks);
av_cold void ff_dcadsp_init_aarch64(DCADSPContext *s)
{
int cpu_flags = av_get_cpu_flags();
if (have_neon(cpu_flags)) {
s->lfe_fir_float[0] = ff_lfe_fir0_float_neon;
s->lfe_fir_float[1] = ff_lfe_fir1_float_neon;
}
}
+101
View File
@@ -0,0 +1,101 @@
/*
* AArch64 NEON optimised DCA LFE FIR filter functions
* Copyright (c) 2026 Jeongkeun Kim <variety0724@gmail.com>
*
* This file is part of FFmpeg.
*
* FFmpeg is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2.1 of the License, or (at your option) any later version.
*
* FFmpeg is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with FFmpeg; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
*/
#include "libavutil/aarch64/asm.S"
function ff_lfe_fir0_float_neon, export=1
lsr x3, x3, #1
sub x1, x1, #(7*4)
.Louter0:
ld1 {v4.4s, v5.4s}, [x1]
scvtf v4.4s, v4.4s
scvtf v5.4s, v5.4s
ext v6.16b, v5.16b, v5.16b, #8
rev64 v6.4s, v6.4s
ext v7.16b, v4.16b, v4.16b, #8
rev64 v7.4s, v7.4s
mov x4, x2
add x5, x2, #(248*4)
mov x6, x0
add x7, x0, #(32*4)
mov w8, #32
.Linner0:
ld1 {v0.4s, v1.4s}, [x4], #32
ld1 {v16.4s, v17.4s}, [x5]
sub x5, x5, #32
subs w8, w8, #1
fmul v2.4s, v0.4s, v6.4s
fmul v3.4s, v16.4s, v4.4s
fmla v2.4s, v1.4s, v7.4s
fmla v3.4s, v17.4s, v5.4s
faddp v2.4s, v2.4s, v2.4s
faddp v3.4s, v3.4s, v3.4s
faddp s2, v2.2s
faddp s3, v3.2s
str s2, [x6], #4
str s3, [x7], #4
b.gt .Linner0
subs x3, x3, #1
add x1, x1, #4
add x0, x0, #(64*4)
b.gt .Louter0
ret
endfunc
function ff_lfe_fir1_float_neon, export=1
lsr x3, x3, #2
sub x1, x1, #(3*4)
.Louter1:
ld1 {v4.4s}, [x1]
scvtf v4.4s, v4.4s
ext v5.16b, v4.16b, v4.16b, #8
rev64 v5.4s, v5.4s
mov x4, x2
add x5, x2, #(252*4)
mov x6, x0
add x7, x0, #(64*4)
mov w8, #64
.Linner1:
ld1 {v0.4s}, [x4], #16
ld1 {v16.4s}, [x5]
sub x5, x5, #16
subs w8, w8, #1
fmul v2.4s, v0.4s, v5.4s
fmul v3.4s, v16.4s, v4.4s
faddp v2.4s, v2.4s, v2.4s
faddp v3.4s, v3.4s, v3.4s
faddp s2, v2.2s
faddp s3, v3.2s
str s2, [x6], #4
str s3, [x7], #4
b.gt .Linner1
subs x3, x3, #1
add x1, x1, #4
add x0, x0, #(128*4)
b.gt .Louter1
ret
endfunc
+7 -3
View File
@@ -130,6 +130,10 @@ NEON8_FNPROTO(epel_uni_v, (uint8_t *dst, ptrdiff_t dststride,
const uint8_t *src, ptrdiff_t srcstride,
int height, intptr_t mx, intptr_t my, int width),);
NEON8_FNPROTO(epel_uni_h, (uint8_t *dst, ptrdiff_t dststride,
const uint8_t *src, ptrdiff_t srcstride,
int height, intptr_t mx, intptr_t my, int width),);
NEON8_FNPROTO(epel_uni_hv, (uint8_t *dst, ptrdiff_t _dststride,
const uint8_t *src, ptrdiff_t srcstride,
int height, intptr_t mx, intptr_t my, int width),);
@@ -143,7 +147,7 @@ NEON8_FNPROTO(epel_uni_w_v, (uint8_t *_dst, ptrdiff_t _dststride,
int height, int denom, int wx, int ox,
intptr_t mx, intptr_t my, int width),);
NEON8_FNPROTO_PARTIAL_4(qpel_uni_w_v, (uint8_t *_dst, ptrdiff_t _dststride,
NEON8_FNPROTO(qpel_uni_w_v, (uint8_t *_dst, ptrdiff_t _dststride,
const uint8_t *_src, ptrdiff_t _srcstride,
int height, int denom, int wx, int ox,
intptr_t mx, intptr_t my, int width),);
@@ -222,12 +226,12 @@ NEON8_FNPROTO(epel_uni_w_hv, (uint8_t *_dst, ptrdiff_t _dststride,
int height, int denom, int wx, int ox,
intptr_t mx, intptr_t my, int width), _i8mm);
NEON8_FNPROTO_PARTIAL_5(qpel_uni_w_hv, (uint8_t *_dst, ptrdiff_t _dststride,
NEON8_FNPROTO(qpel_uni_w_hv, (uint8_t *_dst, ptrdiff_t _dststride,
const uint8_t *_src, ptrdiff_t _srcstride,
int height, int denom, int wx, int ox,
intptr_t mx, intptr_t my, int width),);
NEON8_FNPROTO_PARTIAL_5(qpel_uni_w_hv, (uint8_t *_dst, ptrdiff_t _dststride,
NEON8_FNPROTO(qpel_uni_w_hv, (uint8_t *_dst, ptrdiff_t _dststride,
const uint8_t *_src, ptrdiff_t _srcstride,
int height, int denom, int wx, int ox,
intptr_t mx, intptr_t my, int width), _i8mm);
+480
View File
@@ -1276,6 +1276,7 @@ function ff_hevc_put_hevc_epel_bi_v32_8_neon, export=1
endfunc
function ff_hevc_put_hevc_epel_bi_v48_8_neon, export=1
AARCH64_SIGN_LINK_REGISTER
stp x4, x5, [sp, #-64]!
stp x2, x3, [sp, #16]
stp x0, x1, [sp, #32]
@@ -1292,10 +1293,12 @@ function ff_hevc_put_hevc_epel_bi_v48_8_neon, export=1
bl X(ff_hevc_put_hevc_epel_bi_v24_8_neon)
ldr x30, [sp, #8]
add sp, sp, #16
AARCH64_VALIDATE_LINK_REGISTER
ret
endfunc
function ff_hevc_put_hevc_epel_bi_v64_8_neon, export=1
AARCH64_SIGN_LINK_REGISTER
stp x4, x5, [sp, #-64]!
stp x2, x3, [sp, #16]
stp x0, x1, [sp, #32]
@@ -1312,6 +1315,7 @@ function ff_hevc_put_hevc_epel_bi_v64_8_neon, export=1
bl X(ff_hevc_put_hevc_epel_bi_v32_8_neon)
ldr x30, [sp, #8]
add sp, sp, #16
AARCH64_VALIDATE_LINK_REGISTER
ret
endfunc
@@ -1744,6 +1748,398 @@ function ff_hevc_put_hevc_epel_uni_v64_8_neon, export=1
ret
endfunc
// epel_uni_h: horizontal EPEL filter with output to uint8_t
// void put_hevc_epel_uni_h(uint8_t *dst, ptrdiff_t dststride,
// const uint8_t *src, ptrdiff_t srcstride,
// int height, intptr_t mx, intptr_t my, int width)
// x0: dst, x1: dststride, x2: src, x3: srcstride, w4: height, x5: mx
.macro EPEL_UNI_H_HEADER
movrel x7, epel_filters
add x7, x7, x5, lsl #2
ld1r {v30.4s}, [x7]
sxtl v0.8h, v30.8b
sub x2, x2, #1
.endm
function ff_hevc_put_hevc_epel_uni_h4_8_neon, export=1
EPEL_UNI_H_HEADER
1: ld1 {v4.8b}, [x2], x3
subs w4, w4, #1
uxtl v4.8h, v4.8b
ext v5.16b, v4.16b, v4.16b, #2
ext v6.16b, v4.16b, v4.16b, #4
ext v7.16b, v4.16b, v4.16b, #6
mul v16.4h, v4.4h, v0.h[0]
mla v16.4h, v5.4h, v0.h[1]
mla v16.4h, v6.4h, v0.h[2]
mla v16.4h, v7.4h, v0.h[3]
sqrshrun v16.8b, v16.8h, #6
st1 {v16.s}[0], [x0], x1
b.ne 1b
ret
endfunc
function ff_hevc_put_hevc_epel_uni_h6_8_neon, export=1
EPEL_UNI_H_HEADER
1: ld1 {v3.16b}, [x2], x3
subs w4, w4, #1
uxtl2 v4.8h, v3.16b
uxtl v3.8h, v3.8b
ext v5.16b, v3.16b, v4.16b, #2
ext v6.16b, v3.16b, v4.16b, #4
ext v7.16b, v3.16b, v4.16b, #6
mul v16.8h, v3.8h, v0.h[0]
mla v16.8h, v5.8h, v0.h[1]
mla v16.8h, v6.8h, v0.h[2]
mla v16.8h, v7.8h, v0.h[3]
sqrshrun v16.8b, v16.8h, #6
add x7, x0, #4
st1 {v16.s}[0], [x0], x1
st1 {v16.h}[2], [x7]
b.ne 1b
ret
endfunc
function ff_hevc_put_hevc_epel_uni_h8_8_neon, export=1
EPEL_UNI_H_HEADER
1: ld1 {v3.16b}, [x2], x3
subs w4, w4, #1
uxtl2 v4.8h, v3.16b
uxtl v3.8h, v3.8b
ext v5.16b, v3.16b, v4.16b, #2
ext v6.16b, v3.16b, v4.16b, #4
ext v7.16b, v3.16b, v4.16b, #6
mul v16.8h, v3.8h, v0.h[0]
mla v16.8h, v5.8h, v0.h[1]
mla v16.8h, v6.8h, v0.h[2]
mla v16.8h, v7.8h, v0.h[3]
sqrshrun v16.8b, v16.8h, #6
st1 {v16.8b}, [x0], x1
b.ne 1b
ret
endfunc
function ff_hevc_put_hevc_epel_uni_h12_8_neon, export=1
EPEL_UNI_H_HEADER
1: ld1 {v3.16b}, [x2], x3
subs w4, w4, #1
uxtl2 v4.8h, v3.16b
uxtl v3.8h, v3.8b
ext v5.16b, v3.16b, v4.16b, #2
ext v6.16b, v3.16b, v4.16b, #4
ext v7.16b, v3.16b, v4.16b, #6
ext v20.16b, v4.16b, v4.16b, #2
ext v21.16b, v4.16b, v4.16b, #4
ext v22.16b, v4.16b, v4.16b, #6
mul v16.8h, v3.8h, v0.h[0]
mla v16.8h, v5.8h, v0.h[1]
mla v16.8h, v6.8h, v0.h[2]
mla v16.8h, v7.8h, v0.h[3]
mul v17.4h, v4.4h, v0.h[0]
mla v17.4h, v20.4h, v0.h[1]
mla v17.4h, v21.4h, v0.h[2]
mla v17.4h, v22.4h, v0.h[3]
sqrshrun v16.8b, v16.8h, #6
sqrshrun v17.8b, v17.8h, #6
add x7, x0, #8
st1 {v16.8b}, [x0], x1
st1 {v17.s}[0], [x7]
b.ne 1b
ret
endfunc
function ff_hevc_put_hevc_epel_uni_h16_8_neon, export=1
EPEL_UNI_H_HEADER
1: ld1 {v2.16b, v3.16b}, [x2], x3
subs w4, w4, #1
uxtl v4.8h, v2.8b
uxtl2 v5.8h, v2.16b
uxtl v6.8h, v3.8b
ext v16.16b, v4.16b, v5.16b, #2
ext v17.16b, v4.16b, v5.16b, #4
ext v18.16b, v4.16b, v5.16b, #6
ext v19.16b, v5.16b, v6.16b, #2
ext v20.16b, v5.16b, v6.16b, #4
ext v21.16b, v5.16b, v6.16b, #6
mul v22.8h, v4.8h, v0.h[0]
mla v22.8h, v16.8h, v0.h[1]
mla v22.8h, v17.8h, v0.h[2]
mla v22.8h, v18.8h, v0.h[3]
mul v23.8h, v5.8h, v0.h[0]
mla v23.8h, v19.8h, v0.h[1]
mla v23.8h, v20.8h, v0.h[2]
mla v23.8h, v21.8h, v0.h[3]
sqrshrun v22.8b, v22.8h, #6
sqrshrun2 v22.16b, v23.8h, #6
st1 {v22.16b}, [x0], x1
b.ne 1b
ret
endfunc
function ff_hevc_put_hevc_epel_uni_h24_8_neon, export=1
EPEL_UNI_H_HEADER
1: ld1 {v1.16b, v2.16b}, [x2], x3
subs w4, w4, #1
uxtl v3.8h, v1.8b
uxtl2 v4.8h, v1.16b
uxtl v5.8h, v2.8b
uxtl2 v6.8h, v2.16b
// First 8 pixels
ext v16.16b, v3.16b, v4.16b, #2
ext v17.16b, v3.16b, v4.16b, #4
ext v18.16b, v3.16b, v4.16b, #6
mul v22.8h, v3.8h, v0.h[0]
mla v22.8h, v16.8h, v0.h[1]
mla v22.8h, v17.8h, v0.h[2]
mla v22.8h, v18.8h, v0.h[3]
// Second 8 pixels
ext v16.16b, v4.16b, v5.16b, #2
ext v17.16b, v4.16b, v5.16b, #4
ext v18.16b, v4.16b, v5.16b, #6
mul v23.8h, v4.8h, v0.h[0]
mla v23.8h, v16.8h, v0.h[1]
mla v23.8h, v17.8h, v0.h[2]
mla v23.8h, v18.8h, v0.h[3]
// Third 8 pixels
ext v16.16b, v5.16b, v6.16b, #2
ext v17.16b, v5.16b, v6.16b, #4
ext v18.16b, v5.16b, v6.16b, #6
mul v24.8h, v5.8h, v0.h[0]
mla v24.8h, v16.8h, v0.h[1]
mla v24.8h, v17.8h, v0.h[2]
mla v24.8h, v18.8h, v0.h[3]
sqrshrun v22.8b, v22.8h, #6
sqrshrun2 v22.16b, v23.8h, #6
sqrshrun v23.8b, v24.8h, #6
add x7, x0, #16
st1 {v22.16b}, [x0], x1
st1 {v23.8b}, [x7]
b.ne 1b
ret
endfunc
function ff_hevc_put_hevc_epel_uni_h32_8_neon, export=1
EPEL_UNI_H_HEADER
1: ld1 {v1.16b, v2.16b, v3.16b}, [x2], x3
subs w4, w4, #1
uxtl v4.8h, v1.8b
uxtl2 v5.8h, v1.16b
uxtl v6.8h, v2.8b
uxtl2 v7.8h, v2.16b
uxtl v26.8h, v3.8b
// First 8 pixels
ext v16.16b, v4.16b, v5.16b, #2
ext v17.16b, v4.16b, v5.16b, #4
ext v18.16b, v4.16b, v5.16b, #6
mul v22.8h, v4.8h, v0.h[0]
mla v22.8h, v16.8h, v0.h[1]
mla v22.8h, v17.8h, v0.h[2]
mla v22.8h, v18.8h, v0.h[3]
// Second 8 pixels
ext v16.16b, v5.16b, v6.16b, #2
ext v17.16b, v5.16b, v6.16b, #4
ext v18.16b, v5.16b, v6.16b, #6
mul v23.8h, v5.8h, v0.h[0]
mla v23.8h, v16.8h, v0.h[1]
mla v23.8h, v17.8h, v0.h[2]
mla v23.8h, v18.8h, v0.h[3]
// Third 8 pixels
ext v16.16b, v6.16b, v7.16b, #2
ext v17.16b, v6.16b, v7.16b, #4
ext v18.16b, v6.16b, v7.16b, #6
mul v24.8h, v6.8h, v0.h[0]
mla v24.8h, v16.8h, v0.h[1]
mla v24.8h, v17.8h, v0.h[2]
mla v24.8h, v18.8h, v0.h[3]
// Fourth 8 pixels
ext v16.16b, v7.16b, v26.16b, #2
ext v17.16b, v7.16b, v26.16b, #4
ext v18.16b, v7.16b, v26.16b, #6
mul v25.8h, v7.8h, v0.h[0]
mla v25.8h, v16.8h, v0.h[1]
mla v25.8h, v17.8h, v0.h[2]
mla v25.8h, v18.8h, v0.h[3]
sqrshrun v22.8b, v22.8h, #6
sqrshrun2 v22.16b, v23.8h, #6
sqrshrun v23.8b, v24.8h, #6
sqrshrun2 v23.16b, v25.8h, #6
st1 {v22.16b, v23.16b}, [x0], x1
b.ne 1b
ret
endfunc
function ff_hevc_put_hevc_epel_uni_h48_8_neon, export=1
EPEL_UNI_H_HEADER
sub sp, sp, #32
st1 {v8.16b, v9.16b}, [sp]
1: ld1 {v1.16b, v2.16b, v3.16b}, [x2]
add x7, x2, #48
ld1 {v26.8b}, [x7]
add x2, x2, x3
subs w4, w4, #1
uxtl v4.8h, v1.8b
uxtl2 v5.8h, v1.16b
uxtl v6.8h, v2.8b
uxtl2 v7.8h, v2.16b
uxtl v8.8h, v3.8b
uxtl2 v9.8h, v3.16b
uxtl v27.8h, v26.8b
// First 8 pixels
ext v16.16b, v4.16b, v5.16b, #2
ext v17.16b, v4.16b, v5.16b, #4
ext v18.16b, v4.16b, v5.16b, #6
mul v22.8h, v4.8h, v0.h[0]
mla v22.8h, v16.8h, v0.h[1]
mla v22.8h, v17.8h, v0.h[2]
mla v22.8h, v18.8h, v0.h[3]
// Second 8 pixels
ext v16.16b, v5.16b, v6.16b, #2
ext v17.16b, v5.16b, v6.16b, #4
ext v18.16b, v5.16b, v6.16b, #6
mul v23.8h, v5.8h, v0.h[0]
mla v23.8h, v16.8h, v0.h[1]
mla v23.8h, v17.8h, v0.h[2]
mla v23.8h, v18.8h, v0.h[3]
// Third 8 pixels
ext v16.16b, v6.16b, v7.16b, #2
ext v17.16b, v6.16b, v7.16b, #4
ext v18.16b, v6.16b, v7.16b, #6
mul v24.8h, v6.8h, v0.h[0]
mla v24.8h, v16.8h, v0.h[1]
mla v24.8h, v17.8h, v0.h[2]
mla v24.8h, v18.8h, v0.h[3]
// Fourth 8 pixels
ext v16.16b, v7.16b, v8.16b, #2
ext v17.16b, v7.16b, v8.16b, #4
ext v18.16b, v7.16b, v8.16b, #6
mul v25.8h, v7.8h, v0.h[0]
mla v25.8h, v16.8h, v0.h[1]
mla v25.8h, v17.8h, v0.h[2]
mla v25.8h, v18.8h, v0.h[3]
// Fifth 8 pixels
ext v16.16b, v8.16b, v9.16b, #2
ext v17.16b, v8.16b, v9.16b, #4
ext v18.16b, v8.16b, v9.16b, #6
mul v28.8h, v8.8h, v0.h[0]
mla v28.8h, v16.8h, v0.h[1]
mla v28.8h, v17.8h, v0.h[2]
mla v28.8h, v18.8h, v0.h[3]
// Sixth 8 pixels
ext v16.16b, v9.16b, v27.16b, #2
ext v17.16b, v9.16b, v27.16b, #4
ext v18.16b, v9.16b, v27.16b, #6
mul v29.8h, v9.8h, v0.h[0]
mla v29.8h, v16.8h, v0.h[1]
mla v29.8h, v17.8h, v0.h[2]
mla v29.8h, v18.8h, v0.h[3]
sqrshrun v22.8b, v22.8h, #6
sqrshrun2 v22.16b, v23.8h, #6
sqrshrun v23.8b, v24.8h, #6
sqrshrun2 v23.16b, v25.8h, #6
sqrshrun v24.8b, v28.8h, #6
sqrshrun2 v24.16b, v29.8h, #6
st1 {v22.16b, v23.16b, v24.16b}, [x0], x1
b.ne 1b
ld1 {v8.16b, v9.16b}, [sp], #32
ret
endfunc
function ff_hevc_put_hevc_epel_uni_h64_8_neon, export=1
EPEL_UNI_H_HEADER
sub sp, sp, #64
st1 {v8.16b, v9.16b, v10.16b, v11.16b}, [sp]
1: add x7, x2, #48
ld1 {v1.16b, v2.16b, v3.16b}, [x2]
ld1 {v26.16b, v27.16b}, [x7]
add x2, x2, x3
subs w4, w4, #1
uxtl v4.8h, v1.8b
uxtl2 v5.8h, v1.16b
uxtl v6.8h, v2.8b
uxtl2 v7.8h, v2.16b
uxtl v8.8h, v3.8b
uxtl2 v9.8h, v3.16b
uxtl v10.8h, v26.8b
uxtl2 v11.8h, v26.16b
uxtl v28.8h, v27.8b
// First 8 pixels
ext v16.16b, v4.16b, v5.16b, #2
ext v17.16b, v4.16b, v5.16b, #4
ext v18.16b, v4.16b, v5.16b, #6
mul v22.8h, v4.8h, v0.h[0]
mla v22.8h, v16.8h, v0.h[1]
mla v22.8h, v17.8h, v0.h[2]
mla v22.8h, v18.8h, v0.h[3]
// Second 8 pixels
ext v16.16b, v5.16b, v6.16b, #2
ext v17.16b, v5.16b, v6.16b, #4
ext v18.16b, v5.16b, v6.16b, #6
mul v23.8h, v5.8h, v0.h[0]
mla v23.8h, v16.8h, v0.h[1]
mla v23.8h, v17.8h, v0.h[2]
mla v23.8h, v18.8h, v0.h[3]
// Third 8 pixels
ext v16.16b, v6.16b, v7.16b, #2
ext v17.16b, v6.16b, v7.16b, #4
ext v18.16b, v6.16b, v7.16b, #6
mul v24.8h, v6.8h, v0.h[0]
mla v24.8h, v16.8h, v0.h[1]
mla v24.8h, v17.8h, v0.h[2]
mla v24.8h, v18.8h, v0.h[3]
// Fourth 8 pixels
ext v16.16b, v7.16b, v8.16b, #2
ext v17.16b, v7.16b, v8.16b, #4
ext v18.16b, v7.16b, v8.16b, #6
mul v25.8h, v7.8h, v0.h[0]
mla v25.8h, v16.8h, v0.h[1]
mla v25.8h, v17.8h, v0.h[2]
mla v25.8h, v18.8h, v0.h[3]
sqrshrun v22.8b, v22.8h, #6
sqrshrun2 v22.16b, v23.8h, #6
sqrshrun v23.8b, v24.8h, #6
sqrshrun2 v23.16b, v25.8h, #6
// Fifth 8 pixels
ext v16.16b, v8.16b, v9.16b, #2
ext v17.16b, v8.16b, v9.16b, #4
ext v18.16b, v8.16b, v9.16b, #6
mul v24.8h, v8.8h, v0.h[0]
mla v24.8h, v16.8h, v0.h[1]
mla v24.8h, v17.8h, v0.h[2]
mla v24.8h, v18.8h, v0.h[3]
// Sixth 8 pixels
ext v16.16b, v9.16b, v10.16b, #2
ext v17.16b, v9.16b, v10.16b, #4
ext v18.16b, v9.16b, v10.16b, #6
mul v25.8h, v9.8h, v0.h[0]
mla v25.8h, v16.8h, v0.h[1]
mla v25.8h, v17.8h, v0.h[2]
mla v25.8h, v18.8h, v0.h[3]
// Seventh 8 pixels
ext v16.16b, v10.16b, v11.16b, #2
ext v17.16b, v10.16b, v11.16b, #4
ext v18.16b, v10.16b, v11.16b, #6
mul v26.8h, v10.8h, v0.h[0]
mla v26.8h, v16.8h, v0.h[1]
mla v26.8h, v17.8h, v0.h[2]
mla v26.8h, v18.8h, v0.h[3]
// Eighth 8 pixels
ext v16.16b, v11.16b, v28.16b, #2
ext v17.16b, v11.16b, v28.16b, #4
ext v18.16b, v11.16b, v28.16b, #6
mul v27.8h, v11.8h, v0.h[0]
mla v27.8h, v16.8h, v0.h[1]
mla v27.8h, v17.8h, v0.h[2]
mla v27.8h, v18.8h, v0.h[3]
sqrshrun v24.8b, v24.8h, #6
sqrshrun2 v24.16b, v25.8h, #6
sqrshrun v25.8b, v26.8h, #6
sqrshrun2 v25.16b, v27.8h, #6
st1 {v22.16b, v23.16b, v24.16b, v25.16b}, [x0], x1
b.ne 1b
ld1 {v8.16b, v9.16b, v10.16b, v11.16b}, [sp], #64
ret
endfunc
.macro EPEL_H_HEADER
movrel x5, epel_filters
@@ -2824,6 +3220,7 @@ function ff_hevc_put_hevc_epel_hv4_8_\suffix, export=1
add w10, w3, #3
lsl x10, x10, #7
sub sp, sp, x10 // tmp_array
AARCH64_SIGN_LINK_REGISTER
stp x5, x30, [sp, #-32]!
stp x0, x3, [sp, #16]
add x0, sp, #32
@@ -2832,6 +3229,7 @@ function ff_hevc_put_hevc_epel_hv4_8_\suffix, export=1
bl X(ff_hevc_put_hevc_epel_h4_8_\suffix)
ldp x0, x3, [sp, #16]
ldp x5, x30, [sp], #32
AARCH64_VALIDATE_LINK_REGISTER
b hevc_put_hevc_epel_hv4_8_end_neon
endfunc
@@ -2839,6 +3237,7 @@ function ff_vvc_put_epel_hv4_8_\suffix, export=1
add w10, w3, #3
lsl x10, x10, #8
sub sp, sp, x10 // tmp_array
AARCH64_SIGN_LINK_REGISTER
stp x5, x30, [sp, #-32]!
stp x0, x3, [sp, #16]
add x0, sp, #32
@@ -2847,6 +3246,7 @@ function ff_vvc_put_epel_hv4_8_\suffix, export=1
bl X(ff_vvc_put_epel_h4_8_\suffix)
ldp x0, x3, [sp, #16]
ldp x5, x30, [sp], #32
AARCH64_VALIDATE_LINK_REGISTER
b vvc_put_epel_hv4_8_end_neon
endfunc
@@ -2854,6 +3254,7 @@ function ff_hevc_put_hevc_epel_hv6_8_\suffix, export=1
add w10, w3, #3
lsl x10, x10, #7
sub sp, sp, x10 // tmp_array
AARCH64_SIGN_LINK_REGISTER
stp x5, x30, [sp, #-32]!
stp x0, x3, [sp, #16]
add x0, sp, #32
@@ -2862,6 +3263,7 @@ function ff_hevc_put_hevc_epel_hv6_8_\suffix, export=1
bl X(ff_hevc_put_hevc_epel_h6_8_\suffix)
ldp x0, x3, [sp, #16]
ldp x5, x30, [sp], #32
AARCH64_VALIDATE_LINK_REGISTER
b hevc_put_hevc_epel_hv6_8_end_neon
endfunc
@@ -2869,6 +3271,7 @@ function ff_hevc_put_hevc_epel_hv8_8_\suffix, export=1
add w10, w3, #3
lsl x10, x10, #7
sub sp, sp, x10 // tmp_array
AARCH64_SIGN_LINK_REGISTER
stp x5, x30, [sp, #-32]!
stp x0, x3, [sp, #16]
add x0, sp, #32
@@ -2877,6 +3280,7 @@ function ff_hevc_put_hevc_epel_hv8_8_\suffix, export=1
bl X(ff_hevc_put_hevc_epel_h8_8_\suffix)
ldp x0, x3, [sp, #16]
ldp x5, x30, [sp], #32
AARCH64_VALIDATE_LINK_REGISTER
b hevc_put_hevc_epel_hv8_8_end_neon
endfunc
@@ -2884,6 +3288,7 @@ function ff_vvc_put_epel_hv8_8_\suffix, export=1
add w10, w3, #3
lsl x10, x10, #8
sub sp, sp, x10 // tmp_array
AARCH64_SIGN_LINK_REGISTER
stp x5, x30, [sp, #-32]!
stp x0, x3, [sp, #16]
add x0, sp, #32
@@ -2892,6 +3297,7 @@ function ff_vvc_put_epel_hv8_8_\suffix, export=1
bl X(ff_vvc_put_epel_h8_8_\suffix)
ldp x0, x3, [sp, #16]
ldp x5, x30, [sp], #32
AARCH64_VALIDATE_LINK_REGISTER
b vvc_put_epel_hv8_8_end_neon
endfunc
@@ -2899,6 +3305,7 @@ function ff_hevc_put_hevc_epel_hv12_8_\suffix, export=1
add w10, w3, #3
lsl x10, x10, #7
sub sp, sp, x10 // tmp_array
AARCH64_SIGN_LINK_REGISTER
stp x5, x30, [sp, #-32]!
stp x0, x3, [sp, #16]
add x0, sp, #32
@@ -2907,6 +3314,7 @@ function ff_hevc_put_hevc_epel_hv12_8_\suffix, export=1
bl X(ff_hevc_put_hevc_epel_h12_8_\suffix)
ldp x0, x3, [sp, #16]
ldp x5, x30, [sp], #32
AARCH64_VALIDATE_LINK_REGISTER
b hevc_put_hevc_epel_hv12_8_end_neon
endfunc
@@ -2914,6 +3322,7 @@ function ff_hevc_put_hevc_epel_hv16_8_\suffix, export=1
add w10, w3, #3
lsl x10, x10, #7
sub sp, sp, x10 // tmp_array
AARCH64_SIGN_LINK_REGISTER
stp x5, x30, [sp, #-32]!
stp x0, x3, [sp, #16]
add x0, sp, #32
@@ -2922,6 +3331,7 @@ function ff_hevc_put_hevc_epel_hv16_8_\suffix, export=1
bl X(ff_hevc_put_hevc_epel_h16_8_\suffix)
ldp x0, x3, [sp, #16]
ldp x5, x30, [sp], #32
AARCH64_VALIDATE_LINK_REGISTER
b hevc_put_hevc_epel_hv16_8_end_neon
endfunc
@@ -2929,6 +3339,7 @@ function ff_vvc_put_epel_hv16_8_\suffix, export=1
add w10, w3, #3
lsl x10, x10, #8
sub sp, sp, x10 // tmp_array
AARCH64_SIGN_LINK_REGISTER
stp x5, x30, [sp, #-32]!
stp x0, x3, [sp, #16]
add x0, sp, #32
@@ -2937,6 +3348,7 @@ function ff_vvc_put_epel_hv16_8_\suffix, export=1
bl X(ff_vvc_put_epel_h16_8_\suffix)
ldp x0, x3, [sp, #16]
ldp x5, x30, [sp], #32
AARCH64_VALIDATE_LINK_REGISTER
b vvc_put_epel_hv16_8_end_neon
endfunc
@@ -2944,6 +3356,7 @@ function ff_hevc_put_hevc_epel_hv24_8_\suffix, export=1
add w10, w3, #3
lsl x10, x10, #7
sub sp, sp, x10 // tmp_array
AARCH64_SIGN_LINK_REGISTER
stp x5, x30, [sp, #-32]!
stp x0, x3, [sp, #16]
add x0, sp, #32
@@ -2952,10 +3365,12 @@ function ff_hevc_put_hevc_epel_hv24_8_\suffix, export=1
bl X(ff_hevc_put_hevc_epel_h24_8_\suffix)
ldp x0, x3, [sp, #16]
ldp x5, x30, [sp], #32
AARCH64_VALIDATE_LINK_REGISTER
b hevc_put_hevc_epel_hv24_8_end_neon
endfunc
function ff_hevc_put_hevc_epel_hv32_8_\suffix, export=1
AARCH64_SIGN_LINK_REGISTER
stp x4, x5, [sp, #-64]!
stp x2, x3, [sp, #16]
stp x0, x1, [sp, #32]
@@ -2970,10 +3385,12 @@ function ff_hevc_put_hevc_epel_hv32_8_\suffix, export=1
mov x6, #16
bl X(ff_hevc_put_hevc_epel_hv16_8_\suffix)
ldr x30, [sp], #16
AARCH64_VALIDATE_LINK_REGISTER
ret
endfunc
function ff_vvc_put_epel_hv32_8_\suffix, export=1
AARCH64_SIGN_LINK_REGISTER
stp x4, x5, [sp, #-64]!
stp x2, x3, [sp, #16]
stp x0, x1, [sp, #32]
@@ -2988,10 +3405,12 @@ function ff_vvc_put_epel_hv32_8_\suffix, export=1
mov x6, #16
bl X(ff_vvc_put_epel_hv16_8_\suffix)
ldr x30, [sp], #16
AARCH64_VALIDATE_LINK_REGISTER
ret
endfunc
function ff_hevc_put_hevc_epel_hv48_8_\suffix, export=1
AARCH64_SIGN_LINK_REGISTER
stp x4, x5, [sp, #-64]!
stp x2, x3, [sp, #16]
stp x0, x1, [sp, #32]
@@ -3006,10 +3425,12 @@ function ff_hevc_put_hevc_epel_hv48_8_\suffix, export=1
mov x6, #24
bl X(ff_hevc_put_hevc_epel_hv24_8_\suffix)
ldr x30, [sp], #16
AARCH64_VALIDATE_LINK_REGISTER
ret
endfunc
function ff_hevc_put_hevc_epel_hv64_8_\suffix, export=1
AARCH64_SIGN_LINK_REGISTER
stp x4, x5, [sp, #-64]!
stp x2, x3, [sp, #16]
stp x0, x1, [sp, #32]
@@ -3038,10 +3459,12 @@ function ff_hevc_put_hevc_epel_hv64_8_\suffix, export=1
mov x6, #16
bl X(ff_hevc_put_hevc_epel_hv16_8_\suffix)
ldr x30, [sp], #16
AARCH64_VALIDATE_LINK_REGISTER
ret
endfunc
function ff_vvc_put_epel_hv64_8_\suffix, export=1
AARCH64_SIGN_LINK_REGISTER
stp x4, x5, [sp, #-64]!
stp x2, x3, [sp, #16]
stp x0, x1, [sp, #32]
@@ -3056,10 +3479,12 @@ function ff_vvc_put_epel_hv64_8_\suffix, export=1
mov x6, #32
bl X(ff_vvc_put_epel_hv32_8_\suffix)
ldr x30, [sp], #16
AARCH64_VALIDATE_LINK_REGISTER
ret
endfunc
function ff_vvc_put_epel_hv128_8_\suffix, export=1
AARCH64_SIGN_LINK_REGISTER
stp x4, x5, [sp, #-64]!
stp x2, x3, [sp, #16]
stp x0, x1, [sp, #32]
@@ -3074,6 +3499,7 @@ function ff_vvc_put_epel_hv128_8_\suffix, export=1
mov x6, #64
bl X(ff_vvc_put_epel_hv64_8_\suffix)
ldr x30, [sp], #16
AARCH64_VALIDATE_LINK_REGISTER
ret
endfunc
@@ -3214,6 +3640,7 @@ function ff_hevc_put_hevc_epel_uni_hv4_8_\suffix, export=1
add w10, w4, #3
lsl x10, x10, #7
sub sp, sp, x10 // tmp_array
AARCH64_SIGN_LINK_REGISTER
str x30, [sp, #-48]!
stp x4, x6, [sp, #16]
stp x0, x1, [sp, #32]
@@ -3226,6 +3653,7 @@ function ff_hevc_put_hevc_epel_uni_hv4_8_\suffix, export=1
ldp x4, x6, [sp, #16]
ldp x0, x1, [sp, #32]
ldr x30, [sp], #48
AARCH64_VALIDATE_LINK_REGISTER
b hevc_put_hevc_epel_uni_hv4_8_end_neon
endfunc
@@ -3233,6 +3661,7 @@ function ff_hevc_put_hevc_epel_uni_hv6_8_\suffix, export=1
add w10, w4, #3
lsl x10, x10, #7
sub sp, sp, x10 // tmp_array
AARCH64_SIGN_LINK_REGISTER
str x30, [sp, #-48]!
stp x4, x6, [sp, #16]
stp x0, x1, [sp, #32]
@@ -3245,6 +3674,7 @@ function ff_hevc_put_hevc_epel_uni_hv6_8_\suffix, export=1
ldp x4, x6, [sp, #16]
ldp x0, x1, [sp, #32]
ldr x30, [sp], #48
AARCH64_VALIDATE_LINK_REGISTER
b hevc_put_hevc_epel_uni_hv6_8_end_neon
endfunc
@@ -3252,6 +3682,7 @@ function ff_hevc_put_hevc_epel_uni_hv8_8_\suffix, export=1
add w10, w4, #3
lsl x10, x10, #7
sub sp, sp, x10 // tmp_array
AARCH64_SIGN_LINK_REGISTER
str x30, [sp, #-48]!
stp x4, x6, [sp, #16]
stp x0, x1, [sp, #32]
@@ -3264,6 +3695,7 @@ function ff_hevc_put_hevc_epel_uni_hv8_8_\suffix, export=1
ldp x4, x6, [sp, #16]
ldp x0, x1, [sp, #32]
ldr x30, [sp], #48
AARCH64_VALIDATE_LINK_REGISTER
b hevc_put_hevc_epel_uni_hv8_8_end_neon
endfunc
@@ -3271,6 +3703,7 @@ function ff_hevc_put_hevc_epel_uni_hv12_8_\suffix, export=1
add w10, w4, #3
lsl x10, x10, #7
sub sp, sp, x10 // tmp_array
AARCH64_SIGN_LINK_REGISTER
str x30, [sp, #-48]!
stp x4, x6, [sp, #16]
stp x0, x1, [sp, #32]
@@ -3283,6 +3716,7 @@ function ff_hevc_put_hevc_epel_uni_hv12_8_\suffix, export=1
ldp x4, x6, [sp, #16]
ldp x0, x1, [sp, #32]
ldr x30, [sp], #48
AARCH64_VALIDATE_LINK_REGISTER
b hevc_put_hevc_epel_uni_hv12_8_end_neon
endfunc
@@ -3290,6 +3724,7 @@ function ff_hevc_put_hevc_epel_uni_hv16_8_\suffix, export=1
add w10, w4, #3
lsl x10, x10, #7
sub sp, sp, x10 // tmp_array
AARCH64_SIGN_LINK_REGISTER
str x30, [sp, #-48]!
stp x4, x6, [sp, #16]
stp x0, x1, [sp, #32]
@@ -3302,6 +3737,7 @@ function ff_hevc_put_hevc_epel_uni_hv16_8_\suffix, export=1
ldp x4, x6, [sp, #16]
ldp x0, x1, [sp, #32]
ldr x30, [sp], #48
AARCH64_VALIDATE_LINK_REGISTER
b hevc_put_hevc_epel_uni_hv16_8_end_neon
endfunc
@@ -3309,6 +3745,7 @@ function ff_hevc_put_hevc_epel_uni_hv24_8_\suffix, export=1
add w10, w4, #3
lsl x10, x10, #7
sub sp, sp, x10 // tmp_array
AARCH64_SIGN_LINK_REGISTER
str x30, [sp, #-48]!
stp x4, x6, [sp, #16]
stp x0, x1, [sp, #32]
@@ -3321,10 +3758,12 @@ function ff_hevc_put_hevc_epel_uni_hv24_8_\suffix, export=1
ldp x4, x6, [sp, #16]
ldp x0, x1, [sp, #32]
ldr x30, [sp], #48
AARCH64_VALIDATE_LINK_REGISTER
b hevc_put_hevc_epel_uni_hv24_8_end_neon
endfunc
function ff_hevc_put_hevc_epel_uni_hv32_8_\suffix, export=1
AARCH64_SIGN_LINK_REGISTER
stp x5, x6, [sp, #-64]!
stp x3, x4, [sp, #16]
stp x1, x2, [sp, #32]
@@ -3341,10 +3780,12 @@ function ff_hevc_put_hevc_epel_uni_hv32_8_\suffix, export=1
bl X(ff_hevc_put_hevc_epel_uni_hv16_8_\suffix)
ldr x30, [sp, #56]
add sp, sp, #64
AARCH64_VALIDATE_LINK_REGISTER
ret
endfunc
function ff_hevc_put_hevc_epel_uni_hv48_8_\suffix, export=1
AARCH64_SIGN_LINK_REGISTER
stp x5, x6, [sp, #-64]!
stp x3, x4, [sp, #16]
stp x1, x2, [sp, #32]
@@ -3361,10 +3802,12 @@ function ff_hevc_put_hevc_epel_uni_hv48_8_\suffix, export=1
bl X(ff_hevc_put_hevc_epel_uni_hv24_8_\suffix)
ldr x30, [sp, #56]
add sp, sp, #64
AARCH64_VALIDATE_LINK_REGISTER
ret
endfunc
function ff_hevc_put_hevc_epel_uni_hv64_8_\suffix, export=1
AARCH64_SIGN_LINK_REGISTER
stp x5, x6, [sp, #-64]!
stp x3, x4, [sp, #16]
stp x1, x2, [sp, #32]
@@ -3397,6 +3840,7 @@ function ff_hevc_put_hevc_epel_uni_hv64_8_\suffix, export=1
bl X(ff_hevc_put_hevc_epel_uni_hv16_8_\suffix)
ldr x30, [sp, #56]
add sp, sp, #64
AARCH64_VALIDATE_LINK_REGISTER
ret
endfunc
.endm
@@ -4202,6 +4646,7 @@ function ff_hevc_put_hevc_epel_uni_w_hv4_8_\suffix, export=1
add x10, x4, #3
lsl x10, x10, #7
sub sp, sp, x10 // tmp_array
AARCH64_SIGN_LINK_REGISTER
str x30, [sp, #-48]!
stp x4, x6, [sp, #16]
stp x0, x1, [sp, #32]
@@ -4214,6 +4659,7 @@ function ff_hevc_put_hevc_epel_uni_w_hv4_8_\suffix, export=1
ldp x4, x6, [sp, #16]
ldp x0, x1, [sp, #32]
ldr x30, [sp], #48
AARCH64_VALIDATE_LINK_REGISTER
b hevc_put_hevc_epel_uni_w_hv4_8_end_neon
endfunc
@@ -4224,6 +4670,7 @@ function ff_hevc_put_hevc_epel_uni_w_hv6_8_\suffix, export=1
add x10, x4, #3
lsl x10, x10, #7
sub sp, sp, x10 // tmp_array
AARCH64_SIGN_LINK_REGISTER
str x30, [sp, #-48]!
stp x4, x6, [sp, #16]
stp x0, x1, [sp, #32]
@@ -4236,6 +4683,7 @@ function ff_hevc_put_hevc_epel_uni_w_hv6_8_\suffix, export=1
ldp x4, x6, [sp, #16]
ldp x0, x1, [sp, #32]
ldr x30, [sp], #48
AARCH64_VALIDATE_LINK_REGISTER
b hevc_put_hevc_epel_uni_w_hv6_8_end_neon
endfunc
@@ -4246,6 +4694,7 @@ function ff_hevc_put_hevc_epel_uni_w_hv8_8_\suffix, export=1
add x10, x4, #3
lsl x10, x10, #7
sub sp, sp, x10 // tmp_array
AARCH64_SIGN_LINK_REGISTER
str x30, [sp, #-48]!
stp x4, x6, [sp, #16]
stp x0, x1, [sp, #32]
@@ -4258,6 +4707,7 @@ function ff_hevc_put_hevc_epel_uni_w_hv8_8_\suffix, export=1
ldp x4, x6, [sp, #16]
ldp x0, x1, [sp, #32]
ldr x30, [sp], #48
AARCH64_VALIDATE_LINK_REGISTER
b hevc_put_hevc_epel_uni_w_hv8_8_end_neon
endfunc
@@ -4268,6 +4718,7 @@ function ff_hevc_put_hevc_epel_uni_w_hv12_8_\suffix, export=1
add x10, x4, #3
lsl x10, x10, #7
sub sp, sp, x10 // tmp_array
AARCH64_SIGN_LINK_REGISTER
str x30, [sp, #-48]!
stp x4, x6, [sp, #16]
stp x0, x1, [sp, #32]
@@ -4280,6 +4731,7 @@ function ff_hevc_put_hevc_epel_uni_w_hv12_8_\suffix, export=1
ldp x4, x6, [sp, #16]
ldp x0, x1, [sp, #32]
ldr x30, [sp], #48
AARCH64_VALIDATE_LINK_REGISTER
b hevc_put_hevc_epel_uni_w_hv12_8_end_neon
endfunc
@@ -4290,6 +4742,7 @@ function ff_hevc_put_hevc_epel_uni_w_hv16_8_\suffix, export=1
add x10, x4, #3
lsl x10, x10, #7
sub sp, sp, x10 // tmp_array
AARCH64_SIGN_LINK_REGISTER
str x30, [sp, #-48]!
stp x4, x6, [sp, #16]
stp x0, x1, [sp, #32]
@@ -4302,6 +4755,7 @@ function ff_hevc_put_hevc_epel_uni_w_hv16_8_\suffix, export=1
ldp x4, x6, [sp, #16]
ldp x0, x1, [sp, #32]
ldr x30, [sp], #48
AARCH64_VALIDATE_LINK_REGISTER
b hevc_put_hevc_epel_uni_w_hv16_8_end_neon
endfunc
@@ -4312,6 +4766,7 @@ function ff_hevc_put_hevc_epel_uni_w_hv24_8_\suffix, export=1
add x10, x4, #3
lsl x10, x10, #7
sub sp, sp, x10 // tmp_array
AARCH64_SIGN_LINK_REGISTER
str x30, [sp, #-48]!
stp x4, x6, [sp, #16]
stp x0, x1, [sp, #32]
@@ -4324,10 +4779,12 @@ function ff_hevc_put_hevc_epel_uni_w_hv24_8_\suffix, export=1
ldp x4, x6, [sp, #16]
ldp x0, x1, [sp, #32]
ldr x30, [sp], #48
AARCH64_VALIDATE_LINK_REGISTER
b hevc_put_hevc_epel_uni_w_hv24_8_end_neon
endfunc
function ff_hevc_put_hevc_epel_uni_w_hv32_8_\suffix, export=1
AARCH64_SIGN_LINK_REGISTER
ldp x15, x16, [sp]
mov x17, #16
stp x15, x16, [sp, #-96]!
@@ -4352,10 +4809,12 @@ function ff_hevc_put_hevc_epel_uni_w_hv32_8_\suffix, export=1
bl X(ff_hevc_put_hevc_epel_uni_w_hv16_8_\suffix)
ldp x17, x30, [sp, #16]
ldp x15, x16, [sp], #32
AARCH64_VALIDATE_LINK_REGISTER
ret
endfunc
function ff_hevc_put_hevc_epel_uni_w_hv48_8_\suffix, export=1
AARCH64_SIGN_LINK_REGISTER
ldp x15, x16, [sp]
mov x17, #24
stp x15, x16, [sp, #-96]!
@@ -4379,10 +4838,12 @@ function ff_hevc_put_hevc_epel_uni_w_hv48_8_\suffix, export=1
bl X(ff_hevc_put_hevc_epel_uni_w_hv24_8_\suffix)
ldp x17, x30, [sp, #16]
ldp x15, x16, [sp], #32
AARCH64_VALIDATE_LINK_REGISTER
ret
endfunc
function ff_hevc_put_hevc_epel_uni_w_hv64_8_\suffix, export=1
AARCH64_SIGN_LINK_REGISTER
ldp x15, x16, [sp]
mov x17, #32
stp x15, x16, [sp, #-96]!
@@ -4407,6 +4868,7 @@ function ff_hevc_put_hevc_epel_uni_w_hv64_8_\suffix, export=1
bl X(ff_hevc_put_hevc_epel_uni_w_hv32_8_\suffix)
ldp x17, x30, [sp, #16]
ldp x15, x16, [sp], #32
AARCH64_VALIDATE_LINK_REGISTER
ret
endfunc
.endm
@@ -4597,6 +5059,7 @@ function ff_hevc_put_hevc_epel_bi_hv4_8_\suffix, export=1
add w10, w5, #3
lsl x10, x10, #7
sub sp, sp, x10 // tmp_array
AARCH64_SIGN_LINK_REGISTER
stp x7, x30, [sp, #-48]!
stp x4, x5, [sp, #16]
stp x0, x1, [sp, #32]
@@ -4610,6 +5073,7 @@ function ff_hevc_put_hevc_epel_bi_hv4_8_\suffix, export=1
ldp x4, x5, [sp, #16]
ldp x0, x1, [sp, #32]
ldp x7, x30, [sp], #48
AARCH64_VALIDATE_LINK_REGISTER
b hevc_put_hevc_epel_bi_hv4_8_end_neon
endfunc
@@ -4617,6 +5081,7 @@ function ff_hevc_put_hevc_epel_bi_hv6_8_\suffix, export=1
add w10, w5, #3
lsl x10, x10, #7
sub sp, sp, x10 // tmp_array
AARCH64_SIGN_LINK_REGISTER
stp x7, x30, [sp, #-48]!
stp x4, x5, [sp, #16]
stp x0, x1, [sp, #32]
@@ -4630,6 +5095,7 @@ function ff_hevc_put_hevc_epel_bi_hv6_8_\suffix, export=1
ldp x4, x5, [sp, #16]
ldp x0, x1, [sp, #32]
ldp x7, x30, [sp], #48
AARCH64_VALIDATE_LINK_REGISTER
b hevc_put_hevc_epel_bi_hv6_8_end_neon
endfunc
@@ -4637,6 +5103,7 @@ function ff_hevc_put_hevc_epel_bi_hv8_8_\suffix, export=1
add w10, w5, #3
lsl x10, x10, #7
sub sp, sp, x10 // tmp_array
AARCH64_SIGN_LINK_REGISTER
stp x7, x30, [sp, #-48]!
stp x4, x5, [sp, #16]
stp x0, x1, [sp, #32]
@@ -4650,6 +5117,7 @@ function ff_hevc_put_hevc_epel_bi_hv8_8_\suffix, export=1
ldp x4, x5, [sp, #16]
ldp x0, x1, [sp, #32]
ldp x7, x30, [sp], #48
AARCH64_VALIDATE_LINK_REGISTER
b hevc_put_hevc_epel_bi_hv8_8_end_neon
endfunc
@@ -4657,6 +5125,7 @@ function ff_hevc_put_hevc_epel_bi_hv12_8_\suffix, export=1
add w10, w5, #3
lsl x10, x10, #7
sub sp, sp, x10 // tmp_array
AARCH64_SIGN_LINK_REGISTER
stp x7, x30, [sp, #-48]!
stp x4, x5, [sp, #16]
stp x0, x1, [sp, #32]
@@ -4670,6 +5139,7 @@ function ff_hevc_put_hevc_epel_bi_hv12_8_\suffix, export=1
ldp x4, x5, [sp, #16]
ldp x0, x1, [sp, #32]
ldp x7, x30, [sp], #48
AARCH64_VALIDATE_LINK_REGISTER
b hevc_put_hevc_epel_bi_hv12_8_end_neon
endfunc
@@ -4677,6 +5147,7 @@ function ff_hevc_put_hevc_epel_bi_hv16_8_\suffix, export=1
add w10, w5, #3
lsl x10, x10, #7
sub sp, sp, x10 // tmp_array
AARCH64_SIGN_LINK_REGISTER
stp x7, x30, [sp, #-48]!
stp x4, x5, [sp, #16]
stp x0, x1, [sp, #32]
@@ -4690,6 +5161,7 @@ function ff_hevc_put_hevc_epel_bi_hv16_8_\suffix, export=1
ldp x4, x5, [sp, #16]
ldp x0, x1, [sp, #32]
ldp x7, x30, [sp], #48
AARCH64_VALIDATE_LINK_REGISTER
b hevc_put_hevc_epel_bi_hv16_8_end_neon
endfunc
@@ -4697,6 +5169,7 @@ function ff_hevc_put_hevc_epel_bi_hv24_8_\suffix, export=1
add w10, w5, #3
lsl x10, x10, #7
sub sp, sp, x10 // tmp_array
AARCH64_SIGN_LINK_REGISTER
stp x7, x30, [sp, #-48]!
stp x4, x5, [sp, #16]
stp x0, x1, [sp, #32]
@@ -4710,6 +5183,7 @@ function ff_hevc_put_hevc_epel_bi_hv24_8_\suffix, export=1
ldp x4, x5, [sp, #16]
ldp x0, x1, [sp, #32]
ldp x7, x30, [sp], #48
AARCH64_VALIDATE_LINK_REGISTER
b hevc_put_hevc_epel_bi_hv24_8_end_neon
endfunc
@@ -4718,6 +5192,7 @@ function ff_hevc_put_hevc_epel_bi_hv32_8_\suffix, export=1
add w10, w5, #3
lsl x10, x10, #7
sub sp, sp, x10 // tmp_array
AARCH64_SIGN_LINK_REGISTER
stp x7, x30, [sp, #-48]!
stp x4, x5, [sp, #16]
stp x0, x1, [sp, #32]
@@ -4732,10 +5207,12 @@ function ff_hevc_put_hevc_epel_bi_hv32_8_\suffix, export=1
ldp x4, x5, [sp, #16]
ldp x0, x1, [sp, #32]
ldp x7, x30, [sp], #48
AARCH64_VALIDATE_LINK_REGISTER
b hevc_put_hevc_epel_bi_hv32_8_end_neon
endfunc
function ff_hevc_put_hevc_epel_bi_hv48_8_\suffix, export=1
AARCH64_SIGN_LINK_REGISTER
stp x6, x7, [sp, #-80]!
stp x4, x5, [sp, #16]
stp x2, x3, [sp, #32]
@@ -4751,10 +5228,12 @@ function ff_hevc_put_hevc_epel_bi_hv48_8_\suffix, export=1
add x4, x4, #48
bl X(ff_hevc_put_hevc_epel_bi_hv24_8_\suffix)
ldr x30, [sp], #16
AARCH64_VALIDATE_LINK_REGISTER
ret
endfunc
function ff_hevc_put_hevc_epel_bi_hv64_8_\suffix, export=1
AARCH64_SIGN_LINK_REGISTER
stp x6, x7, [sp, #-80]!
stp x4, x5, [sp, #16]
stp x2, x3, [sp, #32]
@@ -4770,6 +5249,7 @@ function ff_hevc_put_hevc_epel_bi_hv64_8_\suffix, export=1
add x4, x4, #64
bl X(ff_hevc_put_hevc_epel_bi_hv32_8_\suffix)
ldr x30, [sp], #16
AARCH64_VALIDATE_LINK_REGISTER
ret
endfunc
.endm
File diff suppressed because it is too large Load Diff
+6 -1
View File
@@ -511,8 +511,11 @@ function hevc_loop_filter_luma_body_\bitdepth\()_neon, export=0
sqxtun v6.8b, v6.8h
sqxtun v7.8b, v7.8h
.endif
// Use x15 to signal whether any pixels should be updated or not.
mov x15, #1
ret
3: mov x15, #0
ret
3: ret x6
endfunc
.endm
@@ -562,6 +565,7 @@ function ff_hevc_\dir\()_loop_filter_luma_\bitdepth\()_neon, export=1
.endif
.endif
bl hevc_loop_filter_luma_body_\bitdepth\()_neon
cbz x15, 9f
.if \bitdepth > 8
.ifc \dir, v
transpose_8x8H v0, v1, v2, v3, v4, v5, v6, v7, v16, v17
@@ -587,6 +591,7 @@ function ff_hevc_\dir\()_loop_filter_luma_\bitdepth\()_neon, export=1
st1 {v6.8b}, [x10], x1
st1 {v7.8b}, [x10]
.endif
9:
ret x6
endfunc
.endm
+73 -60
View File
@@ -194,6 +194,24 @@ static void hevc_dequant_12_neon(int16_t *coeffs, int16_t log2_size)
member[8][v][h] = ff_hevc_put_hevc_##fn##24_8_neon##ext; \
member[9][v][h] = ff_hevc_put_hevc_##fn##32_8_neon##ext;
/*
* qpel horizontal (non-i8mm): no dedicated w24/w48/w64 NEON functions,
* w12 and w24 share h12 (loop x2), w32/w48/w64 share h32 (loop).
*
* Index-to-width: [1]=4 [2]=6 [3]=8 [4]=12 [5]=16
* [6]=24 [7]=32 [8]=48 [9]=64
*/
#define NEON8_FNASSIGN_QPEL_H(member, fn) \
member[1][0][1] = ff_hevc_put_hevc_##fn##_h4_8_neon; \
member[2][0][1] = ff_hevc_put_hevc_##fn##_h6_8_neon; \
member[3][0][1] = ff_hevc_put_hevc_##fn##_h8_8_neon; \
member[4][0][1] = \
member[6][0][1] = ff_hevc_put_hevc_##fn##_h12_8_neon; \
member[5][0][1] = ff_hevc_put_hevc_##fn##_h16_8_neon; \
member[7][0][1] = \
member[8][0][1] = \
member[9][0][1] = ff_hevc_put_hevc_##fn##_h32_8_neon;
av_cold void ff_hevc_dsp_init_aarch64(HEVCDSPContext *c, const int bit_depth)
{
int cpu_flags = av_get_cpu_flags();
@@ -228,82 +246,77 @@ av_cold void ff_hevc_dsp_init_aarch64(HEVCDSPContext *c, const int bit_depth)
c->sao_edge_filter[2] =
c->sao_edge_filter[3] =
c->sao_edge_filter[4] = ff_hevc_sao_edge_filter_16x16_8_neon;
c->put_hevc_qpel[1][0][1] = ff_hevc_put_hevc_qpel_h4_8_neon;
c->put_hevc_qpel[2][0][1] = ff_hevc_put_hevc_qpel_h6_8_neon;
c->put_hevc_qpel[3][0][1] = ff_hevc_put_hevc_qpel_h8_8_neon;
c->put_hevc_qpel[4][0][1] =
c->put_hevc_qpel[6][0][1] = ff_hevc_put_hevc_qpel_h12_8_neon;
c->put_hevc_qpel[5][0][1] = ff_hevc_put_hevc_qpel_h16_8_neon;
c->put_hevc_qpel[7][0][1] =
c->put_hevc_qpel[8][0][1] =
c->put_hevc_qpel[9][0][1] = ff_hevc_put_hevc_qpel_h32_8_neon;
c->put_hevc_qpel_uni[1][0][1] = ff_hevc_put_hevc_qpel_uni_h4_8_neon;
c->put_hevc_qpel_uni[2][0][1] = ff_hevc_put_hevc_qpel_uni_h6_8_neon;
c->put_hevc_qpel_uni[3][0][1] = ff_hevc_put_hevc_qpel_uni_h8_8_neon;
c->put_hevc_qpel_uni[4][0][1] =
c->put_hevc_qpel_uni[6][0][1] = ff_hevc_put_hevc_qpel_uni_h12_8_neon;
c->put_hevc_qpel_uni[5][0][1] = ff_hevc_put_hevc_qpel_uni_h16_8_neon;
c->put_hevc_qpel_uni[7][0][1] =
c->put_hevc_qpel_uni[8][0][1] =
c->put_hevc_qpel_uni[9][0][1] = ff_hevc_put_hevc_qpel_uni_h32_8_neon;
c->put_hevc_qpel_bi[1][0][1] = ff_hevc_put_hevc_qpel_bi_h4_8_neon;
c->put_hevc_qpel_bi[2][0][1] = ff_hevc_put_hevc_qpel_bi_h6_8_neon;
c->put_hevc_qpel_bi[3][0][1] = ff_hevc_put_hevc_qpel_bi_h8_8_neon;
c->put_hevc_qpel_bi[4][0][1] =
c->put_hevc_qpel_bi[6][0][1] = ff_hevc_put_hevc_qpel_bi_h12_8_neon;
c->put_hevc_qpel_bi[5][0][1] = ff_hevc_put_hevc_qpel_bi_h16_8_neon;
c->put_hevc_qpel_bi[7][0][1] =
c->put_hevc_qpel_bi[8][0][1] =
c->put_hevc_qpel_bi[9][0][1] = ff_hevc_put_hevc_qpel_bi_h32_8_neon;
NEON8_FNASSIGN(c->put_hevc_epel, 0, 0, pel_pixels,);
NEON8_FNASSIGN(c->put_hevc_epel, 1, 0, epel_v,);
/* ============ qpel ============ */
NEON8_FNASSIGN(c->put_hevc_qpel, 0, 0, pel_pixels,);
NEON8_FNASSIGN_QPEL_H(c->put_hevc_qpel, qpel);
NEON8_FNASSIGN(c->put_hevc_qpel, 1, 0, qpel_v,);
NEON8_FNASSIGN(c->put_hevc_qpel, 1, 1, qpel_hv,);
/* qpel_uni: pixels, h, v, hv */
NEON8_FNASSIGN(c->put_hevc_qpel_uni, 0, 0, pel_uni_pixels,);
NEON8_FNASSIGN_QPEL_H(c->put_hevc_qpel_uni, qpel_uni);
NEON8_FNASSIGN(c->put_hevc_qpel_uni, 1, 0, qpel_uni_v,);
NEON8_FNASSIGN(c->put_hevc_qpel_uni, 1, 1, qpel_uni_hv,);
/* qpel_bi: pixels, h, v, hv */
NEON8_FNASSIGN(c->put_hevc_qpel_bi, 0, 0, pel_bi_pixels,);
NEON8_FNASSIGN_QPEL_H(c->put_hevc_qpel_bi, qpel_bi);
NEON8_FNASSIGN(c->put_hevc_qpel_bi, 1, 0, qpel_bi_v,);
NEON8_FNASSIGN(c->put_hevc_qpel_bi, 1, 1, qpel_bi_hv,);
/* qpel_uni_w: pixels, h, v, hv */
NEON8_FNASSIGN(c->put_hevc_qpel_uni_w, 0, 0, pel_uni_w_pixels,);
NEON8_FNASSIGN_SHARED_32(c->put_hevc_qpel_uni_w, 0, 1, qpel_uni_w_h,);
NEON8_FNASSIGN(c->put_hevc_qpel_uni_w, 1, 0, qpel_uni_w_v,);
NEON8_FNASSIGN(c->put_hevc_qpel_uni_w, 1, 1, qpel_uni_w_hv,);
/* qpel_bi_w: pixels only */
NEON8_FNASSIGN_PARTIAL_6(c->put_hevc_qpel_bi_w, 0, 0, pel_bi_w_pixels,);
/* ============ epel ============ */
NEON8_FNASSIGN(c->put_hevc_epel, 0, 0, pel_pixels,);
NEON8_FNASSIGN_SHARED_32(c->put_hevc_epel, 0, 1, epel_h,);
NEON8_FNASSIGN(c->put_hevc_epel, 1, 0, epel_v,);
NEON8_FNASSIGN(c->put_hevc_epel, 1, 1, epel_hv,);
/* epel_uni: pixels, h, v, hv */
NEON8_FNASSIGN(c->put_hevc_epel_uni, 0, 0, pel_uni_pixels,);
NEON8_FNASSIGN(c->put_hevc_epel_uni, 0, 1, epel_uni_h,);
NEON8_FNASSIGN(c->put_hevc_epel_uni, 1, 0, epel_uni_v,);
NEON8_FNASSIGN(c->put_hevc_epel_uni, 1, 1, epel_uni_hv,);
/* epel_bi: pixels, h, v, hv */
NEON8_FNASSIGN(c->put_hevc_epel_bi, 0, 0, pel_bi_pixels,);
NEON8_FNASSIGN(c->put_hevc_epel_bi, 0, 1, epel_bi_h,);
NEON8_FNASSIGN(c->put_hevc_epel_bi, 1, 0, epel_bi_v,);
NEON8_FNASSIGN(c->put_hevc_qpel_bi, 0, 0, pel_bi_pixels,);
NEON8_FNASSIGN(c->put_hevc_qpel_bi, 1, 0, qpel_bi_v,);
NEON8_FNASSIGN_PARTIAL_6(c->put_hevc_qpel_bi_w, 0, 0, pel_bi_w_pixels,);
NEON8_FNASSIGN_PARTIAL_6(c->put_hevc_epel_bi_w, 0, 0, pel_bi_w_pixels,);
NEON8_FNASSIGN(c->put_hevc_epel_uni, 0, 0, pel_uni_pixels,);
NEON8_FNASSIGN(c->put_hevc_epel_uni, 1, 0, epel_uni_v,);
NEON8_FNASSIGN(c->put_hevc_qpel_uni, 0, 0, pel_uni_pixels,);
NEON8_FNASSIGN(c->put_hevc_qpel_uni, 1, 0, qpel_uni_v,);
NEON8_FNASSIGN(c->put_hevc_epel_uni_w, 0, 0, pel_uni_w_pixels,);
NEON8_FNASSIGN(c->put_hevc_qpel_uni_w, 0, 0, pel_uni_w_pixels,);
NEON8_FNASSIGN(c->put_hevc_epel_uni_w, 1, 0, epel_uni_w_v,);
NEON8_FNASSIGN_PARTIAL_4(c->put_hevc_qpel_uni_w, 1, 0, qpel_uni_w_v,);
NEON8_FNASSIGN_SHARED_32(c->put_hevc_epel, 0, 1, epel_h,);
NEON8_FNASSIGN_SHARED_32(c->put_hevc_epel_uni_w, 0, 1, epel_uni_w_h,);
NEON8_FNASSIGN(c->put_hevc_epel, 1, 1, epel_hv,);
NEON8_FNASSIGN(c->put_hevc_epel_uni, 1, 1, epel_uni_hv,);
NEON8_FNASSIGN(c->put_hevc_epel_uni_w, 1, 1, epel_uni_w_hv,);
NEON8_FNASSIGN(c->put_hevc_epel_bi, 1, 1, epel_bi_hv,);
NEON8_FNASSIGN_SHARED_32(c->put_hevc_qpel_uni_w, 0, 1, qpel_uni_w_h,);
/* epel_uni_w: pixels, h, v, hv */
NEON8_FNASSIGN(c->put_hevc_epel_uni_w, 0, 0, pel_uni_w_pixels,);
NEON8_FNASSIGN_SHARED_32(c->put_hevc_epel_uni_w, 0, 1, epel_uni_w_h,);
NEON8_FNASSIGN(c->put_hevc_epel_uni_w, 1, 0, epel_uni_w_v,);
NEON8_FNASSIGN(c->put_hevc_epel_uni_w, 1, 1, epel_uni_w_hv,);
NEON8_FNASSIGN(c->put_hevc_qpel, 1, 1, qpel_hv,);
NEON8_FNASSIGN(c->put_hevc_qpel_uni, 1, 1, qpel_uni_hv,);
NEON8_FNASSIGN_PARTIAL_5(c->put_hevc_qpel_uni_w, 1, 1, qpel_uni_w_hv,);
NEON8_FNASSIGN(c->put_hevc_qpel_bi, 1, 1, qpel_bi_hv,);
/* epel_bi_w: pixels only */
NEON8_FNASSIGN_PARTIAL_6(c->put_hevc_epel_bi_w, 0, 0, pel_bi_w_pixels,);
if (have_i8mm(cpu_flags)) {
NEON8_FNASSIGN(c->put_hevc_epel, 0, 1, epel_h, _i8mm);
NEON8_FNASSIGN(c->put_hevc_epel, 1, 1, epel_hv, _i8mm);
NEON8_FNASSIGN(c->put_hevc_epel_uni, 1, 1, epel_uni_hv, _i8mm);
NEON8_FNASSIGN(c->put_hevc_epel_uni_w, 0, 1, epel_uni_w_h ,_i8mm);
NEON8_FNASSIGN(c->put_hevc_epel_uni_w, 1, 1, epel_uni_w_hv, _i8mm);
NEON8_FNASSIGN(c->put_hevc_epel_bi, 1, 1, epel_bi_hv, _i8mm);
/* i8mm overrides: qpel */
NEON8_FNASSIGN(c->put_hevc_qpel, 0, 1, qpel_h, _i8mm);
NEON8_FNASSIGN(c->put_hevc_qpel, 1, 1, qpel_hv, _i8mm);
NEON8_FNASSIGN(c->put_hevc_qpel_uni, 1, 1, qpel_uni_hv, _i8mm);
NEON8_FNASSIGN(c->put_hevc_qpel_uni_w, 0, 1, qpel_uni_w_h, _i8mm);
NEON8_FNASSIGN_PARTIAL_5(c->put_hevc_qpel_uni_w, 1, 1, qpel_uni_w_hv, _i8mm);
NEON8_FNASSIGN(c->put_hevc_qpel_uni_w, 1, 1, qpel_uni_w_hv, _i8mm);
NEON8_FNASSIGN(c->put_hevc_qpel_bi, 1, 1, qpel_bi_hv, _i8mm);
/* i8mm overrides: epel */
NEON8_FNASSIGN(c->put_hevc_epel, 0, 1, epel_h, _i8mm);
NEON8_FNASSIGN(c->put_hevc_epel, 1, 1, epel_hv, _i8mm);
NEON8_FNASSIGN(c->put_hevc_epel_uni, 1, 1, epel_uni_hv, _i8mm);
NEON8_FNASSIGN(c->put_hevc_epel_uni_w, 0, 1, epel_uni_w_h, _i8mm);
NEON8_FNASSIGN(c->put_hevc_epel_uni_w, 1, 1, epel_uni_w_hv, _i8mm);
NEON8_FNASSIGN(c->put_hevc_epel_bi, 1, 1, epel_bi_hv, _i8mm);
}
}
+111
View File
@@ -0,0 +1,111 @@
/*
* HEVC Intra Prediction NEON initialization
*
* Copyright (c) 2026 Jun Zhao <barryjzhao@tencent.com>
*
* This file is part of FFmpeg.
*
* FFmpeg is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2.1 of the License, or (at your option) any later version.
*
* FFmpeg is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with FFmpeg; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
*/
#include "libavutil/attributes.h"
#include "libavutil/avassert.h"
#include "libavutil/aarch64/cpu.h"
#include "libavcodec/hevc/pred.h"
// DC prediction
void ff_hevc_pred_dc_4x4_8_neon(uint8_t *src, const uint8_t *top,
const uint8_t *left, ptrdiff_t stride,
int c_idx);
void ff_hevc_pred_dc_8x8_8_neon(uint8_t *src, const uint8_t *top,
const uint8_t *left, ptrdiff_t stride,
int c_idx);
void ff_hevc_pred_dc_16x16_8_neon(uint8_t *src, const uint8_t *top,
const uint8_t *left, ptrdiff_t stride,
int c_idx);
void ff_hevc_pred_dc_32x32_8_neon(uint8_t *src, const uint8_t *top,
const uint8_t *left, ptrdiff_t stride,
int c_idx);
// Planar prediction
void ff_hevc_pred_planar_4x4_8_neon(uint8_t *src, const uint8_t *top,
const uint8_t *left, ptrdiff_t stride);
void ff_hevc_pred_planar_8x8_8_neon(uint8_t *src, const uint8_t *top,
const uint8_t *left, ptrdiff_t stride);
void ff_hevc_pred_planar_16x16_8_neon(uint8_t *src, const uint8_t *top,
const uint8_t *left, ptrdiff_t stride);
void ff_hevc_pred_planar_32x32_8_neon(uint8_t *src, const uint8_t *top,
const uint8_t *left, ptrdiff_t stride);
// 3-tap reference sample filter
void ff_hevc_ref_filter_3tap_8x8_8_neon(uint8_t *filtered_left,
uint8_t *filtered_top,
const uint8_t *left,
const uint8_t *top, int size);
void ff_hevc_ref_filter_3tap_16x16_8_neon(uint8_t *filtered_left,
uint8_t *filtered_top,
const uint8_t *left,
const uint8_t *top, int size);
void ff_hevc_ref_filter_3tap_32x32_8_neon(uint8_t *filtered_left,
uint8_t *filtered_top,
const uint8_t *left,
const uint8_t *top, int size);
// Strong intra smoothing
void ff_hevc_ref_filter_strong_8_neon(uint8_t *filtered_top, uint8_t *left,
const uint8_t *top);
static void pred_dc_neon(uint8_t *src, const uint8_t *top,
const uint8_t *left, ptrdiff_t stride,
int log2_size, int c_idx)
{
switch (log2_size) {
case 2:
ff_hevc_pred_dc_4x4_8_neon(src, top, left, stride, c_idx);
break;
case 3:
ff_hevc_pred_dc_8x8_8_neon(src, top, left, stride, c_idx);
break;
case 4:
ff_hevc_pred_dc_16x16_8_neon(src, top, left, stride, c_idx);
break;
case 5:
ff_hevc_pred_dc_32x32_8_neon(src, top, left, stride, c_idx);
break;
default:
av_unreachable("log2_size must be 2, 3, 4 or 5");
}
}
av_cold void ff_hevc_pred_init_aarch64(HEVCPredContext *hpc, int bit_depth)
{
int cpu_flags = av_get_cpu_flags();
if (!have_neon(cpu_flags))
return;
if (bit_depth == 8) {
hpc->pred_dc = pred_dc_neon;
hpc->pred_planar[0] = ff_hevc_pred_planar_4x4_8_neon;
hpc->pred_planar[1] = ff_hevc_pred_planar_8x8_8_neon;
hpc->pred_planar[2] = ff_hevc_pred_planar_16x16_8_neon;
hpc->pred_planar[3] = ff_hevc_pred_planar_32x32_8_neon;
hpc->ref_filter_3tap[0] = ff_hevc_ref_filter_3tap_8x8_8_neon;
hpc->ref_filter_3tap[1] = ff_hevc_ref_filter_3tap_16x16_8_neon;
hpc->ref_filter_3tap[2] = ff_hevc_ref_filter_3tap_32x32_8_neon;
hpc->ref_filter_strong = ff_hevc_ref_filter_strong_8_neon;
}
}
File diff suppressed because it is too large Load Diff
+4
View File
@@ -1169,9 +1169,11 @@ function nsse16_neon, export=1
str x0, [sp, #-0x40]!
stp x1, x2, [sp, #0x10]
stp x3, x4, [sp, #0x20]
AARCH64_SIGN_LINK_REGISTER
str x30, [sp, #0x30]
bl X(sse16_neon)
ldr x30, [sp, #0x30]
AARCH64_VALIDATE_LINK_REGISTER
mov w9, w0 // here we store score1
ldp x1, x2, [sp, #0x10]
ldp x3, x4, [sp, #0x20]
@@ -1290,9 +1292,11 @@ function nsse8_neon, export=1
str x0, [sp, #-0x40]!
stp x1, x2, [sp, #0x10]
stp x3, x4, [sp, #0x20]
AARCH64_SIGN_LINK_REGISTER
str x30, [sp, #0x30]
bl X(sse8_neon)
ldr x30, [sp, #0x30]
AARCH64_VALIDATE_LINK_REGISTER
mov w9, w0 // here we store score1
ldp x1, x2, [sp, #0x10]
ldp x3, x4, [sp, #0x20]
+83
View File
@@ -43,6 +43,36 @@ void ff_vvc_put_luma_h16_12_neon(int16_t *dst, const uint8_t *_src, const ptrdif
void ff_vvc_put_luma_h_x16_12_neon(int16_t *dst, const uint8_t *_src, const ptrdiff_t _src_stride,
const int height, const int8_t *hf, const int8_t *vf, const int width);
void ff_vvc_put_chroma_h8_10_neon(int16_t *dst, const uint8_t *_src, const ptrdiff_t _src_stride,
const int height, const int8_t *hf, const int8_t *vf, const int width);
void ff_vvc_put_chroma_h16_10_neon(int16_t *dst, const uint8_t *_src, const ptrdiff_t _src_stride,
const int height, const int8_t *hf, const int8_t *vf, const int width);
void ff_vvc_put_chroma_h_x16_10_neon(int16_t *dst, const uint8_t *_src, const ptrdiff_t _src_stride,
const int height, const int8_t *hf, const int8_t *vf, const int width);
void ff_vvc_put_chroma_h8_12_neon(int16_t *dst, const uint8_t *_src, const ptrdiff_t _src_stride,
const int height, const int8_t *hf, const int8_t *vf, const int width);
void ff_vvc_put_chroma_h16_12_neon(int16_t *dst, const uint8_t *_src, const ptrdiff_t _src_stride,
const int height, const int8_t *hf, const int8_t *vf, const int width);
void ff_vvc_put_chroma_h_x16_12_neon(int16_t *dst, const uint8_t *_src, const ptrdiff_t _src_stride,
const int height, const int8_t *hf, const int8_t *vf, const int width);
void ff_vvc_put_chroma_v4_10_neon(int16_t *dst, const uint8_t *_src, const ptrdiff_t _src_stride,
const int height, const int8_t *hf, const int8_t *vf, const int width);
void ff_vvc_put_chroma_v8_10_neon(int16_t *dst, const uint8_t *_src, const ptrdiff_t _src_stride,
const int height, const int8_t *hf, const int8_t *vf, const int width);
void ff_vvc_put_chroma_v16_10_neon(int16_t *dst, const uint8_t *_src, const ptrdiff_t _src_stride,
const int height, const int8_t *hf, const int8_t *vf, const int width);
void ff_vvc_put_chroma_v_x16_10_neon(int16_t *dst, const uint8_t *_src, const ptrdiff_t _src_stride,
const int height, const int8_t *hf, const int8_t *vf, const int width);
void ff_vvc_put_chroma_v4_12_neon(int16_t *dst, const uint8_t *_src, const ptrdiff_t _src_stride,
const int height, const int8_t *hf, const int8_t *vf, const int width);
void ff_vvc_put_chroma_v8_12_neon(int16_t *dst, const uint8_t *_src, const ptrdiff_t _src_stride,
const int height, const int8_t *hf, const int8_t *vf, const int width);
void ff_vvc_put_chroma_v16_12_neon(int16_t *dst, const uint8_t *_src, const ptrdiff_t _src_stride,
const int height, const int8_t *hf, const int8_t *vf, const int width);
void ff_vvc_put_chroma_v_x16_12_neon(int16_t *dst, const uint8_t *_src, const ptrdiff_t _src_stride,
const int height, const int8_t *hf, const int8_t *vf, const int width);
void ff_vvc_put_luma_v4_10_neon(int16_t *dst, const uint8_t *_src, const ptrdiff_t _src_stride,
const int height, const int8_t *hf, const int8_t *vf, const int width);
void ff_vvc_put_luma_v8_10_neon(int16_t *dst, const uint8_t *_src, const ptrdiff_t _src_stride,
@@ -73,6 +103,19 @@ void ff_vvc_put_luma_hv16_12_neon(int16_t *dst, const uint8_t *_src, const ptrdi
void ff_vvc_put_luma_hv_x16_12_neon(int16_t *dst, const uint8_t *_src, const ptrdiff_t _src_stride,
const int height, const int8_t *hf, const int8_t *vf, const int width);
void ff_vvc_put_chroma_hv8_10_neon(int16_t *dst, const uint8_t *_src, const ptrdiff_t _src_stride,
const int height, const int8_t *hf, const int8_t *vf, const int width);
void ff_vvc_put_chroma_hv16_10_neon(int16_t *dst, const uint8_t *_src, const ptrdiff_t _src_stride,
const int height, const int8_t *hf, const int8_t *vf, const int width);
void ff_vvc_put_chroma_hv_x16_10_neon(int16_t *dst, const uint8_t *_src, const ptrdiff_t _src_stride,
const int height, const int8_t *hf, const int8_t *vf, const int width);
void ff_vvc_put_chroma_hv8_12_neon(int16_t *dst, const uint8_t *_src, const ptrdiff_t _src_stride,
const int height, const int8_t *hf, const int8_t *vf, const int width);
void ff_vvc_put_chroma_hv16_12_neon(int16_t *dst, const uint8_t *_src, const ptrdiff_t _src_stride,
const int height, const int8_t *hf, const int8_t *vf, const int width);
void ff_vvc_put_chroma_hv_x16_12_neon(int16_t *dst, const uint8_t *_src, const ptrdiff_t _src_stride,
const int height, const int8_t *hf, const int8_t *vf, const int width);
void ff_alf_classify_sum_neon(int *sum0, int *sum1, int16_t *grad, uint32_t gshift, uint32_t steps);
#define BIT_DEPTH 8
@@ -290,12 +333,26 @@ void ff_vvc_dsp_init_aarch64(VVCDSPContext *const c, const int bd)
c->inter.dmvr[0][1] = ff_vvc_dmvr_h_10_neon;
c->inter.dmvr[1][1] = ff_vvc_dmvr_hv_10_neon;
c->inter.apply_bdof = ff_vvc_apply_bdof_10_neon;
c->inter.put[1][2][0][1] = ff_vvc_put_chroma_h8_10_neon;
c->inter.put[1][3][0][1] = ff_vvc_put_chroma_h16_10_neon;
c->inter.put[1][4][0][1] =
c->inter.put[1][5][0][1] =
c->inter.put[1][6][0][1] = ff_vvc_put_chroma_h_x16_10_neon;
c->inter.put[0][2][0][1] = ff_vvc_put_luma_h8_10_neon;
c->inter.put[0][3][0][1] = ff_vvc_put_luma_h16_10_neon;
c->inter.put[0][4][0][1] =
c->inter.put[0][5][0][1] =
c->inter.put[0][6][0][1] = ff_vvc_put_luma_h_x16_10_neon;
c->inter.put[1][1][1][0] = ff_vvc_put_chroma_v4_10_neon;
c->inter.put[1][2][1][0] = ff_vvc_put_chroma_v8_10_neon;
c->inter.put[1][3][1][0] = ff_vvc_put_chroma_v16_10_neon;
c->inter.put[1][4][1][0] =
c->inter.put[1][5][1][0] =
c->inter.put[1][6][1][0] = ff_vvc_put_chroma_v_x16_10_neon;
c->inter.put[0][1][1][0] = ff_vvc_put_luma_v4_10_neon;
c->inter.put[0][2][1][0] = ff_vvc_put_luma_v8_10_neon;
c->inter.put[0][3][1][0] = ff_vvc_put_luma_v16_10_neon;
@@ -309,6 +366,12 @@ void ff_vvc_dsp_init_aarch64(VVCDSPContext *const c, const int bd)
c->inter.put[0][5][1][1] =
c->inter.put[0][6][1][1] = ff_vvc_put_luma_hv_x16_10_neon;
c->inter.put[1][2][1][1] = ff_vvc_put_chroma_hv8_10_neon;
c->inter.put[1][3][1][1] = ff_vvc_put_chroma_hv16_10_neon;
c->inter.put[1][4][1][1] =
c->inter.put[1][5][1][1] =
c->inter.put[1][6][1][1] = ff_vvc_put_chroma_hv_x16_10_neon;
c->alf.filter[LUMA] = alf_filter_luma_10_neon;
c->alf.filter[CHROMA] = alf_filter_chroma_10_neon;
c->alf.classify = alf_classify_10_neon;
@@ -322,6 +385,13 @@ void ff_vvc_dsp_init_aarch64(VVCDSPContext *const c, const int bd)
c->inter.dmvr[0][1] = ff_vvc_dmvr_h_12_neon;
c->inter.dmvr[1][1] = ff_vvc_dmvr_hv_12_neon;
c->inter.apply_bdof = ff_vvc_apply_bdof_12_neon;
c->inter.put[1][2][0][1] = ff_vvc_put_chroma_h8_12_neon;
c->inter.put[1][3][0][1] = ff_vvc_put_chroma_h16_12_neon;
c->inter.put[1][4][0][1] =
c->inter.put[1][5][0][1] =
c->inter.put[1][6][0][1] = ff_vvc_put_chroma_h_x16_12_neon;
c->inter.put[0][2][0][1] = ff_vvc_put_luma_h8_12_neon;
c->inter.put[0][3][0][1] = ff_vvc_put_luma_h16_12_neon;
c->inter.put[0][4][0][1] =
@@ -341,6 +411,19 @@ void ff_vvc_dsp_init_aarch64(VVCDSPContext *const c, const int bd)
c->inter.put[0][5][1][0] =
c->inter.put[0][6][1][0] = ff_vvc_put_luma_v_x16_12_neon;
c->inter.put[1][1][1][0] = ff_vvc_put_chroma_v4_12_neon;
c->inter.put[1][2][1][0] = ff_vvc_put_chroma_v8_12_neon;
c->inter.put[1][3][1][0] = ff_vvc_put_chroma_v16_12_neon;
c->inter.put[1][4][1][0] =
c->inter.put[1][5][1][0] =
c->inter.put[1][6][1][0] = ff_vvc_put_chroma_v_x16_12_neon;
c->inter.put[1][2][1][1] = ff_vvc_put_chroma_hv8_12_neon;
c->inter.put[1][3][1][1] = ff_vvc_put_chroma_hv16_12_neon;
c->inter.put[1][4][1][1] =
c->inter.put[1][5][1][1] =
c->inter.put[1][6][1][1] = ff_vvc_put_chroma_hv_x16_12_neon;
c->alf.filter[LUMA] = alf_filter_luma_12_neon;
c->alf.filter[CHROMA] = alf_filter_chroma_12_neon;
c->alf.classify = alf_classify_12_neon;
+552
View File
@@ -1611,6 +1611,7 @@ endfunc
function ff_vvc_apply_bdof_8_neon, export=1
mov w6, #8
0:
AARCH64_SIGN_LINK_REGISTER
stp x19, x20, [sp, #-0x40]!
stp x21, x22, [sp, #0x10]
stp x23, x24, [sp, #0x20]
@@ -1703,6 +1704,7 @@ function ff_vvc_apply_bdof_8_neon, export=1
ldp x23, x24, [sp, #0x20]
ldp x21, x22, [sp, #0x10]
ldp x19, x20, [sp], #0x40
AARCH64_VALIDATE_LINK_REGISTER
ret
endfunc
@@ -1833,6 +1835,137 @@ function ff_vvc_put_luma_h_x16_12_neon, export=1
put_luma_h_x16_xx_neon 4
endfunc
.macro put_chroma_h_x8_horizontal_filter shift
// 4 bytes from hf loaded to v0.4h
// 24 bytes from _src loaded to v20.8h & v21.4h where v21.4h is loaded for shift to v1.8h,v2.8h,v3.8h
// v24.4h & v25.4h are output vectors to store
ext v1.16b, v20.16b, v21.16b, #2
ext v2.16b, v20.16b, v21.16b, #4
ext v3.16b, v20.16b, v21.16b, #6
smull v24.4s, v20.4h, v0.h[0]
smull2 v25.4s, v20.8h, v0.h[0]
smlal v24.4s, v1.4h, v0.h[1]
smlal2 v25.4s, v1.8h, v0.h[1]
smlal v24.4s, v2.4h, v0.h[2]
smlal2 v25.4s, v2.8h, v0.h[2]
smlal v24.4s, v3.4h, v0.h[3]
smlal2 v25.4s, v3.8h, v0.h[3]
sqshrn v24.4h, v24.4s, #(\shift)
sqshrn v25.4h, v25.4s, #(\shift)
.endm
.macro put_chroma_h8_xx_neon shift
// dst .req x0
// _src .req x1
// _src_stride .req x2
// height .req x3
// hf .req x4
// vf .req x5
// width .req x6
mov x9, #(VVC_MAX_PB_SIZE * 2)
ldr s0, [x4]
sub x1, x1, #2
sub x2, x2, #16
sxtl v0.8h, v0.8b
1:
ld1 {v20.8h}, [x1], #16
ld1 {v21.4h}, [x1], x2
put_chroma_h_x8_horizontal_filter \shift
subs w3, w3, #1
st1 {v24.4h, v25.4h}, [x0], x9
b.gt 1b
ret
.endm
.macro put_chroma_h16_xx_neon shift
// dst .req x0
// _src .req x1
// _src_stride .req x2
// height .req x3
// hf .req x4
// vf .req x5
// width .req x6
mov x9, #(VVC_MAX_PB_SIZE * 2)
ldr s0, [x4]
sub x9, x9, #16
sub x1, x1, #2
sub x2, x2, #32
sxtl v0.8h, v0.8b
1:
ld1 {v20.8h, v21.8h}, [x1], #32
ld1 {v22.4h}, [x1], x2
put_chroma_h_x8_horizontal_filter \shift
mov v20.16b, v21.16b
mov v21.16b, v22.16b
st1 {v24.4h, v25.4h}, [x0], #16
put_chroma_h_x8_horizontal_filter \shift
subs w3, w3, #1
st1 {v24.4h, v25.4h}, [x0], x9
b.gt 1b
ret
.endm
.macro put_chroma_h_x16_xx_neon shift
// dst .req x0
// _src .req x1
// _src_stride .req x2
// height .req x3
// hf .req x4
// vf .req x5
// width .req x6
mov x9, #(VVC_MAX_PB_SIZE * 2)
ldr s0, [x4]
sub x9, x9, w6, uxtw #1
sub x2, x2, w6, uxtw #1
sxtl v0.8h, v0.8b
sub x1, x1, #2
sub x2, x2, #16
1:
ld1 {v20.8h}, [x1], #16
mov w8, w6
2:
ld1 {v21.8h, v22.8h}, [x1], #32
put_chroma_h_x8_horizontal_filter \shift
mov v20.16b, v21.16b
mov v21.16b, v22.16b
st1 {v24.4h, v25.4h}, [x0], #16
put_chroma_h_x8_horizontal_filter \shift
mov v20.16b, v21.16b
subs w8, w8, #16
st1 {v24.4h, v25.4h}, [x0], #16
b.gt 2b
subs w3, w3, #1
add x0, x0, x9
add x1, x1, x2
b.gt 1b
ret
.endm
function ff_vvc_put_chroma_h8_10_neon, export=1
put_chroma_h8_xx_neon 2
endfunc
function ff_vvc_put_chroma_h8_12_neon, export=1
put_chroma_h8_xx_neon 4
endfunc
function ff_vvc_put_chroma_h16_10_neon, export=1
put_chroma_h16_xx_neon 2
endfunc
function ff_vvc_put_chroma_h16_12_neon, export=1
put_chroma_h16_xx_neon 4
endfunc
function ff_vvc_put_chroma_h_x16_10_neon, export=1
put_chroma_h_x16_xx_neon 2
endfunc
function ff_vvc_put_chroma_h_x16_12_neon, export=1
put_chroma_h_x16_xx_neon 4
endfunc
.macro put_luma_v4_xx_neon shift
mov x9, #(VVC_MAX_PB_SIZE * 2)
sub x1, x1, x2, lsl #1
@@ -2225,6 +2358,229 @@ function ff_vvc_put_luma_v_x16_12_neon, export=1
put_luma_v_x16_xx_neon 4
endfunc
.macro put_chroma_v4_xx_neon shift
// dst .req x0
// _src .req x1
// _src_stride .req x2
// height .req x3
// hf .req x4
// vf .req x5
// width .req x6
mov x9, #(VVC_MAX_PB_SIZE * 2)
ldr s0, [x5]
sub x1, x1, x2
sxtl v0.8h, v0.8b
ld1 {v20.4h}, [x1], x2
ld1 {v21.4h}, [x1], x2
ld1 {v22.4h}, [x1], x2
1:
ld1 {v23.4h}, [x1], x2
smull v1.4s, v20.4h, v0.h[0]
smull v2.4s, v21.4h, v0.h[1]
smlal v1.4s, v22.4h, v0.h[2]
smlal v2.4s, v23.4h, v0.h[3]
ld1 {v24.4h}, [x1], x2
smull v3.4s, v21.4h, v0.h[0]
smull v4.4s, v22.4h, v0.h[1]
smlal v3.4s, v23.4h, v0.h[2]
smlal v4.4s, v24.4h, v0.h[3]
add v1.4s, v1.4s, v2.4s
add v3.4s, v3.4s, v4.4s
sqshrn v1.4h, v1.4s, #(\shift)
sqshrn v3.4h, v3.4s, #(\shift)
st1 {v1.4h}, [x0], x9
mov v20.16b, v22.16b
mov v21.16b, v23.16b
mov v22.16b, v24.16b
subs w3, w3, #2
st1 {v3.4h}, [x0], x9
b.gt 1b
ret
.endm
function ff_vvc_put_chroma_v4_10_neon, export=1
put_chroma_v4_xx_neon 2
endfunc
function ff_vvc_put_chroma_v4_12_neon, export=1
put_chroma_v4_xx_neon 4
endfunc
.macro put_chroma_v8_xx_neon shift
// dst .req x0
// _src .req x1
// _src_stride .req x2
// height .req x3
// hf .req x4
// vf .req x5
// width .req x6
mov x9, #(VVC_MAX_PB_SIZE * 2)
ldr s0, [x5]
sub x1, x1, x2
sxtl v0.8h, v0.8b
ld1 {v20.8h}, [x1], x2
ld1 {v21.8h}, [x1], x2
ld1 {v22.8h}, [x1], x2
1:
ld1 {v23.8h}, [x1], x2
smull v1.4s, v20.4h, v0.h[0]
smull2 v2.4s, v20.8h, v0.h[0]
smlal v1.4s, v21.4h, v0.h[1]
smlal2 v2.4s, v21.8h, v0.h[1]
smlal v1.4s, v22.4h, v0.h[2]
smlal2 v2.4s, v22.8h, v0.h[2]
smlal v1.4s, v23.4h, v0.h[3]
smlal2 v2.4s, v23.8h, v0.h[3]
sqshrn v1.4h, v1.4s, #(\shift)
sqshrn v2.4h, v2.4s, #(\shift)
ld1 {v24.8h}, [x1], x2
st1 {v1.4h-v2.4h}, [x0], x9
smull v3.4s, v21.4h, v0.h[0]
smull2 v4.4s, v21.8h, v0.h[0]
smlal v3.4s, v22.4h, v0.h[1]
smlal2 v4.4s, v22.8h, v0.h[1]
smlal v3.4s, v23.4h, v0.h[2]
smlal2 v4.4s, v23.8h, v0.h[2]
smlal v3.4s, v24.4h, v0.h[3]
smlal2 v4.4s, v24.8h, v0.h[3]
sqshrn v3.4h, v3.4s, #(\shift)
sqshrn v4.4h, v4.4s, #(\shift)
mov v20.16b, v22.16b
mov v21.16b, v23.16b
mov v22.16b, v24.16b
subs w3, w3, #2
st1 {v3.4h-v4.4h}, [x0], x9
b.gt 1b
ret
.endm
function ff_vvc_put_chroma_v8_10_neon, export=1
put_chroma_v8_xx_neon 2
endfunc
function ff_vvc_put_chroma_v8_12_neon, export=1
put_chroma_v8_xx_neon 4
endfunc
.macro put_chroma_v_x16_horizontal_filter shift, src0, src1, src2, src3, src4, src5, src6, src7
smull v2.4s, \src0\().4h, v0.h[0]
smull2 v3.4s, \src0\().8h, v0.h[0]
smlal v2.4s, \src2\().4h, v0.h[1]
smlal2 v3.4s, \src2\().8h, v0.h[1]
smlal v2.4s, \src4\().4h, v0.h[2]
smlal2 v3.4s, \src4\().8h, v0.h[2]
smlal v2.4s, \src6\().4h, v0.h[3]
smlal2 v3.4s, \src6\().8h, v0.h[3]
smull v4.4s, \src1\().4h, v0.h[0]
smull2 v5.4s, \src1\().8h, v0.h[0]
smlal v4.4s, \src3\().4h, v0.h[1]
smlal2 v5.4s, \src3\().8h, v0.h[1]
smlal v4.4s, \src5\().4h, v0.h[2]
smlal2 v5.4s, \src5\().8h, v0.h[2]
smlal v4.4s, \src7\().4h, v0.h[3]
smlal2 v5.4s, \src7\().8h, v0.h[3]
sqshrn v6.4h, v2.4s, #(\shift)
sqshrn v7.4h, v4.4s, #(\shift)
sqshrn2 v6.8h, v3.4s, #(\shift)
sqshrn2 v7.8h, v5.4s, #(\shift)
.endm
.macro put_chroma_v16_xx_neon shift
// dst .req x0
// _src .req x1
// _src_stride .req x2
// height .req x3
// hf .req x4
// vf .req x5
// width .req x6
mov x9, #(VVC_MAX_PB_SIZE * 2)
ldr s0, [x5]
sub x1, x1, x2
sxtl v0.8h, v0.8b
ld1 {v16.8h-v17.8h}, [x1], x2
ld1 {v18.8h-v19.8h}, [x1], x2
ld1 {v20.8h-v21.8h}, [x1], x2
1:
ld1 {v22.8h-v23.8h}, [x1], x2
put_chroma_v_x16_horizontal_filter \shift, v16, v17, v18, v19, v20, v21, v22, v23
ld1 {v24.8h-v25.8h}, [x1], x2
st1 {v6.8h-v7.8h}, [x0], x9
put_chroma_v_x16_horizontal_filter \shift, v18, v19, v20, v21, v22, v23, v24, v25
subs w3, w3, #2
st1 {v6.8h-v7.8h}, [x0], x9
mov v16.16b, v20.16b
mov v17.16b, v21.16b
mov v18.16b, v22.16b
mov v19.16b, v23.16b
mov v20.16b, v24.16b
mov v21.16b, v25.16b
b.gt 1b
ret
.endm
function ff_vvc_put_chroma_v16_10_neon, export=1
put_chroma_v16_xx_neon 2
endfunc
function ff_vvc_put_chroma_v16_12_neon, export=1
put_chroma_v16_xx_neon 4
endfunc
.macro put_chroma_v_x16_xx_neon shift
// dst .req x0
// _src .req x1
// _src_stride .req x2
// height .req x3
// hf .req x4
// vf .req x5
// width .req x6
mov x9, #(VVC_MAX_PB_SIZE * 2)
ldr s0, [x5]
sub x1, x1, x2
sxtl v0.8h, v0.8b
1:
mov w8, #0
2:
add x11, x1, x8, lsl #1
add x10, x0, x8, lsl #1
ld1 {v16.8h-v17.8h}, [x11], x2
add x8, x8, #16
ld1 {v18.8h-v19.8h}, [x11], x2
cmp w8, w6
ld1 {v20.8h-v21.8h}, [x11], x2
ld1 {v22.8h-v23.8h}, [x11], x2
ld1 {v24.8h-v25.8h}, [x11], x2
put_chroma_v_x16_horizontal_filter \shift, v16, v17, v18, v19, v20, v21, v22, v23
st1 {v6.8h-v7.8h}, [x10], x9
put_chroma_v_x16_horizontal_filter \shift, v18, v19, v20, v21, v22, v23, v24, v25
st1 {v6.8h-v7.8h}, [x10], x9
b.lt 2b
add x0, x0, x9, lsl #1
subs w3, w3, #2
add x1, x1, x2, lsl #1
b.gt 1b
ret
.endm
function ff_vvc_put_chroma_v_x16_10_neon, export=1
put_chroma_v_x16_xx_neon 2
endfunc
function ff_vvc_put_chroma_v_x16_12_neon, export=1
put_chroma_v_x16_xx_neon 4
endfunc
.macro put_luma_hv_x8_horizontal_filter shift, dst, src0, src1
ext v2.16b, \src0\().16b, \src1\().16b, #2
@@ -2575,3 +2931,199 @@ endfunc
function ff_vvc_put_luma_hv_x16_12_neon, export=1
put_luma_hv_x16_xx_neon 4
endfunc
.macro put_chroma_hv_x8_horizontal_filter shift, dst, src0, src1
ext v2.16b, \src0\().16b, \src1\().16b, #2
ext v3.16b, \src0\().16b, \src1\().16b, #4
ext v4.16b, \src0\().16b, \src1\().16b, #6
smull v6.4s, \src0\().4h, v0.h[0]
smull2 v7.4s, \src0\().8h, v0.h[0]
smlal v6.4s, v2.4h, v0.h[1]
smlal2 v7.4s, v2.8h, v0.h[1]
smlal v6.4s, v3.4h, v0.h[2]
smlal2 v7.4s, v3.8h, v0.h[2]
smlal v6.4s, v4.4h, v0.h[3]
smlal2 v7.4s, v4.8h, v0.h[3]
sqshrn \dst\().4h, v6.4s, #(\shift)
sqshrn2 \dst\().8h, v7.4s, #(\shift)
.endm
.macro put_chroma_hv_x8_vertical_filter dst0, dst1, src0, src1, src2, src3
smull \dst0\().4s, \src0\().4h, v1.h[0]
smull2 \dst1\().4s, \src0\().8h, v1.h[0]
smlal \dst0\().4s, \src1\().4h, v1.h[1]
smlal2 \dst1\().4s, \src1\().8h, v1.h[1]
smlal \dst0\().4s, \src2\().4h, v1.h[2]
smlal2 \dst1\().4s, \src2\().8h, v1.h[2]
smlal \dst0\().4s, \src3\().4h, v1.h[3]
smlal2 \dst1\().4s, \src3\().8h, v1.h[3]
sqshrn \dst0\().4h, \dst0\().4s, #6
sqshrn \dst1\().4h, \dst1\().4s, #6
.endm
.macro put_chroma_hv8_xx_neon shift
// dst .req x0
// _src .req x1
// _src_stride .req x2
// height .req x3
// hf .req x4
// vf .req x5
// width .req x6
mov x9, #(VVC_MAX_PB_SIZE * 2)
sub x1, x1, #2
ldr s0, [x4]
ldr s1, [x5]
sxtl v0.8h, v0.8b
sub x1, x1, x2
sxtl v1.8h, v1.8b
ld1 {v16.8h, v17.8h}, [x1], x2
ld1 {v18.8h, v19.8h}, [x1], x2
ld1 {v20.8h, v21.8h}, [x1], x2
put_chroma_hv_x8_horizontal_filter \shift, v16, v16, v17
put_chroma_hv_x8_horizontal_filter \shift, v18, v18, v19
put_chroma_hv_x8_horizontal_filter \shift, v20, v20, v21
1:
ld1 {v22.8h, v23.8h}, [x1], x2
put_chroma_hv_x8_horizontal_filter \shift, v22, v22, v23
put_chroma_hv_x8_vertical_filter v2, v3, v16, v18, v20, v22
ld1 {v24.8h, v25.8h}, [x1], x2
st1 {v2.4h-v3.4h}, [x0], x9
put_chroma_hv_x8_horizontal_filter \shift, v24, v24, v25
put_chroma_hv_x8_vertical_filter v2, v3, v18, v20, v22, v24
st1 {v2.4h-v3.4h}, [x0], x9
mov v16.16b, v20.16b
mov v18.16b, v22.16b
subs w3, w3, #2
mov v20.16b, v24.16b
b.gt 1b
ret
.endm
function ff_vvc_put_chroma_hv8_10_neon, export=1
put_chroma_hv8_xx_neon 2
endfunc
function ff_vvc_put_chroma_hv8_12_neon, export=1
put_chroma_hv8_xx_neon 4
endfunc
.macro put_chroma_hv16_xx_neon shift
// dst .req x0
// _src .req x1
// _src_stride .req x2
// height .req x3
// hf .req x4
// vf .req x5
// width .req x6
mov x9, #(VVC_MAX_PB_SIZE * 2)
sub x1, x1, #2
ldr s0, [x4]
ldr s1, [x5]
sxtl v0.8h, v0.8b
sub x1, x1, x2
sxtl v1.8h, v1.8b
ld1 {v16.8h-v18.8h}, [x1], x2
ld1 {v19.8h-v21.8h}, [x1], x2
ld1 {v22.8h-v24.8h}, [x1], x2
put_chroma_hv_x8_horizontal_filter \shift, v16, v16, v17
put_chroma_hv_x8_horizontal_filter \shift, v17, v17, v18
put_chroma_hv_x8_horizontal_filter \shift, v19, v19, v20
put_chroma_hv_x8_horizontal_filter \shift, v20, v20, v21
put_chroma_hv_x8_horizontal_filter \shift, v22, v22, v23
put_chroma_hv_x8_horizontal_filter \shift, v23, v23, v24
1:
ld1 {v25.8h-v27.8h}, [x1], x2
put_chroma_hv_x8_horizontal_filter \shift, v25, v25, v26
put_chroma_hv_x8_horizontal_filter \shift, v26, v26, v27
put_chroma_hv_x8_vertical_filter v2, v3, v16, v19, v22, v25
put_chroma_hv_x8_vertical_filter v4, v5, v17, v20, v23, v26
ld1 {v28.8h-v30.8h}, [x1], x2
st1 {v2.4h-v5.4h}, [x0], x9
put_chroma_hv_x8_horizontal_filter \shift, v28, v28, v29
put_chroma_hv_x8_horizontal_filter \shift, v29, v29, v30
put_chroma_hv_x8_vertical_filter v2, v3, v19, v22, v25, v28
put_chroma_hv_x8_vertical_filter v4, v5, v20, v23, v26, v29
st1 {v2.4h-v5.4h}, [x0], x9
mov v16.16b, v22.16b
mov v17.16b, v23.16b
mov v19.16b, v25.16b
mov v20.16b, v26.16b
subs w3, w3, #2
mov v22.16b, v28.16b
mov v23.16b, v29.16b
b.gt 1b
ret
.endm
function ff_vvc_put_chroma_hv16_10_neon, export=1
put_chroma_hv16_xx_neon 2
endfunc
function ff_vvc_put_chroma_hv16_12_neon, export=1
put_chroma_hv16_xx_neon 4
endfunc
.macro put_chroma_hv_x16_xx_neon shift
// dst .req x0
// _src .req x1
// _src_stride .req x2
// height .req x3
// hf .req x4
// vf .req x5
// width .req x6
mov x9, #(VVC_MAX_PB_SIZE * 2)
sub x1, x1, #2
ldr s0, [x4]
ldr s1, [x5]
sxtl v0.8h, v0.8b
sub x1, x1, x2
sxtl v1.8h, v1.8b
1:
mov w13, w3
mov x11, x1
mov x10, x0
ld1 {v16.8h-v18.8h}, [x11], x2
ld1 {v19.8h-v21.8h}, [x11], x2
ld1 {v22.8h-v24.8h}, [x11], x2
put_chroma_hv_x8_horizontal_filter \shift, v16, v16, v17
put_chroma_hv_x8_horizontal_filter \shift, v17, v17, v18
put_chroma_hv_x8_horizontal_filter \shift, v19, v19, v20
put_chroma_hv_x8_horizontal_filter \shift, v20, v20, v21
put_chroma_hv_x8_horizontal_filter \shift, v22, v22, v23
put_chroma_hv_x8_horizontal_filter \shift, v23, v23, v24
2:
ld1 {v25.8h-v27.8h}, [x11], x2
put_chroma_hv_x8_horizontal_filter \shift, v25, v25, v26
put_chroma_hv_x8_horizontal_filter \shift, v26, v26, v27
put_chroma_hv_x8_vertical_filter v2, v3, v16, v19, v22, v25
put_chroma_hv_x8_vertical_filter v4, v5, v17, v20, v23, v26
ld1 {v28.8h-v30.8h}, [x11], x2
st1 {v2.4h-v5.4h}, [x10], x9
put_chroma_hv_x8_horizontal_filter \shift, v28, v28, v29
put_chroma_hv_x8_horizontal_filter \shift, v29, v29, v30
put_chroma_hv_x8_vertical_filter v2, v3, v19, v22, v25, v28
put_chroma_hv_x8_vertical_filter v4, v5, v20, v23, v26, v29
st1 {v2.4h-v5.4h}, [x10], x9
mov v16.16b, v22.16b
mov v17.16b, v23.16b
mov v19.16b, v25.16b
mov v20.16b, v26.16b
subs w13, w13, #2
mov v22.16b, v28.16b
mov v23.16b, v29.16b
b.gt 2b
subs w6, w6, #16
add x0, x0, #32
add x1, x1, #32
b.gt 1b
ret
.endm
function ff_vvc_put_chroma_hv_x16_10_neon, export=1
put_chroma_hv_x16_xx_neon 2
endfunc
function ff_vvc_put_chroma_hv_x16_12_neon, export=1
put_chroma_hv_x16_xx_neon 4
endfunc
+20 -26
View File
@@ -122,13 +122,11 @@ function ff_vvc_alf_filter_luma_8_sme2, export=1
// clip .req x5
// vb .req x6
sme_entry
stp x29, x30, [sp, #-96]!
mov x29, sp
stp x19, x20, [sp, #16]
stp x21, x22, [sp, #32]
stp x23, x24, [sp, #48]
stp x25, x26, [sp, #64]
stp x27, x28, [sp, #80]
stp x19, x20, [sp, #-80]!
stp x21, x22, [sp, #16]
stp x23, x24, [sp, #32]
stp x25, x26, [sp, #48]
stp x27, x28, [sp, #64]
lsr x7, x3, #32
cnth x11
@@ -356,12 +354,11 @@ function ff_vvc_alf_filter_luma_8_sme2, export=1
add x0, x0, x2, lsl #2
b.gt 1b
ldp x19, x20, [sp, #16]
ldp x21, x22, [sp, #32]
ldp x23, x24, [sp, #48]
ldp x25, x26, [sp, #64]
ldp x27, x28, [sp, #80]
ldp x29, x30, [sp], #96
ldp x21, x22, [sp, #16]
ldp x23, x24, [sp, #32]
ldp x25, x26, [sp, #48]
ldp x27, x28, [sp, #64]
ldp x19, x20, [sp], #80
sme_exit
ret
endfunc
@@ -410,13 +407,11 @@ function ff_vvc_alf_filter_luma_10_sme2, export=1
mov w12, #1023
0:
sme_entry
stp x29, x30, [sp, #-96]!
mov x29, sp
stp x19, x20, [sp, #16]
stp x21, x22, [sp, #32]
stp x23, x24, [sp, #48]
stp x25, x26, [sp, #64]
stp x27, x28, [sp, #80]
stp x19, x20, [sp, #-80]!
stp x21, x22, [sp, #16]
stp x23, x24, [sp, #32]
stp x25, x26, [sp, #48]
stp x27, x28, [sp, #64]
lsr x7, x3, #32
cnth x11
@@ -644,12 +639,11 @@ function ff_vvc_alf_filter_luma_10_sme2, export=1
add x0, x0, x2, lsl #3
b.gt 1b
ldp x19, x20, [sp, #16]
ldp x21, x22, [sp, #32]
ldp x23, x24, [sp, #48]
ldp x25, x26, [sp, #64]
ldp x27, x28, [sp, #80]
ldp x29, x30, [sp], #96
ldp x21, x22, [sp, #16]
ldp x23, x24, [sp, #32]
ldp x25, x26, [sp, #48]
ldp x27, x28, [sp, #64]
ldp x19, x20, [sp], #80
sme_exit
ret
endfunc
+6
View File
@@ -31,6 +31,7 @@
#include <math.h>
#include <string.h>
#include "libavutil/attributes.h"
#include "libavutil/channel_layout.h"
#include "libavutil/crc.h"
#include "libavutil/downmix_info.h"
@@ -338,7 +339,9 @@ static int decode_exponents(AC3DecodeContext *s,
switch (group_size) {
case 4: dexps[j++] = prevexp;
dexps[j++] = prevexp;
av_fallthrough;
case 2: dexps[j++] = prevexp;
av_fallthrough;
case 1: dexps[j++] = prevexp;
}
}
@@ -614,13 +617,16 @@ static void ac3_upmix_delay(AC3DecodeContext *s)
break;
case AC3_CHMODE_2F2R:
memset(s->delay[3], 0, channel_data_size);
av_fallthrough;
case AC3_CHMODE_2F1R:
memset(s->delay[2], 0, channel_data_size);
break;
case AC3_CHMODE_3F2R:
memset(s->delay[4], 0, channel_data_size);
av_fallthrough;
case AC3_CHMODE_3F1R:
memset(s->delay[3], 0, channel_data_size);
av_fallthrough;
case AC3_CHMODE_3F:
memcpy(s->delay[2], s->delay[1], channel_data_size);
memset(s->delay[1], 0, channel_data_size);
+1 -1
View File
@@ -2704,7 +2704,7 @@ static int adpcm_decode_frame(AVCodecContext *avctx, AVFrame *frame,
for (int k = i-1; k > -1; k--) {
for (int o = 1; o < order; o++)
delta += sf_codes[(i-1) - k] * coefs[(o*8) + k];
delta += sf_codes[(i-1) - k] * (unsigned)coefs[(o*8) + k];
}
sample = sf_codes[i] * 2048;
+1
View File
@@ -252,6 +252,7 @@ extern const FFCodec ff_pbm_encoder;
extern const FFCodec ff_pbm_decoder;
extern const FFCodec ff_pcx_encoder;
extern const FFCodec ff_pcx_decoder;
extern const FFCodec ff_pdv_encoder;
extern const FFCodec ff_pdv_decoder;
extern const FFCodec ff_pfm_encoder;
extern const FFCodec ff_pfm_decoder;
+13 -4
View File
@@ -1548,8 +1548,12 @@ static int read_diff_float_data(ALSDecContext *ctx, unsigned int ra_frame) {
return AVERROR_INVALIDDATA;
}
j = 0;
for (i = 0; i < frame_length; ++i) {
ctx->raw_mantissa[c][i] = AV_RB32(larray);
if (ctx->raw_samples[c][i] == 0) {
ctx->raw_mantissa[c][i] = AV_RB32(larray + j);
j += 4;
}
}
}
}
@@ -1560,7 +1564,10 @@ static int read_diff_float_data(ALSDecContext *ctx, unsigned int ra_frame) {
if (ctx->raw_samples[c][i] != 0) {
//The following logic is taken from Table 14.45 and 14.46 from the ISO spec
if (av_cmp_sf_ieee754(acf[c], FLOAT_1)) {
nbits[i] = 23 - av_log2(abs(ctx->raw_samples[c][i]));
int nbit = av_log2(FFABSU(ctx->raw_samples[c][i]));
if (nbit > 23)
return AVERROR_INVALIDDATA;
nbits[i] = 23 - nbit;
} else {
nbits[i] = 23;
}
@@ -1634,7 +1641,7 @@ static int read_diff_float_data(ALSDecContext *ctx, unsigned int ra_frame) {
tmp_32 = (sign << 31) | ((e + EXP_BIAS) << 23) | (mantissa);
ctx->raw_samples[c][i] = tmp_32;
} else {
ctx->raw_samples[c][i] = raw_mantissa[c][i] & 0x007fffffUL;
ctx->raw_samples[c][i] = raw_mantissa[c][i];
}
}
align_get_bits(gb);
@@ -1790,7 +1797,9 @@ static int read_frame_data(ALSDecContext *ctx, unsigned int ra_frame)
}
if (sconf->floating) {
read_diff_float_data(ctx, ra_frame);
ret = read_diff_float_data(ctx, ra_frame);
if (ret < 0)
return ret;
}
if (get_bits_left(gb) < 0) {
+48 -74
View File
@@ -21,8 +21,6 @@
#include "amfdec.h"
#include "codec_internal.h"
#include "hwconfig.h"
#include "libavutil/imgutils.h"
#include "libavutil/mem.h"
#include "libavutil/time.h"
#include "decode.h"
#include "decode_bsf.h"
@@ -125,31 +123,7 @@ static int amf_init_decoder(AVCodecContext *avctx)
} else if (avctx->color_range != AVCOL_RANGE_UNSPECIFIED) {
AMF_ASSIGN_PROPERTY_BOOL(res, ctx->decoder, AMF_VIDEO_DECODER_FULL_RANGE_COLOR, 0);
}
color_profile = AMF_VIDEO_CONVERTER_COLOR_PROFILE_UNKNOWN;
switch (avctx->colorspace) {
case AVCOL_SPC_SMPTE170M:
if (avctx->color_range == AVCOL_RANGE_JPEG) {
color_profile = AMF_VIDEO_CONVERTER_COLOR_PROFILE_FULL_601;
} else {
color_profile = AMF_VIDEO_CONVERTER_COLOR_PROFILE_601;
}
break;
case AVCOL_SPC_BT709:
if (avctx->color_range == AVCOL_RANGE_JPEG) {
color_profile = AMF_VIDEO_CONVERTER_COLOR_PROFILE_FULL_709;
} else {
color_profile = AMF_VIDEO_CONVERTER_COLOR_PROFILE_709;
}
break;
case AVCOL_SPC_BT2020_NCL:
case AVCOL_SPC_BT2020_CL:
if (avctx->color_range == AVCOL_RANGE_JPEG) {
color_profile = AMF_VIDEO_CONVERTER_COLOR_PROFILE_FULL_2020;
} else {
color_profile = AMF_VIDEO_CONVERTER_COLOR_PROFILE_2020;
}
break;
}
color_profile = av_amf_get_color_profile(avctx->color_range, avctx->colorspace);
if (color_profile != AMF_VIDEO_CONVERTER_COLOR_PROFILE_UNKNOWN)
AMF_ASSIGN_PROPERTY_INT64(res, ctx->decoder, AMF_VIDEO_DECODER_COLOR_PROFILE, color_profile);
if (avctx->color_trc != AVCOL_TRC_UNSPECIFIED)
@@ -269,12 +243,13 @@ static int amf_init_frames_context(AVCodecContext *avctx, int sw_format, int new
static int amf_decode_init(AVCodecContext *avctx)
{
AMFDecoderContext *ctx = avctx->priv_data;
ctx->dimensions_initialized = 0;
int ret;
ctx->in_pkt = av_packet_alloc();
if (!ctx->in_pkt)
return AVERROR(ENOMEM);
if (avctx->hw_device_ctx) {
if (avctx->hw_device_ctx) {
AVHWDeviceContext *hwdev_ctx;
hwdev_ctx = (AVHWDeviceContext*)avctx->hw_device_ctx->data;
if (hwdev_ctx->type == AV_HWDEVICE_TYPE_AMF)
@@ -297,7 +272,7 @@ static int amf_decode_init(AVCodecContext *avctx)
AVAMFDeviceContext *amf_device_ctx = (AVAMFDeviceContext*)hw_device_ctx->hwctx;
enum AVPixelFormat surf_pix_fmt = AV_PIX_FMT_NONE;
if(amf_legacy_driver_no_bitness_detect(amf_device_ctx)){
if (amf_legacy_driver_no_bitness_detect(amf_device_ctx)) {
// if bitness detection is not supported in legacy driver use format from container
switch (avctx->pix_fmt) {
case AV_PIX_FMT_YUV420P:
@@ -306,7 +281,7 @@ static int amf_decode_init(AVCodecContext *avctx)
case AV_PIX_FMT_YUV420P10:
surf_pix_fmt = AV_PIX_FMT_P010; break;
}
}else{
} else {
AMFVariantStruct format_var = {0};
ret = ctx->decoder->pVtbl->GetProperty(ctx->decoder, AMF_VIDEO_DECODER_OUTPUT_FORMAT, &format_var);
@@ -314,17 +289,26 @@ static int amf_decode_init(AVCodecContext *avctx)
surf_pix_fmt = av_amf_to_av_format(format_var.int64Value);
}
if(avctx->hw_frames_ctx)
if (avctx->hw_frames_ctx)
{
// this values should be set for avcodec_open2
// will be updated after header decoded if not true.
if(surf_pix_fmt == AV_PIX_FMT_NONE)
if (surf_pix_fmt == AV_PIX_FMT_NONE)
surf_pix_fmt = AV_PIX_FMT_NV12; // for older drivers
if (!avctx->coded_width)
avctx->coded_width = 1280;
if (!avctx->coded_height)
avctx->coded_height = 720;
ret = amf_init_frames_context(avctx, surf_pix_fmt, avctx->coded_width, avctx->coded_height);
int frames_w = 0;
int frames_h = 0;
if (avctx->coded_width > 0 && avctx->coded_height > 0) {
frames_w = avctx->coded_width;
frames_h = avctx->coded_height;
} else if (avctx->width > 0 && avctx->height > 0) {
frames_w = avctx->width;
frames_h = avctx->height;
} else {
frames_w = 1280;
frames_h = 720;
}
ret = amf_init_frames_context(avctx, surf_pix_fmt, frames_w, frames_h);
AMF_GOTO_FAIL_IF_FALSE(avctx, ret == 0, ret, "Failed to init frames context (AMF) : %s\n", av_err2str(ret));
}
else
@@ -375,7 +359,7 @@ static int amf_amfsurface_to_avframe(AVCodecContext *avctx, AMFSurface* surface,
avctx->sw_pix_fmt = avctx->pix_fmt;
ret = ff_attach_decode_data(frame);
ret = ff_attach_decode_data(avctx, frame);
if (ret < 0)
return ret;
frame->width = avctx->width;
@@ -435,41 +419,10 @@ static int amf_amfsurface_to_avframe(AVCodecContext *avctx, AMFSurface* surface,
AMFHDRMetadata * hdrmeta = (AMFHDRMetadata*)hdrmeta_buffer->pVtbl->GetNative(hdrmeta_buffer);
if (ret != AMF_OK)
return ret;
if (hdrmeta != NULL) {
AVMasteringDisplayMetadata *mastering = av_mastering_display_metadata_create_side_data(frame);
const int chroma_den = 50000;
const int luma_den = 10000;
if (!mastering)
return AVERROR(ENOMEM);
mastering->display_primaries[0][0] = av_make_q(hdrmeta->redPrimary[0], chroma_den);
mastering->display_primaries[0][1] = av_make_q(hdrmeta->redPrimary[1], chroma_den);
mastering->display_primaries[1][0] = av_make_q(hdrmeta->greenPrimary[0], chroma_den);
mastering->display_primaries[1][1] = av_make_q(hdrmeta->greenPrimary[1], chroma_den);
mastering->display_primaries[2][0] = av_make_q(hdrmeta->bluePrimary[0], chroma_den);
mastering->display_primaries[2][1] = av_make_q(hdrmeta->bluePrimary[1], chroma_den);
mastering->white_point[0] = av_make_q(hdrmeta->whitePoint[0], chroma_den);
mastering->white_point[1] = av_make_q(hdrmeta->whitePoint[1], chroma_den);
mastering->max_luminance = av_make_q(hdrmeta->maxMasteringLuminance, luma_den);
mastering->min_luminance = av_make_q(hdrmeta->maxMasteringLuminance, luma_den);
mastering->has_luminance = 1;
mastering->has_primaries = 1;
if (hdrmeta->maxContentLightLevel) {
AVContentLightMetadata *light = av_content_light_metadata_create_side_data(frame);
if (!light)
return AVERROR(ENOMEM);
light->MaxCLL = hdrmeta->maxContentLightLevel;
light->MaxFALL = hdrmeta->maxFrameAverageLightLevel;
}
}
ret = av_amf_attach_hdr_metadata(frame, hdrmeta);
if (ret < 0)
return ret;
}
}
return 0;
@@ -552,6 +505,25 @@ static AMF_RESULT amf_buffer_from_packet(AVCodecContext *avctx, const AVPacket*
return amf_update_buffer_properties(avctx, buf, pkt);
}
static void amf_init_dimensions(AVCodecContext *avctx)
{
AMFDecoderContext *ctx = avctx->priv_data;
AMFVariantStruct size_var = {0};
AMF_RESULT res = AMF_OK;
res = ctx->decoder->pVtbl->GetProperty(ctx->decoder, AMF_VIDEO_DECODER_CURRENT_SIZE, &size_var);
if (res == AMF_OK && size_var.sizeValue.width > 0 && size_var.sizeValue.height > 0) {
avctx->width = size_var.sizeValue.width;
avctx->height = size_var.sizeValue.height;
avctx->coded_width = size_var.sizeValue.width;
avctx->coded_height = size_var.sizeValue.height;
ctx->dimensions_initialized = 1;
av_log(avctx, AV_LOG_DEBUG, "AMF: detected initial decoder size %dx%d\n", avctx->width, avctx->height);
}
}
static int amf_decode_frame(AVCodecContext *avctx, struct AVFrame *frame)
{
AMFDecoderContext *ctx = avctx->priv_data;
@@ -613,9 +585,11 @@ static int amf_decode_frame(AVCodecContext *avctx, struct AVFrame *frame)
}
res = amf_receive_frame(avctx, frame);
if (res == AMF_OK)
if (res == AMF_OK) {
got_frame = 1;
else if (res == AMF_REPEAT)
if (!ctx->dimensions_initialized)
amf_init_dimensions(avctx);
} else if (res == AMF_REPEAT)
// decoder has no output yet
res = AMF_OK;
else if (res == AMF_EOF) {
+1
View File
@@ -55,6 +55,7 @@ typedef struct AMFDecoderContext {
int drain;
int resolution_changed;
int copy_output;
int dimensions_initialized;
AVPacket* in_pkt;
enum AMF_SURFACE_FORMAT output_format;
+51 -108
View File
@@ -17,13 +17,11 @@
*/
#include "config.h"
#include "config_components.h"
#include "libavutil/avassert.h"
#include "libavutil/imgutils.h"
#include "libavutil/hwcontext.h"
#include "libavutil/hwcontext_amf.h"
#include "libavutil/hwcontext_amf_internal.h"
#if CONFIG_D3D11VA
#include "libavutil/hwcontext_d3d11va.h"
#endif
@@ -37,62 +35,10 @@
#include "amfenc.h"
#include "encode.h"
#include "internal.h"
#include "libavutil/mastering_display_metadata.h"
#define AMF_AV_FRAME_REF L"av_frame_ref"
#define PTS_PROP L"PtsProp"
static int amf_save_hdr_metadata(AVCodecContext *avctx, const AVFrame *frame, AMFHDRMetadata *hdrmeta)
{
AVFrameSideData *sd_display;
AVFrameSideData *sd_light;
AVMasteringDisplayMetadata *display_meta;
AVContentLightMetadata *light_meta;
sd_display = av_frame_get_side_data(frame, AV_FRAME_DATA_MASTERING_DISPLAY_METADATA);
if (sd_display) {
display_meta = (AVMasteringDisplayMetadata *)sd_display->data;
if (display_meta->has_luminance) {
const unsigned int luma_den = 10000;
hdrmeta->maxMasteringLuminance =
(amf_uint32)(luma_den * av_q2d(display_meta->max_luminance));
hdrmeta->minMasteringLuminance =
FFMIN((amf_uint32)(luma_den * av_q2d(display_meta->min_luminance)), hdrmeta->maxMasteringLuminance);
}
if (display_meta->has_primaries) {
const unsigned int chroma_den = 50000;
hdrmeta->redPrimary[0] =
FFMIN((amf_uint16)(chroma_den * av_q2d(display_meta->display_primaries[0][0])), chroma_den);
hdrmeta->redPrimary[1] =
FFMIN((amf_uint16)(chroma_den * av_q2d(display_meta->display_primaries[0][1])), chroma_den);
hdrmeta->greenPrimary[0] =
FFMIN((amf_uint16)(chroma_den * av_q2d(display_meta->display_primaries[1][0])), chroma_den);
hdrmeta->greenPrimary[1] =
FFMIN((amf_uint16)(chroma_den * av_q2d(display_meta->display_primaries[1][1])), chroma_den);
hdrmeta->bluePrimary[0] =
FFMIN((amf_uint16)(chroma_den * av_q2d(display_meta->display_primaries[2][0])), chroma_den);
hdrmeta->bluePrimary[1] =
FFMIN((amf_uint16)(chroma_den * av_q2d(display_meta->display_primaries[2][1])), chroma_den);
hdrmeta->whitePoint[0] =
FFMIN((amf_uint16)(chroma_den * av_q2d(display_meta->white_point[0])), chroma_den);
hdrmeta->whitePoint[1] =
FFMIN((amf_uint16)(chroma_den * av_q2d(display_meta->white_point[1])), chroma_den);
}
sd_light = av_frame_get_side_data(frame, AV_FRAME_DATA_CONTENT_LIGHT_LEVEL);
if (sd_light) {
light_meta = (AVContentLightMetadata *)sd_light->data;
if (light_meta) {
hdrmeta->maxContentLightLevel = (amf_uint16)light_meta->MaxCLL;
hdrmeta->maxFrameAverageLightLevel = (amf_uint16)light_meta->MaxFALL;
}
}
return 0;
}
return 1;
}
#if CONFIG_D3D11VA
#include <d3d11.h>
#endif
@@ -251,6 +197,8 @@ static int amf_copy_buffer(AVCodecContext *avctx, AVPacket *pkt, AMFBuffer *buff
AMFVariantStruct var = {0};
int64_t timestamp = AV_NOPTS_VALUE;
int64_t size = buffer->pVtbl->GetSize(buffer);
enum AVPictureType pict_type = 0;
int average_qp = -1;
if ((ret = ff_get_encode_buffer(avctx, pkt, size, 0)) < 0) {
return ret;
@@ -258,25 +206,52 @@ static int amf_copy_buffer(AVCodecContext *avctx, AVPacket *pkt, AMFBuffer *buff
memcpy(pkt->data, buffer->pVtbl->GetNative(buffer), size);
switch (avctx->codec->id) {
case AV_CODEC_ID_H264:
buffer->pVtbl->GetProperty(buffer, AMF_VIDEO_ENCODER_OUTPUT_DATA_TYPE, &var);
if(var.int64Value == AMF_VIDEO_ENCODER_OUTPUT_DATA_TYPE_IDR) {
pkt->flags = AV_PKT_FLAG_KEY;
case AV_CODEC_ID_H264:
buffer->pVtbl->GetProperty(buffer, AMF_VIDEO_ENCODER_OUTPUT_DATA_TYPE, &var);
pkt->flags |= AV_PKT_FLAG_KEY * (var.int64Value == AMF_VIDEO_ENCODER_OUTPUT_DATA_TYPE_IDR);
pict_type = var.int64Value == AMF_VIDEO_ENCODER_OUTPUT_DATA_TYPE_IDR ? AV_PICTURE_TYPE_I :
var.int64Value == AMF_VIDEO_ENCODER_OUTPUT_DATA_TYPE_I ? AV_PICTURE_TYPE_I :
var.int64Value == AMF_VIDEO_ENCODER_OUTPUT_DATA_TYPE_P ? AV_PICTURE_TYPE_P :
var.int64Value == AMF_VIDEO_ENCODER_OUTPUT_DATA_TYPE_B ? AV_PICTURE_TYPE_B : 0;
var.int64Value = -1;
if ((buffer->pVtbl->GetProperty(buffer, AMF_VIDEO_ENCODER_STATISTIC_AVERAGE_QP, &var)) == AMF_OK) {
average_qp = FFMAX((int)var.int64Value, -1);
}
break;
case AV_CODEC_ID_HEVC:
buffer->pVtbl->GetProperty(buffer, AMF_VIDEO_ENCODER_HEVC_OUTPUT_DATA_TYPE, &var);
pkt->flags |= AV_PKT_FLAG_KEY * (var.int64Value == AMF_VIDEO_ENCODER_HEVC_OUTPUT_DATA_TYPE_IDR);
pict_type = var.int64Value == AMF_VIDEO_ENCODER_HEVC_OUTPUT_DATA_TYPE_IDR ? AV_PICTURE_TYPE_I :
var.int64Value == AMF_VIDEO_ENCODER_HEVC_OUTPUT_DATA_TYPE_I ? AV_PICTURE_TYPE_I :
var.int64Value == AMF_VIDEO_ENCODER_HEVC_OUTPUT_DATA_TYPE_P ? AV_PICTURE_TYPE_P : 0;
var.int64Value = -1;
if ((buffer->pVtbl->GetProperty(buffer, AMF_VIDEO_ENCODER_HEVC_STATISTIC_AVERAGE_QP, &var)) == AMF_OK) {
average_qp = FFMAX((int)var.int64Value, -1);
}
break;
case AV_CODEC_ID_AV1:
buffer->pVtbl->GetProperty(buffer, AMF_VIDEO_ENCODER_AV1_OUTPUT_FRAME_TYPE, &var);
pkt->flags |= AV_PKT_FLAG_KEY * (var.int64Value == AMF_VIDEO_ENCODER_AV1_OUTPUT_FRAME_TYPE_KEY);
pict_type = var.int64Value == AMF_VIDEO_ENCODER_AV1_OUTPUT_FRAME_TYPE_KEY ? AV_PICTURE_TYPE_I :
var.int64Value == AMF_VIDEO_ENCODER_AV1_OUTPUT_FRAME_TYPE_INTRA_ONLY ? AV_PICTURE_TYPE_I :
var.int64Value == AMF_VIDEO_ENCODER_AV1_OUTPUT_FRAME_TYPE_INTER ? AV_PICTURE_TYPE_P : 0;
var.int64Value = -1;
if ((buffer->pVtbl->GetProperty(buffer, AMF_VIDEO_ENCODER_AV1_STATISTIC_AVERAGE_Q_INDEX, &var)) == AMF_OK) {
average_qp = FFMAX((int)var.int64Value, -1); // av1 qindex
if (average_qp >= 0) {
average_qp = (average_qp > 244) ? (average_qp <= 249 ? 62 : 63) : (average_qp + 3) >> 2; // av1 quantizer
}
break;
case AV_CODEC_ID_HEVC:
buffer->pVtbl->GetProperty(buffer, AMF_VIDEO_ENCODER_HEVC_OUTPUT_DATA_TYPE, &var);
if (var.int64Value == AMF_VIDEO_ENCODER_HEVC_OUTPUT_DATA_TYPE_IDR) {
pkt->flags = AV_PKT_FLAG_KEY;
}
break;
case AV_CODEC_ID_AV1:
buffer->pVtbl->GetProperty(buffer, AMF_VIDEO_ENCODER_AV1_OUTPUT_FRAME_TYPE, &var);
if (var.int64Value == AMF_VIDEO_ENCODER_AV1_OUTPUT_FRAME_TYPE_KEY) {
pkt->flags = AV_PKT_FLAG_KEY;
}
default:
break;
}
break;
default:
break;
}
if (average_qp >= 0) {
ff_encode_add_stats_side_data(pkt, average_qp * FF_QP2LAMBDA, NULL, 0, pict_type);
}
buffer->pVtbl->GetProperty(buffer, ctx->pts_property_name, &var);
@@ -479,7 +454,7 @@ static int amf_submit_frame(AVCodecContext *avctx, AVFrame *frame, AMFSurface
res = amf_device_ctx->context->pVtbl->AllocBuffer(amf_device_ctx->context, AMF_MEMORY_HOST, sizeof(AMFHDRMetadata), &hdrmeta_buffer);
if (res == AMF_OK) {
AMFHDRMetadata * hdrmeta = (AMFHDRMetadata*)hdrmeta_buffer->pVtbl->GetNative(hdrmeta_buffer);
if (amf_save_hdr_metadata(avctx, frame, hdrmeta) == 0) {
if (av_amf_extract_hdr_metadata(frame, hdrmeta) == 0) {
switch (avctx->codec->id) {
case AV_CODEC_ID_H264:
AMF_ASSIGN_PROPERTY_INTERFACE(res, ctx->encoder, AMF_VIDEO_ENCODER_INPUT_HDR_METADATA, hdrmeta_buffer); break;
@@ -500,6 +475,7 @@ static int amf_submit_frame(AVCodecContext *avctx, AVFrame *frame, AMFSurface
switch (avctx->codec->id) {
case AV_CODEC_ID_H264:
AMF_ASSIGN_PROPERTY_BOOL(res, surface, AMF_VIDEO_ENCODER_STATISTICS_FEEDBACK, 1);
AMF_ASSIGN_PROPERTY_INT64(res, surface, AMF_VIDEO_ENCODER_INSERT_AUD, !!ctx->aud);
switch (frame->pict_type) {
case AV_PICTURE_TYPE_I:
@@ -520,6 +496,7 @@ static int amf_submit_frame(AVCodecContext *avctx, AVFrame *frame, AMFSurface
}
break;
case AV_CODEC_ID_HEVC:
AMF_ASSIGN_PROPERTY_BOOL(res, surface, AMF_VIDEO_ENCODER_HEVC_STATISTICS_FEEDBACK, 1);
AMF_ASSIGN_PROPERTY_INT64(res, surface, AMF_VIDEO_ENCODER_HEVC_INSERT_AUD, !!ctx->aud);
switch (frame->pict_type) {
case AV_PICTURE_TYPE_I:
@@ -536,6 +513,7 @@ static int amf_submit_frame(AVCodecContext *avctx, AVFrame *frame, AMFSurface
}
break;
case AV_CODEC_ID_AV1:
AMF_ASSIGN_PROPERTY_BOOL(res, surface, AMF_VIDEO_ENCODER_AV1_STATISTICS_FEEDBACK, 1);
if (frame->pict_type == AV_PICTURE_TYPE_I) {
if (ctx->forced_idr) {
AMF_ASSIGN_PROPERTY_INT64(res, surface, AMF_VIDEO_ENCODER_AV1_FORCE_INSERT_SEQUENCE_HEADER, 1);
@@ -733,41 +711,6 @@ int ff_amf_receive_packet(AVCodecContext *avctx, AVPacket *avpkt)
return ret;
}
int ff_amf_get_color_profile(AVCodecContext *avctx)
{
amf_int64 color_profile = AMF_VIDEO_CONVERTER_COLOR_PROFILE_UNKNOWN;
if (avctx->color_range == AVCOL_RANGE_JPEG) {
/// Color Space for Full (JPEG) Range
switch (avctx->colorspace) {
case AVCOL_SPC_SMPTE170M:
color_profile = AMF_VIDEO_CONVERTER_COLOR_PROFILE_FULL_601;
break;
case AVCOL_SPC_BT709:
color_profile = AMF_VIDEO_CONVERTER_COLOR_PROFILE_FULL_709;
break;
case AVCOL_SPC_BT2020_NCL:
case AVCOL_SPC_BT2020_CL:
color_profile = AMF_VIDEO_CONVERTER_COLOR_PROFILE_FULL_2020;
break;
}
} else {
/// Color Space for Limited (MPEG) range
switch (avctx->colorspace) {
case AVCOL_SPC_SMPTE170M:
color_profile = AMF_VIDEO_CONVERTER_COLOR_PROFILE_601;
break;
case AVCOL_SPC_BT709:
color_profile = AMF_VIDEO_CONVERTER_COLOR_PROFILE_709;
break;
case AVCOL_SPC_BT2020_NCL:
case AVCOL_SPC_BT2020_CL:
color_profile = AMF_VIDEO_CONVERTER_COLOR_PROFILE_2020;
break;
}
}
return color_profile;
}
const AVCodecHWConfigInternal *const ff_amfenc_hw_configs[] = {
#if CONFIG_D3D11VA
HW_CONFIG_ENCODER_FRAMES(D3D11, D3D11VA),
-2
View File
@@ -163,8 +163,6 @@ int ff_amf_receive_packet(AVCodecContext *avctx, AVPacket *avpkt);
*/
extern const enum AVPixelFormat ff_amf_pix_fmts[];
int ff_amf_get_color_profile(AVCodecContext *avctx);
/**
* Error handling helper
*/
+30 -24
View File
@@ -16,16 +16,16 @@
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
*/
#include "libavutil/avassert.h"
#include "libavutil/hwcontext_amf.h"
#include "libavutil/internal.h"
#include "libavutil/intreadwrite.h"
#include "libavutil/mem.h"
#include "libavutil/opt.h"
#include "libavutil/pixdesc.h"
#include "amfenc.h"
#include "codec_internal.h"
#define AMF_VIDEO_ENCODER_AV1_CAP_WIDTH_ALIGNMENT_FACTOR_LOCAL L"Av1WidthAlignmentFactor" // amf_int64; default = 1
#define AMF_VIDEO_ENCODER_AV1_CAP_HEIGHT_ALIGNMENT_FACTOR_LOCAL L"Av1HeightAlignmentFactor" // amf_int64; default = 1
#define OFFSET(x) offsetof(AMFEncoderContext, x)
#define VE AV_OPT_FLAG_VIDEO_PARAM | AV_OPT_FLAG_ENCODING_PARAM
static const AVOption options[] = {
@@ -205,6 +205,7 @@ static av_cold int amf_encode_init_av1(AVCodecContext* avctx)
amf_int64 bit_depth;
amf_int64 color_profile;
enum AVPixelFormat pix_fmt;
const AVPixFmtDescriptor *pix_desc;
//for av1 alignment and crop
uint32_t crop_right = 0;
@@ -250,35 +251,40 @@ static av_cold int amf_encode_init_av1(AVCodecContext* avctx)
// Color bit depth
pix_fmt = avctx->hw_frames_ctx ? ((AVHWFramesContext*)avctx->hw_frames_ctx->data)->sw_format
: avctx->pix_fmt;
: avctx->pix_fmt;
pix_desc = av_pix_fmt_desc_get(pix_fmt);
av_assert0(pix_desc);
bit_depth = ctx->bit_depth;
if(bit_depth == AMF_COLOR_BIT_DEPTH_UNDEFINED){
bit_depth = pix_fmt == AV_PIX_FMT_P010 ? AMF_COLOR_BIT_DEPTH_10 : AMF_COLOR_BIT_DEPTH_8;
if (bit_depth == AMF_COLOR_BIT_DEPTH_UNDEFINED) {
bit_depth = pix_desc->comp[0].depth >= 10 ? AMF_COLOR_BIT_DEPTH_10 : AMF_COLOR_BIT_DEPTH_8;
}
AMF_ASSIGN_PROPERTY_INT64(res, ctx->encoder, AMF_VIDEO_ENCODER_AV1_COLOR_BIT_DEPTH, bit_depth);
// Color profile
color_profile = ff_amf_get_color_profile(avctx);
color_profile = av_amf_get_color_profile(avctx->color_range, avctx->colorspace);
AMF_ASSIGN_PROPERTY_INT64(res, ctx->encoder, AMF_VIDEO_ENCODER_AV1_OUTPUT_COLOR_PROFILE, color_profile);
// Color Range
// TODO
AMF_ASSIGN_PROPERTY_BOOL(res, ctx->encoder, AMF_VIDEO_ENCODER_AV1_OUTPUT_FULL_RANGE_COLOR, (avctx->color_range == AVCOL_RANGE_JPEG));
// Color Transfer Characteristics (AMF matches ISO/IEC)
if(avctx->color_primaries != AVCOL_PRI_UNSPECIFIED && (pix_fmt == AV_PIX_FMT_NV12 || pix_fmt == AV_PIX_FMT_P010)){
// if input is YUV, color_primaries are for VUI only
// AMF VCN color conversion supports only specific output primaries BT2020 for 10-bit and BT709 for 8-bit
// vpp_amf supports more
AMF_ASSIGN_PROPERTY_INT64(res, ctx->encoder, AMF_VIDEO_ENCODER_AV1_OUTPUT_TRANSFER_CHARACTERISTIC, avctx->color_trc);
if (!(pix_desc->flags & AV_PIX_FMT_FLAG_RGB)) {
// Color Transfer Characteristics (AMF matches ISO/IEC)
if (avctx->color_trc != AVCOL_TRC_UNSPECIFIED) {
// if input is YUV, color_trc is for VUI only - any value
// AMF VCN color conversion supports only specific output transfer characteristic SMPTE2084 for 10-bit and BT709 for 8-bit
// vpp_amf supports more
AMF_ASSIGN_PROPERTY_INT64(res, ctx->encoder, AMF_VIDEO_ENCODER_AV1_OUTPUT_TRANSFER_CHARACTERISTIC, avctx->color_trc);
}
// Color Primaries (AMF matches ISO/IEC)
if (avctx->color_primaries != AVCOL_PRI_UNSPECIFIED) {
// if input is YUV, color_primaries are for VUI only
// AMF VCN color conversion supports only specific primaries BT2020 for 10-bit and BT709 for 8-bit
// vpp_amf supports more
AMF_ASSIGN_PROPERTY_INT64(res, ctx->encoder, AMF_VIDEO_ENCODER_AV1_OUTPUT_COLOR_PRIMARIES, avctx->color_primaries);
}
}
// Color Primaries (AMF matches ISO/IEC)
if(avctx->color_primaries != AVCOL_PRI_UNSPECIFIED || pix_fmt == AV_PIX_FMT_NV12 || pix_fmt == AV_PIX_FMT_P010 )
{
// AMF VCN color conversion supports only specific primaries BT2020 for 10-bit and BT709 for 8-bit
// vpp_amf supports more
AMF_ASSIGN_PROPERTY_INT64(res, ctx->encoder, AMF_VIDEO_ENCODER_AV1_OUTPUT_COLOR_PRIMARIES, avctx->color_primaries);
}
profile_level = avctx->level;
if (profile_level == AV_LEVEL_UNKNOWN) {
profile_level = ctx->level;
@@ -658,13 +664,13 @@ static av_cold int amf_encode_init_av1(AVCodecContext* avctx)
var.pInterface->pVtbl->Release(var.pInterface);
//processing crop information according to alignment
if (ctx->encoder->pVtbl->GetProperty(ctx->encoder, AMF_VIDEO_ENCODER_AV1_CAP_WIDTH_ALIGNMENT_FACTOR_LOCAL, &var) != AMF_OK)
if (ctx->encoder->pVtbl->GetProperty(ctx->encoder, AMF_VIDEO_ENCODER_AV1_CAP_WIDTH_ALIGNMENT_FACTOR, &var) != AMF_OK)
// assume older driver and Navi3x
width_alignment_factor = 64;
else
width_alignment_factor = (int)var.int64Value;
if (ctx->encoder->pVtbl->GetProperty(ctx->encoder, AMF_VIDEO_ENCODER_AV1_CAP_HEIGHT_ALIGNMENT_FACTOR_LOCAL, &var) != AMF_OK)
if (ctx->encoder->pVtbl->GetProperty(ctx->encoder, AMF_VIDEO_ENCODER_AV1_CAP_HEIGHT_ALIGNMENT_FACTOR, &var) != AMF_OK)
// assume older driver and Navi3x
height_alignment_factor = 16;
else
@@ -746,7 +752,7 @@ const FFCodec ff_av1_amf_encoder = {
AV_CODEC_CAP_DR1,
.caps_internal = FF_CODEC_CAP_INIT_CLEANUP,
CODEC_PIXFMTS_ARRAY(ff_amf_pix_fmts),
.color_ranges = AVCOL_RANGE_MPEG, /* FIXME: implement tagging */
.color_ranges = AVCOL_RANGE_MPEG | AVCOL_RANGE_JPEG,
.p.wrapper_name = "amf",
.hw_configs = ff_amfenc_hw_configs,
};
+10 -5
View File
@@ -16,10 +16,12 @@
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
*/
#include "libavutil/avassert.h"
#include "libavutil/hwcontext_amf.h"
#include "libavutil/internal.h"
#include "libavutil/mem.h"
#include "libavutil/opt.h"
#include "libavutil/pixdesc.h"
#include "amfenc.h"
#include "codec_internal.h"
#include <AMF/components/PreAnalysis.h>
@@ -206,6 +208,7 @@ static av_cold int amf_encode_init_h264(AVCodecContext *avctx)
int deblocking_filter = (avctx->flags & AV_CODEC_FLAG_LOOP_FILTER) ? 1 : 0;
amf_int64 color_profile;
enum AVPixelFormat pix_fmt;
const AVPixFmtDescriptor *pix_desc;
if (avctx->framerate.num > 0 && avctx->framerate.den > 0) {
framerate = AMFConstructRate(avctx->framerate.num, avctx->framerate.den);
@@ -270,18 +273,20 @@ static av_cold int amf_encode_init_h264(AVCodecContext *avctx)
AMF_ASSIGN_PROPERTY_RATIO(res, ctx->encoder, AMF_VIDEO_ENCODER_ASPECT_RATIO, ratio);
}
color_profile = ff_amf_get_color_profile(avctx);
color_profile = av_amf_get_color_profile(avctx->color_range, avctx->colorspace);
AMF_ASSIGN_PROPERTY_INT64(res, ctx->encoder, AMF_VIDEO_ENCODER_OUTPUT_COLOR_PROFILE, color_profile);
/// Color Range (Support for older Drivers)
AMF_ASSIGN_PROPERTY_BOOL(res, ctx->encoder, AMF_VIDEO_ENCODER_FULL_RANGE_COLOR, !!(avctx->color_range == AVCOL_RANGE_JPEG));
AMF_ASSIGN_PROPERTY_BOOL(res, ctx->encoder, AMF_VIDEO_ENCODER_OUTPUT_FULL_RANGE_COLOR, (avctx->color_range == AVCOL_RANGE_JPEG));
/// Color Depth
pix_fmt = avctx->hw_frames_ctx ? ((AVHWFramesContext*)avctx->hw_frames_ctx->data)->sw_format
: avctx->pix_fmt;
: avctx->pix_fmt;
pix_desc = av_pix_fmt_desc_get(pix_fmt);
av_assert0(pix_desc);
// 10 bit input video is not supported by AMF H264 encoder
AMF_RETURN_IF_FALSE(ctx, pix_fmt != AV_PIX_FMT_P010, AVERROR_INVALIDDATA, "10-bit input video is not supported by AMF H264 encoder\n");
AMF_RETURN_IF_FALSE(ctx, pix_desc->comp[0].depth == 8, AVERROR_INVALIDDATA, "10-bit input video is not supported by AMF H264 encoder\n");
AMF_ASSIGN_PROPERTY_INT64(res, ctx->encoder, AMF_VIDEO_ENCODER_COLOR_BIT_DEPTH, AMF_COLOR_BIT_DEPTH_8);
/// Color Transfer Characteristics (AMF matches ISO/IEC)
+38 -19
View File
@@ -16,9 +16,12 @@
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
*/
#include "libavutil/avassert.h"
#include "libavutil/hwcontext_amf.h"
#include "libavutil/internal.h"
#include "libavutil/mem.h"
#include "libavutil/opt.h"
#include "libavutil/pixdesc.h"
#include "amfenc.h"
#include "codec_internal.h"
#include <AMF/components/PreAnalysis.h>
@@ -166,6 +169,7 @@ static av_cold int amf_encode_init_hevc(AVCodecContext *avctx)
AMFEncoderContext *ctx = avctx->priv_data;
AMFVariantStruct var = {0};
amf_int64 profile = 0;
amf_int64 profile_from_bitdepth = 0;
amf_int64 profile_level = 0;
AMFBuffer *buffer;
AMFGuid guid;
@@ -175,6 +179,7 @@ static av_cold int amf_encode_init_hevc(AVCodecContext *avctx)
amf_int64 bit_depth;
amf_int64 color_profile;
enum AVPixelFormat pix_fmt;
const AVPixFmtDescriptor *pix_desc;
if (avctx->framerate.num > 0 && avctx->framerate.den > 0) {
framerate = AMFConstructRate(avctx->framerate.num, avctx->framerate.den);
@@ -243,35 +248,49 @@ static av_cold int amf_encode_init_hevc(AVCodecContext *avctx)
// Color bit depth
pix_fmt = avctx->hw_frames_ctx ? ((AVHWFramesContext*)avctx->hw_frames_ctx->data)->sw_format
: avctx->pix_fmt;
: avctx->pix_fmt;
pix_desc = av_pix_fmt_desc_get(pix_fmt);
av_assert0(pix_desc);
bit_depth = ctx->bit_depth;
if(bit_depth == AMF_COLOR_BIT_DEPTH_UNDEFINED){
bit_depth = pix_fmt == AV_PIX_FMT_P010 ? AMF_COLOR_BIT_DEPTH_10 : AMF_COLOR_BIT_DEPTH_8;
if (bit_depth == AMF_COLOR_BIT_DEPTH_UNDEFINED) {
bit_depth = pix_desc->comp[0].depth >= 10 ? AMF_COLOR_BIT_DEPTH_10 : AMF_COLOR_BIT_DEPTH_8;
}
AMF_ASSIGN_PROPERTY_INT64(res, ctx->encoder, AMF_VIDEO_ENCODER_HEVC_COLOR_BIT_DEPTH, bit_depth);
// HEVC profile follows target bit depth
profile_from_bitdepth = bit_depth == AMF_COLOR_BIT_DEPTH_10 ? AMF_VIDEO_ENCODER_HEVC_PROFILE_MAIN_10
: AMF_VIDEO_ENCODER_HEVC_PROFILE_MAIN;
if (profile != profile_from_bitdepth) {
AMF_ASSIGN_PROPERTY_INT64(res, ctx->encoder, AMF_VIDEO_ENCODER_HEVC_PROFILE, profile_from_bitdepth);
if (profile != 0) {
av_log(avctx, AV_LOG_WARNING, "The video profile and bit depth did not match, but this has been corrected\n");
}
}
avctx->profile = bit_depth == AMF_COLOR_BIT_DEPTH_10 ? AV_PROFILE_HEVC_MAIN_10 : AV_PROFILE_HEVC_MAIN;
// Color profile
color_profile = ff_amf_get_color_profile(avctx);
color_profile = av_amf_get_color_profile(avctx->color_range, avctx->colorspace);
AMF_ASSIGN_PROPERTY_INT64(res, ctx->encoder, AMF_VIDEO_ENCODER_HEVC_OUTPUT_COLOR_PROFILE, color_profile);
// Color Range (Support for older Drivers)
AMF_ASSIGN_PROPERTY_BOOL(res, ctx->encoder, AMF_VIDEO_ENCODER_HEVC_NOMINAL_RANGE, !!(avctx->color_range == AVCOL_RANGE_JPEG));
AMF_ASSIGN_PROPERTY_BOOL(res, ctx->encoder, AMF_VIDEO_ENCODER_HEVC_OUTPUT_FULL_RANGE_COLOR, (avctx->color_range == AVCOL_RANGE_JPEG));
// Color Transfer Characteristics (AMF matches ISO/IEC)
if(avctx->color_trc != AVCOL_TRC_UNSPECIFIED && (pix_fmt == AV_PIX_FMT_NV12 || pix_fmt == AV_PIX_FMT_P010)){
// if input is YUV, color_trc is for VUI only - any value
// AMF VCN color conversion supports only specific output transfer characteristic SMPTE2084 for 10-bit and BT709 for 8-bit
// vpp_amf supports more
AMF_ASSIGN_PROPERTY_INT64(res, ctx->encoder, AMF_VIDEO_ENCODER_HEVC_OUTPUT_TRANSFER_CHARACTERISTIC, avctx->color_trc);
}
if (!(pix_desc->flags & AV_PIX_FMT_FLAG_RGB)) {
// Color Transfer Characteristics (AMF matches ISO/IEC)
if (avctx->color_trc != AVCOL_TRC_UNSPECIFIED) {
// if input is YUV, color_trc is for VUI only - any value
// AMF VCN color conversion supports only specific output transfer characteristic SMPTE2084 for 10-bit and BT709 for 8-bit
// vpp_amf supports more
AMF_ASSIGN_PROPERTY_INT64(res, ctx->encoder, AMF_VIDEO_ENCODER_HEVC_OUTPUT_TRANSFER_CHARACTERISTIC, avctx->color_trc);
}
// Color Primaries (AMF matches ISO/IEC)
if(avctx->color_primaries != AVCOL_PRI_UNSPECIFIED && (pix_fmt == AV_PIX_FMT_NV12 || pix_fmt == AV_PIX_FMT_P010)){
// if input is YUV, color_primaries are for VUI only
// AMF VCN color conversion supports only specific output primaries BT2020 for 10-bit and BT709 for 8-bit
// vpp_amf supports more
AMF_ASSIGN_PROPERTY_INT64(res, ctx->encoder, AMF_VIDEO_ENCODER_HEVC_OUTPUT_COLOR_PRIMARIES, avctx->color_primaries);
// Color Primaries (AMF matches ISO/IEC)
if (avctx->color_primaries != AVCOL_PRI_UNSPECIFIED) {
// if input is YUV, color_primaries are for VUI only
// AMF VCN color conversion supports only specific output primaries BT2020 for 10-bit and BT709 for 8-bit
// vpp_amf supports more
AMF_ASSIGN_PROPERTY_INT64(res, ctx->encoder, AMF_VIDEO_ENCODER_HEVC_OUTPUT_COLOR_PRIMARIES, avctx->color_primaries);
}
}
// Picture control properties
+2
View File
@@ -24,6 +24,7 @@
* ASCII/ANSI art decoder
*/
#include "libavutil/attributes.h"
#include "libavutil/common.h"
#include "libavutil/frame.h"
#include "libavutil/xga_font_data.h"
@@ -396,6 +397,7 @@ static int decode_frame(AVCodecContext *avctx, AVFrame *rframe,
break;
case 0x0A: //LF
hscroll(avctx);
av_fallthrough;
case 0x0D: //CR
s->x = 0;
break;
+3 -2
View File
@@ -152,8 +152,9 @@ int ff_aom_parse_film_grain_sets(AVFilmGrainAFGS1Params *s,
payload_4byte = get_bits1(gb);
payload_size = get_bits(gb, payload_4byte ? 2 : 8);
set_idx = get_bits(gb, 3);
fgp = av_film_grain_params_alloc(&fgp_size);
if (!fgp)
if (!fgp || s->sets[set_idx])
goto error;
aom = &fgp->codec.aom;
@@ -212,7 +213,7 @@ int ff_aom_parse_film_grain_sets(AVFilmGrainAFGS1Params *s,
}
predict_scaling = get_bits1(gb);
if (predict_scaling && (!ref || ref == fgp))
if (predict_scaling && !ref)
goto error; // prediction must be from valid, different set
predict_y_scaling = predict_scaling ? get_bits1(gb) : 0;
+11 -9
View File
@@ -19,6 +19,7 @@
#include <stdatomic.h>
#include "libavutil/attributes.h"
#include "libavutil/avassert.h"
#include "libavutil/mastering_display_metadata.h"
#include "libavutil/mem_internal.h"
#include "libavutil/pixdesc.h"
@@ -63,12 +64,12 @@ typedef struct APVDecodeContext {
uint8_t warned_unknown_pbu_types;
} APVDecodeContext;
static const enum AVPixelFormat apv_format_table[5][5] = {
{ AV_PIX_FMT_GRAY8, AV_PIX_FMT_GRAY10, AV_PIX_FMT_GRAY12, AV_PIX_FMT_GRAY14, AV_PIX_FMT_GRAY16 },
{ 0 }, // 4:2:0 is not valid.
{ AV_PIX_FMT_YUV422P, AV_PIX_FMT_YUV422P10, AV_PIX_FMT_YUV422P12, AV_PIX_FMT_YUV422P14, AV_PIX_FMT_YUV422P16 },
{ AV_PIX_FMT_YUV444P, AV_PIX_FMT_YUV444P10, AV_PIX_FMT_YUV444P12, AV_PIX_FMT_YUV444P14, AV_PIX_FMT_YUV444P16 },
{ AV_PIX_FMT_YUVA444P, AV_PIX_FMT_YUVA444P10, AV_PIX_FMT_YUVA444P12, 0 ,AV_PIX_FMT_YUVA444P16 },
static const enum AVPixelFormat apv_format_table[5][4] = {
{ AV_PIX_FMT_GRAY10, AV_PIX_FMT_GRAY12, AV_PIX_FMT_GRAY14, AV_PIX_FMT_GRAY16 },
{ AV_PIX_FMT_NONE, AV_PIX_FMT_NONE, AV_PIX_FMT_NONE, AV_PIX_FMT_NONE }, // 4:2:0 is not valid.
{ AV_PIX_FMT_YUV422P10, AV_PIX_FMT_YUV422P12, AV_PIX_FMT_YUV422P14, AV_PIX_FMT_YUV422P16 },
{ AV_PIX_FMT_YUV444P10, AV_PIX_FMT_YUV444P12, AV_PIX_FMT_YUV444P14, AV_PIX_FMT_YUV444P16 },
{ AV_PIX_FMT_YUVA444P10, AV_PIX_FMT_YUVA444P12, AV_PIX_FMT_NONE, AV_PIX_FMT_YUVA444P16 },
};
static APVVLCLUT decode_lut;
@@ -82,14 +83,15 @@ static int apv_decode_check_format(AVCodecContext *avctx,
avctx->level = header->frame_info.level_idc;
bit_depth = header->frame_info.bit_depth_minus8 + 8;
if (bit_depth < 8 || bit_depth > 16 || bit_depth % 2) {
av_assert1(bit_depth >= 10 && bit_depth <= 16); // checked by CBS
if (bit_depth % 2) {
avpriv_request_sample(avctx, "Bit depth %d", bit_depth);
return AVERROR_PATCHWELCOME;
}
avctx->pix_fmt =
apv_format_table[header->frame_info.chroma_format_idc][bit_depth - 4 >> 2];
apv_format_table[header->frame_info.chroma_format_idc][(bit_depth - 10) >> 1];
if (!avctx->pix_fmt) {
if (avctx->pix_fmt == AV_PIX_FMT_NONE) {
avpriv_request_sample(avctx, "YUVA444P14");
return AVERROR_PATCHWELCOME;
}
+12 -25
View File
@@ -20,6 +20,7 @@
#include "config.h"
#include "libavutil/attributes.h"
#include "libavutil/avassert.h"
#include "libavutil/common.h"
#include "apv.h"
@@ -100,33 +101,19 @@ static void apv_decode_transquant_c(void *output,
}
// Output.
if (bit_depth == 8) {
uint8_t *ptr = output;
int bd_shift = 20 - bit_depth;
av_assert2(bit_depth > 8 && bit_depth <= 16);
uint16_t *ptr = output;
int bd_shift = 20 - bit_depth;
pitch /= 2; // Pitch was in bytes, 2 bytes per sample.
for (int y = 0; y < 8; y++) {
for (int x = 0; x < 8; x++) {
int sample = ((recon_sample[y][x] +
(1 << (bd_shift - 1))) >> bd_shift) +
(1 << (bit_depth - 1));
ptr[x] = av_clip_uintp2(sample, bit_depth);
}
ptr += pitch;
}
} else {
uint16_t *ptr = output;
int bd_shift = 20 - bit_depth;
pitch /= 2; // Pitch was in bytes, 2 bytes per sample.
for (int y = 0; y < 8; y++) {
for (int x = 0; x < 8; x++) {
int sample = ((recon_sample[y][x] +
(1 << (bd_shift - 1))) >> bd_shift) +
(1 << (bit_depth - 1));
ptr[x] = av_clip_uintp2(sample, bit_depth);
}
ptr += pitch;
for (int y = 0; y < 8; y++) {
for (int x = 0; x < 8; x++) {
int sample = ((recon_sample[y][x] +
(1 << (bd_shift - 1))) >> bd_shift) +
(1 << (bit_depth - 1));
ptr[x] = av_clip_uintp2(sample, bit_depth);
}
ptr += pitch;
}
}
+4
View File
@@ -634,24 +634,28 @@ static int decode_frame(AVCodecContext *avctx, AVFrame *rframe,
ret = decode_avcf(avctx, frame);
break;
}
av_fallthrough;
case MKBETAG('A', 'L', 'C', 'D'):
if (avctx->pix_fmt == AV_PIX_FMT_PAL8) {
s->key = 0;
ret = decode_alcd(avctx, frame);
break;
}
av_fallthrough;
case MKBETAG('R', 'L', 'E', 'F'):
if (avctx->pix_fmt == AV_PIX_FMT_PAL8) {
s->key = 1;
ret = decode_rle(avctx, frame);
break;
}
av_fallthrough;
case MKBETAG('R', 'L', 'E', 'D'):
if (avctx->pix_fmt == AV_PIX_FMT_PAL8) {
s->key = 0;
ret = decode_rle(avctx, frame);
break;
}
av_fallthrough;
default:
av_log(avctx, AV_LOG_DEBUG, "unknown chunk 0x%X\n", chunk);
break;
+29 -29
View File
@@ -28,25 +28,25 @@
*/
@ void ff_int32_to_float_fmul_array8_vfp(FmtConvertContext *c, float *dst, const int32_t *src, const float *mul, int len)
function ff_int32_to_float_fmul_array8_vfp, export=1
push {lr}
ldr a1, [sp, #4]
subs lr, a1, #3*8
bcc 50f @ too short to pipeline
push {lr}
ldr a1, [sp, #4]
subs lr, a1, #3*8
bcc 50f @ too short to pipeline
@ Now need to find (len / 8) % 3. The approximation
@ x / 24 = (x * 0xAB) >> 12
@ is good for x < 4096, which is true for both AC3 and DCA.
mov a1, #0xAB
ldr ip, =0x03070000 @ RunFast mode, short vectors of length 8, stride 1
mul a1, lr, a1
vpush {s16-s31}
mov a1, a1, lsr #12
add a1, a1, a1, lsl #1
rsb a1, a1, lr, lsr #3
cmp a1, #1
fmrx a1, FPSCR
fmxr FPSCR, ip
beq 11f
blo 10f
mov a1, #0xAB
ldr ip, =0x03070000 @ RunFast mode, short vectors of length 8, stride 1
mul a1, lr, a1
vpush {s16-s31}
mov a1, a1, lsr #12
add a1, a1, a1, lsl #1
rsb a1, a1, lr, lsr #3
cmp a1, #1
fmrx a1, FPSCR
fmxr FPSCR, ip
beq 11f
blo 10f
@ Array is (2 + multiple of 3) x 8 floats long
@ drop through...
vldmia a3!, {s16-s23}
@@ -122,9 +122,9 @@ function ff_int32_to_float_fmul_array8_vfp, export=1
vstmia a2!, {s24-s27}
vstmia a2!, {s28-s31}
fmxr FPSCR, a1
vpop {s16-s31}
pop {pc}
fmxr FPSCR, a1
vpop {s16-s31}
pop {pc}
10: @ Array is (multiple of 3) x 8 floats long
vldmia a3!, {s8-s15}
@@ -158,9 +158,9 @@ function ff_int32_to_float_fmul_array8_vfp, export=1
b 2b
50:
ldr lr, =0x03070000 @ RunFast mode, short vectors of length 8, stride 1
fmrx ip, FPSCR
fmxr FPSCR, lr
ldr lr, =0x03070000 @ RunFast mode, short vectors of length 8, stride 1
fmrx ip, FPSCR
fmxr FPSCR, lr
51:
vldmia a3!, {s8-s15}
vldmia a4!, {s0}
@@ -178,8 +178,8 @@ function ff_int32_to_float_fmul_array8_vfp, export=1
vstmia a2!, {s12-s15}
bne 51b
fmxr FPSCR, ip
pop {pc}
fmxr FPSCR, ip
pop {pc}
endfunc
/**
@@ -195,9 +195,9 @@ VFP len .req a3
NOVFP tmp .req a3
NOVFP len .req a4
NOVFP vmov s0, a3
ldr tmp, =0x03070000 @ RunFast mode, short vectors of length 8, stride 1
fmrx ip, FPSCR
fmxr FPSCR, tmp
ldr tmp, =0x03070000 @ RunFast mode, short vectors of length 8, stride 1
fmrx ip, FPSCR
fmxr FPSCR, tmp
1:
vldmia a2!, {s8-s15}
vcvt.f32.s32 s8, s8
@@ -214,8 +214,8 @@ NOVFP vmov s0, a3
vstmia a1!, {s12-s15}
bne 1b
fmxr FPSCR, ip
bx lr
fmxr FPSCR, ip
bx lr
endfunc
.unreq tmp
.unreq len
+6 -6
View File
@@ -571,8 +571,8 @@ function \type\()_h264_qpel16_hv_lowpass_l2_neon
endfunc
.endm
h264_qpel16_hv put
h264_qpel16_hv avg
h264_qpel16_hv put
h264_qpel16_hv avg
.macro h264_qpel8 type
function ff_\type\()_h264_qpel8_mc10_neon, export=1
@@ -760,8 +760,8 @@ function ff_\type\()_h264_qpel8_mc33_neon, export=1
endfunc
.endm
h264_qpel8 put
h264_qpel8 avg
h264_qpel8 put
h264_qpel8 avg
.macro h264_qpel16 type
function ff_\type\()_h264_qpel16_mc10_neon, export=1
@@ -942,5 +942,5 @@ function ff_\type\()_h264_qpel16_mc33_neon, export=1
endfunc
.endm
h264_qpel16 put
h264_qpel16 avg
h264_qpel16 put
h264_qpel16 avg
+305 -305
View File
@@ -23,363 +23,363 @@
#include "neon.S"
.macro hevc_loop_filter_chroma_start
ldr r12, [r2]
ldr r3, [r2, #4]
add r2, r3, r12
cmp r2, #0
it eq
bxeq lr
ldr r12, [r2]
ldr r3, [r2, #4]
add r2, r3, r12
cmp r2, #0
it eq
bxeq lr
.endm
.macro hevc_loop_filter_chroma_body
vsubl.u8 q3, d4, d2
vsubl.u8 q11, d18, d19
vshl.i16 q3, #2
vadd.i16 q11, q3
vdup.16 d0, r12
vdup.16 d1, r3
vrshr.s16 q11, q11, #3
vneg.s16 q12, q0
vmovl.u8 q2, d4
vmin.s16 q11, q11, q0
vmax.s16 q11, q11, q12
vaddw.u8 q1, q11, d2
vsub.i16 q2, q11
vqmovun.s16 d2, q1
vqmovun.s16 d4, q2
vsubl.u8 q3, d4, d2
vsubl.u8 q11, d18, d19
vshl.i16 q3, #2
vadd.i16 q11, q3
vdup.16 d0, r12
vdup.16 d1, r3
vrshr.s16 q11, q11, #3
vneg.s16 q12, q0
vmovl.u8 q2, d4
vmin.s16 q11, q11, q0
vmax.s16 q11, q11, q12
vaddw.u8 q1, q11, d2
vsub.i16 q2, q11
vqmovun.s16 d2, q1
vqmovun.s16 d4, q2
.endm
.macro hevc_loop_filter_luma_start
ldr r12, [r3]
ldr r3, [r3, #4]
lsl r3, #16
orr r3, r12
cmp r3, #0
it eq
bxeq lr
lsr r3, #16
ldr r12, [r3]
ldr r3, [r3, #4]
lsl r3, #16
orr r3, r12
cmp r3, #0
it eq
bxeq lr
lsr r3, #16
.endm
.macro hevc_loop_filter_luma_body
vmovl.u8 q8, d16
vmovl.u8 q9, d18
vmovl.u8 q10, d20
vmovl.u8 q11, d22
vmovl.u8 q12, d24
vmovl.u8 q13, d26
vmovl.u8 q14, d28
vmovl.u8 q15, d30
vmovl.u8 q8, d16
vmovl.u8 q9, d18
vmovl.u8 q10, d20
vmovl.u8 q11, d22
vmovl.u8 q12, d24
vmovl.u8 q13, d26
vmovl.u8 q14, d28
vmovl.u8 q15, d30
vadd.i16 q7, q9, q11
vadd.i16 q6, q14, q12
vsub.i16 q7, q10
vsub.i16 q6, q13
vabd.s16 q7, q7, q10
vabd.s16 q6, q6, q13
vadd.i16 q7, q9, q11
vadd.i16 q6, q14, q12
vsub.i16 q7, q10
vsub.i16 q6, q13
vabd.s16 q7, q7, q10
vabd.s16 q6, q6, q13
vdup.16 q0, r2
vmov q4, q7
vmov q5, q6
vdup.16 d4, r12
vtrn.16 q7, q4
vtrn.16 q6, q5
vdup.16 q0, r2
vmov q4, q7
vmov q5, q6
vdup.16 d4, r12
vtrn.16 q7, q4
vtrn.16 q6, q5
vshl.u64 q7, #32
vshr.u64 q4, #32
vshl.u64 q6, #32
vshr.u64 q5, #32
vshr.u64 q7, #32
vshr.u64 q6, #32
vshl.u64 q5, #32
vshl.u64 q4, #32
vorr q6, q5
vorr q7, q4
vdup.16 d5, r3
vadd.i16 q5, q7, q6
vshl.u64 q7, #32
vshr.u64 q4, #32
vshl.u64 q6, #32
vshr.u64 q5, #32
vshr.u64 q7, #32
vshr.u64 q6, #32
vshl.u64 q5, #32
vshl.u64 q4, #32
vorr q6, q5
vorr q7, q4
vdup.16 d5, r3
vadd.i16 q5, q7, q6
vmov q4, q5
vmov q3, q5
vtrn.32 q3, q4
vmov q4, q5
vmov q3, q5
vtrn.32 q3, q4
vadd.i16 q4, q3
vadd.i16 q4, q3
vshl.s16 q5, q5, #1
vcgt.s16 q3, q0, q4
vshl.s16 q5, q5, #1
vcgt.s16 q3, q0, q4
vmovn.i16 d6, q3
vshr.s16 q1, q0, #2
vmovn.i16 d6, q3
vcgt.s16 q5, q1, q5
vmov r7, s12
cmp r7, #0
beq bypasswrite
vmovn.i16 d6, q3
vshr.s16 q1, q0, #2
vmovn.i16 d6, q3
vcgt.s16 q5, q1, q5
vmov r7, s12
cmp r7, #0
beq bypasswrite
vpadd.i32 d0, d14, d12
vpadd.i32 d1, d15, d13
vmov q4, q2
vshl.s16 q2, #2
vshr.s16 q1, q1, #1
vrhadd.s16 q2, q4
vpadd.i32 d0, d14, d12
vpadd.i32 d1, d15, d13
vmov q4, q2
vshl.s16 q2, #2
vshr.s16 q1, q1, #1
vrhadd.s16 q2, q4
vabd.s16 q7, q8, q11
vaba.s16 q7, q15, q12
vabd.s16 q7, q8, q11
vaba.s16 q7, q15, q12
vmovn.i32 d0, q0
vmov r5, r6, s0, s1
vcgt.s16 q6, q1, q7
vand q5, q5, q6
vabd.s16 q7, q11, q12
vcgt.s16 q6, q2, q7
vand q5, q5, q6
vmovn.i32 d0, q0
vmov r5, r6, s0, s1
vcgt.s16 q6, q1, q7
vand q5, q5, q6
vabd.s16 q7, q11, q12
vcgt.s16 q6, q2, q7
vand q5, q5, q6
vmov q2, q5
vtrn.s16 q5, q2
vshr.u64 q2, #32
vshl.u64 q5, #32
vshl.u64 q2, #32
vshr.u64 q5, #32
vorr q5, q2
vmov q2, q5
vtrn.s16 q5, q2
vshr.u64 q2, #32
vshl.u64 q5, #32
vshl.u64 q2, #32
vshr.u64 q5, #32
vorr q5, q2
vmov q2, q5
vshl.i16 q7, q4, #1
vtrn.32 q2, q5
vand q5, q2
vneg.s16 q6, q7
vmovn.i16 d4, q5
vmovn.i16 d4, q2
vmov r8, s8
vmov q2, q5
vshl.i16 q7, q4, #1
vtrn.32 q2, q5
vand q5, q2
vneg.s16 q6, q7
vmovn.i16 d4, q5
vmovn.i16 d4, q2
vmov r8, s8
and r9, r8, r7
cmp r9, #0
beq 1f
and r9, r8, r7
cmp r9, #0
beq 1f
vadd.i16 q2, q11, q12
vadd.i16 q4, q9, q8
vadd.i16 q1, q2, q10
vdup.16 d10, r9
vadd.i16 q0, q1, q9
vshl.i16 q4, #1
lsr r9, #16
vadd.i16 q1, q0
vrshr.s16 q3, q0, #2
vadd.i16 q1, q13
vadd.i16 q4, q0
vsub.i16 q3, q10
vrshr.s16 q1, #3
vrshr.s16 q4, #3
vmax.s16 q3, q6
vsub.i16 q1, q11
vsub.i16 q4, q9
vmin.s16 q3, q7
vmax.s16 q4, q6
vmax.s16 q1, q6
vadd.i16 q3, q10
vmin.s16 q4, q7
vmin.s16 q1, q7
vdup.16 d11, r9
vadd.i16 q4, q9
vadd.i16 q1, q11
vbit q9, q4, q5
vadd.i16 q4, q2, q13
vbit q11, q1, q5
vadd.i16 q0, q4, q14
vadd.i16 q2, q15, q14
vadd.i16 q4, q0
vadd.i16 q2, q11, q12
vadd.i16 q4, q9, q8
vadd.i16 q1, q2, q10
vdup.16 d10, r9
vadd.i16 q0, q1, q9
vshl.i16 q4, #1
lsr r9, #16
vadd.i16 q1, q0
vrshr.s16 q3, q0, #2
vadd.i16 q1, q13
vadd.i16 q4, q0
vsub.i16 q3, q10
vrshr.s16 q1, #3
vrshr.s16 q4, #3
vmax.s16 q3, q6
vsub.i16 q1, q11
vsub.i16 q4, q9
vmin.s16 q3, q7
vmax.s16 q4, q6
vmax.s16 q1, q6
vadd.i16 q3, q10
vmin.s16 q4, q7
vmin.s16 q1, q7
vdup.16 d11, r9
vadd.i16 q4, q9
vadd.i16 q1, q11
vbit q9, q4, q5
vadd.i16 q4, q2, q13
vbit q11, q1, q5
vadd.i16 q0, q4, q14
vadd.i16 q2, q15, q14
vadd.i16 q4, q0
vshl.i16 q2, #1
vadd.i16 q4, q10
vbit q10, q3, q5
vrshr.s16 q4, #3
vadd.i16 q2, q0
vrshr.s16 q3, q0, #2
vsub.i16 q4, q12
vrshr.s16 q2, #3
vsub.i16 q3, q13
vmax.s16 q4, q6
vsub.i16 q2, q14
vmax.s16 q3, q6
vmin.s16 q4, q7
vmax.s16 q2, q6
vmin.s16 q3, q7
vadd.i16 q4, q12
vmin.s16 q2, q7
vadd.i16 q3, q13
vbit q12, q4, q5
vadd.i16 q2, q14
vbit q13, q3, q5
vbit q14, q2, q5
vshl.i16 q2, #1
vadd.i16 q4, q10
vbit q10, q3, q5
vrshr.s16 q4, #3
vadd.i16 q2, q0
vrshr.s16 q3, q0, #2
vsub.i16 q4, q12
vrshr.s16 q2, #3
vsub.i16 q3, q13
vmax.s16 q4, q6
vsub.i16 q2, q14
vmax.s16 q3, q6
vmin.s16 q4, q7
vmax.s16 q2, q6
vmin.s16 q3, q7
vadd.i16 q4, q12
vmin.s16 q2, q7
vadd.i16 q3, q13
vbit q12, q4, q5
vadd.i16 q2, q14
vbit q13, q3, q5
vbit q14, q2, q5
1:
mvn r8, r8
and r9, r8, r7
cmp r9, #0
beq 2f
mvn r8, r8
and r9, r8, r7
cmp r9, #0
beq 2f
vdup.16 q4, r2
vdup.16 q4, r2
vdup.16 d10, r9
lsr r9, #16
vmov q1, q4
vdup.16 d11, r9
vshr.s16 q1, #1
vsub.i16 q2, q12, q11
vadd.i16 q4, q1
vshl.s16 q0, q2, #3
vshr.s16 q4, #3
vadd.i16 q2, q0
vsub.i16 q0, q13, q10
vsub.i16 q2, q0
vshl.i16 q0, q0, #1
vsub.i16 q2, q0
vshl.s16 q1, q7, 2
vrshr.s16 q2, q2, #4
vadd.i16 q1, q7
vabs.s16 q3, q2
vshr.s16 q6, q6, #1
vcgt.s16 q1, q1, q3
vand q5, q1
vshr.s16 q7, q7, #1
vmax.s16 q2, q2, q6
vmin.s16 q2, q2, q7
vdup.16 d10, r9
lsr r9, #16
vmov q1, q4
vdup.16 d11, r9
vshr.s16 q1, #1
vsub.i16 q2, q12, q11
vadd.i16 q4, q1
vshl.s16 q0, q2, #3
vshr.s16 q4, #3
vadd.i16 q2, q0
vsub.i16 q0, q13, q10
vsub.i16 q2, q0
vshl.i16 q0, q0, #1
vsub.i16 q2, q0
vshl.s16 q1, q7, 2
vrshr.s16 q2, q2, #4
vadd.i16 q1, q7
vabs.s16 q3, q2
vshr.s16 q6, q6, #1
vcgt.s16 q1, q1, q3
vand q5, q1
vshr.s16 q7, q7, #1
vmax.s16 q2, q2, q6
vmin.s16 q2, q2, q7
vshr.s16 q7, q7, #1
vrhadd.s16 q3, q9, q11
vneg.s16 q6, q7
vsub.s16 q3, q10
vdup.16 d2, r5
vhadd.s16 q3, q2
vdup.16 d3, r6
vmax.s16 q3, q3, q6
vcgt.s16 q1, q4, q1
vmin.s16 q3, q3, q7
vand q1, q5
vadd.i16 q3, q10
lsr r5, #16
lsr r6, #16
vbit q10, q3, q1
vshr.s16 q7, q7, #1
vrhadd.s16 q3, q9, q11
vneg.s16 q6, q7
vsub.s16 q3, q10
vdup.16 d2, r5
vhadd.s16 q3, q2
vdup.16 d3, r6
vmax.s16 q3, q3, q6
vcgt.s16 q1, q4, q1
vmin.s16 q3, q3, q7
vand q1, q5
vadd.i16 q3, q10
lsr r5, #16
lsr r6, #16
vbit q10, q3, q1
vrhadd.s16 q3, q14, q12
vdup.16 d2, r5
vsub.s16 q3, q13
vdup.16 d3, r6
vhsub.s16 q3, q2
vcgt.s16 q1, q4, q1
vmax.s16 q3, q3, q6
vand q1, q5
vmin.s16 q3, q3, q7
vadd.i16 q3, q13
vbit q13, q3, q1
vadd.i16 q0, q11, q2
vsub.i16 q4, q12, q2
vbit q11, q0, q5
vbit q12, q4, q5
vrhadd.s16 q3, q14, q12
vdup.16 d2, r5
vsub.s16 q3, q13
vdup.16 d3, r6
vhsub.s16 q3, q2
vcgt.s16 q1, q4, q1
vmax.s16 q3, q3, q6
vand q1, q5
vmin.s16 q3, q3, q7
vadd.i16 q3, q13
vbit q13, q3, q1
vadd.i16 q0, q11, q2
vsub.i16 q4, q12, q2
vbit q11, q0, q5
vbit q12, q4, q5
2:
vqmovun.s16 d16, q8
vqmovun.s16 d18, q9
vqmovun.s16 d20, q10
vqmovun.s16 d22, q11
vqmovun.s16 d24, q12
vqmovun.s16 d26, q13
vqmovun.s16 d28, q14
vqmovun.s16 d30, q15
vqmovun.s16 d16, q8
vqmovun.s16 d18, q9
vqmovun.s16 d20, q10
vqmovun.s16 d22, q11
vqmovun.s16 d24, q12
vqmovun.s16 d26, q13
vqmovun.s16 d28, q14
vqmovun.s16 d30, q15
.endm
function ff_hevc_v_loop_filter_luma_neon, export=1
hevc_loop_filter_luma_start
push {r5-r11}
vpush {d8-d15}
sub r0, #4
vld1.8 {d16}, [r0], r1
vld1.8 {d18}, [r0], r1
vld1.8 {d20}, [r0], r1
vld1.8 {d22}, [r0], r1
vld1.8 {d24}, [r0], r1
vld1.8 {d26}, [r0], r1
vld1.8 {d28}, [r0], r1
vld1.8 {d30}, [r0], r1
sub r0, r0, r1, lsl #3
transpose_8x8 d16, d18, d20, d22, d24, d26, d28, d30
push {r5-r11}
vpush {d8-d15}
sub r0, #4
vld1.8 {d16}, [r0], r1
vld1.8 {d18}, [r0], r1
vld1.8 {d20}, [r0], r1
vld1.8 {d22}, [r0], r1
vld1.8 {d24}, [r0], r1
vld1.8 {d26}, [r0], r1
vld1.8 {d28}, [r0], r1
vld1.8 {d30}, [r0], r1
sub r0, r0, r1, lsl #3
transpose_8x8 d16, d18, d20, d22, d24, d26, d28, d30
hevc_loop_filter_luma_body
transpose_8x8 d16, d18, d20, d22, d24, d26, d28, d30
vst1.8 {d16}, [r0], r1
vst1.8 {d18}, [r0], r1
vst1.8 {d20}, [r0], r1
vst1.8 {d22}, [r0], r1
vst1.8 {d24}, [r0], r1
vst1.8 {d26}, [r0], r1
vst1.8 {d28}, [r0], r1
vst1.8 {d30}, [r0]
vpop {d8-d15}
pop {r5-r11}
bx lr
transpose_8x8 d16, d18, d20, d22, d24, d26, d28, d30
vst1.8 {d16}, [r0], r1
vst1.8 {d18}, [r0], r1
vst1.8 {d20}, [r0], r1
vst1.8 {d22}, [r0], r1
vst1.8 {d24}, [r0], r1
vst1.8 {d26}, [r0], r1
vst1.8 {d28}, [r0], r1
vst1.8 {d30}, [r0]
vpop {d8-d15}
pop {r5-r11}
bx lr
endfunc
function ff_hevc_h_loop_filter_luma_neon, export=1
hevc_loop_filter_luma_start
push {r5-r11}
vpush {d8-d15}
sub r0, r0, r1, lsl #2
vld1.8 {d16}, [r0], r1
vld1.8 {d18}, [r0], r1
vld1.8 {d20}, [r0], r1
vld1.8 {d22}, [r0], r1
vld1.8 {d24}, [r0], r1
vld1.8 {d26}, [r0], r1
vld1.8 {d28}, [r0], r1
vld1.8 {d30}, [r0], r1
sub r0, r0, r1, lsl #3
add r0, r1
push {r5-r11}
vpush {d8-d15}
sub r0, r0, r1, lsl #2
vld1.8 {d16}, [r0], r1
vld1.8 {d18}, [r0], r1
vld1.8 {d20}, [r0], r1
vld1.8 {d22}, [r0], r1
vld1.8 {d24}, [r0], r1
vld1.8 {d26}, [r0], r1
vld1.8 {d28}, [r0], r1
vld1.8 {d30}, [r0], r1
sub r0, r0, r1, lsl #3
add r0, r1
hevc_loop_filter_luma_body
vst1.8 {d18}, [r0], r1
vst1.8 {d20}, [r0], r1
vst1.8 {d22}, [r0], r1
vst1.8 {d24}, [r0], r1
vst1.8 {d26}, [r0], r1
vst1.8 {d28}, [r0]
vst1.8 {d18}, [r0], r1
vst1.8 {d20}, [r0], r1
vst1.8 {d22}, [r0], r1
vst1.8 {d24}, [r0], r1
vst1.8 {d26}, [r0], r1
vst1.8 {d28}, [r0]
bypasswrite:
vpop {d8-d15}
pop {r5-r11}
bx lr
vpop {d8-d15}
pop {r5-r11}
bx lr
endfunc
function ff_hevc_v_loop_filter_chroma_neon, export=1
hevc_loop_filter_chroma_start
sub r0, #4
vld1.8 {d16}, [r0], r1
vld1.8 {d17}, [r0], r1
vld1.8 {d18}, [r0], r1
vld1.8 {d2}, [r0], r1
vld1.8 {d4}, [r0], r1
vld1.8 {d19}, [r0], r1
vld1.8 {d20}, [r0], r1
vld1.8 {d21}, [r0], r1
sub r0, r0, r1, lsl #3
transpose_8x8 d16, d17, d18, d2, d4, d19, d20, d21
sub r0, #4
vld1.8 {d16}, [r0], r1
vld1.8 {d17}, [r0], r1
vld1.8 {d18}, [r0], r1
vld1.8 {d2}, [r0], r1
vld1.8 {d4}, [r0], r1
vld1.8 {d19}, [r0], r1
vld1.8 {d20}, [r0], r1
vld1.8 {d21}, [r0], r1
sub r0, r0, r1, lsl #3
transpose_8x8 d16, d17, d18, d2, d4, d19, d20, d21
hevc_loop_filter_chroma_body
transpose_8x8 d16, d17, d18, d2, d4, d19, d20, d21
vst1.8 {d16}, [r0], r1
vst1.8 {d17}, [r0], r1
vst1.8 {d18}, [r0], r1
vst1.8 {d2}, [r0], r1
vst1.8 {d4}, [r0], r1
vst1.8 {d19}, [r0], r1
vst1.8 {d20}, [r0], r1
vst1.8 {d21}, [r0]
bx lr
transpose_8x8 d16, d17, d18, d2, d4, d19, d20, d21
vst1.8 {d16}, [r0], r1
vst1.8 {d17}, [r0], r1
vst1.8 {d18}, [r0], r1
vst1.8 {d2}, [r0], r1
vst1.8 {d4}, [r0], r1
vst1.8 {d19}, [r0], r1
vst1.8 {d20}, [r0], r1
vst1.8 {d21}, [r0]
bx lr
endfunc
function ff_hevc_h_loop_filter_chroma_neon, export=1
hevc_loop_filter_chroma_start
sub r0, r0, r1, lsl #1
vld1.8 {d18}, [r0], r1
vld1.8 {d2}, [r0], r1
vld1.8 {d4}, [r0], r1
vld1.8 {d19}, [r0]
sub r0, r0, r1, lsl #1
sub r0, r0, r1, lsl #1
vld1.8 {d18}, [r0], r1
vld1.8 {d2}, [r0], r1
vld1.8 {d4}, [r0], r1
vld1.8 {d19}, [r0]
sub r0, r0, r1, lsl #1
hevc_loop_filter_chroma_body
vst1.8 {d2}, [r0], r1
vst1.8 {d4}, [r0]
bx lr
vst1.8 {d2}, [r0], r1
vst1.8 {d4}, [r0]
bx lr
endfunc
+88 -88
View File
@@ -322,44 +322,44 @@ endfunc
.endm
.macro tr_4x4 in0, in1, in2, in3, out0, out1, out2, out3, shift, tmp0, tmp1, tmp2, tmp3, tmp4
vshll.s16 \tmp0, \in0, #6
vmull.s16 \tmp2, \in1, d4[1]
vmov \tmp1, \tmp0
vmull.s16 \tmp3, \in1, d4[3]
vmlal.s16 \tmp0, \in2, d4[0] @e0
vmlsl.s16 \tmp1, \in2, d4[0] @e1
vmlal.s16 \tmp2, \in3, d4[3] @o0
vmlsl.s16 \tmp3, \in3, d4[1] @o1
vshll.s16 \tmp0, \in0, #6
vmull.s16 \tmp2, \in1, d4[1]
vmov \tmp1, \tmp0
vmull.s16 \tmp3, \in1, d4[3]
vmlal.s16 \tmp0, \in2, d4[0] @e0
vmlsl.s16 \tmp1, \in2, d4[0] @e1
vmlal.s16 \tmp2, \in3, d4[3] @o0
vmlsl.s16 \tmp3, \in3, d4[1] @o1
vadd.s32 \tmp4, \tmp0, \tmp2
vsub.s32 \tmp0, \tmp0, \tmp2
vadd.s32 \tmp2, \tmp1, \tmp3
vsub.s32 \tmp1, \tmp1, \tmp3
vqrshrn.s32 \out0, \tmp4, #\shift
vqrshrn.s32 \out3, \tmp0, #\shift
vqrshrn.s32 \out1, \tmp2, #\shift
vqrshrn.s32 \out2, \tmp1, #\shift
vadd.s32 \tmp4, \tmp0, \tmp2
vsub.s32 \tmp0, \tmp0, \tmp2
vadd.s32 \tmp2, \tmp1, \tmp3
vsub.s32 \tmp1, \tmp1, \tmp3
vqrshrn.s32 \out0, \tmp4, #\shift
vqrshrn.s32 \out3, \tmp0, #\shift
vqrshrn.s32 \out1, \tmp2, #\shift
vqrshrn.s32 \out2, \tmp1, #\shift
.endm
.macro tr_4x4_8 in0, in1, in2, in3, out0, out1, out2, out3, tmp0, tmp1, tmp2, tmp3
vshll.s16 \tmp0, \in0, #6
vld1.s16 {\in0}, [r1, :64]!
vmov \tmp1, \tmp0
vmull.s16 \tmp2, \in1, \in0[1]
vmull.s16 \tmp3, \in1, \in0[3]
vmlal.s16 \tmp0, \in2, \in0[0] @e0
vmlsl.s16 \tmp1, \in2, \in0[0] @e1
vmlal.s16 \tmp2, \in3, \in0[3] @o0
vmlsl.s16 \tmp3, \in3, \in0[1] @o1
vshll.s16 \tmp0, \in0, #6
vld1.s16 {\in0}, [r1, :64]!
vmov \tmp1, \tmp0
vmull.s16 \tmp2, \in1, \in0[1]
vmull.s16 \tmp3, \in1, \in0[3]
vmlal.s16 \tmp0, \in2, \in0[0] @e0
vmlsl.s16 \tmp1, \in2, \in0[0] @e1
vmlal.s16 \tmp2, \in3, \in0[3] @o0
vmlsl.s16 \tmp3, \in3, \in0[1] @o1
vld1.s16 {\in0}, [r1, :64]
vld1.s16 {\in0}, [r1, :64]
vadd.s32 \out0, \tmp0, \tmp2
vadd.s32 \out1, \tmp1, \tmp3
vsub.s32 \out2, \tmp1, \tmp3
vsub.s32 \out3, \tmp0, \tmp2
vadd.s32 \out0, \tmp0, \tmp2
vadd.s32 \out1, \tmp1, \tmp3
vsub.s32 \out2, \tmp1, \tmp3
vsub.s32 \out3, \tmp0, \tmp2
sub r1, r1, #8
sub r1, r1, #8
.endm
@ Do a 4x4 transpose, using q registers for the subtransposes that don't
@@ -385,7 +385,7 @@ function ff_hevc_idct_4x4_\bitdepth\()_neon, export=1
tr_4x4 d16, d17, d18, d19, d0, d1, d2, d3, 20 - \bitdepth, q10, q11, q12, q13, q0
transpose_4x4 q0, q1, d0, d1, d2, d3
vst1.s16 {d0-d3}, [r0, :128]
bx lr
bx lr
endfunc
.endm
@@ -557,14 +557,14 @@ endfunc
.endm
.macro add_member in, t0, t1, t2, t3, t4, t5, t6, t7, op0, op1, op2, op3, op4, op5, op6, op7
sum_sub q5, \in, \t0, \op0
sum_sub q6, \in, \t1, \op1
sum_sub q7, \in, \t2, \op2
sum_sub q8, \in, \t3, \op3
sum_sub q9, \in, \t4, \op4
sum_sub q10, \in, \t5, \op5
sum_sub q11, \in, \t6, \op6
sum_sub q12, \in, \t7, \op7
sum_sub q5, \in, \t0, \op0
sum_sub q6, \in, \t1, \op1
sum_sub q7, \in, \t2, \op2
sum_sub q8, \in, \t3, \op3
sum_sub q9, \in, \t4, \op4
sum_sub q10, \in, \t5, \op5
sum_sub q11, \in, \t6, \op6
sum_sub q12, \in, \t7, \op7
.endm
.macro butterfly16 in0, in1, in2, in3, in4, in5, in6, in7
@@ -682,7 +682,7 @@ function func_tr_16x4_\name
mov r4, #-32
store16 d26, d27, d28, d29, d30, d31, d8, d9, r4
.else
store_to_stack (\offset + 64), (\offset + 176), q4, q9, q10, q11, q3, q2, q1, q0
store_to_stack (\offset + 64), (\offset + 176), q4, q9, q10, q11, q3, q2, q1, q0
.endif
bx lr
@@ -744,10 +744,10 @@ endfunc
.endm
.macro add_member32 in, t0, t1, t2, t3, op0, op1, op2, op3
sum_sub q10, \in, \t0, \op0
sum_sub q11, \in, \t1, \op1
sum_sub q12, \in, \t2, \op2
sum_sub q13, \in, \t3, \op3
sum_sub q10, \in, \t0, \op0
sum_sub q11, \in, \t1, \op1
sum_sub q12, \in, \t2, \op2
sum_sub q13, \in, \t3, \op3
.endm
.macro butterfly32 in0, in1, in2, in3
@@ -900,7 +900,7 @@ function func_tr_32x4_\name
add r3, r11, #(32 + 3 * 64)
scale_store \shift
bx r10
bx r10
endfunc
.endm
@@ -965,59 +965,59 @@ idct_32x32_dc 10
/* uses registers q2 - q9 for temp values */
/* TODO: reorder */
.macro tr4_luma_shift r0, r1, r2, r3, shift
vaddl.s16 q5, \r0, \r2 // c0 = src0 + src2
vaddl.s16 q2, \r2, \r3 // c1 = src2 + src3
vsubl.s16 q4, \r0, \r3 // c2 = src0 - src3
vmull.s16 q6, \r1, d0[0] // c3 = 74 * src1
vaddl.s16 q5, \r0, \r2 // c0 = src0 + src2
vaddl.s16 q2, \r2, \r3 // c1 = src2 + src3
vsubl.s16 q4, \r0, \r3 // c2 = src0 - src3
vmull.s16 q6, \r1, d0[0] // c3 = 74 * src1
vaddl.s16 q7, \r0, \r3 // src0 + src3
vsubw.s16 q7, q7, \r2 // src0 - src2 + src3
vmul.s32 q7, q7, d0[0] // dst2 = 74 * (src0 - src2 + src3)
vaddl.s16 q7, \r0, \r3 // src0 + src3
vsubw.s16 q7, q7, \r2 // src0 - src2 + src3
vmul.s32 q7, q7, d0[0] // dst2 = 74 * (src0 - src2 + src3)
vmul.s32 q8, q5, d0[1] // 29 * c0
vmul.s32 q9, q2, d1[0] // 55 * c1
vadd.s32 q8, q9 // 29 * c0 + 55 * c1
vadd.s32 q8, q6 // dst0 = 29 * c0 + 55 * c1 + c3
vmul.s32 q8, q5, d0[1] // 29 * c0
vmul.s32 q9, q2, d1[0] // 55 * c1
vadd.s32 q8, q9 // 29 * c0 + 55 * c1
vadd.s32 q8, q6 // dst0 = 29 * c0 + 55 * c1 + c3
vmul.s32 q2, q2, d0[1] // 29 * c1
vmul.s32 q9, q4, d1[0] // 55 * c2
vsub.s32 q9, q2 // 55 * c2 - 29 * c1
vadd.s32 q9, q6 // dst1 = 55 * c2 - 29 * c1 + c3
vmul.s32 q2, q2, d0[1] // 29 * c1
vmul.s32 q9, q4, d1[0] // 55 * c2
vsub.s32 q9, q2 // 55 * c2 - 29 * c1
vadd.s32 q9, q6 // dst1 = 55 * c2 - 29 * c1 + c3
vmul.s32 q5, q5, d1[0] // 55 * c0
vmul.s32 q4, q4, d0[1] // 29 * c2
vadd.s32 q5, q4 // 55 * c0 + 29 * c2
vsub.s32 q5, q6 // dst3 = 55 * c0 + 29 * c2 - c3
vmul.s32 q5, q5, d1[0] // 55 * c0
vmul.s32 q4, q4, d0[1] // 29 * c2
vadd.s32 q5, q4 // 55 * c0 + 29 * c2
vsub.s32 q5, q6 // dst3 = 55 * c0 + 29 * c2 - c3
vqrshrn.s32 \r0, q8, \shift
vqrshrn.s32 \r1, q9, \shift
vqrshrn.s32 \r2, q7, \shift
vqrshrn.s32 \r3, q5, \shift
vqrshrn.s32 \r0, q8, \shift
vqrshrn.s32 \r1, q9, \shift
vqrshrn.s32 \r2, q7, \shift
vqrshrn.s32 \r3, q5, \shift
.endm
.ltorg
function ff_hevc_transform_luma_4x4_neon_8, export=1
vpush {d8-d15}
vld1.16 {q14, q15}, [r0] // coeffs
ldr r3, =0x4a // 74
vmov.32 d0[0], r3
ldr r3, =0x1d // 29
vmov.32 d0[1], r3
ldr r3, =0x37 // 55
vmov.32 d1[0], r3
vpush {d8-d15}
vld1.16 {q14, q15}, [r0] // coeffs
ldr r3, =0x4a // 74
vmov.32 d0[0], r3
ldr r3, =0x1d // 29
vmov.32 d0[1], r3
ldr r3, =0x37 // 55
vmov.32 d1[0], r3
tr4_luma_shift d28, d29, d30, d31, #7
tr4_luma_shift d28, d29, d30, d31, #7
vtrn.16 d28, d29
vtrn.16 d30, d31
vtrn.32 q14, q15
vtrn.16 d28, d29
vtrn.16 d30, d31
vtrn.32 q14, q15
tr4_luma_shift d28, d29, d30, d31, #12
tr4_luma_shift d28, d29, d30, d31, #12
vtrn.16 d28, d29
vtrn.16 d30, d31
vtrn.32 q14, q15
vst1.16 {q14, q15}, [r0]
vpop {d8-d15}
bx lr
vtrn.16 d28, d29
vtrn.16 d30, d31
vtrn.32 q14, q15
vst1.16 {q14, q15}, [r0]
vpop {d8-d15}
bx lr
endfunc
File diff suppressed because it is too large Load Diff
+141 -141
View File
@@ -23,155 +23,155 @@
#include "neon.S"
function ff_hevc_sao_band_filter_neon_8, export=1
push {r4-r10}
ldr r5, [sp, #28] // width
ldr r4, [sp, #32] // height
ldr r8, [sp, #36] // offset_table
vpush {d8-d15}
mov r12, r4 // r12 = height
mov r6, r0 // r6 = r0 = dst
mov r7, r1 // r7 = r1 = src
vldm r8, {q0-q3}
vmov.u16 q15, #1
vmov.u8 q14, #32
0: pld [r1]
cmp r5, #4
beq 4f
8: subs r4, #1
vld1.8 {d16}, [r1], r3
vshr.u8 d17, d16, #3 // index = [src>>3]
vshll.u8 q9, d17, #1 // lowIndex = 2*index
vadd.u16 q11, q9, q15 // highIndex = (2*index+1) << 8
vshl.u16 q10, q11, #8 // q10: highIndex; q9: lowIndex;
vadd.u16 q10, q9 // combine high and low index;
push {r4-r10}
ldr r5, [sp, #28] // width
ldr r4, [sp, #32] // height
ldr r8, [sp, #36] // offset_table
vpush {d8-d15}
mov r12, r4 // r12 = height
mov r6, r0 // r6 = r0 = dst
mov r7, r1 // r7 = r1 = src
vldm r8, {q0-q3}
vmov.u16 q15, #1
vmov.u8 q14, #32
0: pld [r1]
cmp r5, #4
beq 4f
8: subs r4, #1
vld1.8 {d16}, [r1], r3
vshr.u8 d17, d16, #3 // index = [src>>3]
vshll.u8 q9, d17, #1 // lowIndex = 2*index
vadd.u16 q11, q9, q15 // highIndex = (2*index+1) << 8
vshl.u16 q10, q11, #8 // q10: highIndex; q9: lowIndex;
vadd.u16 q10, q9 // combine high and low index;
// Look-up Table Round 1; index range: 0-15
vtbx.8 d24, {q0-q1}, d20
vtbx.8 d25, {q0-q1}, d21
vtbx.8 d24, {q0-q1}, d20
vtbx.8 d25, {q0-q1}, d21
// Look-up Table Round 2; index range: 16-31
vsub.u8 q10, q14 // Look-up with 8bit
vtbx.8 d24, {q2-q3}, d20
vtbx.8 d25, {q2-q3}, d21
vaddw.u8 q13, q12, d16
vqmovun.s16 d8, q13
vst1.8 d8, [r0], r2
bne 8b
subs r5, #8
beq 99f
mov r4, r12
add r6, #8
mov r0, r6
add r7, #8
mov r1, r7
b 0b
4: subs r4, #1
vld1.32 {d16[0]}, [r1], r3
vshr.u8 d17, d16, #3 // src>>3
vshll.u8 q9, d17, #1 // lowIndex = 2*index
vadd.u16 q11, q9, q15 // highIndex = (2*index+1) << 8
vshl.u16 q10, q11, #8 // q10: highIndex; q9: lowIndex;
vadd.u16 q10, q9 // combine high and low index;
vsub.u8 q10, q14 // Look-up with 8bit
vtbx.8 d24, {q2-q3}, d20
vtbx.8 d25, {q2-q3}, d21
vaddw.u8 q13, q12, d16
vqmovun.s16 d8, q13
vst1.8 d8, [r0], r2
bne 8b
subs r5, #8
beq 99f
mov r4, r12
add r6, #8
mov r0, r6
add r7, #8
mov r1, r7
b 0b
4: subs r4, #1
vld1.32 {d16[0]}, [r1], r3
vshr.u8 d17, d16, #3 // src>>3
vshll.u8 q9, d17, #1 // lowIndex = 2*index
vadd.u16 q11, q9, q15 // highIndex = (2*index+1) << 8
vshl.u16 q10, q11, #8 // q10: highIndex; q9: lowIndex;
vadd.u16 q10, q9 // combine high and low index;
// Look-up Table Round 1; index range: 0-15
vtbx.8 d24, {q0-q1}, d20
vtbx.8 d25, {q0-q1}, d21
vtbx.8 d24, {q0-q1}, d20
vtbx.8 d25, {q0-q1}, d21
// Look-up Table Round 2; index range: 16-32
vsub.u8 q10, q14 // Look-up with 8bit
vtbx.8 d24, {q2-q3}, d20
vtbx.8 d25, {q2-q3}, d21
vaddw.u8 q13, q12, d16
vsub.u8 q10, q14 // Look-up with 8bit
vtbx.8 d24, {q2-q3}, d20
vtbx.8 d25, {q2-q3}, d21
vaddw.u8 q13, q12, d16
vqmovun.s16 d14, q13
vst1.32 d14[0], [r0], r2
bne 4b
b 99f
vst1.32 d14[0], [r0], r2
bne 4b
b 99f
99:
vpop {d8-d15}
pop {r4-r10}
bx lr
vpop {d8-d15}
pop {r4-r10}
bx lr
endfunc
function ff_hevc_sao_edge_filter_neon_8, export=1
push {r4-r11}
ldr r5, [sp, #32] // width
ldr r4, [sp, #36] // height
ldr r8, [sp, #40] // a_stride
ldr r9, [sp, #44] // b_stride
ldr r10, [sp, #48] // sao_offset_val
ldr r11, [sp, #52] // edge_idx
vpush {d8-d15}
mov r12, r4 // r12 = height
mov r6, r0 // r6 = r0 = dst
mov r7, r1 // r7 = r1 = src
vld1.8 {d0}, [r11] // edge_idx table load in d0 5x8bit
vld1.16 {q1}, [r10] // sao_offset_val table load in q1, 5x16bit
vmov.u8 d1, #2
vmov.u16 q2, #1
0: mov r10, r1
add r10, r8 // src[x + a_stride]
mov r11, r1
add r11, r9 // src[x + b_stride]
pld [r1]
cmp r5, #4
beq 4f
8: subs r4, #1
vld1.8 {d16}, [r1], r3 // src[x] 8x8bit
vld1.8 {d17}, [r10], r3 // src[x + a_stride]
vld1.8 {d18}, [r11], r3 // src[x + b_stride]
vcgt.u8 d8, d16, d17
vshr.u8 d9, d8, #7
vclt.u8 d8, d16, d17
vadd.u8 d8, d9 // diff0
vcgt.u8 d10, d16, d18
vshr.u8 d11, d10, #7
vclt.u8 d10, d16, d18
vadd.u8 d10, d11 // diff1
vadd.s8 d8, d10
vadd.s8 d8, d1
vtbx.8 d9, {d0}, d8 // offset_val
vshll.u8 q6, d9, #1 // lowIndex
vadd.u16 q7, q6, q2
vshl.u16 q10, q7, #8 // highIndex
vadd.u16 q10, q6 // combine lowIndex and highIndex, offset_val
vtbx.8 d22, {q1}, d20
vtbx.8 d23, {q1}, d21
vaddw.u8 q12, q11, d16
vqmovun.s16 d26, q12
vst1.8 d26, [r0], r2
bne 8b
subs r5, #8
beq 99f
mov r4, r12
add r6, #8
mov r0, r6
add r7, #8
mov r1, r7
b 0b
4: subs r4, #1
vld1.32 {d16[0]}, [r1], r3
vld1.32 {d17[0]}, [r10], r3 // src[x + a_stride]
vld1.32 {d18[0]}, [r11], r3 // src[x + b_stride]
vcgt.u8 d8, d16, d17
vshr.u8 d9, d8, #7
vclt.u8 d8, d16, d17
vadd.u8 d8, d9 // diff0
vcgt.u8 d10, d16, d18
vshr.u8 d11, d10, #7
vclt.u8 d10, d16, d18
vadd.u8 d10, d11 // diff1
vadd.s8 d8, d10
vadd.s8 d8, d1
vtbx.8 d9, {d0}, d8 // offset_val
vshll.u8 q6, d9, #1 // lowIndex
vadd.u16 q7, q6, q2
vshl.u16 q10, q7, #8 // highIndex
vadd.u16 q10, q6 // combine lowIndex and highIndex, offset_val
vtbx.8 d22, {q1}, d20
vtbx.8 d23, {q1}, d21
vaddw.u8 q12, q11, d16
vqmovun.s16 d26, q12
vst1.32 d26[0], [r0], r2
bne 4b
b 99f
push {r4-r11}
ldr r5, [sp, #32] // width
ldr r4, [sp, #36] // height
ldr r8, [sp, #40] // a_stride
ldr r9, [sp, #44] // b_stride
ldr r10, [sp, #48] // sao_offset_val
ldr r11, [sp, #52] // edge_idx
vpush {d8-d15}
mov r12, r4 // r12 = height
mov r6, r0 // r6 = r0 = dst
mov r7, r1 // r7 = r1 = src
vld1.8 {d0}, [r11] // edge_idx table load in d0 5x8bit
vld1.16 {q1}, [r10] // sao_offset_val table load in q1, 5x16bit
vmov.u8 d1, #2
vmov.u16 q2, #1
0: mov r10, r1
add r10, r8 // src[x + a_stride]
mov r11, r1
add r11, r9 // src[x + b_stride]
pld [r1]
cmp r5, #4
beq 4f
8: subs r4, #1
vld1.8 {d16}, [r1], r3 // src[x] 8x8bit
vld1.8 {d17}, [r10], r3 // src[x + a_stride]
vld1.8 {d18}, [r11], r3 // src[x + b_stride]
vcgt.u8 d8, d16, d17
vshr.u8 d9, d8, #7
vclt.u8 d8, d16, d17
vadd.u8 d8, d9 // diff0
vcgt.u8 d10, d16, d18
vshr.u8 d11, d10, #7
vclt.u8 d10, d16, d18
vadd.u8 d10, d11 // diff1
vadd.s8 d8, d10
vadd.s8 d8, d1
vtbx.8 d9, {d0}, d8 // offset_val
vshll.u8 q6, d9, #1 // lowIndex
vadd.u16 q7, q6, q2
vshl.u16 q10, q7, #8 // highIndex
vadd.u16 q10, q6 // combine lowIndex and highIndex, offset_val
vtbx.8 d22, {q1}, d20
vtbx.8 d23, {q1}, d21
vaddw.u8 q12, q11, d16
vqmovun.s16 d26, q12
vst1.8 d26, [r0], r2
bne 8b
subs r5, #8
beq 99f
mov r4, r12
add r6, #8
mov r0, r6
add r7, #8
mov r1, r7
b 0b
4: subs r4, #1
vld1.32 {d16[0]}, [r1], r3
vld1.32 {d17[0]}, [r10], r3 // src[x + a_stride]
vld1.32 {d18[0]}, [r11], r3 // src[x + b_stride]
vcgt.u8 d8, d16, d17
vshr.u8 d9, d8, #7
vclt.u8 d8, d16, d17
vadd.u8 d8, d9 // diff0
vcgt.u8 d10, d16, d18
vshr.u8 d11, d10, #7
vclt.u8 d10, d16, d18
vadd.u8 d10, d11 // diff1
vadd.s8 d8, d10
vadd.s8 d8, d1
vtbx.8 d9, {d0}, d8 // offset_val
vshll.u8 q6, d9, #1 // lowIndex
vadd.u16 q7, q6, q2
vshl.u16 q10, q7, #8 // highIndex
vadd.u16 q10, q6 // combine lowIndex and highIndex, offset_val
vtbx.8 d22, {q1}, d20
vtbx.8 d23, {q1}, d21
vaddw.u8 q12, q11, d16
vqmovun.s16 d26, q12
vst1.32 d26[0], [r0], r2
bne 4b
b 99f
99:
vpop {d8-d15}
pop {r4-r11}
bx lr
vpop {d8-d15}
pop {r4-r11}
bx lr
endfunc
+1 -1
View File
@@ -85,7 +85,7 @@
beq 2f
subs \tmp, \tmp, #1
beq 3f
b 4f
b 4f
.endm
@ ----------------------------------------------------------------
+225 -225
View File
@@ -56,314 +56,314 @@
#define FIX_0xFFFF_ID 48
function ff_j_rev_dct_arm, export=1
push {r0, r4 - r11, lr}
push {r0, r4 - r11, lr}
mov lr, r0 @ lr = pointer to the current row
mov r12, #8 @ r12 = row-counter
movrel r11, const_array @ r11 = base pointer to the constants array
mov lr, r0 @ lr = pointer to the current row
mov r12, #8 @ r12 = row-counter
movrel r11, const_array @ r11 = base pointer to the constants array
row_loop:
ldrsh r0, [lr, # 0] @ r0 = 'd0'
ldrsh r2, [lr, # 2] @ r2 = 'd2'
ldrsh r0, [lr, # 0] @ r0 = 'd0'
ldrsh r2, [lr, # 2] @ r2 = 'd2'
@ Optimization for row that have all items except the first set to 0
@ (this works as the int16_t are always 4-byte aligned)
ldr r5, [lr, # 0]
ldr r6, [lr, # 4]
ldr r3, [lr, # 8]
ldr r4, [lr, #12]
orr r3, r3, r4
orr r3, r3, r6
orrs r5, r3, r5
beq end_of_row_loop @ nothing to be done as ALL of them are '0'
orrs r3, r3, r2
beq empty_row
ldr r5, [lr, # 0]
ldr r6, [lr, # 4]
ldr r3, [lr, # 8]
ldr r4, [lr, #12]
orr r3, r3, r4
orr r3, r3, r6
orrs r5, r3, r5
beq end_of_row_loop @ nothing to be done as ALL of them are '0'
orrs r3, r3, r2
beq empty_row
ldrsh r1, [lr, # 8] @ r1 = 'd1'
ldrsh r4, [lr, # 4] @ r4 = 'd4'
ldrsh r6, [lr, # 6] @ r6 = 'd6'
ldrsh r1, [lr, # 8] @ r1 = 'd1'
ldrsh r4, [lr, # 4] @ r4 = 'd4'
ldrsh r6, [lr, # 6] @ r6 = 'd6'
ldr r3, [r11, #FIX_0_541196100_ID]
add r7, r2, r6
ldr r5, [r11, #FIX_M_1_847759065_ID]
mul r7, r3, r7 @ r7 = z1
ldr r3, [r11, #FIX_0_765366865_ID]
mla r6, r5, r6, r7 @ r6 = tmp2
add r5, r0, r4 @ r5 = tmp0
mla r2, r3, r2, r7 @ r2 = tmp3
sub r3, r0, r4 @ r3 = tmp1
ldr r3, [r11, #FIX_0_541196100_ID]
add r7, r2, r6
ldr r5, [r11, #FIX_M_1_847759065_ID]
mul r7, r3, r7 @ r7 = z1
ldr r3, [r11, #FIX_0_765366865_ID]
mla r6, r5, r6, r7 @ r6 = tmp2
add r5, r0, r4 @ r5 = tmp0
mla r2, r3, r2, r7 @ r2 = tmp3
sub r3, r0, r4 @ r3 = tmp1
add r0, r2, r5, lsl #13 @ r0 = tmp10
rsb r2, r2, r5, lsl #13 @ r2 = tmp13
add r4, r6, r3, lsl #13 @ r4 = tmp11
rsb r3, r6, r3, lsl #13 @ r3 = tmp12
add r0, r2, r5, lsl #13 @ r0 = tmp10
rsb r2, r2, r5, lsl #13 @ r2 = tmp13
add r4, r6, r3, lsl #13 @ r4 = tmp11
rsb r3, r6, r3, lsl #13 @ r3 = tmp12
push {r0, r2, r3, r4} @ save on the stack tmp10, tmp13, tmp12, tmp11
push {r0, r2, r3, r4} @ save on the stack tmp10, tmp13, tmp12, tmp11
ldrsh r3, [lr, #10] @ r3 = 'd3'
ldrsh r5, [lr, #12] @ r5 = 'd5'
ldrsh r7, [lr, #14] @ r7 = 'd7'
ldrsh r3, [lr, #10] @ r3 = 'd3'
ldrsh r5, [lr, #12] @ r5 = 'd5'
ldrsh r7, [lr, #14] @ r7 = 'd7'
add r0, r3, r5 @ r0 = 'z2'
add r2, r1, r7 @ r2 = 'z1'
add r4, r3, r7 @ r4 = 'z3'
add r6, r1, r5 @ r6 = 'z4'
ldr r9, [r11, #FIX_1_175875602_ID]
add r8, r4, r6 @ r8 = z3 + z4
ldr r10, [r11, #FIX_M_0_899976223_ID]
mul r8, r9, r8 @ r8 = 'z5'
ldr r9, [r11, #FIX_M_2_562915447_ID]
mul r2, r10, r2 @ r2 = 'z1'
ldr r10, [r11, #FIX_M_1_961570560_ID]
mul r0, r9, r0 @ r0 = 'z2'
ldr r9, [r11, #FIX_M_0_390180644_ID]
mla r4, r10, r4, r8 @ r4 = 'z3'
ldr r10, [r11, #FIX_0_298631336_ID]
mla r6, r9, r6, r8 @ r6 = 'z4'
ldr r9, [r11, #FIX_2_053119869_ID]
mla r7, r10, r7, r2 @ r7 = tmp0 + z1
ldr r10, [r11, #FIX_3_072711026_ID]
mla r5, r9, r5, r0 @ r5 = tmp1 + z2
ldr r9, [r11, #FIX_1_501321110_ID]
mla r3, r10, r3, r0 @ r3 = tmp2 + z2
add r7, r7, r4 @ r7 = tmp0
mla r1, r9, r1, r2 @ r1 = tmp3 + z1
add r5, r5, r6 @ r5 = tmp1
add r3, r3, r4 @ r3 = tmp2
add r1, r1, r6 @ r1 = tmp3
add r0, r3, r5 @ r0 = 'z2'
add r2, r1, r7 @ r2 = 'z1'
add r4, r3, r7 @ r4 = 'z3'
add r6, r1, r5 @ r6 = 'z4'
ldr r9, [r11, #FIX_1_175875602_ID]
add r8, r4, r6 @ r8 = z3 + z4
ldr r10, [r11, #FIX_M_0_899976223_ID]
mul r8, r9, r8 @ r8 = 'z5'
ldr r9, [r11, #FIX_M_2_562915447_ID]
mul r2, r10, r2 @ r2 = 'z1'
ldr r10, [r11, #FIX_M_1_961570560_ID]
mul r0, r9, r0 @ r0 = 'z2'
ldr r9, [r11, #FIX_M_0_390180644_ID]
mla r4, r10, r4, r8 @ r4 = 'z3'
ldr r10, [r11, #FIX_0_298631336_ID]
mla r6, r9, r6, r8 @ r6 = 'z4'
ldr r9, [r11, #FIX_2_053119869_ID]
mla r7, r10, r7, r2 @ r7 = tmp0 + z1
ldr r10, [r11, #FIX_3_072711026_ID]
mla r5, r9, r5, r0 @ r5 = tmp1 + z2
ldr r9, [r11, #FIX_1_501321110_ID]
mla r3, r10, r3, r0 @ r3 = tmp2 + z2
add r7, r7, r4 @ r7 = tmp0
mla r1, r9, r1, r2 @ r1 = tmp3 + z1
add r5, r5, r6 @ r5 = tmp1
add r3, r3, r4 @ r3 = tmp2
add r1, r1, r6 @ r1 = tmp3
pop {r0, r2, r4, r6} @ r0 = tmp10 / r2 = tmp13 / r4 = tmp12 / r6 = tmp11
@ r1 = tmp3 / r3 = tmp2 / r5 = tmp1 / r7 = tmp0
pop {r0, r2, r4, r6} @ r0 = tmp10 / r2 = tmp13 / r4 = tmp12 / r6 = tmp11
@ r1 = tmp3 / r3 = tmp2 / r5 = tmp1 / r7 = tmp0
@ Compute DESCALE(tmp10 + tmp3, CONST_BITS-PASS1_BITS)
add r8, r0, r1
add r8, r8, #(1<<10)
mov r8, r8, asr #11
strh r8, [lr, # 0]
add r8, r0, r1
add r8, r8, #(1<<10)
mov r8, r8, asr #11
strh r8, [lr, # 0]
@ Compute DESCALE(tmp10 - tmp3, CONST_BITS-PASS1_BITS)
sub r8, r0, r1
add r8, r8, #(1<<10)
mov r8, r8, asr #11
strh r8, [lr, #14]
sub r8, r0, r1
add r8, r8, #(1<<10)
mov r8, r8, asr #11
strh r8, [lr, #14]
@ Compute DESCALE(tmp11 + tmp2, CONST_BITS-PASS1_BITS)
add r8, r6, r3
add r8, r8, #(1<<10)
mov r8, r8, asr #11
strh r8, [lr, # 2]
add r8, r6, r3
add r8, r8, #(1<<10)
mov r8, r8, asr #11
strh r8, [lr, # 2]
@ Compute DESCALE(tmp11 - tmp2, CONST_BITS-PASS1_BITS)
sub r8, r6, r3
add r8, r8, #(1<<10)
mov r8, r8, asr #11
strh r8, [lr, #12]
sub r8, r6, r3
add r8, r8, #(1<<10)
mov r8, r8, asr #11
strh r8, [lr, #12]
@ Compute DESCALE(tmp12 + tmp1, CONST_BITS-PASS1_BITS)
add r8, r4, r5
add r8, r8, #(1<<10)
mov r8, r8, asr #11
strh r8, [lr, # 4]
add r8, r4, r5
add r8, r8, #(1<<10)
mov r8, r8, asr #11
strh r8, [lr, # 4]
@ Compute DESCALE(tmp12 - tmp1, CONST_BITS-PASS1_BITS)
sub r8, r4, r5
add r8, r8, #(1<<10)
mov r8, r8, asr #11
strh r8, [lr, #10]
sub r8, r4, r5
add r8, r8, #(1<<10)
mov r8, r8, asr #11
strh r8, [lr, #10]
@ Compute DESCALE(tmp13 + tmp0, CONST_BITS-PASS1_BITS)
add r8, r2, r7
add r8, r8, #(1<<10)
mov r8, r8, asr #11
strh r8, [lr, # 6]
add r8, r2, r7
add r8, r8, #(1<<10)
mov r8, r8, asr #11
strh r8, [lr, # 6]
@ Compute DESCALE(tmp13 - tmp0, CONST_BITS-PASS1_BITS)
sub r8, r2, r7
add r8, r8, #(1<<10)
mov r8, r8, asr #11
strh r8, [lr, # 8]
sub r8, r2, r7
add r8, r8, #(1<<10)
mov r8, r8, asr #11
strh r8, [lr, # 8]
@ End of row loop
add lr, lr, #16
subs r12, r12, #1
bne row_loop
beq start_column_loop
add lr, lr, #16
subs r12, r12, #1
bne row_loop
beq start_column_loop
empty_row:
ldr r1, [r11, #FIX_0xFFFF_ID]
mov r0, r0, lsl #2
and r0, r0, r1
add r0, r0, r0, lsl #16
str r0, [lr, # 0]
str r0, [lr, # 4]
str r0, [lr, # 8]
str r0, [lr, #12]
ldr r1, [r11, #FIX_0xFFFF_ID]
mov r0, r0, lsl #2
and r0, r0, r1
add r0, r0, r0, lsl #16
str r0, [lr, # 0]
str r0, [lr, # 4]
str r0, [lr, # 8]
str r0, [lr, #12]
end_of_row_loop:
@ End of loop
add lr, lr, #16
subs r12, r12, #1
bne row_loop
add lr, lr, #16
subs r12, r12, #1
bne row_loop
start_column_loop:
@ Start of column loop
pop {lr}
mov r12, #8
pop {lr}
mov r12, #8
column_loop:
ldrsh r0, [lr, #( 0*8)] @ r0 = 'd0'
ldrsh r2, [lr, #( 4*8)] @ r2 = 'd2'
ldrsh r4, [lr, #( 8*8)] @ r4 = 'd4'
ldrsh r6, [lr, #(12*8)] @ r6 = 'd6'
ldrsh r0, [lr, #( 0*8)] @ r0 = 'd0'
ldrsh r2, [lr, #( 4*8)] @ r2 = 'd2'
ldrsh r4, [lr, #( 8*8)] @ r4 = 'd4'
ldrsh r6, [lr, #(12*8)] @ r6 = 'd6'
ldr r3, [r11, #FIX_0_541196100_ID]
add r1, r2, r6
ldr r5, [r11, #FIX_M_1_847759065_ID]
mul r1, r3, r1 @ r1 = z1
ldr r3, [r11, #FIX_0_765366865_ID]
mla r6, r5, r6, r1 @ r6 = tmp2
add r5, r0, r4 @ r5 = tmp0
mla r2, r3, r2, r1 @ r2 = tmp3
sub r3, r0, r4 @ r3 = tmp1
ldr r3, [r11, #FIX_0_541196100_ID]
add r1, r2, r6
ldr r5, [r11, #FIX_M_1_847759065_ID]
mul r1, r3, r1 @ r1 = z1
ldr r3, [r11, #FIX_0_765366865_ID]
mla r6, r5, r6, r1 @ r6 = tmp2
add r5, r0, r4 @ r5 = tmp0
mla r2, r3, r2, r1 @ r2 = tmp3
sub r3, r0, r4 @ r3 = tmp1
add r0, r2, r5, lsl #13 @ r0 = tmp10
rsb r2, r2, r5, lsl #13 @ r2 = tmp13
add r4, r6, r3, lsl #13 @ r4 = tmp11
rsb r6, r6, r3, lsl #13 @ r6 = tmp12
add r0, r2, r5, lsl #13 @ r0 = tmp10
rsb r2, r2, r5, lsl #13 @ r2 = tmp13
add r4, r6, r3, lsl #13 @ r4 = tmp11
rsb r6, r6, r3, lsl #13 @ r6 = tmp12
ldrsh r1, [lr, #( 2*8)] @ r1 = 'd1'
ldrsh r3, [lr, #( 6*8)] @ r3 = 'd3'
ldrsh r5, [lr, #(10*8)] @ r5 = 'd5'
ldrsh r7, [lr, #(14*8)] @ r7 = 'd7'
ldrsh r1, [lr, #( 2*8)] @ r1 = 'd1'
ldrsh r3, [lr, #( 6*8)] @ r3 = 'd3'
ldrsh r5, [lr, #(10*8)] @ r5 = 'd5'
ldrsh r7, [lr, #(14*8)] @ r7 = 'd7'
@ Check for empty odd column (happens about 20 to 25 % of the time according to my stats)
orr r9, r1, r3
orr r10, r5, r7
orrs r10, r9, r10
beq empty_odd_column
orr r9, r1, r3
orr r10, r5, r7
orrs r10, r9, r10
beq empty_odd_column
push {r0, r2, r4, r6} @ save on the stack tmp10, tmp13, tmp12, tmp11
push {r0, r2, r4, r6} @ save on the stack tmp10, tmp13, tmp12, tmp11
add r0, r3, r5 @ r0 = 'z2'
add r2, r1, r7 @ r2 = 'z1'
add r4, r3, r7 @ r4 = 'z3'
add r6, r1, r5 @ r6 = 'z4'
ldr r9, [r11, #FIX_1_175875602_ID]
add r8, r4, r6
ldr r10, [r11, #FIX_M_0_899976223_ID]
mul r8, r9, r8 @ r8 = 'z5'
ldr r9, [r11, #FIX_M_2_562915447_ID]
mul r2, r10, r2 @ r2 = 'z1'
ldr r10, [r11, #FIX_M_1_961570560_ID]
mul r0, r9, r0 @ r0 = 'z2'
ldr r9, [r11, #FIX_M_0_390180644_ID]
mla r4, r10, r4, r8 @ r4 = 'z3'
ldr r10, [r11, #FIX_0_298631336_ID]
mla r6, r9, r6, r8 @ r6 = 'z4'
ldr r9, [r11, #FIX_2_053119869_ID]
mla r7, r10, r7, r2 @ r7 = tmp0 + z1
ldr r10, [r11, #FIX_3_072711026_ID]
mla r5, r9, r5, r0 @ r5 = tmp1 + z2
ldr r9, [r11, #FIX_1_501321110_ID]
mla r3, r10, r3, r0 @ r3 = tmp2 + z2
add r7, r7, r4 @ r7 = tmp0
mla r1, r9, r1, r2 @ r1 = tmp3 + z1
add r5, r5, r6 @ r5 = tmp1
add r3, r3, r4 @ r3 = tmp2
add r1, r1, r6 @ r1 = tmp3
add r0, r3, r5 @ r0 = 'z2'
add r2, r1, r7 @ r2 = 'z1'
add r4, r3, r7 @ r4 = 'z3'
add r6, r1, r5 @ r6 = 'z4'
ldr r9, [r11, #FIX_1_175875602_ID]
add r8, r4, r6
ldr r10, [r11, #FIX_M_0_899976223_ID]
mul r8, r9, r8 @ r8 = 'z5'
ldr r9, [r11, #FIX_M_2_562915447_ID]
mul r2, r10, r2 @ r2 = 'z1'
ldr r10, [r11, #FIX_M_1_961570560_ID]
mul r0, r9, r0 @ r0 = 'z2'
ldr r9, [r11, #FIX_M_0_390180644_ID]
mla r4, r10, r4, r8 @ r4 = 'z3'
ldr r10, [r11, #FIX_0_298631336_ID]
mla r6, r9, r6, r8 @ r6 = 'z4'
ldr r9, [r11, #FIX_2_053119869_ID]
mla r7, r10, r7, r2 @ r7 = tmp0 + z1
ldr r10, [r11, #FIX_3_072711026_ID]
mla r5, r9, r5, r0 @ r5 = tmp1 + z2
ldr r9, [r11, #FIX_1_501321110_ID]
mla r3, r10, r3, r0 @ r3 = tmp2 + z2
add r7, r7, r4 @ r7 = tmp0
mla r1, r9, r1, r2 @ r1 = tmp3 + z1
add r5, r5, r6 @ r5 = tmp1
add r3, r3, r4 @ r3 = tmp2
add r1, r1, r6 @ r1 = tmp3
pop {r0, r2, r4, r6} @ r0 = tmp10 / r2 = tmp13 / r4 = tmp11 / r6 = tmp12
@ r1 = tmp3 / r3 = tmp2 / r5 = tmp1 / r7 = tmp0
pop {r0, r2, r4, r6} @ r0 = tmp10 / r2 = tmp13 / r4 = tmp11 / r6 = tmp12
@ r1 = tmp3 / r3 = tmp2 / r5 = tmp1 / r7 = tmp0
@ Compute DESCALE(tmp10 + tmp3, CONST_BITS+PASS1_BITS+3)
add r8, r0, r1
add r8, r8, #(1<<17)
mov r8, r8, asr #18
strh r8, [lr, #( 0*8)]
add r8, r0, r1
add r8, r8, #(1<<17)
mov r8, r8, asr #18
strh r8, [lr, #( 0*8)]
@ Compute DESCALE(tmp10 - tmp3, CONST_BITS+PASS1_BITS+3)
sub r8, r0, r1
add r8, r8, #(1<<17)
mov r8, r8, asr #18
strh r8, [lr, #(14*8)]
sub r8, r0, r1
add r8, r8, #(1<<17)
mov r8, r8, asr #18
strh r8, [lr, #(14*8)]
@ Compute DESCALE(tmp11 + tmp2, CONST_BITS+PASS1_BITS+3)
add r8, r4, r3
add r8, r8, #(1<<17)
mov r8, r8, asr #18
strh r8, [lr, #( 2*8)]
add r8, r4, r3
add r8, r8, #(1<<17)
mov r8, r8, asr #18
strh r8, [lr, #( 2*8)]
@ Compute DESCALE(tmp11 - tmp2, CONST_BITS+PASS1_BITS+3)
sub r8, r4, r3
add r8, r8, #(1<<17)
mov r8, r8, asr #18
strh r8, [lr, #(12*8)]
sub r8, r4, r3
add r8, r8, #(1<<17)
mov r8, r8, asr #18
strh r8, [lr, #(12*8)]
@ Compute DESCALE(tmp12 + tmp1, CONST_BITS+PASS1_BITS+3)
add r8, r6, r5
add r8, r8, #(1<<17)
mov r8, r8, asr #18
strh r8, [lr, #( 4*8)]
add r8, r6, r5
add r8, r8, #(1<<17)
mov r8, r8, asr #18
strh r8, [lr, #( 4*8)]
@ Compute DESCALE(tmp12 - tmp1, CONST_BITS+PASS1_BITS+3)
sub r8, r6, r5
add r8, r8, #(1<<17)
mov r8, r8, asr #18
strh r8, [lr, #(10*8)]
sub r8, r6, r5
add r8, r8, #(1<<17)
mov r8, r8, asr #18
strh r8, [lr, #(10*8)]
@ Compute DESCALE(tmp13 + tmp0, CONST_BITS+PASS1_BITS+3)
add r8, r2, r7
add r8, r8, #(1<<17)
mov r8, r8, asr #18
strh r8, [lr, #( 6*8)]
add r8, r2, r7
add r8, r8, #(1<<17)
mov r8, r8, asr #18
strh r8, [lr, #( 6*8)]
@ Compute DESCALE(tmp13 - tmp0, CONST_BITS+PASS1_BITS+3)
sub r8, r2, r7
add r8, r8, #(1<<17)
mov r8, r8, asr #18
strh r8, [lr, #( 8*8)]
sub r8, r2, r7
add r8, r8, #(1<<17)
mov r8, r8, asr #18
strh r8, [lr, #( 8*8)]
@ End of row loop
add lr, lr, #2
subs r12, r12, #1
bne column_loop
beq the_end
add lr, lr, #2
subs r12, r12, #1
bne column_loop
beq the_end
empty_odd_column:
@ Compute DESCALE(tmp10 + tmp3, CONST_BITS+PASS1_BITS+3)
@ Compute DESCALE(tmp10 - tmp3, CONST_BITS+PASS1_BITS+3)
add r0, r0, #(1<<17)
mov r0, r0, asr #18
strh r0, [lr, #( 0*8)]
strh r0, [lr, #(14*8)]
add r0, r0, #(1<<17)
mov r0, r0, asr #18
strh r0, [lr, #( 0*8)]
strh r0, [lr, #(14*8)]
@ Compute DESCALE(tmp11 + tmp2, CONST_BITS+PASS1_BITS+3)
@ Compute DESCALE(tmp11 - tmp2, CONST_BITS+PASS1_BITS+3)
add r4, r4, #(1<<17)
mov r4, r4, asr #18
strh r4, [lr, #( 2*8)]
strh r4, [lr, #(12*8)]
add r4, r4, #(1<<17)
mov r4, r4, asr #18
strh r4, [lr, #( 2*8)]
strh r4, [lr, #(12*8)]
@ Compute DESCALE(tmp12 + tmp1, CONST_BITS+PASS1_BITS+3)
@ Compute DESCALE(tmp12 - tmp1, CONST_BITS+PASS1_BITS+3)
add r6, r6, #(1<<17)
mov r6, r6, asr #18
strh r6, [lr, #( 4*8)]
strh r6, [lr, #(10*8)]
add r6, r6, #(1<<17)
mov r6, r6, asr #18
strh r6, [lr, #( 4*8)]
strh r6, [lr, #(10*8)]
@ Compute DESCALE(tmp13 + tmp0, CONST_BITS+PASS1_BITS+3)
@ Compute DESCALE(tmp13 - tmp0, CONST_BITS+PASS1_BITS+3)
add r2, r2, #(1<<17)
mov r2, r2, asr #18
strh r2, [lr, #( 6*8)]
strh r2, [lr, #( 8*8)]
add r2, r2, #(1<<17)
mov r2, r2, asr #18
strh r2, [lr, #( 6*8)]
strh r2, [lr, #( 8*8)]
@ End of row loop
add lr, lr, #2
subs r12, r12, #1
bne column_loop
add lr, lr, #2
subs r12, r12, #1
bne column_loop
the_end:
@ The end....
pop {r4 - r11, pc}
pop {r4 - r11, pc}
endfunc
const const_array
+2 -2
View File
@@ -70,7 +70,7 @@
/* void rv34_idct_add_c(uint8_t *dst, int stride, int16_t *block) */
function ff_rv34_idct_add_neon, export=1
mov r3, r0
rv34_inv_transform r2
rv34_inv_transform r2
vmov.i16 q12, #0
vrshrn.s32 d16, q1, #10 @ (z0 + z3) >> 10
vrshrn.s32 d17, q2, #10 @ (z1 + z2) >> 10
@@ -99,7 +99,7 @@ endfunc
/* void rv34_inv_transform_noround_neon(int16_t *block); */
function ff_rv34_inv_transform_noround_neon, export=1
rv34_inv_transform r0
rv34_inv_transform r0
vshl.s32 q11, q2, #1
vshl.s32 q10, q1, #1
vshl.s32 q12, q3, #1
+2 -2
View File
@@ -687,7 +687,7 @@ endfunc
.endm
/* void ff_rv40_weight_func_16_neon(uint8_t *dst, uint8_t *src1, uint8_t *src2,
int w1, int w2, int stride) */
* int w1, int w2, int stride) */
function ff_rv40_weight_func_16_neon, export=1
ldr r12, [sp]
vmov d0, r3, r12
@@ -704,7 +704,7 @@ function ff_rv40_weight_func_16_neon, export=1
endfunc
/* void ff_rv40_weight_func_8_neon(uint8_t *dst, uint8_t *src1, uint8_t *src2,
int w1, int w2, int stride) */
* int w1, int w2, int stride) */
function ff_rv40_weight_func_8_neon, export=1
ldr r12, [sp]
vmov d0, r3, r12

Some files were not shown because too many files have changed in this diff Show More