diff --git a/_posts/2019/2019-03-10-more-random-satisfactory-stuff.html b/_posts/2019/2019-03-10-more-random-satisfactory-stuff.html index 27bfd43..1b4e5f7 100644 --- a/_posts/2019/2019-03-10-more-random-satisfactory-stuff.html +++ b/_posts/2019/2019-03-10-more-random-satisfactory-stuff.html @@ -14,7 +14,7 @@ tags: [Satisfactory, 'WordPress Archive']

In my entire play time, I've found a total of three Purple Power Slugs, and only two of them are within an Area that I can actually reach.

-
+

Host and Client have different Mechanics

@@ -26,13 +26,13 @@ tags: [Satisfactory, 'WordPress Archive']

You may not initially want to do it, but once you reach Stackable Poles, vertical building becomes your friend. You can stuff a lot of things into a vertical block that wouldn't fit as a flat plane anywhere, and it makes things a ton easier if you actually do move things into a vertical layout. Just make sure you give yourself enough room to work.

-
One of the layers of my Iron production line
+
One of the layers of my Iron production line

Dropped Items can be a Temporary Platform

If you just need to get there quickly, Leaves and other Items can be dropped in small stacks to create a platform that you can collide with and stand on. Use this to quickly get up somewhere without wasting a ton of cement in the process. This only works if there is no ground in range vertically.

-
Leaves used as a temporary wasteless bridge.
+
Leaves used as a temporary wasteless bridge.

Work-In-Progress Mercer Spheres

diff --git a/_posts/2019/2019-03-10-random-stuff-about-satisfactory.html b/_posts/2019/2019-03-10-random-stuff-about-satisfactory.html index 7065d33..225ed11 100644 --- a/_posts/2019/2019-03-10-random-stuff-about-satisfactory.html +++ b/_posts/2019/2019-03-10-random-stuff-about-satisfactory.html @@ -18,13 +18,13 @@ tags: [Satisfactory, 'WordPress Archive']

Not exactly as groundbreaking as most things, but this allows lining up things to the actual resource node better.

-
Just mining my Foundation.
+
Just mining my Foundation.

Balance your Conveyor Lines

If you have more than a single input conveyor, such as when you're mining ore and there's two, three or even four miners required, you should consider building a balancer. Balancers help distribute the input lines evenly to all output lines, such as balancing 3 inputs to feed all three outputs equally.

-
3 Input, 3 Output Ore Balancer
+
3 Input, 3 Output Ore Balancer

Balancers can be built in any shape, size or configuration. I've personally only needed 3-In-3-Out balancers so far, but configurations like 2-In-6-Out, 6-In, 2-Out, 6-In-6-Out, 5-In-5-Out and so on should all be possible as the Splitters and Mergers are much more intelligent than the ones in Factorio.

@@ -36,4 +36,4 @@ tags: [Satisfactory, 'WordPress Archive']

It's never too early to optimize your production lines, and it only gets easier the more materials you've saved up. Build Balancers, build huge production facilities for even more inputs, just really go all out and build for 10 times what you actually need right now.

-
+
diff --git a/_posts/2019/2019-04-22-i-have-a-garden-and-a-camera.html b/_posts/2019/2019-04-22-i-have-a-garden-and-a-camera.html index 764bdfc..e9411b8 100644 --- a/_posts/2019/2019-04-22-i-have-a-garden-and-a-camera.html +++ b/_posts/2019/2019-04-22-i-have-a-garden-and-a-camera.html @@ -6,4 +6,4 @@ tags: ['WordPress Archive']

Started cleaning up my garden again as the ground isn't frozen anymore and I can actually do work on everything again. So I took my Camera with me, and made some close (really still 80cm away) shots of things. I'm looking forward to owning an actual Macro lens for this purpose.

-
+
diff --git a/_posts/2019/2019-09-04-adventures-in-modding-relegend.html b/_posts/2019/2019-09-04-adventures-in-modding-relegend.html index 987b4f1..c592efe 100644 --- a/_posts/2019/2019-09-04-adventures-in-modding-relegend.html +++ b/_posts/2019/2019-09-04-adventures-in-modding-relegend.html @@ -11,7 +11,7 @@ tags: ['Re:Legend','Modding', 'WordPress Archive']

As the game had no modding API that was visible, the only thing I could do is to figure out what tools I need for injecting new code. And as the game is made with Unity, the same engine that Risk of Rain 2 runs on, I figured I could reuse the toolset from there.

-
Knowing which .Net SDK to use is half the battle.
+
Knowing which .Net SDK to use is half the battle.

All I needed to figure out was which .Net version the game was compiled against, and there was a handy tool for it: Detect It Easy. Loading the Assembly-CSharp.dll into it revealed that the game was built against .Net 4.0, which means that I needed a BepInEx compatible with Unity 2017 and newer.

@@ -22,7 +22,7 @@ tags: ['Re:Legend','Modding', 'WordPress Archive']

If you aren't familiar with Unity, the binary Assembly-CSharp.dll holds compiled and usually optimized compiled game code written in C#, and also usually contains most dependendencies that don't have their own .dll file. So what we need is a disassembly tool that can deal with optimized compiled code and give back a reasonable representation of the actual code.

-
ILSpy in action.
+
ILSpy in action.

After a bit of searching for a C# disassembler I decided to use ILSpy, which performed the task better than most other tools and did not crash no matter what I threw at it. Loading Assembly-CSharp.dll into it gave me a near instant disassembled version of the game code, although mostly lacking comments since those get removed.

@@ -35,7 +35,7 @@ tags: ['Re:Legend','Modding', 'WordPress Archive']

A little while later after delving into the disassembled code, I found the place that I needed to modify: global::TutorialManager::StartDisplayTutorial. The functionality that it currently had was to enable the tutorial panel, disable quitting the game, stop all player interaction, search and initialize the tutorial, unlock the tutorial and then finally set up the UI and display it. We don't want it to do anything but unlock the tutorial.

-
Skipping Tutorials made easy - they don't even show up now!
+
Skipping Tutorials made easy - they don't even show up now!

As TutorialManagers UnlockTutorial was private, but the unlockTutorials List was not, the override was a simple as just calling SearchTutorial, then adding the tutorial to the unlockTutorials List, and finally just calling EndTutorial().

diff --git a/_posts/2019/2019-12-22-the-path-to-streamfx-0-8-0.html b/_posts/2019/2019-12-22-the-path-to-streamfx-0-8-0.html index 52a65a2..364c0c2 100644 --- a/_posts/2019/2019-12-22-the-path-to-streamfx-0-8-0.html +++ b/_posts/2019/2019-12-22-the-path-to-streamfx-0-8-0.html @@ -65,13 +65,13 @@ tags: ['StreamFX', 'WordPress Archive']
-
With Color Grading
+
With Color Grading
-
Without Color Grading
+
Without Color Grading
diff --git a/_posts/2020/2020-01-23-stepping-down-as-the-maintainer-for-the-amd-encoder-plugin.html b/_posts/2020/2020-01-23-stepping-down-as-the-maintainer-for-the-amd-encoder-plugin.html index 3a62639..cddc262 100644 --- a/_posts/2020/2020-01-23-stepping-down-as-the-maintainer-for-the-amd-encoder-plugin.html +++ b/_posts/2020/2020-01-23-stepping-down-as-the-maintainer-for-the-amd-encoder-plugin.html @@ -1,36 +1,18 @@ --- title: 'Stepping Down as the Maintainer for the AMD Encoder Plugin' category: Blog -tags: [AMD, AMF, OBS] -unpublished: true +tags: [AMD, AMF, OBS, OBS Studio] +published: false --- -
-
-

I'm stepping down from the maintainer position for the AMD Encoder, and am removing myself from the OBS Project team permanently.

- +

I'm stepping down from the maintainer position for the AMD Encoder, and am removing myself from the OBS Project team permanently.

- - - +

This is my official notice that I am stepping down from the maintainer position for the AMD Encoder, as I have had several disagreements with the (only?) maintainer Jim. I would also request to be removed from the OBS Project team immediately. The integrated AMD Encoder plugin is now lacking a maintainer, and good luck to whoever picks it up again.

- -

This is my official notice that I am stepping down from the maintainer position for the AMD Encoder, as I have had several disagreements with the (only?) maintainer Jim. I would also request to be removed from the OBS Project team immediately. The integrated AMD Encoder plugin is now lacking a maintainer, and good luck to whoever picks it up again.

- +

There are various reasons for this, but the biggest one is the repeated disagreements. In every disagreement i was not treated as an equal, but as someone who didn't know any better and should just be an obedient underling. This way of interacting with one another is simply not acceptable, and therefore I have no further interest in working together with the OBS Project team.

- -

There are various reasons for this, but the biggest one is the repeated disagreements. In every disagreement i was not treated as an equal, but as someone who didn't know any better and should just be an obedient underling. This way of interacting with one another is simply not acceptable, and therefore I have no further interest in working together with the OBS Project team.

- +

Furthermore there is the lack of clarity and transparency for the OBS project. As there is no roadmap, there is no way to tell what is necessary for a new version, and decisions feel arbitrary due to that. What gets in and what doesn't seems to be up to the preference of someone instead of being planned out from the start. And despite repeated claims to improve this, nothing has happened over the span of two years. Not only that, but several much requested features are just pushed away as niche instantly instead of actually considering them seriously and putting them in a roadmap.

- -

Furthermore there is the lack of clarity and transparency for the OBS project. As there is no roadmap, there is no way to tell what is necessary for a new version, and decisions feel arbitrary due to that. What gets in and what doesn't seems to be up to the preference of someone instead of being planned out from the start. And despite repeated claims to improve this, nothing has happened over the span of two years. Not only that, but several much requested features are just pushed away as niche instantly instead of actually considering them seriously and putting them in a roadmap.

- +

Overall, the state of OBS Project does not sit well with me and I no longer wish to be associated with it. This does not mean that I will no longer develop things for OBS Studio/libobs, just that I no longer want to be associated with the OBS Project team at all. obs-StreamFX and other secret plugins I've developed for clients will stay around until I find another thing to work on.

- -

Overall, the state of OBS Project does not sit well with me and I no longer wish to be associated with it. This does not mean that I will no longer develop things for OBS Studio/libobs, just that I no longer want to be associated with the OBS Project team at all. obs-StreamFX and other secret plugins I've developed for clients will stay around until I find another thing to work on.

- - - -

Read the original on GitHub.

- -
+

Read the original on GitHub.

diff --git a/_posts/2020/2020-02-04-a-now-playing-overlay-using-last-fm.html b/_posts/2020/2020-02-04-a-now-playing-overlay-using-last-fm.html index 52f5244..f94492c 100644 --- a/_posts/2020/2020-02-04-a-now-playing-overlay-using-last-fm.html +++ b/_posts/2020/2020-02-04-a-now-playing-overlay-using-last-fm.html @@ -4,19 +4,18 @@ category: Blog tags: [OBS, Last.FM, AIMP, Spotify, 'WordPress Archive'] --- -

I wanted to upgrade my streaming setup slightly, and while watching other streamers, I noticed that some have added a "Now Playing" overlay. For the most part it's either embedded in a static overlay as text, or just free floating text. But that isn't enough for me.

- - -
Last.FM Now Playing Overlay
Last.FM Now Playing Overlay
- - - +

+

+ Last.FM Now Playing Overlay +
Last.FM Now Playing Overlay
+
I wanted to upgrade my streaming setup slightly, and while watching other streamers, I noticed that some have added a "Now Playing" overlay. For the most part it's either embedded in a static overlay as text, or just free floating text. But that isn't enough for me.

The biggest difference to the usual "Now Playing" overlays is that it is animated. Instead of simply changing text, it slides out when no song is playing, slides in when a song starts playing, and flips to reveal a track change. Not only that, but it also shows track art if there is any. Just take a look at it in action to see what it can do:

- -
The "Now Playing" overlay in action.
- +
+ +
The "Now Playing" overlay in action.
+

To support most streaming software (and drastically reduce the work I had to do), I compressed it to a single HTML file that contains the necessary CSS and JavaScript. This also enables you to just download it, but keep it mind that you will not receive any future updates if you do so. You can also customize the style completely by either using the CSS override function that your streaming software provides, or by editing the HTML file.

@@ -36,7 +35,16 @@ tags: [OBS, Last.FM, AIMP, Spotify, 'WordPress Archive']

As the entire thing is a HTML page, you can easily change the design of the overlay. Here are the CSS classes you can modify to customize the style:

- +

Note that the CSS modifying hooks that OBS Studio and Streamlabs OBS provide run after all other JavaScript code has run, so you can't override the zoom with that. In the case that you want to change the sizes of elements beyond what is already there, consider downloading it and modifying the source files.

diff --git a/_posts/2020/2020-04-02-whats-coming-to-streamfx-0-8-0.html b/_posts/2020/2020-04-02-whats-coming-to-streamfx-0-8-0.html index 736dd04..df66ae1 100644 --- a/_posts/2020/2020-04-02-whats-coming-to-streamfx-0-8-0.html +++ b/_posts/2020/2020-04-02-whats-coming-to-streamfx-0-8-0.html @@ -18,9 +18,20 @@ tags: [StreamFX, 'WordPress Archive']

For example, in my livestream on 28th March 2020 I created various example shaders for filters and transitions, such as a retro pixelate transition, a CRT curvature and scanline filter, or a simple luma based transition instead of the classic fade. With HLSL any of these effects are no longer out of reach and easily done, if you have the skills and patience to pull it off.

- - - + + +
+ +
CRT Shaders without Bleeding
+
+
+ +
+ +
CRT Shaders with Bleeding
+
+
+

The current implementation allows you to do any effect that does not require extra textures, as these are not yet supported. A future update will hopefully be able add support for textures, including FFTs for source audio, so that you can create even more impressive shaders. What exactly is possible still remains to be seen, as I've hit various limitations of the OBS Studio effect parser, including undiscovered crashes in OBS.

diff --git a/_posts/2020/2020-04-04-how-to-use-the-new-nvidia-face-tracking-filter-in-streamfx.html b/_posts/2020/2020-04-04-how-to-use-the-new-nvidia-face-tracking-filter-in-streamfx.html index ff99918..1d2d0e0 100644 --- a/_posts/2020/2020-04-04-how-to-use-the-new-nvidia-face-tracking-filter-in-streamfx.html +++ b/_posts/2020/2020-04-04-how-to-use-the-new-nvidia-face-tracking-filter-in-streamfx.html @@ -10,7 +10,18 @@ tags: [Tutorial, StreamFX, 'WordPress Archive']

In simple terms it keeps the Region of Interest always close to the center of the frame, which is usually the face of a content creator. Cropping and zoom is handled automatically and adjusted live while you move around. Get closer to the camera and it zooms out, go further away and it zooms in. Just compare the images below to see what it can do automatically:

-
Without Nvidia Face Tracking Filter
With Nvidia Face Tracking Filter
+ + +
+
Without Nvidia Face Tracking Filter
+
+
+ +
+
With Nvidia Face Tracking Filter
+
+
+

Effectively this filter simulates what it's like to have a professional camera operator following you on stage, always following where the action currently is. And the best thing is: it's free, and available now!

@@ -18,7 +29,9 @@ tags: [Tutorial, StreamFX, 'WordPress Archive']

If you like moving around a lot, a manually cropped area only gets you so far. It's very restrictive, as you have a predetermined region of interest, and adjusting it takes a while to do. And that's exactly where this filter comes in.

-
Nvidia Face Tracking Filter at work on a Logitech Brio 4K
+
+
Nvidia Face Tracking Filter at work on a Logitech Brio 4K
+

In this video I've moved through the entire horizontal field of view, which I could not have done with a manually cropped region. Thanks to the new Filter the region of interest was almost always in the center. And of course, like with everything that I do, it can be fully customized to your personal needs. Whether you want it fully zoomed in to something or just slightly further away is entirely up to you. If you wanted you could even create this meme on the press of a button.

@@ -32,7 +45,10 @@ tags: [Tutorial, StreamFX, 'WordPress Archive']

The options are very simple: Stability controls how fast it should respond to changes (lower values respond faster but are more unstable), Zoom controls the maximum zoom relative to the "full frame" zoom level, and Offset controls the relative offset to the detected region of interest. The default values are targeted at the Logitech Brio 4k with the Field of View set to 90°, so you should adjust the values for your needs and camera of choice.

-
Settings for the Nvidia Face Tracking Filter
+ +
+
Settings for the Nvidia Face Tracking Filter
+

Anything else?

diff --git a/_posts/2020/2020-04-26-optimizing-streamfx-in-your-obs-scenes.html b/_posts/2020/2020-04-26-optimizing-streamfx-in-your-obs-scenes.html index 1c58a4e..2657f2e 100644 --- a/_posts/2020/2020-04-26-optimizing-streamfx-in-your-obs-scenes.html +++ b/_posts/2020/2020-04-26-optimizing-streamfx-in-your-obs-scenes.html @@ -30,7 +30,35 @@ tags: [Tutorial, StreamFX, 'WordPress Archive']

For example you can completely replace a Gaussian Area Blur with a Dual-Filtering Blur and the latter will be significantly faster. Just look at this table to see replacements that work up to 5 times faster on any hardware:

-
Original BlurReplacement BlurUp to x% faster
Box Area Blur
Box Directional Blur
Box Linear Area Blur
Box Linear Directional Blur
~200%
Gaussian Area BlurDual-Filtering Blur~500%
Gaussian Area Blur
Gaussian Directional Blur
Gaussian Linear Area Blur
Gaussian Linear Directional Blur
(Not identical to Gaussian Blur)
~200%
Possible replacement blurs that take siginificantly less CPU and GPU time.
+
+ + + + + + + + + + + + + + + + + + + + + + + + + +
Original BlurReplacement BlurUp to x% faster
Box Area Blur
Box Directional Blur
Box Linear Area Blur
Box Linear Directional Blur
~200%
Gaussian Area BlurDual-Filtering Blur~500%
Gaussian Area Blur
Gaussian Directional Blur
Gaussian Linear Area Blur
Gaussian Linear Directional Blur
(Not identical to Gaussian Blur)
~200%
+
Possible replacement blurs that take siginificantly less CPU and GPU time.
+

Full Resolution Blurring is Wasteful

@@ -44,7 +72,15 @@ tags: [Tutorial, StreamFX, 'WordPress Archive']

Shaders are one of the features in StreamFX that allow you to do so much cool stuff - and at the same time mess everything up. The following is a list of things you should do:

- +

Other Improvements

diff --git a/_posts/2020/2020-06-13-is-a-48khz-sample-rate-truly-enough-for-audio.html b/_posts/2020/2020-06-13-is-a-48khz-sample-rate-truly-enough-for-audio.html index 97c6681..6111576 100644 --- a/_posts/2020/2020-06-13-is-a-48khz-sample-rate-truly-enough-for-audio.html +++ b/_posts/2020/2020-06-13-is-a-48khz-sample-rate-truly-enough-for-audio.html @@ -2,185 +2,203 @@ title: Is a 48kHz sample rate truly enough for Audio? category: Blog tags: ["Audio", "DAC", "ADC", 'WordPress Archive'] -unpublished: true +published: false --- -

Ever since the day that we've been able to push sample rate higher than 44.1kHz, this question has appeared: What is the best sample rate for Audio, and can you actually hear the difference between 48kHz and 96kHz (or higher) sample rates?

- - -

Before we get into this, note that I am not an audio engineer, or a scientist. I am a software developer, who is often too curious for his own good, resulting in weird new projects - like StreamFX. So take this with a grain of salt, and if you know better, do feel free to contact me!

- +

Before we get into this, note that I am not an audio engineer, or a scientist. I am a software developer, who is often too curious for his own good, resulting in weird new projects - like StreamFX. So take this with a grain of salt, and if you know better, do feel free to contact me!

- - - - -

This test is based on my own frequency visualizer tool available here. All tests are based on the humanly audible range of 20 Hz to 20kHz (for a healthy young adult). These tests do not take into account inaccuracies caused by the physical properties of the D/A or A/D converters, speakers and microphones.

- -

Update: Information for supersampling D/A and A/D converters has been added to the entry. Please see the conclusion page for more information.

- -

When does 48kHz run into problems?

- -

The most common case is the conversion from Digital to Analog - without it we would not be able to hear any audio at all. Let's take a look at a few common frequencies: 32 Hz, 64 Hz, 128 Hz, 256 Hz, 512 Hz, 1024 Hz, 2048 Hz, 4096 Hz, 8192 Hz, and finally 16384 Hz.

- - - - + + +
+ +
+
+ +
+ +
+
+ +
+ +
+
+ +
+ +
+
+
-

Looking at the generated graphs, we can immediately tell that anything equal to or below 2048 Hz will be perfectly fine on 48kHz. We can also tell that somewhere between 2048 and 4096 Hz we will start seeing slight artifacts, and that everything above that unknown value will have ever stronger artifacts - In fact we can see the strong artifacts appear on 8192 Hz already.

- -

And at 16384 Hz we might as well just throw in the towel as there is basically no way to create the original wave with current hardware. Even the most accurate DAC will struggle to recreate the wave properly, and overshoot and undershoot constantly, corrupting the wave past recovery. While it is possible to work around the issue, it won't be gone.

- -

Knowing this we can tell that for the majority of audible frequencies, we'll be safe with 48kHz. But so far we've only looked at 2^n frequencies - what about other frequencies that end up in a pretty bad shape at this sample rate?

- -

The Broken Frequencies in 48kHz

- -

Since we can safely say that any frequency up to 4096 kHz works "fine", let's take a look at the frequencies above them. For example, how about we look at integer divisions of 48 kHz and variations of them, such as 19.2 kHz, 16 kHz, 12 kHz, 9.6 kHz, 8 kHz, 6 kHz and 4.8 kHz.

- - - - + + +
+ +
+
+ +
+ +
+
+ +
+ +
+
+ +
+ +
+
+
-

At 19.2 kHz and 16 kHz, we have by far the worst artifacts. It's not even possible to call these waves anymore, they are just random noise now. Not much of the original wave is left, but we can still guess that it used to be a wave of some type. In the second sample which is slightly offset by time, we can see even worse effects for both frequencies.

- - - - + + +
+ +
+
+ +
+ +
+
+ +
+ +
+
+ +
+ +
+
+
-

Continuing on with 12kHz and 9.6kHz, we can see similar results depending on just how the time offset is adjusted. However good filtering algorithms might be able to still make out that these used to be waves - the velocity of the waveform could be used to recreate a proper wave for the frequency that we are trying to reproduce.

- - - - + + +
+ +
+
+ +
+ +
+
+ +
+ +
+
+ +
+ +
+
+
-

With 8kHz and all frequencies below that, we've approached the area where the artifacts become so small that we can filter them out at minimal loss. Knowing this we can infer that all smaller frequencies that this will perform fine, given good filtering.

- -

So the question then is, what sample rate is enough to fix the majority of artifacts?

- -

Which Samplerate avoids the artifacts?

- -

In the best case possible, we would want to accurately reproduce every frequency between 20 Hz and 20 kHz. This is however just not feasible with current technology at a reasonable price point. That means we'll have to do with what we already have: 96 kHz and 192 kHz. Let's look at both of them.

- - - - + + +
+ +
+
+ +
+ +
+
+
-

In the graphs for 96 kHz we can clearly see an improvement compared to 48 kHz, as it almost eliminates all the artifacts for these frequencies. At 96 kHz we are safe when it comes to human speech and most instruments. Some artifacts are still left, but for the most common use cases, 96 kHz is enough.

- - - - + + +
+ +
+
+ +
+ +
+
+
-

At 192 kHz we can see all the remaining artifacts effectively disappear completely. Even 19.2 kHz looks like a proper wave and likely will not need any complex filtering to be detected correctly. This would be the ideal sample rate for instruments such as cymbals and bells, but would be massively overkill for vocals.

- -

Solving the Question(s)

- -

Is 48 kHz enough?

- -

This depends, but the short answer is no. There is a significant audio processing overhead required to make 48 kHz be able to sound like what you would achieve with 96 kHz or higher. If you can confidently say that everything in your audio production pipeline is doing the necessary processing for 48 kHz playback, then you can set your playback frequency to 48 kHz.

- -

Will switching to 96 kHz (or higher) fix the problems?

- -

Yes, absolutely. While they won't be gone completely, they will be reduced to the point that they won't matter anymore, which is especially important for audio recording from real world instruments and vocals. A studio performance captured at 192 kHz sample rate will sound much different compared to one captured at 48 kHz.

- -

What sample rate should I pick?

- -

This depends on what you actually want to do:

- - - - + -

However there is a problem with this. If your pipeline involves a naive downsampler, which is common in many popular media production software such as streaming apps, you actually gain none of the benefits of the higher sampling rate. In the worst case this can even cause new artifacts to appear.

- -

What is the correct way to downsample?

- -

This is the hard part, and I have no real answer for it. A reduced sample rate simply cannot cover all the frequencies that higher sample rates can, and even the best downsampling and filtering and only do so much and will struggle with certain frequencies where artifacts are simply unavoidable.

- -

The majority of the frequencies above 9.6 kHz are problematic at 48 kHz, and simply can't be represented correctly. For example the 19.2 kHz frequency is just nearly impossible to accurately represent, but is fine at 96 kHz.

- -

What about supersampling D/A and A/D converters?

- -

Higher priced audio devices have started using supersampling D/A and A/D converters, which usually have a data resolution of 48, 96 or 192 kHz, and an internal resolution in the mHz area. Since these are usually not listed in the spec sheet, it is impossible to tell if you have one or not without an oscilloscope.

- -

Their quality is defined by their resampling algorithm, and high quality resampling algorithms can make 48 kHz sound nearly indistinguishable compared to 96 kHz, at least for the majority of frequencies. If you can confidently say that you have one of these, then you will be "fine" at 48 kHz sampling rate - the majority of audio frequencies will be reproduced with only minor artifacts.

- -

The Conclusion

- -

So there you have it, the answer to the age old question: "Is 48 kHz enough?" - and the answer to it is "No". The minimum necessary to accurately reproduce most real world audio is 96 kHz, and some things even need 192 kHz or higher to be correctly reproduced.

- -

And thanks to technological advances, we might in the future see 96 kHz become the new "X is enough". Chips have gotten smaller and more efficient, audio capture/playback devices have gotten better at audio, and even our mobile phones are starting to jump onto higher samplerates.

- - -

With all that said, there isn't anything left to talk about. If you think I made a mistake, or just know better, do feel free to contact me.

- +

With all that said, there isn't anything left to talk about. If you think I made a mistake, or just know better, do feel free to contact me.

diff --git a/_posts/2020/2020-06-23-the-art-of-encoding-with-nvidia-turing-nvenc.html b/_posts/2020/2020-06-23-the-art-of-encoding-with-nvidia-turing-nvenc.html index 69390ea..b11e146 100644 --- a/_posts/2020/2020-06-23-the-art-of-encoding-with-nvidia-turing-nvenc.html +++ b/_posts/2020/2020-06-23-the-art-of-encoding-with-nvidia-turing-nvenc.html @@ -1,49 +1,59 @@ --- title: The Art of encoding with NVIDIA NVENC category: Blog -tags: [Tutorial, NVENC, NVIDIA, OBS Studio] +tags: [Tutorial, NVENC, NVIDIA, OBS Studio, "WordPress Archive"] +samples: + resolutions: + - caption: "1920x1080 at 8.5mbit" + tag: "8500-1080" + - caption: "1920x1080 at 6.0mbit" + tag: "6000-1080" + - caption: "1280x720 at 6.0mbit" + tag: "6000-0750" + - caption: "960x540 at 6.0mbit" + tag: "6000-0540" + - caption: "640x360 at 3.5mbit" + tag: "3500-0360" + encoders: + - name: "x264 slow" + tag: "x264" + - name: "OBS Studio 27.x" + tag: "obsnvenc" + - name: "StreamFX v0.10" + tag: "sfxnvenc" + - name: "FFmpeg 4.4" + tag: "newnvenc" + videos: + - name: "ARMA 3 \"Walk through the Jungle\" #002" + url: "https://cdn.xaymar.com/blog/2021/04/arma_3-002.:encoder-:resolution.mp4" + - name: "Black Mesa #001" + url: "https://cdn.xaymar.com/blog/2021/04/black_mesa-001.:encoder-:resolution.mp4" + - name: "Call of Duty Modern Warframe TDM Broadcast (by EposVox)" + url: "https://cdn.xaymar.com/blog/2021/04/call_of_duty_modern_warfare-tdm_broadcast-001.:encoder-:resolution.mp4" + - name: "Forza Horizon 4 #002" + url: "https://cdn.xaymar.com/blog/2021/04/forza_4_horizon-002.:encoder-:resolution.mp4" + - name: "GRIP Combat Racing #007" + url: "https://cdn.xaymar.com/blog/2021/04/grip_combat_racing-007.:encoder-:resolution.mp4" --- -

Streaming with more than one PC has been the leader in H.264 encoding for years, but NVIDIAs Turing and - Ampere generation has put a significant dent into that lead. The new generation of GPUs with the brand new encoder - brought comparable quality x264 medium - if you can find a GPU that is. Let's take a look at what's needed to set up - your stream for massively improved quality.

+

Streaming with more than one PC has been the leader in H.264 encoding for years, but NVIDIAs Turing and Ampere generation has put a significant dent into that lead. The new generation of GPUs with the brand new encoder brought comparable quality x264 medium - if you can find a GPU that is. Let's take a look at what's needed to set up your stream for massively improved quality.


The guide has been updated for:
StreamFX v0.11 and OBS Studio 27.0

Setting up NVENC (for Streaming)

-

Modern OBS Studio has two ways to achieve the expected quality: the built-in NVENC H.264 (new) - and the addition from StreamFX called NVIDIA NVENC H.264/AVC (via FFmpeg). Both Options can achieve similar quality to - x264 medium, but the latter is able to exceed that and rival x264 medium/slow in various situations. Whichever you - pick, both of them support zero-copy encoding, and they're both valid options for streaming.

+

Modern OBS Studio has two ways to achieve the expected quality: the built-in NVENC H.264 (new) and the addition from StreamFX called NVIDIA NVENC H.264/AVC (via FFmpeg). Both Options can achieve similar quality to x264 medium, but the latter is able to exceed that and rival x264 medium/slow in various situations. Whichever you pick, both of them support zero-copy encoding, and they're both valid options for streaming.

Built-In: OBS Studio NVENC H.264 (new)

- - -
-
-
Image reference for Turing/Ampere
-
-
- - -

The built-in NVENC option in OBS Studio is by far the simplest option and will give you almost - identical quality on Maxwell, Pascal, Turing and Ampere, though Turing and Ampere will make use of the new - improvements of the NVENC chip. Maxwell and Pascal users can expect to reach x264 veryfast/faster-like quality, - while Turing and Ampere users can expect to hit fast/medium-like quality. Below are the settings you need to set: +

+

+
Image reference for Turing/Ampere
+
The built-in NVENC option in OBS Studio is by far the simplest option and will give you almost identical quality on Maxwell, Pascal, Turing and Ampere, though Turing and Ampere will make use of the new improvements of the NVENC chip. Maxwell and Pascal users can expect to reach x264 veryfast/faster-like quality, while Turing and Ampere users can expect to hit fast/medium-like quality. Below are the settings you need to set:

-
+
@@ -85,23 +95,13 @@ tags: [Tutorial, NVENC, NVIDIA, OBS Studio]

StreamFX: NVIDIA NVENC H.264/AVC (via FFmpeg)

- -
-
-
Image reference for Turing/Ampere
-
-
- +

+

+
Image reference for Turing/Ampere
+
+If you're new to StreamFX's NVENC integration, it will most likely overwhelm you with the settings it offers. But thanks to all those settings, you can actually go above the default quality by quite a significant amount. Note that I will only cover critical settings, as other settings like Bitrate, Buffer Size and Key Frame Interval are explained elsewhere.

-

If you're new to StreamFX's NVENC integration, it will most likely overwhelm you with the settings it - offers. But thanks to all those settings, you can actually go above the default quality by quite a significant - amount. Note that I will only cover critical settings, as other settings like Bitrate, Buffer Size and Key Frame - Interval are explained elsewhere.

- -
+
@@ -231,1018 +231,108 @@ tags: [Tutorial, NVENC, NVIDIA, OBS Studio]
Ideal NVENC Settings for StreamFX's NVENC
-

For certain platforms it may be necessary to turn off Adaptive I-Frames due to how their internal - processing works, such as Twitch. This has a drastic quality impact and should only be done if you rely on that - platform alone to reach your audience.

+

For certain platforms it may be necessary to turn off Adaptive I-Frames due to how their internal processing works, such as Twitch. This has a drastic quality impact and should only be done if you rely on that platform alone to reach your audience.

Setting up Resolution and Framerate to match the Bitrate

-

It is no secret than H.264/AVC is an outdated codec and that platforms should have paved the way for - better codecs a long time ago, but it is the solution that we are stuck with until AV1 is adopted by the masses. So - in order to get the best quality out of our stream, we should aim to also set up our stream according to the bitrate - and codec we use. Below is the average result of a few hundred thousand tests at various resolutions and bitrates, - according to PSNR and VMAF (weighted PSNR 30:70 VMAF):

+

It is no secret than H.264/AVC is an outdated codec and that platforms should have paved the way for better codecs a long time ago, but it is the solution that we are stuck with until AV1 is adopted by the masses. So in order to get the best quality out of our stream, we should aim to also set up our stream according to the bitrate and codec we use. Below is the average result of a few hundred thousand tests at various resolutions and bitrates, according to PSNR and VMAF (weighted PSNR 30:70 VMAF):

-
+
- - - - - - - + + + + + + + - - - - - - - + + + + + + + - - - - - - - + + + + + + + - - - - - - - + + + + + + + - - - - - - - + + + + + + + - - - - - - - + + + + + + + - - - - - - - + + + + + + +
Resolution3.5mbit
30 FPS
3.5mbit
60 FPS
6.0mbit
30 FPS
6.0mbit
60 FPS
8.5mbit
30 FPS
8.5mbit
60 FPS
Resolution3.5mbit
30 FPS
3.5mbit
60 FPS
6.0mbit
30 FPS
6.0mbit
60 FPS
8.5mbit
30 FPS
8.5mbit
60 FPS
640x360668798640x360668798
960x540546577960x540546577
1280x7204343541280x720434354
1536x8643243441536x864324344
1600x9003232431600x900323243
1920x10802122331920x1080212233
-
Rating from 1 to 10 based on VMAF and PSNR, weighted towards producing useful ranges. Tests performed - with x264 slow.
A 10 is perfect, 9 is near lossless, 8 is indistinguishable, 7 is high quality, 5 is - acceptable quality and 3 is watchable.
+
Rating from 1 to 10 based on VMAF and PSNR, weighted towards producing useful ranges. Tests performed with x264 slow.
A 10 is perfect, 9 is near lossless, 8 is indistinguishable, 7 is high quality, 5 is acceptable quality and 3 is watchable.
-

Please note that watchable in video encoding means that you can decode information within it with - reasonable accuracy, instead of it having turned to full garbage. Higher resolutions than 1920x1080 were omitted - from the table as the rows would be filled with values between 0 and 1, which just are not very useful to us.

+

Please note that watchable in video encoding means that you can decode information within it with reasonable accuracy, instead of it having turned to full garbage. Higher resolutions than 1920x1080 were omitted from the table as the rows would be filled with values between 0 and 1, which just are not very useful to us.

-

This means that at 3.5mbit, the highest resolution and framerate for a variety streamer is 1280x720 at - 30 FPS, or 960x540 at 60 FPS. The equation shifts slightly for 6.0mbit, where you can either go for 1536x864 at 30 - FPS or 1280x720 at 60 FPS. Finally at 8.5mbit you are looking at a maximum resolution and framerate of 1920x1080 at - 30 FPS or 1536x864 at 60 FPS.

+

This means that at 3.5mbit, the highest resolution and framerate for a variety streamer is 1280x720 at 30 FPS, or 960x540 at 60 FPS. The equation shifts slightly for 6.0mbit, where you can either go for 1536x864 at 30 FPS or 1280x720 at 60 FPS. Finally at 8.5mbit you are looking at a maximum resolution and framerate of 1920x1080 at 30 FPS or 1536x864 at 60 FPS.

Final Words

-

In the past few years NVIDIA has made massive improvements to their encoder, which has evened the - playing field far beyond what was expected. With no need to transfer frames from the GPU to the CPU, and quality - comparable to x264 medium (or better), NVIDIAs Turing NVENC is pushing the boundaries of what is possible in a - single consumer PC.

+

In the past few years NVIDIA has made massive improvements to their encoder, which has evened the playing field far beyond what was expected. With no need to transfer frames from the GPU to the CPU, and quality comparable to x264 medium (or better), NVIDIAs Turing NVENC is pushing the boundaries of what is possible in a single consumer PC.

-

Whether you use it or not is entirely up to you however. If you already have a working Dual-PC setup - that can achieve x264 medium (or better) quality, then you don't gain much from moving to Turing NVENC. But if - you're currently stuck on anything below x264 medium, or have a Turing GPU ready to test it out - why not give it a - shot?

+

Whether you use it or not is entirely up to you however. If you already have a working Dual-PC setup that can achieve x264 medium (or better) quality, then you don't gain much from moving to Turing NVENC. But if you're currently stuck on anything below x264 medium, or have a Turing GPU ready to test it out - why not give it a shot?

Video Examples

- -
-

ARMA 3 "Walk through the Jungle" #002

- - - -
-
-

x264 slow

- - - -
-
1920x1080x60 at 8.5mbit
-
- - - -
-
1920x1080x60 at 6.0mbit
-
- - - -
-
1280x720x60 at 6.0mbit
-
- - - -
-
960x540 at 6.0mbit
-
- - - -
-
640x360 at 3.5mbit
-
- -
- - - -
-

OBS Studio 27.x

- - - -
-
1920x1080x60 at 8.5mbit
-
- - - -
-
1920x1080x60 at 6.0mbit
-
- - - -
-
1280x720x60 at 6.0mbit
-
- - - -
-
960x540 at 6.0mbit
-
- - - -
-
640x360 at 3.5mbit
-
- -
- - - -
-

StreamFX v0.10

- - - -
-
1920x1080x60 at 8.5mbit
-
- - - -
-
1920x1080x60 at 6.0mbit
-
- - - -
-
1280x720x60 at 6.0mbit
-
- - - -
-
960x540 at 6.0mbit
-
- - - -
-
640x360 at 3.5mbit
-
- -
- - - -
-

FFmpeg 4.4

- - - -
-
1920x1080x60 at 8.5mbit
-
- - - -
-
1920x1080x60 at 6.0mbit
-
- - - -
-
1280x720x60 at 6.0mbit
-
- - - -
-
960x540 at 6.0mbit
-
- - - -
-
640x360 at 3.5mbit
-
- -
- -
- -
- - - - - - - -
-

Black Mesa #001

- - - -
-
-

x264 slow

- - - -
-
1920x1080x60 at 8.5mbit
-
- - - -
-
1920x1080x60 at 6.0mbit
-
- - - -
-
1280x720x60 at 6.0mbit
-
- - - -
-
960x540 at 6.0mbit
-
- - - -
-
640x360 at 3.5mbit
-
- -
- - - -
-

OBS Studio 27.x

- - - -
-
1920x1080x60 at 8.5mbit
-
- - - -
-
1920x1080x60 at 6.0mbit
-
- - - -
-
1280x720x60 at 6.0mbit
-
- - - -
-
960x540 at 6.0mbit
-
- - - -
-
640x360 at 3.5mbit
-
- -
- - - -
-

StreamFX v0.10

- - - -
-
1920x1080x60 at 8.5mbit
-
- - - -
-
1920x1080x60 at 6.0mbit
-
- - - -
-
1280x720x60 at 6.0mbit
-
- - - -
-
960x540 at 6.0mbit
-
- - - -
-
640x360 at 3.5mbit
-
- -
- - - -
-

FFmpeg 4.4

- - - -
-
1920x1080x60 at 8.5mbit
-
- - - -
-
1920x1080x60 at 6.0mbit
-
- - - -
-
1280x720x60 at 6.0mbit
-
- - - -
-
960x540 at 6.0mbit
-
- - - -
-
640x360 at 3.5mbit
-
- -
- -
- -
- - - - - - - -
-

Call of Duty Modern Warframe TDM Broadcast

- - - -
-
-

x264 slow

- - - -
-
1920x1080x60 at 8.5mbit
-
- - - -
-
1920x1080x60 at 6.0mbit
-
- - - -
-
1280x720x60 at 6.0mbit
-
- - - -
-
960x540 at 6.0mbit
-
- - - -
-
640x360 at 3.5mbit
-
- -
- - - -
-

OBS Studio 27.x

- - - -
-
1920x1080x60 at 8.5mbit
-
- - - -
-
1920x1080x60 at 6.0mbit
-
- - - -
-
1280x720x60 at 6.0mbit
-
- - - -
-
960x540 at 6.0mbit
-
- - - -
-
640x360 at 3.5mbit
-
- -
- - - -
-

StreamFX v0.10

- - - -
-
1920x1080x60 at 8.5mbit
-
- - - -
-
1920x1080x60 at 6.0mbit
-
- - - -
-
1280x720x60 at 6.0mbit
-
- - - -
-
960x540 at 6.0mbit
-
- - - -
-
640x360 at 3.5mbit
-
- -
- - - -
-

FFmpeg 4.4

- - - -
-
1920x1080x60 at 8.5mbit
-
- - - -
-
1920x1080x60 at 6.0mbit
-
- - - -
-
1280x720x60 at 6.0mbit
-
- - - -
-
960x540 at 6.0mbit
-
- - - -
-
640x360 at 3.5mbit
-
- -
- -
- -
- - - - - - - -
-

Forza Horizon 4 #002

- - - -
-
-

x264 slow

- - - -
-
1920x1080x60 at 8.5mbit
-
- - - -
-
1920x1080x60 at 6.0mbit
-
- - - -
-
1280x720x60 at 6.0mbit
-
- - - -
-
960x540 at 6.0mbit
-
- - - -
-
640x360 at 3.5mbit
-
- -
- - - -
-

OBS Studio 27.x

- - - -
-
1920x1080x60 at 8.5mbit
-
- - - -
-
1920x1080x60 at 6.0mbit
-
- - - -
-
1280x720x60 at 6.0mbit
-
- - - -
-
960x540 at 6.0mbit
-
- - - -
-
640x360 at 3.5mbit
-
- -
- - - -
-

StreamFX v0.10

- - - -
-
1920x1080x60 at 8.5mbit
-
- - - -
-
1920x1080x60 at 6.0mbit
-
- - - -
-
1280x720x60 at 6.0mbit
-
- - - -
-
960x540 at 6.0mbit
-
- - - -
-
640x360 at 3.5mbit
-
- -
- - - -
-

FFmpeg 4.4

- - - -
-
1920x1080x60 at 8.5mbit
-
- - - -
-
1920x1080x60 at 6.0mbit
-
- - - -
-
1280x720x60 at 6.0mbit
-
- - - -
-
960x540 at 6.0mbit
-
- - - -
-
640x360 at 3.5mbit
-
- -
- -
- -
- - - - - - - -
-

GRIP Combat Racing #007 (at 60 FPS)

- - - -
-
-

x264 slow

- - - -
-
1920x1080x60 at 8.5mbit
-
- - - -
-
1920x1080x60 at 6.0mbit
-
- - - -
-
1280x720x60 at 6.0mbit
-
- - - -
-
960x540 at 6.0mbit
-
- - - -
-
640x360 at 3.5mbit
-
- -
- - - -
-

OBS Studio 27.x

- - - -
-
1920x1080x60 at 8.5mbit
-
- - - -
-
1920x1080x60 at 6.0mbit
-
- - - -
-
1280x720x60 at 6.0mbit
-
- - - -
-
960x540 at 6.0mbit
-
- - - -
-
640x360 at 3.5mbit
-
- -
- - - -
-

StreamFX v0.10

- - - -
-
1920x1080x60 at 8.5mbit
-
- - - -
-
1920x1080x60 at 6.0mbit
-
- - - -
-
1280x720x60 at 6.0mbit
-
- - - -
-
960x540 at 6.0mbit
-
- - - -
-
640x360 at 3.5mbit
-
- -
- - - -
-

FFmpeg 4.4

- - - -
-
1920x1080x60 at 8.5mbit
-
- - - -
-
1920x1080x60 at 6.0mbit
-
- - - -
-
1280x720x60 at 6.0mbit
-
- - - -
-
960x540 at 6.0mbit
-
- - - -
-
640x360 at 3.5mbit
-
- -
- -
- -
- +{% for video in page.samples.videos %} +

{{ video.name }}

+ +
+ +
+ +{% endfor %} diff --git a/_posts/2020/2020-07-14-funding-streamfx-where-to-go-from-here.html b/_posts/2020/2020-07-14-funding-streamfx-where-to-go-from-here.html new file mode 100644 index 0000000..5a123c3 --- /dev/null +++ b/_posts/2020/2020-07-14-funding-streamfx-where-to-go-from-here.html @@ -0,0 +1,39 @@ +--- +title: "Funding StreamFX: Where to go from here?" +category: Blog +tags: [StreamFX, "WordPress Archive"] +--- + +

StreamFX has grown into one of the most used plugins for OBS Studio, often being called essential for big and small creators alike. And yet, there is a massive problem facing StreamFX: A lack of funding. Like any project, StreamFX can't survive without it, so where do we go from here?

+ +

Currently the funding come from Github Sponsors, Twitch Subscriptions, Patreon, and my own job. The first three make up around $110 in total (+- some amount), which I'm really thankful for. While $110 is not a lot, it does help a bit, and reduces my time spent at work ever so slightly.

+ +

The way Forward

+ +

If we look at the monthly download numbers for just the major stable releases, we can see close to 22k downloads. Assuming that around 75% of those downloads are redownloads from the same user, we arrive at around 5k unique users.

+ +

If even half of these users were to support both OBS Studio and StreamFX with just $1 per month both projects would be in a much better place. I could hire a 2nd developer for MacOS or Vulkan, which would open up many new possibilities for StreamFX.

+ +

However that's not what we're seeing at all, despite my attempts to get people to support the project more. That leaves just one question then: How can I make supporting StreamFX more attractive?

+ +

Option 1: Removable Ads in the App

+ +

In the modern world, ads are the primary way to get paid for work without requiring a subscription. This option would add an ad to the OBS Studio app window that can only be removed by supporting the project in some way. I'm not a particular fan of this option, so this is my least favourite one - and very unlikely to happen.

+ +

Option 2: Bundled Software in the Installer

+ +

Used by some software like Cheat Engine, FileZilla and others to pay for development cost, but often associated with bad press due to the software being installed despite clicking no or even trying to trick users into accepting by moving the accept button where the decline button would be. More likely than option 1, but still not my go-to option.

+ +

Option 3: Locking new features behind a Paywall

+ +

This is the the option I've thought about the most. Every currently existing feature would stay free, while new features would require supporting the project. Unlocking the features would simply be signing in with your account and things would appear on the next OBS Studio launch. Even the $1/mo tiers would be enough to unlock the features. It is the least intrusive way I can think of, and the most likely to happen.

+ +

Option 4: Removable Watermark

+ +

Watermarks are the easiest way to get people to support something. Those that don't care about the project will simply ignore them (or crop them off like art thieves), and those actually wanting to help the project will look for a way to remove them legally. Similar to Option 3, removing them requires actively supporting the project. Probably going to happen in combination with option 3 (for new features).

+ +

The future

+ +

These four were the only options I could come up with. My current approach is not working out, and I see the plugin used in streams of bigger creators all the time with rarely even any mention of it - or any support from those creators. This saddens me as I see people benefiting from my work and choosing to ignore the struggles of the people actually making the work benefit them.

+ +

What do you think about all this? Do you think I'm in the wrong and should just live on exposure bux, or should I implement one of the options mentioned above? Leave your thoughts in the comments below, or as a reply to the social media message you used to get here.

diff --git a/_posts/2020/2020-07-25-streamfx-now-has-its-own-discord.html b/_posts/2020/2020-07-25-streamfx-now-has-its-own-discord.html new file mode 100644 index 0000000..67d25a3 --- /dev/null +++ b/_posts/2020/2020-07-25-streamfx-now-has-its-own-discord.html @@ -0,0 +1,11 @@ +--- +title: StreamFX now has its own Discord! +category: News +tags: [StreamFX, "WordPress Archive"] +--- + +

Due to an excessive amount of channels required for StreamFX, I've decided to split it off into its own Discord server. You can join it using this link, and enjoy all the new features in it. Make sure to read the rules and select your roles according to what you want to do!

+ +

The Server features dedicated roles for each category of tasks, in order to better help users do things. Each role also has a dedicated releases channel for their own content in order to spread the content to other creators that are less skilled at the task.

+ +

You can also now advertise your content in the dedicated channels for it, such as your stream or your YouTube channel.

diff --git a/_posts/2020/2020-07-27-streamfx-v0-8-1-is-now-available.html b/_posts/2020/2020-07-27-streamfx-v0-8-1-is-now-available.html new file mode 100644 index 0000000..b38731e --- /dev/null +++ b/_posts/2020/2020-07-27-streamfx-v0-8-1-is-now-available.html @@ -0,0 +1,119 @@ +--- +title: StreamFX v0.8.1 is now available! +category: News +tags: [StreamFX, "WordPress Archive"] +--- + +

In the two months since the release of Version 0.8.0, a lot of bugs have been discovered - which now have been fixed with Version 0.8.1! Let's take a closer look at the things that have been fixed.

+ +

Update: Update 0.8.2 has been released fixing the newly discovered issues in 0.8.1. The links in the post have been updated.
Update: 0.8.3 is out, and the links have been updated.

+ +

Improving the Installer experience on Windows

+ +

This had been on the table for a while, and finally made it in. Due to the excessive flood of people not reading the installation instructions and asking the same question - usually within seconds of the same question being answered - the installation process had to become a bit more automatic.With that in mind, I went to town on the installer.

+ +

The first thing I did was get rid of the Win98 feel by enabling the modern UI built into the setup tool. Next was hiding the upgrade process that happens automatically in the background, as it sometimes confused users. And finally, it received the ability to automatically install any missing MSVC Redistributable - a much needed feature as almost all support requests could be fixed with just that.

+ +

With that all done, it was time to move on to other more important things.

+ +

Invisible Source Mirrors

+ +

In version 0.8 a fix was applied to no longer freeze OBS Studio when opening the filter dialog on a Source Mirror, and this in turn caused Source Mirror to disappear as the source size was queried in the tick function. The tick function however is not called if the source has no size, which meant it was never set. Now the source size is acquired as soon as the source is acquired, which fixes it for most types of sources.

+ +

A different fix has to be implemented for asynchronous sources, which usually get their size at a later point in time. This second fix will likely be released with 0.9, and not be backported to 0.8.

+ +

Weird Shader filter Rendering

+ +
+
Shader filters used for glow effect.
+
+ +

Shader filters sometimes turned invisible on load, which resulted in very weird graphics glitches. Fixing this was actually super easy, as the reason for the invisible glitch was actually that the size of the output was set to 0x0, and thus exceptions were being thrown. Catching those exceptions and then skipping the filter fixed the issue.

+ +

But something else was still causing problems. Shader filters would look weird depending on how their output resolution was scaled. Turns out I had accidentally coded it to capture the input at the output resolution - fixing that made my effects crystal clear again. Now all shader filters should look crystal clear for you as well!

+ +

New Memory Leaks?!

+ +

Through testing with one of my complex scenes, I discovered an abnormally high number of memory leaks. As it turns out, they actually came from three distinct sources, so fixing them wasn't easy, and I have no explanation for one of them. I've managed to fix all of them, but lets actually look at each one individually:

+ +

Source Tracker

+ +

For a long time now, StreamFX has used its own source tracker implementation, which keeps track of existing sources and scenes, instead of obs_enum_sources and obs_enum_scenes. And this implementation was the source of not just one, but 83 memory leaks - and due to the nature of the code, all of them are critical bugs.

+ +

So what caused them? A quick glance over the code didn't reveal any obvious causes, and did not reveal anything unusal - even hours later I still had no idea what caused them. Even now, nothing jumps out at me that could have caused the leaks. My only guess is that we somehow missed destruction events, which would imply that libOBS is broken, or that we corrupted the map in a thread somewhere.

+ +

I could only gamble at a solution by replacing all the direct pointers with std::shared_ptr, and my gamble paid off. Replacing all the pointers to obs_source_t and obs_weak_source_t with std::shared_ptrs fixed them, as long as a custom deleter was being used. Hooray, on to the next memory leaks!

+ +

Vertex Buffers

+ +

For those not familiar with the StreamFX code base, most of the internal libOBS objects have fitting C++ wrappers that handle all the reference counting things. The same is true for gs::vertex_buffer, yet it was the source of not just one memory leak, but 13. Looking at the source code didn't reveal much on the surface, everything looks like it should have worked as intended.

+ + + + +

+ Closer inspection revealed the first bug: The gs_vbdata_t object was only deleted if the gs_vertbuffer_t object did not exist. This was obviously wrong, so I replaced the entire thing with an std::shared_ptr<gs_vbdata_t> and a custom deleter. One down, 12 to go - but where could they even be? +

+
+ + if (_data) { + memset(_data, 0, sizeof(gs_vb_data)); + if (!_buffer) { // The problem. + gs_vbdata_destroy(_data); + _data = nullptr; + } +} + +
+ +

Several hours of reading the code didn't reveal a single thing. There should have been no way for these memory leaks to occur after this, and yet they did - in total 12 memory leaks were unaccounted for. So I started debugging, and eventually found the culprit: libOBS sets the global obs object to NULL right before unloading all modules, which meant that all the GPU related code would not run.

+ +

This is effectively unfixable from my end, so I've only been able to apply a work around which reduced the memory leaks to 5. Once the OBS Project releases a new OBS Studio version, they should instantly disappear as a fix for this behavior has already been applied to the main code branch of OBS Studio.

+ +

Configuration

+ +

And with that, we are left with two more fixable memory leaks - and the largest difficulty curve I have ever encountered in my entire career in programming. One of the bugs was simply forgetting to free a memory block, but the other was straight up unexplainable behavior. No matter which documentation I looked at, it should not have occured - and yet it did. Try to spot the problem yourself:

+ + + +
bool streamfx::ui::handler::have_shown_about_streamfx(bool shown) +{ + if (shown) { + obs_data_set_bool(streamfx::configuration::instance()->get().get(), _cfg_have_shown_about.data(), true); + } + if (streamfx::configuration::instance()->is_different_version()) { + return false; + } else { + return obs_data_get_bool(streamfx::configuration::instance()->get().get(), _cfg_have_shown_about.data()); + } +} +
Before
+
+
+ +
bool streamfx::ui::handler::have_shown_about_streamfx(bool shown) +{ + auto config = streamfx::configuration::instance(); + auto data = config->get(); + if (shown) { + obs_data_set_bool(data.get(), _cfg_have_shown_about.data(), true); + } + if (config->is_different_version()) { + return false; + } else { + return obs_data_get_bool(data.get(), _cfg_have_shown_about.data()); + } +} +
After
+
+
+
+ + +

Looking at this, the only things that are different are how the std::shared_ptr<streamfx::configuration> and std::shared_ptr<obs_data_t> are stored. Both should have identical results, but they don't. If anyone knows what the actual reason behind this weird behavior is, please leave a comment below, because I couldn't figure it out with the C++ documentation that is available to me.

+ +

Closing Words

+ +

And that is all that was included in the update for version 0.8.1, at least on the user facing side. There are additional changes to the developer facing side, but aside from that, there isn't much to talk about for this version. Go and grab the latest stable version from Github if you haven't already! You can find it here on Github.

+ +

- Xaymar out.

diff --git a/_posts/2020/2020-08-17-a-preview-of-whats-coming-with-streamfx-v0-9.html b/_posts/2020/2020-08-17-a-preview-of-whats-coming-with-streamfx-v0-9.html new file mode 100644 index 0000000..435ce6e --- /dev/null +++ b/_posts/2020/2020-08-17-a-preview-of-whats-coming-with-streamfx-v0-9.html @@ -0,0 +1,41 @@ +--- +title: A preview of what's coming with StreamFX v0.9 +category: Blog +tags: [StreamFX, "WordPress Archive"] +--- + +

A lot of time has passed since the 0.8 release of StreamFX, and since then a lot of code has been submitted and tested. A ton of issues have been fixed internally, making everything work better, and a lot of new features are being worked on. Let's take a quick look at the already confirmed additions!

+ +

The FFmpeg Encoders are now available on Linux!

+ +

You can now use the fancy NVENC UI/UX from StreamFX on your Linux machine! While zero-copy is not supported due to a limitation in OBS Studio itself, all the encoders should be available to you as long as you have the necessary system drivers. This limitation is not something I can work around, so if you need zero-copy you will have to stick with Windows, or find an alternative solution - or just learn coding and write the necessary code in OBS Studio.

+ +

Cleaner UI and code improvements for NVENC!

+ +

NVENC has received a number of changes, mainly for UI/UX, but also in terms of bug fixes. All the rate limitation options are now under the same group, which slightly cleans up the UI and makes it easier to understand. Furthermore the "Each" mode that was listed for "B-Frame Reference Mode" has been removed, as well as Level 5.2 which wasn't actually supported by the FFmpeg version that OBS Studio uses. Lastly the log file now shows the actual configuration, instead of making one up out of thin air.

+ +

Support for AMD AMF H.264 and H.265!

+ +

Even though I personally no longer own any AMD GPUs, and having no plans to get one in the near future, StreamFX will support encoding via AMD AMF in the upcoming release. Note that this uses FFmpeg to do the work, so any bugs with the encoding result are the result of work done by AMD and should be reported as a bug to AMD. I can't fix the things AMD broke in their own driver, so your contact for bugs there is AMD.

+ +

New Shader Parameters: Random and RandomSeed

+ +

Custom shaders now have access to 16 randomly generated values, of which 4 are generated at creation, another 4 are generated for each activation, and the remaining 8 are generated each frame. These are all based on the user specified seed (accessible as RandomSeed), and as such won't differ from run to run unless the seed is changed. Shader developers can make use of these to implement transitions that look different each time they are used, for example the "Sliding Bars" shader has already been improved with this - it now has slightly different speeds and rotations each time the transition is invoked.

+ +

Minor Changes

+ + + +

Conclusion

+ +

While there is still a lot to do and improve, the current changes already make a huge difference to before. I want to get features into a good state before teaming up with companies and integrating StreamFX officially into other software such as Streamlabs OBS, but it is slowly getting there.

diff --git a/_posts/2020/2020-08-28-what-happened-to-the-video-encoding-samples-project.html b/_posts/2020/2020-08-28-what-happened-to-the-video-encoding-samples-project.html new file mode 100644 index 0000000..f4f81d0 --- /dev/null +++ b/_posts/2020/2020-08-28-what-happened-to-the-video-encoding-samples-project.html @@ -0,0 +1,11 @@ +--- +title: What happened to the Video Encoding Samples project? +category: News +tags: ["Video Encoding Samples", "VES", "WordPress Archive"] +--- + +

Due to the new GPU generations being released by the two major vendors (and soon three major vendors), I've currently put the project on indefinite hold. The current discoveries still hold for all existing encoders, which makes newer tests unnecessary for the time being. Even the early runs have not resulted in different settings compared to before.

+ +

For the time being, I've left the old data online, while I quietly work on making a new, more user friendly version possible. Perhaps I will even allow user submissions in order to increase the number of tested GPUs, but that requires a lot of hosting storage.

+ +

For those that have not been around for a long time, the Video Encoding Samples project is a simple database of encoding results compared to the original footage with PSNR, SSIM and VMAF. It is the project that has resulted in the ultimate NVENC settings, and also resulted in a lot of yelling at a certain vendor to finally stop dawdling around.

diff --git a/_posts/2020/2020-09-03-nvidia-rtx-30xx-how-to-make-everything-else-obsolete-in-one-generation.html b/_posts/2020/2020-09-03-nvidia-rtx-30xx-how-to-make-everything-else-obsolete-in-one-generation.html new file mode 100644 index 0000000..4ef92e3 --- /dev/null +++ b/_posts/2020/2020-09-03-nvidia-rtx-30xx-how-to-make-everything-else-obsolete-in-one-generation.html @@ -0,0 +1,35 @@ +--- +title: "NVIDIA RTX 30xx: How to make everything else obsolete in one generation" +category: Blog +tags: ["NVIDIA", "WordPress Archive"] +--- + +

NVIDIA certainly wasn't idle in the last two years, that much is clear. Their jump from 12nm to 8nm should set the average standard for what we should expect from moving nodes while also improving on the generation. This generational leap is what we should have seen from the 20xx series, which now seems like overpriced junk - so sorry for anyone who bought them in the last 6 months and can't return them. Let's go into a bit of history and detail.

+ +

The AMD side: Shrinking 14nm to 7nm

+ +

Three years ago in 2017, AMD RTG tried to even the playing field by moving from 14nm to 7nm, and succeeded. Their new RX Vega generation, while extremely power hungry, did improve performance across the board by roughly 30-75%, depending on what you looked at. And in 2019 they improved on that, with the RX 5000 series - except this time we saw practically no (<5%) performance increase, but they did cut down on heat generation and power draw quite a lot.

+ +

Unfortunately AMD RTG forgot to improve all the other areas: drivers are still pretty bad, video encoding is even worse, and the feature set is still lacking at the price point. Their top of the line RX 5700 XT at ~350,- € is equal with a RTX 2070 at ~350,- €, while providing no raytracing (RTX), machine learning acceleration (Tensor), or good video encoding and decoding support. AMD RTG bet on the competition not moving forward much, which is their signature move at this point.

+ +

The NVIDIA side: Shrinking 12nm to 8nm

+ +

And in comes NVIDIA and completely shatters that bet with their RTX 30xx launch, showing a nearly 100% improved performance across the board compared to the 20xx series - without actually increasing the TDP by much. Everything was upgraded, except for NVENC which is already pretty damn good, so not only do you get double the performance for less money, you also get new features on top of that.

+ +

One of the new features is AV1 decoding support for up to 8K60 HDR content, which paves the way for future content production and consumption. The other is RTX IO, a way to do on-GPU decoding and decompression of content, which offloads the decoding and decompression of textures from the CPU to the GPU - eliminating a transfer between nodes in the ideal case. We can only wait and see what the future has to offer, and if there's going to be Ti or Super models once again (though I highly doubt it this time).

+ +

The Weird Parts: PAM4 memory signaling instead of NRZ

+ +
The new GDDR6X signaling.
+ +

This slide actually doesn't make a lot of sense to the generic end-user, but it is one of the most important parts of everything. Basically NVIDIA doubled the data rate of their memory interface, without actually increasing the frequency. Instead of transmitting 1 bit per cycle, it's now transmitting 2 bits per cycle.

+ +

In order to actually do this, the 30xx series actually needs to have good enough power isolation and needs to be unaffected by sudden power drops caused by other devices or even itself - a massive downside that increases the price and has its own limitations. It also likely means that the 30xx series will not overclock as high as the previous GPUs would, due to the memory being much more sensitive to voltage differences than before.

+ +

To put it in terms everyone understands, the new way of accessing memory allows NVIDIA to literally do more with less. Increasing the amount of bits transferred per cycle from 1 to 2 is quite literally doubling the effective data rate, so not only can they transfer more data, it also takes half the amount of time to access data. A game with very large textures can be expected to suddenly run up to twice as fast, just with this change alone.

+ +

My thoughs on the RTX 30xx Series

+ +

Raytracing has been really fun to play with, and even my 2080 Ti manages around 20-21 fps at 2560x1440, with all effects turned on. So I'm really excited to see a GPU that can do twice that and push the raytraced view to an actual playable amount of frames, without the need of DLSS - DLSS is nice and all, but it does have very obvious issues, even in DLSS 2.0.

+ +

I'll be waiting for AIB partners to come up with their own designs, and then see real world performance from reviewers such as Linus Tech Tips, Gamers Nexus and similar. So until I see any of that, I'm sticking with my tuned RTX 2080 Ti.

diff --git a/_posts/2020/2020-11-21-a-review-gainward-rtx-3090-phantom.html b/_posts/2020/2020-11-21-a-review-gainward-rtx-3090-phantom.html new file mode 100644 index 0000000..315cec8 --- /dev/null +++ b/_posts/2020/2020-11-21-a-review-gainward-rtx-3090-phantom.html @@ -0,0 +1,283 @@ +--- +title: "A Review: Gainward RTX 3090 Phantom" +category: Review +tags: ["NVIDIA", "WordPress Archive"] +--- + + +
+
Melted PCB
+
+ +

Around the end of last week, my Alphacool waterblock decided that it was time to kill the NVIDIA RTX 2080 Ti Founders Edition it was placed on. That was the day I learned that burning PCB and plastic smells the same as coal - and that I should probably replace my smoke detectors since they didn't go off at all.

+ +

That meant I needed a new GPU, and after a bit of search for actually available GPUs, I ended up going for the 3090 cards - nobody apparently has 3080s, only 3070s and 3090s. The card I ended up with is the Gainward RTX 3090 Phantom, which has some limitations but otherwise works well. Let's get into the hard stuff.

+ +

The GPU

+ +
+
Resting against the reservoir
+
+ +

The first thing that came to mind when unboxing this card is "What a monster, will that even fit?". And it barely does, if it were 1mm longer I would have had to return this card, and wait for another one to be available. This card is so long it blocks about 60% of my cable management, and sits directly against the EKWB reservoir+pump combination - longer than the previous 2080 Ti, which was already pretty long to begin with.

+ +

The card requires three separate 8-pin rails, ships with a dual-IDE to 8-pin adapter and even has an acryl laser cut anti-sag part. This anti-sag part seems to have been made for a completely different GPU, as its default position - or any of the possible ones - would be inside of a fan. This also means that it needs 6 entire slots to keep itself in a reasonable position if mounted like in the picture to the right.

+ +

The default BIOS on it disallows undervolting (but allows overvolting, surprisingly enough), comes with a power limit of 100/370/420 W (Min/Std/Max), and supports a weird number of memory chips that just seem to make no sense at all. At the time of purchase the card was sold for 1749,00€, which puts its Price per Watt at 17.49/4.72/4.37 €. I've uploaded the BIOS to TechPowerUp for verification.

+ +
+
The Gainward RTX 3090 Phantom, in all its massive glory. To the left is the air cooler of the RTX 2080 Ti FE.
+
+ +

Testing

+ +

As usual, testing was performed on my main system, with the following configuration: AMD Ryzen 9 3950X (16/32, Stock, PBO Enabled, Watercooled), 4x G.Skill TridentZ Neo 16GB 3.6Ghz (3.2Ghz, 15-15-15-15-30-45), Gigabyte Aorus X570 Master, 2x1TB SAMSUNG 960 Pro, 2x2TB SanDisk Ultra 3D and a 1000W be Quiet! Pure Power 11. The Driver version used is 457.30, and the average room temperature was measured at 22°C +-1.5°C, which slightly varies from run to run.

+ +

Tests were performed 10 times in sequence, with only the worst 3 results taken and averaged together to create a realistic experience. All games were run at the highest possible settings, which in some games means to actually deviate from the default profile. VSync, GSync and FreeSync were turned off during testing. FrameView was used where possible to measure performance accurately, and the power limit was left at 100%.

+ +

Results at 1920x1080

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
GameMin FPSAvg FPS99%ile FPS€/Frame
Anno 1800 (DirectX 12)0.572153.6333.5511.38
Assassin's Creed Odyssey7.2798.3766.9417.78
Borderlands 3 (DirectX 11)13.36116.8189.5414.97
Borderlands 3 (DirectX 12)23.16150.97130.6211.59
Far Cry 5 (DirectX 11)29.28117.9387.6514.83
Forza Horizon 4160.40186.30N/A19.39
Tom Clancy's Ghost Recon Breakpoint (Vulkan)52.11160.67105.2710.89
Tom Clancy's Rainbow Six (Vulkan)34.30193.07164.209.06
Tom Clancy's The Division 2 (DirectX 12)59.96119.9799.9414.58
+ +

Results at 2560x1440

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
GameMin FPSAvg FPS99%ile FPS€/Frame
Anno 1800 (DirectX 12)24.47122.61757.9814.26
Assassin's Creed Odyssey7.5488.7962.4819.70
Borderlands 3 (DirectX 11)35.42110.8480.0415.78
Borderlands 3 (DirectX 12)26.33122.19101.6714.31
Far Cry 5 (DirectX 11)48.97112.8086.6315.51
Forza Horizon 4130.80153.30N/A111.41
Tom Clancy's Ghost Recon Breakpoint (Vulkan)48.33132.7199.3513.18
Tom Clancy's Rainbow Six (Vulkan)32.90115.80101.4915.10
Tom Clancy's The Division 2 (DirectX 12)46.39119.6093.5614.62
+ +

Synthetic Benchmarks

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
GameScore
3D Mark Fire Strike Ultra (DirectX 11) - Graphics Score11910
3D Mark Port Royal (DirectX 12, DXR) - Graphics Score12694
3D Mark Time Spy (DirectX 12) - Graphics Score19596
3D Mark Time Spy Extreme (DirectX 12) - Graphics Score9960
Final Fantasy XIV (DirectX 11, 1920x1080, Maximum)20290
Final Fantasy XIV (DirectX 11, 2560x1440, Maximum)18807
Unigine Superposition (DirectX 11, 1080p Extreme)12296
Unigine Superposition (DirectX 11, 4K Optimized)16055
Unigine Superposition (DirectX 11, 8K Optimized)7174
+ + + +
    +
  1. Unable to hook with third party tools as the game crashes immediately.
  2. +
  3. Indentical behavior across 10 runs, game runs extremely poorly.
  4. +
+ + + +

Other Notes

+ + + + +

Conclusion

+ +

Don't waste your money on the Gainward RTX 3090 Phantom. It performs poorly - often being beaten by a MSI/ASUS/EVGA/Gigabyte RTX 3080 - and constantly overheats, causing it to throttle to performance below a RTX 3080. Wait for stock for one of the well known and reputable brands to be available, as you will get a far better deal out of them.

+ +

In short

+ + + + + + diff --git a/_posts/2020/2020-12-08-fastest-uint8array-to-hex-string-conversion-in-javascript.html b/_posts/2020/2020-12-08-fastest-uint8array-to-hex-string-conversion-in-javascript.html new file mode 100644 index 0000000..6b80880 --- /dev/null +++ b/_posts/2020/2020-12-08-fastest-uint8array-to-hex-string-conversion-in-javascript.html @@ -0,0 +1,260 @@ +--- +title: "Convert Uint8Array to Hex quickly in JS" +category: Blog +tags: ["JavaScript", "TypedArray", "WordPress Archive"] +--- + +

As a Programmer I have to deal with a number of programming languages to write code, and one language that repeatedly appears is JavaScript. JavaScript is one of the weirder languages - similar to PHP in weirdness - which makes it an interesting experience to say the least. Most of the time you're at the whim of a grey box compiler, due to the massive variance of Browsers and Devices that the users use.

+ +

So in order to best approach reality, I have to figure out which APIs are available at any point in time, and also run performance benchmarks in current major browsers available to me. And that's what todays post is about, finding which of the various methods is fast enough for high performance use.

+ +

The different Methods

+ + + +

Like any other programming language, there are infinite ways to reach the same solution. Some slower, some unreadable and some look like magic. Here are all the unique ones that I could find or come up with, excluding those which did not even manage to convert more than 1000 buffers per second on current generation hardware:

+ + + +

All code below is under the BSD 3-Clause license.
Copyright 2020 Michael Fabian 'Xaymar' Dirks <info@xaymar.com>

+ + + +

Method #1: Array.map() with String.slice()

+ + + +function toHex(buffer) { + return Array.prototype.map.call(buffer, x => ('00' + x.toString(16)).slice(-2)).join(''); +} + +

While this one looks complex at first, it's actually just calling the map method of a different class on a different object, which just so happens to work. The rest is simple string modification and then joining the entire array to a string.

+ +

Method #2: Array.map() with String.padStart()

+ +function toHex(buffer) { + return Array.prototype.map.call(buffer, x => x.toString(16).padStart(2, '0')).join(''); +} + +

Same idea as #1, just optimizing the string operations slightly.

+ +

Method #3: Array.map() with 4-bit LUT and StringBuilder

+ +// Pre-Init +const LUT_HEX_4b = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F']; +// End Pre-Init +function toHex(buffer) { + return Array.prototype.map.call(buffer, x => `${LUT_HEX_4b[(x >>> 4) & 0xF]}${LUT_HEX_4b[x & 0xF]}`).join(''); +} + +

This approach uses a precomputed look-up-table (LUT) to convert any 4-bit value to a hexadecimal symbol.

+ +

Method #3.1: Array.map() with 4-bit LUT and String Concat

+ +// Pre-Init +const LUT_HEX_4b = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F']; +// End Pre-Init +function toHex(buffer) { + return Array.prototype.map.call(buffer, x => (LUT_HEX_4b[(x >>> 4) & 0xF] + LUT_HEX_4b[x & 0xF])).join(''); +} + +

Method #4: Array.map() with 8-bit LUT

+ +// Pre-Init +const LUT_HEX_4b = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F']; +const LUT_HEX_8b = new Array(0x100); +for (let n = 0; n < 0x100; n++) { + LUT_HEX_8b[n] = `${LUT_HEX_4b[(n >>> 4) & 0xF]}${LUT_HEX_4b[n & 0xF]}`; +} +// End Pre-Init +function toHex(buffer) { + return Array.prototype.map.call(buffer, x => LUT_HEX_8b[x]).join(''); +} + +

Same idea as #3, but with a LUT to convert any 8-bit value to a hexadecimal symbol group.

+ +

Method #5: Array.push() with 8-bit LUT

+ +// Pre-Init +const LUT_HEX_4b = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F']; +const LUT_HEX_8b = new Array(0x100); +for (let n = 0; n < 0x100; n++) { + LUT_HEX_8b[n] = `${LUT_HEX_4b[(n >>> 4) & 0xF]}${LUT_HEX_4b[n & 0xF]}`; +} +// End Pre-Init +function toHex(buffer) { + const out = new Array(); + for (let idx = 0; idx < buffer.length; idx++) { + out.push(LUT_HEX_8b[buffer[idx]]); + } + return out.join(''); +} + +

Breaking out from the same idea is #5, which builds an array manually instead of letting the JavaScript runtime handle it for us. This also uses the LUT approach.

+ +

Method #5.1: Array.set() with 8-bit LUT

+ +// Pre-Init +const LUT_HEX_4b = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F']; +const LUT_HEX_8b = new Array(0x100); +for (let n = 0; n < 0x100; n++) { + LUT_HEX_8b[n] = `${LUT_HEX_4b[(n >>> 4) & 0xF]}${LUT_HEX_4b[n & 0xF]}`; +} +// End Pre-Init +function toHex(buffer) { + const out = new Array(buffer.length); + for (let idx = 0; idx < buffer.length; ++idx) { + out[idx] = (LUT_HEX_8b[buffer[idx]]); + } + return out.join(''); +} + +

Method #6: String Concat with 4-bit LUT and StringBuilder

+ +// Pre-Init +const LUT_HEX_4b = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F']; +// End Pre-Init +function toHex(buffer) { + let out = ''; + for (let idx = 0; idx < buffer.length; idx++) { + let n = buffer[idx]; + out += `${LUT_HEX_4b[(n >>> 4) & 0xF]}${LUT_HEX_4b[n & 0xF]}`; + } + return out; +} + +

Similar to #5, but this time we directly build a string instead of building an array first.

+ +

Method #6.1: String Concat with 4-bit LUT (String += String + String)

+ +// Pre-Init +const LUT_HEX_4b = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F']; +// End Pre-Init +function toHex(buffer) { + let out = ''; + for (let idx = 0; idx < buffer.length; idx++) { + let n = buffer[idx]; + out += LUT_HEX_4b[(n >>> 4) & 0xF] + LUT_HEX_4b[n & 0xF]; + } + return out; +} + +

Method #6.2: String Concat with 4-bit LUT (2x String += String)

+ +// Pre-Init +const LUT_HEX_4b = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F']; +// End Pre-Init +function toHex(buffer) { + let out = ''; + for (let idx = 0; idx < buffer.length; idx++) { + let n = buffer[idx]; + out += LUT_HEX_4b[(n >>> 4) & 0xF]; + out += LUT_HEX_4b[n & 0xF]; + } + return out; +} + +

Method #7: String Concat with 8-bit LUT

+ +

Similar to #6, but with an 8-bit LUT. This is effectively this StackOverflow answer, just much cleaner.

+ +// Pre-Init +const LUT_HEX_4b = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F']; +const LUT_HEX_8b = new Array(0x100); +for (let n = 0; n < 0x100; n++) { + LUT_HEX_8b[n] = `${LUT_HEX_4b[(n >>> 4) & 0xF]}${LUT_HEX_4b[n & 0xF]}`; +} +// End Pre-Init +function toHex(buffer) { + let out = ''; + for (let idx = 0, edx = buffer.length; idx < edx; idx++) { + out += LUT_HEX_8b[buffer[idx]]; + } + return out; +} + +

Omitted Methods

+ + + +

The Results

+ +

As usual, tests were run on my daily available machines, mainly the 3950X gaming/development PC. These tests were run using JSBench.me, as JSBen.ch had wildly fluctuating results in both Chrome and Firefox. Without and further needless text, here are the result:

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
MethodOps/s% Slower
#1: Array.map() with String.slice()15589.45 +- 1.03%94.16 %
#2: Array.map() with String.padStart()17072.61 +- 0.81%93.60 %
#3: Array.map() with 4-bit LUT and StringBuilder34887.21 +- 0.31%86.93 %
#3.1: Array.map() with 4-bit LUT and String Concat35465.37 +- 0.37%86.71 %
#4: Array.map() with 8-bit LUT48936.74 +- 0.70%81.66 %
#5: Array.push() with 8-bit LUT46378.04 +- 0.55%82.62 %
#5.1: Array.set() with 8-bit LUT59356.56 +- 0.59%77.76 %
#6: String Concat with 4-bit LUT and StringBuilder71194.39 +- 0.44%73.32 %
#6.1: String Concat with 4-bit LUT (String += String + String)106905.18 +- 0.62%59.94 %
#6.2: String Concat with 4-bit LUT (2x String += String)135382.25 +- 0.58%49.27 %
#7: String Concat with 8-bit LUT266856.91 +- 0.54%0.00 %
+
Tests performed in Mozilla Firefox 83.0 (64-bit) on an AMD Ryzen 3950X with 64GB memory. Similar results were observed in Google Chrome 87.0.4280.88.
+
+ +

Much to my surprise, the String concatenation ones came out on top. Both ended up being roughly 400% faster than their Array based counterparts, which is totally unexpected in this situation. This seems to point at the Array.join() function being poorly implemented in every JavaScript engine, resulting in massive slow downs where barely any should be.

+ +

The results slightly differed between Chrome and Firefox on Desktop, where Chrome performed much worse in tests #1, #2, #3, #3.1 and #4, and better in #5 and #5.1. The same relative performance numbers were observed on mobile in both browsers, which most likely also extend to the Apple platforms. Anyway, with all that text out of the way, it's safe to say that method #7 won the contest, by a large margin - even on mobile.

+ +

- Xaymar

diff --git a/_posts/2020/2020-12-23-accidental-nvenc-discoveries.html b/_posts/2020/2020-12-23-accidental-nvenc-discoveries.html new file mode 100644 index 0000000..753e756 --- /dev/null +++ b/_posts/2020/2020-12-23-accidental-nvenc-discoveries.html @@ -0,0 +1,59 @@ +--- +title: Accidental NVENC Discoveries +category: Blog +tags: ["NVIDIA", "NVENC", "H264", "H265", 'WordPress Archive'] +--- + +

While testing new updates to the VES testing suite, I discovered some weird behavior in NVENC. Here's a list of them, maybe NVIDIA can shed some light on it:

+ +

G-Sync affects NVENC Encoding Speed

+ +

For unknown reasons G-Sync affects the rate of encoding provided by NVENC, no matter how you submit frames to it. So in case you're hitting "Encoding Overloaded" in OBS Studio for no actual reason, try disabling G-Sync globally.

+ +

Constant Quality is a lie?

+ +

The "Constant Quality" encoding method often used for archival and also often mistaken for an alternative to x264 CRF has a default bitrate limit. For H.264/AVC the maximum bitrate it will pick is 135mbit/s, while for H.265/HEVC it will pick is 25mbit/s. You can affect these limits by explicitly setting the maxBitRate ("-maxrate" in FFmpeg) and vbvBufferSize ("-bufsize" in FFmpeg) to a higher value.

+ + + +

Please note that doing this means that you will need a more modern decoder to view or edit the footage. I still recommend to go with the highest possible maximum bitrate in order to get the best out of your footage.

+ +

H.265/HEVC struggles representing Foliage

+ +

Even with the above maximum bitrate fix applied, the HEVC NVENC encoder has a weird affinity to just murder foliage for no actual reason. It seems that there is a noise pattern detection method that freaks out once the foliage gets too large, and fails to adapt to the fact that the foliage now has even more detail than before.

diff --git a/_posts/2020/2020-12-23-achieving-ideal-recording-quality.html b/_posts/2020/2020-12-23-achieving-ideal-recording-quality.html new file mode 100644 index 0000000..0b872f7 --- /dev/null +++ b/_posts/2020/2020-12-23-achieving-ideal-recording-quality.html @@ -0,0 +1,7 @@ +--- +title: A guide to achieving high quality Recordings +category: Blog +tags: ["Tutorial", "NVENC", "OBS Studio", 'WordPress Archive'] +--- + +

Every since publishing the guide on how to achieve the best possible NVIDIA NVENC quality with FFmpeg 4.3.x and below, people repeatedly ask me what the best possible recording settings are. So today, as a Christmas present, let me answer this question to the best of my knowledge and help all of you achieve a quality you've never seen before. Read the full guide here.

diff --git a/_posts/2020/2020-12-28-av1-the-current-future-of-video.html b/_posts/2020/2020-12-28-av1-the-current-future-of-video.html new file mode 100644 index 0000000..06af890 --- /dev/null +++ b/_posts/2020/2020-12-28-av1-the-current-future-of-video.html @@ -0,0 +1,927 @@ +--- +title: "AV1: The current future of Video" +category: Blog +tags: ["AV1", "H264", "H265", "x265", "NVENC", "WordPress Archive"] +--- + +

It has been a while since I last checked out AV1, but even then AV1 was still dominating in quality and compression. Now it's time to revisit the tests I've done back then, so grab a coffee, take a seat, cause this is going to be one long ride.

+ +

Setup

+ +

Today we'll test H264 via x264 veryslow as well as NVENC, HEVC via NVENC, and AV1 via libsvtav1. x265 was excluded due to unclear licensing and patent situation, and VP9 was excluded because none of the current encoders reach any reasonable encoding performance. The settings used for each encoder are:

+ + + +

Settings were intentionally set up for VBR encodes instead of CBR, as CBR streaming should no longer be used in the modern day - VBR is superior with a proper streaming protocol.

+ +

Note: If your browser does not use hardware decoders and the developers have refused to do so, you might have to opt for the classic right click and download method. If you don't know how to do that, perhaps now is the time to experiment a little - or use Google.

+ +

Videos

+ +

If you're not yet familiar with my Video Encoding Samples, it is my attempt at making a reproducable video encoding and testing environment, producing results in a few hours with as many tests performed as possible. For this test, I'm using footage from the sample videos provided here.

+ +

Bitrates

+ +

Of interest to me are how well each codec handles extremes, whether it is the content being extremely difficult to encode, or the bitrate being extremely low. Multiple examples will be provided at different bitrates, each annotated to show which bitrate it is. The bitrates used are: 1mbit, 2mbit, 3.5mbit, 6.0mbit.

+ +

To noones surprise, AV1 is on top in every single test. The distance between H.264, HEVC and AV1 only increases with reduced bitrates, but don't let me tell you what to think, instead look at the results themselves:

+ +

ARMA 3 002: "Short walking section through heavy foliage"

+ +

6mbit

+ + + +
+
+
x264
+ + + +
h264_nvenc
+
+ + + +
+
libsvtav1
+ + + +
hevc_nvenc
+
+
+ + + +

H.264 is fighting with all it's got, but it's not enough. HEVC and AV1 dominate here in quality, with AV1 having the advantage of being a software encoder.

+ + + +

3.5mbit

+ + + +
+
+
x264
+ + + +
h264_nvenc
+
+ + + +
+
libsvtav1
+ + + +
hevc_nvenc
+
+
+ + + +

H.264 is already waving the white flag, but HEVC is still holding up. AV1 is ready for us to actually begin the fight, as it has yet to actually give it 10% of what it's capable of.

+ + + +

2.0mbit

+ + + +
+
+
x264
+ + + +
h264_nvenc
+
+ + + +
+
libsvtav1
+ + + +
hevc_nvenc
+
+
+ + + +

H264 has been reduced to nothing more than its bones, but HEVC is still fighting but struggling to maintain composure. AV1 however is now sweating, regretting its earlier to decision to ask us to turn up the heat.

+ + + +

1.0mbit

+ + + +
+
+
x264
+ + + +
h264_nvenc
+
+ + + +
+
libsvtav1
+ + + +
hevc_nvenc
+
+
+ + + +

Every encoder is now waving the white flag. I wouldn't want to watch this.

+ + + + + + + +

Black Mesa 001: "Fighting Section in Labs"

+ + + +

6.0mbit

+ + + +
+
+
x264
+ + + +
h264_nvenc
+
+ + + +
+
libsvtav1
+ + + +
hevc_nvenc
+
+
+ + + +

H.264 manages to be watchable here, only lacking in detail and sharpness, and occasionally becoming noisy/blocky. HEVC is much sharper and detailed, but has issues in combat. AV1 is apparently unfazed by what's going on.

+ + + +

3.5mbit

+ + + +
+
+
x264
+ + + +
h264_nvenc
+
+ + + +
+
libsvtav1
+ + + +
hevc_nvenc
+
+
+ + + +

It is clear that this is taxing on H.264 encodes already, but they still remain watchable despite the obvious lack of quality. HEVC is also now struggling to maintain overall quality, having dropped in detail and sharpness. AV1 is laughing at us.

+ + + +

2.0mbit

+ + + +
+
+
+ + + +
+
+ + + +
+
+ + + +
+
+
+ + + +

Both H.264 encoders are now unable to maintain watchable quality, and even HEVC is barely making ends meet. AV1 still has stopped laughing.

+ + + +

1.0mbit

+ + + +
+
+
x264
+ + + +
h264_nvenc
+
+ + + +
+
libsvtav1
+ + + +
hevc_nvenc
+
+
+ + + +

HEVC and H.264 are giving up, waving the white flag. AV1 is now taunting us with the classic "Is that all you got? Pathetic." line, while visibly sweating.

+ + + + + + + +

Final Fantasy XIV 002: "The Singularity Reactor - King Thordan Phase Transition"

+ + + +

6.0mbit

+ + + +
+
+
x264
+ + + +
h264_nvenc
+
+ + + +
+
libsvtav1
+ + + +
hevc_nvenc
+
+
+ + + +

Both H.264 and HEVC struggle to maintain quality, while AV1 is still really watchable.

+ + + +

3.5mbit

+ + + +
+
+
x264
+ + + +
h264_nvenc
+
+ + + +
+
libsvtav1
+ + + +
hevc_nvenc
+
+
+ + + +

The trend continues, with AV1 being the only really watchable one.

+ + + +

2.0mbit

+ + + +
+
+
x264
+ + + +
h264_nvenc
+
+ + + +
+
libsvtav1
+ + + +
hevc_nvenc
+
+
+ + + +

NVENC H.264 has given up, and is encoding at 3.1mbit, x264 is unwatchable and HEVC is getting close to its limit. AV1 is still watchable, but definitely would not meet my quality standards.

+ + + +

1.0mbit

+ + + +
+
+
x264
+ + + +
h264_nvenc
+
+ + + +
+
libsvtav1
+ + + +
hevc_nvenc
+
+
+ + + +

Every encode is practically trash now, AV1 is watchable, but not by a lot.

+ + + + + + + +

Forza 4 Horizon 001: "Racing at Dusk"

+ + + +

6.0mbit

+ + + +
+
+
x264
+ + + +
h264_nvenc
+
+ + + +
+
libsvtav1
+ + + +
hevc_nvenc
+
+
+ + + +

Straight from the start, HEVC and AV1 set themselves apart from H.264 with sharpness and detail. AV1 one ups HEVC here even.

+ + + +

3.5mbit

+ + + +
+
+
x264
+ + + +
h264_nvenc
+
+ + + +
+
libsvtav1
+ + + +
hevc_nvenc
+
+
+ + + +

The ancient H.264 codec is now bordering on unwatchable, with much blocking and noise. HEVC is starting to come apart at the seams, but still maintaining reasonable quality. AV1 is still flexing on the other codecs here.

+ + + +

2.0mbit

+ + + +
+
+
x264
+ + + +
h264_nvenc
+
+ + + +
+
libsvtav1
+ + + +
hevc_nvenc
+
+
+ + + +

The heap of bones you can see here is called H.264, which has reverted to its most basic form - the bitstream. H265 is also struggling hard with this bitrate, with nearly all of the foliage having lost any detail and visible blocking all over the place. AV1 maintains smoothness, but also starts to have issues with foliage.

+ + + +

1.0mbit

+ + + +
+
+
x264
+ + + +
h264_nvenc
+
+ + + +
+
libsvtav1
+ + + +
hevc_nvenc
+
+
+ + + +

There's still a pile of bones called H264, and H265 is currently burning at the stake. AV1 is avoiding most of the fire, but is still hurting and not in perfect condition anymore.

+ + + + + + + +

GRIP 001: "Racing on Space Station"

+ + + +

6.0mbit

+ + + +
+
+
x264
+ + + +
h264_nvenc
+
+ + + +
+
libsvtav1
+ + + +
hevc_nvenc
+
+
+ + + +

Only HEVC and AV1 manage to encode this at a reasonable quality, due to the post-processing grain this game adds by default. HEVC is definitely struggling with motion however, and AV1 is visibly better than HEVC

+ + + +

3.5mbit

+ + + +
+
+
x264
+ + + +
h264_nvenc
+
+ + + +
+
libsvtav1
+ + + +
hevc_nvenc
+
+
+ + + +

The white flag has been raised by both H264 encoders, and HEVC is fighting with all its got. AV1 is not impressed so far.

+ + + +

2.0mbit

+ + + +
+
+
x264
+ + + +
h264_nvenc
+
+ + + +
+
libsvtav1
+ + + +
hevc_nvenc
+
+
+ + + +

Again both H264 encoders are waving the white flag (🏳), and HEVC is no longer watchable. AV1 is still strong, but definitely hitting its limits as well.

+ + + +

1.0mbit

+ + + +
+
+
x264
+ + + +
h264_nvenc
+
+ + + +
+
libsvtav1
+ + + +
hevc_nvenc
+
+
+ + + +

Once again, NVENC H264 is cheating and using more bitrate than allowed, but even then it's still not enough for anything watchable. HEVC has given up and even AV1 is having issues - but they might be resolved at higher presets, as this preset doesn't enable grain synthesis.

+ + + + + + + +

GRIP 002: "Race through orange desert"

+ + + +

6.0mbit

+ + + +
+
+
x264
+ + + +
h264_nvenc
+
+ + + +
+
libsvtav1
+ + + +
hevc_nvenc
+
+
+ + + +

HEVC is visibly better than H264 already, with increased detail and clarity. AV1 doesn't differ much from HEVC at this bitrate.

+ + + +

3.5mbit

+ + + +
+
+
x264
+ + + +
h264_nvenc
+
+ + + +
+
libsvtav1
+ + + +
hevc_nvenc
+
+
+ + + +

HEVC and AV1 are the only codecs I'd consider watchable here, with AV1 keeping slightly more clarity in the image than HEVC.

+ + + +

2.0mbit

+ + + +
+
+
x264
+ + + +
h264_nvenc
+
+ + + +
+
libsvtav1
+ + + +
hevc_nvenc
+
+
+ + + +

You already know the deal, H264 is not capable of anything at this bitrate. AV1 and HEVC are definitely struggling here, but AV1 wins in pure clarity again.

+ + + +

1.0mbit

+ + + +
+
+
x264
+ + + +
h264_nvenc
+
+ + + +
+
libsvtav1
+ + + +
hevc_nvenc
+
+
+ + + +

Everything is pretty much just a mess of blocks at this point. HEVC and AV1 maintain some level of clarity, enough to see what's going on at least. AV1 is better than HEVC only in some areas, falling behind or equalling it in others.

+ + + + + + + +

Satisfactory 001: "Exploration at Night"

+ + + +

6.0mbit

+ + + +
+
+
x264
+ + + +
h264_nvenc
+
+ + + +
+
libsvtav1
+ + + +
hevc_nvenc
+
+
+ + + +

Pretty much all encoders are having issues, with HEVC and AV1 still maintaining watchable quality.

+ + + +

3.5mbit

+ + + +
+
+
x264
+ + + +
h264_nvenc
+
+ + + +
+
libsvtav1
+ + + +
hevc_nvenc
+
+
+ + + +

Still a similar situation with only HEVC and AV1 maintaining watchable quality. AV1 seems to be slightly faster in recovery, but that's most likely due to software encoding being used.

+ + + +

2.0mbit

+ + + +
+
+
x264
+ + + +
h264_nvenc
+
+ + + +
+
libsvtav1
+ + + +
hevc_nvenc
+
+
+ + + +

AV1 is starting to show its strengths again, with significantly reduced blocking in motion. H264 and H265 here is not that great anymore.

+ + + +

1.0mbit

+ + + +
+
+
x264
+ + + +
h264_nvenc
+
+ + + +
+
libsvtav1
+ + + +
hevc_nvenc
+
+
+ + + +

At 1mbit, none of the codecs are reasonable quality. Though I'd still call AV1 much better than any of the other ones.

+ + + + + + + +

Ending Words

+ +

From the results here, it seems like AV1 will take over as the one codec to rule them all. While we do not yet have real-time capable encoding at high resolution and framerates on consumer-grade computers yet, this will most likely change with one GPU generation. Hardware decoding for AV1 is already confirmed to be available in the NVIDIA 30xx Series, AMD 6xxx Series, and seemingly the new Intel iGPUs

+ +

That's all for now.

diff --git a/_posts/2021/2021-03-17-a-look-back-creating-a-vst-2-x-plug-in-from-nothing-part-1.html b/_posts/2021/2021-03-17-a-look-back-creating-a-vst-2-x-plug-in-from-nothing-part-1.html new file mode 100644 index 0000000..0908e7a --- /dev/null +++ b/_posts/2021/2021-03-17-a-look-back-creating-a-vst-2-x-plug-in-from-nothing-part-1.html @@ -0,0 +1,70 @@ +--- +title: "A Look Back: Creating a VST 2.x Plug-In from Nothing (Part 1)" +category: Blog +tags: ["VoiceFX", "VST", "VST 2.x", "WordPress Archive"] +--- + + +

When I started with VoiceFX, my original goal was to only support VST 3.x, as it was the most modern version of the SDK, and surely by now every important software had moved to it. Unfortunately I didn't account for the occasional big shot releasing a modern product with a relatively ancient version of the SDK - an SDK that no longer officially exists. So what do you do in this situation?

+ +

You do what every other totally sane developer does and start a clean-room reverse engineering project for the now abandoned VST 2.x SDK, staying faithful to the law. And finally, after roughly 5 months of development, I managed to make it work. So how did I get there?

+ +

Establishing the Boundaries

+ +

When trying to stay within the law, you have to establish clear boundaries. Especially for clean-room reverse engineering, I had to go into quite a few lawsuits about it to figure out what is and is not allowed. In the end, I had to actually ask a lawyer for advice, which ended up with some costs, and the advice I got boiled down to this:

+ +
    +
  1. Any work performed must be for the purpose of interoperability.
  2. +
  3. Don't use any original material or third-party source material that is not clearly set apart from the original.
  4. +
  5. Avoid the use of reverse engineering tools where possible.
  6. +
+ +

These rules sound simple, but they ended up making the project a living hell. I had to rely only on information that was clearly detached from the applications used, or not at all rely on these applications. A single misstep and you end up with a huge amount of legal issues. But what other option do I have when the actual SDK is now intentionally hidden by the creators, but still in use by commercial applications released today?

+ +

Where is the Entry?

+ +

I have to begin somewhere, and for VST 2.x Plug-Ins and Hosts, it is the interface to the Plug-In itself. I needed to figure out the "entry point" from which everything starts, and what that entry point actually does, which meant I had to look into what existing VST 2.x Plug-Ins export. On Windows, I could hook into GetProcAddress, while on Linux, I could hook into dlsym.

+ +

I ended up using the Windows function for the initial steps and after filtering out several thousand unrelated strings, I finally found something that looked relevant: a string with the content "VSTPluginMain". This string was often accompanied by "MAIN" and "main_macho", which I assume are for Windows and MacOS exclusive VST 2.x plugins only.

+ +

I had the name of the entry point, but none of it's details - in common terms, I knew where the door was, but not where the knob was, what the key looked like, and how it would even open. But I had a start, a which is more than nothing, and gave me the necessary push required to move on after almost giving up from the mass of data.

+ +

Lockpicking the Entry

+ +

While having an entry point is great, it still being locked is a problem - it needs to be unlocked for us to actually do anything. I could not rely on anything but the most basic tools at this point, as every information other tools may give me could be wrong. The only tool that I had available to unlock the entry point are the CPU registers, so the work began.

+ +

On AMD64, you have the registers (R)AX, (R)BX, (R)CX, (R)DX, (R)SI, (R)DI, (R)BP, (R)SP, R8, R9, R10, R11, R12, R13, R14, R15, (R)IP, and (R)FLAGS. All of these have the potential to hold critical information, but some of these will actually contain useful information. I won't bore you with the details about x86/amd64 Assembly here, there's plenty of other resources out there that are easily found if you're actually into that.

+ +
+
Register debugging is not fun.
+
+ +

The short form of testing is that I found two clear patterns that repeated every single time: (R)CX would point into some kind of read-only executable memory, while the value in (R)AX when returning would result in different crashes - except if the value is zero. So most likely I was looking at two pointers of some kind, one as the argument, one as the return value.

+ +

That left the calling convention, which is where it got a bit difficult. I've never had a clear resource on what each calling convention actually does, but it seems that 64-bit has somewhat unified the world to stop creating additional standards for something so critical. I have no idea what the 32-bit calling convention would be, but my assumption is that it is either stdcall or cdecl, with the latter sounding more sane - time will tell.

+ +

Magic Space

+ +

With the entry point unlocked, but without a clear idea of what it does, questions flooded into my mind. Clearly the value I'm returning has some sort of meaning, so can I affect the behavior in other ways that just returning 0? What if I have a bunch of memory which is all filled with 0, but I return the pointer to that memory? How large does that memory have to be if that works?

+ +

While it seemed simple to test, of course the reality turned out to be much harder. My initial idea of simply returning memory that shrinks every time it succeeds resulted in a size of 4 bytes - nowhere enough to store anything. There was clearly something going on with the first four bytes, but what? It had to be a magic number of some kind to clearly identify the structure.

+ +

And so I wrote a script which would repeatedly launch a VST 2.x host application with a crafted plug-in, which on every iteration would try a new number to insert into the first four bytes. And lo and behold, after 1450406992 iterations, I found the exact string: VstP.

+ +

I assume that this is short for Vst Plug-In, and it should have occured to me - before I wasted several kWh on this problem - to attempt to limit the possible values of each byte. But I now had the magic number, and thanks to it I had a rough estimate of the size of the memory to return: roughly 128 bytes or more.

+ +

Structural Inspection

+ +

With the magic number in place, it was time to delve into what the structure actually contains aside from the first four bytes clearly being a magic number. The easiest way to check data integrity is by poisioning it intentionally, and that's exactly what I did: I started inserting 1 bit changes at random locations to figure out what does and does not cause problems.

+ +

I didn't have to wait long to hit something. Precisely when I chose the bytes 8 though 15 I was hitting something critical that seems to be used right after loading the VST 2.x Plug-In. The length of this looked suspiciously like a pointer, so my first attempt was to place a function at this point, and my guess was right. I was now hitting a breakpoint in my newly created function.

+ +

After more register investigation, I figured out that the function in total has 6 arguments, of which two are pointers (one pointing at the memory i returned from the entry point), one is at least 8 bits, another is at least 24 bits (most likely 32), another is either 16, 32 or 64, and one is a 32 bit float. Going by the MSVC x64 calling convention, I guessed at the rough order of parameters, which seems to be correct so far. Time will tell if I messed up.

+ +

Difficulty Spike

+ +

At this point, the difficulty went from Easy to Dark Souls but you only have the X button. It was time to start logging the calls to the function which I dubbed "control", every single one that would happen. It did not take long for patterns to emerge on most VST 2.x Hosts, with the most common calls having the second parameter be 0x2D, 0x2F, 0x30, 0x31, 0x3A, 0x23, 0x0A, 0x0B, and finally 0x00 - I called this parameter the "opcode".

+ +

Unfortunately after this much time I can't remember the exact details anymore. Some of them are extremely obvious, like 0x0A which sets the sample rate. Others were more complex and I only just recently figured out what they mean and how to use them correctly through the development of VoiceFX.

+ +

However this part is already long enough, so I'll continue with this in the 2nd part in the future.

diff --git a/_posts/2021/2021-04-24-high-quality-streaming-with-obs-studio.html b/_posts/2021/2021-04-24-high-quality-streaming-with-obs-studio.html new file mode 100644 index 0000000..b19e3ef --- /dev/null +++ b/_posts/2021/2021-04-24-high-quality-streaming-with-obs-studio.html @@ -0,0 +1,60 @@ +--- +title: "High Quality Streaming (with OBS Studio)" +published: false +--- + + +

Creating entertaining, educational or otherwise useful content while live is a tough job, but it is a rewarding job if you make it. And I'm here to help you get started producing high quality content the moment you set foot into the world of streaming, at least in terms of Audio and Video - I can't help you if your content itself is unwatchable.

+ + + + + + + +

Setting up the Basics

+ + + +

Everyone has to go back to the basics sometimes, some more than others, some less. It is important to sometimes take a step back and get a proper look at what you are actually doing, and if it is actually working as it should. Sometimes it doesn't, sometimes it does, and sometimes it's based on luck entirely. So let's try to strip out the "doesn't work" and "luck" part entirely, and ensure that you get high quality right from the basics.

+ + + +

Good Audio on a Budget

+ + + +

Audio is what you and your viewers hear, or what is converted into words on screen if you happen to have a subtitling software running for disabled viewers. It is one of three critical parts of streaming, and messing it up can drive away interested viewers the moment things go out of control. As a Creator you want to sound better than the masses that don't know how to do it right, not the same.

+ + + +

One thing many people get wrong is that they assume that every piece of Hardware is the same, which has resulted in the annoying "48 kHz is enough!" argument that went way out of control. Not only does it not apply universally, it also requires good Hardware or Software resampling - and the former is not very common in cheap or integrated Hardware. So the next best thing that we can use is Software resampling, which will often sound better.

+ + + +
96 kHz Sample Rate
+ + + +

If your Hardware doesn't have a good resampling chip, you should set it to it's native Sample Rate. For many devices this is either 96 kHz or 192 kHz, but often 96 kHz should already sound better than 48 kHz to a trained ear. On the flip side, stacking a good hardware resampling chip with a mediocre software resampling implementation actually degrades audio quality - watch out for that when you have good audio hardware.

+ + + +

Finally in order to complete the basic Audio set up, all that's left is setting up OBS Studio accordingly. That means setting the Sample Rate to 48 kHz or 44.1 kHz (depending on the streaming platform), and setting the correct number of channels - which is almost always Stereo now. If you used a higher Hardware Sample Rate, you will now have OBS Studio resampling your Audio single to the correct Sample Rate with a mediocre but not awful resampling algorithm.

+ + + +
OBS Studio should be set to 48 kHz Stereo
+ + + +

Video for Everyone

+ + + +

+ + + +

+ diff --git a/_posts/2021/2021-04-24-low-latency-streaming-to-twitch.html b/_posts/2021/2021-04-24-low-latency-streaming-to-twitch.html new file mode 100644 index 0000000..99f3c33 --- /dev/null +++ b/_posts/2021/2021-04-24-low-latency-streaming-to-twitch.html @@ -0,0 +1,28 @@ +--- +title: "Low Latency Streaming to Twitch" +category: Blog +tags: ["Tutorial", "Twitch", "OBS Studio", "StreamFX", "x264", "NVENC", "WordPress Archive"] +--- + +

If you've been following my social media for the past few years, or have read my Recording or Streaming on NVIDIA Turing/Ampere guides, I'm always chasing the next higher level of "perfection". And this time I was chasing the lowest possible latency on Twitch - and it appears that I have finally found it, after days of trying.

+ + +

My search began as usual, with a manual investigation into the problem - I didn't have infinite resources to throw at the problem, and I didn't want to increase Twitch's hosting costs either. Therefore I was stuck with manual testing, which worked out well enough in the end for me, and didn't take much time either. I started my testing with the Key-Frame Interval, going down from 5 Seconds to 1 Second in 1 Second interval.

+ +

Oddly enough, I noticed that certain Intervals would end up "grouped together" into a higher latency. For example, 1 Second and 2 Seconds would end up being 3 Seconds of latency, while 3 Seconds and 4 Seconds would end up being 5 Seconds of latency, and 5 Seconds ended up being 7 Seconds of latency. Unable to make sense of it, I moved on to other settings.

+ +

Look-Ahead was the first one to recieve the axe, and it made no difference - 30 Frames or 0 Frames, both had the same latency. Similar to this was Zero-Latency and B-Frames, neither made a difference. But one option did, and it is what every RTMP+HLS based service recommends against: Adaptive I-Frames, or Scene Cut in x264 terms.

+ +

Both x264 and NVENC use this option to insert complete Key-Frames if they happen to be a better option, for example with a fade to black, or a complete cut to new content. Unfortunately this option results in a measurable and visible increase in latency between a 250 and 500ms - half a Second gone to unfortunate Key-Frame placement.

+ +

This brought the latency ladder down to ~2.4 Seconds for a Key-Frame Interval of 1 and 2 Seconds, ~4.2 Seconds for an Interval of 3 and 4, and ~5.6 Seconds for an Interval of 5. Careful observers may have just noticed the same thing I did, which is that the latency appears to climb in ~1.5 Second intervals - and that's what I tested next.

+ +

I set up the Key-Frame Interval to 1.5 Seconds, and then I watched the Video Stats...

+ +
+
One Second Transport Latency
+
+ +

I finally had it. Almost a perfectly flat 1 Second latency for the stream itself, though with all the Rendering, Encoding and Muxing buffering, it still ended up being roughly 2 Seconds. Which is still less than the lowest total latency I managed with Adaptive I-Frames being Enabled, and any other Key-Frame Interval. I repeated this test several times, and was able to achieve the same latency over several hours in Firefox, Chrome and Edge. I am a happy Twitch streamer now.

+ +

Disclaimer: This discovery is not guaranteed to work everywhere, and may be unique to the Ingest server I used. It may even be different due to changing which GPU is used, which CPU is used, which RAM is used, how your network looks like, etc. It is provided at no guarantee or warranty.

diff --git a/_posts/2021/2021-05-17-amd-and-the-curse-of-conflicting-information.html b/_posts/2021/2021-05-17-amd-and-the-curse-of-conflicting-information.html new file mode 100644 index 0000000..16e0f54 --- /dev/null +++ b/_posts/2021/2021-05-17-amd-and-the-curse-of-conflicting-information.html @@ -0,0 +1,100 @@ +--- +title: "AMD and the Curse of Conflicting Information" +category: Blog +tags: ["AMD", "Ryzen", "Overclocking", "Undervolting", "Curve Optimizer", "WordPress Archive"] +--- + +

Against better judgement to just wait, back in December 2020, I ordered a AMD Ryzen 9 5950X - and received possible one of the worst chips to be on the market. In Cinebench R23, it achieved a Single Core score of ca. 1550, with a Multi Core score of ca. 24040. This by itself doesn't look too bad, until you open Cinebench R20 and get ca. 580 in Single Core, with Multi Core just barely hitting the 9800 barrier.

+ +

So I did what any person with this hardware would do, and searched for overclocking options.

+ +

Overclocking on AMD Ryzen

+ +

With Zen3, AMD has granted us overclockers more than enough options to edge out every possible point of performance we can possibly need at the current time. Some of them are exclusive to any other method, while other methods can stack to some degree. I'll only look at the main three here:

+ +
    +
  1. VCore: This is your classic undervolting option which can be used to offset the Voltage the CPU Cores actually get.
  2. +
  3. PBO: Automatically increases the Boost clock speed and voltage according to a built in Voltage/Multiplier curve, up to the limits specified by you or the Motherboard. Not exclusive with VCore.
  4. +
  5. Curve Optimizer: An offset applied to the indexing of the Voltage/Multiplier curve, with each step being 3-5mV according to AMD. Not exclusive with VCore, but appears to do the same.
  6. +
+ +

VCore vs Curve Optimizer

+ +

A fair number of resources out there state that VCore offsets and Curve Optimizer are exclusive. In a perfect ideal CPU sample, they are, but in reality, it's not that simple. In order to understand why, it is necessary to understand the basics of how PBO figures out what Multiplier it can actually run at. Most likely, AMD uses a Look-Up-Table (LUT) to figure out what Multiplier to use. Below is an example of such a LUT:

+ +
+ + + + + + + + + + + + + + + + + + + + + +
mV300305310315320
Multiplier37.50037.62537.75037.87538.000
+
Example PBO Look-Up-Table
+
+ +

For now, let's ignore the additional calculations for the allowed maximum mV and Multiplier, and just assume it is a perfect world. If the user has Curve Optimizer disabled, a lookup for which Multiplier would apply at +310mV would return 37.750. But if the Curve Optimizer is set to -10mV (-2 in the options), the Multiplier returned would be 38.000 - we offset the Multiplier lookup up by 2 in the table.

+ +

And then there is VCore offsets, which sound like they should do the same, but in reality they don't. Unlike PBO and Curve Optimizer, VCore offset does not modify which multiplier we end up with, it only modifies the effective Voltage. So in the above example with PBO and Curve Optimizer at -2 with an offset of -25mV, we'd still have lookup for +320mV a Multiplier of 38.000, but now our Voltage offset is only +295mV.

+ +

So now I hope you understand the difference between the two options and why they are not exclusive to each other. It is true that a fixed VCore will break PBO and Curve Optimizer, but offsets are perfectly fine. In fact, I wrote this blog post with PBO On, Curve Optimizer at -10, and a VCore offset of -37.5mV.

+ +

Benchmarks

+ +

As usual, to verify my findings I ran a number of benchmarks. Tests were ran for 20 minutes each, so plenty of heat was available to reduce the score.

+ +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
SettingsCBR23 MCCBR23 SC
PBO Off, CO 0, VCoreD 0~24040~1550
PBO On, CO 0, VCoreD 0~26530~1542
PBO MB, CO 0, VCoreD 0~26670~1550
PBO MB, CO -10, VCoreD 0~27211~1555
PBO MB, CO -10, VCoreD -25mV~27400~1560
+
+ +

While I'm still far away from what other people manage, this is already very promising. Maybe I'll find a golden egg while searching for settings and can bring this bad bin to a decent state.

diff --git a/_posts/2021/2021-05-22-whats-coming-in-voicefx-v0-3-0.html b/_posts/2021/2021-05-22-whats-coming-in-voicefx-v0-3-0.html new file mode 100644 index 0000000..4bcaba9 --- /dev/null +++ b/_posts/2021/2021-05-22-whats-coming-in-voicefx-v0-3-0.html @@ -0,0 +1,43 @@ +--- +title: "What's coming in VoiceFX v0.3.0?" +category: Blog +tags: ["VoiceFX", "WordPress Archive"] +--- + +

With what is most likely the final beta version of VoiceFX 0.3.0 being released, it's time to talk about all the improvements and changes that VoiceFX had to go through to get here. Let's take a look at all the new and upgraded things in it.

+ +

The Plug-In has been renamed to VoiceFX

+ +
+
Adobe Audition detecting VoiceFX
+
+ +

This may come as a surprise to some, but the VST 3.x version was never actually called VoiceFX. It has now been renamed - along with an identifier change - to match the actual project name. This will break existing configurations/setups, so you will have to update accordingly, and no further renames are planned. This was mainly done to integrate other effects later down the line, and no longer potentially infringe on trademarks and branding guidelines.

+ +

Drastically reduced the Load/Reset Times

+ +

If you have used VoiceFX before, you've probably noticed and complained about the extremely long time required to load and/or reset the plugin. I wasn't able to remove it entirely - still waiting on NVIDIA to answer my questions - but I was able to significantly reduce it. As an example, instead of having to wait close to 18 seconds for Stereo playback with VoiceFX on in Adobe Audition, it is now only 3.8 seconds - almost 5 times faster!

+ +

Improved Buffering and Latency Detection

+ +

Due to the mixed bag that VST hosts are, I had to invest quite some time into a solution that was not just efficient, but also covered most of the possible bad situations. This led to a much more stable plugin with far better buffering (fixing Adobe Premiere Pro exporting), better latency detection and various other things. Many VST hosts are also now 100% compatible with this.

+ +

Text File Logs for improved end-user Support

+ +
+
Generated Log Files
+
+ +

With the mixed bag of VST Hosts that we have, I could no longer rely on the VST 3 and VST 2 standard as my only resource for problems users experienced. So I did what any Software does, and now write log files to a pre-defined location. You can find these at %LOCALAPPDATA%\VoiceFX\logs, and sending the last 2-3 files along with any request for support will drastically improve the chances of fixing the problem.

+ +

Support for VST 2.x Hosts

+ +

With 0.3.0 comes the promised integration into many VST 2.x only hosts, available to any Tier 2 or higher Supporter of mine on Github and Patreon. This integration took quite a while as the VST 2.x API is practically undefined if you want to stay on the legal side, so quite a bit of time was spent reverse engineering an API I had no actual information about, other than it exists.

+ +

Improved support for non-standard-compliant VST Hosts

+ +

As with any standard, there are always pseudo-implementations of it in Software and Hardware - and VST is no different. I've applied many safe-guards now to either make these VST Hosts work fine, or prevent them from crashing entirely, but there may still be a VST Host out there that does yet another thing differently from the actual standard.

+ +

With every update comes...

+ +

... a lot of testing. I'm currently running a public beta test of the new VST 3.x version, as well as a Supporter-exclusive beta test of the VST 2.x version - both are available in Discord. If you're less chatty you can still try out the new version linked on the official website (automatically updated short link) for it. If you have an NVIDIA Tensor capable GPU (RTX 2060 or better), why not try it out?

diff --git a/_posts/2021/2021-06-10-streamfx-0-11-whats-going-to-be-in-it.html b/_posts/2021/2021-06-10-streamfx-0-11-whats-going-to-be-in-it.html new file mode 100644 index 0000000..1eb576a --- /dev/null +++ b/_posts/2021/2021-06-10-streamfx-0-11-whats-going-to-be-in-it.html @@ -0,0 +1,30 @@ +--- +title: "StreamFX 0.11: What's (going to be) in it?" +category: "News" +tags: [ "StreamFX", "WordPress Archive" ] +published: false +--- + + +

+ + + +

Video Super Resolution

+ + + +

+ + + +

Video Denoising

+ + + +

Custom Kernel Blur

+ + + +

Reference-accurate Gaussian Blur

+ diff --git a/_posts/2021/2021-06-17-testing-nvidia-maxines-super-resolution-beta.html b/_posts/2021/2021-06-17-testing-nvidia-maxines-super-resolution-beta.html new file mode 100644 index 0000000..39f8143 --- /dev/null +++ b/_posts/2021/2021-06-17-testing-nvidia-maxines-super-resolution-beta.html @@ -0,0 +1,24 @@ +--- +title: "Testing NVIDIA Maxine's Super Resolution Beta" +category: Blog +tags: ["NVIDIA", "Maxine", "Super Resolution", "Upscaling", "WordPress Archive"] +--- + +

With the release of NVIDIA Maxine, a number of exciting new Video, Audio and AR effects were shown. I wanted to try these out much earlier, but an unexpected problem prevented me from doing so. However thanks to NVIDIAs help, I've managed to get some of the examples running most of the time. Please note that the effect I'm testing is still considered Beta by NVIDIA, so final quality will likely differ from what I show here.

+ +

Since I only had gaming footage at hand for testing, and the Super Resolution effect seems to have been designed for real world content, I went to my balcony to record a quick video. The video was downscaled to 720p as well as 540p, then fed into the effect to upscale to two different resolution targets. Below is the result of this (may need an external player):

+ +
+ +
+ +

The video was taken with a Logitech Brio 4K on a long USB extension, set up for 1920x1080x60 capture as close to real world as possible. The results are quite clear, and show both the strengths and weaknesses of the effect clearly. At the current level, it seems to be missing any and all temporal capabilities, relying completely on spatial upscaling. This works well as long as you don't exceed an upscale of ~1.5x, above that you end up with significant artifacts.

+ +

I'm excited to see what the future will bring for this.

diff --git a/_posts/2021/2021-09-12-a-look-into-voicefx-v0-3-0.html b/_posts/2021/2021-09-12-a-look-into-voicefx-v0-3-0.html new file mode 100644 index 0000000..7a3334b --- /dev/null +++ b/_posts/2021/2021-09-12-a-look-into-voicefx-v0-3-0.html @@ -0,0 +1,61 @@ +--- +title: "A look into VoiceFX v0.3.0" +category: "News" +tags: [ "VoiceFX", "WordPress Archive" ] +--- + +

This release was plagued with odd bugs, reappearing problems, and delays from real life events happening to me. But now, after several months of nothing, VoiceFX v0.3.0 is available! Why not delve into what exactly was changed, with some visual examples?

+ +

Loading the effect should now be much faster!

+ +

Earlier VoiceFX versions had a bug that would cause VoiceFX to re-initialize everything every time the VST host sent any command to it, until the moment it was supposed to start processing. On some VST hosts, this was so bad that users reported VoiceFX freezing their system for minutes as the VST host was spamming a lot of commands. This is now a thing of the past, and initialization is only done once, resulting in a load time reduction of up to 75%!

+ +

Significant reduction in Reset times!

+ +

If you've used VoiceFX, you may already have experienced in how slow VoiceFX can be when restarting playback, or seeking within a file. In the worst cases, you could end up spending more than a minute for multi-channel audio, or much longer with multiple instances of VoiceFX. In rare cases, it might have even led to a frozen system that needed to be restarted.

+ +

This is now a thing of the past! Thanks to help from NVIDIA, resetting VoiceFX by seeking or restarting playback is now super fast. So fast that it was difficult to measure how much faster it was on a NVIDIA RTX 3090 unless I put some stressful load on the GPU. And the best of it all? This has no impact on quality!

+ + + +
+ +
v0.3.0b3 and earlier
+
+
+ +
+ +
v0.3.0 and beyond
+
+
+
+ +

Split VST3.x and 2.x Installers

+ +

Since users often had a mixed bag of VST hosts, some 3.x and 2.x, I originally created a binary that had both APIs supported. This turned out to be extremely user unfriendly, as users had to choose which VST host could see it, and which couldn't. With the new separate installers for either VST 3.x or 2.x version, users can now choose which VST host should be able to see it, or if it should just be available for both!

+ +

Other Changes

+ +
+ +
Log files!
+
+ +

There is of course some smaller changes, which don't really need a whole section, but still need to be mentioned:

+ + + +

Some final Words

+ +

I've released VoiceFX on itch.io where you can now buy the paid version, without going through Patreon or GitHub. While I can't promise an actual release cycle - I'm just a single person working on many things plus my actual job - I can promise that you won't miss out on actual updates. Some testing version may however not be uploaded to itch.io when they are immediately nuked due to a critical bug.

+ +

Anyway, that was all I had to say. Go grab the lastest versions from my Website, itch.io or from my Discord!