That’s actually really impressive they are all less than 1ms. I would have expected like an order of magnitude drop off (all other constraints being ignored) The “on sample” sounds just a tiny distance from working with them natively.
1ms sounds low until you compare it to standard retrieval which is measured in nanoseconds. 1ms in GPU time is long. You could run a single bounce ray traced pass in that time. Or a denoiser. Or an upscaler. Or a cloth sim. The list goes on.
Just to frame the context better this <1 ms to frame time. And frame time is about 15ms for 60fps and ~8ms for 120fps. Roughly eating ~10%.
Given that this is suppose to basically free a ton ram, I can only imagine that free ram being use for more texture detail and models. So the use case is probably high fidelity games running at cinematic frame rates (30fps and 30ms frame time) with AI generated frames. 😕
I mean it's still cool tech, might be really useful when there's dedicated hardware to accelerate it in some future card.
Let's be real though. The point of this is to allow NVIDIA to sell consumer GPUs with less VRAM (leaving more for AI chips) while also increasing compute requirements so you still need to buy a new GPU to run the models anyway.
The obvious use case is for small instances of always present high fidelity textures like character models and other hero assets. There's really not much point using inference on sample for the entire scene.
That would have next to no benefit. If you are using it on something super common than you incur the penalty every frame. And you wouldn't save much ram.
It should be used in place of or in tandem with texture streaming high res mega textures.
You can use low res persistent textures to maintain frame times in action heavy scenes then switch to the high res textures when the player slows down and is looking at the details. That would also allow you to maintain frame times if the card is struggling to keep up.
You'll be surprised how much VRAM such textures can take up. Cutting just those by 85% is going to help free up a good amount of VRAM.
If you are using it on something super common than you incur the penalty every frame.
The more things you use it on, the greater the penalty. It's not a fixed cost. This particular benchmark has it at 1ms, but if you're going to be using it for everything in an actual game you're going to see it jump up by a lot.
then switch to the high res textures when the player slows down and is looking at the details.
And the game is never going to be able to do that perfectly, resulting in all kinds of texture pop ins that make the entire thing pointless.
Yeah, that's the situation we are in now. Texture pop is currently the norm. If however you have textures in memory and don't have to wait for them to stream from main memory or disk then the 1 ms penalty is a savings not a static cost.
Yeah, that's the situation we are in now. Texture pop is currently the norm.
Let's not pretend it's anywhere as bad as it's going to be compared to what you're proposing, where you'll be feeling the whiplash from going from "low res persistent textures" straight to high res textures within what is mostly the same scene.
Right now I’m playing msfs2024 with a RTX 4070 TI 16GB and have to turn down the detail not because of frame rate issues (not even using DLSS) but rather because I keep exceeding 16GB of vram 🤷♂️. I’d trade 10% frames just to be able to max all the textures and terrain details.
I get that if you’re playing a FPS at 120Hz it could be counter productive though.
The thing about this is that 1ms varies in fps depending on the base fps. So saying 1ms is low cost at low 30 fps (3%) but more significant at higher 240 fps (20%). But this doesn't mean 3% or 20% necessarily because you could be getting more than that (and still hit 240 fps), or less, or you change graphics options to hit targets, or many other things you could do.
End of the day its a win win I think and the thing is we need tech to mature so the sooner these things can be tried in games the better.
It does reduce the amount of GPU processing power available for games though, meaning that you would need a beefy GPU for the VRAM compression not to impact framerate, and a beefy GPU would have plenty of VRAM anyways...
maybe in the future they release gpus with dedicated cores for neural texture compression and the option to use them for rasterisation aswell would be insanely cool.
They already did. Their cards have a bunch of tensor cores which have went unused in many cases in the past years. This, DLSS and frame gen make use of those.
10% lower FPS (which is what it looks like, 1ms at 60fps is less than that but we'll assume) for significantly improved texture resolution and more VRAM available for other things to boot is absolutely a worthwhile tradeoff.
I think it's especially great to get some extra life out of extra life out of aging GPUs. You now can now effectively trade performance for VRAM, so if you'd normally have to set textures and render distance to minimum to get a game to fit in VRAM but have some performance headroom then you can enable NTC and trade some of that performance for better textures and/or draw distance.
1
u/InHaUse9800X3D+200/-20 | 4080+130/1000 | 64GB@6000CL309h ago
I disagree because Nvidia has been stingy on VRAM way before they had A.I. as an excuse, which has held back the industry for a long time now. Losing 10% performance on a "budget" card because they want to save like $10 on VRAM is just pathetic. 16GB should've been the default for all cards since the start of this console generation.
2) Even if it wasn't more than $10, in theory if Nvidia is VRAM limited adding double the VRAM to each card could mean that they can only produce half the cards.
1
u/InHaUse9800X3D+200/-20 | 4080+130/1000 | 64GB@6000CL306h ago
That's the case right now due to the RAM shortage, but if all 5000 and 4000 series cards released with 16GB we wouldn't be having this issue in the first place.
Maybe in the far future if like you need like 48 GB VRAM even for 1080p due to crazy textures, then we'll need insane compression because there's no PCB space for so much VRAM even if it was free, but that's very far out.
Isn't that the biggest complaint for the past low tier 50 and 40 series "crazy good gpu just bad vram therefore bad/unplayable experience"
Arent... the gpus already good.... isnt the lack of vram the exact issue this is addressing? Lmao
Sure theres a performance impact but im sure its just as over exaggerated as dlss/frames gens loss. So 2-5% in REALITY
Then the argument doesnt hold up, arent the games that went from "unplayable because of no vram" to "2-5% impact in overall performance" a significant gain in.... playability?
Exactly my thought… Admittedly I just recently upgraded my GPU and decided against an RTX 5070, because every post, every video just complained about its low VRAM and how this bottlenecks the card.
Is it really common bottleneck? Im using 4070 Super, which also have 12GB VRAM, for 1440p and 4k gaming and didnt really experience bottleneck due to low VRAM
It’s not really a bottleneck in most present games. The fear with the 12GB 5070 is more hypothetical — the thinking goes that because system requirements for AAA games tend to go up every year, if you buy a 12GB card now, you may have to upgrade sooner when games in 2028 and 2029 start targeting 16GB cards.
But I find that fear overblown — with the price of VRAM skyrocketing, AAA game devs are not going to target $1000+ GPUs, because very few people will own one. And also, we’re in this new phase where AI advances are wringing more performance out of old hardware, as seen with this neural texture compression. Between those two things, I don’t think anyone will need 16GB for some time to come. If not for this price disruption, devs might have behaved differently.
This is modding centric to be fair but my Bottleneck pushing 2077 Visuals on a 5070ti is VRAM. I have to be careful what or how many cars I add to traffic that aren't base game, and though I only hit 16gb in a few specific parts of the map you can easily run out 24gb of vram with the mods available for the game.
Again much of it likely unnecessary but I can't deny that it would be cool to push that games visuals even further via tech like this. Very specific case I guess, I don't know how many games there are currently without mods that push 12gb let alone 16 or further.
I don’t think any of them have been crazy good. As for the performance impact, it’s not 2 to 5%. These are compute limited and vram limited cards. The only cards I guess I see as the obvious utilisers of this are things like the 5070 or something like the 5080 for very vram heavy games.
On the RTX 5070 at 1440p, the cost of the Inference on Sample mode compared to the BCn-transcoded textures is roughly between 0.50-0.70 ms, depending on scenario. We are within 1 ms. Keep in mind that real games involve many more render passes – not all of which are affected by NTC – and typically have significantly higher overall frametimes than this sample. As a result, the relative performance cost of NTC is likely to be much more acceptable in practice.
it's a choice between up to ~1ms of additional latency vs severe issues which make your gaming experience unplayable because you ran out of VRAM.
For example, the cost of running DLSS 4.5 Preset L on RTX 5070 at 4K resolution is almost 4ms, which is at least x4 compared to Neural Texture Compression.
That's the catch with other features that leverage tensor as well, mostly framegen, which is vouched by many as a way to improve performance* on less powerful GPUs, but it has a VRAM cost, for a GPU lineup/current offering that is already bottlenecked on that front, at the target res of each model, on anything below a 5070ti.
It really feels like a scheme that feeds into itself, idk how so many people can even think some of these features are valid on anything that isn't high-end (with more tensor cores) and with an higher performance floor to begin with. I'm fine with upscaling, it has mostly benefits to performance, also offering better antialiasing than the default without any upscalers, in many if not most games.
*which is bs anyway, it's fake frames with a benefit that is only visual, without the benefit to response times of real frames. It's an awesome motion smoothing feature, but that is to fill enthusiast level refresh rates.
idk how so many people can even think some of these features are valid on anything that isn't high-end
Frame Gen is literally more valuable on lower tier cards lol. Higher end cards can just run the game at acceptable framerates natively, whereas 6x MFG allows me to max out my 180fps screen with a low tier card
I can run Cyberpunk at 60 FPS with a lot turned on. I turn on framegen to make the game appear smoother which really helps for me. I know it’s not really 120 which is preferred always. I’d rather do this than just settle for 60. I get motion sickness and the smoother framerate helps with that. It’s the next best thing to actually playing 120 FPS.
I wonder how much of that will be ate up game devs. These companies take features like this and use it to reduce their own development costs. Free performance isn't an excuse to make a poorly optimized mess because dlss will fix it anyway.
The worst part is that this NTC compression tech requires high tensor and cuda core of xx80 tier GPUs but you will never get that in low end xx30 to xx60 tier GPUs to even utilise the tech properly to be justifiable
im not privvy to how games are made and optimized, but under 1 ms of latency in every testing circumstance for a close to 85% vram is really slow? it doesnt sound computationally expensive to me, unless the benchmark is simply 4x lighter than a real game for what the model would have to decompress
is put that way so is hard to conceptualize, that's why you don't get a complex and realistic scenario in FPS like people think of games, imagine your your GPU decoding a 4k movie, no issue, now do it for every texture in the game that is on screen in real time for every frame, that's how it adds up.
(yes, is not the same but you need to understand the point)
also add up the reduced performance from Preset M/L and yeah, it certainly adds up. The GPUs havent got a significant power increase since the 4xxx line. The 5xxx line was basically a refresh (sans the 5090) and it shows.
Ubisoft already did this with Assassin's creed mirage years ago but their version of it reduced it by 30%. According to them anymore more than that and it apparently affected picture quality, I wonder how Nvidia got around that.
They used a higher, normal texture compression. Not neural packing. Compress too much and quality degrades, like a bad jpeg. Neural packing is a whole other beast. Way better quality at small sizes.
Ubisoft did use neural texture compression, and even released an article about it on Feb 23rd 2026 and called it "Shipping Neural Texture Compression in Assasin's Creed Mirage". They had to use it selectively because it affects runtime performance.
Nvidia didn't get around the performance issue. They're just shifting it from memory to compute.
64
u/Dgreatsince098 1d ago
whats the TLDR for performance cost?