Product Review: Nvidia RTX 4000 Ada GPU

Anthony Frausto-Robledo, AIA, NCARB, LEED AP

6 months ago

WHEN WE LAST REVIEWED NVIDIA’s Ampere-generation RTX A2000 GPU for workstations, we were impressed with how much performance that entry-level GPU brought to the user. This time, we are reviewing the most powerful single-slot professional GPU on the market, the mid-tier Ada generation RTX 4000.

General Summary

This graphics card is a professional application-oriented GPU board designed for use in professional workstations. It has been on the market for quite some time, but recently, Nvidia sent us a version to review, and we have done the most exhaustive review of a GPU to date. We should note that the last mid-tier professional GPU we reviewed was from AMD with its Radeon Pro W6600, which was designed to compete with this GPU’s immediate predecessor, the Ampere-generation RTX A4000. Because we never reviewed this GPU’s predecessor, we have relied on some published benchmarks below.

Nvidia’s mid-tier 4000-class GPU models are generally ideal for AEC professionals who are involved in a fair amount of real-time and photo-realistic rendering workflows, whether in supplement to BIM or not. This is also the case for product design and MCAD professionals; those doing more rendering workflows will see greater value benefits from this GPU. (see Conclusions on the last page for more info).

Image 1: Nvidia’s RTX 4000 Ada generation GPU for workstations is the most powerful professional workstation-class single-slot GPU in the market. (Image: Nvidia)

The RTX 4000 Ada GPU is the successor to the RTX A4000 Ampere generation GPU. For the purposes of our GPU economics charts and calculations, we are using MSRP pricing for the 4000 Ada GPU at USD 1,250, while the initial price for the Ampere generation predecessor GPU was USD 1,000 when first released in April 2021.

The Nvidia Ampere generation GPU we tested and reviewed was the RTX A2000 SFF GPU, and we have carried over benchmarks for non-direct comparison. The Ampere generation’s rival AMD Radeon Pro GPU was the W6600, which we also reviewed a few years ago. Again, we have carried over some comparison benchmarks.

Image 2: The Nvidia RTX 4000 Ada GPU inside our BOXX Technologies testing workstation. The GPU cannot draw all its power from the PCIe slot, so it includes a standard 16-pin 12VHPWR connector for direct connection to the power supply. (Image: Architosh)

In-house benchmarking is a tough business, especially if working on different systems. That is not our case. Our review of the RTX 4000 Ada GPU ran on the same BOXX workstation as our previously reviewed GPUs under the same Windows 10 Professional operating system.

While we will summarize our conclusions at the end of the review, some highlights about the RTX 4000 Ada GPU to bring up now include the following:

Most powerful single-slot GPU for professional workstation computing
substantial performance gains in apps like V-ray, Arnold, and Blender
DLSS is supported in Enscape and upscales with AI
Ada generation brings dual encode/decode engines (2 encode / 2 decode engines)
AD1 code support
Excellent ventilation with blower style thermal design versus gaming card thermals

The RTX 4000 Ada GPU is powered by Nvidia’s AD104 graphics processor, built on TSMCs 4nm process node. The 4N node is a custom node designed for Nvidia and differs from its regular N4 node. The custom node for Nvidia lays an emphasis on its power efficiency.

Image 3: The Nvidia AD104 chip powers the Nvidia’s RTX 4000 Ada GPU. The chip is built on TSMC’s 4nm process node. (Image: Nvidia)

To put the chip into perspective, the die area is around 294 mm² and contains 35.8 billion transistors. The predecessor chip in the A4000 Ampere was the GA104 chip with 17.4 billion transistors, similar in size to Apple’s M1 chip at 16 billion transistors. We bring up the M1 because we will make some comparisons to SoCs in our review. The important take-away is that there are twice as many transistors in this Ada generation RTX 4000 GPU.

GPU Economics

As part of our review work, we like to compare GPUs on a performance-per-dollar basis, but we calculate this a bit differently than the common formula. We divide the cost of the chip or GPU by the benchmark score to obtain the cost of one compute unit (i.e., the cost in dollars to obtain one unit of measure on the benchmark). In our charts, our notation for “compute unit” is “CU.”

Cost in USD per Compute Unit (CU) = cost of GPU / benchmark score

You can see this kind of metric in our review of a GPU workstation here. We began this type of benchmark because we have begun research into methods of economically optimizing workstation configurations across both GPU and CPU for specific workflows. Such work is in an early phase and explained in a feature article titled, “BIM Manager: The Economic Value of Workstation Performance,” inside Xpresso newsletter #44.

Our GPU economics charts are partial and selective, but hopefully informative. This kind of metric allows you to compare the value proposition of different chips, especially in isolation, to a specific application or type of workflow. However, one must keep in mind that benchmark scores alone do not tell the whole story of “value” or “performance,” as overall system performance impacts GPU performance. Furthermore, workstation-class GPUs offer other tangible and intangible benefits, including the certification of professional applications.

Since we have so many benchmarks to test, which ones do you select for performance per dollar calculation and comparison? We have selected three:

(1) Creo and Solidworks (OpenGL-based) composite tests
(2) Creo (OpenGL shaded and edge performance) tests
(3) Cinebench GPU (Redshift Renderer) tests

The first two are SPECviewperf tests and are widely regarded for OpenGL-based 3D CAD applications. Redshift is widely deployed across DCC apps plus all of Nemetschek Group’s AEC BIM solutions as a render option. We review these three performance per dollar metrics at the end of our Benchmarking section below.

GPU Details: RTX 4000 Ada

Nvidia’s RTX 4000 Ada generation GPU is a full-size single-slot GPU that boasts 20 GB of video memory, which is more than the 16 GB that shipped on its predecessor Ampere generation GPU. Details include:

6,144 CUDA Cores
48 3rd-gen RT (raytracing) cores
192 4th-gen Tensor cores
26.7 peak TeraFlops single-precision (FP32)
Nvidia AD104 chip

PCI Express 4 x 16 for advanced data transfer
OpenGL 4.6 / Vulkan 1.3 / DirectX 12 Shader Model 6.7
20 GB GDDR6 with ECC memory
360 GB/s peak memory bandwidth
4x DisplayPort 1.4a
2x 7680 x 4320 @ 60 Hz max resolution
130W max power consumption
Single-Slot, full height, 4.4″ H x 9.5″ L
VR Ready

Compute APIs support CUDA 12.2, OpenCL 3.0 and DirectCompute. Nvidia NVlink = no.

Those are the specs at a glance. This is Nvidia’s most powerful single-slot GPU ever and the ideal GPU for demanding professionals across CAD industries like AEC, DCC, MCAD, in addition to science, medicine, energy, financial and software development.

Image 3.1 — Nvidia RTX 4000 Ada GPU, showing its blower-style thermal design. (Image: Nvidia)

As generative AI technologies begin to dominate the focus of professional application workflows, the Nvidia RTX 4000 Ada GPU unlocks accelerated AI compute workloads. We have also noted the new DLSS support for Enscape, which stands for Deep Learning Super Sampling), which is AI-powered upscaling. This game-inspired technology enables game rendering at lower resolution and then uses deep learning to upscale the image to a higher resolution. This technology is inside the popular AEC rendering software Enscape, starting with version 3.1.

next page: Benchmarking and Performance Results

Benchmarking and Performance Results

Each time we review a graphics card it seems there are new changes in the computer performance benchmark application scene. As we explained to NVIDIA’s Sean Kilbride, we would like to have some tests that can go cross-platform so we can help readers understand the trade-offs and benefits of Windows and Mac computers. This is even more important as more of the Windows market moves to ARM system-on-chip (SoC) based systems in the years ahead.

That said, Kilbride has also emphasized that some benchmarking tools, particularly some that run cross-platform, do not allow for the specifics of a GPU to show off its true strength. The benchmarks we have below cover a wide gamut and accomplish both needs—broad cross-platform comparative data plus benchmarks that highlight the RTX 4000 Ada’s GPU’ strengths. The benchmarks include:

Geekbench 6 – GPU Benchmark
Maxon Cinebench 2024 – GPU Benchmark
BlenderMark — best for ray trace testing
CompuBench — Catmull-Clark and SSS tests
SPECviewperf 2020 v3.0 — MCAD oriented OpenGL
V-Ray Benchmark — best for ray trace testing
VRMark (Orange Room) — for testing VR headset capabilities
SketchUp TTD FPS Test — Run three SU models

Our only real-world tests is our script-based SketchUp test that we run on multiple models. More on that later.

Benchmark Framing

Like our other GPU hardware reviews, one thing we like to do is carry over results from earlier reviews to give the reader comparative scores that we can vouch for since we did the testing on the same hardware used for this review. This way we can see general industry progress with graphics and relate the subject RTX 4000 Ada GPU to some relevant predecessors, competitors, and generation siblings. This is our first Ada generation review so we do not have any Ada siblings test results.

Since we didn’t review the RTX A4000 Ampere predecessor, we have used published results in a few cases plus a proxy result from the Nvidia RTX 3070 Ti (an Ampere-based close comparable) in a Blendermark GPU test.

Our direct predecessor comparable is in the Vray Bench 6 scores where have a published scores from the Nvidia RTX A4000 to run up against the RTX 4000 Ada GPU.

Finally, in addition to comparable discreet GPUs from previous generations (AMDs and Nvidia’s), we have shown how much faster a discreet GPU can be compared to some SoCs from both Apple, Qualcomm and Intel. This is useful to demonstrate to Mac users, in particular, since they currently do not have any discreet GPUs available to them in modern Mac computers.

Let us proceed then with this last issue first.

Geekbench 6 GPU

As much as the industry continues to tout the advancements of ARM SoCs like Apple’s M-series and Qualcomm’s Snapdragon X Elite—or for that matter Intel’s new Intel Core Ultra 9-285K—when it comes to raw GPU power you simply cannot beat a powerful dedicated discreet GPU. The RTX 4000 Ada is 7x faster than the GPU powers inside some of the latest System-on-Chips.

Chart 1: Geekbench 6 GPU test results. In-housing testing on 4000 Ada and M3 only. (Image: Architosh)

Geekbench 6 is one of the world’s most trusted benchmark applications and was highly touted by the Nuvia team, which was acquired by Qualcomm and developed by the Snapdragon X Elite SoC. As a GPU-focused benchmark, Geekbench 6 focuses on GPU compute performance using workloads more applicable to generalized computing rather than CAD, BIM, and 3D applications. These are still meaningful because the average architect, for example, has meaningful workloads involving working with photography, image-editors including blur effects in conference apps like Zoom and not just Adobe products, and utilizing algorithms used in real-time renders.

Chart 2: This subtest on the Geekbench 6 GPU tests has relevancy, as noted in the paragraph below. In-housing testing on 4000 Ada and M3 only. (Image: Architosh)

In particular, the Image Synthesis workloads in this test are meaningful to content creation tasks like image rendering and image processing. Particle Physics workloads use techniques commonly used in both games and pro animation apps used for special effects and film editing.

Cinebench 2024

As noted earlier, the Maxon Cinebench 2024 test is particularly relevant. This GPU benchmark returned back to the Maxon tool after an absence using the Redshift Render engine rather than OpenGL as in the the past. Redshift is a well-regarded, fully GPU-accelerated, biased renderer with wide deployment inside DCC tools like Maya, Houdini, Cinema 4D, 3ds Max, Blender, and others. It is available in all three Nemetschek BIM brands: Allplan, Archicad, and Vectorworks. As such, Cinebench 2024 has become a pivotal benchmark for both DCC and AEC industries.

Chart 3: Cinebench 2024 GPU test uses the popular Redshift Renderer. In-housing testing on 4000 Ada and M3 only. (Image: Architosh)

Since we do not have a Cinebench 2024 score for the Ampere-generation RTX A4000, we have included a range of published scores (sources include www.render4you.com and www.cgdirector.com) on some GPUs we have tested using other benchmarks (see upcoming benchmarks). Since the Nvidia RTX 3070 Ti—a gaming GPU—is the closest matching Ampere generation GPU, we have included a published score for that GPU. Published scores must be taken with a grain of salt.

Blendermark GPU

We want to turn our attention to Blendermark GPU, another excellent test for ray tracing performance. This benchmark is also a cross-platform test so we will be including it as part of our stable of benchmarks for GPUs of all kinds in the future.

Chart 4: Blendermark is another excellent raytracing benchmark. In-housing testing on 4000 Ada and M3 only. (Image: Architosh)

Again, we have carried over the same set of reference chips, using published scores except as noted in the image description. As we can see, the Ada 4000 is 1.19x faster (which is essentially 20% faster) than the Ampere generation consumer comparable GPU. In comparison to our SoC M3, it is 4.59x faster, which is tremendously more powerful.

It should be noted that the Nvidia RTX 3070 Ti as an Ampere-generation substitute comparable to the RTX A4000—the immediate predecessor of the Ada generation RTX 4000—only goes so far. While the 3070 Ti and A4000 were identical chip-wise, each with 6144 CUDA cores, 48 RT cores and exact same transistors (17.4 billion), the RTX 3070 Ti had a higher boost clock speed (1770 Mhz versus 1560 Mhz) and more power consumption (290W versus 140W) than the RTX A4000 GPU. Sources cite about a 21% higher performance between the two.

Therefore, in the test scores above, the approximate performance delta in both Blendermark and Cinebench amounts to a 20% improvement over a gamer GPU that had a 21% improvement over the RTX A4000. The math works out such that the RTX 4000 Ada over the RTX A4000 = 121% x (1+ 0.20) = 145%.

CompuBench

In this particular benchmark we focus on Catmull-Clark and Subsurface Scattering tests. This time we have all in-house testing results—summarizing the last four GPUs we have tested for both AMD and Nvidia. These tests are really for our own edification but they demonstrate industry progress as the Ada generation RTX 4000 demolishes previous-generation (Ampere and AMD’s direct competitor generation) GPUs.

Chart 5: Catmull-Clark subtests from the CompuBench suite are useful for rendering. In-housing testing on all GPUs shown above. (Image: Architosh)

We run the Catmull-Clark SubDivision Surfaces Level 5 test and the Subsurface Scattering rendering tests in this GPU-compute oriented, OpenCL benchmark suite. The algorithms in these tests are very applicable to real-world professional 3D computer graphics. Software using the Catmull-Clark algorithm includes most leading CAD and 3D software tools from AutoCAD to Maya to Rhino. The algorithm tested here recursively breaks down surfaces into further surfaces to achieve “curved surfaces.” (see results above).

Chart 6: The Subsurface Scattering subtests shows substantial performance advantages for the RTX 4000 Ada GPU. In-housing testing on all GPUs shown above. (Image: Architosh)

As for the Subsurface Scattering results shown above, the Ada generation RTX 4000 is very impressive. (see above).

Image 5: Progressing through CompuBench’s Subsurface Scatter rendering algorithm test.

So what is Subsurface Scattering? SSS is about how light penetrates the surface of a translucent object will scatter by interacting with the material before it exits the material on the other side. Shining a light behind and through your fingers is a real-life example of subsurface scattering. But the rendering test results look like this above. (Image 5)

SPECviewperf – 2020 v3

SolidWorks Composite Tests

This benchmark is a gold standard for OpenGL-based 3D CAD applications and we focus on two particular non-AEC apps in Creo and Solidworks. The benchmark composites run various OpenGL render mode scenes using diverse models and report on frame-rates (FPS) performance.

Chart 7: SPECviewperf 2020v3 is a top OpenGL benchmark suite. This is our selected Solidworks Composite test scores. In-housing testing on all GPUs shown above. (Image: Architosh)

As we can see from the chart, the RTX 4000 Ada GPU is much more than twice as fast as the Radeon Pro W6600–a professional GPU that was meant to compete with the RTX 4000 Ada’s predecessor, the RTX A4000. We also look at this specific benchmark in the economic metrics section below.

Creo Composite Tests

The PTC Creo composite benchmarks offer similar tests using car and submarine models of various complexity and rendering them in OpenGL 4.5. Some tests move models around in simple shaded mode with no AA (anti-aliasing), while others turn on reflections, SSAO, bump maps, transparency with color, and 8x AA.

Chart 8: The SPECviewperf – Creo Composite subtest is another OpenGL set of benchmarks. In-housing testing on all GPUs shown above. (Image: Architosh)

On this test, we see that the RTX 4000 Ada GPU is still roughly a bit faster than twice as fast as its predecessor’s arch-rival GPU from AMD (W6600).

Creo Shaded Edges Sub-Tests

The subtest with the highest FPS scores is the Scorpion, Shaded no AA test. Comparable to SketchUp’s default settings of AA set to a 4x setting, the Scorpion no AA test helps us see how a GPU will move a 3D model around viewports in this basic OpenGL render mode with no anti-aliasing. The other subtest we put into this score is the World Car Shaded with Edges, 4xAA.

Chart 9: SPECviewperf “Shaded w/ Edges” Creo subtests show two scores combined and averaged with and without anti-aliasing (AA). In-housing testing on all GPUs shown above. (Image: Architosh)

We can see from the results that the FPS rate for Scorpion, Shaded with Edges, no AA is high at 167.14 FPS + 181.49 FPS for a total of 348.63 / 2 = 174.31 FPS. So there is a massive 2.3x difference between the NVIDIA RTX 4000 Ada GPU compared to AMD’s last generation RTX A4000 competitor.

One thing to point out is that on these more basic OpenGL render modes, sometimes a more powerful graphics chip doesn’t help you, as you can see between the two different AMD Pro GPUs in the chart, each powered with a different AMD Navi chip. This begs the question of what we might see on this test if we tested the RTX 2000 Ada GPU, which uses the Nvidia AD107 chip, not the AD104 chip inside the RTX 4000.

V-Ray 5 GPU RTX Benchmark

In this first chart we had a previously unpublished score for Vray Bench 5 for the Ampere generation RTX A2000 GPU and compared that to our Ada generation RTX 4000.

Chart 10: Vray Bench 5 GPU. In-housing testing the Ada generation GPU only. (Image: Architosh)

We can see the Ada generation GPU scores are dramatically better than the previous-generation GPU, but they are not comparable models. We feel the next test is far more relevant as it compares generation to generation at the model level.

V-Ray 6 GPU RTX Benchmark

For Vray 6 benchmark, we have a published score on the RTX A4000 and compare that to our in-house tested RTX 4000 Ada GPU. Despite having the same level of RT cores—something clearly different between the 2000 series and 4000 series GPUs above—we can see how next-generation RT cores boost raytracing performance in the Ada GPU. The Vray bench improvement from Ampere to Ada for the 4000 series GPUs show a 75 percent improvement.

Chart 11: Vray Bench 6 GPU. In-housing testing the Ada generation GPU only. (Image: Architosh)

While we did not do an economic value chart on Vray bench like we did on Cinebench (below), if we did the values on the scores reported for the Vray bench 6 results above they work out like this:

To achieve one compute unit (CU) for the Ampere generation GPU the formula is:

1,000 USD / 3,012 Vpaths = 0.33 USD per Vpath (CU)

To achieve one compute unit (CU) for the Ada generation GPU the formula is:

1,200 USD / 5,331 Vpaths = 0.22 USD per Vpath (CU)

On an economic basis this is a 50 percent value improvement per dollar spent, which is excellent and greater than the Cinebench economic comparisons below. (=33 / 22 = 1.50). This also compounds with time savings (the value for a professional’s time, which we talk about at the end.)

VMark Bench

The Ada card is 3x faster than the baseline requirement for the older Oculus Rift or HTC Vive.

Chart 12: VMark Orange Room Test. In-housing testing on all GPUs shown above. (Image: Architosh)

On this test we had our previously reviewed RTX A2000 GPU and Radeon Pro W6400 scores. The Ada generation RTX 4000 performed twice as fast as the Ampere generation RTX A2000.

Real-World SketchUp 2019 Tests

As we noted in our last GPU review, SketchUp is quite symbolic of the prevelance of “CPU, frequency-bound” CAD industry applications. Whether Rhino, SketchUp, or BIM tools like Revit, with these applications and the process of “design and modeling” performance-tied to having the fastest frequency of a single core in your CPU. Thus, the Geekbench 6 single-core benchmark is the ultimate mark folks should be looking for.

However, the GPU does play a key role in SketchUp as it essentially draws the screen using OpenGL 3.1 for our version 2019 test versions. We used the “Time.test_display” built in Ruby script to test out three models ranging in size from 1.3 MB, 13 MB to 200 MB. You can learn more about our SU test files from previous reviews here and here.

It should be noted that SketchUp version 2024 introduced a new graphic engine and left OpenGL behind for using Metal (on Mac) and DirectX (on Windows). Regardless of what graphics engine, the GPU manages what you see in the viewport, from panning, zooming, and importantly, orbiting. The GPU impacts how smoothly these functions work with all sized models and this applies to BIM tools like Revit as well.

That being said, our tests continue to show that entry-level workstation-class GPUs deliver essentially nearly the same performance as mid-level GPUs like the RTX 4000 Ada.

Chart 13: SketchUp v. 2019 TTD FPS Test for Combined Architosh SU Models. Reader beware, the numbers on the charts matter more than their graphical lengths in the chart. In-housing testing on all GPUs shown above. (Image: Architosh)

As we can see, the RTX 4000 Ada GPU was faster than GPUs we have tested in the past, but the deltas are rather insignificant. The reason to buy the Nvidia RTX 4000 GPU if you are a SketchUp user is because it can power popular SketchUp renderers like V-Ray and Octane Render. Aside from panning, zooming and orbiting, the power of a GPU also impacts anti-aliasing in SketchUp. The default is 4x AA but the preference settings allow up to 64x AA.

On the next page we will look at value economics and conclusions.

next page: Economic Metrics and Conclusions

Economic Metrics

As earlier noted in our GPU Economics section on page one, we like to compare GPUs on a performance-per-dollar basis and we look at the cost in USD per Compute Unit (CU) of a particular benchmark score. The formula is:

Cost in USD per Compute Unit (CU) = cost of GPU / benchmark score

Let’s now review the three benchmarks where we run these types of numbers.

Creo and Solidworks Composite Tests

In this test, we selected particular viewsets for the PTC Creo and SolidWorks parts of the SPECviewperf 2020 V3 benchmark and compared them on an economic basis. The Composite tests look broadly across both Creo and Solidworks at mixed types of OpenGL renderer setting levels, from simple shaded with edges to more complex OpenGL rendering settings including transparency, reflections, bumps, textures and anti-aliasing (AA).

Chart 14: SPECviewperf – Composite Performance (CU) per Dollar. Shorter bars are better in these blue charts. In this economics-value-oriented chart, the cost of one compute unit (CU) score in our selected Creo + Solidworks viewsets is shown above in USD. The Nvidia RTX 4000 Ada delivers solid value compared to our previously reviewed GPUs and reference scores. In-housing testing on all GPUs shown above. (Image: Architosh)

This Composite across Creo and SolidWorks gives a wide gamut of OpenGL performance across MCAD and other OpenGL-based applications. We can see on a Cost per CU basis, the Nvidia RTX 4000 Ada GPU ranks better than a previous-generation competitor in the AMD Radeon Pro W6600 GPU, but not quite matching W6600’s little sibling.

Creo & SolidWorks Shaded & Edge Performance Tests

On the Shaded & Edge Performance subtests, we are looking at addressing the fact that a large share of users still work in viewports rendered with “shaded with edge mode” with or without AA (anti-aliasing) turned on. While this basic OpenGL rendering type is still quite common, it is waning in the face of more sophisticated real-time rendering capabilities powered by modern GPUs.

Chart 15: SPECviewperf – Shaded & Edge Performance per Dollar. Shorter bars are better in these blue charts. In-housing testing on all GPUs shown above. (Image: Architosh)

As such, given the progress and transition to more modern graphics APIs like Vulkan and DirectX, this particular rendering mode is losing value in the grand scheme of things. If working in tools in this more basic rendering mode is all you need, then lower-tier GPUs are all you really need. That being said, the RTX 4000 Ada GPU delivered astonishingly high scores in this benchmark subset. (see Chart 8 and 9).

Cinebench GPU (Redshift) Tests

In this set of tests we had Cinebench GPU benchmark scores for the previous generation Nvidia workstation GPU—the Ampere generation RTX A4000. We also had our in-house testing of the Ampere generation RTX A2000.

Chart 16: Cinebench GPU Performance per Dollar. Shorter bars are better in these blue charts. In-housing testing on all GPUs shown above. (Image: Architosh)

We can see that compared to the previous Ampere generation GPU, the RTX 4000 Ada GPU delivers an 20 percent value improvement. This is an important point to see because compared to the previous generation the latest GPU is more expensive.

Performance Conclusions

OpenGL Workflows

We can see a massive speed 2.3x speed difference on our SPECviewperf Creo Shaded w/ Edges subtests with the RTX 4000 Ada GPU. Watching this test in action was quite the thrill as the models were whipped around in blinding speeds. For those doing basic OpenGL workflows in MCAD tools or AEC tools with less complex rendering modes, the RTX 4000 Ada GPU will supercharge your workloads. The only question is, would the smaller sibling RTX 2000 Ada GPU produce a similar matching score under these tests? We honestly would love to know that so maybe we push for reviewing the RTX 2000 Ada GPU next.

The Value Proposition

Whether in AEC or product design, the real value obtained in this GPU is in better rendering workflows, including ray tracing and real-time ray tracing. These may come about via new APIs like DirectX Ray Tracing (DXR) or the ray tracing components of the Vulkan API. As noted in our Blendermark section above, the Ada generation GPU is about 145 percent faster than the previous Ampere generation predecessor GPU at rendering workloads such as Blendermark GPU or Cinebench (Redshift GPU). We also noted the impressive Subsurface Scattering test results, another aspect of rendering. And the Vray mark tests show remarkable improvements over the Ampere generation predecessor.

In terms of OpenGL workflows in AEC and MCAD applications, the value proposition changes depending on how complex your viewport rendering settings are. We can see that in our subtests like the Shaded with Edges tests with and without AA. While many MCAD products leverage OpenGL 4 and above, our Creo and Solidworks subtests inside SPECviewperf v3 leverage and require OpenGL 4.5. This version of OpenGL featured GPU-based hardware tessellation for dynamic level-of-detail mesh refinement. The 4000 Ada GPU was incredibly quick in all the SPECviewperf tests, posting 328 FPS in our Solidworks Composite tests.

In general, for MCAD or AEC workflows without a lot of rendering workloads thrown in, the RTX 4000 Ada GPU will deliver incredible performance but on a pure economic basis one might get very similar results from the Ada generation RTX 2000 GPU or a discounted RTX A4000, which is still available on the market. However, if you have a decent amount of rendering workloads (say 20 percent or above) the value proposition changes instantly.

Recommendations

For visualization professionals in AEC or product design, the Nvidia RTX 4000 Ada GPU is a fantastic value. Even through the GPU is priced USD 200 more than its predecessor A4000 Ampere GPU, it delivers approximately 20 percent greater computer performance per dollar. That value savings compounds with the value of a professional’s time, which this publication has explained in quite some detail in an article titled, “BIM Manager: The Economic Value of Workstation Performance.”

While we won’t rehash that article’s economic conclusions, or those of a preceding article titled, “BIM Manager: The Economic Value of Rapid Response Time,” we know already that the savings or recaptured value of configuring workstations correctly can be as high as USD 6,480 dollars or more, annually. Thus, the cost of GPUs when added to workstation costs pale in comparison to opportunity costs of optimizing your CPU and GPU configurations for your workflows.

Other things we like about this GPU is the DLSS (Deep Learning Super Sampling) support for Enscape and much better video encode/decode capabilities. And then, of course, there is the 20 GB of memory on this GPU. As applications and models increase in size, so to the need for more memory. While there are a few more Ada generation pro GPUs to select from, unless you do visualization all day long, the RTX 4000 is likely the high point for the average AEC and product design user doing a fair amount visualization. And again, if you don’t do much visualization in your daily work, then the Ada generation RTX 2000 is likely a great choice, though we don’t know if a better value would be obtaining the older Ampere generation A4000 versus the RTX 2000 Ada. That is a question for another review.

In the meantime, we see AEC applications evolving away from OpenGL to low-level graphics APIs like DirectX, Vulkan and Metal. As real-time raytracing continues to penetrate more and more professional applications, benchmarks like Blender, Vray and Redshift continue to be highly meaningful, while SPECviewperf remains mostly OpenGL based and more meaningful to MCAD tools than AEC tools.

To learn more about the Nvidia RTX 4000 visit here.

Pros: Industry-leading performance per watt and the most powerful single-slot professional GPU in the market for CAD industry users. The GPU offers stunning performance gains over the Ampere predecessor GPU (RTX A4000) in raytracing rendering, benefitting from third-generation RT cores (48 cores). (Charts 3-4 and 10-11) V-ray, Blender, Redshift GPU rendering performance is excellent compared to the predecessor GPU boasting a 20% economic value boost in Redshift, as an example. (see Chart 16). New DLSS support in Enscape and AI technologies another great benefit; new encode/decode engines; incredibly high FPS scores in OpenGL MCAD workflows with a mixture of rendering modes; plus 20 GB GDDR6 ECC memory.

Cons: Perhaps given AMD’s exit from the professional GPU market, the entire Ada Lovelace generation GPUs are a bit more expensive. On a performance per dollar basis, the Ada generation has been criticized for being less impressive than the prior generation jump, and the RTX 4000 model, in particular, is the least impressive. In our Cinebench Cost per CU chart, we note only a 20% improvement over the previous generation GPU. This is really the only negative we can reasonably come up with.

Advice: For CAD industry professionals with visualization workloads, the RTX 4000 Ada GPU is an excellent solution for your workstation build or upgrade. For dedicated visualization professionals, the Ada generation GPUs above this solution will possibly provide even better value for your dollar.

Cost: 1,200 USD MSRP

Volume of New Content = 4.5 — The RTX 4000 Ada GPU delivers similar technologies to its predecessor in similar amounts, but in updated underlying technologies. However, notable new additions include new DLSS (Deep Learning Super Sampling) AI technologies which are active in tools like Enscape, plus an excellent new underlying chip (AD104) built on a very efficient custom node from TSMC. There is also a notable increase in onboard memory (20 GB, up from 16)

Quality of Execution = 4 — The new GPU delivers industry-leading performance for a single-slot GPU based largely off its new chip. Ironically, the memory bandwidth is 360GB/s based on a 160 bit wide memory interface which is less than the predecessor GPU with its 448 GB/s bandwidth based on a 256-bit wide memory interface. This felt like a step backward.

Underlying Technologies = 5 — The new AD104 chip is built on a custom TSMC 4nm “4N” node which differs from the TSMC N4 node used for other chipmakers. This new Ada generation chip packs twice as many transistors than the previous Ampere generation chip into an efficient design, with a great balance of performance versus power efficiency. It also features improved RT cores for raytracing and 4th gen Tensor cores for AI features like DLSS 3 noted above.

Future Proofing = 5 — The GPU is on the leading-edge of graphics technologies, including new AI features (DLSS 3) driven by its Tensor cores. It also offers the leading-edge memory (GDDR6), PCIe Gen 4 and the AV1 encode and decode support. These capabilities mean the Ada GPU will be capable of delivering advanced performance for years to come.