V-Ray GPU benchmarks on top-of the-line NVIDIA GPUs

Introduction

V-Ray’s GPU rendering and NVIDIA’s hardware are constantly improving. Recently, there have been major advances in both, so we thought now would be the perfect time to run new benchmarks and find out how much faster everything might be.

The hardware

With 40 logical CPU cores and 128GB RAM, the Lenovo P900 is powerful. It’s great for GPU tests, since there’s space for three double slot GPUs and one single slot GPU. Plus, the toolless chassis makes it quick to pop cards in and out. The tests felt like an F1 pitstop for GPUs.

The GPUs we decided to test are as follows:

GPU	Architecture	Cores	RAM type	RAM	Power	Slots	Street Price
GP100	Pascal	3584	HBM2	16GB	235W	2	N/A
P6000	Pascal	3840	GDDR5X	24GB	250W	2	$4,699
P5000	Pascal	2560	GDDR5X	16GB	180W	2	$2,499
P4000	Pascal	1792	GDDR5X	8GB	105W	1	N/A
M6000	Maxwell	3072	GDDR5	24GB	250W	2	$4,539
Titan X (Pascal)	Pascal	3584	GDDR5X	12GB	250W	2	$1,599

*Street prices approximate, based on a quick search at Newegg and Amazon. The GP100 and P4000 are not public yet, so no pricing is available.

The benchmark test

Even before the benchmarks started, we were very interested to see NVIDIA’s new NVLink tech in action. Because NVLink allows cards to share memory, we were curious to see what sort of performance we could get using two new GP100s. More on this later.

Our lead GPU developer, Blago Taskov and I set up the benchmarks. To get better data, we decided it would be best to test multiple scenes instead of just one. We batch rendered nine different scenes and recorded the time to complete each one. Then, we added up the total time for all nine.

Here are the results:

	Test 1	Test 2	Test 3	Test 4	Test 5	Test 6	Test 7	Test 8	Test 9	Total time
GP100 x 2	46.49	130.36	156.69	29.43	112.99	39.88	40.21	107.75	19.94	683.74
GP100	90.72	251.81	295.52	50.84	220.51	77.72	76.94	202.28	38.02	1304.36
P6000	127.21	363.18	410.72	72.17	348.99	131.39	109.64	264.82	61.83	1889.95
P5000	188.18	536.69
P4000	212.54	636.84	724.22	131.86	565.83	207.79	178.6	455.61	104.62	3217.91
M6000	140.13	483.71	538.86	97.59	423.11	159.04	134.79	351.91	73.15	2402.29

A comparison of the different times in percentage of time for each card can be seen in this table:

	GP100 x 2	GP100	P6000	P5000	P4000	M6000	Titan X (Pascal)
GP100 x 2	1	1.907684	2.764135	4.059496	4.706336	3.513455	2.7042882
GP100	0.524196	1	1.448948	2.127971	2.467041	1.841738	1.4175764
P6000	0.361777	0.690156	1	1.468631	1.702643	1.271087	0.9783486
P5000	0.246336	0.469931	0.680906	1	1.15934	0.86549	0.6661635
P4000	0.21248	0.405344	0.587322	0.86256	1	0.746537	0.5746059
M6000	0.28462	0.542965	0.786728	1.155414	1.339518	1	0.7696947
Titan X (Pascal)	0.369783	0.705429	1.022131	1.501133	1.740323	1.299216	1

A note about RAM

RAM plays a big part in the value of these cards. For example, the Titan X (Pascal) and P6000 showed similar times across all the tests. On some the Titan X was faster, and on others the P6000 beat it outright. In overall time, the Titan X narrowly edged out the P6000. But that’s the not the whole story. While both cards were neck in neck in speed, the choice (and cost) comes down to RAM. The Titan X is significantly less expensive at 12GB of RAM, but the P6000 can fit much more data with its 24GB of RAM. You might be able to give yourself a little more breathing on that 12GB card with V-Ray 3.5’s On-demand Mip-mapping. This would dramatically reduce the RAM requirements for loading textures. Ultimately, it comes down to your budget and how much memory you really need.

Let’s say you want to render a huge scene with lots of geometry and textures. If you need more than 24GB, that’s where NVLink comes in. What is NVLink?

Currently, GP100s are the only cards to support NVLink. They use special HBM2 memory that is so fast, it can be shared across cards. It may look similar to SLI, but it’s not the same. In our setup we connected two GP100s. In theory, with specialized hardware, it’s possible to link more. For example, NVIDIA’s DGX-1 does this with eight P100 GPUs. But at $129,000 it’s a little out of our price range. We’re looking forward to testing that one. When we do, we’ll be sure to share the results.

V-Ray and NVLink

We’ve enabled NVLINK in the latest V-Ray nightly builds. To test it, we enlisted the help of our friends at Dabarti Studio, and they created this torture test.

Model and assets courtesy of Dabarti with 169 million polygons and 150+ 6k textures

This scene contains 169 million polygons and over 150 6K images. The geometry alone won’t fit on a single card, not to mention all those high res. textures.

Time to render. First, we set all objects to Dynamic Geometry in the V-Ray Properties. This made it possible for the geometry to be shared across the cards. Then, we disabled On-demand Mip-mapping to force the full resolution textures to load. Once the cards were fully loaded, each one used 13GB of its 16GB RAM. That’s a total of 26GB RAM on both cards – more than the 24GB a P6000 can hold.

It worked, and we noticed little or no performance loss with NVLink. It’s still early, but the initial results are positive. Maybe with a few driver updates and V-Ray tweaks, NVLink will perform even better in the future.

Conclusion

Moore’s Law is alive and well. The M6000 arrived about two years ago and today the GP100 is almost twice as fast – right on schedule. The combination of NVIDIA’s latest tech and V-Ray’s most recent advances in GPU rendering, seem to remove some of the early memory limitations. And that paints a bright future for GPU rendering. We will continue to test and update you more as we get new hardware to test and benchmark.

Special thanks

Thanks to NVIDIA for loaning us their latest and greatest hardware for stress testing. Also, thanks to Lenovo for supplying Chaos Group Labs with a workstation that can handle some serious computing. And thanks to Tomasz Wyszolmirski at Dabarti Studio for helping us continue to push GPU rendering to its limits.

V-Ray 6 Benchmark available now

Discover new features and support for more hardware

Find out more

Invisible

Recommended

Recommended

Easy to use real-time design companion

Unmatched real-time realism & storytelling

Industry-standard photorealistic rendering

Explore our ecosystem

Explore our ecosystem

Students & Education

Start learning here

Knowledge sources

Chaos community

Articles & stories

Test your hardware

Free scenes

Customer renders

Help

Common tasks

What are you looking for?

V-Ray GPU benchmarks on top-of the-line NVIDIA GPUs

Introduction

The hardware

The benchmark test

A note about RAM

V-Ray and NVLink

Conclusion

Special thanks

V-Ray 6 Benchmark available now

Discover new features and support for more hardware

Subscribe to our blog.

Subscribe to our newsletter.

Chaos

Purchasing

Support

Passion projects

Technology

Follow us

Invisible

Recommended

Recommended

Easy to use real-time design companion

Unmatched real-time realism & storytelling

Industry-standard photorealistic rendering

Explore our ecosystem

Explore our ecosystem

Students & Education

Start learning here

Knowledge sources

Chaos community

Articles & stories

Test your hardware

Free scenes

Customer renders

Help

Common tasks

What are you looking for?

V-Ray GPU benchmarks on top-of the-line NVIDIA GPUs

Introduction

The hardware

The benchmark test

A note about RAM

V-Ray and NVLink

Conclusion

Special thanks

V-Ray 6 Benchmark available now

Discover new features and support for more hardware

Related articles

V-Ray Luminaires: dramatically faster, more accurate rendering of complex light fixtures

3D Gaussian Splatting: A new frontier in rendering

The sky’s the limit: talented student on interning with Chaos' Innovation Lab

Subscribe to our blog.

Subscribe to our newsletter.

Chaos

Purchasing

Support

Passion projects

Technology

Follow us