The Nvidia RTX 3060 12GB brings a new level of performance to the mainstream market – sort of. Officially, the RTX 3060 launches today with prices starting at just $ 329. Realistically? You’re as likely to find one at this price point as a $ 399 RTX 3060 Ti, $ 499 RTX 3070, or $ 699 RTX 3080 – not entirely impossible, perhaps, but highly unlikely. Nvidia’s Ampere architecture now powers many of the best graphics cards, and they all experience huge levels of demand from gamers and cryptocurrency miners. Nvidia added firmware and driver code to detect Ethereum mining, which should help a bit, but when people are willing to pay extreme prices on eBay, even for cards like the GTX 1660 Super and the RTX 2060, everything in our GPU benchmark hierarchy is pretty much sold out right now. Nvidia is even working with partners to bring back the Turing and Pascal cards from the previous generation.
None of this makes it a bad GPU, but we expect the RTX 3060 to be just as difficult to acquire as any other modern GPU. Eventually, Ethereum’s current mining boom will fade away, but it could take a year or more before the chip shortages end. It shouldn’t surprise anyone at this point, but if you’re hoping for a reasonably priced gaming PC upgrade, it’s a depressing state of affairs.
Unlike previous Ampere GPUs, Nvidia won’t be offering an RTX 3060 Founders Edition, so we’re considering a third-party card. Nvidia sent us the EVGA GeForce RTX 3060 XC for this launch review, a reasonably compact and relatively modest card. There’s no metal (or even plastic) backplate, no RGB lighting, and two custom-sized 87mm fans for cooling with a 2.0-slot form factor. The board measures 202 x 110 x 38mm and weighs 653g, which is quite the change of pace from other third-party Ampere boards we’ve reviewed so far.
There are reasons for this, of course. Creating a consumer card and decorating it with all the bells and whistles costs money. And we believe that most gamers who buy value for money are much better served by modest designs with good performance. There will certainly be extreme variants of the RTX 3060, and some of them will be more expensive than the budget RTX 3060 Ti options. Let’s be clear: even the fastest RTX 3060 won’t beat a 3060 Ti in most situations – yes, even with 12GB of VRAM. This is because memory capacity isn’t a huge factor once you go over 8GB, and having more memory bandwidth, thanks to its larger memory bus, gives the 3060 Ti a big advantage. Additionally, the 3060 Ti has 35% more GPU cores.
|Graphic card||RTX 3060 Ti||RTX 3060||RTX 2060 Super||RTX 2060|
|Process technology||Samsung 8N||Samsung 8N||TSMC 12FFN||TSMC 12FFN|
|Die size (mm ^ 2)||392.5||276||445||445|
|Base clock (MHz)||1410||1320||1470||1410|
|Clock boost (MHz)||1665||1777||1650||1680|
|VRAM speed (Gbit / s)||14||15||14||14|
|VRAM bus width||256||192||256||192|
|GFLOPS FP32 (Boost)||16.2||12.7||7.2||6.5|
|TFLOPS FP16 (Tensor)||65 (130)||51 (102)||57||52|
|Release date||Dec-20||the 21st of February||july-19||Jan-19|
|Introductory price||$ 399||$ 329||$ 399||$ 349|
Here’s how it breaks down, comparing the RTX 3060 to its closest sibling Ampere and its predecessors Turing. The RTX 2060 and 2060 Super show how much things have changed for -60 suffix cards between Turing and Ampere. Ampere gives you a parcel more shader cores which means potentially much higher compute performance and a minor improvement in memory bandwidth for the 12 GB card. It also doubles the VRAM capacity (at least until the the expected RTX 3060 6 GB is displayed(although maybe Nvidia will leave that for the RTX 3050 line) and offer improvements in the RT and Tensor cores, as well as in the memory subsystem, all leading to better performance. Power consumption remains similar, with a TGP (Total Graphics Power) of 170W, a decent drop from the 220W TGP of the RTX 3060 Ti.
An interesting piece of information is that this is the first time that Nvidia has used 15 Gbps GDDR6 memory. The RTX 20 series cards all used 14 Gbps memory, except for the RTX 2080 Super which was equipped with 15.5 Gbps VRAM. This narrows the bandwidth gap between the 3060 and 3060 Ti a bit, although the extra 64 bits of interface width still gives GA104 cards a distinct advantage. And the GA106 has no advantage in ROPs, render outputs, as it only has 48 – the same as the RTX 2060.
However, the differences between Turing and Ampere GPUs are not always reflected in the specification tables like above. Theoretically, the RTX 3060 has up to 95% more FP32 performance and 97% more base FP16 Tensor performance than the RTX 2060. In practice, the actual performance difference is much less, as half of the FP32 pipelines share processing resources with INT32 pipelines. The 3060 should never be slower for gaming, but most of the time it will only be 20-25% faster.
This is the first desktop card to use Nvidia’s GA106 processor. At a high level, there are three Graphics Processing Clusters (GPCs), each with up to 10 SMs and 16 ROPs (the two blocks of eight blue rectangles each at the bottom of the GPC). The full chip has 30 SM while the 3060 disables two and ends up with 28 SM, but everything else is left alone. (Note that the RTX 3060 mobile has all 30 SMs enabled, although it only comes with 6GB of memory, which is also clocked lower than on the desktop card.)
Each SM contains 64 dedicated FP32 CUDA cores, plus another 64 FP32 + INT32 CUDA cores – only FP32 or INT32 can be used for each cycle. SMs also contain a second generation RT core and four third generation Tensor cores, each of which is up to twice the performance of the previous generation cores, and with the rarity, the Tensor cores are potentially four times faster. than on Turing. Finally, there are six 32-bit memory interfaces, each connected to a single 8GB or 16GB GDDR6 module – the latter is currently reserved for desktops, with 8GB modules being used on laptops.
The complete GA106 chip has 12 billion transistors, compared to 17.4 billion in the GA104. This reduces the die size from 393mm square to just 276mm square, which not only helps lower the cost of the chip, but also increases the number of chips Nvidia can get from a single wafer – and if you are wondering, GA106 is less than more than half the size of GA102, which measures 628.4mm across and has 28.3 billion transistors. By one estimate, Nvidia can get around 130 dies per wafer with GA104 (some of which are defective, most of which end up being partially disabled chips), while GA106’s smaller size allows for around 200 dies per wafer. More matrices mean better yields and more graphics cards to use. It is hope.
AFTER: Best graphics cards
AFTER: Benchmarks and GPU hierarchy
AFTER: All graphic content