Original Link: https://www.anandtech.com/show/1005



When NVIDIA "launched" the original nForce chipset at Computex 2001 in Taiwan, no one expected it to take almost half a year to reach widespread availability. By the time the chipset was actually available VIA's KT266A was entrenched in the market and NVIDIA's nForce saw little initial success.

Earlier this year NVIDIA finally enjoyed a handful of OEM design wins for the nForce platform and followed it up with a much more mature launch of nForce2 in July. Supposedly learning from their mistakes, we were lead to believe that nForce2 would not have any of the issues that we found with the original nForce launch.

Looking at today's date, it's now almost three months since NVIDIA "launched" the nForce2 chipset and we're finally able to bring you a review of pre-production hardware. What was this about learning from their mistakes? No manufacturer is perfect though, even Intel has their own set of screwups when it comes to chipsets (motherboard manufacturers aren't happy that they've had to go from 845 to 845DDR to 845E to 845PE in a year).

With that one gripe out of the way, we actually have been eagerly awaiting the release of NVIDIA's second Athlon XP chipset. Our expectations for the chipset weren't unreasonably high, which was the case with the original nForce, and there are many more launch partners this time around. While you still shouldn't expect retail availability for another few weeks, boards are either ready or in the final stages of preparation for mass production.

This article will not focus on the architecture behind the nForce2 chipset nor will it detail the graphics performance of nForce2. For chipset details read our Technology Overview of the nForce2 and for integrated graphics performance, we've got another article in the works that you'll come across later this week.



The Motherboards

In looking out for their best interests, AMD picked the most stable, complete and highest performing nForce2 board that was ready in time and sent it out to all reviewers as a part of the Athlon XP 2800+ launch kit (see our review here). The board they chose was ASUS' A7N8X; the board that was sent out is still not final production, although it does use the final production stepping of the nForce2 chipset itself.


Click to Enlarge

The A7N8X is based on the nForce2 SPP (meaning it has no integrated graphics) and uses the MCP-T (dual LAN + Firewire support).

We'll have a review of the board itself shortly but we'll highlight a couple of its features here today.

ASUS implemented 6-channel analog and digital outputs without relying on NVIDIA's SoundStorm reference design for an add-in card.

The BIOS was modified by AMD to provide support for the Athlon XP 2800+ but it's still not final; there are a number of options missing as well as a few glitches (e.g. the system will not bypass extended memory tests even if quickboot is enabled).


Click to Enlarge

NVIDIA also provided their reference board, which boasts a combination of the nForce2 IGP (with integrated graphics) and the MCP-T as well.

The most unique feature of this board is NVIDIA's decision to get rid of all serial ports on the board and replace the headers with two VGA connectors to enable on-board multimonitor support through the IGP.



The Test

We started out by comparing nForce2 to VIA's KT400, however we quickly realized that motherboards based on VIA's KT400 chipset aren't as mature as some of the better KT333 boards (a good KT333 board will outperform a KT400 board at this point). So instead we used the EPoX 8K3A+ (which ended up outperforming the KT400s we originally tried) and didn't bother with DDR400 tests on the VIA platform since we've already proved that the higher bandwidth memory does nothing for the Athlon XP.

Windows XP Professional Test Bed
Hardware Configuration
CPU
AMD Athlon XP 2800+ (2.25GHz)
Motherboard
ASUS A7N8X - NVIDIA nForce2 Chipset
RAM
1 x 256MB DDR400 CAS2 Corsair XMS3200 DIMM
Sound
None
Hard Drive
80GB Western Digital Special Edition 8MB Cache ATA/100 HDD
Video Cards (Drivers)

NVIDIA GeForce4 Ti 4600 (30.82)



Memory Controller Performance

The key to a high performing chipset lies in two areas: the FSB interface and the memory interface. Implementing the FSB link is not too difficult to accomplish well, however it is the memory controller that takes a great deal of expertise to develop and a lot of experience to perfect. It is the memory controller of these two Athlon chipsets that we'll look at first, hopefully to get an idea of what sort of performance we should expect. We ran all tests using an Athlon XP 2800+ with the FSB set to 333MHz:

Memory Bandwidth Comparison - STREAM

NVIDIA nForce2
VIA KT333
 
64-bit DDR333
128-bit DDR333
64-bit DDR400
128-bit DDR400
64-bit DDR333
Copy
945.9 MB/s
1296.2 MB/s
1071.5 MB/s
1303.9 MB/s
1069.3 MB/s
Scale
911.3 MB/s
1250.0 MB/s
1047.8 MB/s
1244.5 MB/s
947.9 MB/s
Add
1005.8 MB/s
1422.5 MB/s
1112.1 MB/s
1451.9 MB/s
1062.3 MB/s
Triad
1009.6 MB/s
1377.2 MB/s
1109.6 MB/s
1412.1 MB/s
1035.7 MB/s
Average:
968.1 MB/s
1336.5 MB/s
1085.2 MB/s
1353.1 MB/s
1028.8 MB/s

Remember that the nForce2 chipset can run in either 64-bit (single channel) mode with only one memory stick installed or 128-bit (dual channel) mode with two or three sticks of memory installed. To test the effects of going to a dual channel architecture we simply plugged in another stick of memory to enable the full 128-bit wide memory bus.

With that said, there's a pretty healthy boost in memory bandwidth when going from a single to dual channel setup with the nForce2 (25 - 40% boost in bandwidth). The move to DDR400 yields a 12% increase in memory bandwidth; the unseen penalty is an increase in latency since today's unofficial DDR400 modules don't run at the same aggressive timings as the fastest DDR333 you can buy today.

VIA's KT333 provides an impressive showing, outperforming the nForce2 by 6% in 64-bit DDR333 mode. Once you realize that a 6% advantage in memory bandwidth will not translate into any sort of real-world performance advantage, it's clear that just by looking at the bandwidth figures the nForce2 and KT333 will be very close performers.

Memory Latency Comparison - Cachemem

NVIDIA nForce2
VIA KT333
 
64-bit DDR333
128-bit DDR333
64-bit DDR400
128-bit DDR400
64-bit DDR333
Latency (clocks - lower is better)
244
244
298
305
349

Here's where we complete the picture; by looking at the latency in addition to the bandwidth we can get an idea for what sort of performance to expect out of the various configurations.

Remember the 12% boost in bandwidth we saw on the nForce2 by going to DDR400? That 12% increase in bandwidth comes at the cost of a 22% increase in latency! The increase in latency is not only due to the slower memory timings DDR400 modules run at but also because the memory bus is no longer synchronous with the FSB when running in DDR400 mode whereas DDR333 matches up perfectly with the new 333MHz Athlon XP FSB.

There's no increase in latency when going from a single channel DDR333 to a dual channel DDR333 setup on the nForce2 platform. There is a slight increase when making the same transition with DDR400 because we had to increase some of the timing delays in order to run two channels of DDR400 with the nForce2 while maintaining stability.

Also take note of the extremely poor latency of VIA's KT333 memory controller; accesses take almost 43% longer on the KT333 than they do on the nForce2 when running in 64-bit DDR333 mode.

As we proved in our original review of the nForce chipset, the bandwidth gained from going to dual channel DDR doesn't help unless you're sharing main memory bandwidth with an integrated GPU. In this case we're not and we'll be focusing on IGP performance in a later article, so we can disregard the two 128-bit nForce2 solutions for the rest of this comparison. We also have a balanced FSB/memory bus setup, meaning we have as much bandwidth going to our CPU as we do to main memory, so increasing memory bandwidth without similarly increasing FSB bandwidth would inherently yield poor returns as we're FSB limited at that point.

It's also clear that the bandwidth increase DDR400 provides doesn't offset the incredible increase in latency, leaving the nForce2's 64-bit DDR333 configuration the ideal comparison to the VIA KT333. Let's look at some real world tests:



Content Creation Performance

Internet Content Creation Performance
Internet Content Creation SYSMark 2002
NVIDIA nForce2

VIA KT333

281

281

|
0
|
56
|
112
|
169
|
225
|
281
|
337

Content Creation Performance
Content Creation Winstone 2002
NVIDIA nForce2

VIA KT333

42.3

42.3

|
0
|
8
|
17
|
25
|
34
|
42
|
51

In Content Creation applications we see that the nForce2 and KT333 platforms are virtually identical in performance. Even current motherboards based on the KT400 chipset provide equal performance to these two solutions.



General Usage & Business Application Performance

For general/business usage performance we turn to Office Productivity SYSMark 2002 and Business Winstone 2001

General Usage Performance
Office Productivity SYSMark 2002
NVIDIA nForce2

VIA KT333

185

185

|
0
|
37
|
74
|
111
|
148
|
185
|
222

General Usage Performance
Business Winstone 2001
NVIDIA nForce2

VIA KT333

72.3

71.7

|
0
|
14
|
29
|
43
|
58
|
72
|
87

Once again we see a very close race with less than a 1% performance difference between the two.



3D Rendering Performance - 3ds max 5

For our 3ds max 5 benchmarks we chose two benchmark scenes that ship with the product - SinglePipe2.max and Underwater_Environment_Finished.max

3D Rendering Performance - 3ds max 5
SinglePipe2.max - Render Time in Seconds (lower is better)
NVIDIA nForce2

VIA KT333

226

226

|
0
|
45
|
90
|
136
|
181
|
226
|
271

3D Rendering Performance - 3ds max 5
Underwater_Environment_Finished.max - Render Time in Seconds (lower is better)
NVIDIA nForce2

VIA KT333

301

301

|
0
|
60
|
120
|
181
|
241
|
301
|
361

We've noticed in the past that rendering in 3dsmax isn't incredibly memory bandwidth dependent and relies more on the CPU's cache with most models and scenes being rendered, thus we see no performance difference between the two chipsets.

3D Rendering Performance - Maya 4.0.1
Rendertest.ma - Render Time in Seconds (lower is better)
NVIDIA nForce2

VIA KT333

70

70

|
0
|
14
|
28
|
42
|
56
|
70
|
84

The same trend holds under Maya as well.



Media Encoding Performance

We'll start off with a "quick" conversion of a DVD rip (more specifically, Chapter 40 from the Star Wars Episode I DVD) to a DiVX MPEG-4 file. We used the latest DiVX codec (5.02) in conjunction with Xmpeg 4.5 to perform the encoding at 720 x 480.

We set the encoding speed to Fastest, disabled audio processing and left all of the remaining settings on their defaults. We recorded the last frame rate given during the encoding process as the progress bar hit 100%

MPEG-4 Encoding Performance - Xmpeg 4.5
DiVX 5.02 Conversion Frame Rate (higher is better)
NVIDIA nForce2

VIA KT333

66.3

64.3

|
0
|
13
|
27
|
40
|
53
|
66
|
80

The nForce2 gains a 3% advantage over the KT333, but clearly nothing tangible.

MP3 Encoding Performance - LAME 3.91
Time in Seconds to Encode 170MB .wav File
NVIDIA nForce2

VIA KT333

90

90

|
0
|
18
|
36
|
54
|
72
|
90
|
108

There's no difference in MP3 encoding either.



Gaming Performance

Gaming Performance - Unreal Tournament 2003
DM-Asbestos - Frames per Second (higher is better)
NVIDIA nForce2

VIA KT333

186.3

177.8

|
0
|
37
|
75
|
112
|
149
|
186
|
22

Gaming Performance - Jedi Knight 2 1.03
JK2FFA - Frames per Second (higher is better)
NVIDIA nForce2

VIA KT333

152.6

149.9

|
0
|
31
|
61
|
92
|
122
|
153
|
183

Gaming Performance - Comanche 4
Benchmark - Frames per Second (higher is better)
NVIDIA nForce2

VIA KT333

51.7

51

|
0
|
10
|
21
|
31
|
41
|
52
|
62

Some of the largest performance differences we've seen come to surface in the gaming tests, but even then the biggest differential is still only around 5%.



Ethernet Controller Performance

Now that it's clear that the nForce2 performs just as well as the fastest Socket-A chipset out there, let's look at some of its features in greater depth. One of the biggest selling points of the new nForce2 MCP-T is its "router on a chip" capability, made possible by having two independent Ethernet controllers on die. You'll also remember that a major advantage of the nForce2 architecture is that the integrated Ethernet controller(s) gets an isochronous path to the IGP, guaranteeing that your Ethernet controller always receives the bandwidth it needs.


Two ethernet MACs lay beyond the packaging of the MCP-T

In order to test the performance of the nForce2's integrated network controllers we turned to NetIQ's Chariot benchmark. The way Chariot works is simple; you run the Chariot client on a handful of PCs, including the one you wish to test and you install the controller software on another PC. The Chariot controller then instructs all of the client PCs to generate traffic to/from the PC you wish to test and it takes data during the test. We chose to look at average bandwidth through the network controllers as well as their average CPU utilization during the tests. In order to provide some good reference points, we not only compared the two nForce2 Ethernet controllers but also added the following:

1) AMD PCNet based 10/100 card
2) Intel server class 10/100 PRO+ adapter
3) Netgear FA311 10/100, and
4) VIA's VT6103

With the exception of the VIA chip, we installed all of the cards on our ASUS nForce2 testbed in order to limit the number of variables introduced into the comparison. The VIA chip was on our KT400 test platform and thus we used a different motherboard for those tests, although the result should be comparable with the others. We also used the best driver we could find for the particular card, we tested both Windows XP's integrated drivers as well as those available from the manufacturers' website and chose the highest performing of the two.

All of our tests simulated file transfers between 1 or 2 PCs and our nForce2 test bed; we ran a total of six different tests, we'll describe each one as we encounter it:

Ethernet Controller Performance - NetIQ Chariot

Test:
Dual Client Bi-directional Transfer
NIC
Average Bandwidth
CPU Utilization
AMD PCNet Family
150 Mbps
59%
Intel 10/100 PRO+
155 Mbps
13%
Netgear FA311
163 Mbps
34%
nForce2 (3Com MAC)
91 Mbps
12%
nForce2 (NVIDIA MAC)
153 Mbps
12%
VIA VT6103
125 Mbps
27%

This first test involves two client PCs sending and receiving data from our nForce2 testbed. Remember that although we're testing a 100Mbit connection, it is a full-duplex connection and thus the theoretical maximum is 200Mbps (100Mbps each way).

Note relatively low bandwidth throughput of the 3Com MAC in the MCP-T but also note that both nForce2 solutions have the lowest CPU utilization scores out of the bunch.

Ethernet Controller Performance - NetIQ Chariot

Test:
Dual Client Inbound Transfer
NIC
Average Bandwidth
CPU Utilization
AMD PCNet Family
95 Mbps
37%
Intel 10/100 PRO+
95 Mbps
9%
Netgear FA311
95 Mbps
23%
nForce2 (3Com MAC)
76 Mbps
20%
nForce2 (NVIDIA MAC)
95 Mbps
7%
VIA VT6103
95 Mbps
20%

Our next test involves two clients both sending data to our nForce2 testbed. Since the traffic is only occurring in one direction, the theoretical maximum transfer rate here is 100Mbps.

Once again we see that the 3Com MAC is performing well below the rest of the group and this time it ends up eating a good 20% of our Athlon XP 2800+. The NVIDIA MAC in the nForce2 MCP-T not only provides high bandwidth but it also does so at a very low CPU utilization.

Ethernet Controller Performance - NetIQ Chariot

Test:
Dual Client Outbound Transfer
NIC
Average Bandwidth
CPU Utilization
AMD PCNet Family
94 Mbps
61%
Intel 10/100 PRO+
94 Mbps
9%
Netgear FA311
94 Mbps
27%
nForce2 (3Com MAC)
90 Mbps
11%
nForce2 (NVIDIA MAC)
93 Mbps
9%
VIA VT6103
94 Mbps
26%

Our final dual client test has our nForce2 testbed sending data to two clients. Since the traffic is only occurring in one direction, the theoretical maximum transfer rate here is 100Mbps.

Interestingly enough, the numbers all look to be on par with one another here, including the 3Com MAC.

Ethernet Controller Performance - NetIQ Chariot

Test:
Single Client Bi-directional Transfer
NIC
Average Bandwidth
CPU Utilization
AMD PCNet Family
165 Mbps
71%
Intel 10/100 PRO+
157 Mbps
13%
Netgear FA311
155 Mbps
35%
nForce2 (3Com MAC)
90 Mbps
13%
nForce2 (NVIDIA MAC)
155 Mbps
12%
VIA VT6103
135 Mbps
27%

The single client tests take place between our nForce2 test bed and one other PC. This particular test has data flowing both to and from the testbed, meaning that the theoretical maximum transfer rate is 200Mbps.

There seems to be an issue with getting the 3Com MAC to work in full duplex mode as it will not break the 100Mbps barrier while the NVIDIA MAC had no problem reaching 155 Mbps. Once again, CPU utilization is lowest on the nForce2 controllers (they are even as low as the Intel NIC).

Ethernet Controller Performance - NetIQ Chariot

Test:
Single Client Inbound Transfer
NIC
Average Bandwidth
CPU Utilization
AMD PCNet Family
92 Mbps
37%
Intel 10/100 PRO+
92 Mbps
8%
Netgear FA311
95 Mbps
23%
nForce2 (3Com MAC)
93 Mbps
6%
nForce2 (NVIDIA MAC)
93 Mbps
6%
VIA VT6103
91 Mbps
19%

Here we have one client sending data to the nForce2 testbed. The two nForce2 controllers perform identically and once again yield the lowest CPU utilization out of the bunch.

Ethernet Controller Performance - NetIQ Chariot

Test:
Single Client Outbound Transfer
NIC
Average Bandwidth
CPU Utilization
AMD PCNet Family
93 Mbps
60%
Intel 10/100 PRO+
93 Mbps
7%
Netgear FA311
94 Mbps
26%
nForce2 (3Com MAC)
67 Mbps
13%
nForce2 (NVIDIA MAC)
92 Mbps
9%
VIA VT6103
92 Mbps
27%

For our final test we have our nForce2 testbed sending data to one client and once again we see the 3Com MAC deliver sub-par performance while the NVIDIA MAC works wonderfully.

In the end, although the integrated nForce2 Ethernet doesn't really provide any more bandwidth than the competition it does provide some of the highest performance at the lowest CPU utilization possible. There is one caveat and that is that there seems to be an issue with the integrated 3Com controller; we're not sure whether this is a driver problem or if the integrated 3Com controller is just a cheap add-in that isn't really meant to perform all that well but primarily there for brand recognition.

Audio & I/O Performance

Since the APU remains unchanged from the original nForce, the audio performance of the nForce2 chipset is identical to the original nForce and thus we won't go over it again here.

We ran disk tests on the nForce2 platform to ensure it was at least on par with the KT333 and throughout our tests we could not get the two platforms to ever differ in performance. It seems like NVIDIA has really made sure that the nForce2 won't fall behind on performance, thus finally allowing it to stand on its own and attract buyers based on its features.



Final Words

Despite our distaste with NVIDIA's (not just NVIDIA, other manufacturers do this to - e.g. SiS, VIA) launch of the nForce2 chipset, we must say that the platform is pretty impressive. Not only does the chipset perform just as well as VIA's matured KT333 (and obviously the KT400 as well) but it manages to do so while offering a number of very compelling features.

We will focus on the integrated graphics performance in a forthcoming article but the IGP in combination with the integrated networking, firewire and powerful audio features make the nForce2 platform a very tempting base for a media center PC.

We were also very impressed with the low CPU utilization of the integrated NVIDIA Ethernet controller, especially since it rivaled the performance of one of Intel's server NICs. The integrated 3Com MAC was clearly a disappointment, as it was the only contender to actually perform poorly in our NetIQ Chariot tests.

In the end, the nForce2 platform is just as fast as anything VIA currently offers while providing features that no other chipset manufacturer can boast. Unfortunately, we're left with the same question we had when we reviewed our first nForce board - when and how much? Those questions will be answered in time (hopefully shorter than last time) and if NVIDIA truly has learned from their mistakes then the answers to those questions won't turn you off of nForce2.

This could very well be the Athlon XP platform to have as we move into 2003...

Log in

Don't have an account? Sign up now