Mohtalim

Posted: **Thu Jun 30, 2005 2:16 pm**

Question du jour: If you had to build a high performance computing machine (can be dual or quad processor) for your desktop, what would you make sure you got?

Posted: **Thu Jun 30, 2005 2:48 pm**

What are you going to use it for?

Posted: **Thu Jun 30, 2005 4:59 pm**

Is the money coming out of your own wallet? I would never pay for a dual socket system myself, but if my organization were going to, that'd be fine. Dual core, single socket is going to be much cheaper. I would definitely only consider dual core, because the price advantages over single core are considerable.

Be careful with your terminology. HPC is a loaded term that implies a certain system configuration, generally with an expensive interconnect, and a certain type of workload. You really want a workstation, particularly if you are looking at 2 Core, 1 Socket or 4 Core, 2 Socket (2C1S, 4C2S) topologies.

Since you want a workstation, you have exactly three choices: Dell, HP, or IBM. I guess you might also consider Sun. You ought to check out their pricing and service plans.

I would pick a system based on how many threads you can keep going at once. If you can keep four threads active, then you'll want a 4C2S machine. Also, all the above vendors publish CPU2000 stats at http://www.spec.org/. You can drill down to the subtests to find the workloads that match your expected load. Broadly, you'd look at gcc or gzip/bzip2/perlbmk for compilation or general usage performance, and the FP suite for floating point performance.

I wouldn't go with RAID unless you knew you had large (>1GB) files to move around. I wouldn't use SCSI at all. Use network backup and a single workstation-class SATA drive. That means something with more than a 3 year warranty. Pick 10,000 RPM for random seek workloads like compilation or go with large (>250 GB) capacity for streaming workloads.

64 bit is largely a wash unless you need more address space per process or more physical memory. I'd get it, I guess, because it is free.

Intel's dual core Xeon processor comes out before the end of the year. The launch date hasn't been announced yet.

If you answer Peijen's question I could be more specific.

I work for Intel, but I do not speak for Intel. My opinions are not necessarily the opinions of Intel Corporation.

Posted: **Thu Jun 30, 2005 5:20 pm**

Most RAIDs provide significant speed advantages. So the size of the file isn't relevent; of interest is the bandwidth and type of disk I/O you need. Most applications won't be bottlenecked by disk I/O, but if they are data intensive, or if they're disk I/O is bursty and not well cached, you might need the RAID for that.

Posted: **Thu Jun 30, 2005 6:28 pm**

RAID is good for striping large datasets across multiple disks or for maintaining high availability. I am not aware of any mechanism for RAID disks to increase random seek performance. I'll ask our engineering computing guys, but I'm pretty sure we don't use RAID for any of our enterprise or workstation machines.

Posted: **Thu Jun 30, 2005 6:34 pm**

Yeah, all our machines are single disk.

Posted: **Thu Jun 30, 2005 6:52 pm**

1067 FSB Xeons are not scheduled to launch until Q1 2006.

Posted: **Thu Jun 30, 2005 9:46 pm**

I thought part of the advantage of most RAID flavors is that they distribute the transfers across several drives, allowing at least some concurrent transfers. Or are the seek times dominant for anything other than purely sequential accesses? If so, I bet a clever implementation of NCQ (Native command queueing) would really boost performance.

Posted: **Thu Jun 30, 2005 10:11 pm**

Yes, seek times are dominant for anything other than streaming stuff off disk.

NCQ can help if your accesses *could* be well-ordered, but *aren't*. If you have sequential accesses that aren't sequential for whatever reason (generally multiple threads of execution), you get a big perf boost from NCQ. Otherwise, it doesn't help much, if any.

Posted: **Thu Jun 30, 2005 10:11 pm**

George wrote:I thought part of the advantage of most RAID flavors is that they distribute the transfers across several drives, allowing at least some concurrent transfers. Or are the seek times dominant for anything other than purely sequential accesses? If so, I bet a clever implementation of NCQ (Native command queueing) would really boost performance.

Good RAID Guide:
http://www.storagereview.com/guide2000/ ... index.html

As for transfer rates, most RAID setups excel at it (RAID1 and JBOD (not really RAID) come to mind as exceptions), since they take advantage of multiple read/write heads.

However, additional heads have no positive impact on access times.

Posted: **Fri Jul 01, 2005 4:26 am**

I guess desktop might have been a misnomer. I'm looking for something powerful, but isn't in a rack configuration.

What am I going to use it for? Good question. Let's say extensive data processing (probably statistics based) on fairly large datasets. I can't think of a good example, but I think I'm looking mostly for floating point performance. After just taking five seconds, I was thinking something like this might be comprable to what I'm looking for. Jonathan, what would be comprable to this in a dual core design? Can you point/link me in the right direction? What are the considerations about buying this now, verses waiting a few months for the dual core xeon.

As for storage, the datasets will probably average about a few gig, but there needs to be enough room to grow up to a tera.

If you hadn't guessed, the money is not coming out of my wallet.

Posted: **Fri Jul 01, 2005 2:56 pm**

VLSmooth wrote:However, additional heads have no positive impact on access times.

So with a non-striped, mirrored array, you can't have two concurrent seeks? Obviously this doesn't help for writing, but it seems like if you had two non-local read requests, you could overlap them.

Posted: **Fri Jul 01, 2005 4:39 pm**

Well, that is a server. You might be able to get a better price for a workstation or performance desktop without the availability features. Certainly the 800 FSB dual-core desktop processors (Pentium D) are going to be higher performance than the 667 FSB single-core Xeons for a bus-bound application like calculating gigs of FP. I'm not sure if you can find a system with ECC memory on the desktop side, though. The dual-core Xeons launching this year will also be 667 FSB, so that will not help. The dual-core Xeons on 1067 FSB will be substantially faster. I don't know that that will fit your time table, however.

If those few gigs per dataset are accreted slowly, you don't need RAID for perf reasons. Your availability requirements are your business. If you stream through those few gigs sequentially at high speed, striping (or some higher availability variation thereof) would help a lot. Try figuring out what precisely is "a few gigs." Perhaps you can keep the results memory resident and write out to disk in a noncritical time. That'd be your biggest speedup right there.

I'll have a look at the Dell system when I'm not using my Hiptop.

George: you are correct. In practice, the only workloads with lots of concurrent disk activity are heavily parallel ones like transaction processing, web serving, or databases.

Posted: **Fri Jul 01, 2005 4:56 pm**

Jason wrote:As for storage, the datasets will probably average about a few gig, but there needs to be enough room to grow up to a tera.

Does it do alot of IO? Or it's more of dump data at the end of run? If it's doing alot of IO while the calculation is going on I would suggest you put more emphasis on the disk. At least SATA II if not SCSI. Our multiprocessor project spend most of it's time writing to disk rather than doing calculation, if we just throw away the result and run calculation only it's at least 20x faster. This was on a super computer (I forgot where).

Posted: **Fri Jul 01, 2005 5:30 pm**

Yeah, scientific supercomputing algortihms operating on large amounts of data have to be designed carefully to prevent disk I/O from bottlenecking them.

Whether you need to worry about the disk will depend on how long it takes to handle a unit of data. If your processing is very complex and time consuming, you can just have a background task DMA the next unit of data while you're processing. On the other hand, if you can complete processing of a set of data in less time than the DMA of the next data set would take, you need a faster disk.

Posted: **Fri Jul 01, 2005 7:42 pm**

George wrote:So with a non-striped, mirrored array, you can't have two concurrent seeks? Obviously this doesn't help for writing, but it seems like if you had two non-local read requests, you could overlap them.

You're correct about overlapping.

I wasn't referring to multiple simulaneous reads, just that a RAID array's lowest access time for one read is that of the fastest drive and no faster.

Mohtalim

High-end system

High-end system