Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.
well i now owe my friend 10 dollars, i made a bet that Big Mac would come in number two... ohwell.

maybe someone should get VT to join the macrumors folding team...

aethier
 
Re: Re: Re: Data

Originally posted by theRebel
Yeah i had noticed that too.

However the stats posted here are the same as what top500.org has posted in their chart. It is possible that they may just have a typo though.

It is also possible that the November stats for the NCSA cluster were achieved with only 1250 systems, but that the NCSA is planning to add another 200. Their press release does not make it clear whether 1450 is how many they have now or how many they plan to have.

How can we find out which is right? The Top500.org chart or the NCSA press release?

that very well could be true ...but if that were the case I'd make sure to let everyone know that results were based on an incomplete cluster! ...regardless, if was only tested with 2500 procs that still doesn't change anything!! ...the 970 is clearly the faster processor in these tests! ;)
 
Re: Re: Re: Re: Re: The Next Supercomputer?

Originally posted by greenstork
I would argue that distributed computing (e.g. processing one unit on one machine) is not what the cluster computer is trying to achieve, hence the Infiniband connection between the nodes. So while your analogy to distributed computing holds true on paper, it's not in the nature of the tasks that a cluster supercomputer is designed for, IMO. I could be wrong, but that is my impression.

You can rest assured that nearly all super computer tasks are designed to distribute well over a large cluster of nodes. Adding in more nodes should get you basically a linear speed-up until you start clogging up the network.

People are not building 2000 node clusters to shave a few seconds off of the time it took the 1000 node cluster. They are doing it because nearly all linear algebra, discretized differential equations, and particle simulations can be nicely partitioned and effectively distributed over a gazillion [sp?] odd nodes.

The main tricky thing is to get the system to properly use the network, partitioning nodes and communication up so that folks aren't tromping all over each other. Properly laid out, the nodes spend very little time waiting for one another, and most of the time computing and sending/receiving.
 
Originally posted by greenstork
You know what I'd love to see is the cost per teraflop. I bet Apple is by far the cheapest of the whole top500. Now that would be an interesting stat.

I doubt it. Some of the lower ones on the list are also "home built" but from much cheaper components.
 
Re: Re: Re: Re: Re: The Next Supercomputer?

Originally posted by greenstork
I would argue that distributed computing (e.g. processing one unit on one machine) is not what the cluster computer is trying to achieve, hence the Infiniband connection between the nodes. So while your analogy to distributed computing holds true on paper, it's not in the nature of the tasks that a cluster supercomputer is designed for, IMO. I could be wrong, but that is my impression.

It is a gradient. The algorithm space that will scale on cluster is generally a function of interconnect latency rather than bandwidth. Most supercomputing clusters are only good for "embarrasingly parallel" algorithm spaces. By reducing the latency they can get the clusters to scale to "somewhat less embarrassingly parallel" algorithm spaces. Infiniband still does not have the latency specs required for a great many codes to effectively run in parallel.

One of the rather huge grains of salt that you have to take with any supercomputing cluster number is that the de facto standard benchmarks for these things are in fact embarrassingly parallel codes. For most real-world applications, you won't get anything remotely representing the efficiency implied by those benchmarks. For algorithm spaces that have many fine-grained dependencies (i.e. latency bound), you will find that huge supercomputing clusters are SLOWER than a single (or few) much smaller SMP or ccNUMA boxen.

For example, I happen to run huge information theoretic codes with fine-grained dependencies. For me, clusters like the VT cluster would actually run slower than a smaller lower latency box even taking Infiniband interconnects into account. The latency between any two processors in my systems needs to be in the ballpark of 1-us or less, or efficiency plummets and the processors aren't actually doing anything most of the time. Performance of my codes on the VT cluster would suck big time, and I could run rings around it with a much smaller box. Which is why you can't use a cluster to solve every problem and why they still make monolithic ultra-low-latency supercomputers.

So to give it proper perspective, clusters like the VT system are useless and not cost-effective for 90+% of the algorithms you could run. It's average interconnect latency is still pretty far on the "embarrassingly parallel" side of things, though much better than something like GigE, and the better interconnect will allow them to add more nodes for a given code before the fabric saturates. For the <10% of algorithms that run very well on them, it is well worth the money spent. Supercomputers in the more traditional sense will run most codes very efficiently, but are also more expensive than the more narrowly useful clusters.
 
Some one posted a reminder that the VT cluster is composed of complete consumer friendly CPU's with ALL associated accessories installed. That this fact will result in a great resale value in 2-3 years when they "sidegrade" to G5-2.8 ghz boxes :)

I would further point out that just like calculating the "cost of a lease" the life cycle cost of the hardware (including resale value) and support must be considered. I suspect if all of these factors are properly accounted for at the "point of sale" we will see the VT cluster being by far the lowest total cost cluster ever assembled on earth. And may maintain that record for quite some time.

As for black program computers, not only do they "not exist" :), but the security measures dramatically increase the costs of operation.

And finally my breaking comments about a 3U 7 blade Apple cluster, that will improve latency, density, and it will shock the "home built" compute world by it being available on the Apple Store to anyone, anywhere. It will be a race. He who chooses next day air and deep dish pizza wins:)

Rocketman
 
Advertising

I wonder why Apple isn't advertising on the top500.org site. You'd think that they could have a banner ad that points out that their computer is third and cost much less than its competition....
 
Re: Re: Re: Re: Re: Re: The Next Supercomputer?

Originally posted by Wombatronic
You can rest assured that nearly all super computer tasks are designed to distribute well over a large cluster of nodes. Adding in more nodes should get you basically a linear speed-up until you start clogging up the network.


Very few supercomputer-type codes scale beyond 64 processors, even with something like an Infiniband interconnect fabric. Not surprisingly, people build these clusters to run codes that are already known to be in the narrow set of codes that WILL scale well to a large number of processors. Only a very tiny number of codes can scale linearly, the vast majority scale sub-linearly, and an infinitesimal number scale super-linearly (why super-linear is possible is left up as a technical exercise for the reader).

There are a huge range of supercomputing applications that will run slower on the VT cluster than on a more monolithic system with a fraction of the processors, as latency is the real killer in most cases. For example a cluster of a dozen custom Opteron-based boxes like this (www.octigabay.com) will outperform the VT cluster for many tasks. Not because a couple hundred Opterons is faster than a couple thousand PPC970s, but because the communication fabric is much faster and has much lower average latency in this case.

The only reason the Top500 list has so many clusters in it is because the benchmark they use is in the narrow set of codes that scale well on clusters. If they used a different code (e.g. the supercomputing codes I work on), the cluster benchmarks would drop like a rock and more monolithic systems would rule the roost. Clusters are excellent at what they can do, but there are many things in the supercomputing world that they scale very poorly on.
 
Re: Re: Re: Re: Re: Re: The Next Supercomputer?

Originally posted by Wombatronic
You can rest assured that nearly all super computer tasks are designed to distribute well over a large cluster of nodes. Adding in more nodes should get you basically a linear speed-up until you start clogging up the network.

People are not building 2000 node clusters to shave a few seconds off of the time it took the 1000 node cluster. They are doing it because nearly all linear algebra, discretized differential equations, and particle simulations can be nicely partitioned and effectively distributed over a gazillion [sp?] odd nodes.

The main tricky thing is to get the system to properly use the network, partitioning nodes and communication up so that folks aren't tromping all over each other. Properly laid out, the nodes spend very little time waiting for one another, and most of the time computing and sending/receiving.

Although I agree with you that getting these system up to speed means proper use of the network. The VT cluster is highly dependent on the interconnects, the topology of the network, the software that does the clustering, the task at hand, etc.

However I disagree that all supercomputing tasks are designed well to distribute, at least not in the sense of a traditional distributed computer models like Seti@Home and Folding, which was the comment I originally responded to.

Different types of problems (read code threads) require different computing needs. For example, if you have a problem that requires a great deal of memory, you may have to pool memory over different nodes. Then, bandwidth and speed of your networking play a much greater role in the speed of your supercomputer. In a distributed model, you wouldn't even have the ability to pool memory efficiently and thus, the speed type of computing is best accomplished on a supercomputer like Big Mac, Earth Simulator, etc. edit: or low latency (traditional) supercomputers computers as suggested above.

There are computing problems where multiple, numerous, nodes are fairly efficient. But these type of tasks require that you allocate data to a node infrequently. These types of computing tasks will scale with the number of processors but for those that require more frequent allocation between the nodes, it isn't a linear scaling.

To state the obvious, The tools that you use depend on the problem and in reality, neither one of us knows what VT is planning to use their computer for anyway. ;)
 
Re: Re: Re: Re: Re: Re: The Next Supercomputer?

Originally posted by Wombatronic
People are not building 2000 node clusters to shave a few seconds off of the time it took the 1000 node cluster. They are doing it because nearly all linear algebra, discretized differential equations, and particle simulations can be nicely partitioned and effectively distributed over a gazillion [sp?] odd nodes.

I disagree. The 1100 node supercomputer is more for allowing many different people to run jobs at the same time. The top500 list is great for measuring the overall performance of a cluster, but most likely no one will be running a 2200 processor job. Most jobs will probably be less than 32 processors.

Real world problems are generally not easily parallelized efficiently. It takes a tremendous amount of work and is often unnecessary. To site one of your examples, numerically solving differential equations (using finite difference methods) means communication between processes is required. Such communication increases with the surface area of the individual domains. Thus if you have a huge number of processes, you will have a lot of little domains that have, comparatively, large surface areas, and your efficiency drops rapidly. There is no way to get around this if your domain stays the same size. Obviously you can increase your domain, but often times a specific task doesn't need a larger domain. And you may not feel like waiting the extra time for a simulation over a larger domain to finish.

I work on a large, parallelized numerical weather model. Specifically I am developing a subroutine which more accurately calculates the effects of turbulence. Parallelizing is tough, dirty work, and often requires a major reorganization of the program. You have to have a good understanding of the theory of the problem, the details of numerical techniques, and the methods of parallelization. And you have to have a lot of time.

crackpip
 
Great to see Big Mac at number 3

now will one of you nerds please exsplain what big mac has been put together to do what exsatly run the uni slush fund or is it to do reall work like solving cancer lol

it is really really really great to see that apple machines came in the list and blew away the rest of those machines in one go..


but still theres a lot to be done

well done vt
 
Classic:

I wonder why Apple isn't advertising on the top500.org site. You'd think that they could have a banner ad that points out that their computer is third and cost much less than its competition....
Do you honestly think anyone in the buisness doesn't know about the G5 cluster, its rank, its cost, and on exactly which algorithms it is effective?
 
Va Tech will be ordering another cluster in 06'

I remember Va Tech saying they will be ordering another cluster in 2006. I'm hoping it's a G5 but if it is I wonder where the G5 will be by then? 4.5-5GHz??

Also I'm sure that once they Panthorized the the cluster it will show a improvement.
 
Voting negative?, here is a big negative for you people ..l., . Congratulations Apple and VT. :)
 
Originally posted by simply258
you have got to see this

click on "G5 Ordering" and click on the 1st pic ..

http://don.cc.vt.edu/

she's ordering :D

for those that dont get it, look at what she's running
Damn! The secret's out! :( The cluster is really a giant jukebox, sharing a central repository of music with the entire campus :)
Maybe they're setting up a new iTunes music store? Order your Eminem while cracking a few genetic sequences on the side. Would you like fries with that?
Bet the visualizations would be smooth! Cover the walls with giant plasma screens :D Psychedelic head-trip. Who needs drugs?
 
Originally posted by pkradd
As stated above, Big Mac is not yet operating at full potential. It may be able to add 3 or 4 more teraflops. There will be another list published in 6 months so there should be some new computers named and Big Mac may or may not stay where it is. As far as the 3rd fastest computer in the world, I'd take that with a grain of salt. Many governments have computer systems that are not advertised (known) that may be as fast or faster then any of those posted on the list.

right, if they are not known, then how do you know about them? id take what you said with a shaker of salt. back your claim up with something please if you are serious about that assertion
 
lets not go into government computers...
considering the fact that there's no mention of any DoD computers, or other ciphering machine used by the US or UK...
like i said, lets not get into that subject
 
I hear that Apple actually has a computer running in a top secret underground lab that they stole from the Borg and use to reverse engineer their PPC chip designs. ;)
 
Someone at IBM (maybe with little help from Apple), after sitting on PowerPC architecture for so many years and doing nothing much with it,has finally realized its potential over other CPU architectures(read x86) for high end computing. Now all of sudden there's alot of PPC buzz going around. Few days ago we hear about Microsoft, Nintendo going with PPC for their next gen gaming consoles. Super computer in size of one TV set, and confirmation that VT PPC 970 cluster is 3 rd in the world with such incredible price/performance ratio delivered by Apple. Can you believe it! :D Apple-Price/Performance. Today I read article, how SOI(Silcon on Insulator) is helping IBM to sample 90nm CPUs without any problems that are plaguing Intel's shift to 90nm.

Right now,future looks really promising.


- Windows NT: Windows Nice Try.
- Windows XP: Windows eXpress Problem
 
CPU Upgrades?

I didn't see the question posed, but I may have missed it. What is the possibility of these system being upgraded to faster CPU's next year. Obviously upgrading a CPU would be a whole heck of a lot cheaper then buying brand new boxes in a few years. Yes is a good bet the architecture for the PowerMac will have changed by 2006 but by next Spring? Summer? Fall? Wouldn't it be cool if they could order a few thousand G5 chips running somewhere from 2.x - 3Ghz and jump into second place? God knows the heat sink should easily handle any hotter chips in the future.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.