Supercomputers - LEKULE

Breaking

27 May 2018

Supercomputers

Roll back time a half-century or so and the smallest computer in the world was a gargantuan machine that filled a room. When transistors and integrated circuits were developed, computers could pack the same power into microchips as big as your fingernail. So what if you build a room-sized computer today and fill it full of those same chips? What you get is a supercomputer—a computer that's millions of times faster than a desktop PC and capable of crunching the world's most complex scientific problems.

 What makes supercomputers different from the machine you're using right now? Let's take a closer look!

What is a supercomputer?

Before we make a start on that question, it helps if we understand what a computer is: it's a general-purpose machine that takes in information (data) by a process called input, stores and processes it, and then generates some kind of output (result). A supercomputer is not simply a fast or very large computer: it works in an entirely different way, typically using parallel processing instead of the serial processing that an ordinary computer uses. Instead of doing one thing at a time, it does many things at once.
Bar chart showing which countries have the most supercomputers in the top 500.
Chart: Who has the most supercomputers? Almost 90 percent of the world's 500 most powerful machines can be found in just six countries: China, the USA, Japan, Germany, France, and the UK. Drawn in January 2018 using the latest data from TOP500, November 2017.

Serial and parallel processing

What's the difference between serial and parallel? An ordinary computer does one thing at a time, so it does things in a distinct series of operations; that's called serial processing. It's a bit like a person sitting at a grocery store checkout, picking up items from the conveyor belt, running them through the scanner, and then passing them on for you to pack in your bags. It doesn't matter how fast you load things onto the belt or how fast you pack them: the speed at which you check out your shopping is entirely determined by how fast the operator can scan and process the items, which is always one at a time. (Since computers first appeared, most have worked by simple, serial processing, inspired by a basic theoretical design called a Turing machine, originally conceived by Alan Turing.)

A typical modern supercomputer works much more quickly by splitting problems into pieces and working on many pieces at once, which is called parallel processing. It's like arriving at the checkout with a giant cart full of items, but then splitting your items up between several different friends. Each friend can go through a separate checkout with a few of the items and pay separately. Once you've all paid, you can get together again, load up the cart, and leave. The more items there are and the more friends you have, the faster it gets to do things by parallel processing—at least, in theory. Parallel processing is more like what happens in our brains.
Simple diagram showing the difference between serial and parallel processing in computer systems.
Artwork: Serial and parallel processing: Top: In serial processing, a problem is tackled one step at a time by a single processor. It doesn't matter how fast different parts of the computer are (such as the input/output or memory), the job still gets done at the speed of the central processor in the middle. Bottom: In parallel processing, problems are broken up into components, each of which is handled by a separate processor. Since the processors are working in parallel, the problem is usually tackled more quickly even if the processors work at the same speed as the one in a serial system.

Why do supercomputers use parallel processing?

Most of us do quite trivial, everyday things with our computers that don't tax them in any way: looking at web pages, sending emails, and writing documents use very little of the processing power in a typical PC. But if you try to do something more complex, like changing the colors on a very large digital photograph, you'll know that your computer does, occasionally, have to work hard to do things: it can take a minute or so to do really complex operations on very large digital photos. If you play computer games, you'll be aware that you need a computer with a fast processor chip and quite a lot of "working memory" (RAM), or things really slow down. Add a faster processor or double the memory and your computer will speed up dramatically—but there's still a limit to how fast it will go: one processor can generally only do one thing at a time.

Now suppose you're a scientist charged with forecasting the weather, testing a new cancer drug, or modeling how the climate might be in 2050. Problems like that push even the world's best computers to the limit. Just like you can upgrade a desktop PC with a better processor and more memory, so you can do the same with a world-class computer. But there's still a limit to how fast a processor will work and there's only so much difference more memory will make. The best way to make a difference is to use parallel processing: add more processors, split your problem into chunks, and get each processor working on a separate chunk of your problem in parallel.

Massively parallel computers

Once computer scientists had figured out the basic idea of parallel processing, it made sense to add more and more processors: why have a computer with two or three processors when you can have one with hundreds or even thousands? Since the 1990s, supercomputers have routinely used many thousands of processors in what's known as massively parallel processing; at the time I'm updating this, in January 2018, the world's fastest supercomputer, the Sunway TaihuLight, Fujitsu K, has around 40,960 processing modules, each with 260 processor cores, which means 10,649,600 processor cores in total!

Unfortunately, parallel processing comes with a built-in drawback. Let's go back to the supermarket analogy. If you and your friends decide to split up your shopping to go through multiple checkouts at once, the time you save by doing this is obviously reduced by the time it takes you to go your separate ways, figure out who's going to buy what, and come together again at the end. We can guess, intuitively, that the more processors there are in a supercomputer, the harder it will probably be to break up problems and reassemble them to make maximum efficient use of parallel processing. Moreover, there will need to be some sort of centralized management system or coordinator to split the problems, allocate and control the workload between all the different processors, and reassemble the results, which will also carry an overhead.

With a simple problem like paying for a cart of shopping, that's not really an issue. But imagine if your cart contains a billion items and you have 65,000 friends helping you with the checkout. If you have a problem (like forecasting the world's weather for next week) that seems to split neatly into separate sub-problems (making forecasts for each separate country), that's one thing. Computer scientists refer to complex problems like this, which can be split up easily into independent pieces, as embarrassingly parallel computations (EPC)—because they are trivially easy to divide.

But most problems don't cleave neatly that way. The weather in one country depends to a great extent on the weather in other places, so making a forecast for one country will need to take account of forecasts elsewhere. Often, the parallel processors in a supercomputer will need to communicate with one another as they solve their own bits of the problems. Or one processor might have to wait for results from another before it can do a particular job. A typical problem worked on by a massively parallel computer will thus fall somewhere between the two extremes of a completely serial problem (where every single step has to be done in an exact sequence) and an embarrassingly parallel one; while some parts can be solved in parallel, other parts will need to be solved in a serial way. A law of computing (known as Amdahl's law, for computer pioneer Gene Amdahl), explains how the part of the problem that remains serial effectively determines the maximum improvement in speed you can get from using a parallel system.

Clusters

You can make a supercomputer by filling a giant box with processors and getting them to cooperate on tackling a complex problem through massively parallel processing. Alternatively, you could just buy a load of off-the-shelf PCs, put them in the same room, and interconnect them using a very fast local area network (LAN) so they work in a broadly similar way. That kind of supercomputer is called a cluster. Google does its web searches for users with clusters of off-the-shelf computers dotted in data centers around the world.
Pleiades ICE Supercomputer built from racks of Silicon Graphics workstations.
Photo: Supercomputer cluster:NASA's Pleiades ICE Supercomputer is a cluster of 112,896 cores made from 185 racks of Silicon Graphics (SGI) workstations. Picture by Dominic Hart courtesy of NASA Ames Research Center.

Grids

A grid is a supercomputer similar to a cluster (in that it's made up of separate computers), but the computers are in different places and connected through the Internet (or other computer networks). This is an example of distributed computing, which means that the power of a computer is spread across multiple locations instead of being located in one, single place (that's sometimes called centralized computing).

Grid super computing comes in two main flavors. In one kind, we might have, say, a dozen powerful mainframe computers in universities linked together by a network to form a supercomputer grid. Not all the computers will be actively working in the grid all the time, but generally we know which computers make up the network. The CERN Worldwide LHC Computing Grid, assembled to process data from the LHC (Large Hadron Collider) particle accelerator, is an example of this kind of system. It consists of two tiers of computer systems, with 11 major (tier-1) computer centers linked directly to the CERN laboratory by private networks, which are themselves linked to 160 smaller (tier-2) computer centers around the world (mostly in universities and other research centers), using a combination of the Internet and private networks.

The other kind of grid is much more ad-hoc and informal and involves far more individual computers—typically ordinary home computers. Have you ever taken part in an online computing project such as SETI@home, GIMPS, FightAIDS@home, Folding@home, MilkyWay@home, or ClimatePrediction.net? If so, you've allowed your computer to be used as part of an informal, ad-hoc supercomputer grid. This kind of approach is called opportunistic supercomputing, because it takes advantage of whatever computers just happen to be available at the time. Grids like this, which are linked using the Internet, are best for solving embarrassingly parallel problems that easily break up into completely independent chunks.

Hot stuff!

A red Cray2 supercomputer installed at NASA in 1989.
Photo: A Cray-2 supercomputer (left), photographed at NASA in 1989, with its own personal Fluorinert cooling tower (right). State of the art in the mid-1980s, this particular machine could perform a half-billion calculations per second. Picture courtesy of NASA Image Exchange (NIX).
If you routinely use a laptop (and sit it on your lap, rather than on a desk), you'll have noticed how hot it gets. That's because almost all the electical energy that feeds in through the power cable is ultimately converted to heat energy. And it's why most computers need a cooling system of some kind, from a simple fan whirring away inside the case (in a home PC) to giant air-conditioning units (in large mainframes).

Overheating (or cooling, if you prefer) is a major issue for supercomputers. The early Cray supercomputers had elaborate cooling systems—and the famous Cray-2 even had its own separate cooling tower, which pumped a kind of cooling "blood" (Fluorinert™) around the cases to stop them overheating.

Modern supercomputers tend to be either air-cooled (with fans) or liquid cooled (with a coolant circulated in a similar way to refrigeration). Either way, cooling systems translate into very high energy use and very expensive electricity bills; they're also very bad environmentally. Some supercomputers deliberately trade off a little performance to reduce their energy consumption and cooling needs and achieve lower environmental impact.

What software do supercomputers run?

You might be surprised to discover that most supercomputers run fairly ordinary operating systems much like the ones running on your own PC, although that's less surprising when we remember that a lot of modern supercomputers are actually clusters of off-the-shelf computers or workstations. The most common supercomputer operating system used to be Unix, but it's now been superseded by Linux (an open-source, Unix-like operating system originally developed by Linus Torvalds and thousands of volunteers). Since supercomputers generally work on scientific problems, their application programs are sometimes written in traditional scientific programming languages such as Fortran, as well as popular, more modern languages such as C and C++.

What do supercomputers actually do?

SeaWiFS satellite image of Earth's climate from NASA GRIN
Photo: Supercomputers can help us crack the most complex scientific problems, including modeling Earth's climate. Picture courtesy of NASA on the Commons.
As we saw at the start of this article, one essential feature of a computer is that it's a general-purpose machine you can use in all kinds of different ways: you can send emails on a computer, play games, edit photos, or do any number of other things simply by running a different program. If you're using a high-end cellphone, such as an Android phone or an iPhone or an iPod Touch, what you have is a powerful little pocket computer that can run programs by loading different "apps" (applications), which are simply computer programs by another name. Supercomputers are slightly different.
Typically, supercomputers have been used for complex, mathematically intensive scientific problems, including simulating nuclear missile tests, forecasting the weather, simulating the climate, and testing the strength of encryption (computer security codes). In theory, a general-purpose supercomputer can be used for absolutely anything.

While some supercomputers are general-purpose machines that can be used for a wide variety of different scientific problems, some are engineered to do very specific jobs. Two of the most famous supercomputers of recent times were engineered this way. IBM's Deep Blue machine from 1997 was built specifically to play chess (against Russian grand master Gary Kasparov), while its later Watson machine (named for IBM's founder, Thomas Watson, and his son) was engineered to play the game Jeopardy. Specially designed machines like this can be optimized for particular problems; so, for example, Deep Blue would have been designed to search through huge databases of potential chess moves and evaluate which move was best in a particular situation, while Watson was optimized to analyze tricky general-knowledge questions phrased in (natural) human language.

How powerful are supercomputers?

Look through the specifications of ordinary computers and you'll find their performance is usually quoted in MIPS (million instructions per second), which is how many fundamental programming commands (read, write, store, and so on) the processor can manage. It's easy to compare two PCs by comparing the number of MIPS they can handle (or even their processor speed, which is typically rated in gigahertz or GHz).

Supercomputers are rated a different way. Since they're employed in scientific calculations, they're measured according to how many floating point operations per second (FLOPS) they can do, which is a more meaningful measurement based on what they're actually trying to do (unlike MIPS, which is a measurement of how they are trying to do it). Since supercomputers were first developed, their performance has been measured in successively greater numbers of FLOPS, as the table below illustrates:
Unit FLOPS Example Decade
Hundred FLOPS 100 = 102 Eniac ~1940s
KFLOPS (kiloflops) 1 000 = 103 IBM 704 ~1950s
MFLOPS (megaflops) 1 000 000 = 106 CDC 6600 ~1960s
GFLOPS (gigaflops) 1 000 000 000 = 109 Cray-2 ~1980s
TFLOPS (teraflops) 1 000 000 000 000 = 1012 ASCI Red ~1990s
PFLOPS (petaflops) 1 000 000 000 000 000 = 1015 Jaguar ~2010s
EFLOPS (exaflops) 1 000 000 000 000 000 000 = 1018 ????? ~2020s
The example machines listed in the table are described in more detail in the chronology, below.

Who invented supercomputers? A supercomputer timeline!

Study the history of computers and you'll notice something straight away: no single individual can lay claim to inventing these amazing machines. Arguably, that's much less true of supercomputers, which are widely acknowledged to owe a huge debt to the work of a single man, Seymour Cray (1925–1996). Here's a whistle stop tour of supercomputing, BC and AC—before and after Cray!
Cray-2 C-shaped supercomputer unit
Photo: The distinctive C-shaped processor unit of a Cray-2 supercomputer. Picture courtesy of NASA Image Exchange (NIX).
  • 1946: John Mauchly and J. Presper Eckert construct ENIAC (Electronic Numerical Integrator And Computer) at the University of Pennsylvania. The first general-purpose, electronic computer, it's about 25m (80 feet) long and weighs 30 tons and, since it's deployed on military-scientific problems, is arguably the very first scientific supercomputer.
  • 1953: IBM develops its first general-purpose mainframe computer, the IBM 701 (also known as the Defense Calculator), and sells about 20 of the machines to a variety of government and military agencies. The 701 is arguably the first off-the-shelf supercomputer. IBM engineer Gene Amdahl later redesigns the machine to make the IBM 704, a machine capable of 5 KFLOPS (5000 FLOPS).
  • 1956: IBM develops the Stretch supercomputer for Los Alamos National Laboratory. It remains the world's fastest computer until 1964.
  • 1957: Seymour Cray co-founds Control Data Corporation (CDC) and pioneers fast, transistorized, high-performance computers, including the CDC 1604 (announced 1958) and 6600 (released 1964), which seriously challenge IBM's dominance of mainframe computing.
  • 1972: Cray leaves Control Data and founds Cray Research to develop high-end computers—the first true supercomputers. One of his key ideas is to reduce the length of the connections between components inside his machines to help make them faster. This is partly why early Cray computers are C-shaped, although the unusual circular design (and bright blue or red cabinets) also helps to distinguish them from competitors.
  • 1976: First Cray-1 supercomputer is installed at Los Alamos National Laboratory. It manages a speed of about 160 MFLOPS.
  • 1979: Cray develops an ever faster model, the eight-processor, 1.9 GFLOP Cray-2. Where wire connections in the Cray-1 were a maximum of 120cm (~4 ft) long, in the Cray-2 they are a mere 41cm (16 inches).
  • 1983: Thinking Machines Corporation unveils the massively parallel Connection Machine, with 64,000 parallel processors.
  • 1989: Seymour Cray starts a new company, Cray Computer, where he develops the Cray-3 and Cray-4.
  • 1990s: Cuts in defense spending and the rise of powerful RISC workstations, made by companies such as Silicon Graphics, pose a serious threat to the financial viability of supercomputer makers.
  • 1993: Fujitsu Numerical Wind Tunnel becomes the world's fastest computer using 166 vector processors.
  • 1994: Thinking Machines files for bankruptcy protection.
  • 1995: Cray Computer runs into financial difficulties and files for bankruptcy protection. Tragically, Seymour Cray dies on October 5, 1996, after sustaining injuries in a car accident.
  • 1996: Cray Research (Cray's original company) is purchased by Silicon Graphics.
  • 1997: ASCI Red, a supercomputer made from Pentium processors by Intel and Sandia National Laboratories, becomes the world's first teraflop (TFLOP) supercomputer.
  • 1997: IBM's Deep Blue supercomputer beats Gary Kasparov at chess.
  • 2008: The Jaguar supercomputer built by Cray Research and Oak Ridge National Laboratory becomes the world's first petaflop (PFLOP) scientific supercomputer. Briefly the world's fastest computer, it is soon superseded by machines from Japan and China.
  • 2011–2013: Jaguar is extensively (and expensively) upgraded, renamed Titan, and briefly becomes the world's fastest supercomputer before losing the top slot to the Chinese machine Tianhe-2.
  • 2014: Mont-Blanc, a European consortium, announces plans to build an exaflop (1018 FLOP) supercomputer from energy efficient smartphone and tablet processors.
  • 2017: Chinese scientists announce they will soon unveil the prototype of an exaflop supercomputer, expected to be based on Tianhe-2.
  • 2018: China now dominates the TOP500 ranking of the world's 500 fastest supercomputers, leading the United States by 202 machines to 143 (a year earlier, both countries boasted 171 machines each. The Sunway TaihuLight remains the world's most powerful machine.

No comments: