Science Grid This Week
June 14, 2006 Current Issue | About SGTW | Search | Subscribe | Archive | Contact SGTW  
SUMS Schedules MIT

Xgrid at MIT

Image Courtesy of Adam Kocoloski
Faced with the challenge of sifting through petabytes of data and keeping up with the fast pace of technological progress in grid hardware and software, a nuclear physics collaboration developed a grid scheduler that has now been adapted to run on a cluster of Apple computers at the Massachusetts Institute of Technology.

Over the last three years the STAR collaboration, a member of the Particle Physics Data Grid and the Open Science Grid, has developed a software package to provide a constant interface to the ever-evolving dynamic hardware and software that defines grid computing. The STAR Unified Meta Scheduler (SUMS) provides a simple and elegant definition of a physics user analysis and translates that into the required commands to allocate disk storage, locate data sets, break tasks into many processes that can run in mass parallel, launch jobs on the grid, and return the results to the user.

The MIT cluster is one of a growing number of university grid clusters that harvest unused resources from users' desktop machines. MIT's Apple Macintosh G5 desktops boast a dual core-dual CPU with approximately 2 GB of RAM, providing a total of 10 GHz of CPU power. The MIT group established a prototype local grid cluster by linking 20 G5s using Apple's Xgrid software. As Xgrid is a standard part of the Apple operating system, connecting the machines was simple.

"That was the key to getting people to donate their desktop machines," says Adam Kocoloski, a physics graduate student at MIT who set up the local grid cluster. "If it's as simple as clicking a checkbox on their computer, they're willing to do it."

While the Xgrid commands were relatively straightforward, the 15 STAR researchers at MIT could not efficiently use the available 50 GHz of computing power without investing significant resources in learning another parallel computing system and developing the necessary software to efficiently launch jobs, manage tasks and retrieve results. With only two days of collaboration between MIT's Mike Miller and Levente Hajdu from Brookhaven National Laboratory, a primary developer of the scheduler, SUMS was adapted to run on the Apple operating system and interface with the MIT Xgrid cluster.

"SUMS fits our needs perfectly," says Miller. "It allocates resources efficiently, easily interfaces with the Xgrid software, and it addresses many of the subtleties that are intrinsic in Apple's first release of Xgrid, such as the need to retrieve the results manually for each task once it has finished."

The nuclear physics group at MIT, which includes experimentalists and theorists, are now using the scheduler full time on MIT-Xgrid. SUMS and the Xgrid allow Kocoloski to get a jump start on the analysis of STAR data for his Ph.D. dissertation.

"The grid is really useful for data reduction—taking a data set that's too large to fit on one's laptop and would take too long to process, sending that early analysis step out to the grid and bringing the results back," explains Kocoloski.

The prototype Xgrid cluster has been so successful that the group is now pursuing funding to expand the cluster to a much larger scale.

—Adam Kocoloski and Mike Miller, MIT