Pittsburgh, PA
Thursday
December 13, 2018
    News           Sports           Lifestyle           Classifieds           About Us
Health & Science
 
Place an Ad
Travel Getaways
Headlines by E-mail
Home >  Health & Science Printer-friendly versionE-mail this story
Data mining: New 'virtual observatory' will let astronomers sift through galaxies of data

Monday, November 19, 2001

By Byron Spice, Science Editor, Post-Gazette

The nation's newest observatory isn't a telescope sitting atop some Western mountain, a radiation detector buried in the South Pole ice or a specialized camera orbiting on a satellite.

Rather, it's a bunch of computers.

The National Virtual Observatory, funded late last month by a five-year, $10 million National Science Foundation grant, will combine existing databases from ground-based and orbiting observatories and make them easily accessible to researchers.

It doesn't sound too exciting, but astronomers predict it will change how they do their work.

Discoveries no longer are based only on startling new observations, or on newly conceived theoretical models, explained Alex Szalay, a Johns Hopkins University astronomer and the virtual observatory's co-leader. Increasingly, "eureka moments" are achieved by using computers to sift through mounds of existing data.

Already, some astronomers aren't sure that hoarding their new observational data gives them much of a competitive advantage any more, said Andrew Connolly, a University of Pittsburgh astrophysicist. Principal investigators, who traditionally have enjoyed exclusive use of new data for a year, increasingly are choosing to release their data early.

"Now," he said, "it's 'Can I get my hands on your N-point correlation code?' " Researchers who have a hot piece of analysis software, it seems, may sometimes have an advantage over colleagues with a virgin database.

If computation joins experimentation and theorizing as a mode of research, it will be as much from necessity as opportunity. Astronomers are buried in data. Automated observatories with digitized data fed directly into computers have made data easier to generate than to analyze.

"In the last 30 years, the amount of data we get just from detectors has grown by a thousand-fold," said Steven Beckwith, director of the Space Telescope Science Institute in Baltimore. "And so the amount of data out there is overwhelming." Many databanks go unused and unanalyzed once investigators have published the initial, most obvious findings.

Volume of data is one issue; type of data is another. Astronomers view stars and galaxies in a variety of wavelengths -- optical, radio, infrared, gamma ray, X-ray and more. Each wavelength can provide different information about a celestial event or object, but also requires a special expertise to interpret. The databases also aren't compatible with each other.

The National Virtual Observatory, headed by Szalay and computer scientist Paul Messina of the California Institute of Technology, will serve astronomers much as search engines help computer users wade though the mass of information on the Internet.

The virtual observatory will link the databases but operate so users can search them without any special knowledge of the databases or special expertise in particular wavelengths.

"And this is just the start," Pitt's Connolly predicted. As overwhelming as astronomical databases already are, astronomers are demanding ever bigger ones. The Sloan Digital Sky Survey, for instance, promotes itself as the most ambitious astronomical survey ever attempted -- a five-year effort to catalog everything in one quarter of the sky. Yet, within 10 years, astronomers want to map the sky every four nights, he said.

"We do not know what goes bump in the night," explained Robert Nichol, an astrophysicist at Carnegie Mellon University. "Until we can monitor the sky every night, we won't really know how the stars change every night."

It was only three years ago, for instance, that astronomers chanced to see an optical flash associated with a gamma ray burst, a brief, but phenomenally powerful cosmic explosion, noted Daniel Reichart, a Caltech astronomer. And it was less than two weeks ago that a team of astrophysicists announced that they had spied their first "orphan afterglow," the optical glow that can linger for months following a gamma ray burst. In this case, the afterglow was an "orphan" because it was seen without first detecting the burst of gamma rays.

Gamma ray bursts have been detected for decades using orbiting gamma ray telescopes, but their exact nature "is very much still a mystery." Reichart said.

They are thought to be associated with exploding stars, called supernovae, but they are far brighter than any supernovae and may date to the earliest times in the universe. They thus could serve as flashlights, backlighting everything that lies between us and the early universe and perhaps could provide the first pictures of the first stars -- if astronomers can manage to be looking at the right place at the right time.

Giving astronomers access to all of this information is only part of the solution to the data problem. They also must be given the tools to make sense of it all.

Andrew Moore, a CMU computer scientist, is one of the virtual observatory's senior personnel and for several years has been working with Nichol, Connolly and fellow CMU computer scientist Jeff Schneider to find computing shortcuts that can troll through large databases.

For certain types of problems that normally would require days of intensive computing to answer, Moore said, it's actually faster to pre-compute the answers to every statistical question that can be posed of the database. Only some of the answers -- the surprising or unusual ones -- are stored in a new database called an AD tree. The result is that the AD tree can answer a billion questions in five minutes, using the unusual answers to reconstruct even the mundane answers when necessary.

This automated science can free astronomers from mundane tasks, but it won't replace the need for confirmatory observations using actual telescopes. Aggressive searching for patterns within a statistical database is going to turn up illusions as well as insights.

"Once you find something, you're going to still want to get on a big telescope and look at it," Nichol said.

Search | Contact Us |  Site Map | Terms of Use |  Privacy Policy |  Advertise | Help |  Corrections