It’s one thing to make sure you have enough room on your hard drive for every episode of Star Trek. It’s quite another to make room for a petabyte — a million gigabytes — of data for a billion stars in our galaxy and even beyond.
If you’ve ever wondered how Star Trek’s computer banks could possibly store the locations and trajectories of all those conveniently located class M planets, look no further than Gaia, one of the European Space Agency’s scientific programs. Gaia’s mission: to gather immense amounts of data on a billion stars and other objects over a five-year period, process it over a seventeen-year period and reconstruct the most precise 3D map of our galaxy ever created.
Xavier Luri is the current leader of the Gaia Archive and one of the original authors of the Gaia mission proposal. I met Luri earlier this year over Skype, and we discussed the literally astronomical scope of this mission and its peculiar data challenges. In our email exchanges more recently, Luri mentioned that in his PowerPoint presentations he often compares the Star Trek universe and the actual span of the Gaia catalogue. “I think it makes a nice comparison,” he said.
Gaia’s data challenge has already begun. The Gaia spacecraft was launched in December 2013 from French Guiana and has already started to record information with the help of its twin telescopes, deployable sunshield, micro-propulsion system and 1,000-megapixel, high-resolution image sensor. A form of triangulation, known as parallax in the astronomy world, is used to determine the distance between the object and the spacecraft, which orbits the Earth at a distance of about 930,000 miles (the “L2 point”). The distance of a star, together with its position in the sky, which Gaia also measures, will allow researchers to obtain the full 3D position of each star, along with many other objects of interest: galaxies, quasars, extragalactic supernovae, asteroids and other complex and curious bits of the universe.
The raw astrometry, photometry and spectroscopy data that will be collected by the Gaia spacecraft (estimated to be approximately 150 terabytes, or 150,000 gigabytes of information total) will be transmitted back to Earth via Gaia’s high-gain antenna to be processed. Other needed calculations, such as luminosity, composition, temperature and gravity, will be derived from the raw data. This additional processing, however, is where data storage and management get especially tricky.
“To appreciate the complexity of the data processing,” Luri said, “the data processing consortium consists of more than 450 scientists and engineers from 24 countries.” The DPAC team, which has six different data processing centers around Europe, has been working since 2006 and is scheduled to finish processing the raw data from Gaia around 2023. In the end, it will be enough information to fill 1.5 million CD-ROMs. Between now and then, information will be released in stages into the hands of eagerly awaiting astrophysicists.
Star Trek’s Lieutenant Commander Data himself, with his unbelievable android storage capacity of eight hundred quadrillion bits, is actually only a factor of one hundred ahead of the total digital space that Gaia requires for its own space mission. A data challenge, indeed.