The Renkei Project: A study of resource coordination techniques for the formation of research communities
In the Global Scientific Information and Computing Center (GSIC) at the Tokyo Institute of Technology, we are conducting research relating to resource coordination techniques for the formation of a collaborative research community with eight universities and organizations including the NII. The title of this project is RENKEI, which is an acronym of Resources Linkage for e-Science. We spoke to professor Satoshi Matsuoka, assistant professor Shinichiro Takizawa and assistant professor Hitoshi Sato of the GSIC about the relationship between SINET3 and RENKEI-POP (point of presence), which is a sub-theme of this project.
(Date of interview: May 7, 2010)
It seems that GSIC is involved in a lot of other activities besides introducing information technology into the campus.
Matsuoka: We’re currently working on three major projects. The first is, as you mentioned, the construction and operation of an internal IT infrastructure for the university. In addition to operating the Super TITANET campus network, this project also includes operating computers and the TSUBAME supercomputer for education and research applications, and providing the university with cloud-based hosting and content services. We are also constructing a university-wide integrated authentication system that implements an environment where users can access other systems and services with a single sign-on once they have logged in to their own portals.
The other two projects are more along the lines of external activities. One involves cooperating with other universities and research organizations to promote advanced research in the IT field. We are researchers ourselves, so in addition to maintaining the Tokyo Institute of Technology’s IT infrastructure, we are also involved in various studies such as e-science projects and R&D involving our supercomputers and HPCs. The other involves cooperating with the IT centers of other universities, and in April 2010 we joined the configuration center of the Joint Usage / Research Center for Interdisciplinary Large-Scale Information Infrastructures. Incidentally, as you might expect from the word “International” in the GSIC’s name, we are also cooperating with universities and research organizations in other countries such as Thailand, Canada and the United States.
It has been said that you’re strengthening your ties with researchers in other fields besides IT.
Matsuoka: At the cutting edge of science, advanced information processing technology is indispensible in all fields. For example, in studies such as the human genome project, the latest supercomputers can be used to process data in a few minutes that would originally have taken 10 years to analyze. This means that research can now be performed from a completely different viewpoint. Exactly the same can be said of research in fields such as astronomy and the global environment. However, teachers who specialize in astronomy and biology are not always familiar with computers or information processing technology. So to accelerate research in various different fields, we must provide support for researchers in these fields of information.
Can you tell us about your involvement in the ongoing RENKEI project?
Matsuoka: Like I said, an important theme of modern science is the simulation and analysis of huge amounts of data produced by equipment such as genome sequencers, radio telescopes, particle accelerators and synchrotron facilities. But if each of these research fields has its own separate e-science infrastructure and centers, then that isn’t very efficient. The aim of the RENKEI project is to create a common data platform that can be used by researchers in all fields of science.
In Japan, a lot of effort has been put into improving the processing power of supercomputers. This is of course very important, and here at Tokyo Institute of Technology we’re currently working on the development of the successor to our TSUBAME supercomputer. But speed is not the be-all and end-all of computer technology. To facilitate the advancement of e-science, it’s just as important to have a data infrastructure where large quantities of data can be stored, accessed at high speed, processed, and transmitted.
What is the role of RENKEI-POP in this project?
Takizawa: RENKEI-POP is one of the RENKEI project’s five sub-themes, which is aimed at practical evaluation and collaboration between users. For this purpose, we are researching and developing appliances that support the high-speed transfer of data between locations. This makes it easy for large quantities of research data in the supercomputers of various universities and research organizations to be utilized at another location. There are currently two RENKEI-POP nodes located at Tokyo Institute of Technology, and one each at Osaka University, Nagoya University, Tsukuba University, NII, KEK and the National Institute of Advanced Industrial Science and Technology (AIST). Data can be retrieved and transmitted between these nodes. This technology is being developed using software resources like the Globus Toolkit and Gfarm, and a SINET3 layer 3 VPN is used to network the nodes together. Since we have a 10 Gbps bandwidth at our disposal, we can transfer 8 GB of data in just 15 seconds or so. Due to bottlenecks in the communication between nodes and the lack of fine-tuning in the TCP communication between hosts, we’re not currently able to take full advantage of the system’s capacity. However, we’re working to improve performance in collaboration with the administrators at each node.
That’s pretty fast. Will you be able to transmit data even more quickly if SINET is speeded up in the future?
Sato: Basically yes, but we would have to make changes to the current RENKEI-POP specifications. That’s because the bottleneck used to be network bandwidth, but now we have to keep a close eye on the I/O performance of the RENKEI-POP system. The current RENKEI-POP specifications call for a capacious high-end PC including a Core i7 975 Extreme CPU with 12 GB of memory, a 10 GbE NIC, and 30 TB of data storage. These specifications are perfectly adequate for a bandwidth of 10 Gbps, but will become a bit stretched if the bandwidth gets even larger. To operate RENKEI-POP properly, we have to balance all the factors such as the network bandwidth and I/O performance, so if the bandwidth goes up to 40 Gbps or 100 Gbps, we’ll need to revise the system specifications accordingly.
So the establishment of this environment is good news for people in all fields of research?
Matsuoka: Absolutely. There are many different studies that use large quantities of data, but until now it has been very difficult for them to transmit this data across networks. Even a 10 Gbps connection can be far too slow for a practical throughput, so researchers have been obliged to physically transport data on magnetic tape or disks instead. But using RENKEI-POP, it is easy for researchers to have their data analyzed by an advanced supercomputer located in some other university. We plan to continue with our research and development in order to implement this system as soon as possible.
Finally, what are your hopes regarding SINET and your aspirations for future research?
Matsuoka: First of all, I’d like to see SINET extended into other services besides networks. We’ve been using the network services for a long time, and RENKEI-POP is also proving to be useful. But for the future progress of science, it is very important to set up an domestic data sharing platform. It would be great if you could cooperate with our efforts to build a basic infrastructure to connect together the IT centers of different universities.
As for the future, I’d like to put more effort into training people to take on the grand challenges of scientific research. I tell students to stop doing mundane research (laughs). I think services based on the Web or mobile phones should be left to the private sector. If people are conducting research at university, I think we should encourage them to take on the grand challenges in all sorts of scientific fields. This also includes the grand challenge of information systems.