The Institute of Statistical Mathematics (ISM) is widely involved in cutting-edge research of statistical science and in collaborative studies with researchers in various fields. We spoke to professor Yasuyoshi Tamura (vice director-general of the ISM) and Kazuhiro Nakamura (head of the computing facilities unit at the center for engineering and technical support) about the ISM’s use of network technology.
(Date of interview: March 11, 2010)
Can you describe briefly what sort of work the ISM does?
Tamura: In Japan, there are currently no universities offering specialized statistics courses or statistics departments. The ISM therefore plays a central role in statistical research as a Japanese center of excellence for statistical science. Our work here has achieved a world-leading standard in many areas, such as the “Akaike Information Criterion” (devised by our former director, the late Dr. Hirotsugu Akaike), which is now used throughout the world as a model selection method.
As for statistics in general, some people might think we just spend our time drawing pie charts and bar graphs (laughs). In fact we are engaged in a wide range of activities from the latest cutting-edge science to fields that are closely related to our everyday lives. Statistics are used in all sorts of places. Their applications include not only opinion polls and electoral predictions, but also natural sciences, medicine, financial engineering and predictions of demand and sales figures in the retail sector.
So as a result, the ISM has research groups set up in a wide range of different fields.
Tamura: That’s right. The famous 19th century statistician Karl Pearson called statistics the “grammar of science”. If you want to analyze data or use it to construct a model, then you absolutely need to have some knowledge of statistics. In any academic field, you always need to employ statistics when making use of data. You might be surprised to find that statistics have even been used to examine the stylistic features of written language in studies of classical literature. My particular specialty is time series analysis, which is a field of statistics used for the prediction and control of time-varying phenomena. I’m currently researching respiration and brain activity in a collaborative study with the faculty of medicine.
So you’re involved in studies at the very foundations of modern science.
What sort of role does IT play in your work?
Tamura: We were very early adopters of information technology. Japan’s first commercial computer was the Fujitsu FACOM128A, and the ISM was its first customer. Statisticians have to handle large amounts of data, so fast computers and networks are essential. We have been using networks for a long time too. Before the Internet became widely available, we were hooked up to Tokyo University’s TISN and JUNET networks. For our internal network, we had already set up a “yellow cable” 10BASE-5 LAN by the late 1980s.
There’s also considerable growth in the demand for IT from users, isn’t there?
Nakamura: That’s right. In the computer infrastructure laboratory that I look after, we perform the role of the ISM information systems division, and we have to keep a close eye on the reliability, availability and security of our resources. In particular, our computing resources have recently become an important tool for supporting research with technologies such as email, so we have to make sure downtime is kept to the absolute minimum.
High speed is another important criterion. The ISM provides external universities and research organizations with computing resources such as supercomputers, and as a result the quantities of data we are having to handle are growing at an accelerating rate year on year. For that reason alone, we need our networks to be as fast as possible. We were also relatively early adopters of SINET in order to meet these demands.
Apart from supercomputers, what other applications do you use networks for?
Tamura: Well, for example we have built an on-site physical random number generator board which we have made available as an on-demand service to on-site and off-site users. Physical random numbers exhibit less periodicity and habitual behavior than pseudo-random number generators, but their drawback is that they are expensive to operate. Although there are of course other research laboratories and universities with their own physical random number generator boards, it is often not possible for individual researchers to afford them. Since the ISM has developed a world-class high-performance system, we’re keen to make it widely available to other people.
But although it’s easy enough to generate random numbers that people can download to their hard drives, getting the maximum performance from this equipment is still a difficult task within the current network environment. In July 2010 we will be operating three types of physical random number generator boards, but to get these running online at their full capacity will require a bandwidth of about 600 MB/s. So in this sense, I think we also might need to beef up the SINET bandwidth (laughs).
As another example, we are using SINET to share test and analysis data in my research that I mentioned earlier. It’s very useful for synchronizing data at high speed between the ISM and the laboratories of our research partners. Before SINET, we had no choice but to send these large amounts of data as individual file attachments, which was very inefficient.
Did SINET come in useful when the institute relocated to Tachikawa?
Nakamura: During the relocation, we still had a bit of time left in the rental period of the supercomputer, so we had to decide whether to physically move the supercomputer to our new location or to use it remotely over the network. It takes a lot of time and effort to disassemble a supercomputer and set it up again somewhere else. So we decided to use the supercomputer via the network for the remainder of the rental period. But thanks to the SINET L2-VPN, we were able to use the supercomputer just as easily as before the relocation. Despite being in a different location, we were still able to access the supercomputer at the same segment, so there was no need for us to bother with complicated network equipment reconfiguration. SINET also made it possible for us to access our research data, email messages and so on without having to first transfer it onto tape storage. And it was great that we were able to accomplish the relocation with hardly any downtime.
What do you hope to achieve with SINET?
Nakamura: Our mission is to provide users with the best possible services, so we will continue making improvements in the future. In particular, the speed and capacity of computers and hard drives have increased dramatically in recent years, so we have to make sure that the network doesn’t become a bottleneck. We definitely hope to use SINET to provide network services that are faster and more reliable.
And finally, what are your hopes for the future?
Tamura: I’d like to see the ISM getting involved in outreach to other fields in addition to our cutting-edge research of statistical mathematics. In particular, we hope to foster talented researchers that have a comprehensive knowledge of statistics and the skills to manage research projects. We will also broaden the base of statistics by deepening our ties with other fields, resulting in pioneering new fields of research. To do this, it is important that we continue with our advanced research and outreach activities.