IT Support for Research
Permalink to this page
| Cluster Evolution | Research Projects |
The Departments have a major investment in High-Performance Clustered (HPC) computer systems providing a powerful and accessible resource for data-intensive and computing-intensive tasks. The two linked clusters are jointly owned by Mathematics Computing and Technology, Physics and Astronomy in the Science Faculty and the Department of Design, Development Environment and Materials. The systems also support projects in Chemistry, Earth Sciences and The Planetary and Space Sciences Research Institute.
The original initiative for developing the cluster system came from a SRIF grant which was bid for under the auspices of the Department of Statistics. Faculty Users are drawn from the Computing Department, Applied Maths, Engineering and Statistics.
In addition to the Clustered system, a range of specialised computing services are maintained and developed based around specific research needs. Services available also include a VMCenter which provides virtual machines using VMWare for specific applications and a Storage Area Network and a Hierarchical Storage environment.
IBM Domino Workflow servers are provided for various academic and administrative activities and content management for web-presented information.
The Research computer systems are developed and managed jointly by Geoff Bradshaw (Physics & Astronomy) and Allan Thrower (Maths, Computing and Technology).
Recent news of the cluster includes the fact that it produced significant results very quickly in a major Combinatorial analysis relating to Steiner Triple Systems of order 15.
Both clusters have evolved from a ever-increasing need for greater processing power. This need stems from current and future planned research being conducted by Maths & Computing, the Department of Design and Innovation and Physics & Astronomy. The requirement has led the departments to integrate their individual clusters providing an essential and very powerful tool for various fields of research. Computations that formerly took several months to complete can be carried out in less than a day in some cases using the power of the IMPACT cluster.
The Cluster and ancillary equipment has been being moved to a purpose-built location in a new building on the Open University campus.
A walkthrough in the new HPC Server Room designed and built by Annor Ltd showing optimised trunking and Cluster equipment
Transition to Linux...
Our decision was to go with openMosix openMosix for load balancing and PVM,MPICH for parallel programs. A standard RedHat installation forms the base operating system. First tests were unbelievably fast and we got users to test their existing programs on the system. They too were getting very fast results. We ran openMosix as a secondary system to the Alphas for the first year, then we had contact with Jeffrey Johnson Professor of Complexity Science and Design who wanted to build a Linux cluster to run transims - but as we already had one, it was logical to add capacity to ours.
Purchasing ten more Transtec dual 4GHz Xeon machines has provided a system that has taken over as our main research platform. The final transition to Linux came in 2003 with a total switch from Alpha to Linux for all back end systems.
Total Cost of Ownership...
Linux and open source software has allowed us to dramatically reduce software costs while keeping our research system running at full strength. Using a cluster and a job queuing system we have been able to add users desktop machine’s to the overall system taking advantage of spare CPU cycles on the more powerful desktop machines that are available. With the introduction of the new AMD Opteron architecture we have replaced Alphas for 64bit computing while making use of 32bit emulation when working with legacy code.
Improved turnround time...
Research Projects being undertaken using the High Performance cluster.
Genoveva Burca: Design
Advanced analytical and experimental methods for neutron diffraction studies of materials.
Jonti Horner: Astronomy
Funded by an STFC rolling grant looking at "Dynamical studies of Planetary Habitability, and the formation and evolution of planetary systems” (with Barrie Jones). He performs detailed n-body dynamical simulations of the behaviour of objects in our Solar system and beyond to investigate a number of questions on Solar system formation and terrestrial planet habitability. A typical suite of such integrations requires some 10 – 20 years of computing time, and is usually carried our spread over a number of nodes on the cluster. An average run on a give node will run for between 4 and 10 weeks, depending on the precision of the simulations, and the ejection/collision rate of particles within the system. Our first results have already yielded a significant amount of international coverage, for example
Jonti Horner has also based fellowship bids for 2009 on the IMPACT cluster.
Neil Edwards: Earth Sciences
Uses and develops simplified Earth System Models to study the interactions of climate change and economics, the origins of ice-age cycles, and mass-extinction events through the Phanerozoic. The thorough analysis of model error is absolutely essential for the credibility of this work, and requires large ensembles of order 1000 simulations to produce reliable statistics. This research therefore relies on having regularly upgraded machines, so that processor speed allows the longest integrations, and a large cluster of computing nodes, to investigate large-dimensional parameter spaces. Competitor institutions (e.g. Bristol, Exeter, Notts, Hamburg to name just four) have recently invested in new ~1000-processor clusters, thus to remain at the cutting edge in this field requires significant, sustained investment in procuring, upgrading and managing a large-scale cluster.
Joan Serras: Design
Involves the study of multilevel representations on very large systems. The cluster is used to run a multi-agent transport model called TRANSIMS (TRansportation ANalysis and SIMulation System). “TRANSIMS is an integrated system of travel forecasting models designed to give transportation planners accurate and complete information on traffic impacts, congestion and pollution” (Hobeika, 2005). TRANSIMS can model urban systems of any size at a refined time-scale (1 second basis). TRANSIMS goes beyond the traditional four-step model to achieve an activity-based demand. The output data produced by the model is then integrated in a Multilevel Representation.
The cluster offers the chance to do a successful run of a model for Milton Keynes, modelling the movement of each inhabitant in the synthetic population (around 2x105 inhabitants) on a second-by-second basis. They are currently planning to address larger areas of the order of 106 agents. The use of the cluster’s parallel computing interfaces (MPI and PVM) are of key importance to execute the simulations on an acceptable time.
Michael Wilkinson: Applied Mathematics
Uses the cluster in connection with ‘Strings in turbulent flow’ investigating the statistics of the configuration of a string advected by a turbulent fluid flow assessing whether it forms a compact, folded conformation like a random walk, or an extended conformation pulled out by large-scale eddies in the flow.
Other numerical studies will address the smallest objects produced by gravitational collapse It is commonly claimed that a cloud of interstellar gas which undergoes gravitational collapse will fragment into smaller pieces, until the fragments are dense enough that they are opaque to their own black body radiation. There are persuasive reasons to doubt this criterion, which suggest that fragmentation can continue until the pieces are significantly smaller than currently expected. Numerical work will be done to test theoretical estimates. This will use a variant of an SPH method which includes radiative transfer.
Robert Hasson: Applied Mathematics
Investigates the properties of inverse problems and in particular a brain imaging technique called Magnetoencephalography (MEG). The IMPACT cluster is used to segment MRI slices to produce smooth surfaces which accurately follow brain surfaces. In turn these smooth surfaces are used to create integration meshes so that the Boundary Element Method (BEM) can be used to relate brain sources to MEG measurements. Once a BEM model has been constructed for a subject then the process of analysing MEG measurement data for that subject can begin. The process of contructing a BEM model is very time consuming (currently about 4 days computation time per subject if there are no complications). Without the IMPACT cluster the current project which compares data across ten subjects would not be feasible.
James Hague: Physics
Research includes embolic stroke modelling, optical lattice simulations and electron-phonon interactions (a grant application as CI is to be submitted in the spring). These require a large quantity of CPU time, e.g., 3 months run time on 16 processors for a calculation similar to PRL 98 (2007) 037002; more detailed work may require even more computational resources. Both this latter application and the success of his EPSRC First Grant application to work on graphene will rely heavily on the availability of computer resources at the OU. Enhanced computing resources will significantly enhance the chance of success in an application for a Responsive Mode grant on embolic stroke to be submitted in autumn 2009 (which will involve detailed fluid dynamics simulations).
Natural Language Processing: Computing
Modern Natural Language Processing and Information Retrieval techniques typically rely on statistical language models extracted from large (eg 200M word) corpora or text collections. More recent approaches have seen the introduction of more and more context-sensitive approaches - i.e. approaches where the probability of a word occurring is not modelled as constant throughout a text, and hence, where it does not suffice to rely on simple word occurrence counts. Instead, the techniques include, for example, processing of large numbers of contexts and n-grams (of which there need to be many lest the data is too sparse), advanced statistical techniques (eg Baysian statistics supported by Markov Chain Monte Carlo methods), advanced machine learning techniques for the induction of probabilistic grammars, and several compute-intensive methods for dimensionality reduction in shallow text representation models (such as LSA). In other words, as well as data-intensive, all these methods are also compute intensive. In addition, many standard packages in NLP and IR assume an underlying Unix (or Linux) architecture.
All the examples above are drawn from NLP and IR projects over the last 6 years or so. These include 4 PhDs (Sarkar on Term Burstiness; Chantree on Ambiguity Detection; Haley on using LSA in automatic Marking; Nanas on adaptive Information Filtering), 2 EPSRC funded projects (Autoadapt; Context Sensitive Information Retrieval). They will also be required for a recently funded EPSRC project based on Chantree's work (Matrex) and a Jisc project, all of which will be using this range of approaches and techniques. We also envisage using the array for grammar induction, to allow the OU to scale up its current capacity for marking short answers automatically (we are in the process of bidding for strategic funding for this, in collaboration with the COMSTL CETL, the VLE and the Science Faculty).
The Linux array is a vital piece of infrastructure for this line of research. The processing volume we require is on the increase as several of the projects we have secured carry a requirement of developing tools which will require not only access to large datasets (on a scale we have not engaged with before), but also the need to deliver reasonable results at run time.
Paul Upton: Applied Maths
Andrey Umerski: Applied Maths
Models the spin-dependent electronic properties of magnetic multilayers and other nanostructured materials, at the atomic level using the IMPACT cluster. The cluster is used to model These materials and the quantum mechanical effects they exhibit, are currently of great theoretical, experimental and technological interest. This emergent research area is known as spintronics, and is the research area for which the physics Nobel prize was awarded in 2007.
Results obtained from the old cluster were crucial in a successful bid for a 3 year EPSRC PDRA fellowship. The theme of this grant is to investigate the spin dependent transport of electrons across a semiconductor/ferromagnet interface, and the new cluster will be of central importance in this project: allowing us to perform realistic simulations by including interfacial roughness and defects. Such computations are highly CPU intensive, but by using the MPI (Message Passing Interface), we distribute the workload over all the processors, making the calculations tractable.
Future applications for research grants will almost certainly be based on work performed on the new cluster.
Andrew J Norton: Astronomy
Started an STFC rolling grant project, linked to that of Jones/Horner, to investigate the stability of the orbits of exoplanets in hierarchical multiple stellar systems (binaries, triples, quadruple stars etc. The code (swift_hjs) runs for millions of years of system time and each run takes several weeks of CPU time. Since the parameter space to explore is so large (different stellar masses, planetary masses, orbital separations / periods / eccentricities / inclinations and hierarchies) the project is currently open ended and likely to use a significant amount of computer time for years to come.
Andrew Norton has been using a custom-written period search program to identify all the periodic stellar variables in the SuperWASP archive. This currently contains 14 billion data points on 23 million objects. Initial period searches on subsets of objects were run on the cluster at the OU whilst the code was being developed, final runs are carried out on a similar system local to the archive itself at Leicester University. Several papers are likely to follow reporting the half-million newly identified variable stars that the search has uncovered, but an initial study based on period-searching done at the OU is published as:
A&A 467, 785-905 (2007) ‘New periodic variable stars coincident with ROSAT sources discovered using SuperWASP’, A. J. Norton, P. J. Wheatley, R. G. West, C. A. Haswell, R. A. Street, A. Collier Cameron, D. J. Christian, W. I. Clarkson, B. Enoch, M. Gallaway, C. Hellier, K. Horne, J. Irwin, S. R. Kane, T. A. Lister, J. P. Nicholas, N. Parley, D. Pollacco, R. Ryans, I. Skillen, D. M. Wilson.
Andrew Norton & Ollie Butters run a hydrodynamical particle code - HyDisc - which simulates accretion flows in magnetic cataclysmic variable stars. There is a vast parameter space to explore (different stellar mass ratios, magnetic field strengths, orbital periods and spin periods) each of which yields a different accretion flow. Establishing flows at equilibrium requires the code to be run over many system orbital periods which can take days of CPU time. Result are published in: The Astrophysical Journal, 672:524–530, 2008 January 1 The Accretion Flows and Evolution of Magnetic Cataclysmic Variables A. J. Norton, O. W. Butters, T. L. Parker, G. A. Wynn In an ongoing investigation we are now developing an additional code which simulates the X-ray emission that arises from each flow using a ray-tracing approach. This too requires significant cluster CPU time to calculate the simulated lightcurve for each set of system parameters.
Stelios Tsangarides: Astronomy
Stephen Justham: Astronomy
David Broadhurst: Physics
Elaine Moore: Chemistry
Avik Sarkar: Centre for Computational Linguistics
My research is in the field of statistical Natural Language Processing and Computational Linguistics. These require processing large amounts of textual data.
First I need to store these large textual data collections, for which the large hard-disk resources are required, the RAID disks on the Linux clusters help me for that. Processed data files are stored into many small files, as fast access is required on these files.
Processing large amounts of data requires large memory and CPU resources, which can only be done on powerful clusters like KRONOS. To understand data characteristics, Bayesian models are fitted to this data; these are very computational intensive and require large number of iterations for the parameters of the model to converge. The Linux cluster is the only answer to such computational intensive methods.
Often collection of real-world textual data is required for research purposes. Crawlers have to be run to collect data from the web. Crawlers are computational intensive and also require large storage space to store the collected files. Linux clusters with large storage are useful for such purposes.
Uwe Grimm: Applied Maths
Andrea Capiluppi: Computing
Francis Chantree: Computing
Heather Whitaker: Statistics
Debra Haley: Computing
Debra is using the IMPACT cluster for work involving Latent Sematic Analysis
Mike Grannell and Terry Griggs from the Combinatorics Research Group, working with Martin Knor from the Slovak Technical University in Bratislava, have been investigating the biembeddability of Steiner triple systems of order 15 in an orientable surface.
Basically the idea is to take a sphere with a number of handles attached, to place points ("vertices") on this surface and join them by lines ("edges") also on the surface so that the edges only intersect at the vertices, every vertex is joined to every other vertex by a unique edge, and the resulting faces are all triangular (i.e. they all have exactly three edges). If, in addition, it is possible to colour the faces black and white so that no two faces of the same colour have a common edge, then the faces in each colour class form a Steiner triple system of order n, where n is the number of vertices. The two Steiner triple systems are said to be biembedded in the surface.
A Steiner triple system of order n is a set of triples taken from a set of n points with the property that each pair of the points appears in exactly one triple. So in a biembedding, the faces in each colour class form the triples and the edges form the pairs. Such biembeddings are only possible if n has the form 12s+3 or 12s+7 for s=1, 2, 3, . . ., and the cases n=3 and n=7 are trivial. So the first really interesting and difficult case is n=15, and in this case the sphere must have eleven handles
There are 80 essentially different Steiner triple systems of order 15 and it is a mammoth job to determine which pairs can be biembedded. For five of the systems that have particularly amenable structures, we have already determined the systems with which they can be biembedded. This work, which is to be published in the Journal of Combinatorial Mathematics and Combinatorial Computing, was carried out sequentially on ordinary PC equipment, but it took many weeks. A full sequential run for all 80 systems on a PC would take well over a year. So now we are about to run a parallel search on the remaining 75 systems using the IMPACT research cluster and we hope that this will enable us to get the definitive answers within a reasonable period.
Running in parallel the calculations are completed in under three months. M. J. Grannell, T. S. Griggs, M. Knor and A. R. W. Thrower, A census of the orientable biembeddings of Steiner triple systems of order 15, Australasian Journal of Combinatorics 42, 25 details these results.
Further parallel use of the cluster is planned.
Stephen Lewis: Astronomy
Runs global atmosphere models of Mars, Venus, Earth, Giant and ExtraSolar Planets. It is becoming increasingly imperative in his field to conduct experiments at higher spatial resolutions, and to analyse larger quantities of spacecraft data in combination with the models, in order to remain scientifically competitive and this requires much more computer memory (at least 8GB), processor time and disk storage. Expanded use of the OU cluster will be essential for the success of current (e.g. Mars related projects funded by NASA and ESA) and future (e.g., several grant applications submitted to STFC, a NERC application, the new version of the ESA contract, etc.) projects.
Jimena Gorfinkiel: Physics
Runs ab initio calculations on electron-molecule collisions. A current EPSRC grant to study electron collisions with molecular clusters relies heavily on the Linux Cluster, in particular the large memory cores. A new collaboration with a Swiss group to study mid-size alcohols and ethers with the aim of understanding how functional groups affect the formation of resonances and the subsequent dissociative electron attachment process, has been making intensive use of the large memory (8 GB) cores. Routinely the process of running several tens of calculations takes about 2-3 days of computing time. These calculations make intensive use of I/O and therefore require large local scratch areas. Future grant applications to work on other processes of interest to biological radiation damage will require even more computer resource and will therefore not be possible without expansion of the current computer capability.
James Hague: Physics