Gelato Home
  
Participants
Home > Participants > Members > University of Karlsruhe
University of Karlsruhe

The Scientific Supercomputing Center (SSC Karlsruhe) as part of the Computing Center of the University of Karlsruhe in Germany operates an HP XC6000 cluster as the high-performance computer for the federal state of Baden-Württemberg. This supercomputer is embedded into the infrastructure of the High Performance Computing Competence Center Baden-Württemberg (hkz-bw) and can be provided for projects from the universities of our federal state requiring computing power which can not be satisfied by local resources and is also available to users from private enterprises against refund of costs.

People

Rudolf Lohner - Gelato Leader

Rudolf Lohner is a Professor of Mathematics at the Computing Center and the Department of Mathematics of the University of Karlsruhe in Germany. He holds a Diploma degree in Mathematics and Computer Science from the University of Karlsruhe as well as a PhD and a Habilitation in Mathematics from the same university.

Lohner was a post-graduate Research Assistant in Applied Mathematics at the University of Georgia, Athens, and Georgia Tech, Atlanta. After a position as a scientific staff member at the Institute of Applied Mathematics at the University of Karlsruhe, he had a two-year deputyship of the chair of this institute. In 2001, he joined the Scientific Supercomputing Group of the Scientific Supercomputing Center of the University of Karlsruhe (Karlsruhe SSC), where he is now Director of the Department of Applications and Software. SSCK operates a large Itanium 2-based HP XC6000 cluster, offering service and support to scientific and industrial users from all over Germany.

Since May 2005, Lohner has been the Secretary of HP-CAST, the worldwide user group of technical high-performance computing HP users. His main research interests include: applied mathematics, self-verifying numerical algorithms, scientific supercomputing, and grid computing.


Walter F. Tichy

Walter F. Tichy has been a professor of Computer Science at the University Karlsruhe, Germany since 1986. Previously, he was a senior scientist at Carnegie Group, Inc., in Pittsburgh, Pennsylvania, and served six years on the faculty of Computer Science at Purdue University in West Lafayette, Indiana. His primary research interests are software engineering and parallelism. He is currently directing research on a variety of topics, including empirical software engineering, autonomic computing, software configuration management, cluster computing, optimizing compilers for parallel computers, and optoelectronic interconnects. He has consulted widely for industry.

Dr. Tichy earned an M.S. and a PhD in Computer Science from Carnegie Mellon University in 1976 and 1980, resp. He is the director of the Forschungszentrum Informatik, a technology transfer institute. He is co-founder of ParTec AG, a company specializing in software for computer clusters. He is the program co-chair for the 25th International Conference on Software Engineering (2003). Dr. Tichy is also a member of ACM, GI, and the IEEE Computer Society.



Overview/Problem Definition
  • Parallel access to storage resources within a cluster
  • Single system image
  • Logical parallelism: how a file is accessed by several processors
  • Physical parallelism: how a file is de-clustered over several nodes
  • Main performance problem: poor match between logical and physical parallelism
Technical Approach
  • Files are striped over several I/O nodes
  • Applications run on compute nodes
  • Flexible physical distribution: arbitrary file distribution over several cluster disks Advantage: logical parallelism matches the physical parallelism => increased performance and scalability.
  • Views: application-defined logical windows to subsets of parallel files Advantages: accessing non-contiguous regions of a file with a single call, simplified offset computation
  • Collective I/O: merge several I/O requests from different compute nodes before sending them to disks
  • Performed either at compute nodes (two-phase I/O: MPI-IO over Clusterfile) or at the disks (disk-directed I/O: Clusterfile) Advantage: improved network and disk throughput
  • Cooperative Caching: Joint management of cluster’s distributed caches => global cache
Results
  • Efficient non-contiguous I/O operations
  • LINUX kernel interface
  • MPI-IO library
  • User-level library
Looking Ahead
  • Metadata distribution and replication
  • Moving functionality to LINUX kernel
  • Integrate cooperative caching policies

Overview/Problem Definition

Scalable Cluster-based Servers - building scalable servers on top of clusters of commodity-off-the-shelf computers

Technical Approach
  • Combining OS-level mechanisms into content-aware request distribution systems
  • Main mechanism: cooperative caching + connection migration
Results
  • Cooperative caching (CC):
  • CC mechanism: Cluster-Aware Remote Disks (CARD) + policies
  • CC policies: Home-based Server-less Cooperative Caching
  • CARDs & policies implemented as Linux kernel modules
  • Connection migration: implemented as Linux kernel module
Looking Ahead
  • Locality-Aware Request Distribution (LARD) policy
  • Integrate LARD and CARD policies

Overview/Problem Definition
  • Gang Scheduling = simultaneously schedule heavily interacting processes running on several nodes
  • Motivation: reduce task switches due to communication blocking
  • Impact and applicability upon loosely coupled cluster nodes still unclear
Technical Approach
  • synchronization of schedulers across cluster
Results
  • Tools to record and display kernel events
  • ICMP-based remote schedule feature: broadcast/multicast special raw IP packets across network to simultaneously schedule groups of processes
Looking Ahead
  • Integrate remote schedule feature in middleware, e.g. MPI or JavaParty
  • Develop and evaluate distributed schedule policies beyond classical Gang Scheduling

Overview/Problem Definition
  • Lack of language-level support for cluster-distributed programming
  • Java has built-in support for parallelism but little support for distribution
  • Required: Easy-to-use parallel distributed environment
Technical Approach
  • Transparent remote classes and objects
  • Object migration
  • Transparent distributed threads
  • Single system view of a cluster (single virtual machine)
Results
  • Source-to-source program transformation
    • annotated parallel Java program => parallel distributed program
    • JavaParty code => pure Java code with calls to [Ka]RMI
  • Library support for:
    • low-latency RMI (KaRMI)
    • high-performance object marshalling
    • cluster-wide transparent thread semantics
Current Status
  • Implemented inside an existing Java compiler
  • KaRMI = world’s fastest pure Java RMI implementation
  • Applications:
    • Parallel data mining
    • Parallel image understanding
  • Available from http://www.ipd.uka.de/JavaParty
Looking Ahead
  • Transparent object replication
  • Checkpoint/restart





Resources
Related Links
Gelato at the University of Karlsruhe (pdf)
A poster describing IPD's Gelato-related projects as presented at the May 2004 Gelato Federation Meeting.



-
-
-
-
-

 

All content © copyright 2002-2006 Gelato Federation. Click here to view the Gelato Federation Privacy Policy and Terms of Service Agreement. If you have any questions or comments, please contact us.

Gelato Central Operations is housed within the Coordinated Science Laboratory (CSL) of the College of Engineering at the University of Illinois at Urbana-Champaign (UIUC).