This project has received funding from the European Union’s Horizon 2020 Research and Innovation Programme under the Marie Sklodowska-Curie grant agreement Nยบ 847635.
Department of Computer Architecture
Faculty of Computer Science and Engineering
The research activity of the ArTeCS (Architecture and Technology of Computing Systems) group covers a wide range of subjects related to high performance computing, computer architecture and system design. Energy efficiency has always been an important research topic for our team. Indeed, we have been exploring complexity-effective micro-architectural designs since the early 2000s. At the circuit level we have years of experience in the design of arithmetic and cryptographic units using HLS methodologies. The memory hierarchy has also been a recurrent topic. Lately, thanks to a long and fruitful collaboration with the IMEC institute (Belgium), we have acquired extensive experience in non-volatile technologies. Application mapping, code generation and optimization have also been very fertile research areas. We also have extensive experience in the design and development of system Sw for asymmetric and heterogeneous processors, from OS schedulers, to libraries simulators and runtime systems. Other relevant tools designed and maintained by our team are PMCTrack, an architecture-independent Sw tool that allows a convenient access to Hw performance counters and AccelPower CAPE, a low-cost, open-hardware platform for energy measurement on HPC nodes.
Among other computing resources we highlight the following: 5 racks with high-performance servers, interconnection networks and redundant power supplies, with nearly 1000 cores in total; workstations with the most representative Intel Xeon processors up to Skylake-X and Intel Knights Landing; several heterogeneous nodes featuring different accelerator technologies, including two servers combining Intel Skylake and Nvidia V100; two servers featuring Intel Haswell and the latest generation Altera FPGAs (Arria 10) and AMD Vega GPUs to support OpenCL implementations. All servers share central storage resources and high-performance local storage to support data-intensive applications. From the Sw perspective, the group constantly acquires new versions of benchmarks, compilers and other tools. Because we also focus power efficient architectures/acelerators, the infrastructure also includes serveral developer boards from ARM, NVidia, etc. as well as power measurement equipment.
Process scaling still happens nowadays, but energy efficiency of traditional designs is now compromised. At the same time, applications are becoming more and more computationally demanding, both in traditional HPC domains and in new emerging domains like machine learning. Industry tackled this combined challenge by means of hardware specialization, seeking to optimize relevant kernels of specific domain applications. Specialization has in turn lead to heterogeneous systems at every level: from technology to architecture and system level.
Our proposal addresses the aforementioned issues by exploring the implications of specialization and heterogeneity at three different levels, namely: architectures, system software and applications. The main goal of the project is to evaluate the use of new techniques, methodologies and implementations at each level to face the new challenges and to exploit the new opportunities offered within the post- Moore era, with special emphasis on the memory system.
Heterogeneity impact will be broad, not restricted to the core itself but striking memory organization. Traditional cache based architectures are likely to fail meeting energy and performance constraints in many arising application domains. Thus, we will consider alternative memory architectures comprising mixed hardware-software controlled memories built on different technologies (volatile and non-volatile) and organizations (HMC).
To enable this huge exploration, one goal of this project will involve building a scalable architecture simluator. FireSim represents an excellent starting point that allows to combine architectures, from RISC-V to GPU accelerators based on NVDLA, and heterogeneous memory systems with parametrized behavior.