Topic for Discussion Sessions
High Performance Computing
for Artificial Intelligence and Machine Learning
During the past few years, Artificial Intelligence (AI) and Machine Learning (ML) are seen to be quite useful for day-to-day operations in many different domains, and their use has significantly grown, with the advent of Internet of Things (IoT) and drone technologies. In our ScalPerf meeting of 2015, we had anticipated this trend and had discussed the nature of the problems at the time (see attached presentation from ScalPerf 2015). In this year’s workshop, we would like to focus on two different aspects of the impact of AI and ML on Server design.
What is the nature of AI/ML workloads in various domains, and what aspects of server hardware, system software, and algorithms will significantly help the AI computation? Some of the areas are:
Cases where the analysis is done with large volume of data to develop Foundation Models (involving billions or more parameters).
Customization of Foundation Models for problems at hand (related to the Homogenization step).
Network Embedding models.
Analysis involving with quasi real time (i.e., online) transactional data.
Classical AI analysis etc.
Value and cost-performance tradeoffs of variable precision arithmetic.
How to use AI/ML, particularly with all the advanced algorithms developed in the past decade, to improve the values and availability of servers, by enhancing the resource utilization, reliability, serviceability, and security.
As we know, this is an area of tremendous excitements and of big advancements. We expect some of the positions we will formulate in this workshop to be speculative. But educated speculation can be quite valuable for research.