Compiler infrastructure generally provides carefully thought through frameworks and principles to simplify the lives of compiler designers, maximizing the reusability of tools and interoperability when deploying to various different hardware targets. This infrastructure doesn’t provide “full” solutions for compiling to hardware, but instead provides the general scaffolding to make the design of such compilers as principled and interoperable as possible, with maximal code sharing and interoperability being at the heart of their design.
LLVM is a set of compiler and toolchain technologies that can be used to develop a front end for any programming language and a back end for any instruction set architecture. LLVM is designed around a language-independent intermediate representation (IR) that serves as a portable, high-level assembly language that can be optimized with a variety of transformations over multiple passes. It is designed for compile-time, link-time, run-time, and “idle-time” optimization. It can provide the middle layers of a complete compiler system, taking intermediate representation (IR) code from a compiler and emitting an optimized IR. This new IR can then be converted and linked into machine-dependent assembly language code for a target platform. It can also accept the IR from the GNU Compiler Collection (GCC) toolchain, allowing it to be used with a wide array of existing compiler front-ends written for that project.
The Multi Level Intermediate Representation (MLIR) is an important piece of compiler infrastructure designed to represent multiple levels of abstraction, with abstractions and domain-specific IR constructs being easy to add, and with location being a first-class construct. It is part of the broader LLVM project. It aims to address software fragmentation, improve compilation for heterogeneous hardware, significantly reduce the cost of building domain specific compilers, and aid in connecting existing compilers together. Compared to other parts of the overall ML stack, MLIR is designed to operate at a lower level than the neural network exchange formats. For example, the Onnx-mlir compiler uses the MLIR compiler infrastructure to implement a compiler which enables ONNX defined models to be compiled into native code.
OneAPI is an open standard for a unified Application Programming Interface (API) intended to be used across different compute accelerator (coprocessor) architectures, including GPUs, AI accelerators, and field-programmable gate arrays, although at present the main user is Intel, with them being the authors of the standard. The set of APIs spans several domains that benefit from acceleration, including libraries for linear algebra math, deep learning, machine learning, video processing, and others. OneDNN is particularly relevant, focusing on neural network functions for deep learning training and inference. Intel CPUs and GPUs have accelerators for Deep Learning software, and OneDNN provides a unified interface to utilize these accelerators, with much of the hardware-specific complexity abstracted away. In a similar manner to MLIR, OneAPI is also designed to operate at a lower level than the Neural Network Exchange Formats. The interface is lower level and more primitive than the neural network exchange formats, with a focus on the core low-level operations such as convolutions, matrix multiplications, batch normalization etc. This makes OneDNN very much complementary to these formats, where OneDNN can sit below the exchange formats in the overall stack, enabling accelerators to be fully leveraged with minimal hardware-specific considerations, with this all helpfully being abstracted by the OneDNN API. Indeed, OneAPI and MLIR can work together in tandem, and OneDNN is working to integrate Tensor Possessing Primitives in the MLIR compilers used underneath TensorFlow.