In 2019, the market cap of the two tech giants, Microsoft and Apple, was around $900 billion each. A tech company called NVidia was a Lilliput compared to them, with a market capitalisation of only $100 billion. No one would have imagined that just five years later, NVidia’s market cap of $2.2 trillion would be within catching distance of Microsoft’s $3 trillion or Apple’s $2.6 trillion. NVidia’s phenomenal growth reflects a paradigm shift that has revolutionised hardware technology. Computers initially worked on their Central Processing Units (CPU) which were nothing but chips that contained millions of tiny transistors. Increasing the computing power meant increasing their number on a single chip. Till about 2010, this increase was driven by Moore’s Law, according to which their numbers doubled every two years, making computers millions of times faster.
Companies like Intel or AMD that made these chips raked in billions of dollars from this CPU revolution. But Moore’s Law has a physical limit. Transistors used in today’s state-of-theart chips are only a few atoms wide ~ and the width of a transistor cannot be less than the atomic dimension. Moore’s Law is thus reaching its limits, slowing down the improvements in transistor density and hence constraining the growth of CPUs.
But meanwhile, by 1990s, a more potent tool had taken the computing world by storm ~ the Graphic Processing Unit or GPU. While CPUs process tasks serially or sequentially, GPUs process tasks parallelly, executing multiple instructions simultaneously, and hence are much faster. While CPUs are capable of handling most of our daily computing needs like web-browsing, running software applications or file management, GPUs are far better in terms of computer graphics, video games etc. Their parallel processing ability dramatically reduces the number of transistors needed for the same computation and speeds up memory access many times over, making them suited for the ultra-fast processing demands of advanced AI applications. CPUs also execute parallelism using multi-core processors where each core executes a different task simultaneously.
While they can be used for running lightweight AI models, they cannot match the capability of GPUs with hundreds or thousands of smaller cores, each running separate threads of a larger task like rendering graphics etc. Today’s AI chip design solutions typically use reinforcement learning (RL) which is about learning by interacting with the environment and observing the response, through a process of trial and error. It becomes progressively better as it gathers more data, learning continuously and dynamically. RL facilitates electronic design automation (EDA) ~ the software used to design chips.
But it is the generative AI, which can quickly generate texts, videos, images, and audio based on large language models that have given a tremendous boost to development of AI chip design. Generative AI models are developing astoundingly diverse abilities as demonstrated by OpenAI’s ChatGPT. But they require massive amounts of data for training their models, demanding a significant amount of bandwidth and processing power. AI chips therefore require a unique architecture comprising billions of processors, memory arrays, security, and real-time data connectivity, which can be provided only by the massive parallelism of GPUs, which can serve as AI accelerators, enhancing performance for neural networks and similar AI workloads. AI Chips are emerging as the answer to the slowing of Moore’s law. A single AI chip can store an entire AI algorithm.
GPUs were originally designed for rendering highresolution graphics and video games, but adapted quickly to the demands of AI, though they have their limitations since they were not specifically designed for AI tasks and are not always the most efficient option for AI workloads. More specialized AI chips, such as Application-Specific Integrated Circuits (ASICs) and Field-Programmable Gate Arrays (FPGAs) emerged as the next stage of evolution of AI chip technology. While ASICs are custom-built chips developed to execute specific tasks like neural network processing, FPGAs are chips that can be programmed to perform a wide range of tasks. FPGAs are more flexible than ASICs, and hence suited for a large variety of AI workloads, but they are also more complex and expensive.
The latest in AI chip technology is the Neural Processing Unit (NPU) designed specifically for the processing of neural networks, which constitutes a key component of modern AI systems. NPUs are optimized for the high-volume, parallel computations required by the neural networks, with high-bandwidth memory interfaces to efficiently handle large amount of data. Neural network computations being power-intensive, NPUs must be designed to optimize power consumption. With the help of parallel processing, AI chips can distribute workloads more efficiently than other chips, minimising energy consumption which can help reduce the AI industry’s massive carbon footprint. State-of-the-art AI chips are also dramatically more cost-effective than the corresponding CPUs.
One estimate tells us that an AI chip that is a thousand times as efficient as a CPU provides an output equivalent to 26 years of Moore’s Lawdriven CPU improvements. Older AI chips consume more energy with their larger, slower, and more power-hungry transistors leading to unaffordable costs and slowdowns. Cutting-edge AI algorithms need cutting-edge AI chips that demand continuous innovation and improvement to minimise the huge training cost of AI algorithms which require many terra bytes of data that cost tens of millions of dollars. CPUs or general-purpose chips can never match the economics of scale achieved by the AI chips. AI chips enable AI processing virtually on any smart device ~ from watches, cameras, kitchen appliances to smart homes and smart cities.
This process is known as Edge AI in which processing takes place closer to where data originates instead of on the cloud, thereby reducing latency and improving security and energy efficiency. They can enhance the capabilities of driverless cars by processing and interpreting the huge datasets collected by the car’s cameras and sensors, and support sophisticated tasks like pattern and image recognition. They can help vehicles autonomously navigate complex environments, detect obstacles and respond to dynamic traffic conditions. They are also extremely useful in robotics, in areas ranging from “cobots” or Collaborative robots that work alongside humans and are slowly changing the nature of work from harvesting of crops to humanoid robots providing companionship.
The ability of AI chips to perform intricate calculations with high precision makes them an obvious choice for high-stake AI applications like medical imaging or robot controlled surgery. NVidia is currently the leading provider of AI chips. While it controls 80 per cent of the global market share in GPUs, it doesn’t manufacture them, sourcing them instead from Taiwan Semiconductor Manufacturing Corporation (TSMC) which makes almost 90 per cent of the world’s advanced chips that power everything from Apple’s iPhones to Tesla’s electric vehicles. It is also the sole manufacturer of NVidia’s powerful H100 and A100 processors, considered the most powerful AI chips today. NVidia’s AI chips are compatible with a broad range of AI frameworks which make them versatile for various AI and machine learning applications. Other players include AMD and Intel.
AMD, traditionally known for CPUs, has entered the AI space with products like the Radeon Instinct GPUs, which are tailored for machine learning and AI workloads, offering highperformance computing and deep learning capabilities. Intel is the world’s second largest chip manufacturer by revenue. Besides CPUs with AI capabilities, it has developed dedicated AI hardware like the Habana Gaudi processors specifically engineered for training deep learning models. These processors stand out for their high efficiency and performance in AI training tasks, and are optimised for data centre workloads, providing a scalable and efficient solution for training large and complex AI models.
They also have significant capabilities for inter-processor communication, which enable efficient scaling across multiple chips. Others like Microsoft, Google and Amazon are also designing their own custom AI chips to reduce their reliance on NVidia. The USA and Taiwan have a competitive advantage and control a majority of the chip fabrication factories, or “fabs” that fabricate state-ofthe-art AI chips. In the chips landscape, geopolitics between major powers plays an important role. TSMC’s overwhelming control of the chip market impacts the global supply chain as its limited domestic capacity and resources are inadequate to meet the soaring demands for AI chips. TSMC has already set up factories in Japan and USA and is in talks with Foxconn which manufactures most of Apple’s iPhones and iPads and has more than 30 factories and nine production campuses in India for expanding their chip production footprint in India.
Taiwan remains a hotspot for tension not only with China, but also due to the US-China rivalry which introduces an element of uncertainty in the supply chain ecosystem for chips. China is already feeling the brunt of US export controls that have severely limited China’s access to AI chips, chip-making equipment and chip design software, much of which are controlled by NVidia. One reason why it so desperately wants to annex Taiwan is the TSMC. Even as the US tries to limit China’s access to AI hardware, it is itself trying to reduce its own reliance on chip fabrication facilities in East Asia, by incentivising TSMC to set up more factories in the USA. India, hitherto a non-player in the chip universe and is just taking some baby steps, should do the same with more vigour to attract companies like NVidia or TSMC.
(The writer is an author, commentator and academic. Opinions expressed are personal)