Who is Jim Keller?
1. AMD’s Zen Architecture: Jim Keller was a key figure at AMD, known for leading the development of the Zen microarchitecture, which played a pivotal role in revitalizing AMD’s CPU lineup and improving competitiveness against Intel. This is well-documented in various sources like AnandTech and news reports from TechCrunch.
2. Apple’s A4/A5 Processors: During his tenure at Apple, Keller contributed significantly to the development of Apple’s A4 and A5 chips, which powered some of the early iPhones and iPads. This contribution is often mentioned in industry reviews and Apple’s history articles.
3. Positions at Tesla and Intel: Jim Keller held influential roles at Tesla and Intel. At Tesla, he focused on self-driving car technology, particularly in developing custom AI chips for Tesla’s vehicles. His time at Intel was marked by leading projects related to new chip architecture. Articles from outlets like The Verge and Engadget often discuss these career highlights.
4. Tenstorrent Leadership: Keller’s position as CEO at Tenstorrent and his vision for advancing AI hardware have been covered in industry announcements and Tenstorrent’s press releases. The company’s website Tenstorrent.com provides updates on Keller’s leadership and strategic goals.
He is the CEO of Tenstorrent, a prominent company at the forefront of developing advanced AI and computing hardware. Keller is renowned for his deep expertise in microprocessor engineering and his influential role in shaping modern computer architectures. Over his career, Keller has been instrumental in designing breakthrough technologies for companies like Apple, AMD, Tesla, and Intel, playing a key role in developing successful products such as Apple’s A4/A5 processors and AMD’s Zen architecture.
As CEO of Tenstorrent, Keller leverages his vast experience in hardware and system design to drive innovation in AI acceleration and high-performance computing. Under his leadership, Tenstorrent aims to create cutting-edge processor technology that can meet the demanding needs of AI workloads, positioning the company as a critical player in the rapidly evolving tech landscape. Jim Keller’s deep technical insights and visionary approach make him a leading figure in the technology industry, driving forward the future of AI hardware.
Introduction
It’s my honor to introduce Jim Keller, CEO of Tenstorrent, as this year’s keynote speaker at 61DAC. Keller brings decades of experience, with significant contributions at Intel, Tesla, and now Tenstorrent. In his keynote, Keller shares insights into his journey through the tech industry, the evolution of AI hardware, and the lessons he’s learned along the way.
The Complexity of AI and Giving Keynotes
When I volunteered to give this talk, or rather, when a friend asked me, it felt easy to say yes—after all, it was six months away. But as the event loomed closer, I began stressing over what to focus on, given the complexity of the topics. Today, just this morning, five amazing things have already happened. There was an accident on 101, which made my trip go from 40 minutes to an hour and a half. Fortunately, an AI program on Google rerouted me. It got me thinking: if we had autonomous driving, maybe that accident wouldn’t have happened, which is kind of wild.
Then, the introduction touched on Conway’s book, which was phenomenal. It bridged the gap for me between logic design and transistor design, taking me from studying semiconductor physics to understanding the broader layers of computer architecture. That understanding of abstraction layers has always stuck with me.
From Logic Design to AI at Tesla and Intel
I studied electromagnetics and semiconductor physics in college, and then, for some reason, took a job in Florida because I wanted to live by the beach and surf. At that time, I was a pretty serious person, at least I thought so. I started in logic design, then moved to Digital, where I became a computer architect. The shift into semiconductors came next, which was fascinating.
My journey took me to Intel, where I worked on CAD methodologies and transforming IP and CAD processes, and later to Tesla, where we built the Autopilot chip. I’ve been involved in AI for over a decade now, and it has been a wild ride, especially in the last few years. Building AI isn’t just about making programs run faster—though, in some ways, it is—but it involves understanding how the whole architecture fits together.
What is AI and How We Build It
I was initially going to title this talk “What I Did Wrong,” but that seemed too negative—I’m generally a positive person, so I’ll get to that part later. Today, I’ll cover what AI is, some of the problems we’ve had, the role of open source, and where we go next.
There’s a joke I love: every time AI solves a new problem, people say, “That’s not AI anymore.” They said it was for playing chess, then Go, and then speech recognition. We tend to redefine what AI is every time it succeeds, which is fascinating. Working on AI with people like Andrej Karpathy, I realized it’s a journey of constant learning and redefinition.
Back in 1982, I worked on the VAX 8800, which was a dual-processor system. We met with a Fortran group with a one-year plan to vectorize and parallelize code. That plan is still not finished, showing just how hard it is to parallelize some processes. AI involves many of the same challenges: it’s deceptively simple when you look at the operations—matrix multiplies, tensor modifications—but the scale and combination of operations make it very complex.
The Hardware-Software Contract and AI Misconceptions
As a CPU designer, I’m used to a solid hardware-software contract: there’s memory, registers, a program counter, and defined instructions that let software do whatever it needs to within those constraints. In AI, the contract is much less clear. When people say, “I just want to write a PyTorch program that runs fast,” they often underestimate the complexity beneath that. PyTorch is just the beginning; beneath it lies a world of kernels, data movement, and multi-layer computation.
We built our first AI computer with hardware specifically designed for data movement because it was an extra step that needed optimization. Despite seeming simple, this process involves breaking down PyTorch programs into graphs, which are then distributed across multiple chips. The deceptively simple operations—like matrix multiply—turn into complex challenges when dealing with massive datasets.
Scaling AI with Hardware: A Layered Approach
We designed our hardware to handle vectors, matrices, and tensor modifications, scaling from CPUs to GPU-like tensor units. CPUs are great at scalar operations, but as AI demands have increased, we’ve added vectors and tensor capabilities. Each layer of AI computation requires a different scale of processing—CPUs for simple operations, GPUs for vectors, and specialized tensor processors for dense calculations.
As we scaled our AI models, we needed a hardware architecture that allowed us to manage this complexity efficiently. We designed our processors to handle these transformations, taking advantage of both vector processing and large-scale tensor operations. The challenge isn’t just in the math itself but also in managing data across processors and chips while ensuring everything stays synchronized.