Artwork

内容由Sonic Futures and The Green Software Foundation提供。所有播客内容(包括剧集、图形和播客描述)均由 Sonic Futures and The Green Software Foundation 或其播客平台合作伙伴直接上传和提供。如果您认为有人在未经您许可的情况下使用您的受版权保护的作品,您可以按照此处概述的流程进行操作https://zh.player.fm/legal
Player FM -播客应用
使用Player FM应用程序离线!

AWS Summit AI-Ready Infrastructure Panel with Prasad Kalyanaraman, David Issacs & Neil Thompson

45:43
 
分享
 

Manage episode 432935636 series 3582716
内容由Sonic Futures and The Green Software Foundation提供。所有播客内容(包括剧集、图形和播客描述)均由 Sonic Futures and The Green Software Foundation 或其播客平台合作伙伴直接上传和提供。如果您认为有人在未经您许可的情况下使用您的受版权保护的作品,您可以按照此处概述的流程进行操作https://zh.player.fm/legal
CXO Bytes host Sanjay Podder is joined by Prasad Kalyanaraman, David Isaacs and Neil Thompson at the AI-Ready Infrastructure Panel at the AWS Summit in Washington, June 2024. The discussion featured insights on the transformative potential of generative AI, the global semiconductor innovation race, and the impact of the CHIPS Act on supply chain resilience. The panel also explored the infrastructure requirements for AI, including considerations for sustainable data center locations, responsible AI usage, and innovations in water and energy efficiency. The episode offers a comprehensive look at the future of AI infrastructure and its implications for business and sustainability.
Learn more about our people:

Find out more about the GSF:

Resources:

If you enjoyed this episode then please either:
Connect with us on Twitter, Github and LinkedIn!
TRANSCRIPT BELOW:
Sanjay Podder: Hello and welcome to CXO Bytes, a podcast brought to you by the Green Software Foundation and dedicated to supporting Chiefs of Information, Technology, Sustainability, and AI as they aim to shape a sustainable future through green software. We will uncover the strategies and a big green move that's helped drive results for business and for the planet.
I am your host, Sanjay Poddar.
Welcome to another episode of CXO Bytes, where we bring you unique insights into the world of sustainable software development. I am your host Sanjay Poddar. Today we are excited to bring you highlights from a captivating panel discussion at the recent AWS Summit in Washington held in June 2024. The AI-Ready Infrastructure Panel featured industry heavyweights including Prasad Kalyanaraman, VP of Infrastructure Services at AWS, David Isaacs from the Semiconductor Industry Association, and renowned researcher Neil Thompson from MIT, and it was chaired by Axios Senior Business Reporter Hope King.
During this panel, we take a look at the transformative potential of generative AI, the global race for semiconductor innovation, and the significance of the CHIPS Act in strengthening supply chain resilience. Together, we will hopefully have a better picture of the future of AI infrastructure and the innovations driving this field forward.
And before we dive in here, a reminder that everything we talk about will be linked in the show notes below this episode. So without further ado, let's dive into the AI-Ready Infrastructure Panel from the AWS Summit.
Prasad Kalyanaraman: Well, well first, for the avoidance of doubt, generative AI is an extremely transformative technology for us. You know, sometimes I liken it to the internet, right, the internet revolution. So, I think there's, we're very early in that journey. I would say that, at least the way we've thought about generative AI, we think about it in three layers of the stack, right?
The underlying infrastructure layer is one of them, and I'll get into more details there. And then there is the frameworks. We build a set of capabilities that makes it easy to run generative AI models. And then the third layer is the application layer, which is where, you know, many people are familiar with like chat applications and so on.
That's the third layer of the stack, right? Thinking into the infrastructure layer It always starts from, you know, obviously, finding land and pouring concrete and building data centers out of it. And then, on top of it, there's a lot more that goes in inside a data center in terms of the networks that you build, in terms of how you think about electrical systems that are designed for it, how you land a set of servers, what kind of servers do you land, it's not just the GPUs that many people are familiar with, because you need a lot more in terms of storage, in terms of network, in terms of other compute capability.
And then you have to actually cluster these, servers together because it's not a single cluster that does these, training models. You can broadly think about generative AIs like training versus inference, and they both require slightly different infrastructure.
Hope King: Okay, so talk about the generative, what that needs first, and then the inference.
Prasad Kalyanaraman: Yeah, so the training models are typically large models that have, you know, you might have heard the term number of parameters, and typically, think of them as, and there are billions of parameters. So, you take, the content which is actually available out there on the internet and then you start, the models start learning about them.
And once they start learning about them, then they start associating weights with different parameters, with different, parts of that content that's there. And so when you ask, the generated AI models, for completing the set of tasks, that's the part which is inference. So you first create the model, which requires large clusters to be built, and then you have, a set of capabilities that allows you to do inference on these models.
So outcome of the model training exercise is a model with a different set of parameters and weights. And then inference workloads require these models. And then you merge that with your own customer data, so that customers can actually go and look at it to say, okay, what does, this model produce for my particular use case?
Hope King: Okay, let's, and, you know, let's go backwards to, to just even finding the land. Yeah. You know, where are the areas in the world where a company like Amazon AWS is first looking? Where are they, what, are the areas that are most ideal to actually build data centers that will end up?
Producing these models and training and all the applications on top of that.
Prasad Kalyanaraman: There's a lot of parameters that go into, picking those locations. Well, first, you know, we're a very customer obsessed company, so our customers really tell us that we, that they need, the capacity. But land is one part of the equation.
It's actually also about availability of green renewable power, which I'm sure we'll talk about through the course of this conversation. Being able to actually provide enough amounts of power and renewable sources to be able to run these compute capabilities is a fairly important consideration.
Beyond that, there are regulations about, like, what kind of, of, content that you can actually process. Then the availability of networks that allows you to connect these, these, servers together, as well as connect them to users who are going to use those models. And finally, it's about the availability of hardware and chips that are capable of processing this. And, you know, I'd say that, this is an area of pretty significant innovation over the last few years now.
We've been investing in machine learning chips and machine learning for 12 years now. And so, we have a lot of experience designing these servers. And so it takes network, land, power, regulations, renewable energy and so on.
Hope King: David, I want to bring you in here because you know obviously the chips are a very important part of building the entity of AI and the brain and connecting that with the physical infrastructure.
Where do you look geographically? What, or your, body of organizations when you're looking at maybe even diversifying the supply chain, to build, you know, even more chips as demand increases.
David Isaacs: Yeah, so I think it's around the world, quite frankly, and many of you may be familiar with the CHIPS Act, the past, two years ago here in the US, something very near and dear to my heart.
That's resulting in incentivizing significant investment here in the US, and we think that's extremely important to make the supply chain more resilient overall, to help feed the demand that AI is bringing about. I would also add that the green energy that was just alluded to that will also require, substantial, semiconductor innovation and new demand.
So we think that improving our, you know, the diversity of chip output, right now it's overly concentrated in ways that are, subject to geopolitical tensions, natural disasters, other disruptions, you know, we saw during the pandemic, the problems that can arise from the supply chain and the problems, I guess most prominently illustrated in the, automotive industry, we don't want that holding up the growth in AI.
And so we think that having a more diversified supply chain, including a robust, manufacturing presence here in the US is what we're trying to achieve.
Hope King: Is there any area of the world, though, that is safe from any of those risks? I mean, you know, we're in the middle of a heat wave right now, right?
And we're, you know, and we're not, and we're gonna talk about cooling because it's an important part. But do you, just to be specific, see any parts of the world that are more ideal to set up these systems and these buildings, these data centers for resiliency going forward?
David Isaacs: No, probably not. But just like, you know, your investment portfolio, rule one is to diversify.
I think we need a diversified supply chain for semiconductors. And, you know, right now, The US and the world, for that matter, is reliant 92 percent on leading edge chips from the island of Taiwan and the remaining 8 percent from South Korea. You don't need to be a geopolitical genius or a risk analyst to recognize that is dangerous and a problem waiting to happen. So, as a result of the investments we're seeing under the CHIPS Act, we projected, and in a report we issued last month with Boston Consulting Group, that the US will achieve 28 percent of leading edge chip production by 2032. That's, I think, good for the US and, good for the global economy.
Hope King: Alright, I'll ask this question one last time in a different way. Are there governments that are more proactive in reaching out to the industry to say, please come and build your plants here, data centers here?
David Isaacs: I think there's sort of a global race to attract these investments. There's, counterparts to the CHIPS Act being enacted in other countries.
I think governments around the world view this as an industry of strategic importance. Not just for AI, but for clean energy, for national defense, for telecom, for etc. And, so there's a race for these investments and, you know, we're just glad to see that the US is stepping up and implementing policy measures to attract some of these investments.
Hope King: Sanjay, can you plug in any holes that maybe Prasad and David haven't mentioned in terms of looking just at the land and where are the most ideal areas around the world to build new infrastructure to support the growth of generative AI and other AI?
Sanjay Podder: Well, I can only talk from the perspective of building data centers, which are for example greener, because, as you know, AI, classically and now gen AI.
They are, they consume a lot of energy, right? And based on what is the carbon intensity of the electricity used, it causes emission. So wearing my sustainability hat, you know, I'm obviously concerned about energy, but I'm also concerned about carbon emission. So to me, probably, a recent study in fact we did with AWS points to the regional variability of what, for example, AWS has today in various parts of the world.
So if you look at North America, that's US, Canada, EU, you will see that the data centers there, thanks to the cooler weathers, cold weather, the PUE, the Power Usage Effectiveness is much better, lower, right? Because you don't need a lot of energy to just cool the data centers, for example. Whereas, you will see that in, AsiaPac, for example, because of warmer conditions, you might need more power not only to power your IT, but also to keep the data centers cooler.
Right? So, purely from a geography perspective, you will see that, there are areas of the world today where the carbon intensity of electricity is lower because the electricity is largely powered with renewable energy like Nordics. But at the same time, if you go to certain parts of the world, even today, a lot of the electricity is generated through fossil fuels, which means the carbon intensity is high.
So purely from that perspective, if I see, you know, some of the locations like EU, North America, Canada, even Brazil, for example, a lot of their grid has renewable energy. the, the PUE factors are more favorable, but having said that, I have seen, for example, governments in Singapore, they're creating new, standards for how do you run data centers in, tropical climates.
They, in fact, one of the interesting things that they have done is they have raised the temperature by one degree Celsius, the accepted level of temperature in the data center because that translates to a lot of energy savings.
So, wearing my sustainability hat, if you ask me where should the data centers be, I would say they should be in locations where, you know, the carbon intensity of electricity is lower so that we can keep the emissions low. That is very important. And obviously, there are various other factors because one needs to also remember that these data centers are not small.
They take a lot of space. And where will this space come from, you know? Hopefully they don't cause a trade off with other sustainability areas like nature and biodiversity preservation. So the last thing I would like to see is, you know, large pieces of forest making way to data centers, right? Hopefully good sense will prevail.
Those things won't happen. But these are some of the factors one need to keep in mind, you know, if you bring the sustainability dimension. How do I keep emissions lower? How do I make sure impact to water resources are less? One of the studies shows that 40 to 50 Inferences translate to half a liter of water.
So how do I make sure natural, resources are less impacted? How do I make sure the forest and biodiversity are preserved? These are the things one has to think holistically when you select. And obviously other factors that Prasad will know, proximity to water supply for cooling the centers. So there may, it's a complex decision when you take to select a data center location.
Hope King: I love the description and how detailed you went into it. I mean, I just, I think for all of us, you know, looking at our jobs as members of the press, right, we want to know where the future is going, what it's going to look like, and from what I am putting together from what everyone has said so far, I'm, thinking more data centers are going to be closer to the poles where they're cooler, and maybe more remote areas away from people so that we're not draining resources from those communities.
And Neil, you know, I don't know if this is just, well, it's probably a personality thing. But like, I sit there and I say to myself. I could ask ChatGPT, because I've been dabbling with it, you know, to help me with maybe restructuring the sentence that I'm writing. Is it worth taking away water from a community?
Is it, is me asking the query of it worth all the things that are powering it? I mean, these are things that I think about, I'm like an avid composter, like, this is my life, right? What, are we ultimately doing, right? Like, is, this, what are we all ultimately talking about when we now say AI is going to be a big part of our lives and it's going to be a forever part of our lives,
but then you, know, you're hearing, you know, David and Sanjay and Prasad talk about everything that is required just to do that one thing to give me a grammar check?
Neil Thompson: Yeah, so, I mean, for sure it is remarkable the sort of demand that AI can place on the resources that we need. And it's been a real change, right?
You don't think of saying like, well, should I run Excel? You know, am I gonna use a bunch of water from a community because I'm using Excel, right? You don't think about that. And it's, but you know, some of the calculations, you know, there are things that you can do in Excel and you're like, oh, maybe I've, you know, I've put it on ChatGPT and I shouldn't, I should have done it, put it on Excel or something like that.
So, yeah, so you absolutely have this larger appetite for resources that come in with AI. And the question is sort of what do you do about that, right? And so, I mean, one of the nice things is, of course, that we don't have to put everything on the data center, right? People are working very hard to build models that are smaller so that it would live on your phone, right?
And then you have, you know, you still have the energy of recharging your phone, but it's not so disproportionate to training a model that is requiring tens of thousands of GPUs running for months. So, I think that's one of the things that we can be thinking about here is the efficiency gains that we're going to get and how we can do that and that's happening both at the chip level and also at the algorithmic level and I'm happy to go into a lot more detail on that if folks would like.
But I think that's the trade off we have there is, okay, we're going to make these models more efficient. But the thing is at the same time, there's this overwhelming trend that you see in AI which is if you make your models bigger, they become more powerful. And this is the race that all of the big folks, OpenAI, Anthropic, are all in, which is scaling up these models.
And what you see is that scaling up does produce remarkable changes, right? Even the difference between, say, ChatGPT and GPT 4, if you look at its performance on things like the LSAT or other tests that it's doing, big, big jumps up as they scale up these models. So that's quite exciting, but it does come with all of these resource questions that we're having.
And so there's a real tension here.
Prasad Kalyanaraman: Yeah, I would add that so one of the things that's important for, everyone to realize is that it is so critical to use responsible AI, right? And that Is
Hope King: me, using that for grammar? Is that responsible use of AI? Well, I mean I think I should know
Prasad Kalyanaraman: Responsible AI.
Yeah. what that means is that we have to be careful about how we use these resources, right? Because you asked a question about, like, how much water is it consuming when you use Excel or, any other such application. The key is, you know, this is the reason why when we looked at it and we said, look, we have a responsibility for the environment and, we were actually the, the, ones that were actually came back and said, we need to get to net zero carbon by 2040 with the climate pledge 10 years ahead of the climate accord.
And then we said, we have to get to water positive. I'll give you a couple of anecdotes on this. So we will be returning more water to the communities than what we actually take on AWS by 2030. That's quite impressive if you actually think about it. And that is a capability that you can really innovate on if you try to think about how you use cooling and how you actually think about what you need to cool and so on, right.
The other day I was actually reading another article, in Dublin, in Ireland, where we actually used heat from a data center to help with community heating, right? And so district heating is another one. So, I think there are lots of opportunities to innovate on this thing, to try and actually get the benefits of AI, at the same time be responsible in terms of how we actually use it.
So,
Hope King: talk more about the water. So, how exactly is AWS going to return more water than it takes?
Prasad Kalyanaraman: Yeah, so I'll tell you a few things there. One is, so just the technology allowing us to look at leaks that we have in our pipes. It's a pretty significant amount of water that gets leaked and gets wasted today.
And we've done a lot of research in this area trying to actually use some of our models, trying to use some of the technology that we built, to go look for these leaks from municipalities when they actually transfer it. And that's one area, renewable water sources is another area, so there's a lot of innovation that has happened already in trying to get to water positive.
We took a pretty bold stand on getting to water positive because we have a high degree of confidence that this research will actually get us there.
Hope King: Neil, how do you, is this the first time you're hearing about the water being, I mean this sounds like an incredible development.
Neil Thompson: It is the first time I'm here, so without more details, I'm not sure I can say more about it specifically.
But
Hope King: is it, but it's almost like, you know, we need AI to solve these big problems. It's almost, it's, it's, a quagmire. I mean, it's, yeah, you have to use energy to save energy. Like that, that, it's a paradox.
Neil Thompson: Sure, well, so let me, so I guess let me say two things here. So what, one is to say that it's, like, it is absolutely true that we have, there are a bunch of costs that are associated with using AI, but there are a bunch of benefits that come as well, right?
And so we have this with almost all the technologies, right? We, when we produce you know, concrete roads and things like that. I mean, there's a bunch of stuff that goes into that, but it also makes us more efficient and the like, and in some of the modeling that my lab has done and others have done, right, the upside of using AI in the economy and even more so in research and development to make new discoveries can have a huge benefit to the economy, right?
And so the question is like, okay, if we can get that benefit, right, it may come with some cost, but then we need to think carefully about, okay, well, what are we going to do to mitigate the fact that these costs exist, and how can we deal with the things that they come in?
Hope King: All of these things are racing at the same time, right?
So you've got the race to build these models, to use it to get the solutions, but then you've got to build the thing at the same time, and then you've got to find the chips. And, I mean, like, obviously it's not my job. My job is to ask the questions. I don't understand how we can measure like what, what of these branches of this infrastructure is actually moving faster and it doesn't, does one need to move faster than the other in order for everything to kind of follow along?
I don't know if anybody in here
Prasad Kalyanaraman: Yeah. Look, there's sometimes a misconception that these things started over the last 18 months. That's just not true. Of course. Right. Yeah. Data centers have been there for a long period of time. Okay. Trying to actually use, like, right now if I talk about cloud computing, you know, many of us are very familiar with that, like, ten years back or a decade back, we were very early on that.
So, it's not something that is a change that has to happen overnight, right? It's something that has evolved over a long period of time. And so, the investments that we've been doing over 15 years now, are actually helping us do the next set of improvements that we have. So, you know, we say this internally, there's no compression algorithm for experience.
And so, you have to have that experience and you have to actually have spent time going all the way down to the chip level, to the hardware level, to the cooling level, to the network level, and then you start actually adding up all of these things, then you start getting real large benefits.
Hope King: So, on that point though, is it about building new or retrofitting old when it comes to... because I've seen reports that suggest that building new is more efficient in another way because, you know, rack space or whatever.
So just really quickly on that, I know Neil wants to jump in.
Prasad Kalyanaraman: It's a combination. It's never a one size fits all. Generally
Hope King: speaking, as you look at the development data.
Prasad Kalyanaraman: I would say that there's obviously a lot of new capacity that's being brought online. But it's also about like efficiencies in existing capacity.
Because when we design our data centers, we design our data centers for like 20 plus years. But then, hardware typically actually lands, is useful about like six to seven years or so. After that you have to refresh. Right. So you have an opportunity to go and refresh older data centers. Yeah,
Hope King: it's not an over there update.
Yeah. You know, what were you going to say?
Neil Thompson: So you asked specifically about the race. I think that to me is one of the most interesting questions and something we spend a lot of time on. And it certainly is the case there's been an enormous escalation. So since 2017, if you look at large language models have been something like a 10x increase in the amount of compute being used to train them each year, right?
So that's a giant increase. Compared to that, some of these other increases in efficiency have not kept pace. So what you're talking about is the most cutting edge, the resources required are going up. But it's also the case that if you look at generally the diffusion of technology that's going on, right?
Maybe some capacity already exists, but other firms want to be able to use it. That's just spreading out. Well, there you get to take advantage of the efficiency improvements that are going on. And at the, for example, at the chip level, if you look in terms of the floating point operations that are done, the improvement is between 40 and 50 percent per year in terms of costs of flops per dollar.
That's pretty rapid, but even that is actually not that rapid compared to efficiency improvements. So efficiency improvements, for those who sort of don't think about it in this way, think about it as like, I have some algorithm that I have to run on the chip, that's going to use a certain number of operations, and the question is, can I design something that achieves that same goal, just asking for fewer operations to do, right?
It's just a pure efficiency gain. And what we see is that in large language models, the efficiency is growing by 2 to 3x every year. Which is huge, right? So if you think about the diffusion side of things, actually there, efficiency gains are very high, and we should Feel a little reassured that as that happens, the demands will drop.
Yeah,
Hope King: but then you have to contend with the volume. Because, you know, even if you reduce, I mean, even if you're making each more efficient, you're still multiplying it by other needs, so what does that offset? Math, net.
Neil Thompson: Yeah, so, I mean, the question there is exactly how fast is AI growing? Sure. And that actually is a very, turns out to be a very deep question that people are struggling with.
So, you know, many, unfortunately, many of the early surveys that were done in this area were sort of what you might call a convenience sample, like you called people you, cared about because they were your customers. But that was not representative of the whole country. So a couple of years ago, the census did some work, found out that there was about six percent of firms had actually been operationalizing AI.
Not that many. So we know that's going to be growing a lot, but exactly how fast, we're not sure. I think what we can say, though, is as that happens, you know, we could have a moment where it's happening faster, but as long as these efficiency increases, you know, continue over the longer term, and we do, you know, so far we've seen them to be remarkably robust.
That suggests that as that diffusion happens, then it will go down, so long as we don't all move to the cutting, the biggest cutting edge model in order to do it.
Hope King: David, I think you have probably some insight into how quickly the chip makers themselves are trying to design more efficient chips, you know, to this end.
David Isaacs: Yeah, let me just, start with the caveat that I'm not a technologist. I'm, a scientist, but I'm a political scientist. But, you know, so, but I'm glad that conversation has turned to innovation and efficiency gains because some of the questions ten minutes ago were was assuming that, the technology would remain the same and that's not the case.
Many people in the room are probably familiar with Moore's Law and the improvements in computing power, the improvements in efficiency, and the reduced costs that have been happening for decades now. That innovation pathway is continuing. It's not necessarily the same in terms of scaling and more transistors on silicon, but it's advanced packaging.
It's new designs, new architectures. And then there's the software and algorithm gains and the, like. So, we believe that will result in the efficiency gains and the resource savings. There's been a lot of third party studies. We know for the semiconductor industry that, the technologies we enable have a significant multiplier effect in reducing climate emissions, conserving resources, whether it's in transportation or energy generation or manufacturing. So we think there's a very substantial net gain from all these technologies. And I guess the other thing I would add is, you know, the CHIPS Act, the formal name is the CHIPS and Science Act, and there's very substantial research investments.
Some of which have been appropriated and are getting up and running. Unfortunately some of the science programs have been, and this is Washington speak, authorized but not yet appropriated. We need to fund those programs so that this innovation trajectory can continue.
Hope King: Yeah, I mean, just by nature, you know, we're the skeptics in the room, and we're looking at just the present day, and the imagination that is required for your jobs, is one that's not easily accessible when a lot of us are concerned about the here and now.
So, I appreciate that context, David. Sanjay, I want to talk about, what is going to take though in terms of energy, right? that is something that will remain the same in terms of the needs. So what are you seeing in terms of how new data centers are being built or even maybe systems, clusters of, infrastructure that support these data centers that, that you're seeing emerging and what is the sort of best solution as you know, more companies are building and looking for, land and, to actually grow all these AI systems?
Like what, are the. Actual renewable sources of energy that are easy and sort of at hand right now to be built?
Sanjay Podder: Well, I'm not an expert on that topic, so I won't be able to comment much on that, but I can talk about the fact that, you were discussing on efficiencies. For the same energy.
You know, you can, do a lot more with ai and what I mean by that is, the way we build AI today, whether it's training or inferencing, there is a number of easy things to do, which has a huge impact in amount of energy you need for that AI. I think there was a reference to, for example, large gen AI models.
They are typically preferred because, probably it gives more accuracy. But the reality is, in different business scenarios, you don't necessarily have to go to the largest of the models, right? And, in fact, most of the LLM providers today are giving you LLMs of different t shirt sizes. 7 billion parameters, 80 billion parameters.
One of the intelligent things to do is, fit for purpose. You select a model which is good enough for your business use case. You don't necessarily go to the largest of the model. And the energy need is substantially lowered in the process, for example, right? And that to me are very practical levers. The other thing, for example, there was, I think, Prasad referred to that these models, end of the day, they are deep learning models.
You know, there are techniques like quantizing, pruning, by which you can compress the models, such that smaller models will likely take less energy, for example. And then, of course, you know, inferencing. Traditionally, we have been always thinking about training with classical AI. But with generative AI, the game has changed.
Because with gen AI, given it's how pervasive and everybody is using it, millions of queries, the inferencing part takes more energy than training. Now again, simple techniques can be used to lower that, like you have one shot inferencing, if you batch your prompts, now when you do these things, you come to your answers quicker.
So, you know, you don't have to go and query the large model again and again So if you want to, you know, design your trip itinerary to San Francisco, instead of, you know, asking 15 prompts, you think about it, how do you batch it, this is my purpose, this is why I'm going, you know, this many days I'll be there, and, you know, it is very likely you'll get your itinerary in one shot, and in the process, a lot less energy will be used.
So the point here is, and of course data centers, right? All the chips that we are talking about, all these can, the custom silicons can lower the energy needs. So while I may not be able to comment on, you know, what are the best sources of energy because renewable energies are of various types, solar, wind, now people are talking about nuclear fusion and whatnot, but I'm seeing even within the energy that we have, we can use it in a very sustainable way, in a very intelligent way so that you get the business value you're seeking without wasting that energy unnecessarily.
Hope King: It sounds like you want companies to be more mindful of the, of their own data needs and not be overshoot, but be more even efficient in how they're engineering, what the, applications can do for them.
Sanjay Podder: Absolutely right. And you know, and on that point, about 70 to 80%, maybe more of the data is dark data and what is dark data? These are data organizations that store with the hope that one day they will need it, and they never need, require them. So you are simply storing a lot of data, and these data also corresponds to, you know, energy needs.
Right? So, there are a lot of rebound effect that you mentioned some time back, because compute, the cost of compute went down, storage went down, the programming community became lazy programmers. So what happened as a result of that is, You know, you are having all this, you're not bringing efficiency in the way you're doing software engineering.
You are having all these virtual machines, which you are hardly using, they're hardly utilized. They all require energy to operate. So there's a lot of housekeeping we can do, and that can lower energy, you know, in a very big way. Energy needs, right? And then, because even if there's a lot of renewable energy, IT is not the only reason, or AI is not the only reason where the renewable energy should be used.
There are other human endeavors. So, we have to change our mindset, be more sustainable, more responsible in the way we do IT today, and we do AI today.
Hope King: Did you want to add to that?
Prasad Kalyanaraman: Yeah, I would say that, you know, what he said is 100 percent right, which is that, as I said, like, you kind of have to actually think through the entire stack for this.
And you have to start going off for every stack and thinking about how you optimize it. Like, I'll give you an instance about cooling, and I know you wanted to talk about cooling.
Hope King: Yes, air conditioning for the data centers, and now the liquid cooling that's coming in to try to Exactly. Yes, go ahead.
Prasad Kalyanaraman: So over the years, what we've actually done is, we have figured out that we don't need liquid cooling for a vast majority of compute.
In fact, even today, like pretty much all our data centers run on air cooling. And what it means is we use outside air to actually cool our data centers. Okay. it, none of our data centers are nearly as cool as what you are in this room, by the way, just to be very clear.
Hope King: It's not as cold as this room?
Prasad Kalyanaraman: Not even close.
Hope King: What's the average temperature in a data center?
Prasad Kalyanaraman: It's well above 80 degrees.
Hope King: Really? Yeah. Okay. what's the range?
Prasad Kalyanaraman: It's between eight, between 80 to 85. Now, the thing is that you have to be careful about, cooling the data centers too much because you have to worry about relative humidity at that point as well.
And so, we've spent a lot of time in computational fluid dynamics to look at it to say, do we really need to actually cool the datacenters as much? And it's one of the reasons why our datacenters run primarily on air, and outside air actually. Now there's a certain point of time where you cannot do it with just air, that's where liquid cooling comes in.
But liquid cooling comes into effect because some of these AI chips have to be cooled at the microscopic level and air cannot actually get there fast enough. But even there, if you think about it, you only need to liquid cool a particular AI chip. But as I said, AI does not require just the chip, you need networks, you need like the storage and all that.
Those things are still cooled by air. So, my estimate is that even in a data center that has primarily AI chips, only about 60%, 60 to 70 percent of that needs to be liquid cooled. The rest of it is just pure air cooled.
Hope King: I think that's pretty fascinating, I think, because I think that's been, discussed more recently that liquid cooling is actually more crucial.
I don't know if you saw Elon Musk tweeting a picture of the fans that he has. He like made a pun about how his fans are helping. Whatever, anyways, go check it out. It's on X. thank you for, to, to circle back on the cooling. I think that was definitely, you know, I think a big question for energy use because that, those systems require a lot of energy.
As we come to the last couple of minutes, you know, I want to turn the conversation forward looking. The next two to five years, if I were to be back here with the four of you sitting on this stage, what would we be talking about? And where still do you think, at that point, gaps that we would need to, you know, that we're, that we haven't been fulfilled even, you know, since sitting here now, especially because I think, as you mentioned, right, like, infrastructure is not easily upgradable, like software is, and there will need to be those investments, physical investments, whether it's labor, whether it's land, physical resources, so, so in two years, what do you think we're going to be talking about?
Prasad Kalyanaraman: I'll start. I think you're going to, we're already starting to talk about that, so I expect that we'll talk more about those things. There's going to be a lot more innovation, generative AI will spur that as well in terms of how to think about renewable sources and how to actually run these things very efficiently.
Because some of the things that are realities is that generative AI is actually really expensive to run. And so, you're not going to actually spend a lot of money unless you actually get value out of it. And so there's going to be a lot of innovation on the size of these models, there's going to be a lot of innovation on chips, we're already starting to see that, and there's going to be a lot of innovation on, of course, nuclear, will be a part of the energy equation.
I think the path from here to, at least in our minds, our path to get to, net zero carbon by 2020. It's going to be very non linear, it's not going to be this one size fits all, it's going to be extremely non linear. But I expect that there will be a lot more efficiency running in it. It's one of the reasons why we harp so much on efficiency, on how we actually run our infrastructure.
One, it actually helps us in terms of our costs, which we actually translate to our customers. But it's also a very responsible thing for us to do.
Hope King: David, I actually want to jump over to you just for a second on this because, you know, a lot could happen in the next couple of months when it comes to the administration here in the US is there anything that you see in the next two years, politically, that could change the direction or the pace of development when it comes to AI, infrastructure building, investments from corporations, maybe even pulling back, right, pressures from investors to see that ROI on the cost of these systems.
David Isaacs: Yeah, I'm hesitant to engage in speculation on the political landscape, but I think things like the CHIPS Act, US leadership in AI are things that enjoy bipartisan support and I think that will continue regardless of the outcome of elections and short term political considerations. You know, I think getting to your question on what we'll be talking about a few years down the road, I'm an optimist, so I think we'll be talking about how we have a more resilient supply chain for chips around the world.
I think we'll be, enjoying the benefits of some of the research investments that propel innovation. One additional point I'd like to raise real quickly is, soft infrastructure and human talent. I think that's an important challenge and, at least in the US, we have a huge, skills gap in, among the workforce.
Whether it's, you know, K through 12 STEM education or, you know, retaining the foreign students at our top universities, and so, and that's not just a semiconductor issue, that's all technology, advanced manufacturing throughout the economy. So I think that will be a continuing challenge.
Hope King: Are you seeing governments willing and interested to increase funding in those areas?
David Isaacs: On a limited basis, but I think we have a lot of work
Hope King: to do. What do you mean by limited? .
David Isaacs: I think there's a strong interest in this, but, you know, I'm not sure governments are willing to step up and invest in the way we need to as a society.
Hope King: Sanjay,
To Two years. we're talking again. What's the, what's gonna be on our minds?
Sanjay Podder: So we have not seen what gen AI will do to us. We are still like in kindergarten talking about infrastructure. It's like the internet boom time, right? You know, and then we know what happened with that.
You know, our whole lives changed. With gen AI, enterprises will reinvent themselves. Our society will reinvent themselves and all that will be possible because of a lot of innovation happening in the hardware end as well, the software layer as well. What we need to, however, keep in mind as we transform.
None of us in this room know how the world will look like. It will look very different. That's what's certain. But one thing that we need to keep in mind is this transformation should be responsible. It should be keeping a human in the center of this transformation. We have to keep the environment in the ESG, all three very important, so that, you know, our AI does not disenfranchise communities, people.
I think that is going to be the biggest innovation challenge for us, because I'm very certain that we will, human ingenuity is such that we will build a better world than what we have today. There'll be a lot of innovations in all spheres but in the journey, let's make sure that responsible AI becomes the central aspect, sustainable and responsible AI become a central theme as we go through this journey, right?
So I'm as eagerly looking forward to it as all of us here, how this world will look.
Hope King: Yeah. No digital hoarding. Lastly, Neil, we didn't talk about the last mile customization problem, today, but I don't know if you, if that's something that you're looking for in the next two years to be solved. What other things?
And you can speak to the last mile too, if you want.
Neil Thompson: Sure. So, so for those who don't know, you know, the, idea of the last mile problem in AI is that you can build, say, a large language model and say, this works really well, in general, for people asking questions on the internet, but that might still not mean that it works really well for your company for some really specific thing.
You know, if you're interacting with your own customers, right, on your own products, with those specific terms, with those things, that, that system may not work that well. And you have to do some customization, that customization may be easy, you just need a little prompt engineering, or you feed it a little bit of information from your company.
Or it could be more substantial, it could, you could actually have to retrain it. And in those cases, that's gonna slow the spread of AI. Because it's going to mean that we're going to get lots of improvement in one area, but then you say, okay, well, think about all of the different companies that might want to adopt.
They have to figure out how they can adapt the systems to work for the things they do, and that's going to take time and effort. And so that last mile is going to be one that I think is going to be really important, because I think it's very easy to say, I read on the newspaper, or I saw a demonstration that said, boy, AI can do amazing things, much more than it could do even three months ago.
And that's absolutely true. But that's, then there's also this diffusion process and that's going to take a lot longer and so I think what we should expect over the next two years in terms of this is more of this sort of wedge of there are going to be some folks who are leading and these resource things that we've been talking about are being incredibly salient for them and they're going to be people who are behind for whom the economics of customization don't still work and they're going to be in the model that they were in ten years ago.
And so that divide, I think, is going to get bigger over the next two years.
Hope King: Alright, David, thank you for joining us, Sanjay, Neil, Prasad, and for all of you, hopefully you found this as informative as I did. Thanks, thanks everyone.
Prasad Kalyanaraman: Thank you.
Sanjay Podder: Hey, everyone. Thanks for listening. Just a reminder to follow CXO Bytes on Spotify, Apple, YouTube, or wherever you get your podcasts. And please do leave a rating and review if you like what we're doing. It helps other people discover the show. And of course, we want more listeners. To find out more about the Green Software Foundation, please visit greensoftware.foundation. Thanks again, and see you in the next episode.

  continue reading

4集单集

Artwork
icon分享
 
Manage episode 432935636 series 3582716
内容由Sonic Futures and The Green Software Foundation提供。所有播客内容(包括剧集、图形和播客描述)均由 Sonic Futures and The Green Software Foundation 或其播客平台合作伙伴直接上传和提供。如果您认为有人在未经您许可的情况下使用您的受版权保护的作品,您可以按照此处概述的流程进行操作https://zh.player.fm/legal
CXO Bytes host Sanjay Podder is joined by Prasad Kalyanaraman, David Isaacs and Neil Thompson at the AI-Ready Infrastructure Panel at the AWS Summit in Washington, June 2024. The discussion featured insights on the transformative potential of generative AI, the global semiconductor innovation race, and the impact of the CHIPS Act on supply chain resilience. The panel also explored the infrastructure requirements for AI, including considerations for sustainable data center locations, responsible AI usage, and innovations in water and energy efficiency. The episode offers a comprehensive look at the future of AI infrastructure and its implications for business and sustainability.
Learn more about our people:

Find out more about the GSF:

Resources:

If you enjoyed this episode then please either:
Connect with us on Twitter, Github and LinkedIn!
TRANSCRIPT BELOW:
Sanjay Podder: Hello and welcome to CXO Bytes, a podcast brought to you by the Green Software Foundation and dedicated to supporting Chiefs of Information, Technology, Sustainability, and AI as they aim to shape a sustainable future through green software. We will uncover the strategies and a big green move that's helped drive results for business and for the planet.
I am your host, Sanjay Poddar.
Welcome to another episode of CXO Bytes, where we bring you unique insights into the world of sustainable software development. I am your host Sanjay Poddar. Today we are excited to bring you highlights from a captivating panel discussion at the recent AWS Summit in Washington held in June 2024. The AI-Ready Infrastructure Panel featured industry heavyweights including Prasad Kalyanaraman, VP of Infrastructure Services at AWS, David Isaacs from the Semiconductor Industry Association, and renowned researcher Neil Thompson from MIT, and it was chaired by Axios Senior Business Reporter Hope King.
During this panel, we take a look at the transformative potential of generative AI, the global race for semiconductor innovation, and the significance of the CHIPS Act in strengthening supply chain resilience. Together, we will hopefully have a better picture of the future of AI infrastructure and the innovations driving this field forward.
And before we dive in here, a reminder that everything we talk about will be linked in the show notes below this episode. So without further ado, let's dive into the AI-Ready Infrastructure Panel from the AWS Summit.
Prasad Kalyanaraman: Well, well first, for the avoidance of doubt, generative AI is an extremely transformative technology for us. You know, sometimes I liken it to the internet, right, the internet revolution. So, I think there's, we're very early in that journey. I would say that, at least the way we've thought about generative AI, we think about it in three layers of the stack, right?
The underlying infrastructure layer is one of them, and I'll get into more details there. And then there is the frameworks. We build a set of capabilities that makes it easy to run generative AI models. And then the third layer is the application layer, which is where, you know, many people are familiar with like chat applications and so on.
That's the third layer of the stack, right? Thinking into the infrastructure layer It always starts from, you know, obviously, finding land and pouring concrete and building data centers out of it. And then, on top of it, there's a lot more that goes in inside a data center in terms of the networks that you build, in terms of how you think about electrical systems that are designed for it, how you land a set of servers, what kind of servers do you land, it's not just the GPUs that many people are familiar with, because you need a lot more in terms of storage, in terms of network, in terms of other compute capability.
And then you have to actually cluster these, servers together because it's not a single cluster that does these, training models. You can broadly think about generative AIs like training versus inference, and they both require slightly different infrastructure.
Hope King: Okay, so talk about the generative, what that needs first, and then the inference.
Prasad Kalyanaraman: Yeah, so the training models are typically large models that have, you know, you might have heard the term number of parameters, and typically, think of them as, and there are billions of parameters. So, you take, the content which is actually available out there on the internet and then you start, the models start learning about them.
And once they start learning about them, then they start associating weights with different parameters, with different, parts of that content that's there. And so when you ask, the generated AI models, for completing the set of tasks, that's the part which is inference. So you first create the model, which requires large clusters to be built, and then you have, a set of capabilities that allows you to do inference on these models.
So outcome of the model training exercise is a model with a different set of parameters and weights. And then inference workloads require these models. And then you merge that with your own customer data, so that customers can actually go and look at it to say, okay, what does, this model produce for my particular use case?
Hope King: Okay, let's, and, you know, let's go backwards to, to just even finding the land. Yeah. You know, where are the areas in the world where a company like Amazon AWS is first looking? Where are they, what, are the areas that are most ideal to actually build data centers that will end up?
Producing these models and training and all the applications on top of that.
Prasad Kalyanaraman: There's a lot of parameters that go into, picking those locations. Well, first, you know, we're a very customer obsessed company, so our customers really tell us that we, that they need, the capacity. But land is one part of the equation.
It's actually also about availability of green renewable power, which I'm sure we'll talk about through the course of this conversation. Being able to actually provide enough amounts of power and renewable sources to be able to run these compute capabilities is a fairly important consideration.
Beyond that, there are regulations about, like, what kind of, of, content that you can actually process. Then the availability of networks that allows you to connect these, these, servers together, as well as connect them to users who are going to use those models. And finally, it's about the availability of hardware and chips that are capable of processing this. And, you know, I'd say that, this is an area of pretty significant innovation over the last few years now.
We've been investing in machine learning chips and machine learning for 12 years now. And so, we have a lot of experience designing these servers. And so it takes network, land, power, regulations, renewable energy and so on.
Hope King: David, I want to bring you in here because you know obviously the chips are a very important part of building the entity of AI and the brain and connecting that with the physical infrastructure.
Where do you look geographically? What, or your, body of organizations when you're looking at maybe even diversifying the supply chain, to build, you know, even more chips as demand increases.
David Isaacs: Yeah, so I think it's around the world, quite frankly, and many of you may be familiar with the CHIPS Act, the past, two years ago here in the US, something very near and dear to my heart.
That's resulting in incentivizing significant investment here in the US, and we think that's extremely important to make the supply chain more resilient overall, to help feed the demand that AI is bringing about. I would also add that the green energy that was just alluded to that will also require, substantial, semiconductor innovation and new demand.
So we think that improving our, you know, the diversity of chip output, right now it's overly concentrated in ways that are, subject to geopolitical tensions, natural disasters, other disruptions, you know, we saw during the pandemic, the problems that can arise from the supply chain and the problems, I guess most prominently illustrated in the, automotive industry, we don't want that holding up the growth in AI.
And so we think that having a more diversified supply chain, including a robust, manufacturing presence here in the US is what we're trying to achieve.
Hope King: Is there any area of the world, though, that is safe from any of those risks? I mean, you know, we're in the middle of a heat wave right now, right?
And we're, you know, and we're not, and we're gonna talk about cooling because it's an important part. But do you, just to be specific, see any parts of the world that are more ideal to set up these systems and these buildings, these data centers for resiliency going forward?
David Isaacs: No, probably not. But just like, you know, your investment portfolio, rule one is to diversify.
I think we need a diversified supply chain for semiconductors. And, you know, right now, The US and the world, for that matter, is reliant 92 percent on leading edge chips from the island of Taiwan and the remaining 8 percent from South Korea. You don't need to be a geopolitical genius or a risk analyst to recognize that is dangerous and a problem waiting to happen. So, as a result of the investments we're seeing under the CHIPS Act, we projected, and in a report we issued last month with Boston Consulting Group, that the US will achieve 28 percent of leading edge chip production by 2032. That's, I think, good for the US and, good for the global economy.
Hope King: Alright, I'll ask this question one last time in a different way. Are there governments that are more proactive in reaching out to the industry to say, please come and build your plants here, data centers here?
David Isaacs: I think there's sort of a global race to attract these investments. There's, counterparts to the CHIPS Act being enacted in other countries.
I think governments around the world view this as an industry of strategic importance. Not just for AI, but for clean energy, for national defense, for telecom, for etc. And, so there's a race for these investments and, you know, we're just glad to see that the US is stepping up and implementing policy measures to attract some of these investments.
Hope King: Sanjay, can you plug in any holes that maybe Prasad and David haven't mentioned in terms of looking just at the land and where are the most ideal areas around the world to build new infrastructure to support the growth of generative AI and other AI?
Sanjay Podder: Well, I can only talk from the perspective of building data centers, which are for example greener, because, as you know, AI, classically and now gen AI.
They are, they consume a lot of energy, right? And based on what is the carbon intensity of the electricity used, it causes emission. So wearing my sustainability hat, you know, I'm obviously concerned about energy, but I'm also concerned about carbon emission. So to me, probably, a recent study in fact we did with AWS points to the regional variability of what, for example, AWS has today in various parts of the world.
So if you look at North America, that's US, Canada, EU, you will see that the data centers there, thanks to the cooler weathers, cold weather, the PUE, the Power Usage Effectiveness is much better, lower, right? Because you don't need a lot of energy to just cool the data centers, for example. Whereas, you will see that in, AsiaPac, for example, because of warmer conditions, you might need more power not only to power your IT, but also to keep the data centers cooler.
Right? So, purely from a geography perspective, you will see that, there are areas of the world today where the carbon intensity of electricity is lower because the electricity is largely powered with renewable energy like Nordics. But at the same time, if you go to certain parts of the world, even today, a lot of the electricity is generated through fossil fuels, which means the carbon intensity is high.
So purely from that perspective, if I see, you know, some of the locations like EU, North America, Canada, even Brazil, for example, a lot of their grid has renewable energy. the, the PUE factors are more favorable, but having said that, I have seen, for example, governments in Singapore, they're creating new, standards for how do you run data centers in, tropical climates.
They, in fact, one of the interesting things that they have done is they have raised the temperature by one degree Celsius, the accepted level of temperature in the data center because that translates to a lot of energy savings.
So, wearing my sustainability hat, if you ask me where should the data centers be, I would say they should be in locations where, you know, the carbon intensity of electricity is lower so that we can keep the emissions low. That is very important. And obviously, there are various other factors because one needs to also remember that these data centers are not small.
They take a lot of space. And where will this space come from, you know? Hopefully they don't cause a trade off with other sustainability areas like nature and biodiversity preservation. So the last thing I would like to see is, you know, large pieces of forest making way to data centers, right? Hopefully good sense will prevail.
Those things won't happen. But these are some of the factors one need to keep in mind, you know, if you bring the sustainability dimension. How do I keep emissions lower? How do I make sure impact to water resources are less? One of the studies shows that 40 to 50 Inferences translate to half a liter of water.
So how do I make sure natural, resources are less impacted? How do I make sure the forest and biodiversity are preserved? These are the things one has to think holistically when you select. And obviously other factors that Prasad will know, proximity to water supply for cooling the centers. So there may, it's a complex decision when you take to select a data center location.
Hope King: I love the description and how detailed you went into it. I mean, I just, I think for all of us, you know, looking at our jobs as members of the press, right, we want to know where the future is going, what it's going to look like, and from what I am putting together from what everyone has said so far, I'm, thinking more data centers are going to be closer to the poles where they're cooler, and maybe more remote areas away from people so that we're not draining resources from those communities.
And Neil, you know, I don't know if this is just, well, it's probably a personality thing. But like, I sit there and I say to myself. I could ask ChatGPT, because I've been dabbling with it, you know, to help me with maybe restructuring the sentence that I'm writing. Is it worth taking away water from a community?
Is it, is me asking the query of it worth all the things that are powering it? I mean, these are things that I think about, I'm like an avid composter, like, this is my life, right? What, are we ultimately doing, right? Like, is, this, what are we all ultimately talking about when we now say AI is going to be a big part of our lives and it's going to be a forever part of our lives,
but then you, know, you're hearing, you know, David and Sanjay and Prasad talk about everything that is required just to do that one thing to give me a grammar check?
Neil Thompson: Yeah, so, I mean, for sure it is remarkable the sort of demand that AI can place on the resources that we need. And it's been a real change, right?
You don't think of saying like, well, should I run Excel? You know, am I gonna use a bunch of water from a community because I'm using Excel, right? You don't think about that. And it's, but you know, some of the calculations, you know, there are things that you can do in Excel and you're like, oh, maybe I've, you know, I've put it on ChatGPT and I shouldn't, I should have done it, put it on Excel or something like that.
So, yeah, so you absolutely have this larger appetite for resources that come in with AI. And the question is sort of what do you do about that, right? And so, I mean, one of the nice things is, of course, that we don't have to put everything on the data center, right? People are working very hard to build models that are smaller so that it would live on your phone, right?
And then you have, you know, you still have the energy of recharging your phone, but it's not so disproportionate to training a model that is requiring tens of thousands of GPUs running for months. So, I think that's one of the things that we can be thinking about here is the efficiency gains that we're going to get and how we can do that and that's happening both at the chip level and also at the algorithmic level and I'm happy to go into a lot more detail on that if folks would like.
But I think that's the trade off we have there is, okay, we're going to make these models more efficient. But the thing is at the same time, there's this overwhelming trend that you see in AI which is if you make your models bigger, they become more powerful. And this is the race that all of the big folks, OpenAI, Anthropic, are all in, which is scaling up these models.
And what you see is that scaling up does produce remarkable changes, right? Even the difference between, say, ChatGPT and GPT 4, if you look at its performance on things like the LSAT or other tests that it's doing, big, big jumps up as they scale up these models. So that's quite exciting, but it does come with all of these resource questions that we're having.
And so there's a real tension here.
Prasad Kalyanaraman: Yeah, I would add that so one of the things that's important for, everyone to realize is that it is so critical to use responsible AI, right? And that Is
Hope King: me, using that for grammar? Is that responsible use of AI? Well, I mean I think I should know
Prasad Kalyanaraman: Responsible AI.
Yeah. what that means is that we have to be careful about how we use these resources, right? Because you asked a question about, like, how much water is it consuming when you use Excel or, any other such application. The key is, you know, this is the reason why when we looked at it and we said, look, we have a responsibility for the environment and, we were actually the, the, ones that were actually came back and said, we need to get to net zero carbon by 2040 with the climate pledge 10 years ahead of the climate accord.
And then we said, we have to get to water positive. I'll give you a couple of anecdotes on this. So we will be returning more water to the communities than what we actually take on AWS by 2030. That's quite impressive if you actually think about it. And that is a capability that you can really innovate on if you try to think about how you use cooling and how you actually think about what you need to cool and so on, right.
The other day I was actually reading another article, in Dublin, in Ireland, where we actually used heat from a data center to help with community heating, right? And so district heating is another one. So, I think there are lots of opportunities to innovate on this thing, to try and actually get the benefits of AI, at the same time be responsible in terms of how we actually use it.
So,
Hope King: talk more about the water. So, how exactly is AWS going to return more water than it takes?
Prasad Kalyanaraman: Yeah, so I'll tell you a few things there. One is, so just the technology allowing us to look at leaks that we have in our pipes. It's a pretty significant amount of water that gets leaked and gets wasted today.
And we've done a lot of research in this area trying to actually use some of our models, trying to use some of the technology that we built, to go look for these leaks from municipalities when they actually transfer it. And that's one area, renewable water sources is another area, so there's a lot of innovation that has happened already in trying to get to water positive.
We took a pretty bold stand on getting to water positive because we have a high degree of confidence that this research will actually get us there.
Hope King: Neil, how do you, is this the first time you're hearing about the water being, I mean this sounds like an incredible development.
Neil Thompson: It is the first time I'm here, so without more details, I'm not sure I can say more about it specifically.
But
Hope King: is it, but it's almost like, you know, we need AI to solve these big problems. It's almost, it's, it's, a quagmire. I mean, it's, yeah, you have to use energy to save energy. Like that, that, it's a paradox.
Neil Thompson: Sure, well, so let me, so I guess let me say two things here. So what, one is to say that it's, like, it is absolutely true that we have, there are a bunch of costs that are associated with using AI, but there are a bunch of benefits that come as well, right?
And so we have this with almost all the technologies, right? We, when we produce you know, concrete roads and things like that. I mean, there's a bunch of stuff that goes into that, but it also makes us more efficient and the like, and in some of the modeling that my lab has done and others have done, right, the upside of using AI in the economy and even more so in research and development to make new discoveries can have a huge benefit to the economy, right?
And so the question is like, okay, if we can get that benefit, right, it may come with some cost, but then we need to think carefully about, okay, well, what are we going to do to mitigate the fact that these costs exist, and how can we deal with the things that they come in?
Hope King: All of these things are racing at the same time, right?
So you've got the race to build these models, to use it to get the solutions, but then you've got to build the thing at the same time, and then you've got to find the chips. And, I mean, like, obviously it's not my job. My job is to ask the questions. I don't understand how we can measure like what, what of these branches of this infrastructure is actually moving faster and it doesn't, does one need to move faster than the other in order for everything to kind of follow along?
I don't know if anybody in here
Prasad Kalyanaraman: Yeah. Look, there's sometimes a misconception that these things started over the last 18 months. That's just not true. Of course. Right. Yeah. Data centers have been there for a long period of time. Okay. Trying to actually use, like, right now if I talk about cloud computing, you know, many of us are very familiar with that, like, ten years back or a decade back, we were very early on that.
So, it's not something that is a change that has to happen overnight, right? It's something that has evolved over a long period of time. And so, the investments that we've been doing over 15 years now, are actually helping us do the next set of improvements that we have. So, you know, we say this internally, there's no compression algorithm for experience.
And so, you have to have that experience and you have to actually have spent time going all the way down to the chip level, to the hardware level, to the cooling level, to the network level, and then you start actually adding up all of these things, then you start getting real large benefits.
Hope King: So, on that point though, is it about building new or retrofitting old when it comes to... because I've seen reports that suggest that building new is more efficient in another way because, you know, rack space or whatever.
So just really quickly on that, I know Neil wants to jump in.
Prasad Kalyanaraman: It's a combination. It's never a one size fits all. Generally
Hope King: speaking, as you look at the development data.
Prasad Kalyanaraman: I would say that there's obviously a lot of new capacity that's being brought online. But it's also about like efficiencies in existing capacity.
Because when we design our data centers, we design our data centers for like 20 plus years. But then, hardware typically actually lands, is useful about like six to seven years or so. After that you have to refresh. Right. So you have an opportunity to go and refresh older data centers. Yeah,
Hope King: it's not an over there update.
Yeah. You know, what were you going to say?
Neil Thompson: So you asked specifically about the race. I think that to me is one of the most interesting questions and something we spend a lot of time on. And it certainly is the case there's been an enormous escalation. So since 2017, if you look at large language models have been something like a 10x increase in the amount of compute being used to train them each year, right?
So that's a giant increase. Compared to that, some of these other increases in efficiency have not kept pace. So what you're talking about is the most cutting edge, the resources required are going up. But it's also the case that if you look at generally the diffusion of technology that's going on, right?
Maybe some capacity already exists, but other firms want to be able to use it. That's just spreading out. Well, there you get to take advantage of the efficiency improvements that are going on. And at the, for example, at the chip level, if you look in terms of the floating point operations that are done, the improvement is between 40 and 50 percent per year in terms of costs of flops per dollar.
That's pretty rapid, but even that is actually not that rapid compared to efficiency improvements. So efficiency improvements, for those who sort of don't think about it in this way, think about it as like, I have some algorithm that I have to run on the chip, that's going to use a certain number of operations, and the question is, can I design something that achieves that same goal, just asking for fewer operations to do, right?
It's just a pure efficiency gain. And what we see is that in large language models, the efficiency is growing by 2 to 3x every year. Which is huge, right? So if you think about the diffusion side of things, actually there, efficiency gains are very high, and we should Feel a little reassured that as that happens, the demands will drop.
Yeah,
Hope King: but then you have to contend with the volume. Because, you know, even if you reduce, I mean, even if you're making each more efficient, you're still multiplying it by other needs, so what does that offset? Math, net.
Neil Thompson: Yeah, so, I mean, the question there is exactly how fast is AI growing? Sure. And that actually is a very, turns out to be a very deep question that people are struggling with.
So, you know, many, unfortunately, many of the early surveys that were done in this area were sort of what you might call a convenience sample, like you called people you, cared about because they were your customers. But that was not representative of the whole country. So a couple of years ago, the census did some work, found out that there was about six percent of firms had actually been operationalizing AI.
Not that many. So we know that's going to be growing a lot, but exactly how fast, we're not sure. I think what we can say, though, is as that happens, you know, we could have a moment where it's happening faster, but as long as these efficiency increases, you know, continue over the longer term, and we do, you know, so far we've seen them to be remarkably robust.
That suggests that as that diffusion happens, then it will go down, so long as we don't all move to the cutting, the biggest cutting edge model in order to do it.
Hope King: David, I think you have probably some insight into how quickly the chip makers themselves are trying to design more efficient chips, you know, to this end.
David Isaacs: Yeah, let me just, start with the caveat that I'm not a technologist. I'm, a scientist, but I'm a political scientist. But, you know, so, but I'm glad that conversation has turned to innovation and efficiency gains because some of the questions ten minutes ago were was assuming that, the technology would remain the same and that's not the case.
Many people in the room are probably familiar with Moore's Law and the improvements in computing power, the improvements in efficiency, and the reduced costs that have been happening for decades now. That innovation pathway is continuing. It's not necessarily the same in terms of scaling and more transistors on silicon, but it's advanced packaging.
It's new designs, new architectures. And then there's the software and algorithm gains and the, like. So, we believe that will result in the efficiency gains and the resource savings. There's been a lot of third party studies. We know for the semiconductor industry that, the technologies we enable have a significant multiplier effect in reducing climate emissions, conserving resources, whether it's in transportation or energy generation or manufacturing. So we think there's a very substantial net gain from all these technologies. And I guess the other thing I would add is, you know, the CHIPS Act, the formal name is the CHIPS and Science Act, and there's very substantial research investments.
Some of which have been appropriated and are getting up and running. Unfortunately some of the science programs have been, and this is Washington speak, authorized but not yet appropriated. We need to fund those programs so that this innovation trajectory can continue.
Hope King: Yeah, I mean, just by nature, you know, we're the skeptics in the room, and we're looking at just the present day, and the imagination that is required for your jobs, is one that's not easily accessible when a lot of us are concerned about the here and now.
So, I appreciate that context, David. Sanjay, I want to talk about, what is going to take though in terms of energy, right? that is something that will remain the same in terms of the needs. So what are you seeing in terms of how new data centers are being built or even maybe systems, clusters of, infrastructure that support these data centers that, that you're seeing emerging and what is the sort of best solution as you know, more companies are building and looking for, land and, to actually grow all these AI systems?
Like what, are the. Actual renewable sources of energy that are easy and sort of at hand right now to be built?
Sanjay Podder: Well, I'm not an expert on that topic, so I won't be able to comment much on that, but I can talk about the fact that, you were discussing on efficiencies. For the same energy.
You know, you can, do a lot more with ai and what I mean by that is, the way we build AI today, whether it's training or inferencing, there is a number of easy things to do, which has a huge impact in amount of energy you need for that AI. I think there was a reference to, for example, large gen AI models.
They are typically preferred because, probably it gives more accuracy. But the reality is, in different business scenarios, you don't necessarily have to go to the largest of the models, right? And, in fact, most of the LLM providers today are giving you LLMs of different t shirt sizes. 7 billion parameters, 80 billion parameters.
One of the intelligent things to do is, fit for purpose. You select a model which is good enough for your business use case. You don't necessarily go to the largest of the model. And the energy need is substantially lowered in the process, for example, right? And that to me are very practical levers. The other thing, for example, there was, I think, Prasad referred to that these models, end of the day, they are deep learning models.
You know, there are techniques like quantizing, pruning, by which you can compress the models, such that smaller models will likely take less energy, for example. And then, of course, you know, inferencing. Traditionally, we have been always thinking about training with classical AI. But with generative AI, the game has changed.
Because with gen AI, given it's how pervasive and everybody is using it, millions of queries, the inferencing part takes more energy than training. Now again, simple techniques can be used to lower that, like you have one shot inferencing, if you batch your prompts, now when you do these things, you come to your answers quicker.
So, you know, you don't have to go and query the large model again and again So if you want to, you know, design your trip itinerary to San Francisco, instead of, you know, asking 15 prompts, you think about it, how do you batch it, this is my purpose, this is why I'm going, you know, this many days I'll be there, and, you know, it is very likely you'll get your itinerary in one shot, and in the process, a lot less energy will be used.
So the point here is, and of course data centers, right? All the chips that we are talking about, all these can, the custom silicons can lower the energy needs. So while I may not be able to comment on, you know, what are the best sources of energy because renewable energies are of various types, solar, wind, now people are talking about nuclear fusion and whatnot, but I'm seeing even within the energy that we have, we can use it in a very sustainable way, in a very intelligent way so that you get the business value you're seeking without wasting that energy unnecessarily.
Hope King: It sounds like you want companies to be more mindful of the, of their own data needs and not be overshoot, but be more even efficient in how they're engineering, what the, applications can do for them.
Sanjay Podder: Absolutely right. And you know, and on that point, about 70 to 80%, maybe more of the data is dark data and what is dark data? These are data organizations that store with the hope that one day they will need it, and they never need, require them. So you are simply storing a lot of data, and these data also corresponds to, you know, energy needs.
Right? So, there are a lot of rebound effect that you mentioned some time back, because compute, the cost of compute went down, storage went down, the programming community became lazy programmers. So what happened as a result of that is, You know, you are having all this, you're not bringing efficiency in the way you're doing software engineering.
You are having all these virtual machines, which you are hardly using, they're hardly utilized. They all require energy to operate. So there's a lot of housekeeping we can do, and that can lower energy, you know, in a very big way. Energy needs, right? And then, because even if there's a lot of renewable energy, IT is not the only reason, or AI is not the only reason where the renewable energy should be used.
There are other human endeavors. So, we have to change our mindset, be more sustainable, more responsible in the way we do IT today, and we do AI today.
Hope King: Did you want to add to that?
Prasad Kalyanaraman: Yeah, I would say that, you know, what he said is 100 percent right, which is that, as I said, like, you kind of have to actually think through the entire stack for this.
And you have to start going off for every stack and thinking about how you optimize it. Like, I'll give you an instance about cooling, and I know you wanted to talk about cooling.
Hope King: Yes, air conditioning for the data centers, and now the liquid cooling that's coming in to try to Exactly. Yes, go ahead.
Prasad Kalyanaraman: So over the years, what we've actually done is, we have figured out that we don't need liquid cooling for a vast majority of compute.
In fact, even today, like pretty much all our data centers run on air cooling. And what it means is we use outside air to actually cool our data centers. Okay. it, none of our data centers are nearly as cool as what you are in this room, by the way, just to be very clear.
Hope King: It's not as cold as this room?
Prasad Kalyanaraman: Not even close.
Hope King: What's the average temperature in a data center?
Prasad Kalyanaraman: It's well above 80 degrees.
Hope King: Really? Yeah. Okay. what's the range?
Prasad Kalyanaraman: It's between eight, between 80 to 85. Now, the thing is that you have to be careful about, cooling the data centers too much because you have to worry about relative humidity at that point as well.
And so, we've spent a lot of time in computational fluid dynamics to look at it to say, do we really need to actually cool the datacenters as much? And it's one of the reasons why our datacenters run primarily on air, and outside air actually. Now there's a certain point of time where you cannot do it with just air, that's where liquid cooling comes in.
But liquid cooling comes into effect because some of these AI chips have to be cooled at the microscopic level and air cannot actually get there fast enough. But even there, if you think about it, you only need to liquid cool a particular AI chip. But as I said, AI does not require just the chip, you need networks, you need like the storage and all that.
Those things are still cooled by air. So, my estimate is that even in a data center that has primarily AI chips, only about 60%, 60 to 70 percent of that needs to be liquid cooled. The rest of it is just pure air cooled.
Hope King: I think that's pretty fascinating, I think, because I think that's been, discussed more recently that liquid cooling is actually more crucial.
I don't know if you saw Elon Musk tweeting a picture of the fans that he has. He like made a pun about how his fans are helping. Whatever, anyways, go check it out. It's on X. thank you for, to, to circle back on the cooling. I think that was definitely, you know, I think a big question for energy use because that, those systems require a lot of energy.
As we come to the last couple of minutes, you know, I want to turn the conversation forward looking. The next two to five years, if I were to be back here with the four of you sitting on this stage, what would we be talking about? And where still do you think, at that point, gaps that we would need to, you know, that we're, that we haven't been fulfilled even, you know, since sitting here now, especially because I think, as you mentioned, right, like, infrastructure is not easily upgradable, like software is, and there will need to be those investments, physical investments, whether it's labor, whether it's land, physical resources, so, so in two years, what do you think we're going to be talking about?
Prasad Kalyanaraman: I'll start. I think you're going to, we're already starting to talk about that, so I expect that we'll talk more about those things. There's going to be a lot more innovation, generative AI will spur that as well in terms of how to think about renewable sources and how to actually run these things very efficiently.
Because some of the things that are realities is that generative AI is actually really expensive to run. And so, you're not going to actually spend a lot of money unless you actually get value out of it. And so there's going to be a lot of innovation on the size of these models, there's going to be a lot of innovation on chips, we're already starting to see that, and there's going to be a lot of innovation on, of course, nuclear, will be a part of the energy equation.
I think the path from here to, at least in our minds, our path to get to, net zero carbon by 2020. It's going to be very non linear, it's not going to be this one size fits all, it's going to be extremely non linear. But I expect that there will be a lot more efficiency running in it. It's one of the reasons why we harp so much on efficiency, on how we actually run our infrastructure.
One, it actually helps us in terms of our costs, which we actually translate to our customers. But it's also a very responsible thing for us to do.
Hope King: David, I actually want to jump over to you just for a second on this because, you know, a lot could happen in the next couple of months when it comes to the administration here in the US is there anything that you see in the next two years, politically, that could change the direction or the pace of development when it comes to AI, infrastructure building, investments from corporations, maybe even pulling back, right, pressures from investors to see that ROI on the cost of these systems.
David Isaacs: Yeah, I'm hesitant to engage in speculation on the political landscape, but I think things like the CHIPS Act, US leadership in AI are things that enjoy bipartisan support and I think that will continue regardless of the outcome of elections and short term political considerations. You know, I think getting to your question on what we'll be talking about a few years down the road, I'm an optimist, so I think we'll be talking about how we have a more resilient supply chain for chips around the world.
I think we'll be, enjoying the benefits of some of the research investments that propel innovation. One additional point I'd like to raise real quickly is, soft infrastructure and human talent. I think that's an important challenge and, at least in the US, we have a huge, skills gap in, among the workforce.
Whether it's, you know, K through 12 STEM education or, you know, retaining the foreign students at our top universities, and so, and that's not just a semiconductor issue, that's all technology, advanced manufacturing throughout the economy. So I think that will be a continuing challenge.
Hope King: Are you seeing governments willing and interested to increase funding in those areas?
David Isaacs: On a limited basis, but I think we have a lot of work
Hope King: to do. What do you mean by limited? .
David Isaacs: I think there's a strong interest in this, but, you know, I'm not sure governments are willing to step up and invest in the way we need to as a society.
Hope King: Sanjay,
To Two years. we're talking again. What's the, what's gonna be on our minds?
Sanjay Podder: So we have not seen what gen AI will do to us. We are still like in kindergarten talking about infrastructure. It's like the internet boom time, right? You know, and then we know what happened with that.
You know, our whole lives changed. With gen AI, enterprises will reinvent themselves. Our society will reinvent themselves and all that will be possible because of a lot of innovation happening in the hardware end as well, the software layer as well. What we need to, however, keep in mind as we transform.
None of us in this room know how the world will look like. It will look very different. That's what's certain. But one thing that we need to keep in mind is this transformation should be responsible. It should be keeping a human in the center of this transformation. We have to keep the environment in the ESG, all three very important, so that, you know, our AI does not disenfranchise communities, people.
I think that is going to be the biggest innovation challenge for us, because I'm very certain that we will, human ingenuity is such that we will build a better world than what we have today. There'll be a lot of innovations in all spheres but in the journey, let's make sure that responsible AI becomes the central aspect, sustainable and responsible AI become a central theme as we go through this journey, right?
So I'm as eagerly looking forward to it as all of us here, how this world will look.
Hope King: Yeah. No digital hoarding. Lastly, Neil, we didn't talk about the last mile customization problem, today, but I don't know if you, if that's something that you're looking for in the next two years to be solved. What other things?
And you can speak to the last mile too, if you want.
Neil Thompson: Sure. So, so for those who don't know, you know, the, idea of the last mile problem in AI is that you can build, say, a large language model and say, this works really well, in general, for people asking questions on the internet, but that might still not mean that it works really well for your company for some really specific thing.
You know, if you're interacting with your own customers, right, on your own products, with those specific terms, with those things, that, that system may not work that well. And you have to do some customization, that customization may be easy, you just need a little prompt engineering, or you feed it a little bit of information from your company.
Or it could be more substantial, it could, you could actually have to retrain it. And in those cases, that's gonna slow the spread of AI. Because it's going to mean that we're going to get lots of improvement in one area, but then you say, okay, well, think about all of the different companies that might want to adopt.
They have to figure out how they can adapt the systems to work for the things they do, and that's going to take time and effort. And so that last mile is going to be one that I think is going to be really important, because I think it's very easy to say, I read on the newspaper, or I saw a demonstration that said, boy, AI can do amazing things, much more than it could do even three months ago.
And that's absolutely true. But that's, then there's also this diffusion process and that's going to take a lot longer and so I think what we should expect over the next two years in terms of this is more of this sort of wedge of there are going to be some folks who are leading and these resource things that we've been talking about are being incredibly salient for them and they're going to be people who are behind for whom the economics of customization don't still work and they're going to be in the model that they were in ten years ago.
And so that divide, I think, is going to get bigger over the next two years.
Hope King: Alright, David, thank you for joining us, Sanjay, Neil, Prasad, and for all of you, hopefully you found this as informative as I did. Thanks, thanks everyone.
Prasad Kalyanaraman: Thank you.
Sanjay Podder: Hey, everyone. Thanks for listening. Just a reminder to follow CXO Bytes on Spotify, Apple, YouTube, or wherever you get your podcasts. And please do leave a rating and review if you like what we're doing. It helps other people discover the show. And of course, we want more listeners. To find out more about the Green Software Foundation, please visit greensoftware.foundation. Thanks again, and see you in the next episode.

  continue reading

4集单集

所有剧集

×
 
Loading …

欢迎使用Player FM

Player FM正在网上搜索高质量的播客,以便您现在享受。它是最好的播客应用程序,适用于安卓、iPhone和网络。注册以跨设备同步订阅。

 

快速参考指南