One of the biggest IT trends in recent years is the ability to run applications and data anywhere, meaning across IT infrastructures, including private data centers and public cloud services. It also means running them at the edge, where data is captured and generated. According to Steve McDowell, chief analyst and founder at NAND Research, that trend has paved the way for new AI capabilities.
In a Tech Barometer podcast segment, McDowell shares insights from his report Taming the AI-enabled Edge with HCI-based Cloud Architectures, commissioned by hybrid multicloud software company, Nutanix. The report explores the impact of extending IT resources to the edge and the driving force of AI, particularly in areas like image recognition for retail, manufacturing and other industries.
“The reason we push AI to the edge is because that's where the data is,” McDowell said.
He described what it takes to run and manage AI applications at the edge. Edge computing architectures can range in size from a single, smart device or small set of decentralized servers to a microcosm of a full-blown data center. Edge infrastructure often connects with data centers or centralized cloud resources, moving processing to the appropriate location based on the type of application operation and latency requirements.
He said the AI edge differs from traditional edge deployments by requiring greater compute cycles and data management, as mentioned, as well as in software lifecycle maintenance and security requirements. He said leaning on hyperconverged infrastructure can help.
“Traditional edge computing involves things like point-of-sale systems in retail,” McDowell explained. “Once we start putting AI in, then suddenly we have processing requirements that can require AI accelerators.”
The need for GPUs at the edge becomes mandatory “when I start doing things like generative AI,” he said. This involves automating the creation of text, images, videos, or other data using machine learning-trained large language models (LLMs), often in response to prompts.
“Ten years ago, the edge was largely about embedded systems or compute systems that we treated as embedded, which means [a software configuration that’s] fairly locked down and doesn't get updated very often,” McDowell observed. “AI, on the other hand, creates more of a living workflow” that requires regular attention.”
Transcription:
Steve McDowell: The reason we push AI to the edge is because that's where the data is, you know, we want to do the processing close to where the data is so that we don't have latency. And in a lot of environments, if we're ever disconnected, it's going to shut down my business.
Jason Lopez: The question is, how do you deploy edge resources in real time? In this Tech Barometer podcast, Steve McDowell, Chief analyst at NAND research talks about his paper "Taming the AI-Enabled Edge with HCI-Based Cloud Architectures." I'm Jason Lopez. Our aim in the next several minutes is to discuss how AI impacts edge computing.
Steve McDowell: We've always defined edge as any resources that live outside the confines of your data center. And there's some definitions that say the extension of data center resources to a location where there are no data center personnel. It's remote.
Jason Lopez: But AI, of course, adds complexity. One example McDowell cites is automated train car switching. The sides of train cars have bar codes which are scanned, and a local stack of servers processes where the cars are and where they need to be.
Steve McDowell: I can do this in real time. I can partition my workloads so that, you know, computationally expensive stuff or maybe batch stuff can still live in the core. And I don't have to do that at the edge all the time. So I can really fine tune what it is I'm deploying and managing.
Jason Lopez: This is important when you consider that AI at the edge differs from traditional edge deployments primarily due to its need for greater computational power.
Steve McDowell: Once we start putting AI in, then suddenly we have to have the ability to process that AI, which often means the use of GPUs or other kinds of AI accelerators. Ten years ago, if we talked about edge, we're talking largely about embedded systems or compute systems that we treat as embedded. Embedded is a special word in IT. It means it's fairly locked down. It doesn't get updated very often. When we look at things like AI, on the other hand, that's a very living workflow. If I'm doing image processing for manufacturing, for example, for quality assurance, I want to update those models continuously to make sure I've got the latest and the greatest.
Jason Lopez: And along with managing fleets of hardware and software in AI deployments at the edge, there's also the issue of security.
Steve McDowell: By treating edge systems as connected and part of my infrastructure, and not as we historically have treating them as kind of embedded systems, if you will, it also allows me to, in real time, manage patches, look at vulnerabilities, surface alerts back up to my security operations center, my SOC. It makes the edge look like it's part of my data center.
Jason Lopez: Tools like Nutanix allow for this approach, applying a consistent management practice across both core and edge environments. This involves deciding what tasks to perform at the edge versus the core due to constraints like cost, security, and physical space.
Steve McDowell: A key part of the conversation becomes what lives where? And that's not a tool problem, right? That's kind of a system architecture problem. But once you start partitioning your workloads and say, this certain kind of AI really needs to be done in the core, Nutanix gives me that ability and cloud native technologies give me that ability to say, well, I'll just put this kind of inference in the cloud and I'll keep this part local.
Jason Lopez: McDowell's thinking springs from the flexibility afforded by hyper-converged infrastructure. The idea of AI at the edge is part of the whole architecture of storage, network and compute.
Steve McDowell: That can be as disaggregated as it needs to be. So if I need a whole lot of compute in the cloud, I can do that and then put the little bit at the edge and I can manage all of that through that single pane of glass, very, very powerful.
Jason Lopez: Treating edge computing as a part of the data center becomes so interesting because of how the data center itself is being transformed by AI and machine learning.
Steve McDowell: Once we abstract the workload away from the hardware, I've broken a key dependency. I don't have to physically touch a machine to manage it, to update it, to do whatever.
Jason Lopez: The point McDowell makes is how management, not just of the configuration of a node, but across a fleet, is simplified. It enhances efficiency and scalability.
Steve McDowell: We're taking technology that evolved to solve problems in cloud, but they apply equally to the edge, I think. It turns out, it's a fantastic way to manage edge.
Jason Lopez: AI at the edge is increasingly adopting cloud-native technologies like virtualization and containers. The shift is to container-based deployments for AI models, sharing GPUs and managing them remotely.
Steve McDowell: If you look at how, you know, NVIDIA, for example, suggests pushing out models and managing workloads on GPUs, it's very container-driven.
Jason Lopez: And McDowell explains why this simplifies edge management.
Steve McDowell: A GPU in a training environment is a very expensive piece of hardware. And giving users bare metal access to that, you know, requires managing that as a separate box. Using Cloud-native technologies, I can now share that GPU among multiple users, very, very simply. That same flexibility now allows me to manage GPUs at the edge with the level of abstraction that works. So I can sit in my data center, push a button and manage that box without actually worrying about what that box looks like necessarily. So I don't need that expertise kind of onsite, right? Which is a key enabler for edge. If you have to have trained IT specialists wherever you're deploying, that doesn't scale. And edge is all about scalability.
Jason Lopez: GPUs are typically what power AI, but are are not commonly found at the edge. But inference is a facet of AI that many technologists see value in at the edge. GPUs would be the right fit if at the edge, generative AI is needed. But what's needed now are inference engines, especially around vision and natural language processing. Steve McDowell: Take, for example, a retail environment where they have intelligent cameras that are positioned all up and down the aisles of the grocery store. And the only job that these cameras have is to monitor the inventory on the shelf across from the camera. And when they've sold out of Chex mix and there's a gap there, it sends an alert, come restock. I mean, it's very kind of data intensive and you don't want to send that to the cloud necessarily.
Jason Lopez: Technology is moving toward managing infrastructure environments seamlessly, such as edge, data centers, and cloud, without changing tools or management models.
Steve McDowell: Nutanix has capabilities for managing AI in your workflow, kind of period, full stop. A good example of this is GPT in a box. Where it's a technology stack and I plug a GPU in and I can do natural language processing. If I want to push that out to the edge. I don't have to change my tools. I mean, the beautiful thing, and the reason that we use tools like Nutanix is that it gives me kind of a consistent control plane across my infrastructure. Now, infrastructure used to mean data center, and then it meant data center and cloud. And now with edge, it means data center and cloud and edge. The power of Nutanix though, is it allows me to extend outside of my traditional kind of infrastructure into the edge without changing my management models. So, as AI goes to the edge, I think the things that already make Nutanix great for AI in the data center are equally applicable at the edge.
Jason Lopez: Steve McDowell is founder and chief analyst at NAND research. This is the Tech Barometer podcast, I’m Jason Lopez. Tech Barometer is a production of The Forecast, where you can find more articles, podcasts and video on tech and the people behind the innovations. It’s at theforecastbynutanix dot com.
Jason Lopez is executive producer of Tech Barometer, the podcast outlet for The Forecast. He’s the founder of Connected Social Media. Previously, he was executive producer at PodTech and a reporter at NPR.
Ken Kaplan contributed to this podcast.
© 2024 Nutanix, Inc. All rights reserved. For additional information and important legal disclaimers, please go here.
Related Articles
Content is loading...