How do AI and Data work together?
Manage episode 462371213 series 3561447
00;00;09;02 - 00;00;34;16 Welcome to the Oracle Academy Tech Chat. This podcast provides educators and students in-depth discussions with thought leaders around computer science, cloud technologies, and software design to help students on their journey to becoming industry ready technology leaders of the future. Let's get started. Welcome to Oracle Academy Tech Chat, where we discuss how Oracle Academy helps prepare our next generation's workforce.
00;00;34;18 - 00;01;01;19 I'm your host, Tyra Peirce. In this episode, I speak with Oracle, my SQL developer advocate Scott Stroz, about how AI uses data, and the database skills students need to have as they work with AI. So, Scott, you're a returning guest for me. Can you give me a little bit about your background and role at Oracle? For those who may not have listened to our previous podcasts.
00;01;01;22 - 00;01;24;27 Sure. So, first and foremost, I consider myself a full stack developer and I've been a full stack developer for longer than the term full stack developer has actually been in existence. And in that time, the only technology in my stack that has remained constant is MySQL. I used it on my first job as a web developer, and I still use it today, and I've used it pretty much every day in between.
00;01;25;00 - 00;01;47;09 A lot of people, even people in the tech industry, they frequently ask, what does it mean that you're a developer advocate? And there's a lot of people I'm friends with and, I've met in developer relations, where we each kind of have like our own elevator pitch, like a quick, you know, 1 or 2 sit ins answer to basically get people to understand what it is that we do.
00;01;47;11 - 00;02;10;12 And the one that I've come up with is my job is to help developers be better at their jobs and I accomplish, or I hope I accomplish this in, different ways by producing content for developers through blog posts or, videos or podcasts such as this. I also, speak at conferences and I do a guest lectures for colleges and schools.
00;02;10;15 - 00;02;29;26 You are such a wonderful resource. And I we work together quite a bit, and I, I think it's good because I think that a lot of times developers don't understand what they're getting into or the different ways that they can develop and change codes. And I love that. I've got an expert that we can call on, and especially about this new topic.
00;02;29;28 - 00;03;00;18 So, kind of segue into that new topic. How does a I use data to learn SQL? It's a kind of a complicated answer, but it's also probably simpler than some people might think. With the recent release of Heat Wave, which is an AI solution in, Heat Wave that's available on Oracle Cloud, I've started playing around a little bit more than I had been, but it hasn't been with like the front end where I asked, I can I chat by questions, and it would give me answers.
00;03;00;21 - 00;03;22;07 It's been more feeding data into an AI engine, which in this case was heat wave Gen I. And while I was tinkering, I came to the realization that AI is not some black magic that just can pull answers out of thin air when you ask the question. It's just all based on math, and it's not even a new kind of math.
00;03;22;07 - 00;03;41;28 The math has actually been around for a long time, and I gotta say, I was a little a little disappointed. I was kind of hoping for some black magic, but for a genuine AI solution to be able to find pertinent data. So, if you ask a question, I put a chat bot the way it needs to find the answers.
00;03;41;28 - 00;04;14;16 And the first thing we need to do is we need to make sure that we get what are called embeddings of the source data. And this is basically what we use to feed in AI. And if we want it to say get embeddings for a collection of PDFs in any solution, we can actually use heat wave AI to run a stored procedure that will fetch the documents out of a bucket, a storage bucket in Oracle Cloud, and then it breaks down each document into a series of tokens.
00;04;14;18 - 00;04;43;24 Now, in the AI world, a token is a small series of a small set of characters. It can be a single word. It can be multiple words, or sometimes it can actually just be a small part of a longer word. And then using some mathematical algorithms, these tokens are converted into what we call a vector. And a vector is nothing more than an array of hexadecimal data, hexadecimal values that represent that token.
00;04;43;27 - 00;05;11;01 And this is our embedding. And those are stored typically in a database. There are other systems. This isn't just unique. The heat wave Gemini, other Gemini systems have a very similar process for ingesting the source data. And one of the most popular uses of AI solutions is what's called a retrieval augmented generation, or RAC, or what we would know as like an AI chat bot.
00;05;11;07 - 00;05;33;29 So, you open up a chat window when you start asking questions, and the AI spits back answers based off of the source data, and it's using the way it gets the answers to the questions is it actually takes the question. And using those same algorithms that you used with the source data, it actually gets embeddings for the question itself.
00;05;34;01 - 00;06;04;13 And then using some built in database functions. And in the case of heatwave and AI uses, a function that determines how far the tokens in the question are from the tokens in the source data. And again, that's some mathematical algorithm that I haven't even begun to try and understand. And with heat wave genAI, when you ask the questions, there's actually another stored procedure you can run to actually run like a chat bot.
00;06;04;25 - 00;06;41;06 And it will take the embeddings that matched in the source data and send that to the large language model that resides in the database. Or Liam is what people more commonly known as that. And it takes those that information those embeddings that matched and it generates a more human sounding response to the question. I've noticed when I've gone through and I've asked, and I a question like you can still tell it's a computer when they respond, but they are getting better and better and better, and the information coming back is excellent.
00;06;41;09 - 00;07;15;05 And it is really, really good information. They still don't have the human and the human inflection, the human tone. And you can tell when you're right, but it's getting really good. And so, then on to my next question. How can I use data to improve performance? So, because the entire process, which you know, from ingesting the data to actually retrieving relevant data from the embeddings, is based on math, the results can actually be incredibly accurate.
00;07;15;05 - 00;07;37;29 Like you just said, it's it gets better and better almost every day, and AI systems can process large amounts of data and make that the data and available. That tends to be more accurate than if you did like a database, like search, in a SQL query, or even a full text search in a database, or even better than what you get in some search engines.
00;07;38;02 - 00;08;13;02 Why does the quality of data impact the results AI provides? Well, it's funny that even in a new technology like AI in old adage is still true. And that is garbage in, garbage out. And all this really means is if you input bad or inaccurate data, your results are not going to be accurate. When you feed data into an AI system, you need to make sure that it's high quality and it's accurate because if it's not, then any results you get are not going to be high quality and they're not going to be accurate.
00;08;13;04 - 00;08;34;24 What are some of the skills that database professionals should have when they work with AI? Something a lot of people may not be aware of is that every AI solution, whether it's a chat bot or something else, it uses some kind of storage, usually a database, on the back end to store the embeddings for the data.
00;08;34;27 - 00;09;04;09 And while there's nothing particularly special or different about how we store vector data compared to other data types such as strings or dates, it's something that database professionals need to keep in mind when ingesting that data. Can I back up here? Let's go back to the. While there's nothing particularly different or special about how this data is stored compared to other data such as strings and dates, there is one thing database professionals need to keep in mind when ingesting this data.
00;09;04;12 - 00;09;31;18 Embeddings can take up a lot of space. So, for example, I was working on a demo for Heat Wave AI, and I used a PDF that was just a few megabytes in size, and it generated over 55,000 rows of embeddings. Now, the process of creating and retrieving embeddings can also be resource intensive.
00;09;31;21 - 00;09;53;02 So, there might not be special skills needed to store and retrieve the embeddings in the AI data. But we need to make sure that the system has enough resources so that the CPU and Ram is adequate, as well as having storage space. So those are probably the three things that database administrators or people who work in databases need to concentrate on.
00;09;53;02 - 00;10;11;29 More than anything else when working with the AI system. So, Scott, one final question then if you could give one piece of advice to faculty, your students, what would it be? All right, so I'm going to cheat here. I'm going to give two pieces of advice. So, the first piece of advice will be to learn as much as you can about AI.
00;10;12;01 - 00;10;32;22 AI is everywhere, probably even in places it doesn't belong. And while I think the fad is going to die down, I also think AI is here to stay. There won't be a single area in the IT field that won't be influenced by AI in some way. So, the more you know about different AI solutions, the more value you valuable you're going to be.
00;10;32;22 - 00;11;01;07 Not just your current employer, but to future employers as well. My second, this second piece of advice would be not to fear AI. I talked to a lot of developers, and I often get questions at conferences from younger developers concerned that AI is going to take their jobs and while AI tools, especially those that are designed to help write code, are impressive, they are far from perfect.
00;11;01;09 - 00;11;22;20 I have a personal project that I work on, and I use AI tools when I'm coding to try to cut down on some of the boilerplate code that I that I might need to write, and I'll ask for snippets of code given a specific language and tasks I need to achieve. And I have never gotten a response where I could just plug the code in, and it worked.
00;11;22;22 - 00;11;43;18 It's never happened. I'd say probably the best I got was about 90%. I'd say probably in the area of about 85% would be the average. So even though the AI chat bot is helping me write the code, I still have to know how the what the code is supposed to do. I always I've always needed to tweak it and so that's a nice adjunct.
00;11;43;19 - 00;12;11;22 It saves me time from, you know, not having to write everything, but you still need somebody to look at the code to see be like, oh, this, this is wrong. That's, you know, I told what I wanted in the note, but it's for some reason it's giving me in Java notation is, is one thing that actually happened when and again, I hear people, you know, they, you know, when I talk to people and they say they're concerned about, I think I'm afraid I'm going to lose my job.
00;12;11;22 - 00;12;41;11 I, I tell them something that I heard at a keynote talk about AI, and it's, it's about two years ago. And the interesting topic of the keynote was, is AI going to take your job? And the answer to that question, which the presenter gave on the last slide, was, no, as a developer, AI is not going to take your job, but a person using AI will.
00;12;41;13 - 00;13;00;08 I think that's very poignant advice that. And so, it's almost like you've got to be a developer. You've got to understand so you know how to use it so you can maintain your job. I think that's a really good piece of advice to all those faculty and students out there that are really wondering what the impact of AI is going to be.
00;13;00;21 - 00;13;22;25 A big thank you to Scott for speaking with me today about data and AI. It was really insightful. Please visit academy.oracle.com to learn more about Oracle Academy and the resources we offer to faculty and students and subscribe to our podcast. Thanks for listening. That wraps up this episode. Thanks for listening and stay tuned for the next Oracle Academy Tech Chat podcast.
41集单集