Artwork

内容由Oracle Corporation提供。所有播客内容(包括剧集、图形和播客描述)均由 Oracle Corporation 或其播客平台合作伙伴直接上传和提供。如果您认为有人在未经您许可的情况下使用您的受版权保护的作品,您可以按照此处概述的流程进行操作https://zh.player.fm/legal
Player FM -播客应用
使用Player FM应用程序离线!

Implications of AI and Privacy

11:29
 
分享
 

Manage episode 453426739 series 3561447
内容由Oracle Corporation提供。所有播客内容(包括剧集、图形和播客描述)均由 Oracle Corporation 或其播客平台合作伙伴直接上传和提供。如果您认为有人在未经您许可的情况下使用您的受版权保护的作品,您可以按照此处概述的流程进行操作https://zh.player.fm/legal

In this episode, host Tyra Crockett Peirce speaks with Wei Jiang, an Oracle AI researcher and former university professor, about the privacy impacts of AI.

---------------------------------------------------------

Episode Transcript:

00;00;09;03 - 00;00;35;11 Welcome to the Oracle Academy Tech Chat. This podcast provides educators and students in-depth discussions with thought leaders around computer science, cloud technologies and software design to help students on their journey to becoming industry ready technology leaders. Of the Future. Let's get started. Welcome to Oracle Academy Tech Chat, where we discuss how Oracle Academy helps prepare our next generation's workforce. 00;00;35;13 - 00;01;05;09 I'm your host, Tara Pierce. In this episode, I speak with Wade Young, an oracle, an AI researcher, about the privacy impacts of A.I. To start off, can you give me a bit about your background and your role at Oracle? Sure. Thank you for inviting me today. So, before we join Oracle for 2022, I had been a computer science faculty member within the University of Missouri System for over 14 years. 00;01;05;12 - 00;01;38;03 My maintaining your background is privacy, preserving data analytics and applied cryptography currently. Hamer Research scientist at Oracle Labs East, working on privacy, preserving machine learning solutions, using secure multi-party computation techniques. We investigate price preserving solutions for both traditional machine learning tasks and fine-tuning language models as all around us. What are some of the AI impacts on personal privacy? 00;01;38;05 - 00;02;17;27 This is a great question. There are a few factors I can think of. One, from the data management perspective, requires a larger deal a large amount of data. Training data and the increasing amount of training data makes the data protection more and more challenging, especially when data come from multiple sources and contains sensitive information. The second factor I can think of is regarding the memorization capability of AI as reasoning become more accurate and use memory capability can potentially leak sensitive information. 00;02;17;29 - 00;02;48;05 For example, it happened in the past that a large language model response contains a real security number. Another issue is the fake image and news generated by A.I. So, AI generated images and attacks are very difficult to distinguish from the real ones. These fake images and news can cause a lot of harm to individuals during mutations and invading their privacy. 00;02;48;07 - 00;03;18;17 Another point is also enabling the development of smart aerosol layers and the email spams that could bypass the current malware detection systems. Why isn't a person, computer or mobile device these compromised the invasion of privacy as an agreed risk? Honestly, that is terrifying to hear. And it's so given everything is going on today that is a really terrifying thing to hear that we just can't tell the differences anymore. 00;03;18;17 - 00;03;52;02 And so, with AI, algorithms require large amounts of data to operate. How is data managed to maintain privacy? So why Burden thing to do is to really minimize the information flow so we can set up proper access control policies to eliminate unnecessary access to the data share only what is needed and keep data within their own silos. Secondly, we can apply data, anonymization, technique to hide a private information. 00;03;52;05 - 00;04;25;26 For example, we can suppress and generalize direct or indirect identifying information in front of data such as share security number, address, personal name, zip code, diagnosis. And we can also protect inundated by adding some random noise. Another tool we can use is called the encryption. So, a dating option as another layer of protection by ensuring only the party encoding the decryption key can access the original data. 00;04;25;28 - 00;04;59;23 We also need to use pricey scene technology to train a machine learning model. So, by applying different price sequencing technology, we can prevent the disclosure of training data train model, use inquiry and forensic results. A.I. combines various pieces of data about a person to make inferences, creating risks of privacy, invasion. Are these inferences subject to privacy risks as information can be used in new, unintended ways? 00;04;59;25 - 00;05;32;06 Yes, and definitely so. These inferences allow attackers to carry out certain attacks and they can potentially leak sensitive information. So, for example, membership first in attack determines if especially information was used to train machine learning models. This attack can be used to infer sensitive information about any individual who stay. There might be use theory training, which clearly pose a privacy threat to the individual. 00;05;32;08 - 00;06;13;25 Another kind of attack called the modern-day version of that modeling version of that. So, which allows Tiger here to learn the training data was some pretty soon the training data which could reveal private information. Another attack is called a model iteration attack. Which enables Tiger to derive is a similar model of the heat hidden competition model by repeatedly querying the hidden model and obtaining the information without an attacker can reproduce a highly similar model with a fraction of the cost for training the Asian model. 00;06;13;27 - 00;06;43;27 The learned model can be subsequently used to initiate model invasion and membership fee for attack. What is the commonly used privacy enhancing technologies for machine learning and what are their pros and cons? So traditionally there are a lot of crazy enhancing technologies. So, for now, the most popular ones, I would talk about a few popular ones these days. 00;06;44;00 - 00;07;11;25 First one is called the Internet of Data Generation. So basically, since any data can be generated based on certain properties of the real data in general, since other data generation are reasoned efficient, and the resulting data are effective for machine learning tasks. However, it is hard to analyze and approve their privacy preserving properties. Another thing is called the key enemy. 00;07;11;27 - 00;07;45;18 It is one the data anonymization technique. So, to make a data set anonymous, the original attribute values are suppressed and generalized, such that a data recurring in the resulting dataset is indistinguishable from at least K minus one other data writer. So, although this technique offers a far more security guarantee, it may be difficult to determine. The right amount is for K, and it is often reducing model accuracy. 00;07;45;20 - 00;08;20;22 Another very popular technique called differential privacy. So differential privacy as carefully calibrated noises to the original data model vendors overall results to prevent him for his attacks. The technique offers a formal security guarantee with a good efficiency, but it is often challenging to know the right amount of noise to use and model accuracy can be adversely affected. Another popular technique called the Fader learning. 00;08;20;24 - 00;08;47;01 So, he is learning during training. They remain in their own silos and the model learn locally and then combine to derive a global model which can be shared among the participating parties. So though federated learning is efficient, and effective, local model parameters can still be really information about sensitive training data. 00;08;47;04 - 00;09;21;02 The technique I am most familiar race is called the secure multiparty competition. That's kind of, you know, my maybe grant to Disney's a cryptography technique allow us to train machine learning models without directly accessing sensitive training data. You can also keep the train mono use inquiry and every result private as well as guaranteed moderate accuracy. However, secure multi-party computation partygoers work hard to design and implement. 00;09;21;04 - 00;09;43;19 They are computationally expensive and do not scale well. That's a lot of information about different ways to train, and it's so entirely fascinating on all of the things that are coming up that everybody's going to need to start learning about how we secure our data more effectively. When we when we enter this new era of artificial intelligence. 00;09;43;21 - 00;10;13;29 My last question, if you could give one piece of advice to faculty or students, what would it be? I said this on my working experience of the in both academia and industry. I believe collaboration is essential to find the right person to work with and achieve a highly effective collaboration. We need to be as part of our own technical domain and knowledgeable about technology and tools to the first quality. 00;10;13;29 - 00;10;34;28 Allow us to be the go-to person. Right? When I read a problem comes up the organization related to you, to your technical domain. The second quality them to find the go to person for guidance and help you there to solve a problem at hand. The side of overall high spirits. So, I would like to end this question. 00;10;34;28 - 00;10;54;27 We still coach knowing something for everything and everything. And this one thing I think that is a really wonderful piece of advice because teamwork is so essential, especially as we're going into this next phase of technology with AI and making sure that, you know, we become an expert on one thing and that we can find people to help us. 00;10;54;27 - 00;11;15;06 I think that's always a really important, important skill to have been being able to work effectively with others. So, thank you two ways for speaking with me today about AI privacy, who is very insightful. Please visit Academy dot Oracle dot com to learn more about Oracle Academy and the resources we offer to faculty and students. And please subscribe to our podcast. 00;11;15;11 - 00;11;22;25 Thanks. Way. That wraps up this episode. Thanks for listening and stay tuned for the next Oracle Academy Tech Chat podcast.

  continue reading

40集单集

Artwork
icon分享
 
Manage episode 453426739 series 3561447
内容由Oracle Corporation提供。所有播客内容(包括剧集、图形和播客描述)均由 Oracle Corporation 或其播客平台合作伙伴直接上传和提供。如果您认为有人在未经您许可的情况下使用您的受版权保护的作品,您可以按照此处概述的流程进行操作https://zh.player.fm/legal

In this episode, host Tyra Crockett Peirce speaks with Wei Jiang, an Oracle AI researcher and former university professor, about the privacy impacts of AI.

---------------------------------------------------------

Episode Transcript:

00;00;09;03 - 00;00;35;11 Welcome to the Oracle Academy Tech Chat. This podcast provides educators and students in-depth discussions with thought leaders around computer science, cloud technologies and software design to help students on their journey to becoming industry ready technology leaders. Of the Future. Let's get started. Welcome to Oracle Academy Tech Chat, where we discuss how Oracle Academy helps prepare our next generation's workforce. 00;00;35;13 - 00;01;05;09 I'm your host, Tara Pierce. In this episode, I speak with Wade Young, an oracle, an AI researcher, about the privacy impacts of A.I. To start off, can you give me a bit about your background and your role at Oracle? Sure. Thank you for inviting me today. So, before we join Oracle for 2022, I had been a computer science faculty member within the University of Missouri System for over 14 years. 00;01;05;12 - 00;01;38;03 My maintaining your background is privacy, preserving data analytics and applied cryptography currently. Hamer Research scientist at Oracle Labs East, working on privacy, preserving machine learning solutions, using secure multi-party computation techniques. We investigate price preserving solutions for both traditional machine learning tasks and fine-tuning language models as all around us. What are some of the AI impacts on personal privacy? 00;01;38;05 - 00;02;17;27 This is a great question. There are a few factors I can think of. One, from the data management perspective, requires a larger deal a large amount of data. Training data and the increasing amount of training data makes the data protection more and more challenging, especially when data come from multiple sources and contains sensitive information. The second factor I can think of is regarding the memorization capability of AI as reasoning become more accurate and use memory capability can potentially leak sensitive information. 00;02;17;29 - 00;02;48;05 For example, it happened in the past that a large language model response contains a real security number. Another issue is the fake image and news generated by A.I. So, AI generated images and attacks are very difficult to distinguish from the real ones. These fake images and news can cause a lot of harm to individuals during mutations and invading their privacy. 00;02;48;07 - 00;03;18;17 Another point is also enabling the development of smart aerosol layers and the email spams that could bypass the current malware detection systems. Why isn't a person, computer or mobile device these compromised the invasion of privacy as an agreed risk? Honestly, that is terrifying to hear. And it's so given everything is going on today that is a really terrifying thing to hear that we just can't tell the differences anymore. 00;03;18;17 - 00;03;52;02 And so, with AI, algorithms require large amounts of data to operate. How is data managed to maintain privacy? So why Burden thing to do is to really minimize the information flow so we can set up proper access control policies to eliminate unnecessary access to the data share only what is needed and keep data within their own silos. Secondly, we can apply data, anonymization, technique to hide a private information. 00;03;52;05 - 00;04;25;26 For example, we can suppress and generalize direct or indirect identifying information in front of data such as share security number, address, personal name, zip code, diagnosis. And we can also protect inundated by adding some random noise. Another tool we can use is called the encryption. So, a dating option as another layer of protection by ensuring only the party encoding the decryption key can access the original data. 00;04;25;28 - 00;04;59;23 We also need to use pricey scene technology to train a machine learning model. So, by applying different price sequencing technology, we can prevent the disclosure of training data train model, use inquiry and forensic results. A.I. combines various pieces of data about a person to make inferences, creating risks of privacy, invasion. Are these inferences subject to privacy risks as information can be used in new, unintended ways? 00;04;59;25 - 00;05;32;06 Yes, and definitely so. These inferences allow attackers to carry out certain attacks and they can potentially leak sensitive information. So, for example, membership first in attack determines if especially information was used to train machine learning models. This attack can be used to infer sensitive information about any individual who stay. There might be use theory training, which clearly pose a privacy threat to the individual. 00;05;32;08 - 00;06;13;25 Another kind of attack called the modern-day version of that modeling version of that. So, which allows Tiger here to learn the training data was some pretty soon the training data which could reveal private information. Another attack is called a model iteration attack. Which enables Tiger to derive is a similar model of the heat hidden competition model by repeatedly querying the hidden model and obtaining the information without an attacker can reproduce a highly similar model with a fraction of the cost for training the Asian model. 00;06;13;27 - 00;06;43;27 The learned model can be subsequently used to initiate model invasion and membership fee for attack. What is the commonly used privacy enhancing technologies for machine learning and what are their pros and cons? So traditionally there are a lot of crazy enhancing technologies. So, for now, the most popular ones, I would talk about a few popular ones these days. 00;06;44;00 - 00;07;11;25 First one is called the Internet of Data Generation. So basically, since any data can be generated based on certain properties of the real data in general, since other data generation are reasoned efficient, and the resulting data are effective for machine learning tasks. However, it is hard to analyze and approve their privacy preserving properties. Another thing is called the key enemy. 00;07;11;27 - 00;07;45;18 It is one the data anonymization technique. So, to make a data set anonymous, the original attribute values are suppressed and generalized, such that a data recurring in the resulting dataset is indistinguishable from at least K minus one other data writer. So, although this technique offers a far more security guarantee, it may be difficult to determine. The right amount is for K, and it is often reducing model accuracy. 00;07;45;20 - 00;08;20;22 Another very popular technique called differential privacy. So differential privacy as carefully calibrated noises to the original data model vendors overall results to prevent him for his attacks. The technique offers a formal security guarantee with a good efficiency, but it is often challenging to know the right amount of noise to use and model accuracy can be adversely affected. Another popular technique called the Fader learning. 00;08;20;24 - 00;08;47;01 So, he is learning during training. They remain in their own silos and the model learn locally and then combine to derive a global model which can be shared among the participating parties. So though federated learning is efficient, and effective, local model parameters can still be really information about sensitive training data. 00;08;47;04 - 00;09;21;02 The technique I am most familiar race is called the secure multiparty competition. That's kind of, you know, my maybe grant to Disney's a cryptography technique allow us to train machine learning models without directly accessing sensitive training data. You can also keep the train mono use inquiry and every result private as well as guaranteed moderate accuracy. However, secure multi-party computation partygoers work hard to design and implement. 00;09;21;04 - 00;09;43;19 They are computationally expensive and do not scale well. That's a lot of information about different ways to train, and it's so entirely fascinating on all of the things that are coming up that everybody's going to need to start learning about how we secure our data more effectively. When we when we enter this new era of artificial intelligence. 00;09;43;21 - 00;10;13;29 My last question, if you could give one piece of advice to faculty or students, what would it be? I said this on my working experience of the in both academia and industry. I believe collaboration is essential to find the right person to work with and achieve a highly effective collaboration. We need to be as part of our own technical domain and knowledgeable about technology and tools to the first quality. 00;10;13;29 - 00;10;34;28 Allow us to be the go-to person. Right? When I read a problem comes up the organization related to you, to your technical domain. The second quality them to find the go to person for guidance and help you there to solve a problem at hand. The side of overall high spirits. So, I would like to end this question. 00;10;34;28 - 00;10;54;27 We still coach knowing something for everything and everything. And this one thing I think that is a really wonderful piece of advice because teamwork is so essential, especially as we're going into this next phase of technology with AI and making sure that, you know, we become an expert on one thing and that we can find people to help us. 00;10;54;27 - 00;11;15;06 I think that's always a really important, important skill to have been being able to work effectively with others. So, thank you two ways for speaking with me today about AI privacy, who is very insightful. Please visit Academy dot Oracle dot com to learn more about Oracle Academy and the resources we offer to faculty and students. And please subscribe to our podcast. 00;11;15;11 - 00;11;22;25 Thanks. Way. That wraps up this episode. Thanks for listening and stay tuned for the next Oracle Academy Tech Chat podcast.

  continue reading

40集单集

Tất cả các tập

×
 
Loading …

欢迎使用Player FM

Player FM正在网上搜索高质量的播客,以便您现在享受。它是最好的播客应用程序,适用于安卓、iPhone和网络。注册以跨设备同步订阅。

 

快速参考指南

边探索边听这个节目
播放