Support Centre

You have out of 5 free articles left for the month

Signup for a trial to access unlimited content.

Start Trial

Continue reading on DataGuidance with:

Free Member

Limited Articles

Create an account to continue accessing select articles, resources, and guidance notes.

Free Trial

Unlimited Access

Start your free trial to access unlimited articles, resources, guidance notes, and workspaces.

USA: AI vendor management - human programming for machine learning

Machine learning and artificial intelligence (AI) have permeated the supply chain. The reasons are apparent. Low cost and efficiency are an easy sell in today's economy, with rampant inflation in the supply chain and tight labor markets.

Yet, the economic motivation for AI must be tempered by human (or human-programmed) review of AI systems. Rules are necessary to ensure that the fundamental privacy and moral rights of individuals are protected. From data input to disaster recovery, AI vendor management ensures both the protection of businesses and the broader society.

In this Insight article, Lily Li, Founder of Metaverse Law, discusses data minimization for AI vendors, algorithmic bias and disgorgement, considerations for AI terms and conditions, and business continuity and disaster recovery considerations for AI.

metamorworks / Essentials collection /

AI data minimization for AI vendors

In some ways, AI vendors are similar to any other vendor. If you are uncertain about an AI vendor's commitment to data confidentiality and security, conduct due diligence and impose contractual limitations on data use. Alternatively, minimize the information you provide to the vendor to non-sensitive, low-risk data.

While many organizations have already incorporated data minimization at the training and policy level, relying solely on this approach has its flaws. It requires employees to understand company systems and their sensitivity, and on a case-by-case basis, refrain from inputting sensitive data into AI platforms or connecting sensitive company systems to AI plugins and application programming interfaces (APIs). This is like engaging in phishing training for employees but failing to automatically filter out spam or other harmful messages from company email accounts.

The better approach is to combine training and policy with data minimization by design. Even better, leverage AI techniques to reduce the sensitivity of data. For example, an organization's toolkit can include:

  • imposing data loss prevention (DLP) tools to prevent the copying of personal data or sensitive data from production environments;
  • limiting access to AI accounts to managed corporate or business accounts, or APIs subject to data processing terms;
  • restricting AI vendor access to development, test, or QA environments;
  • engaging AI vendors that will create on-premises or private cloud servers to run algorithms on company environments; and
  • partnering with AI vendors that mask or de-identify data prior to submitting it to third-party AI platforms.

Algorithmic assessments and bias

In other respects, AI vendors can be quite different, especially when dealing with generative AI or machine learning models that:

  • train on extensive personal data; and
  • produces unpredictable outputs due to this training.

In such cases, it becomes crucial to assess the sources of training data, including their inherent inaccuracies or biases, as well as test the output of data. Flawed data, whether trained on a few publicly available datasets or the software developer's personal contacts, can lead to poor or biased results. Even with high-quality and unbiased data, training an unpredictable or generative model may still result in bad or biased results.

While this may be hilarious for some, the situation can be embarrassing and risky for businesses that are relying on accurate content for financial transactions, healthcare analysis, or customer service. Furthermore, flawed data can lead to racial profiling and wrongful arrests if employed for law enforcement or national security purposes.

Furthermore, if AI systems are trained on personal data without appropriate privacy disclosures, AI systems and vendors may face disgorgement and algorithm deletion. For example, the Federal Trade Commission (FTC) reached a settlement order with a photo storage service over allegations that it deceived consumers about its use of facial recognition technology. As part of the settlement order, the FTC required the company to delete all facial recognition models or algorithms developed using users' photos or videos.

More recently, the FTC required algorithmic destruction in a case against an international weight management and wellness company, and a subsidiary. FTC Chair Lina Khan stated that the companies marketed weight management services for use by children as young as eight and then illegally collected their personal and sensitive health information. Our order mandates the removal of their acquired data, the destruction of any algorithms derived from it, and the imposition of penalties for their violations.

Consequently, prior to engagement of an AI vendor for critical operations, it is important to assess the following components of the AI system:

  • training data source: determine the source of the vendor's training data. Is this purely de-identified or aggregated data, and if not, what privacy disclosures or legal bases support the processing of such data?;
  • bias testing: If the AI vendor may impact individual rights and liberties, has the training data and/or output been tested for bias?; and
  • data accuracy controls: identify the controls or testing methods in place to ensure the accuracy of data output from the system. Does the system identify inaccurate input prior to generating output?

Like cybersecurity vendor management, AI vendor management should be risk-based and depend on the AI vendor's criticality to your operations and the sensitivity of the systems and data that they access. Furthermore, different models carry different risks. While many systems are marketed as AI systems, some primarily process large volumes of data quickly in response to inflexible rules. These models may not carry as many risks related to transformative or 'surprise' output but can still pose significant privacy and cybersecurity risks.

Flowing through terms

While much of the discussion on AI vendor risk management focuses on risks stemming from the AI vendor itself, the controller or company using the AI vendor also shares responsibility. While many AI vendors implement privacy and security controls for their systems, they have limited control over the data input by their customers. Therefore, AI vendors generally impose acceptable use policies and other terms that prohibit special categories of personal data, illegal content, and copyrighted material from their platforms.

When working with an AI vendor, it is a two-way street. It is equally important to understand the AI vendors' terms and communicate them to your employees and users as it is to convey privacy and security obligations to the AI vendor. Similar to other platform technologies, AI vendors generally can terminate and ban user accounts at their discretion and with limited liability. Companies that rely on AI vendors may experience business disruptions and customer losses if they lose access to the AI platform.

Business continuity/disaster recovery and the AI vendor

As supply chains become more and more dependent on AI and machine learning, a critical question arises: How resilient are these systems to external shocks? While these tools can be incredibly useful in helping organizations draft business continuity or disaster recovery plans, what if an AI vendor loses access to its algorithms, either due to regulatory action or failure to maintain backups?

In the context of business continuity or disaster recovery, AI vendor considerations need to be adjusted for risk. For off-the-shelf products, there are usually alternatives, making vendor swapping less concerning. However, for mission-critical AI vendors, companies may need to revisit concepts like software escrow and extend them to AI training data and architecture. Alternatively, they may need to impose more significant disaster recovery and backup obligations on their AI vendor partners.


AI, including generative AI models, operates based on human design. As far as we know, they do not possess independent thought. This is a positive feature, not something to overcome. It allows us to establish rules for their operation and propagate our own human programming into our AI ecosystem.

Lily Li Founder
[email protected]
Metaverse Law, Orange County/Los Angeles