Support Centre

You have out of 5 free articles left for the month

Signup for a trial to access unlimited content.

Start Trial

Continue reading on DataGuidance with:

Free Member

Limited Articles

Create an account to continue accessing select articles, resources, and guidance notes.

Free Trial

Unlimited Access

Start your free trial to access unlimited articles, resources, guidance notes, and workspaces.

USA: Making the grade - privacy considerations for AI in the education sector

In this Insight article, Zach Lerner and Hannah Schaller, from ZwillGen PLLC, analyze the privacy challenges confronting artificial intelligence (AI) developers in US education, navigating compliance nuances with laws and state privacy regulations to ensure responsible AI use.

imaginima / Signature collection / istockphoto.com

Introduction

The classroom has long existed to cultivate human intelligence, but today, it is also a testing ground for AI. Applications of AI in education range from personalized learning and lesson creation to web traffic monitoring and cheating detection. One feature that distinguishes these emerging technologies is their ability to 'learn' and improve as they are used due to a type of AI known as machine learning (ML).1 The algorithms and neural networks comprising an ML model adapt as they are exposed to data over time, refining the model's analysis of both data inputs as well as its conclusions or responses in relation to such inputs. For example, an AI-powered teaching assistant that is designed to deliver personalized content may start to tailor its answers to a specific student after interacting with them for long enough, because it is learning about the individual learning needs and preferences of that student.

Developing these tools typically requires a great deal of data on which the tool is 'trained' to function as desired, and these tools continue to train as they are used. For example, a plagiarism detection tool may be trained to differentiate between plagiarized and non-plagiarized text by analyzing thousands of examples of student writing. However, it is not static; when a college admissions officer submits applicants' essays to check for plagiarism, the tool continues to 'learn' from analyzing these writing samples.

Both the training dataset and data subsequently processed by an AI-powered tool may contain personal data about students or other data subjects. This potentially triggers a variety of US privacy laws, including the Children's Online Privacy Protection Act (COPPA); student privacy laws such as the Family Education Rights and Privacy Act (FERPA), the Protection of Pupil Rights Amendment of 1978 (PPRA), and state student privacy laws; and state consumer privacy laws such as the California Consumer Privacy Act (CCPA) and the Colorado Privacy Act (CPA). Similar laws outside of the US likely also apply if a tool is available outside the US, for example, if the tool is offered to EU and UK individuals, the EU General Data Protection Regulation (GDPR) and the UK Data Protection Act 2018 (the Data Protection Act) may apply - but those laws are outside the scope of this article.

AI developers creating products for the US education sector should be aware of the privacy laws that may impact the training and deployment of their products, whether they are likely subject to these laws, and their key obligations thereunder. In particular, non-US companies should take stock of these laws before launching their products in US markets, as some of the requirements (such as 'sale' and 'sharing' opt-outs) may be unfamiliar to companies focused on the GDPR and laws modeled after it. This article provides an overview of these laws and non-exhaustive examples of issues AI developers should consider in connection with them.

COPPA

COPPA is a federal law that applies to operators of online services that collect personal information from children under the age of 13, including services directed to children and those that are directed to a general audience but have actual knowledge that they collect data about children. For example, the developer of a general-use AI product would have actual knowledge that it collects personal information from children if the use of the product requires information such as full name, username, and audio/video content, and:

  • the developer contracts with an elementary school to provide its product for use by students; or
  • the developer becomes aware that the product is being used by students under the age of 13, for instance, because users enter birthdates or other information indicating that they are under 13 or in elementary school.

AI developers in the education space should evaluate whether their products are directed to or intended to be used by children. As it relates to the use of AI, in-scope operators should keep in mind the following core COPPA requirements:

  • notice: Drafting a COPPA-compliant notice of the operator's information handling practices, not only providing detail about how data is collected and used generally, but with a specific eye towards transparency regarding the training of the AI model. For example, operators could consider explaining whether data is de-identified before it is used for training;
  • consent: Obtaining verifiable parental consent (or a permissible proxy therefor) to the operator's practices and restricting its use to the consented-to purposes; and
  • retention: Retaining children's personal information only as long as is reasonably necessary for the purpose for which it was collected, which may require reassessing the ways in which data is stored to train the AI model. For example, rather than retaining data in identifiable form, operators may consider retaining data in de-identified form if possible.

Federal student privacy laws: FERPA and PPRA

FERPA

FERPA is a federal law that governs educational entities' disclosure of student records to third-party entities, such as edtech vendors. AI developers and other private operators may obtain adequately de-identified personal information from education records without restriction. However, if an AI developer obtains personal information from student records, the AI developer must fit within an exception, such as the 'school official' exception (i.e., the developer is performing a task that the school would otherwise perform on its own) or 'directory information' exception, or the educational entity must obtain written consent to the disclosure and place contractual limits on how the AI developer may use the personal information.

An AI developer contracting with a school should confirm whether the school will provide it with personal information from students' education records. If so, the developer may need to assess whether the school has obtained appropriate consent or that the disclosure is covered by a qualifying exception.

AI developers should also be aware of relevant restrictions on their use of personal information from student education records (such as restrictions on the developer's use and disclosure of the information) and put processes in place to comply with such restrictions. For instance, the consent obtained by the school to the disclosure may be for a narrow set of purposes. Similarly, for developers relying on the 'school official' exception, the developer should ensure that its use of student records to train AI models does not fall outside of the permitted purposes, which must conform to the service or function that the developer is tasked with performing.

PPRA

The PPRA establishes rights for students and parents in relation to the collection of students' information through surveys, analyses, evaluations, and similar tools. Under the PPRA, consent from a parent (or the student if over 18) is required for students to participate in a survey, etc., that reveals their political affiliation, religious practices, psychological problems, sexual behavior, critical appraisals of other individuals, and certain other information that is similarly sensitive.

If conducting surveys or evaluations in a school setting that may be covered by the PPRA, developers of AI tools should evaluate their obligations under the PPRA, ensure that they have sufficient contractual protections if contracting with a school to provide such tools, including clear a division of responsibility for who is responsible for obtaining and documenting any required consent.

State student privacy laws

Most states have enacted privacy laws that limit the collection, use, and disclosure of students' personal information. The great majority of these laws apply to operators of websites, online services, or online or mobile applications if they have actual knowledge that their services are used primarily for school purposes, and/or if the services were designed and marketed for K-12 school purposes. In other words, these laws generally do not apply to general-use websites, etc., that happen to be used by students, teachers, or schools. For example, the developer of an all-purpose AI-powered chatbot would not be subject to these laws if a teacher used the chatbot with their students.

Student privacy laws place a range of obligations on operators subject to them. In general, the burden is on schools to impose these obligations via contract on operators. Most state student privacy laws require operators to:

  • not sell student personal information;
  • not use student personal information to create a profile or for targeted advertising;
  • limit how and why student personal information may be disclosed;
  • provide prominent notice of material changes to privacy policies; and
  • implement appropriate security measures.

Developers of AI products that are designed and marketed for K-12 school purposes, or that have actual knowledge that this is the primary use case, should evaluate what obligations they have under state student privacy laws, and position themselves to comply with such obligations as imposed by schools with which the developers contract. If contracts limit the developer's use of student personal information in ways that are necessary to provide or maintain the product (e.g., to train the model), the developer may want to discuss with schools whether this is required by the relevant law and provide any mitigating considerations (e.g., if the developer only uses de-identified information to train its model, or deletes personal information immediately after it is used for training purposes).

State consumer privacy laws

Scope

A number of US states have passed consumer privacy laws. Five such laws (the CCPA, the CPA, and laws in Connecticut, Utah, and Virginia) are in effect as of 2023, with others taking effect over the next several years. These laws typically apply to companies that do business in a state or produce goods or services targeted at consumers in the state and meet certain user and/or revenue thresholds.

Under these laws, 'personal data' means data about an identified or identifiable individual, which includes not only data like name and email address, but also indirect identifiers like IP address, device identifiers, and online identifiers and information linked to those identifiers. Except for the CCPA, these laws do not apply to employee and business-to-business data. Certain personal data is exempt under all of these laws, including FERPA-covered personal information.

If an AI developer meets relevant thresholds for any of the state consumer privacy laws, it is subject to these laws with regard to personal data it processes in the education setting that is not subject to FERPA (such as personal data provided directly by students or parents, including personal data contained in exam answers or other assignments) and is not otherwise exempt. State consumer privacy laws apply in addition to any relevant state student privacy laws, the PPRA, and/or COPPA.

Core obligations

State consumer privacy laws regulate 'controllers' ('businesses' under the CCPA) and 'processors' ('service providers' under the CCPA). Controllers determine the means and purposes of processing personal data, while processors process personal data on the controller's behalf and pursuant to its instructions.

Most obligations fall on controllers, which are required to publish a compliant privacy notice, respond to consumer requests to exercise access, correction, and deletion rights, comply with special restrictions on children's data, and impose certain terms on processors and other parties to which the controller provides personal data. Processors have more limited obligations, which mainly involve following the controller's instructions as set forth in the processor's contract with a controller.

State consumer privacy laws impose certain obligations regarding the 'sale' and 'sharing' of personal data. A 'sale' is the exchange of personal data with a third party for monetary or other valuable consideration, including higher-quality analytics, targeted advertising, or other benefits. Examples of 'selling' personal data include uploading lists of student users to social media sites to create custom ad audiences or, arguably, creating training datasets from its users' data and licensing those datasets to other AI ed-tech companies. That said, 'selling' can include the provision of personal data to a third party in a context where the recipient is not limited in how they can use or disclose the information by adequate contractual provisions. 'Sharing' is a type of selling that means using personal data for targeted advertising that relies on data about a consumer's activity across different companies' websites (e.g., targeting a teacher who signs up for a product demo with ads for a product subscription via the teacher's social media account).

Under state consumer privacy laws:

  • controllers must enable consumers to opt out of the sale or sharing of their personal data; and
  • processors may not sell or share personal data they process on behalf of controllers.

General considerations for AI developers

To ensure that it processes student personal data in compliance with state privacy laws, an AI developer must distinguish between personal data for which it is a controller as opposed to a processor. Sometimes this determination is straightforward. For example, insofar as a developer processes student personal data to provide its services to educational organizations, these organizations will likely require the developer to be a processor. However, if a developer provides a tool directly to students rather than to educational entities, it is more likely to be a controller of the students' personal data.

A more ambiguous area is when developers use personal data to improve their products, as often occurs when an AI-powered tool is designed to 'learn' from the data it ingests in the course of its use. Some developers take the position that they are controllers with respect to data that trains their models, while others argue that they are processors. The legal soundness of either position depends on the context of the specific processing activities at issue.

Whether a developer is a controller or a processor regarding student data, they should consider how to meet core requirements, such as the following:

  • Unless there is an exception, developers must correct or delete personal data upon request (including where a developer is a processor and must follow the instructions provided by its customer with respect to such a request). Developers should consider what technical processes are necessary to, for example, identify what data elements the developer processes about a particular student, change those data elements, or delete those data elements. Developers should consider how to mitigate any adverse impacts on the AI model that may result from these activities, especially where the developer is a processor and may make representations about its tool's performance and functionality in its agreements with educational institutions or other customers.
  • As a controller, a developer must craft a privacy notice that provides sufficient transparency about how its AI tool processes consumers' personal data. Privacy notices must be sufficiently clear to their intended audiences, so developers providing products directly to students should consider how to make these notices clear to this audience. 
  • Developers must comply with 'sale' and 'sharing' requirements, as set forth in more detail above and below. In particular, developers acting in a processor capacity should consider whether using personal data to train their models constitutes a 'sale' by the customer to the developer and/or by the developer to others, and be prepared to discuss this with customers. Developers should also consider California-specific requirements set forth below.
  • Developers acting as processors should consider how to comply with statutory limitations on their use of personal data, including, in California, the prohibition on using personal data outside the direct business relationship with the relevant controller or for any commercial purpose not set forth in the contract with the controller. Using personal data to train AI models may pose difficulties in meeting these requirements, and developers should be prepared to discuss this with customers.
  • Developers that are controllers should consider how to comply with special requirements on processing children's personal data as set forth below, including the requirement to obtain consent (outside of California), meet California-specific consent requirements, and perform a data protection impact assessment (DPIA).

Requirements for children's data

US state consumer privacy laws impose special protections on personal data regarding children under the age of 13, and some also regulate data about consumers between the ages of 13 and 18. These requirements are in addition to any requirements under COPPA as discussed below.

Consent and opt-out requirements

Most state consumer privacy laws designate personal data about a child under the age of 13 as a type of sensitive data. For example, this includes information that an AI tool collects from and about students using the tool such as their names, usernames, class assignments they complete using the tool, videos they record including their voice and image, their grades, and teacher feedback, and other data that is or can reasonably be linked to specific students. Certain laws provide that parent or guardian consent obtained in compliance with COPPA meets the consent requirement, while other laws outline criteria for consent that are very similar to those required by COPPA. AI developers collecting personal data from and about children should consider how best to obtain and document the required consent, as well as how to verify that consent is given by a parent or guardian. Unlike COPPA, state consumer privacy laws do not specifically provide for schools to provide proxy consent. This may affect developers in their capacity as controllers, but where a developer is a processor of student personal data on behalf of a school, the school is in a better position to obtain consent and the developer may want to contractually provide for this.

California does not designate children's personal data as sensitive and generally does not require consent to process children's data. However, opt-in consent is required if a controller has actual knowledge that it sells or shares personal data of children under the age of 16. Parent or guardian consent is required for children under the age of 13, and the child's own consent is required for children older than 13 but younger than 16. Controllers must also notify these individuals of the ability to opt out at any time. If developers of AI-based education tools are selling or sharing data of California students under 16, they should establish the appropriate consent mechanisms. It is important to remember that developers may not sell or share student personal data for which they are processors. Examples of sales and sharing are discussed above.

DPIAs

A DPIA is essentially an analysis that weighs the risks and benefits of processing personal data and discusses how the controller can mitigate the risks. Under most state consumer privacy laws, controllers must perform a DPIA if they process the personal data of children under 13.

AI developers of educational tools aimed only at students older than 13 may need to perform a DPIA if their personal data processing activities pose a heightened risk of harm, particularly if the developer processes personal data automatically to evaluate, analyze, or predict personal aspects related to personal preferences, interests, behavior, and other factors (profiling). Education-oriented AI products could profile students by, for example, predicting what topic a particular student will struggle with, evaluating which students are most or least well-behaved in class, or analyzing student participation patterns. DPIAs are generally required where profiling presents a reasonably foreseeable risk of 'substantial injury' to students.

The CCPA currently does not require DPIAs, but if passed, draft CCPA regulations would require controllers to perform DPIAs if they process personal data to train AI or automated decision-making technology. 'AI' is defined very broadly as 'an engineered or machine-based system that is designed to operate with varying levels of autonomy and that can, for explicit or implicit objectives, generate outputs such as predictions, recommendations, or decisions that influence physical or virtual environments. AI includes generative models, such as large language models, that can learn from inputs and create new outputs, such as text, images, audio, or video; and facial or speech recognition or detection technology.' Despite referencing certain technologies commonly understood as AI, the first sentence of the definition is broad enough to include many other technologies. For example, online search functionalities and features in certain word processing software fit the broad definition of that sentence, although they are not typically what comes to mind when thinking about AI.

For example, consider a teacher's use of spellcheck software that simply compares words in its dictionary to words input into it. Even if it makes no use of large language models, ML, or other technologies commonly understood as AI, it may qualify as AI under the draft CCPA regulations because it somewhat autonomously checks spelling and renders decisions about whether words are correctly spelled, which influence the teacher's decisions about grading. The draft CCPA regulations would greatly expand the concept of AI and the requirement for DPIAs for ed-tech providers in California, including those using what we typically consider to be AI.

Conclusion

State and federal privacy laws impacting student data influence how AI developers offering services to US-based schools and districts should design their products and handle personal data, including how data is collected, used, disclosed, and retained. To avoid compliance headaches and after-the-fact changes to the way they operate, developers should consider relevant privacy law requirements while they are developing products and stay up to date on developing legal obligations after their product launches. Building in privacy-compliant elements and internal data handling processes can help AI developers achieve streamlined compliance that goes hand-in-hand with the tools they operate.

Zach Lerner Legal Director
[email protected]
Hannah Schaller Attorney
[email protected]
ZwillGen PLLC, Washington, D.C.


1 Throughout this article, when we refer to AI, we also refer to ML unless otherwise specified.