Artificial Intelligence (AI) is revolutionizing how parties gather, review, and produce enormous amounts of electronic data in litigation. When discovery involves potentially millions of documents, new machine learning models that employ Continuous Active Learning (CAL), also called Technology Assisted Review 2.0 (TAR 2.0), can turn an overwhelming task into a manageable, defensible process. CAL/TAR 2.0 uses attorney decisions to continuously teach the software which documents are likely responsive, serving up the most promising material first. The model updates automatically as the review proceeds until reaching an agreed-upon stopping point once a certain threshold of nonresponsive documents is being returned.
TAR 2.0 has quickly become the default technology for prioritizing and identifying responsive electronically stored information (ESI) at scale. Yet litigants have been drawn into protracted and expensive battles over how to design and govern TAR 2.0 within an ESI protocol. Courts generally allow the producing party to choose reasonable TAR methods but require transparency and validation under Federal Rule 26(g), and proportionality under Rule 26(b)(1). See, e.g., In re Insulin Pricing Litigation, 2025 U.S. Dist. LEXIS 70494.
Knowing how TAR 2.0 works, and where negotiations often snag, will help you craft practical, defensible ESI protocols. This article highlights four recurring negotiation flashpoints and offers practical guidance to help you craft a defensible, efficient approach when using TAR 2.0.
TAR 2.0 Training: How the Model Learns
Older TAR 1.0 software often utilized an initial sample set, or “seed-set,” of training documents. TAR 2.0 models do not require a preliminary seed set due to the continuous learning aspect of the software, and some courts have declined to adopt this step into newer protocols. See, e.g., In re Uber Techs., Inc., 2024 U.S. Dist. LEXIS 131567 (N.D. Cal.) (the court omitted plaintiffs’ training-set language, noting TAR 2.0 learns continuously).
Once prioritized review begins, reviewers code documents for responsiveness as the system continuously updates a predictive model. Each document receives a score (often 0 –100) representing the likelihood of responsiveness. TAR 2.0 typically feeds the highest-scoring uncoded documents first. Ongoing quality control is maintained through the monitoring of score distributions, reviewer consistency, and other metrics.
One issue that often arises is whether parties will use agreed keyword search terms that are used in conjunction with TAR 2.0. Producing parties often seek the efficiency of pre‑culling, while requesting parties have argued that pre‑culling suppresses recall (the percentage of responsive documents found) and overall responsiveness. Courts have been split on the subject. If search terms are used, a best practice is to validate end‑to‑end recall, not TAR‑only.
Early alignment of the parties on these TAR-training issues will likely reduce the risk of significant and expensive downstream disputes.
Stopping Criteria: When to Pause for Validation
One of the most hotly debated issues in ESI protocol negotiations that utilize TAR 2.0 is the “stopping criteria” that signals when the producing party can pause the review and move to the quality control process called validation. Several methods have been proposed and approved by courts.
A common, defensible approach is to stop when consecutive batches of the highest-scoring, uncoded documents contain a low percentage of responsive documents, often 10 percent or less. See, e.g., In re Broiler Chicken (utilizing this “precision drop” method). Parties frequently disagree on what that threshold should be. Requesting parties may push for a lower threshold (e.g., 5 percent) to maximize recall, while producing parties may argue for a higher threshold to limit review costs.
Another sticking point is the size of the batches used to measure responsiveness. Requesting parties typically want a concrete minimum, such as at least 1,000 documents across consecutive batches, to ensure that the result is not skewed by random “dry spells” of non-responsive documents. Producing parties may prefer smaller or vaguely defined “reasonably sized” batches, which can risk stopping too early and potentially missing responsive material.
Validation: How to Prove Reasonableness and Defensibility
Validation is the quality control process that tests whether a TAR 2.0 process has successfully identified the vast majority of responsive documents. It provides confidence to the parties and the court that the review was thorough and defensible.
When it comes time to negotiate how validation will be done, however, parties often find themselves at odds. These disputes matter because a narrow validation can give a false sense of security and lead to missed evidence. A robust, transparent validation process helps to build trust and reduce the risk of later challenges. It is therefore important to address these issues clearly in the ESI protocol before review begins.
The principal debate typically involves the scope of the validation sample. Producing parties often want to utilize what is known as an “elusion test,” which validates only the documents that the TAR system has set aside as non-responsive and that were never reviewed by a human. The elusion test is simple and requires less work, but it can potentially miss errors made earlier in the process, such as mistakes in human review or documents overlooked by keyword culling.
Requesting parties, on the other hand, usually push for a broader “confusion test” that uses “end-to-end” validation, which means sampling not just the unreviewed set but also documents coded as non-responsive by humans and even those excluded by search terms. The so-called “Broiler Chicken” approach, for example, uses a combined validation sample of approximately 3000 documents drawn from several strata, including documents that have been produced as responsive, coded nonresponsive, and that are unreviewed. See In re Broiler Chicken Antitrust Litig., No., 2018 WL 1146371 (N.D. Ill. Jan. 3, 2018). This method yields an end-to-end recall estimate and surfaces both false positives and false negatives. It is more likely to catch systematic misses but requires more work and may expose more documents to the other side.
Another sticking point is who reviews the validation sample, and whether they know how the documents were previously coded. Producing parties typically want their own team to do the review. Requesting parties often ask for a “blind” review by a subject-matter expert, or even for their own reviewers to participate, arguing that this helps to reduce bias in determining what documents are relevant. That has nevertheless raised concerns about privilege and confidentiality.
Parties also often disagree over what happens if the validation sample shows that recall is lower than expected or if important types of documents were missed. Recall in the 70–80 percent range is commonly considered acceptable when measured end-to-end, but lower or higher recall may not be dispositive without examining what was missed. Producing parties may want the flexibility to argue that the review was still reasonable, while requesting parties often want pre-set remedies such as further review, retraining on the TAR model, or expanding the search.
Collaboration Platforms: Special Considerations for TAR 2.0
The rise of collaboration platforms like Slack, Microsoft Teams, and Google Workspace has transformed how organizations communicate. These platforms have also created new challenges for e-Discovery, and they are now often a major source of friction. These points of contention can significantly impact the scope, cost, and defensibility of discovery. If not addressed clearly in the ESI protocol, they can lead to downstream disputes, motion practice, and even sanctions.
Unlike traditional email, collaboration platforms generate streams of messages, comments, reactions, and shared files or folders with numerous files. Parties often disagree on what constitutes a “document” for review and production. Should each message be treated separately, or should messages be grouped by day, topic, or conversation thread? Producing parties may prefer larger groupings for efficiency, while requesting parties may want more granular units to ensure production and preserve context.
Collaboration tools frequently use hyperlinks to share files instead of attaching them directly. This raises the question of whether hyperlinked documents should be produced as part of the same “family” as the message, like traditional email attachments. Producing parties may argue that technical limitations or burdens make this impractical, especially when the linked content is dynamic or stored outside the platform. Requesting parties, however, often insist that linked documents are essential for context and completeness and should therefore be produced together with the referencing message whenever possible.
Linked documents in collaboration platforms may be edited after they are shared, creating multiple versions. Requesting parties may want the version as it existed at the time it was linked (the “contemporaneous version”), while producing parties may only be able or willing to provide the current version, citing technical or cost barriers to retrieving historical versions.
Collaboration platforms also store rich metadata that may document who participated, when, and with what permissions. Requesting parties may seek detailed metadata to understand context, while producing parties may resist, citing privacy, technical limitations, or proportionality concerns.
The informal and fast-paced nature of collaboration platform communications, which are rife with slang, emojis, and abbreviations, makes keyword searching and review more difficult. Parties may disagree on the appropriateness of search terms, the use of TAR, and how to handle non-textual content like images, reactions, or audio/video files.
Collaboration data can also be ephemeral with short retention periods or user-controlled deletion. Disputes may arise over whether relevant data was timely preserved and whether certain channels, private messages, or deleted content are “reasonably accessible” for collection and production.
CAL/TAR 2.0 can deliver faster, cheaper, and more complete discovery, but only if the parties align on the process from the outset to avoid protracted and expensive discovery battles. When negotiating an ESI protocol that utilizes TAR 2.0, efforts should be made to directly address and resolve important issues, including clear training inputs and disclosures, objective stopping criteria, an agreed-upon end-to-end validation methodology, and how to handle hyperlinks and data from collaboration platforms. Done right, your protocol will be both efficient and defensible.
This article originally appeared on Wilson Elser. www.wilsonelser.com.
About the Authors:
Ian A. Stewart is a partner at Wilson Elser. ian.stewart@wilsonelser.com
Adam Wayne is of counsel at Wilson Elser. adam.wayne@wilsonelser.com