- “[P]roportionality requires consideration of results as well as costs. And if stopping at 40,000 [out of a universe of 3 million documents] is going to leave a tremendous number of likely highly responsive documents unproduced, [the defendant’s] proposed cutoff doesn't work.” Id. at *3.
- A seed set of 2400 documents would be culled and reviewed until the predictive coding process reached a level of 95% confidence – that the documents generated by the program were responsive. Id. at *5.
- “[A]ll of the documents [in] the seed set, whether . . . ultimately coded relevant or irrelevant, aside from privilege, will be turned over to” plaintiffs. Id. Both sides could code documents in the seed set. Id.
- The seed set coding itself involved two processes: a keyword code and “judgmental sampling” – the latter to be performed by “senior attorneys” who would not otherwise be conducting a manual document review. Id.
- The number of training iterations for the computer was initially set at seven, with the possibility of more if the results had not stabilized. Id. at *6.
- Predictive coding was accurate enough to support a certification that a disclosure was “complete and correct” under Fed. R. Civ. P. 26(g)(1)(A). Id. at *7.
- Daubert requirements do not apply to a determination of the validity of an ediscovery method. Id.
- Accuracy concerns would be addressed “down the road” by reviewing documents from that seed set that the predictive coding system had judged irrelevant. Id. at *8. If the system was deeming “hot documents” to be “irrelevant,” then the software would have to be “retrained” or “some other search method employed.” Id.
The question to ask in that situation is what methodology would the requesting party suggest instead? Linear manual review is simply too expensive where, as here, there are over three million emails to review. Moreover, while some lawyers still consider manual review to be the “gold standard,” that is a myth, as statistics clearly show that computerized searches are at least as accurate, if not more so, than manual review. . . . [O]n every measure, the performance of [predictive coding] was at least as accurate (measured against the original review) as that of human re-review.
2012 WL 607412, at *9 (citation and quotation marks omitted).
[T]he confusion [over plaintiff’s consent] is immaterial because the ESI protocol contains standards for measuring the reliability of the process and the protocol builds in levels of participation by Plaintiffs. It provides that the search methods will be carefully crafted and tested for quality assurance, with Plaintiffs participating in their implementation. . . . If there is a concern with the relevance of the culled documents, the parties may raise the issue before [the magistrate] before the final production. Further, upon the receipt of the production, if Plaintiffs determine that they are missing relevant documents, they may revisit the issue of whether the software is the best method.
Moore II, 2012 WL 1446534, at *2. The reliability of predictive coding can only be determined by looking at its results. Thus, it is “premature” and “speculative” to raise reliablity concerns before the system is tested in practice. If problems arise, then “the parties are allowed to reconsider their methods.” Id.
There simply is no review tool that guarantees perfection. . . . [T]here are risks inherent in any method of reviewing electronic documents. Manual review with keyword searches is costly, though appropriate in certain situations. However, even if all parties here were willing to entertain the notion of manually reviewing the documents, such review is prone to human error and marred with inconsistencies from the various attorneys’ determination of whether a document is responsive.
2012 WL 1446534, at *3.
The PLAC presentation also indicated that predictive coding had been approved in the case of Kleen Products LLC et al v. Packaging Corporation of America, 1:10-cv-05711 (N.D. Ill.), earlier that very week (that is to say, last week, now). We have a PACER account, and we’re not afraid to use it, so we looked up the docket for that case. Unfortunately, we can’t confirm or deny approval of predictive coding in Kleen. That’s because no order appears on PACER. PACER does, however, indicate that a discovery hearing was held on April 20, 2012, so it’s likely that an oral decision occurred, and the parties are still working out the terms of the order. There’s also a “transcript” entry in the docket, but it was not accessible through PACER, so all we can say at this point is that it exists.