Last week, the U.S. Tax Court denied an Internal Revenue Service motion to compel the production of electronically stored information that was not delivered based on the mutually agreed-upon use of “predictive coding” in eDiscovery. This ruling comes two years after the same court authorized the use of predictive coding in the case (Dynamo Holdings, Ltd. vs. Commissioner, 143 T.C. No. 9 (2014)) and serves as an important judicial reminder that predictive coding is an accepted method of eDiscovery in U.S. courts and that it’s here to stay.
Predictive coding leverages the power of machine learning to “train” a computer to recognize and identify the documents in a review set that are potentially relevant to discovery requests. The software “learns” how to code from the attorneys as it tracks their review decisions and then uses algorithms to predict how the attorneys would have coded each document in the set. The initial sample results are then inspected to identify any errors and fine-tune the algorithms along the way, which obviously creates a far more efficient approach when reviewing a large amount of documents for relevance.
Predictive coding first received judicial endorsement from New York Magistrate Judge Andrew Peck back in 2011 and has been gaining acceptance in U.S. courts since that time as an alternative to the costly and inefficient physical review of documents. However, as industry experts such as Craig Ball have noted, predictive coding has failed to generate the traction in the litigation marketplace that many forecast in the aftermath of Judge Peck’s widely publicized endorsement.
With judicial support and basic economics in its favor, why hasn’t predictive coding fulfilled its promise in eDiscovery?
“Litigation teams have access to wonderfully advanced machine learning software applications, but they don’t necessarily use them like we would expect and the biggest obstacle I see centers on adoption of the tools,” said Michael Etgen, Ph.D., senior user experience architect at LexisNexis. “Lawyers and their staff members talk about using predictive coding in eDiscovery, but what we see in practice is that they only use it in very limited cases and circumstances, not broadly across their work.”
Etgen contends that this unfulfilled promise of predictive coding can be traced to three primary factors:
- Lack of incentive
“You have to understand that using technologies such as predictive coding will enable litigation teams to review more documents faster than ever before and with fewer resources,” he explains. “Since document review is by far the largest cost of eDiscovery, there may be some organizations whose livelihood depends upon document review services not becoming cheaper. They’re not really looking for new and unique ways of using technology to improve their businesses.”
- Threat of job replacement
“This is a ubiquitous fear in our world today as global economies transition to the digital age, and the eDiscovery industry is no exception,” said Etgen. “The people who do large-scale document review work are often contract reviewers asked to identify specific content by reading document after document in a case. These folks and the professionals who manage their work have a real sense that their jobs may be at risk to machine learning systems that can do the same thing faster and more reliably than they can.”
- Complexity of applications
“I’ve also spoken to many litigators who are uncomfortable with their inability to grasp what’s going on inside the ‘black box’ that was so common with the first generation of predictive coding tools,” he said. “Imagine for a moment that you had to stand in front of a judge and defend the output of a software application when in reality you have no clue what it’s doing and how it’s doing it.”
Etgen notes that the lack of financial incentive for working more efficiently is a bigger picture consideration that will likely have to sort itself out in the marketplace. However, he believes the other two factors inhibiting the growth of predictive coding are being addressed right now.
“With respect to the threat of job replacement, there is ample evidence out there now that we get better outcomes when we combine the human touch of people and the technology power of machine learning, as opposed to just one of those elements on its own,” he said. “There is no reason to fear technology as it will always require human collaboration and oversight in order to get the job done.”
Finally, the answer to overly complex applications in the marketplace is to design next-generation eDiscovery tools that unite both the technical and user experience (UX) aspects of software, according to Etgen. His primary focus in recent months has been on the UX for Lexis DiscoveryIQ, a new eDiscovery enterprise software platform from LexisNexis that reimagines how and when predictive coding is used in the workflow.
“We’re overcoming one of the barriers to wider adoption of predictive coding by introducing greater transparency into the way that the machine learning works in Lexis DiscoveryIQ,” said Etgen. “By making it easier and more intuitive for litigators to use the next generation of predictive coding applications, we’ll pave the way for these tools to better deliver on their promise in eDiscovery.”
* * *
This post is by Daryn Teague, who provides support to the litigation software product line based in the LexisNexis Raleigh Technology Center.