Well, it was comfortable while it lasted.
I’m referring to the widespread attitude towards artificial intelligence in law, to the effect that we don’t have to do anything about it yet because the technology has yet to prove itself. Some day, maybe, and we’ll worry about it then. Maybe.
Time’s up.
Across my desk this week came a just-released report from Blue Hill, the Boston-based research and advisory firm specializing in enterprise technology: Their particular foci include AI, cloud, neural science and machine learning, mobile, and security technologies, and perhaps not surprisingly they conduct research and write reports on legal technology on a fairly regular basis.
The report that hit my desk is ROSS Intelligence and Artificial Intelligence in Legal Research, which addresses head-on the psychological, cultural, and quasi-tribal disconnect between the believers and the deniers when it comes to legal AI. From the executive summary:
The growing availability and practicality of artificial intelligence (AI) technologies such as machine learning and Natural Language processing within the legal sector has created a new class of tools that assist legal analysis within activities like legal research, discovery and document review, and contract review. Often, the promised value of these tools is significant, while lingering cultural reluctance and skepticism within the legal profession can lead to hyperbolic reactions to so-called “robot lawyers,” both positive and negative.
To confront this chasm, Blue Hill did what any serious researcher would do: Set up, conducted, and reported the results of an experiment comparing AI-enabled tools to conventional ones on about as level a playing field as it’s possible to imagine.
Specifically, they compared the ROSS Intelligence tool to Lexis and Westlaw, using both Boolean and natural language search. (“Boolean” is basically the familiar keyword approach, allowing you to specify required words, impermissible words, and combinations using “and” and “or.” Natural language is what you’re reading.)
While I’m not formally trained in objective research protocols, their setup of the research struck me as scrupulous:
- Sixteen experienced legal research professionals were randomly assigned into four groups of four apiece.
- Each received a standard set of seven questions designed to emulate real-world queries practicing lawyers would pose.
- For consistency, the subject matter was limited to US federal bankruptcy law.
- Each of the sixteen was asked to research and provide a written answer to the legal question posed, with in a two-hour time limit.
- Although experienced in legal research generally, none of the sixteen had more than passing acquaintance with bankruptcy law.
- And each was assigned to a tool (Lexis, Westlaw, or ROSS) that they were largely unfamiliar with. (Westlaw mavens were sicced on Lexis and vice versa; presumably none were familiar with ROSS out of the box.)
To evaluate the results, Blue Hill measured the time each spent (a) researching; and (b) writing their responses. They also compared:
- Information retrieval quality: What portion of the total results retrieved were drawn from truly relevant sources, what portion of all the items presented were themselves relevant, and were the most relevant placed at the top of the list.
- User satisfaction: How easy was the tool to use and how much confidence did the researcher have in the results.
- Efficiency: Time it took to arrive at a satisfactory answer.
Finally, for simplicity and ease of comparison in evaluating quality and relevance of the search tools, Blue Hill took into account only the first 20 results produced in response to each research query.
Their analysis looked at (a) information retrieval accuracy; (b) user satisfaction; and (c) efficiency.
Recall our three contestants are the new entrant ROSS, the challenger (in Boolean and natural language incarnations) and the composite Lexis/Westlaw incumbent and reigning champions, in Boolean and natural language flavors.
With that lengthy stage-setting, shall we see what they learned?
A reader who prefers anonymity (but who knows to a fare-thee-well whereof he speaks) wrote me as follows, verbatim:
I would be most interested in other knowledgeable comments.