From the abstract for Geneviève Vanderstichele, The Normative Value of Legal Analytics. Is There a Case for Statistical Precedent?:

This work contributes to the harmonisation of the quantitative methodologies of data science, computer science and statistics with the qualitative methodology of the law.

It gives a layered answer to the research question whether the outcome of a machine learning algorithm with case law as an input can have normative value. The thesis argues first that the outcome of a machine learning algorithm with case law as an input is not ‘law’ as we know it today. Neither is it a fact in a court case, nor a secondary source of law. The thesis claims furthermore that for methodological reasons, such an outcome is to be considered as a ‘sui generis’ concept, a concept of its own kind, with which courts can, and even should, engage in adjudication. In addition, it is argued that modelling with machine learning can have an implicit normativity through the definition of the purpose of the algorithm, its design and the choices that are made by the software engineers.

In the first part, the work introduces several building blocks that inform the following parts. The second part is a critical analysis of 9 experiments with mainly supervised machine learning algorithms, with case law as an input. The final part discusses the use of the outcome of such algorithms in court cases.

From the blurb for Law as Data: Computation, Text, and the Future of Legal Analysis (SFI Press, 2019):

In recent years,the digitization of legal texts and developments in the fields of statistics, computer science, and data analytics have opened entirely new approaches to the study of law. This volume explores the new field of computational legal analysis, an approach marked by its use of legal texts as data. The emphasis herein is work that pushes methodological boundaries, either by using new tools to study longstanding questions within legal studies or by identifying new questions in response to developments in data availability and analysis.By using the text and underlying data of legal documents as the direct objects of quantitative statistical analysis, Law as Data introduces the legal world to the broad range of computational tools already proving themselves relevant to law scholarship and practice, and highlights the early steps in what promises to be an exciting new approach to studying the law.

Swiss researchers have found that algorithms that mine large swaths of data can eliminate anonymity in federal court rulings. This could have major ramifications for transparency and privacy protection. The study relied on a “web scraping technique” or mining of large swaths of data. The researchers created a database of all decisions of the Supreme Court available online from 2000 to 2018 – a total of 122,218 decisions. Additional decisions from the Federal Administrative Court and the Federal Office of Public Health were also added. Using an algorithm and manual searches for connections between data, the researchers were able to de-anonymise, in other words reveal identities, in 84% of the judgments in less than an hour.

H/T to beSpacific for discovering this report.

From the abstract for G. Patrick Flanagan & Michelle Dewey, Where Do We Go from Here? Transformation and Acceleration of Legal Analytics in Practice (Georgia State University Law Review, Vol. 35, No. 4, 2019):

The advantages of evidence-based decision-making in the practice and theory of law should be obvious: Don’t make arguments to judges that seldom persuade; Jurisprudential analysis ought to align with sound social science; Attorneys should pitch legal work to clients that demonstrably need it. Despite the appearance of simplicity, there are practical and attitudinal barriers to finding and incorporating data into the practice of law.

This article evaluates the current technologies and systems used to publish and analyze legal information from a researcher’s perspective. The authors also explore the technological, economic, political, and legal impediments that have prevented legal information systems from being able to keep pace with other industries and more open models. The authors detail tangible recommendations for necessary next steps toward making legal analytics more widely adopted by practitioners.

Your litigation analytical tool says your win rate for summary judgement motions in class action employment discrimination cases is ranked the best in your local jurisdiction according to the database used. Forget the problem with using PACER data for litigation analytics, possible modeling error or possible bias embedded in the tool. Can you communicate this applied AI output to a client or potential client? Are you creating an “unjustified expectation” that your client or potential client will achieve the same result for your next client matter?

According to the ABA’s Model Rules of Professional Conduct Rule 7.1, you are probably creating an “unjustified expectation.” However you may be required to use that information under Model Rule 1.1 because that rule creates a duty of technological competence. This tension between Model Rule 7.1 and Model Rule 1.1 is just begining to be played out.

For more, see Roy Strom’s The Algorithm Says You’ll Win the Case. What Do You Say? US Law Week’s Big Law Business column for August 5, 2019. See also Melissa Heelan Stanzione, Courts, Lawyers Must Address AI Ethics, ABA Proposal Says, Bloomberg Law, August 6, 2019.

From the abstract for Charlotte Alexander and Mohammed Javad Feizollahi On Dragons, Caves, Teeth, and Claws: Legal Analytics and the Problem of Court Data Access, (Computational Legal Studies: The Promise and Challenge of Data-Driven Legal Research (Ryan Whalen, ed., Edward Elgar, 2019, Forthcoming).

This chapter provides a case study of data access challenges in a legal analytics project that attempted to study all U.S. district court judges’ decisions in employee misclassification disputes over a ten-year period. The chapter details the data assembly process, problems, and workarounds, and considers the implications for legal analytics and computational law.

Early results from 25% of the AmLaw 200 participating so far in a Feit Consulting survey indicate that the adoption rate of Westlaw Edge and Context by LN is roughly the same, trending at 15%. “Context seems to be getting much more consideration, however, because of its much lower cost. At this point 40% of firms with Lexis are actively considering Context,” according to Feit Consulting’s blog post.

My primary concern is that comparing Westlaw Edge and Context because both offer litigation analytics may only be part of the story. Westlaw Edge offers much more than just the litigation analytics offered by Context; Westlaw Edge includes WestSearch Plus, KeyCite Overruling Risk, Statutes Compare and Regulations Compare. And Westlaw Edge will eventually replace Westlaw whereas Context will not replace Lexis Advance.

From the blurb for Kevin D. Ashley, Artificial Intelligence and Legal Analytics: New Tools for Law Practice in the Digital Age (Campridge UP, 2017):

The field of artificial intelligence (AI) and the law is on the cusp of a revolution that began with text analytic programs like IBM’s Watson and Debater and the open-source information management architectures on which they are based. Today, new legal applications are beginning to appear and this book – designed to explain computational processes to non-programmers – describes how they will change the practice of law, specifically by connecting computational models of legal reasoning directly with legal text, generating arguments for and against particular outcomes, predicting outcomes and explaining these predictions with reasons that legal professionals will be able to evaluate for themselves. These legal applications will support conceptual legal information retrieval and allow cognitive computing, enabling a collaboration between humans and computers in which each does what it can do best. Anyone interested in how AI is changing the practice of law should read this illuminating work.

Trust is a state of readiness to take a risk in a relationship. Once upon a time most law librarians were predisposed to trust legal information vendors and their products and services. Think Shepard’s in print when Shepard’s was the only available citator with signals that were by default the industry standard. Think late 1970s-early 1980s for computer-assisted legal research where the degree of risk taken by a searcher was partially controlled by properly using Boolean operators when Lexis was the only full-text legal search vendor.

Today, output from legal information platforms does not always result in building confidence around the use of the information provided be it legal search or legal citator outputs as comparative studies of each by Mart and Hellyer have demonstrated. What about the output we are now being offered by way of the implementation of artificial intelligence for legal analytics and predictive technology? As legal information professionals are we willing to be vulnerable to the actions of our vendors based on some sort of expectation that vendors will provide actionable intelligence important to our user population, irrespective of our ability to monitor or control vendors’ use of artificial intelligence for legal analytics and predictive technology?

Hopefully we are not so naive as to trust our vendors applied AI output at face value. But we won’t be given the opportunity to shine a light into the “black box” because of understandable proprietary concerns. What’s needed is a way to identify the impact of model error and bias. One way is to compare similar legal analytic outputs that identify trends and patterns using data points from past case law, win/loss rates and even a judge’s history or similar predictive technology outputs that forecast litigation outcome like Mart did for legal search and Hellyer did for citators. At the present time, however, our legal information providers do not offer similar enough AI tools for comparative studies and who knows if they will.  Early days… .

Until such time as there is a legitimate certification process to validate each individual AI product to the end user when the end user calls up specific applied AI output for legal analytics and predictive technology, is there any reason to assume the risk of using them? No, not really, but use them our end users will. Trust but (try to) validate otherwise the output remains opaque to the end user and that can lead to illusions of understanding.