From the abstract for Jake Goldenfein, Algorithmic Transparency and Decision-Making Accountability: Thoughts for Buying Machine Learning Algorithms (Sept. 9, 2019):

There has been a great deal of research on how to achieve algorithmic accountability and transparency in automated decision-making systems – especially for those used in public governance. However, good accountability in the implementation and use of automated decision-making systems is far from simple. It involves multiple overlapping institutional, technical, and political considerations, and becomes all the more complex in the context of machine learning based, rather than rule based, decision systems. This chapter argues that relying on human oversight of automated systems, so called ‘human-in-the-loop’ approaches, is entirely deficient, and suggests addressing transparency and accountability during the procurement phase of machine learning systems – during their specification and parameterisation – is absolutely critical. In a machine learning based automated decision system, the accountability typically associated with a public official making a decision has already been displaced into the actions and decisions of those creating the system – the bureaucrats and engineers involved in building the relevant models, curating the datasets, and implementing a system institutionally. But what should those system designers be thinking about and asking for when specifying those systems?

There are a lot of accountability mechanisms available for system designers to consider, including new computational transparency mechanisms, ‘fairness’ and non-discrimination, and ‘explainability’ of decisions. If an official specifies for a system to be transparent, fair, or explainable, however, it is important that they understand the limitations of such a specification in the context of machine learning. Each of these approaches is fraught with risks, limitations, and the challenging political economy of technology platforms in government. Without understand the complexities and limitations of those accountability and transparency ideas, they risk disempowering public officials in the face of private industry technology vendors, who use trade secrets and market power in deeply problematic ways, as well as producing deficient accountability outcomes. This chapter therefore outlines the risks associated with corporate cooption of those transparency and accountability mechanisms, and suggests that significant resources must be invested in developing the necessary skills in the public sector for deciding whether a machine learning system is useful and desirable, and how it might be made as accountable and transparent as possible.

“Managing the uncertainty that is inherent in machine learning for predictive modeling can be achieved via the tools and techniques from probability, a field specifically designed to handle uncertainty,” writes Jason Brownlee in A Gentle Introduction to Uncertainty in Machine Learning. In his post, one will learn:

  • Uncertainty is the biggest source of difficulty for beginners in machine learning, especially developers.
  • Noise in data, incomplete coverage of the domain, and imperfect models provide the three main sources of uncertainty in machine learning.
  • Probability provides the foundation and tools for quantifying, handling, and harnessing uncertainty in applied machine learning.

From the blurb for Law as Data: Computation, Text, and the Future of Legal Analysis (SFI Press, 2019):

In recent years,the digitization of legal texts and developments in the fields of statistics, computer science, and data analytics have opened entirely new approaches to the study of law. This volume explores the new field of computational legal analysis, an approach marked by its use of legal texts as data. The emphasis herein is work that pushes methodological boundaries, either by using new tools to study longstanding questions within legal studies or by identifying new questions in response to developments in data availability and analysis.By using the text and underlying data of legal documents as the direct objects of quantitative statistical analysis, Law as Data introduces the legal world to the broad range of computational tools already proving themselves relevant to law scholarship and practice, and highlights the early steps in what promises to be an exciting new approach to studying the law.

Swiss researchers have found that algorithms that mine large swaths of data can eliminate anonymity in federal court rulings. This could have major ramifications for transparency and privacy protection. The study relied on a “web scraping technique” or mining of large swaths of data. The researchers created a database of all decisions of the Supreme Court available online from 2000 to 2018 – a total of 122,218 decisions. Additional decisions from the Federal Administrative Court and the Federal Office of Public Health were also added. Using an algorithm and manual searches for connections between data, the researchers were able to de-anonymise, in other words reveal identities, in 84% of the judgments in less than an hour.

H/T to beSpacific for discovering this report.

At the ABA’s annual meeting in August the ABA adopted this AI resolution:

RESOLVED, That the American Bar Association urges courts and lawyers to address the emerging ethical and legal issues related to the usage of artificial intelligence (“AI”) in the practice of law including: (1) bias, explainability, and transparency of automated decisions made by AI; (2) ethical and beneficial usage of AI; and (3) controls and oversight of AI and the vendors that provide AI.

In this Digital Detectives podcast episode on Legal Talk Network, hosts Sharon Nelson and John Simek are joined by Fastcase CEO Ed Walters to discuss this resolution. Recommended.

End Note: It will be interesting to see if, and if so, how the ABA fulfills its promise regarding controlling and oversight of AI vendors.

From Chapter One in Evaluating Machine Learning Models by Alice Zheng:

One of the core tasks in building a machine learning model is to evaluate its performance. It’s fundamental, and it’s also really hard. My mentors in machine learning research taught me to ask these questions at the outset of any project: “How can I measure success for this project?” and “How would I know when I’ve succeeded?” These questions allow me to set my goals realistically, so that I know when to stop. Sometimes they prevent me from working on ill-formulated projects where good measurement is vague or infeasible. It’s important to think about evaluation up front.

From the summary for Counting Regulations: An Overview of Rulemaking, Types of Federal Regulations, and Pages in the Federal Register (R43056, Updated September 3, 2019):

Federal rulemaking is an important mechanism through which the federal government implements policy. Federal agencies issue regulations pursuant to statutory authority granted by Congress. Therefore, Congress may have an interest in performing oversight of those regulations, and measuring federal regulatory activity can be one way for Congress to conduct that oversight. The number of federal rules issued annually and the total number of pages in the Federal Register are often referred to as measures of the total federal regulatory burden.

Bloomberg BNA has changed its name to Bloomberg Industrial Group. because “The new name better reflects the diverse range of businesses and professionals the company serves and the wide range of markets where it operates.” From the press release:

“Since our company was acquired by Bloomberg in 2011, we’ve developed a broad portfolio of products and solutions while serving a changing marketplace,” said Josh Eastright, CEO of Bloomberg Industry Group. “At the same time, we’ve transformed from a periodical publisher to a product- and technology-focused company. Our new name more accurately reflects who we are today—a company that empowers industry professionals with critical information to take decisive action and make the most of every opportunity.”

Jean O’Grady opines “The new name … gives us a wonderful new acronym in Legal Publishing: BIG.”

From the abstract for G. Patrick Flanagan & Michelle Dewey, Where Do We Go from Here? Transformation and Acceleration of Legal Analytics in Practice (Georgia State University Law Review, Vol. 35, No. 4, 2019):

The advantages of evidence-based decision-making in the practice and theory of law should be obvious: Don’t make arguments to judges that seldom persuade; Jurisprudential analysis ought to align with sound social science; Attorneys should pitch legal work to clients that demonstrably need it. Despite the appearance of simplicity, there are practical and attitudinal barriers to finding and incorporating data into the practice of law.

This article evaluates the current technologies and systems used to publish and analyze legal information from a researcher’s perspective. The authors also explore the technological, economic, political, and legal impediments that have prevented legal information systems from being able to keep pace with other industries and more open models. The authors detail tangible recommendations for necessary next steps toward making legal analytics more widely adopted by practitioners.

From the abstract for John Nay, Natural Language Processing and Machine Learning for Law and Policy Texts (Aug. 23, 2019):

Almost all law is expressed in natural language; therefore, natural language processing (NLP) is a key component of understanding and predicting law at scale. NLP converts unstructured text into a formal representation that computers can understand and analyze. The intersection of NLP and law is poised for innovation because there are (i.) a growing number of repositories of digitized machine-readable legal text data, (ii.) advances in NLP methods driven by algorithmic and hardware improvements, and (iii.) the potential to improve the effectiveness of legal services due to inefficiencies in its current practice.

NLP is a large field and like many research areas related to computer science, it is rapidly evolving. Within NLP, this paper focuses primarily on statistical machine learning techniques because they demonstrate significant promise for advancing text informatics systems and will likely be relevant in the foreseeable future.

First, we provide a brief overview of the different types of legal texts and the different types of machine learning methods to process those texts. We introduce the core idea of representing words and documents as numbers. Then we describe NLP tools for leveraging legal text data to accomplish tasks. Along the way, we define important NLP terms in italics and offer examples to illustrate the utility of these tools. We describe methods for automatically summarizing content (sentiment analyses, text summaries, topic models, extracting attributes and relations, document relevance scoring), predicting outcomes, and answering questions.

From the abstract for Harry Surden’s The Ethics of Artificial Intelligence in Law: Basic Questions (Forthcoming chapter in Oxford Handbook of Ethics of AI, 2020):

Ethical issues surrounding the use of Artificial Intelligence (AI) in law often share a common theme. As AI becomes increasingly integrated within the legal system, how can society ensure that core legal values are preserved?

Among the most important of these legal values are: equal treatment under the law; public, unbiased, and independent adjudication of legal disputes; justification and explanation for legal outcomes; outcomes based upon law, principle, and facts rather than social status or power; outcomes premised upon reasonable, and socially justifiable grounds; the ability to appeal decisions and seek independent review; procedural fairness and due process; fairness in design and application of the law; public promulgation of laws; transparency in legal substance and process; adequate access to justice for all; integrity and honesty in creation and application of law; and judicial, legislative, and administrative efficiency.

The use of AI in law may diminish or enhance how these values are actually expressed within the legal system or alter their balance relative to one another. This chapter surveys some of the most important ethical topics involving the use of AI within the legal system itself (but not its use within society more broadly) and examines how central legal values might unintentionally (or intentionally) change with increased use of AI in law.

The first of its kind, Paul T. Jaeger & Natalie Greene Taylor, Foundations of Information Policy (ALA Neal-Schuman, 2019) provides a much-needed introduction to the myriad information policy issues that impact information professionals, information institutions, and the patrons and communities served by those institutions. In this key textbook for LIS students and reference text for practitioners, noted scholars Jaeger and Taylor —

  • draw from current, authoritative sources to familiarize readers with the history of information policy;
  • discuss the broader societal issues shaped by policy, including access to infrastructure, digital literacy and inclusion, accessibility, and security;
  • elucidate the specific laws, regulations, and policies that impact information, including net neutrality, filtering, privacy, openness, and much more;
  • use case studies from a range of institutions to examine the issues, bolstered by discussion questions that encourage readers to delve more deeply;
  • explore the intersections of information policy with human rights, civil rights, and professional ethics; and
  • prepare readers to turn their growing understanding of information policy into action, through activism, advocacy, and education.