With continued advances in AI, machine learning and legal analytics anticipated, we can expect that legal information platforms will be supplanted by legal intelligence platforms in the not too distant future.  But what would a legal intelligence (or “smart law”) platform look like? Well, I can’t describe a prototypical legal intelligence platform in any technical detail. But it will exist at the point of the agile convergence of expert analysis, text and data-driven features for core legal search for all market segments.  I do, however, see what some “smart law” platform elements would be when looking at what Fastcase and Casetext are offering right now.

In my opinion, the best contemporary perspective on what a legal intelligence platform would be is to imagine that Fastcase and Casetext were one company.  The imagined vendor would offer in integrated fashion Fastcase and Casetext’s extensive collection of primary and secondary resources including legal news and contemporary analysis from the law blogosphere, Fastcase’s search engine algorithms for keyword searching, Casetext’s CLARA for contextual searching, Casetext’s SmartCite, Fastcase’s Docket Alarm, Fastcase BK, and Fastcase’s install base of some 70-75% of US attorneys, all in the context of the industry’s most transparent pricing model which both Fastcase and Casetext have already adopted.

Obviously, pricing models are not an essential element of a legal intelligence platform. But wouldn’t most potential “smart law” customers prefer transparent pricing? That won’t happen if WEXIS deploys the first legal intelligence platforms.  Neither Fastcase nor Casetext (nor Thomson Reuters, LexisNexis, BBNA, or WK) has a ‘smart law” platform right now. Who will be the first? Perhaps one possibility is hiding in plain sight.

A snip from Casetext’s blog post, Cite-checking the Smart Way: An Interview about SmartCite with Casetext Co-Founder and Chief Product Officer, Pablo Arredondo (May 15, 2019):

“SmartCite was developed through a combination of cutting-edge machine learning, natural language processing, and experienced editorial review. Let’s start with the technology.

“SmartCite looks for patterns in millions of cases and uses judges’ own words to determine whether a case is good law and how a case has been cited by other cases. There are three key data sources analyzed by SmartCite. First, SmartCite looks at “explanatory parentheticals.” You know how judges will summarize other cases using parentheses? By looking for these phrases in opinions, we were able to extract 4.3 million case summaries and explanations written by judges! These explanatory parentheticals provide what I call “artisanal citator entries”: they are insightful, reliable, judge-written summaries of cases.

“The second key data source leveraged by SmartCite are phrases in judicial opinions that indicate that a case has been negatively treated. For example, when a judicial decision cites to a case that is bad law, the judge will often explain why that case is bad law by saying “overruled by” or “reversed by” or “superseded by statute, as stated in…” The same is true with good law. Judicial opinions will often indicate that a case is “affirmed by” another case.

“The third data source we use are Bluebook signals that judges use to characterize and distinguish cases. Bluebook signals can actually tell us a lot about a case. For example, when a judge introduces a case using “but see” or “cf.” or “contra,” the judge is indicating that this case is contrary authority, or that it has treated a legal issue differently from other cases. These contrary signals are powerful indicators of tension in the case law.

“However, using machine learning to look for judicial phrases and Bluebook signals is only the starting point of SmartCite’s analysis. We also rely on experienced editors to manage that process, review the case law, and make decisions on the ‘edge cases.'”

See also this page for SmartCite product information.

H/T to Scott Fruehwald for calling attention to Kevin Bennardo & Alexa Chew (UNC), Citation Stickiness, 20 Journal of Appellate Practice & Process, Forthcoming, in his Legal Skills Prof Blog post Are Lawyers Citing the Best Cases to Courts? Scott Fruehwald solicits comments to his post. One interesting question is whether we have to start teaching legal research differently because of the results of Bennardo & Chew’s empirical study.

Here’s the abstract to Citation Stickiness:

This Article is an empirical study of what we call citation stickiness. A citation is sticky if it appears in one of the parties’ briefs and then again in the court’s opinion. Imagine that the parties use their briefs to toss citations in the court’s direction. Some of those citations stick and appear in the opinion — these are the sticky citations. Some of those citations don’t stick and are unmentioned by the court — these are the unsticky ones. Finally, some sources were never mentioned by the parties yet appear in the court’s opinion. These authorities are endogenous — they spring from the internal workings of the court itself.

In a perfect adversarial world, the percentage of sticky citations in courts’ opinions would be something approaching 100%. The parties would discuss the relevant authorities in their briefs, and the court would rely on the same authorities in its decision-making. Spoiler alert: our adversarial world is imperfect. Endogenous citations abound in judicial opinions and parties’ briefs are brimming with unsticky citations.

So we crunched the numbers. We analyzed 325 cases in the federal courts of appeals. Of the 7552 cases cited in those opinions, more than half were never mentioned in the parties’ briefs. But there’s more — in the Article, you’ll learn how many of the 23,479 cases cited in the parties’ briefs were sticky and how many were unsticky. You’ll see the stickiness data sliced and diced in numerous ways: by circuit, by case topic, by an assortment of characteristics of the authoring judge. Read on!

H/T to Bob Ambrogi for calling attention to Trialdex, a comprehensive resource for finding and comparing federal and state jury instructions. Bob observes “the site provides a searchable collection all official or quasi-official federal civil and criminal instructions and annotations, as well as an index of 20,000 legal terms, statutes, CFRs and Supreme Court cases referenced in jury instructions. The index includes every reference in a federal instruction or annotation to a U.S. Supreme Court decision, a U.S. Code statute, a C.F.R. provision, and a federal rule.” Do note that Trialdex does not index state instructions, but provides links to all state instructions that are posted online and uses a Google search integration to enable full-text search of all state instructions.

From the blurb for National Survey of State Laws, 8th edition, edited by Richard Leiter:

The National Survey of State Laws (NSSL) is a print and online resource that provides an overall view of some of the most-asked about and controversial legal topics in the United States. This database is derived from Richard Leiter’s National Survey of State Laws print editions. Presented in chart format, NSSL allows users to make basic state-by-state comparisons of current state laws. The database is updated regularly as new laws are passed or updated.

The current 8th edition, along with the 7th, 6th and 5th editions, are included in database format, which allows users to compare the same laws as they existed in 2005, 2008, 2015 and 2018, and to make more current comparisons with laws added or updated in the database since 2018. All print editions are included in HeinOnline’ s image-based, fully searchable, user-friendly platform.

The resource is available from Hein here.

From the abstract for Alexa Chew, Stylish Legal Citation, Arkansas Law Review, Vol. 71, Forthcoming:

Can legal citations be stylish? Is that even a thing? Yes, and this Article explains why and how. The usual approach to writing citations is as a separate, inferior part of the writing process, a perfunctory task that satisfies a convention but isn’t worth the attention that stylish writers spend on the “real” words in their documents. This Article argues that the usual approach is wrong. Instead, legal writers should strive to write stylish legal citations — citations that are fully integrated with the prose to convey information in a readable way to a legal audience. Prominent legal style expert Bryan Garner and others have repeatedly pinned legal style problems on citations. For example, Garner has argued that in-line (or textual) citations supposedly interrupt the prose and cause writers to ignore “unshapely” paragraphs and poor flow between sentences. Garner’s cause célèbre has been to persuade lawyers and judges to move their citations into footnotes, which he asserts will fix the stylistic problems caused by citations. This Article proposes both a different explanation for unstylish citations and a different solution. The explanation is that legal style experts don’t address citation as a component of legal style, leaving practitioners with little guidance about how to write stylish citations or even what they look like. This Article summarizes the citation-writing advice offered to practitioners in legal-style books like Plain English for Lawyers. Spoiler alert: it’s not much. The solution is to restructure the revision and editing processes to incorporate citations and treat them like “real” words, too. Rather than cordoning off citations from the rest of the prose, writers should embrace them as integral to the text as a whole. This Article describes a method for writing citations that goes well beyond “Bluebooking.” This method should be useful to any legal writer — from first-semester 1Ls to judicial clerks to experienced appellate practitioners.

H/T to beSpacific for calling attention to Kristina Niedringhaus’ Is it a “Good” Case? Can You Rely on BCite, KeyCite, and Shepard’s to Tell You?, JOTWELL (April 22, 2019) (reviewing Paul Hellyer, Evaluating Shepard’s, KeyCite, and BCite for Case Validation Accuracy, 110 Law Libr. J. 449 (2018)). Here’s a snip:

Hellyer’s article is an important read for anyone who relies on a citator for case validation or, determining whether a case is still “good” law. The results are fascinating and his methodology is thorough and detailed. Before delving into his findings, Hellyer reviews previous studies and explains his process in detail. His dataset is available upon request. The article has additional value because Hellyer shared his results with the three vendors prior to publication and describes and responds to some of their criticisms in his article, allowing the reader to make their own assessment of the critique.

From the abstract for Stefan H. Krieger & Katrina Fischer Kuh, Accessing Law: An Empirical Study Exploring the Influence of Legal Research Medium (Vanderbilt Journal of Entertainment & Technology Law, Vol. 16, No. 4, 2014):

The legal profession is presently engaged in an uncontrolled experiment. Attorneys now locate and access legal authorities primarily through electronic means. Although this shift to an electronic research medium radically changes how attorneys discover and encounter law, little empirical work investigates impacts from the shift to an electronic medium.

This Article presents the results of one of the most robust empirical studies conducted to date comparing research processes using print and electronic sources. While the study presented in this Article was modest in scope, the extent and type of the differences that it reveals are notable. Some of the observed differences between print and electronic research processes confirm predictions offered, but never before confirmed, about how the research medium changes the research process. This Article strongly supports calls for the legal profession and legal academy to be more attentive to the implications of the shift to electronic research.

On Politico, Seamus Hughes, deputy director of George Washington University’s Program on Extremism, calls out PACER: “I’m here to tell you that PACER—Public Access to Court Electronic Records—is a judicially approved scam. The very name is misleading: Limiting the public’s access by charging hefty fees, it has been a scam since it was launched and, barring significant structural changes, will be a scam forever.” Read The Federal Courts Are Running An Online Scam (Mar. 20, 2019) here.

From the abstract for Vicenç Feliú, Moreau Lislet: The Man Behind the Digest of 1808:

The Louisiana legal system is unique in the United States and legal scholars have been interested in learning how this situation came to pass. Most assume that the origin of this system is the Code Napoleon and even legal scholars steeped in Louisiana law have a hard time answering the question of the roots of the Louisiana legal system. This book solves the riddle through painstaking research into the life of Lois Moreau Lislet, the driving force behind the Digest of 1808.

From the abstract for Ronen Avraham, Database of State Tort Law Reforms (6.1):

This manuscript of the Database of State Tort Law Reforms (6th) (DSTLR) updates the DSTLR (5th) and contains the most detailed, complete and comprehensive legal dataset of the most prevalent tort reforms in the United States between 1980 and 2018. The DSTLR has been downloaded more than 2700 times and has become the standard tool in empirical research of tort reform. The dataset records state laws in all fifty states and the District of Columbia over the last several decades. For each reform we record the effective date, a short description of the reform, whether or not the jury is allowed to know about the reform, whether the reform was upheld or struck down by the states’ courts, as well as whether it was amended by the state legislator. Scholarship studying the empirical effects of tort reforms relies on various datasets, (tort reforms datasets and other legal compilations). Some of the datasets are created and published independently and some of them are created ad-hoc by the researchers. The usefulness of these datasets frequently suffers from various defects. They are often incompatible and do not accurately record judicial invalidation of laws. Additionally, they frequently lack reforms adopted before 1986, amendments adopted after 1986, court-based reforms, and effective dates of legislation. It is possible that some of the persisting variation across empirical studies about the effects of tort reforms might be due to the variations in legal datasets used by the studies. This dataset builds upon and improves existing data sources. It does so through a careful review of original legislation and case law to determine the exact text and effective dates. The fifth draft corrects errors that were found in the fourth draft, focuses only on the most prevalent reforms, and standardizes the descriptions of the reforms. A link to an Excel file which codes ten reforms found in DSTLR (6th) can be found here.
It is hoped that creating one “canonized” dataset will increase our understanding of tort reform’s impacts on our lives.

From the blurb for Kendall Svengalis, A Layperson’s Guide to Legal Research and Self-Help Law Books (New England Press 2018):

This unique and revolutionary new reference book provides reviews of nearly 800 significant self-help law books in 85 subject areas, each of which is proceeded by a concise and illuminating overview of the subject area, with links to online sources for further information. The appendices include the most complete directory of public law libraries in the United States. This is an essential reference work for any law, public, or academic library which fields legal questions or inquiries.

Highly recommended.

From the abstract for Neal Goldfarb’s Corpus Linguistics in Legal Interpretation: When Is It (In)appropriate? (feb. 2019):

Corpus linguistics can be a powerful tool in legal interpretation, but like all tools, it is suited for some uses but not for others. At a minimum, that means that there are likely to be cases in which corpus data doesn’t yield any useful insights. More seriously, in some cases where the data seems useful, that appearance might prove on closer examination to be misleading. So it is important for people to be able to distinguish issues as to which corpus results are genuinely useful from those in which they are not. A big part of the motivation behind introducing corpus linguistics into legal interpretation is to increase the sophistication and quality of interpretive analysis. That purpose will be disserved corpus data is cited in support of conclusions that the data doesn’t really support.

This paper is an initial attempt to deal with problem of distinguishing uses of corpus linguistics that can yield useful data from those that cannot. In particular, the paper addresses a criticism that has been made of the use of corpus linguistics in legal interpretation — namely, that that the hypothesis underlying the legal-interpretive use of frequency data is flawed. That hypothesis, ac-cording to one of the critics, is that “where an ambiguous term retains two plausible meanings, the ordinary meaning of the term… is the more frequently used meaning[.]” (Although that description is not fully accurate, it will suffice for present purposes.)

The asserted flaw in this hypothesis is that differences in the frequencies of different senses of a word might be due to “reasons that have little to do with the ordinary meaning of that word.” Such differences, rather than reflecting the “sense of a word or phrase that is most likely implicated in a given linguistic context,” might instead reflect at least in part “the prevalence or newsworthiness of the underlying phenomenon that the term denotes.” That argument is referred to in this paper as the Purple-Car Argument, based on a skeptical comment about the use of corpus linguistics in legal interpretation: “If the word ‘car’ is ten times more likely to co-occur with the word ‘red’ than with the word ‘purple,’ it would be ludicrous to conclude from this data that a purple car is not a ‘car.’”

This paper deals with the Purple-Car Argument in two ways. First, it attempts to clarify the argument’s by showing that there are ways of using corpus linguistics that do not involve frequency analysis and that are therefore not even arguably subject to the Purple-Car Argument. The paper offers several case studies illustrating such uses.

Second, the acknowledges that when frequency analysis is in fact used, there will be cases that do implicate the flaw that the Purple-Car Argument identifies. The problem, therefore, is to figure out how to distinguish these Purple-Car cases from cases in which the Purple-Car Argument does not apply. The paper discusses some possible methodologies that might be helpful in making that determination. It then presents three case studies, focusing on cases that are well known to those familiar with the law-and-corpus-linguistics literature: Muscarello v. United States, State v. Rasabout, and People v. Harris. The paper concludes that the Purple-Car Argument does not apply to Muscarello, that it does apply to Rasabout, and that a variant of the argument applies to the dissenting opinion in Harris.

From the abstract for Clark D. Cunningham & Jesse Egbert, Scientific Methods for Analyzing Original Meaning: Corpus Linguistics and the Emoluments Clauses, Fourth Annual Conference of Law & Corpus Linguistics (2019):

In interpreting the Constitution’s text, courts “are guided by the principle that ‘[t]he Constitution was written to be understood by the voters; its words and phrases were used in their normal and ordinary as distinguished from their technical meaning’.” District of Columbia v. Heller, 554 U.S. 570, 576 (2008). According to James Madison: “[W]hatever respect may be thought due to the intention of the Convention, which prepared and proposed the Constitution, as a presumptive evidence of the general understanding at the time of the language used, it must be kept in mind that the only authoritative intentions were those of the people of the States, as expressed through the Conventions which ratified the Constitution.”

In looking for “presumptive evidence of the general understanding at the time of the language used” courts have generally relied on dictionary definitions and selected quotations from texts dating from the period of ratification. This paper presents a completely different, scientifically-grounded approach: applying the tools of linguistic analysis to “big data” about how written language was used at the time of ratification. This data became publicly available in Fall 2018 when the website of the Corpus of Founding Era American English (COFEA) was launched. COFEA contains in digital form over 95,000 texts created between 1760 and 1799, totaling more than 138,800,000 words.

The authors illustrate this scientific approach by analyzing the usage of the word emolument by writers in America during the period covered by COFEA, 1760-1799. The authors selected this project both because the interpretation of two clauses in the Constitution using emolument are of considerable current interest and because the meaning of emolument is a mystery to modern Americans.

The District of Columbia and State of Maryland are currently suing President Donald Trump alleging that his continued ownership of the Trump Hotel in Washington puts him in violation of Constitutional prohibitions on receiving or accepting “emoluments” from either foreign or state governments. The President’s primary line of defense is a narrow reading of emolument as “profit arising from an office or employ.”

The authors accessed every text in COFEA in which emolument appeared – over 2500 examples of actual usage – and analyzed all of these examples using three different computerized search methods. The authors found no evidence that emolument had a distinct narrow meaning of “profit arising from an office or employ.” All three analyses indicated just the opposite: emolument was consistently used and understood as a general and inclusive term.

Security breach laws typically have provisions regarding who must comply with the law (e.g., businesses, data/ information brokers, government entities, etc); definitions of “personal information” (e.g., name combined with SSN, drivers license or state ID, account numbers, etc.); what constitutes a breach (e.g., unauthorized acquisition of data); requirements for notice (e.g., timing or method of notice, who must be notified); and exemptions (e.g., for encrypted information). All 50 states, the District of Columbia, Guam, Puerto Rico and the Virgin Islands, according to the National Conference of State Legislatures survey, have enacted legislation requiring private or governmental entities to notify individuals of security breaches of information involving personally identifiable information.