From the abstract for Matthew Sag, The New Legal Landscape for Text Mining and Machine Learning, Journal of the Copyright Society of the USA, Vol 66 (2019):
Individually and collectively, copyrighted works have the potential to generate information that goes far beyond what their individual authors expressed or intended. Various methods of computational and statistical analysis of text — usually referred to as text data mining (“TDM”) or just text mining — can unlock that information. However, because almost every use of TDM involves making copies of the text to be mined, the legality of that copying has become a fraught issue in copyright law in United States and around the world. One of the most fundamental questions for copyright law in the Internet age is whether the protection of the author’s original expression should stand as an obstacle to the generation of insights about that expression. How this question is answered will have a profound influence on the future of research across the sciences and the humanities, and for the development of the next generation of information technology: machine learning and artificial intelligence.
This Article consolidates a theory of copyright law should that I have advanced in a series of articles and amicus briefs over the past decade. It explains why applying copyright’s fundamental principles in the context of new technologies necessarily implies that copying expressive works for non-expressive purposes should not be counted as infringement and must be recognized as fair use. The Article shows how that theory was adopted and applied in the recent high-profile test cases, Authors Guild v. HathiTrust and Authors Guild v. Google, and takes stock of the legal context for TDM research in the United States in the aftermath of those decisions.
The Article makes important contributions to copyright theory, but is also integrates that theory with a practical assessment various interrelated legal issues that text mining researchers and their supporting institutions must confront if they are to realize the full potential of these technologies. These issues range from the enforceability of website terms of service, the effect of laws prohibiting computer hacking and the circumvention of technological protection measures (i.e., encryption and other digital locks), and cross-border copyright issues.
Kudos to Carl Malamud! The 11th Circuit appeals court has just overturned a lower court ruling and said that Georgia’s laws, including annotations, are not covered by copyright, and it is not infringing to post them online. From the opinion in Code Revision Commission v. Public.Resource.Org:
The OCGA annotations are created by Georgia’s legislative body, which has been entrusted with exercising sovereign power on behalf of the people of Georgia. While the annotations do not carry the force of law in the way that statutes or judicial opinions do, they are expressly given legal significance so that, while not “law,” the annotations undeniably are authoritative sources on the meaning of Georgia statutes. The legislature has stamped them “official” and has chosen to make them an integral part of the official codification of Georgia’s laws. By wrapping the annotations and the statutory text into a single unified edict, the Georgia General Assembly has made the connection between the two inextricable and, thereby, ensured that obtaining a full understanding of the laws of Georgia requires having unfettered access to the annotations. Finally, the General Assembly’s annual adoption of the annotations as part of the laws of Georgia is effected by the legislative process — namely bicameralism and presentment — that is ordinarily reserved for the exercise of sovereign power.
Thus, we conclude that the annotations in the OCGA are attributable to the constructive authorship of the People. To advance the interests and effect the will of the People, their agents in the General Assembly have chosen to create an official exposition on the meaning of the laws of Georgia. In creating the annotations, the legislators have acted as draftsmen giving voice to the sovereign’s will. The resulting work is intrinsically public domain material, belonging to the People, and, as such, must be free for publication by all.
As a result, no valid copyright can subsist in these works.
From the abstract for Kyle K. Courtney’s Fair Use is a Right, Haters to the Left: A Primer for Libraries and Other Cultural Institutions (Mar. 29, 2018):
‘[I]n the 30 years of fair use’s modern statutory existence, a terrible myth has crept its way into the doctrine: fair use is an “affirmative defense.” This myth has created more controversy and misunderstanding of our most critical of all copyright exceptions. But, in the end, it is just a myth, and citizens, journalists, teachers, librarians, authors, students, artists, and others should work to dispel this myth. Fair use is a right.
This article outlines the history of fair use and the development of the “affirmative defense” myth. Part I discusses the origins and history of fair use and its development from English law. Part II reveals the treatises and other secondary sources that began to muddy the fair use doctrine. And Part III reveals how the myth was adapted into the common law by a few courts. Finally, in Part IV, the author offers three areas to examine and help reverse the affirmative defense myth: the accurate legislative history of fair use , the plain meaning textualism of the statute, and modern fair use case interpretation. Each of these studies will reveal that fair use is, and was always meant to be, a fundamental right.