GenAI @BC Libraries: Ethical & Legal Issues
We have seen how Generative AI can be a useful tool. However, it comes with several ethical and legal concerns that need to be considered when using it. Questions of copyright and creator’s rights, issues of bias in the model, and environmental impacts all counterbalance the utility that Generative AI tools provide.
Copyright & Creator’s rights

The first issue is the sourcing of training data for the model. This is a two part concern. First, there is the question of the legality of how AI companies obtained the massive quantity of data they needed for training. Publishers and AI companies are currently fighting about whether or not scraping data from the internet and using it to train a model constitutes a copyright violation, or is allowable under fair use. This issue is currently being litigated, and we will not have a definitive answer from the courts for years, though we finally have our first decision regarding copying and non-generative AI.
In Thomson Reuters vs. Ross Intelligence, a Delaware federal district court held that the defendant, Ross Intelligence, was liable for copying Thomson Reuters’ copyright in elements of its Westlaw database. The elements in question were Westlaw’s Headnotes, which are short statements about a particular point of law. Ross (via a third party) copied these Headnotes to create its own legal research algorithm. The court very explicitly said that this case was not about generative AI. However, the court used the fairly recent Warhol case to say that using a copy in a way that competes with the copyright holder’s business was very unlikely to be fair use, which is very analogous to arguments made by publishers and artists against companies like OpenAI. This opening salvo in the AI copyright wars will likely still go through multiple appeals, so there is still much to be decided.
Even if the training of the model turns out to be fair use, we should think about what happens to all of those content creators whose content was used for free, without permission, payment, or credit, when AI chatbots supplant users needing to access that content directly. If AI-generated answers are sufficient for users, fewer people will be able to make a living from making content online, which will eventually lead to less of all types of content.
AI in Publications

Researchers looking to use generative AI in publishing have additional ethical concerns. The Committee on Publication Ethics has taken the position that AI tools cannot be the author of a paper. Authors are responsible for everything in their manuscript, even if something was made up by a chatbot. Many journals have specific rules about where AI is and is not permitted, so authors need to carefully check a journal’s policies before submitting a manuscript. Authors also need to consider the privacy implications of submitting data to an online AI tool for analysis.
Data Bias
Another significant concern is bias in the training data. Since large language models (LLMs) are trained on internet data, they can inherit the biases present online. This includes racist, sexist, and other harmful content, as well as imbalances in representation. The language and cultural context of the training data also play a role, potentially leading to incomplete or incorrect information. Despite efforts by AI companies to document and reduce bias, it is challenging to eliminate these issues entirely.
AI Environmental Impacts

Finally, people must also consider the environmental impact of AI. Training and using AI models both require significant amounts of electricity and water. The Center on Global Energy Policy predicts that by 2026, 4% of total energy sales in the United States will be for LLM data centers, up from 1% in 2022. This is both from training LLMs and the increased cost of searching on them versus traditional search products. Furthermore, the electronic waste generated by the computing power needed for AI, particularly GPUs, adds to the environmental burden, while all of this also requires vast amounts of water for cooling servers. Overall, there are real societal impacts to consider when using AI.