GenAI@BC Libraries: The Basics

The Task Force

As we know too well by now, developments in GenAI are so fast that today’s pronouncements can be stale by the end of the week. Librarians are by nature thorough and deliberative, so how do we respond in this quickly changing environment?

A neo-gothic building with a square tower and many high windows stands in the middle distance across a snow covered lawn, framed by snow-covered branches in the foreground. It all looks picture-postcard perfect.
Burns Library in snow, courtesy of BC Office of Marketing & Communications. This image was uploaded to Google Gemini in an attempt to generate images based on it. The 4 images that follow document several attempts.

We assemble a task force. Here are the task force’s goals: 

  • understand the basics of GenAI tools in the research process, 
  • understand best use cases by scholars and researchers, 
  • understand ethical and legal issues, 
  • uncover other areas of concern, and 
  • prepare documentation and workshops for the BC community

With the ground shifting so quickly, we recognize there is no way to provide definitive answers in any of the above categories. This blog article is the first in a series we’re calling “GenAI@BCLibraries,” which will share our task force members’ encounters with GenAI tools and learning experiences to serve broadening conversations about how our uses of GenAI at BC are evolving.

Generative AI and LLM’s

Two glowing blue translucent human figures stand at floating transparent screens while a few people in VR goggles in shadows to the sides of the image type at computer screens. The setting is similar to the interior of a high cathedral, with books lining the walls floor to ceiling. Bright sunlight comes in through a skylight.
GenAI image created by Google Gemini in response to this prompt: Please use this picture of a library as a base to create an imaginative image of a university library using generative artificial intelligence.

First of all, let’s get some definitions out of the way. We’re not trying to learn everything about all of AI: that’s much too broad a category. We’re concerned primarily with the Generative AI (GenAI) tools–especially chat–that are based on Large Language Model (LLM) training, and are most commonly used. Here at BC, the tools ITS has approved for data security are Microsoft Copilot, Google Gemini, and Notebook LM1. Many students are also using ChatGPT, both free and paid versions, and any number of other proliferating tools.

These tools are cloud-based and trained on Large Language Models (LLM’s): huge datasets of language harvested from the open web and other sources such as book and newspaper publishers. (Companies are mute about the full range of data, leading to suspicions that some have circumvented copyright law). Training a GenAI tool consists of turning all the language data in the LLM into numeric data, and running statistical modeling that “teaches” complex algorithms to predict what “words” (numeric representations of words) are most likely to follow others. For more detail, here’s a good primer produced by the UK digital education nonprofit Jisc. Though we’ll cover ethics in later articles, it’s important to note that a significant amount of contingent human labor in the global south is involved in this LLM training.

With snow falling in the foreground, we are looking in through a vast window into a very modern building with white walls, floors, and ceiling, and many broad windows. People here and there use what seem to be projected translucent screens. On the left side are 3 floors of rows of books on shelves.
GenAI image created by Google Gemini in response to this update to the prior image: I would like you to use the uploaded image in your image design.

Many AI products continuously refine their predictive models by asking users whether answers are satisfactory, and whether they can use your responses for continued training. Many tools (such as ChatGPT) also incorporate anything you upload–data, your own writing, articles, etc.–into its LLM text corpora. ITS has only signed contracts with providers that claim not to automatically ingest uploaded information. It’s important to remember that any AI tool does not understand language; it only gives a very convincing approximation of understanding language. One heavily cited conference paper characterized GenAI chat tools as “stochastic parrots”.

NotebookLM is somewhat different, insofar as its initial training may be on an LLM, but then applies its predictive text generation to smaller local datasets created by users, such as, say, a professor creating a dataset of all the articles students are assigned in a class; when students enter prompts, it creates answers based on the local dataset, not the LLM. These local-data tools both avoid the risk of your own data being absorbed into the LLM corpus and tend to have fewer accuracy problems.

Embedding in library tools

Several enormous blue crystals tower over a very modern two-story building with broad floor-to-ceiling windows on both floors, so that the entire interior of the building is visible. Snow blankets the ground and coats some shrubs. People inside the windows use massive blue projected screens. Library shelves full of books are in the background.
GenAI image created by Google Gemini in response to a followup prompt about the prior image: I do like the snow, but I was thinking you would combine the building in the uploaded image with the idea of generative ai.

Database vendors that serve libraries, such as ProQuest, JSTOR, Scopus, and Clarivate, are beginning to add this latter type of AI Chat tool that queries their own database data. As of this writing, ProQuest is piloting its “Research Assistant” in some databases. JSTOR also offers its pilot chat tool by request. BC Libraries is watching these developments carefully. While some vendors offer pilots as free extensions of existing contracts, others, like Clarivate (Web of Science), have announced products that can be added for additional fees. Our library search engine, Primo, has also introduced AI chat as an optional add-on. The non-profit organization Ithaka (who brings us JSTOR) has developed a continually updated list of Generative AI products with brief descriptions.

Supporting faculty

In all ways but the pace of development, our response to these tools is the same as it ever was: we experiment with them, we research the affordances and limits and how other universities and libraries are using them, we gain familiarity, and we partner with faculty to understand your needs and concerns and facilitate your research and teaching. We look forward to hearing from you.

An enormous cluster of blue glowing crystals towers over tiny people standing near shelves and computer screens in a dark outdoor library under a snowy night sky. Strange spirals of blue light rise from the crystals and envelop a gothic tower.
GenAI image generated by Google Gemini with this final, revised prompt: Please use details of the neo-gothic library in this image to create an imaginative picture about how a library uses generative ai.

Author’s note about the images: The point isn’t so much that Google Gemini is poor at image generation, but that a novice user might have to experiment quite a lot to get anything like satisfactory results. Often, creating effective GenAI prompts is more an art than a science.

  1. At the time of this writing, use of Notebook LM is limited to the standard version, and is intended for exploration, as it’s not a core Google service & could be removed or suspended by Google. IOW, don’t design necessary functions around it. ↩︎