Project 9: Transformers

Overview

Large language models (LLMS) built atop transformer networks have become extremely popular. Given context, they probabilistically guess a word to generate that will sound plausible. Given a large probability distribution of words in the form of a highly specialized neural network, they can produce quite humanlike text.

Level 1

Select two openly available LLMs, such as ChatGPT and Google Bard. Assess them as follows:

Create ten LLM prompts.
Give each prompt to each of the two LLMs you are assessing.
Save the response to each prompt.
Study the prompts and the pairs of responses. Then answer the following questions:
- For each of the ten prompts, which LLM answer did you prefer? Why?
- Overall, what impressions do you have about similarities and differences between these two LLMs?

Level 2

This Kaggle notebook demonstrates a 400 million parameter version of the BlenderBot model from Meta. This model is specialized for conversation.

Perform the following experiments with it:

Use each of the ten prompts you devised for Level 1 to start a conversation with it. Continue each conversation for at least twelve cycles of questions and answers before saying “bye”. Then answer the following questions:
- What are some similarities and differences with how it answered your prompts in comparison with the LLMs you examined for Level 1?
- What are the longest, shortest, and average times it needed to generate an answer?
  - Was there any significant difference between the longest and shortest times?
  - If so, was there any particular pattern or situation that seemed to trigger this difference?
- Overall, what were the strengths and weaknesses of this small version of BlenderBot as a conversation partner?
Find three people not enrolled in this course.
- Have each of them sit down with the BlenderBot program.
- Be upfront that it is an AI.
- Ask them to chat with it for a while about a topic of interest to them.
- Then ask them to assess the degree to which they found it to be humanlike.
  - Ask them to be specific about aspects of its responses that were more or less convincing in this regard.

Level 3

Do one of the following:

Write a nontrivial program that makes use of the BlenderBot version from Level 2 in an essential way. The program can do anything you would like, but it must be something interesting that has a chatbot as an integral component. Feel free to discuss ideas with the instructor for feedback on project ideas. The intention is that any program that employs an LLM in a non-trivial way and is itself an application that someone would conceivably want to use will receive credit.
Select a HuggingFace transformer that performs a task other than text generation (e.g. text to image). As with the previous option, embed the transformer in an interesting program for which its use is integral to the program’s purpose.
Configure and set up (either in a Kaggle notebook or on your own machine) an LLM with at least 7 billion parameters in a HuggingFace pipeline.
- Feel free to consult this searchable list of models or this alphabetical list to select one.
- Be sure to set it up to run on a GPU - otherwise, it will run far too slowly. (If you use Kaggle, you have a 30 hour/week limit on GPU acceleration.)
  - Carefully measure and document its response time with your GPU setup.
- Then repeat the Level 2 work with this newly configured LLM. Compare its performance with BlenderBot for each of the described experiments.

For all of the above options, the student should schedule a meeting with the instructor to demonstrate their work.

Paper

When you are finished with your experiments, write a paper that includes answers to all questions above.

For the Level 3 programming options, describe your application and its purpose, discuss why a transformer is pertinent, and assess how well it fulfills its stated purpose.

For the Level 3 7-billion+ option, document your machine configuration along with the installation and configuration process at a level sufficient to enable another student in the course to replicate your setup given comparable equipment.