Project 9: Transformers
Overview
Large language models (LLMS) built atop transformer networks have become extremely popular. Given context, they probabilistically guess a word to generate that will sound plausible. Given a large probability distribution of words in the form of a highly
specialized neural network, they can produce quite humanlike text.
Level 1
Select two openly available LLMs, such as ChatGPT and Google Bard. Assess them as
follows:
- Create ten LLM prompts.
- Give each prompt to each of the two LLMs you are assessing.
- Save the response to each prompt.
- Study the prompts and the pairs of responses. Then answer the following questions:
- For each of the ten prompts, which LLM answer did you prefer? Why?
- Overall, what impressions do you have about similarities and differences
between these two LLMs?
Level 2
This Kaggle notebook
demonstrates a 400 million parameter version of the BlenderBot
model from Meta. This model is specialized for conversation.
Perform the following experiments with it:
- Use each of the ten prompts you devised for Level 1 to start a conversation with
it. Continue each conversation for at least twelve cycles of questions and answers
before saying “bye”. Then answer the following questions:
- What are some similarities and differences with how it answered your prompts
in comparison with the LLMs you examined for Level 1?
- What are the longest, shortest, and average times it needed to generate an answer?
- Was there any significant difference between the longest and shortest times?
- If so, was there any particular pattern or situation that seemed to trigger
this difference?
- Overall, what were the strengths and weaknesses of this small version of
BlenderBot as a conversation partner?
- Find three people not enrolled in this course.
- Have each of them sit down with the BlenderBot program.
- Be upfront that it is an AI.
- Ask them to chat with it for a while about a topic of interest to them.
- Then ask them to assess the degree to which they found it to be humanlike.
- Ask them to be specific about aspects of its responses that were more or
less convincing in this regard.
Level 3
Do one of the following:
- Write a nontrivial program that makes use of the BlenderBot version from Level 2
in an essential way. The program can do anything you would like, but it must be
something interesting that has a chatbot as an integral component. Feel free to
discuss ideas with the instructor for feedback on project ideas. The intention is
that any program that employs an LLM in a non-trivial way and is itself an application that someone would conceivably want to use will receive credit.
- Select a HuggingFace transformer that performs a task other than text
generation (e.g. text to image). As with the previous option, embed the transformer
in an interesting program for which its use is integral to the program’s purpose.
- Configure and set up (either in a Kaggle notebook or on your own machine) an
LLM with at least 7 billion parameters in a HuggingFace pipeline.
- Feel free to consult this searchable list of models or this alphabetical list to select one.
- Be sure to set it up to run on a GPU - otherwise, it will run far too slowly. (If you use Kaggle, you have a 30 hour/week limit on GPU acceleration.)
- Carefully measure and document its response time with your GPU setup.
- Then repeat the Level 2 work with this newly configured LLM. Compare its
performance with BlenderBot for each of the described experiments.
For all of the above options, the student should schedule a meeting with the
instructor to demonstrate their work.
Paper
When you are finished with your experiments, write a paper that includes answers
to all questions above.
For the Level 3 programming options, describe your
application and its purpose, discuss why a transformer is pertinent, and assess
how well it fulfills its stated purpose.
For the Level 3 7-billion+ option, document your machine configuration along
with the installation and configuration process at a level sufficient to enable
another student in the course to replicate your setup given comparable equipment.