Project 9: Language Models
Overview
Language models have transformed the field of Artificial Intelligence. In this lab,
we will first explore a very simple language model for classifying the language in which
text is written. We will then employ a transformer-based language model as a component
of a larger application. Going further, we will compare and contrast the use of
different language models for the same application.
Level 1: Modeling langauge with Markov Chains
The csci335 repository contains a
learning.markov package that we will be using in this project.
Files you modify are marked with an asterisk (*).
It contains the following files:
LanguageGuesser class: A GUI that allows the user to:
- Train a Markov Chain using a reference text. For your convenience,
reference texts in English, Spanish, French, and German are provided in the
books directory.
- Type a sentence, and see a classification and probability distribution
for the classification.
MarkovChain*: Implements a collection of Markov chains, one for each designated label:
count(): Increases the count for the transition from prev to next.
probability(): Returns the probability of the chain for label generating sequence.
labelDistribution(): Returns a map with keys that are labels and values representing the probability of a sequence corresponding to that label.
bestMatchingChain(): Calls labelDistribution() and returns the label
with the highest probability.
MarkovLanguage: Extends MarkovChain with some utility methods to assist
with language classification.
SimpleMarkovTest: Unit tests featuring some simple examples.
MajorMarkovTest: Unit tests trained using the four provided books.
- Obtain four sentences in each of English, Spanish, French, and German, and test and record how well
LanguageGuesser classifies each sentence.
- You may obtain sentences by searching for them on the web, writing them
yourself, using Google Translate, etc.
- Select four other languages, and obtain four sentences in each of them.
- Each language must have a writing system that employs Latin characters.
- Run each sentence through
LanguageGuesser. Given how it was trained,
how plausible are its guesses?
Paper
When you are finished with your experiments, write a paper summarizing your findings. Include the following:
- The URL for the private GitHub repository containing your code.
- A table containing the 32 sentences you gathered. The first column should give the language, the second column the sentence, the third column the best matching language, and the fourth through seventh columns should give the probability of each of English, Spanish, French, and German.
- An analysis of the performance of your implementation based on the data recorded in the table.
- An analysis of the plausibility of the classifications of the sentences from
languages other than those on which it was trained.
- Based on the performance of Markov chains for the language classification task,
for what other kinds of tasks do you believe this approach would be useful?
Carefully explain your answer.
The Ollama project enables developers to download and employ
language models, both on their own and in application software. Install Ollama. Then
open a command prompt and enter the following command:
At the prompt, type: Explain your capabilities.
Reading over its capabilities, formulate a concept for an application that would make
use of Gemma3:1b in a natural and effective way. To access ollama from Python, type
the following command in the command prompt:
Having done that, the following Python code will allow access to Ollama from Python:
from ollama import chat
import time
class Chatbot:
def __init__(self, role: str):
self.message_history = [{'role': 'system', 'content': f"{role}"}]
def talk(self, msg: str) -> str:
self.message_history.append({"role": "user", "content": msg})
start = time.time()
response = chat(
model="gemma3:1b",
messages=self.message_history
)
duration = time.time() - start
print(f"{duration:.2f}s to reply")
assistant_reply = response.message.content
self.message_history.append({"role": "assistant", "content": assistant_reply})
return assistant_reply
def converse(self, prompt: str):
print(f"{prompt} Type `bye` to exit.")
while True:
line = input("You> ")
if line == 'bye':
return
print(self.talk(line))
if __name__ == '__main__':
bot1 = Chatbot("You are an auto mechanic, highly experienced in diagnosing car trouble.")
bot1.converse("Enter a question about car trouble.")
bot2 = Chatbot("You are a guide to helping lost people get out of the Amazon rain forest safely.")
bot2.converse("You just landed in the Amazon jungle and you need to find your way to civilization.")
When thinking about a Chatbot instance, it is important to contextualize it with
a particular role. Examine the bot1 and bot2 object definitions above
for examples of contexts.
Feel free to use the above code as inspiration for your own design,
and to modify the above code as you see fit.
Your own application must meet the following requirements:
- The chatbots must fit naturally into the context of the application.
- The application should have at least three distinct chatbot personalities,
each with its own role and history.
- The application must use the
gemma3:1b model from Ollama.
Then add the following to your evaluation document:
- A description of your application, and why including chatbots makes
sense for it.
- A description of each chatbot personality and the prompts you provided
to induce each role.
- A discussion of the degree to which the chatbot acts in the manner
you intended. Discuss specifically how you modified the chatbot prompt
in order to induce the behavior you desired.
- A discussion of the overall success of the application.
Level 3: Comparing Language Models
For the application you wrote for Level 2, try out three additional language models from
Ollama’s library. They
could be larger or smaller variants of Gemma3:1b, or they could be from entirely different
language families. How do they compare to Gemma3:1b? Discuss advantages and disadvantages
of each of the models you examine, and in your evaluation document draw conclusions about
which of the four language models is best suited for your task.