Project 9: Language Models

Overview

Language models have transformed the field of Artificial Intelligence. In this lab, we will first explore a very simple language model for classifying the language in which text is written. We will then employ a transformer-based language model as a component of a larger application. Going further, we will compare and contrast the use of different language models for the same application.

Level 1: Modeling langauge with Markov Chains

The csci335 repository contains a learning.markov package that we will be using in this project. Files you modify are marked with an asterisk (*). It contains the following files:

LanguageGuesser class: A GUI that allows the user to:
- Train a Markov Chain using a reference text. For your convenience, reference texts in English, Spanish, French, and German are provided in the books directory.
- Type a sentence, and see a classification and probability distribution for the classification.
MarkovChain*: Implements a collection of Markov chains, one for each designated label:
- count(): Increases the count for the transition from prev to next.
- probability(): Returns the probability of the chain for label generating sequence.
- labelDistribution(): Returns a map with keys that are labels and values representing the probability of a sequence corresponding to that label.
- bestMatchingChain(): Calls labelDistribution() and returns the label with the highest probability.
MarkovLanguage: Extends MarkovChain with some utility methods to assist with language classification.
SimpleMarkovTest: Unit tests featuring some simple examples.
MajorMarkovTest: Unit tests trained using the four provided books.

Assessing performance

Obtain four sentences in each of English, Spanish, French, and German, and test and record how well LanguageGuesser classifies each sentence.
- You may obtain sentences by searching for them on the web, writing them yourself, using Google Translate, etc.
Select four other languages, and obtain four sentences in each of them.
- Each language must have a writing system that employs Latin characters.
- Run each sentence through LanguageGuesser. Given how it was trained, how plausible are its guesses?

Paper

When you are finished with your experiments, write a paper summarizing your findings. Include the following:

The URL for the private GitHub repository containing your code.
A table containing the 32 sentences you gathered. The first column should give the language, the second column the sentence, the third column the best matching language, and the fourth through seventh columns should give the probability of each of English, Spanish, French, and German.
An analysis of the performance of your implementation based on the data recorded in the table.
An analysis of the plausibility of the classifications of the sentences from languages other than those on which it was trained.
Based on the performance of Markov chains for the language classification task, for what other kinds of tasks do you believe this approach would be useful? Carefully explain your answer.

Level 2: Transformer-based Langauge Models

The Ollama project enables developers to download and employ language models, both on their own and in application software. Install Ollama. Then open a command prompt and enter the following command:

ollama run gemma3:1b

At the prompt, type: Explain your capabilities.

Reading over its capabilities, formulate a concept for an application that would make use of Gemma3:1b in a natural and effective way. To access ollama from Python, type the following command in the command prompt:

pip install ollama

Having done that, the following Python code will allow access to Ollama from Python:

from ollama import chat
import time


class Chatbot:
    def __init__(self, role: str):
        self.message_history = [{'role': 'system', 'content': f"{role}"}]

    def talk(self, msg: str) -> str:
        self.message_history.append({"role": "user", "content": msg})
        start = time.time()
        response = chat(
            model="gemma3:1b",
            messages=self.message_history
        )
        duration = time.time() - start
        print(f"{duration:.2f}s to reply")
        assistant_reply = response.message.content
        self.message_history.append({"role": "assistant", "content": assistant_reply})
        return assistant_reply

    def converse(self, prompt: str):
        print(f"{prompt} Type `bye` to exit.")
        while True:
            line = input("You> ")
            if line == 'bye':
                return
            print(self.talk(line))


if __name__ == '__main__':
    bot1 = Chatbot("You are an auto mechanic, highly experienced in diagnosing car trouble.")
    bot1.converse("Enter a question about car trouble.")

    bot2 = Chatbot("You are a guide to helping lost people get out of the Amazon rain forest safely.")
    bot2.converse("You just landed in the Amazon jungle and you need to find your way to civilization.")

When thinking about a Chatbot instance, it is important to contextualize it with a particular role. Examine the bot1 and bot2 object definitions above for examples of contexts.

Feel free to use the above code as inspiration for your own design, and to modify the above code as you see fit.

Your own application must meet the following requirements:

The chatbots must fit naturally into the context of the application.
The application should have at least three distinct chatbot personalities, each with its own role and history.
The application must use the gemma3:1b model from Ollama.

Then add the following to your evaluation document:

A description of your application, and why including chatbots makes sense for it.
A description of each chatbot personality and the prompts you provided to induce each role.
A discussion of the degree to which the chatbot acts in the manner you intended. Discuss specifically how you modified the chatbot prompt in order to induce the behavior you desired.
A discussion of the overall success of the application.

Level 3: Comparing Language Models

For the application you wrote for Level 2, try out three additional language models from Ollama’s library. They could be larger or smaller variants of Gemma3:1b, or they could be from entirely different language families. How do they compare to Gemma3:1b? Discuss advantages and disadvantages of each of the models you examine, and in your evaluation document draw conclusions about which of the four language models is best suited for your task.