Project 10/11A: Compiler I: Tokenizer

  • Description: Complete Stage I (tokenizer) of Project 10 on nand2tetris.org.

  • What to turn in: Turn in all files necessary for running your tokenizer. Your tokenizer can be written in any language you wish.

  • Hints

    • Test your tokenizer on the .jack files in the ArrayTest, ExpressionLessSquare, and Square folders.

    • For each Xxx.jack source file, have your tokenizer test program give the output file the name XxxT.xml (notice the T!). Apply your tokenizer test to each class file in the test programs, then use the supplied TextComparer utility to compare the generated output to the supplied .xml compare files (on Linux or Mac OS, you can also use the built-in diff utility to compare files).

    • Since the output files generated by your tokenizer test will have the same names and extensions as those of the supplied comparison files, we suggest putting them in separate directories.

  • Specification: To get credit for this project, you must complete a working tokenizer which accepts the name of a .jack file as a command-line argument and outputs a corresponding XML file with the same name but a T added to the end of the file name and an .xml extension. (For example, foo.jack should become fooT.xml.) For each of the following input .jack files, your tokenizer must output an .xml file which is identical to the provided comparison .xml file:

    • ArrayTest/Main.jack
    • ExpressionLessSquare/Main.jack
    • ExpressionLessSquare/SquareGame.jack
    • ExpressionLessSquare/Square.jack
    • Square/Main.jack
    • Square/SquareGame.jack
    • Square/Square.jack