CS241 Lecture Notes - Lecture 5: Binary File, Lexeme, Lexical Analysis

60 views4 pages
May 15, 2018
Assembly language ---cs241.binasm----> machine language (what an assembler does -
cs241.binasm is an assembler)
Assignment 3 is writing your own s241.binasm
What are the different things you need to consider?
The input is a text file with lines that look like this:
add $3, $1, $2
jr $31
The output is a binary file that looks like this:
101101....
0101.....
How do we start to write our own assembler?
1. Read the input string and make sense out of it (is it valid, is it an add, etc.)
2. Translate it into binary
These steps have actual names...
Stages
1. Analysis: making sense of the input string
- understanding the meaning/intent
2. Synthesis: output equivalent machine instruction in binary
Analysis
Can have multiple steps:
- First thing is to tokenize input: break down the input string into a sequence of tokens
ex.
add $1, $2, $3
Can break this into ID token (add), REG token ($1), COMMA token, REG token ($2), COMMA
token, REG token ($3)
Tokenization is already done for you in A3
class Token {
std::string kind; // the type of token (ID, REG, COMMA)
std::tring lexeme; // the actual input character making up this token ($1, $2)
public:
...
}
This class is given in A3
Scannere produces a vector<vector<Token>>
find more resources at oneclass.com
find more resources at oneclass.com
Unlock document

This preview shows page 1 of the document.
Unlock all 4 pages and 3 million more documents.

Already have an account? Log in

Document Summary

Assembly language ---cs241. binasm----> machine language (what an assembler does - cs241. binasm is an assembler) The input is a text file with lines that look like this: add , , jr . The output is a binary file that looks like this: How do we start to write our own assembler: read the input string and make sense out of it (is it valid, is it an add, etc. , translate it into binary. First thing is to tokenize input: break down the input string into a sequence of tokens ex. add , , . Can break this into id token (add), reg token (), comma token, reg token (), comma token, reg token () Tokenization is already done for you in a3 class token { std::string kind; // the type of token (id, reg, comma) std::tring lexeme; // the actual input character making up this token (, ) public:

Get access

Grade+
$40 USD/m
Billed monthly
Grade+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
10 Verified Answers
Class+
$30 USD/m
Billed monthly
Class+
Homework Help
Study Guides
Textbook Solutions
Class Notes
Textbook Notes
Booster Class
7 Verified Answers

Related Documents