Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

MKang25

macrumors regular
Original poster
Jun 12, 2010
115
13
I have to build a simple assembler, where I read in a instruction in assembly and then convert it to its machine code equivalent. I need some help getting started. I know how to find the machine code equivalents, looking up opcodes, registers codes etc.. I am just stuck on how to actually go about converting it in the program. In Mips you can only alter bytes, but what I need to do is combine or alter bits.
 

chown33

Moderator
Staff member
Aug 9, 2009
10,732
8,408
A sea of green
... In Mips you can only alter bytes, but what I need to do is combine or alter bits.

Everything made sense until that last sentence. The MIPS instruction-set has opcodes that load & store bytes, half-words, words, and double-words. I'm not sure how this is relevant, though, unless you're writing a MIPS assembler in MIPS assembler.

In any case, you combine and alter bits by ANDing and ORing them into place in longer operands (bytes, words, etc.). You may also have to apply some shifting, depending on where the bits are located in the longer operand. This is extremely common in high-level languages, and also common in assembly languages that have no direct bit-manipulation op-codes.
 

lee1210

macrumors 68040
Jan 10, 2005
3,182
3
Dallas, TX
What language are you writing your assembler in? Like chown33, that was the part that confused me as you failed to say and your last sentence was confusing.

Ultimately in MIPS-32 you'll have (if I remember correctly) a 6-bit opcode then a series of operands that are either constants or 5-bit register numbers. (I looked and in one format there's also a 6-bit function code) My approach would be to have constants for:
Each opcode 0-31
Each register 0-32 (named R#)
the number of bits to shift an opcode left (27)
The numbers to shift over operands

Then it gets pretty boring. Maybe a switch or other lookup that maps the ASCII for each instruction to your constant opcode value, the ASCII for R0-31 to your constant with the same value, etc. Each instruction has a fixed format (I think there's 4-5 instruction formats... I looked, only 3, lucky you) so you could write a subroutine for each instruction format that shifts it's arguments into the proper positions in the resulting 32-bit instruction.

-Lee
 
Last edited:

MKang25

macrumors regular
Original poster
Jun 12, 2010
115
13
Sorry for not being clear. I am writing a simple MIPS assembler using MIPs. I figured out how to modify the bits. A question that I now have is I have to read in a mips instruction to convert to machine code. So I would read in something like " add $t0, $t1, $t2 " . In the data section I would store the string. Also in the data section I would have a list of all the registers and opcodes I am using and their respective hex values. How would I go about looking up the hex values for the registers and opcodes. For the instruction above, I would parse the string and I get to the register how would i "Search" the data section for the memory location of $t0 in the data section so that I could load the hex value of it into a register.
 

subsonix

macrumors 68040
Feb 2, 2008
3,551
79
Create a hash table for the mnemonics, where each one resolves to an index position in a array. At this array position you store the opcode. If there are several different ones, depending on addressing mode and so on, add an array of opcodes at that index position.

The values to the right of the mnemonic I would parse using a switch/case state machine. So each row would be read from right to left.

Are you going to support labels and constants as well?
 

MKang25

macrumors regular
Original poster
Jun 12, 2010
115
13
I haven't learned how to use hash tables. I also have to create a label/address table for the little segment of code that I input.
 

subsonix

macrumors 68040
Feb 2, 2008
3,551
79
I haven't learned how to use hash tables. I also have to create a label/address table for the little segment of code that I input.

Ok, but you could simply leave out the hash function then, and search an array of mnemonics linearly for a match. The array could for example contain structs with the mnemonic and the opcode, or something like that.
 

chown33

Moderator
Staff member
Aug 9, 2009
10,732
8,408
A sea of green
It sounds like you need to do some studying and some research. Asking for answers here is neither.

If you're expected to write an assembler in assembler, then you've presumably been given enough information in class to do this, or pointers to where you can study the information. If not, then you should approach your in-class information provider, i.e. your teacher or teaching aid and describe what information you lack.

There are any number of representations for mnemonic tables and such. Even a brute force search of an array of pointers to char-strings would work, and you wouldn't need to know what a hashtable is.

If you're in a class, you're expected to come up with an answer yourself, even a sub-optimal one. Having multiple people supply you with answers isn't the same thing at all.
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.