-
Notifications
You must be signed in to change notification settings - Fork 3
/
assembly.html
853 lines (743 loc) · 26.8 KB
/
assembly.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
<!DOCTYPE html>
<html lang="en">
<head>
<link rel="stylesheet" href="https://unpkg.com/[email protected]">
<link rel="stylesheet" href="https://unpkg.com/[email protected]/prism">
<meta name="viewport" content="width=device-width, initial-scale=1" />
<title>The Little Man Stack Machine</title>
</head>
<body>
<main>
<h1 class="massivetext">🙋 The Little Man Stack Machine</h1>
<nav>
<p class="tool-bar">
<a href="index.html">Intro</a>
<a href="assembly.html">Assembly & Execution</a>
<a href="functions.html">Functions</a>
<a href="firth.html">Firth</a>
<a href="emulator.html">Emulator</a>
<a href="https://github.com/bigskysoftware/littlemanstackmachine.org">Github</a>
</p>
</nav>
<h1>The LMSM Instruction Cycle</h1>
<p>
As a <a href="https://en.wikipedia.org/wiki/Von_Neumann_architecture">Von Neumann</a> machine, the
LMSM follows a simple <a href="https://en.wikipedia.org/wiki/Instruction_cycle">Instruction Cycle</a>,
identical to the instruction cycle of the LMC:
</p>
<ol>
<li>
The value in the memory location pointed to by the <code>program_counter</code> register is retrieved
from memory and placed in the <code>current_instruction</code> register.
</li>
<li>
The program counter is incremented by one.
</li>
<li>
The instruction is executed.
</li>
<li>
The cycle is repeated.
</li>
</ol>
<p>
This simple loop is the basis for computation on the LMSM and, in a broad sense, of many of the computers
we use today.
</p>
<p>
Instructions, when executed, update the state of the computer in some manner: moving a value between memory
locations and registers, updating registers, outputing data, etc.
</p>
<h2>Example Execution</h2>
<p>
In order to understand how execution works, let's step through a cycle of execution in detail. Consider an
LMSM whose <code>program_counter</code> is set to 0 and who has the following numeric values
in the first three slots in memory:
</p>
<img src="img/execution1.png">
<h3>Cycle 1</h3>
<p>
Execution would begin by retrieving instruction from memory slot 0, which is where the
<code>program_counter</code>
initially points to and placing it in the <code>current_instruction</code> register:
</p>
<img src="img/execution2.png">
<p>
Next, the <code>program_counter</code> value is incremented from 0 to 1, so that it points to the next
instruction
in memory:
</p>
<img src="img/execution3.png">
<p>
Next, the instruction is executed. In this case, the instruction is 401, which tells the
LMSM to place the value 1 into the <code>accumulator</code> register.
(We will go over exactly what each instruction means in the next section.)
</p>
<img src="img/execution4.png">
<p>
This completes a single instruction cycle for the LMSM. The LMSM then starts again from the beginning of
the cycle and repeats the same logic.
</p>
<h3>Cycle 2</h3>
<p>
This time the <code>program_counter</code> is pointing to slot 1, so the instruction 902 is loaded into the
<code>current_instruction</code> register and the <code>program_counter</code> is incremented to the value 2.
The 902 instruction tells the LMSM to print the current value of the <code>accumulator</code> register to
output.
</p>
<p>
Once again, the cycle repeats.
</p>
<h3>Cycle 3</h3>
<p>
Now the <code>program_counter</code> is pointing to slot 2, so the instruction 000 is loaded into the
<code>current_instruction</code> register and the <code>program_counter</code> is incremented to the value 3.
</p>
<p>
The 000 instruction tells the LMSM to halt, and so it stops executing.
</p>
<h3>A Completed Program!</h3>
<p>
The program above prints the number 1 to output. It might not look like much, but this is the "Hello World" of
LMSM, and it demonstrates the basic operation mechanics of the system.
</p>
<h1>LMSM Instructions</h1>
<p>
In the program above we executed LMSM machine instructions to achieve something. In this section we will
go over all the instructions available and what they do.
</p>
<p>
The first thing to bear in mind is that LMSM instructions are stored in LMSM memory as "regular" data. There is
no distinction between "instructions" and "data". Just like any other piece of data in the LMSM, instructions
are numeric values between -999 and 999.
</p>
<p>
Let's look at the available instructions.
</p>
<h2>LMC Instructions</h2>
<p>
The LMSM supports the following instructions, which are all taken, unchanged, from the LCM:
</p>
<table>
<thead>
<tr>
<th>Machine Code</th>
<th>Assembly Instruction</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>
1XX
</td>
<td>
ADD <XX>
</td>
<td>
Adds the value located in memory at position XX to the value of the accumulator.
</td>
</tr>
<tr>
<td>
2XX
</td>
<td>
SUB <XX>
</td>
<td>
Subtracts the value located in memory at position XX to the value of the accumulator.
</td>
</tr>
<tr>
<td>
3XX
</td>
<td>
STA <XX>
</td>
<td>
Stores the value of the accumulator to the memory location at position XX
</td>
</tr>
<tr>
<td>
5XX
</td>
<td>
LDA <XX>
</td>
<td>
Loads the value of the memory location at position XX into the accumulator.
</td>
</tr>
<tr>
<td>
6XX
</td>
<td>
BRA <XX>
</td>
<td>
Unconditionally sets the program counter to the given value
</td>
</tr>
<tr>
<td>
7XX
</td>
<td>
BRZ <XX>
</td>
<td>
Sets the program counter to the given value if and only if the value of the accumulator is 0
</td>
</tr>
<tr>
<td>
8XX
</td>
<td>
BRP <XX>
</td>
<td>
Sets the program counter to the given value if and only if the value of the accumulator is 0 or
a number greater than 0
</td>
</tr>
<tr>
<td>
901
</td>
<td>
INP
</td>
<td>
Gets a numeric value from the user and stores it in the accumulator
</td>
</tr>
<tr>
<td>
902
</td>
<td>
OUT
</td>
<td>
Prints the current value of the accumulator to output
</td>
</tr>
<tr>
<td>
000
</td>
<td>
HLT
</td>
<td>
Halts the system, ending the execution loop
</td>
</tr>
<tr>
<td>
</td>
<td>
DAT <XXX>
</td>
<td>
An assembler-only instruction that allows a program to insert a raw value into a given memory location
</td>
</tr>
</tbody>
</table>
<p>
Because the LMSM supports all these instructions, a LMC program will execute on the LMSM. The LMSM is
<em>backwards compatible</em> with the LMC, which is a very common situation in CPU evolution: as CPUs
developed, old code often continued to work on the new CPUs.
</p>
<h2>The "Load Immediate" Instruction</h2>
<p>
In addition to these standard LMC instructions, the LMSM also supports an "immediate" instruction to load
a value directly into the accumulator. We saw this instruction in our simple "Hello World" program:
</p>
<table>
<thead>
<tr>
<th>Machine Code</th>
<th>Assembly Instruction</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>
4XX
</td>
<td>
LDI <XX>
</td>
<td>
Loads the numeric value XX directly into the accumulator.
</td>
</tr>
</tbody>
</table>
<h2>Stack Instructions</h2>
<p>
The most significant addition to the LMC model in the LMSM is the addition of several instructions for
working with the new stacks available in upper memory:
</p>
<table>
<thead>
<tr>
<th>Machine Code</th>
<th>Assembly Instruction</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>
920
</td>
<td>
SPUSH
</td>
<td>
"Pushes" the value of the <code>accumulator</code> onto the top of the value stack, by
<em>decrementing</em> the <code>stack_pointer</code> register by 1 and then copying the
value from the <code>accumulator</code> to the memory location that the <code>stack_pointer</code>
points to.
</td>
</tr>
<tr>
<td>
921
</td>
<td>
SPOP
</td>
<td>
"Pops" the value off top of the value stack into the <code>accumulator</code> by copying
the value in the memory location that the <code>stack_pointer</code> points to the
<code>accumulator</code>
and then <em>incrementing</em> the <code>stack_pointer</code>.
</td>
</tr>
<tr>
<td>
922
</td>
<td>
SDUP
</td>
<td>
"Duplicates" the top of the value stack by decrementing the value of the <code>stack_pointer</code> and
then copying the value directly above the memory location that the <code>stack_pointer</code> points
to to the memory location that the <code>stack_pointer</code> points to.
</td>
</tr>
<tr>
<td>
923
</td>
<td>
SDROP
</td>
<td>
"Drops" the top of the value stack by incrementing the value of the <code>stack_pointer</code>.
</td>
</tr>
<tr>
<td>
924
</td>
<td>
SSWAP
</td>
<td>
"Swaps" the top of the value stack by swapping the value in the memory location directly above the
memory location that the <code>stack_pointer</code> points to with the value in the memory location
that the <code>stack_pointer</code> points to.
</td>
</tr>
<tr>
<td>
930
</td>
<td>
SADD
</td>
<td>
Removes the top two values on the stack, adds them together and places the result onto the stack.
</td>
</tr>
<tr>
<td>
931
</td>
<td>
SSUB
</td>
<td>
Removes the top two values on the stack, subtracts them and places the result onto the stack.
</td>
</tr>
<tr>
<td>
932
</td>
<td>
SMUL
</td>
<td>
Removes the top two values on the stack, multiplies them and places the result onto the stack.
</td>
</tr>
<tr>
<td>
933
</td>
<td>
SDIV
</td>
<td>
Removes the top two values on the stack, divides them and places the result onto the stack.
</td>
</tr>
<tr>
<td>
934
</td>
<td>
SMAX
</td>
<td>
Removes the top two values on the stack, and places the maximum of the two values back on the stack.
</td>
</tr>
<tr>
<td>
935
</td>
<td>
SMIN
</td>
<td>
Removes the top two values on the stack, and places the minimum of the two values back on the stack.
</td>
</tr>
<tr>
<td>
910
</td>
<td>
JAL
</td>
<td>
The "Jump And Link" instruction updates the program counter to the value that the stack pointer
currently points to and then increases the value of the return address register by one and saves the
address of the next instruction <em>after</em> the "Jump And Link".
</td>
</tr>
<tr>
<td>
911
</td>
<td>
RET
</td>
<td>
The "Return" instruction updates the program counter to the value that the return address pointer
currently points to and then decreases the value of the return address register by one.
</td>
</tr>
</tbody>
</table>
<h2>Assembly Instruction Examples</h2>
<p>
Let's look at the mechanics for a few instructions on the LMSM to get a feel for how things work.
</p>
<h3>The <code>ADD</code> Instruction</h3>
<p>
The <code>ADD</code> assembly instruction takes a two digit argument and tells the LMSM to add the value
at that memory location to the accumulator.
</p>
<p>
Consider the assembly <code>ADD 10</code>. This would assemble down to the machine instruction <code>110</code>.
</p>
<p>
If this instruction were loaded into memory and ready to execute, the LSMS would look like this:
</p>
<img src="img/add1.png">
<p>
Note that the program counter is pointing to the instruction after the <code>ADD</code> machine instruction,
and the <code>ADD</code> machine instruction has been loaded into current instruction register. The LMSM
is ready to execute the instruction and continue.
</p>
<p>
Note also that the value stored in memory location <code>10</code> is the value <code>20</code>, and that
the accumulator currently has the value <code>22</code>.
</p>
<img src="img/add2.png">
<p>
Once the instruction has executed the value <code>20</code> has been added to the accumulator, giving the
value <code>42</code>.
</p>
<p>
The LMSM then moves on to the next instruction.
</p>
<h3>The <code>BRZ</code> Instruction</h3>
<p>
The <code>BRZ</code> assembly instruction takes a two digit argument and tells the LMSM to "jump" to that
memory location if the accumulator is zero. This is an example of assembly control flow, allowing the
LMSM to conditionally execute code depending on the state of data.
</p>
<p>
Consider the assembly <code>BRZ 10</code>. This would assemble down to the machine instruction <code>710</code>.
</p>
<p>
If this instruction were loaded into memory and ready to execute, the LSMS would look like this:
</p>
<img src="img/brz1.png">
<p>
Note that the program counter is pointing to the instruction after the <code>BRZ</code> machine instruction,
and the <code>BRZ</code> machine instruction has been loaded into current instruction register.
</p>
<p>
The accumulator currently has the value <code>0</code>, so the condition to make the jump is true.
</p>
<img src="img/brz2.png">
<p>
Once the instruction has executed, because the accumulator had the value 0, the program counter has been updated
to point to the instruction at memory position <code>10</code>, which has the value <code>000</code>.
</p>
<p>
The value <code>000</code> tells the LMSM to halt, so this <code>BRZ</code> instruction is effectively telling
the LSMS: "Stop executing if the accumulator is zero."
</p>
<h3>The <code>SPUSH</code> Instruction</h3>
<p>
The next instruction we are going to look at is the <code>SPUSH</code> instruction, which takes the current
value of the accumulator and "pushes" it onto the top of the value stack.
</p>
<p>
Note that the stack pointer starts with the value <code>200</code>, indicating that there are no values
on the stack, and that the accumulator starts with the value 22. This latter value will be the value
"pushed" onto the stack.
</p>
<img src="img/spush1.png">
<p>
The first step in the push operation is to decrement the stack pointer register by <code>1</code> to
<code>199</code>. This will be the memory location that the LMSM stores the value into.
</p>
<img src="img/spush2.png">
<p>
The second step in the push operation is to move the value of the accumulator into the memory location that
the stack pointer points to.
</p>
<img src="img/spush3.png">
<p>
At this point, the LMSM is done: the value 22 has been "pushed" onto the top of the value stack. If another
<code>SPUSH</code> were to occur, the new value would be located in the next slot down from this, effectively
"growing" the value stack downwards.
</p>
<h3>The <code>SADD</code> Instruction</h3>
<p>
The final instruction we are going to look at is the <code>SADD</code> instruction, which adds the top two
values on the value stack together and replaces them with the result.
</p>
<p>
Note that in this case the stack pointer starts with the value <code>198</code>, and that there are two values
on the stack, <code>5</code> and <code>8</code>, with <code>8</code> being on the top of the stack.
</p>
<img src="img/sadd1.png">
<p>
The first step in execution is to add the two values together, producing the value <code>13</code> and placing
it in the slot just above the current stack pointer.
</p>
<img src="img/sadd2.png">
<p>
The next step in execution is to increment the stack pointer to <code>199</code>. This step effectively drops
the top value <code>8</code> from the stack, leaving <code>13</code>, the result of the addition, at the top
of the stack.
</p>
<p>
Note also that the <code>8</code> value is not zero'd out: it is no longer relevant since the stack pointer
is "above" (or, if you like, "below" it).
</p>
<img src="img/sadd3.png">
<p>
At this point, the LMSM is done with this instruction: the top two values have been "popped" off the stack, added
and the result has been "pushed" onto it.
</p>
<h2><code>DAT</code> Assembly Instructions</h2>
<p>
The <code>DAT</code> assembly instruction tells the LMSM assembler: "Put the value following me directly into
the memory spot I correspond with.
</p>
<p>
Consider the following LMSM Assembly program:
</p>
<img src="img/dat1.png"/>
<p>
The <code>DAT</code> assembly instruction tells the assembler to place the raw value <code>1</code> into the
memory location that the instruction corresponds to. In this case, the value <code>1</code> is placed in
memory location <code>4</code>, since that is the memory slot that corresponds with the <code>DAT</code>
assembly instruction.
</p>
<p>
The <code>DAT</code> assembly instruction lets programmers save numbers in memory for use in their programs. In
this case, the program asks the user for input, adds the value 1 to it (even though the second instruction
reads <code>ADD 4</code>, remember that means "Add the value in <em>memory location</em> 4 to the accumulator")
and then outputs the result.
</p>
<h2>Labels</h2>
<p>
Labels are a feature of LMSM that let programmers use symbolic, rather than absolute, locations when writing
assembly. In the previous example, the <code>ADD</code> instruction had a hard-coded reference to memory location
<code>4</code>.
</p>
<p>
But what if we added another instruction, say another <code>ADD</code>, to add the number twice:
</p>
<img src="img/label1.png"/>
<p>
Now we are in a bad spot! The <code>ADD 4</code> instructions reference memory location <code>4</code>, but that
holds the <code>HLT</code> instruction, which has the value <code>0</code>. So we are now inadvertently adding
the value <code>0</code> to the accumulator twice!
</p>
<p>
We could fix this by updating the two instructions to be <code>ADD 5</code>, but it would get annoying if, every
time we changed the program, we had to go back and edit all these instructions. What would be better is if
we could give a particular instruction a <em>symbolic name</em> and then let the assembler figure out the
location that the instruction ends up at.
</p>
<p>
This is exactly what labels give us. Any assembly instruction, including <code>DAT</code> instructions,
can be labeled with a prefix, and then referred to by other instructions via that prefix.
</p>
<p>
Let's give the <code>DAT</code> instruction the label <code>ONE</code>, since it holds the value <code>1</code>
for use in our program:
</p>
<img src="img/label2.png"/>
<p>
We prefix the <code>DAT</code> instruction with the label <code>ONE</code> and then update the <code>ADD</code>
instructions to refer to that label, rather than fixed positions in memory. Now the assembler goes through
the trouble of figuring exactly where the value <code>1</code> ends up in memory, and generates the correct
machine instructions for the <code>ADD</code> assembly instructions.
</p>
<p>
Much nicer!
</p>
<h2>Pseudo-Instructions</h2>
<p>
In addition to the "real" assembly instructions above, LMSM assembly supports two "pseudo-instructions".
Pseudo-instructions are assembly instructions that are valid assembly, but that do not correspond directly
to a single machine instruction.
</p>
<p>
Pseudo-instructions are included in the LMSM to make assembly programming more convenient and because they
are common in other more realistic assembly languages, particularly
<a href="https://en.wikipedia.org/wiki/Reduced_instruction_set_computer">RISC-based</a> languages like
<a href="https://en.wikipedia.org/wiki/MIPS_architecture">MIPS</a>.
</p>
<table>
<thead>
<tr>
<th>Pseudo-Instruction</th>
<th>Assembly Instructions</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>
SPUSHI <XX>
</td>
<td>
LDI <XX><br/>
SPUSH
</td>
<td>
This pseudo-instruction pushes the immediate value given onto the stack, compiling down to two
real instructions
</td>
</tr>
<tr>
<td>
CALL <XX>
</td>
<td>
LDI <XX><br/>
SPUSH<br/>
JAL
</td>
<td>
This pseudo-instruction pushes the address of a function onto the stack and then jump-and-links to it
</td>
</tr>
</tbody>
</table>
<h2>Loops & Conditionals In LMSM</h2>
<p>
To understand how loops and conditionals work in LMSM, let's look at a classic LMC program that asks a user
for input and then counts down:
</p>
<pre>
INP
OUT
LOOP BRZ QUIT
SUB ONE
OUT
BRA LOOP
QUIT HLT
ONE DAT 1
</pre>
<p>
Here is each line explained:
</p>
<ul>
<li>
Line 1 asks the user to enter a number and saves it in the accumulator
</li>
<li>
Line 2 prints that number out.
</li>
<li>
Line 3 is labeled <code>LOOP</code> and jumps (or branches) to the location labeled <code>QUIT</code>
if the accumulator is <code>0</code>
</li>
<li>
Line 4 subtracts the value stored in the memory location labeled <code>ONE</code> from the accumulator.
</li>
<li>
Line 5 prints the accumulator out.
</li>
<li>
Line 6 unconditionally jumps back to the instruction on line 3, labled <code>LOOP</code>
</li>
<li>
Line 7 is labeled <code>QUIT</code> and halts the machine
</li>
<li>
Line 8 is labeled <code>ONE</code> and holds the raw value <code>1</code>
</li>
</ul>
<p>
So, this program will ask the user for a number and then print out a count-down from that number to
<code>0</code>. The conditional branch on line 3 gives us the conditional, akin to an <code>if</code>
statement in a higher level programming language. The unconditional branch on line 6 is what makes the
code "loop".
</p>
<h2>Function Calls</h2>
<p>
The last major bit of functionality to understand about the LMSM is how function calls work. This topic
is sufficiently complicated and important as to deserve its own section, which you can find <a href="functions.html">here</a>.
</p>
</main>
<footer>
</footer>
</body>
</html>