UNIT - 5
Programming – 3
Often it is a good idea to link assembly language programs or routines with high-level programs which may contain resources unavailable to you through direct assembly programming--such as using C's built-in graphics library functions or string-processing functions. Conversely, it is often necessary to include short assembly routines in a compiled high-level program to take advantage of the speed of machine language.
All high-level languages have specific calling conventions which allow one language to communicate to the other; i.e., to send variables, values, etc. The assembly-language program that is written in conjunction with the high-level language must also reflect these conventions if the two are to be successfully integrated. Usually, high-level languages pass parameters to subroutines by utilizing the stack. This is also the case for C.
5.2.1 Using Assembly Language with C/C++ for 16-Bit DOS Applications
Procedure Setup
In order to ensure that the assembly language procedure and the C program will combine and be compatible, the following steps should be followed:
Declare the procedure label global by using the GLOBAL directive. In addition, also declare global any data that will be used.
Use the EXTERN directive to declare global data and procedures as external. It is best to place the EXTERN statement outside the segment definitions and to place near data inside the data segment.
Follow the C naming conventions--i.e., precede all names (both procedures and data) with underscores.
Stack Setup
Whenever entering a procedure, it is necessary to set up a stack frame on which to pass parameters. Of course, if the procedure doesn't use the stack, then it is not necessary. To accomplish the stack setup, include the following code in the procedure:
push ebp
mov ebp, esp
EBP allows us to use this pointer as an index into the stack, and should not be altered throughout the procedure unless caution is taken. Each parameter passed to the procedure can now be accessed as an offset from EBP. This is commonly known as a standard stack frame.
Preserving Register
It is necessary that the procedure preserve the contents of the registers ESI, EDI, EBP, and all segment registers. If these registers are corrupted, it is possible that the computer will produce errors when returning to the calling C program.
Passing Parameters in C to the Procedure
C passes arguments to procedures on the stack. For example, consider the following statements from a C main program:
{
extern int Sum();
}
int a1, a2, x;
}
x = Sum(a1, a2);
When C executes the function call to Sum, it pushes the input arguments onto the stack in reverse order, then executes a call to Sum. Upon entering Sum, the stack would contain the following:
Since a1 and a2 are declared as int variables, each takes up one word on the stack. The above method of passing input arguments is called passing by value. The code for Sum, which outputs the sum of the input arguments via register EAX, might look like the following:
_Sum
push ebp ; create stack frame
mov ebp, esp ;
mov eax, [ebp+8] ; grab the first argument
mov ecx, [ebp+12] ; grab the second argument
add eax, ecx ; sum the arguments
pop ebp; restore the base pointer
ret
It is interesting to note several things. First, the assembly code returns the value of the result to the C program through EAX implicitly. Second, a simple RET statement is all that is necessary when returning from the procedure. This is due to the fact that C takes care of removing the passed parameters from the stack. Unfortunately, passing by value has the drawback that we can only return one output value. What if Sum must output several values, or if Sum must modify one of the input variables? To accomplish this, we must pass arguments by reference. In this method of argument transmission, the addresses of the arguments are passed, not their values.
The address may be just an offset, or both an offset and a segment. For example, suppose Sum wishes to modify a2 directly--perhaps storing the result in a2 such that a2 = a1 + a2. The following function call from C could be used: Sum(a1, &a2); The first argument is still passed by value (i.e., only its value is placed on the stack), but the second argument is passed by reference (its address is placed on the stack). The "&" prefixmeans "address of." We say that &a2 is a "pointer" to the variable a2. Using the above statement, the stack would contain the following upon entering Sum:
Note that the address of a2 is pushed on the stack, not its value. With this information, Sum can access the variable a2 directly.
Returning a Value from the Procedure
Assembly can return values to the C calling program using only the EAX register. If the returned value is only four bytes or less, the result is returned in register EAX. If the item is larger than four bytes, a pointer is returned in EAX which points to the item. Here is a short table of the C variable types and how they are returned by the assembly code:
Data Type Register
char AL
short AX
int, long, pointer EAX
(*)
Allocating Local Data Space on the Stack
Temporary storage space for local variables or data can be created by decreasing the contents of ESP just after setting up a stack frame at the beginning of the procedure. It is important to restore the stack space at the end of the procedure. The following code fragment illustrates the basic idea:
push ebp ; Save caller's stack frame
mov ebp, esp ; Establish new stack frame
sub esp, 4 ; Allocate local data space of ; 4 bytes
push esi ; Save critical registers
push edi.
pop edi ; Restore critical registers
pop esi
mov esp, ebp ; Restore the stack
pop ebp ; Restore the frame
ret
Using C Functions in Assembly Procedures
In most cases, calling C library routines or functions from an assembly program is more complex than calling assembly programs from C. An example of how to call the printflibrary function from within an assembly program is shown next, followed by comments on how it actually works.
global _main
extern _printf
section .data
text db "291 is the best!", 10, 0
strformat db "%s", 0
section .code
_main
push dword text
push dword strformat
call _printf
add esp, 8
ret
Key takeaway
Notice that the procedure is declared global, and its name must be _main, which is the starting point of all C code. Since C pushes its arguments onto the stack in reverse order, the offset of the string is pushed first, followed by the offset of the format string. The C function can then be called, but care must be taken to restore the stack once it has completed. When linking the assembly code, include the standard C library (or the library containing the functions you use) in the link. For a more detailed (and perhaps more accurate) description of the procedures involved in calling C functions, refer to another text on the subject.
5.2.2 Using Assembly Language with C/C++ for 32-Bit Applications
A major difference exists between l6-bit and 32-bit applications. The 32-bit applications are written using Microsoft Visual C/C++ Express for Windows and the l6-bit applications are written using Microsoft C++ for DOS. The main difference is that Visual C/C++ Express for Windows is more common today, but Visual C/C++ Express cannot easily call DOS functions such as INT 2lH. It is suggested that embedded applications that do not require a visual interface be written in l6-bit C or C++, and applications that incorporate Microsoft Windows or Windows CE (available for use on a ROM or Flash1 device for embedded applications) use 32-bit Visual C/C++ Express for Windows. A 32-bit application is written by using any of the 32-bit registers, and the memory space is essentially limited to 2G bytes for Windows. The free version of Visual C++ Express does not support 64-bit applications written in assembly language at this time. The only difference is that you may not use the DOS function calls; instead use the console getch() or getche() and putch C/C++ language functions available for use with DOS console applications. Embedded applications use direct assembly language instructions to access I/O devices in an embedded system. In the Visual interface, all I/O is handled by the Windows operating system framework. Console applications in WIN32 run in native mode, which allow assembly language to be included in the program without anything other than the _asm keyword. Windows forms applications are more challenging because they operate in the managed mode, which does not run in the native mode of the microprocessor. Managed applications operate in a pseudo mode that does not generate native code.
Many programs are too large to be developed by one person [Ref 1]. This means that programs are routinely developed by teams of programmers. The linker program is provided with Visual Studio so that programming modules can be linked together into a complete program. Linking is also available from the command prompt provided by Windows. This section of the text describes the linker, the linking task, library files, EXTRN, and PUBLIC as they apply to program modules and modular programming. The assembler program converts a symbolic source module (file) into a hexadecimal object file.
The PUBLIC and EXTRN directives are very important to modular programming because they allow communications between modules. We use PUBLIC to declare that labels of code, data, or entire segments are available to other program modules. EXTRN (external) declares that labels are external to a module. Without these statements, modules could not be linked together to create a program by using modular programming techniques. They might link, but one module would not be able to communicate to another. The PUBLIC directive is placed in the opcode field of an assembly language statement to define a label as public, so that the label can be used (seen by) by other modules. The label declared as public can be a jump address, a data address, or an entire segment. Example below shows the PUBLIC statement used to define some labels and make them public to other modules in a program fragment. When segments are made public, they are combined with other public segments that contain data with the same segment name.
EXAMPLE
model flat, c
.data
public Data1 ;declare Data1 and Data2 public
public Data2
0000 0064[ 00 ] Data1 db 100 dup(?)
0064 0064[
00
] Data2 db 100 dup(?) 00 ]
.code
.startup
public Read ; declare Read public
0006 B4 06 Read proc far
mov ah,6
The EXTRN statement appears in both data and code segments to define labels as external to the segment. If data are defined as external, their sizes must be defined as BYTE, WORD, or DWORD. If a jump or call address is external, it must be defined as NEAR or FAR.
Library files are collections of procedures that are used by many different programs. These procedures are assembled and compiled into a library file by the LIB program that accompanies the MASM assembler program. Libraries allow common procedures to be collected into one place so they can be used by many different applications. A library file is created with the LIB command, which executes the LIB.EXE program that is supplied with Visual Studio. A library file is a collection of assembled .OBJ files that contains procedures or tasks written in assembly language or any other language. Example 8–6 shows two separate functions (UpperCase and LowerCase) included in a module that is written for Windows, which will be used to structure a library file.
Example: extern “C” char UpperCase(char); extern “C” char LowerCase(char);
A macro is a group of instructions that perform one task, just as a procedure performs one task. The difference is that a procedure is accessed via a CALL instruction, whereas a macro, and all the instructions defined in the macro, is inserted in the program at the point of usage. Creating a macro is very similar to creating a new opcode, which is actually a sequence of instructions, in this case, that can be used in the program.
Macro sequences execute faster than procedures because there is no CALL or RET instruction to execute. The instructions of the macro are placed in your program by the assembler at the point where they are invoked. The MACRO and ENDM directives delineate a macro sequence. The first statement of a macro is the MACRO instruction, which contains the name of the macro and any parameters associated with it. An example is MOVE MACRO A,B, which defines the macro name as MOVE.
Example below shows how a macro is created and used in a program. The first six lines of code define the macro. This macro moves the word-sized contents of memory location B into word-sized memory location A. After the macro is defined in the example, it is used twice. The macro is expanded by the assembler in this example, so that you can see how it assembles to generate the moves. Any hexadecimal machine language statement followed by a number (1, in this example) is a macro expansion statement.
EXAMPLE
MOVE MACRO A,B
PUSH AX
MOV AX,B
MOV A,AX
POP AX
ENDM
MOVE VAR1,VAR2 ;move VAR2 into VAR1
0000 50 1 PUSH AX
0001 A1 0002 R 1 MOV AX,VAR2
0004 A3 0000 R 1 MOV VAR1,AX
0007 58 1 POP AX MOVE VAR3,VAR4 ;move VAR4 into VAR3 0008 50 1 PUSH AX
0009 A1 0006 R 1 MOV AX,VAR4
000C A3 0004 R 1 MOV VAR3,AX
000F 58 1 POP AX
Key takeaway
Macro definitions can be placed in the program file as shown, or they can be placed in their own macro module. A file can be created that contains only macros to be included with other program files. We use the INCLUDE directive to indicate that a program file will include a module that contains external macro definitions. Although this is not a library file, for all practical purposes it functions as a library of macro sequences.
5.4.1 Using the Keyboard Display
The keyboard of the personal computer is read by many different objects available to Visual C++. Data read from the keyboard are either in ASCII-coded or in extended ASCII-coded form. They are then either stored in 8-bit ASCII form or in 16-bit Unicode form. As mentioned in an earlier chapter, Unicode contains ASCII code in the codes 0000H–00FFH. The remaining codes are used for foreign language character sets. Do not use cin or getch to read keys in Visual C++ as we do in a DOS C++ console application; in place of cin or getch we use controls in Visual C++ that accomplish the same task.
Creating a Visual C++ Express application that contains a simple textbox gives a better understanding of reading a key in Windows. Figure shows such an application written as a forms-based application. Recall that to create a forms-based application:
1. Start Visual C++ Express.
2. Click on Create: Project.
3. Select a CLR
Fig Textbox with filtering [Ref1]
Windows Forms Application, then give it a name and click on OK. Once the new forms-based application is created, select the textbox control from the toolbox and draw it on the screen of the dialog box, as illustrated in Figure above.
The first thing that should be added to the application is a set focus to the textbox control. When focus is set, the cursor moves to the object, in this case the textbox. Focus is set to a control by using textBox1->Focus(), which in our case is because the textbox control is named textBox1. This statement is placed in the Form1_Load function, which must be installed by double-clicking on a blank area of the form. The Form1_Load function can also be installed by clicking on the yellow lightning bolt and selecting Load and then adding it by doubleclicking on the blank textbox to its right. The application will now set focus to the textbox1 control when started. This means that the blinking cursor appears inside the textbox control.
In some cases this may be undesirable and may require some filtering. One such case is if the program requires that the user enter only hexadecimal data. In order to intercept keystrokes as they are typed, the event handlers KeyDown and KeyPress are used for the textbox. The KeyDown event handler is called when the key is pressed down, which is followed by a call to the KeyPress event handler. To insert these functions into the application for the textbox control, click on the textbox and then select the Properties window. Next find the yellow lightning bolt and click on it, and install KeyDown and KeyPress events handlers for the textbox1 control.
To illustrate filtering, this application uses the KeyDown function to look at each keyboard character that is typed before the program uses the keystroke. This allows the characters to be modified. Here the program only allows the numbers 0 through 9 and the letters A through F to be typed from the keyboard. If a lowercase letter is typed, it is converted into uppercase. If any other key is typed, it is ignored.
5.4.2 Using Video Display
As with the keyboard, in Visual C++ objects are used to display information. The textbox control can be used to either read data or display data as can most objects. Notice that a few label controls have been added to the form to identify the contents of the textbox controls. In this new application the keyboard data is still read into textbox control textBox1, but when the Enter key is typed, a decimal version of the data entered into textBox1 appears in textBox2—the second textbox control. Make sure that the second control is named textBox2 to maintain compatibility with the software presented here. To cause the program to react to the Enter key, ASCII code 13 (0DH or 0x0d), modify the KeyPress function as shown in Example below. Notice how the Enter key is detected using an else if.
EXAMPLE
private: System::Void textBox1_KeyPress(System::Objectˆ sender, System::Windows::Forms::KeyPressEventArgsˆ e)
{
if (e->KeyChar >= ‘a’ && e->KeyChar <= ‘f’)
{
e->KeyChar -= 32;
}
else if (e->KeyChar == 13)
{
// software to display the decimal version in textBox2
keyHandled = true;
}
e->Handled = keyHandled;
}
Timers are important in programming. A timer is programmed to fire or trigger after a known amount of time, which is given in milliseconds. The allowable range for the timer is from 1 to 2G milliseconds. This allows a timer to be programmed for just about any time needed. If programmed with 1000, the timer fires in 1 second, and if programmed with 60000, the timer fires in 1 minute and so forth. A timer may be used a single time or multiple times and as many timers as needed (up to 2 billion) may appear in a program. The timer is found in the toolbox at the Design window.
The design contains two label controls and two button controls plus the timer control. Add all five of these controls to the application. The timer does not appear on the form, but in an area near the bottom of the design screen. Notice that a few properties of the form are changed so the icon is not shown and the Text property of the form is changed to Shift/Rotate instead of Form1.
Fig Shift/Rotate Application Design[Ref1]
Once the form appears as shown in Figure above, add event handler functions (yellow lightning bolt) for the two command buttons (Click) and for the timer (Tick). To add an event handler, click on the button or timer, go to the Properties window at the right of the screen, and click on the yellow lightning bolt and select the event. The program contains three handlers, two for button clicks and one for a timer tick.
Key takeaway
The software for the two button click handlers is nearly identical except for the Boolean variable shift. The two statements in each place text onto the labels. If the shift button is pressed, “Shifted” is displayed, and if the rotate button is pressed, “Rotated” is displayed on label1. The second label has the test number 00011001 displayed. The Boolean variable shift is set to true for the shift button and false for the rotate button. In both button handlers, count is set to 8 to shift or rotate the number 8 places.
Conversion from binary to ASCII is accomplished in three ways[Ref1]: (1) by the AAM instruction if the number is less than 100 (provided the 64-bit extensions are not used for the conversion), (2) by a series of decimal divisions (divide by 10), or (3) by using the C++ Convert class function ToString. The AAM instruction converts the value in AX into a two-digit unpacked BCD number in AX. If the number in AX is 0062H (98 decimal) before AAM executes, AX contains 0908H after AAM executes. This is not ASCII code, but it is converted to ASCII code by adding 3030H to AX.
Example below illustrates a program that uses the procedure that processes the binary value in AL (0–99) and displays it on the video screen as a decimal number. The procedure blanks a leading zero, which occurs for the numbers 0–9, with an ASCII space code. This example program displays the number 74 (testdata) on the video screen. To implement this program, create a forms-based application in Visual C++ and place a single label called label1 on the form. The number 74 will appear if the assembly language function.
EXAMPLE
// place at top of program
// will not function in 64-bit mode
void ConvertAam(char number, char* data)
{
_asm
{
mov ebx,data ;pointer to ebx
mov al,number ;get test data
mov ah,0 ;clear AH
aam ;convert to BCD
add ah,20h
cmp al,20h ;test for leading zero
je D1 ;if leading zero
add ah,10h ;convert to ASCII
D1:
mov [ebx], ah
add al,30h ;convert to ASCII
mov [ebx+1], al
}
}
private: System::Void Form1_Load(System::Objectˆ sender,
System::EventArgsˆ e)
{
char temp[2]; // place for result
ConvertAam(74, temp);
Char a = temp[0];
Char b = temp[1];
label1->Text = Convert: :ToString(a) + Convert::ToString(b);
}
The reason that AAM converts any number between 0 and 99 to a two-digit unpacked BCD number is because it divides AX by 10. The result is left in AX so AH contains the quotient and AL the remainder.
Example [Ref1] below shows how the unsigned 32-bit number is converted to ASCII and displayed on the video screen. Here, we divide EAX by 10 (for decimal) and save the remainder on the stack after each division for later conversion to ASCII. After all the digits have been converted, the result is displayed on the video screen by removing the remainders from the stack and converting them to ASCII code. This program also blanks any leading zeros that occur. As mentioned, any number base can be used by changing the radix variable in this example. Again, to implement this example create a forms application with the /CLR option and a single label called label1. If the number base is greater than 10, letters are used for the representation of characters beyond 9.
EXAMPLE
void Converts(int number, int radix, char* data)
{
_asm
{
mov ebx,data ;initialize pointer
push radix
mov eax, number ;get test data
L1:
mov edx,0 ;clear edx
div radix ;divide by base
push edx ;save remainder
cmp eax,0
jnz L1 ;repeat until 0
L2:
pop edx ;get remainder
cmp edx,radix
je L4 ;if finished
add dl,30h ;convert to ASCII
cmp dl,39h
jbe L3
add dl,7
L3:
mov [ebx],dl ;save digit
inc ebx ;point to next
jmp l2 ;repeat until done
L4:
mov byte ptr[ebx],0 ;save null in string
}
}
private: System::Void Form1_Load(System::Objectˆ sender,
System::EventArgsˆ e)
{
char temp[32]; // place for result
Converts(7423, 10, temp);
Stringˆ a = “”;
int count = 0;
while (temp[count] != 0) // convert to string
{
Char b = temp[count++];
a += b;
}
label1->Text = a;
}
Conversions from ASCII to binary usually start with keyboard entry. If a single key is typed, the conversion occurs when 30H is subtracted from the number. If more than one key is typed, conversion from ASCII to binary still requires 30H to be subtracted, but there is one additional step. After subtracting 30H, the number is added to the result after the prior result is first multiplied by 10. The algorithm for converting from ASCII to binary is: 1. Begin with a binary result of 0. 2. Subtract 30H from the character to convert it to BCD. 3. Multiply the result by 10, and then add the new BCD digit. 4. Repeat steps 2 and 3 for each character of the number. Example below shows a program that implements this algorithm. Here, the binary number is displayed from variable temp on label1 using the Convert class to convert it to a string. Each time this program executes, it reads a number from the char variable array numb and converts it to binary for display on the label.
EXAMPLE
int ConvertAscii(char* data)
{
int number = 0;
_asm
{
mov ebx,data ;intialize pointer
mov ecx,0
B1:
mov cl,[ebx] ;get digit
inc ebx ;address next digit
cmp cl,0 ;if null found
je B2
sub cl,30h ;convert from ASCII to BCD
mov eax,10 ;x10
mul number
add eax,ecx ;add digit
mov number,eax ;save result
jmp B1
B2:
}
return number;
}
private: System::Void Form1_Load(System::Objectˆ sender,
System::EventArgsˆ e)
{
char temp[] = “2316”; // string
int number = ConvertAscii(temp);
label1->Text = Convert::ToString(number);
}
Key takeaway
Conversion from binary to ASCII is accomplished in three ways[Ref1]: (1) by the AAM instruction if the number is less than 100 (provided the 64-bit extensions are not used for the conversion), (2) by a series of decimal divisions (divide by 10), or (3) by using the C++ Convert class function ToString.
Reference:
1. Barry B Brey: The Intel Microprocessors, 8th Edition, Pearson Education, 2009.
2. Douglas V. Hall: Microprocessors and Interfacing, Revised 2nd Edition, TMH, 2006.
3. K. Udaya Kumar & B.S. Umashankar : Advanced Microprocessors
& IBM-PC Assembly Language Programming, TMH 2003.
4. James L. Antonakos: The Intel Microprocessor Family: Hardware
and Software Principles and Applications, Cengage Learning, 2007.