Unit - 2
Macro Processor and Compilers
Macros are single-line abbreviations for a certain group of instructions. Once the macro is defined, these groups of instructions can be used anywhere in a program. It is sometimes necessary for an assembly language programmer to repeat some blocks of code in the course of a program. The programmer needs to define a single machine instruction to represent a block of code for employing a macro in the program. The macro proves to be useful when instead of writing the entire block again and again, you can simply write the macro that you have already defined. An assembly language macro is an instruction that represents several other machine language instructions at once.
Macros are one-line acronyms for a specific set of commands. These collections of instructions can be used anywhere in a program once the macro has been defined.
During the course of a program, an assembly language programmer may need to repeat some blocks of code. To use a macro in a program, the programmer must define a single machine instruction to represent a block of code. Instead of writing the complete block over and over, the macro comes in handy when you can simply write the macro you've already defined. An assembly language macro is a single instruction that acts as a proxy for several machine language commands.
The macro facility allows you to provide a name to a sequence that appears multiple times in a program, and then use that name whenever that sequence is encountered. All you have to do now is use the macro instruction definition to give a sequence a name.
The structure shown below demonstrates how to define a macro in a program:
The first line of the macro definition is the MACRO pseudo-op, which is described by this structure. The name line for macro is the next line, which identifies the macro instruction name. Following the macro name is a line that contains the abbreviated sequence of instructions. The actual macro instruction is contained in each instruction. MEND pseudo-op is the last statement in the macro definition. This pseudo-op marks the end of the macro definition and the end of the macro instruction definition.
Key takeaway
Macros are single-line abbreviations for a certain group of instructions. Once the macro is defined, these groups of instructions can be used anywhere in a program.
Macros are one-line acronyms for a specific set of commands.
Macro instruction arguments
In example 1 we have used one argument called as DATA. Suppose if we want to use more operands for same operation, it provides the facility for arguments or parameters in calls. Corresponding macro dummy arguments will appear in macro definition.
Consider the following example2
A1,D ATAl
A2,DATAl
A3,DATAl
A 1,DATA2
A 2,DATA2
A 3,DATA2
DATAl DC F’5”
DATA2 DC F’4”
In the above example DATAl and DATA2 parameters are used for performing add operations with its registers. In the above assembly code ADD operation is repeated two times with different operands. For the above case we can define macro in the following manner
SOURCE EXPANDED SOURCE
MACRO
INCR & arg
A l,<fcarg
A 2.< fcarg
A 3,< fcarg
MEND
INCR DATAl • use datal as operand A 1,DATAl
A A 2,DATAl
3,DATAl
INCR DATA2 use data2 as operand A
• •
DATA1 A
DC F’5” DATA2
DATA2 F’4” DATA2
DC
Performing the same operation with a variable argument is called as macro instruction argument or dummy argument. It is specified by means of & symbol. It is possible to supply more than one argument in a macro call. Each argument must correspond to a definition argument on the macro name line of the macro definition. When a macro call is processed, the arguments supplied are substituted for the respective dummy arguments in the macro definition.
Consider the following example3
Loopl A A5 DATAl
A 2,DATA2
A 3,DATA3
Loop2 A I 5 DATAl
A 2,DATA2
A 3,DATA3
DATAl F’5”
DC
DATA2 F’4”
DC
DATA3 F’2”
DC
This program could be written as
SOURCE EXPANDED SOURCE
MACRO
&1 INCR &arg15&arg2, &arg3
&1A l, &argl
A 2,&arg2
A 3,&arg3
MEND
Loopl INCR DATA1,DATA2,DATA3 Loopl A 1,DATAl
A 2,DATA2
A 3,DATA3
Loop2 INCR DATA3,DATA2,DATAl Loop2 A 1,DATA3
A 2,DATA2
A 3,DATAl
DATAl DC F’5’ DATAl DC F’5”
DATA2 DC F’4” DATA2 DC F’4”
DATA3 DC F’2” DATA3 DC Fc Z’
Here we have specified four arguments including label argument. Label arguments are treated as operands of a macroinstruction.
For writing conditional macro two pseudo operations are used. They are
AIF
AGO
Consider the following example4
A 2,DATA2
A 3 ,DAT A3
Loop2 A 1,DATAl
A 2,DATA2
Loop3 A 1,DATAl
DATAl DC F’5”
DATA2 DC F’4”
DATA3 DC F’2”
In the above example under each loop variable number of arguments is used. The program could be written by using only conditional macro
MACRO
&argvary &count,&argl ,&arg2,&arg3
&argOA l,&argl
AIF (&count EQ 1).FINI
A 2,&arg2
AIF (&count EQ 2).FINI
A 3,&arg3
TINI MEND
EXPANDED SOURCE
Loopl vary 3,D AT A1,D AT A2,D AT A3 • •
Loopl A !,DAT A3
A 2,DATA2
A 3,DATAl
Loop2 vary 2,D AT A3 ,DAT A2 •
Loop2 A !,DATA3
A 2,DATA
Loop3 vary 1,DATAl
Looρ3 A !,DATA!
DATAl DC F’5”
DATAl DC F’5”
DATA2 DC F’4” DATA2 DC F’4”
DATA3 DC F’2” DATA3 DC F’2”
Labels starting with period (.) such as FINI, are macro labels and do not appear in the output of the macro processor. The statement AIF directs the macro processor to skip to the statement labelled FINI if the parameter corresponding to & COUNT is a 1; otherwise the macro processor is to continue with the statement following the AIF pseudo operation. AIF is a conditional branch pseudo operation. The AGO is
An unconditional branch or go to statement. It specifies a label appearing on some other statement in the macro instruction definition.
One macro within another macro is called a nested macro. Consider the following example 5 MACRO
ADDl &arg
L l,&arg
A 1 =FT’
ST l,<fcarg
MEND
MACRO
ADDS &argl,&arg2,&arg3
ADDl &argl
ADDl &arg2
ADDl &arg3
MEND
The above program could be written using multiple levels. Macro calls with in macros can involve several levels. The macro ADDS invoke another macro ADDl.
SOURCE EXPANDED SOURCE EXPANDED
SOURCE
(Level1) (Level2)
MACRO
ADDl &arg
L l,<fcarg
A I=FT’
ST 1
MEND ,<fcarg
MACRO
ADDS &argl,&arg2,&arg3
ADDl &argl
ADDl &arg2
ADDl &arg3
MEND SOURCE
Example of nested macro
TEST START 2000h
MACROS MACRO
CELTOFER MACRO &CEL &FER
LDA &CEL
MULT NINE
ADD THIRTYTWO
STA &FER
MEND
MEND
MACROF MACRO
CELTOFER MACRO &CEL &FER
LDAF &CEL
MULTF NINE
DIVF FIVE
ADD THIRTYTWO
STA &FER
MEND
MEND
Key takeaway
Nested macro calls refer to the macro calls within the macros. A macro is available within other macro definitions also. In the scenario where a macro call occurs, which contains another macro call; the macro processor generates the nested macro definition as text and places it on the input stack.
Consider the following example 1
A!,DATA //add contents of DATA to register1
A2,DATA //add contents of DATA to register2
A3,DATA //add contents of DATA to register3
A!,DATA //add contents of DATA to register 1
A2,DATA //add contents of DATAto register2
A3,DATA //add contents of DATAto register3
In the above example the following sequence is repeated two times.
A!,DATA
A2,DATA
A3,D AT A
Rather than typing these sequences more than one time, we can simply define a macro.
Macro definition starts with the key word MACRO and end with the keyword MEND i.e. it is defined in the following manner
MACRO
//macro name
//body of macro
MEN
D
The above example can be written using macro as follows
SOURCE EXPANDED SOURCE
MACRO
INCR
A !,DATA
A2,DATA
A3,D AT A
MEND
!,DAT
INCR A A
2,DAT
• A A
3,DAT
- A A
!,D
INCR A A T
2,DAT
• A A
3,DAT
• A A
DATADC F’5”
In the above case the macro processor replaces each macro call with the lines
A !,DATA
A 2,DATA
A 3,D AT A
The macro invocation can be done by using macro name, in this case it is INCR.
The process of replacement is called expanding the macro. In the above case the macro definition does not appear in the expanded source code.
“Macro in macro” can be handled by a one-pass macro processor that alternates between macro definition and macro expansion.
The definition of a macro must be in the source program before any statements that call that macro, due to the one-pass structure.
This is a sensible restriction (does not create any real inconvenience).
Global variables
In a one-pass macro processor, there are three basic data structures involved:
- DEFTAB
● The macro definition is saved here, together with the macro prototype and macro body.
● The comment lines have been eliminated.
● For ease of substituting arguments, references to macro instruction parameters are translated to positional notation.
2. NAMTAB
● Store macro names that function as an index to DEFTAB and contain references to the definition's start and end.
3. ARGTAB
● When macro invocations are expanded, this variable is used.
● The arguments are saved in this database according to their position in the argument list when a macro invocation statement is encountered.
Data structure
● The macro names are inserted into NAMTAB, which has two references to the start and finish of the DEFTAB definition.
● The third data structure is the ARGTAB argument table, which is used during macro invocation expansion.
● The arguments are saved in ARGTAB in the order in which they appear in the argument list.
Algorithm
Procedure DEFINE
● When the beginning of a macro definition is recognized, this function is called.
● Fill in the appropriate fields in DEFTAB and NAMTAB.
Procedure EXPAND
● Called to expand a macro invocation statement and set up the argument values in ARGTAB.
Procedure GETLINE
● Request that the next line be handled.
Algorithm for one pass macro processor
Begin {microprocessor}
EXPANDING : - FALSE
While OPCODE‘MIND’ do
Begin
CEPLINR
PROCESSLINE
End {while}
End {microprocessor}
Procedure PROCESSLINK
Begin
Search NAMTAB for OPCODE
If found then
Else if OPCODE=’MACRO’ then
DEFINE
Else write source line to expanded file
End {PROCESSLINK}
Procedure EXPAND
Begin
EXPANDING :=TRUE
Got first line of macro definition (prototype) from DEFTAB
Got up arguments from macro innovation in ARCTAB
Write macro invocation to expanded file as a constant
While not end of macro definition do
Begin
GETLINE
PROCESSLINE
End (while)
EXPANDING:=FALSE
End {EXPAND}
Procedure GETLINE
Begin
If EXPANDING then
Get next line of macro definition from DEFTAB
Multitude arguments from ARCTAB for positional notation
End (if)
Else
Read next line from input file
End {GETLINE}
Begin
Enter macro name into NAMTAB
Enter micro prtotype into DEFTAB
LEVEL t=1
While LEVEL>0 do
Begin
GETLINE
If this is not a comment line then
Begin
Substitute positonal notaiton for parameters
Enter line into DEFTAB
If OPCOD=’MACRO’ then
LEVEL:=LEVEL-1
Else if OPCODE=’MIND’ then
LEVEL:=LEVEL -1
End (if not comment)
End (while)
Store in NAMTAB pointers to beginning and end of definition
End {DEFINE}
Difference between two pass Vs one pass macro processor
Two Pass Macro Processor | One Pass Macro Processor |
(i) Pass1: Recognize macro definition (ii) Pass 2: Recognize macro calls |
|
|
|
A minimally useful compiler is a large project and a fully optimising compiler doing everything imaginable is impossibly large (a trip to the library or CiteSeer makes it even bigger).
There are several different compiler designs that could make sense for an object oriented dynamically typed language. The possible compiler designs run from compiling as quickly as an interpreter interprets to very slow compilers that produce very high quality code.
Most JIT (Just-in-Time) compilers compile fast but not as fast as interpreters, they stick to linear time worst case algorithms because compilation happens at run time where noticeable pauses are not acceptable. Most mature batch compilers tend to be very slow, aiming at fast execution times (at least at higher optimization levels) though a lot are simpler to make them buildable.
JITs are a great strategy for very fast compiling compilers because they can recoup on the second execution. But they are much less appealing when execution time is important because the compile time pauses will get longer (potentially a lot longer, as compilers often use O(N2) - or greater - algorithms).
Compilers offer some strong benefits for up front design because of the rich body of literature, but they also have some strong disadvantages. It's likely that there are no successful projects working in the same design space: language implementation; optimization style (fast compilation or fast execution); and basic framework (SSA, dataflow, simple tree walkers, one pass). It is also very likely that no-one on the team has done anything similar.
Judging design literature without having personally implemented similar projects is hard, if not impossible, but many key decisions have to be made early. These are really project-level design, not technical, but they'll influence the choice of algorithm and intermediate forms.
Key takeaway
There are several different compiler designs that could make sense for an object oriented dynamically typed language.
JITs are a great strategy for very fast compiling compilers because they can recoup on the second execution.
Compiler | Interpreter |
Scan the entire program first and then translate it into machine code. | Translate the program line-by-line. |
Execution time is less. | Execution time is more. |
Since source code is not required tampering with the source code is not possible. | Source code can be easily modified and hence no security of programs. |
Convert the entire program to machine code; when all the syntax errors have been removed execution takes place. | Each time the program is executed, every line is checked for syntax error and then converted to equivalent machine code. |
Slow for debugging. | Fast for debugging. |
Machine code can be saved and used; source code and compiler are no longer needed. | Machine code cannot be saved; an interpreter is always required for translation. |
m4 is a macro processor in the sense that it duplicates its input to its output while expanding macros. Macros can take any number of arguments and are either built-in or user-defined. m4 has built-in functions for including named files, running shell commands, doing integer arithmetic, manipulating text in various ways, performing recursion, and so on, in addition to macro expansion. m4 can be used as a front-end for a compiler or as a standalone macro processor.
POSIX has standardized the m4 macro processor, which is widely available on all UNIXes. Only a small number of consumers are usually aware that it exists. Those who discover it, on the other hand, are likely to become regular users. GNU Autoconf's popularity, which requires GNU m4 to generate configure scripts, is an incentive for many people to install it, even if they will not program in m4. Except for a few small modifications, GNU m4 is essentially compatible with System V, Release 4 version.
M4 can be quite addictive for certain folks. They begin by using m4 to solve easy problems, then progress to more difficult tasks, learning how to construct complicated sets of m4 macros along the way. Once addicted, users would write intricate m4 programs to tackle even simple problems, dedicating more time to debugging their m4 scripts than to accomplishing actual work. Compulsive programmers should be aware that m4 may be harmful to their health.
Problems and Bugs
If you're having trouble with GNU M4 or believe you've discovered a bug, please let us know. Make sure you've spotted a genuine problem before reporting it. Reread the documentation to make sure it states you can do what you're attempting to do. If it's unclear whether you should be able to perform something or not, please report that as well; it's a documentation flaw!
Try to isolate a bug to the smallest possible input file that reproduces the problem before reporting it or attempting to solve it yourself. Then provide us the input file as well as the exact findings that m4 provided you with. Also, describe what you expected to happen; this will aid us in determining whether the issue was caused by the documentation.
Send an email to bug-m4@gnu.org once you've identified a specific issue. Please specify which version of m4 you are using. The command m4 —version can be used to obtain this information. Include information about the platform you're using.
Suggestions for things that aren't bugs are always welcome. Please also submit any questions you have regarding things that are unclear in the manual or are merely obscure features.
Invoking M4
The m4 command has the following format:
m4 [option…] [file…]
All options start with a ‘-' or a ‘--' if the option name is long. It is not necessary to write the entire option name; any unambiguous prefix will suffice. Even though POSIXLY CORRECT is set in the environment, POSIX needs m4 to recognize arguments mixed together with files. Most options take effect at starting regardless of where they are in the command line, although several are documented below as taking effect after any files that came before them. The argument — is used to indicate the end of a set of options.
Short options may be concatenated into a single command line argument with following options, options with necessary arguments may be provided as a single command line argument or as two arguments, and options with optional arguments must be provided as a single command line argument. m4 -QPDfoo -d a -df is similar to m4 -Q -P -D foo -d -df —./a, albeit the latter is regarded canonical.
Options with obligatory arguments may be offered as a single argument or as two arguments with lengthy options, but options with optional arguments must be provided as a single argument. To put it another way, m4 —def foo —debug an is the same as m4 —define=foo —debug= —./a, however the latter is deemed canonical (not to mention more robust, in case a future version of m4 introduces an option named —default).
The following settings, organized by functionality, are understood by m4.
● Operation modes - Optional operation modes via the command line.
● Preprocessor features - Preprocessor functionality can be accessed via the command line.
● Limits control - Limits control options from the command line.
● Frozen state - Options for the frozen state via the command line.
● Debugging options - Debugging options via the command line.
● Command line files - On the command line, specifying input files.
Optional operation modes through the command line
m4's overall operation is controlled by a number of options:
--help
On standard output, print a help summary, then exit m4 without reading any input files or performing any further activities.
--version
On standard output, print the program's version number, then quit m4 without reading any input files or performing any other operations.
-E
--fatal-warnings
Controls how warnings are shown. When a warning is printed, if it is not provided, execution proceeds and the exit status is unchanged. Warnings become fatal if stated exactly once; when one is sent, execution continues but the exit status is non-zero. If a warning is sent many times, execution will halt with a non-zero status the first time it is issued. The introduction of behavior levels is new to M4 1.4.9; you must provide -E twice to get behavior that is compatible with previous versions.
-i
--interactive
-e
This m4 invocation becomes interactive. All output will be unbuffered, and interrupts will be ignored. The -e spelling is there for compatibility with other m4 implementations, although it comes with a warning that it may be removed in a future version of GNU M4.
Using the command line to specify input files
The remaining command-line arguments are assumed to be input file names. Standard input is read if no names are present. A file with the name - is assumed to be standard input. It is common for input files to end with ‘.m4', but it is not needed.
The input files are read in the order that is specified. Because standard input can be read several times, the file name - may appear on the command line numerous times; this makes a difference when using a terminal or other special file type. If an input file terminates in the middle of an argument collection, a comment, or a quoted text, it's an error.
After processing input from any file names that appear earlier on the command line, the options —define (-D), —undefine (-U), —synclines (-s), and —trace (-t) take effect. Assume the file foo has the following information:
$ cat foo
Bar
The word ‘bar' can then be redefined across numerous foo instances:
$ m4 -Dbar=hello foo -Dbar=world foo
⇒hello
⇒world
The exit state of m4 will be 0 for success, 1 for general failure (such as difficulty reading an input file), and 63 for version mismatch if none of the input files activated m4exit (see M4exit) (see Using frozen files).
If you need to read a file with a name that begins with a -, type ‘./-file' or use — to signify the end of the choices.
Input comments in m4 format
The letters ‘#' and newline are commonly used to separate comments in m4. All characters between the comment delimiters are disregarded, but the whole comment (including the delimiters) is sent to the output—m4 does not trash comments.
Because comments can't be nested, the first newline after a '#' marks the end of the comment. The begin-comment string's commenting effect can be reduced by quoting it.
$ m4
quoted text' # commented text'
⇒quoted text # commented text'
quoting inhibits' #' comments'
⇒quoting inhibits # comments
The built-in macro changecom can be used to change the comment delimiters to any string at any time.
References:
- Dhamdhere D., "Systems Programming and Operating Systems", McGraw Hill, ISBN 0 - 07 - 463579 – 4
- Silberschatz, Galvin, Gagne, "Operating System Principles", 9th Edition, Wiley, ISBN 978- 1-118-06333-0
- John R. Levine, Tony Mason, Doug Brown, “Lex & Yacc”, 1st Edition, O’REILLY,
ISBN 81-7366-062-X
4. Alfred V. Aho, Ravi Sethi, Reffrey D. Ullman, “Compilers Principles, Techniques, and Tools”, Addison Wesley, ISBN 981-235-885-4