Unit 2 | unit 2 macro processor and compilers

SP&OS

Unit - 2

Macro Processor and Compilers

2.1 Introduction

Macros are single-line abbreviations for a certain group of instructions. Once the macro is defined, these groups of instructions can be used anywhere in a program. It is sometimes necessary for an assembly language programmer to repeat some blocks of code in the course of a program. The programmer needs to define a single machine instruction to represent a block of code for employing a macro in the program. The macro proves to be useful when instead of writing the entire block again and again, you can simply write the macro that you have already defined. An assembly language macro is an instruction that represents several other machine language instructions at once.

Macros are one-line acronyms for a specific set of commands. These collections of instructions can be used anywhere in a program once the macro has been defined.

During the course of a program, an assembly language programmer may need to repeat some blocks of code. To use a macro in a program, the programmer must define a single machine instruction to represent a block of code. Instead of writing the complete block over and over, the macro comes in handy when you can simply write the macro you've already defined. An assembly language macro is a single instruction that acts as a proxy for several machine language commands.

The macro facility allows you to provide a name to a sequence that appears multiple times in a program, and then use that name whenever that sequence is encountered. All you have to do now is use the macro instruction definition to give a sequence a name.

The structure shown below demonstrates how to define a macro in a program:

The first line of the macro definition is the MACRO pseudo-op, which is described by this structure. The name line for macro is the next line, which identifies the macro instruction name. Following the macro name is a line that contains the abbreviated sequence of instructions. The actual macro instruction is contained in each instruction. MEND pseudo-op is the last statement in the macro definition. This pseudo-op marks the end of the macro definition and the end of the macro instruction definition.

Key takeaway

Macros are single-line abbreviations for a certain group of instructions. Once the macro is defined, these groups of instructions can be used anywhere in a program.

Macros are one-line acronyms for a specific set of commands.

2.2 Features of a Macro facility: Macro instruction arguments

Macro instruction arguments

In example 1 we have used one argument called as DATA. Suppose if we want to use more operands for same operation, it provides the facility for arguments or parameters in calls. Corresponding macro dummy arguments will appear in macro definition.

Consider the following example2

A1,D ATAl

A2,DATAl

A3,DATAl

A 1,DATA2

A 2,DATA2

A 3,DATA2

DATAl DC F’5”

DATA2 DC F’4”

In the above example DATAl and DATA2 parameters are used for performing add operations with its registers. In the above assembly code ADD operation is repeated two times with different operands. For the above case we can define macro in the following manner

SOURCE EXPANDED SOURCE

MACRO

INCR & arg

A l,<fcarg

A 2.< fcarg

A 3,< fcarg

MEND

INCR DATAl • use datal as operand A 1,DATAl

A A 2,DATAl

3,DATAl

INCR DATA2 use data2 as operand A

• •

DATA1 A

DC F’5” DATA2

DATA2 F’4” DATA2

Performing the same operation with a variable argument is called as macro instruction argument or dummy argument. It is specified by means of & symbol. It is possible to supply more than one argument in a macro call. Each argument must correspond to a definition argument on the macro name line of the macro definition. When a macro call is processed, the arguments supplied are substituted for the respective dummy arguments in the macro definition.

Consider the following example3

Loopl A A5 DATAl

A 2,DATA2

A 3,DATA3

Loop2 A I 5 DATAl

A 2,DATA2

A 3,DATA3

DATAl F’5”

DATA2 F’4”

DATA3 F’2”

This program could be written as

SOURCE EXPANDED SOURCE

MACRO

&1 INCR &arg15&arg2, &arg3

&1A l, &argl

A 2,&arg2

A 3,&arg3

MEND

Loopl INCR DATA1,DATA2,DATA3 Loopl A 1,DATAl

A 2,DATA2

A 3,DATA3

Loop2 INCR DATA3,DATA2,DATAl Loop2 A 1,DATA3

A 2,DATA2

A 3,DATAl

DATAl DC F’5’ DATAl DC F’5”

DATA2 DC F’4” DATA2 DC F’4”

DATA3 DC F’2” DATA3 DC Fc Z’

Here we have specified four arguments including label argument. Label arguments are treated as operands of a macroinstruction.

2.3 Conditional Macro expansion

For writing conditional macro two pseudo operations are used. They are

AIF

AGO

Consider the following example4

A 2,DATA2

A 3 ,DAT A3

Loop2 A 1,DATAl

A 2,DATA2

Loop3 A 1,DATAl

DATAl DC F’5”

DATA2 DC F’4”

DATA3 DC F’2”

In the above example under each loop variable number of arguments is used. The program could be written by using only conditional macro

MACRO

&argvary &count,&argl ,&arg2,&arg3

&argOA l,&argl

AIF (&count EQ 1).FINI

A 2,&arg2

AIF (&count EQ 2).FINI

A 3,&arg3

TINI MEND

EXPANDED SOURCE

Loopl vary 3,D AT A1,D AT A2,D AT A3 • •

Loopl A !,DAT A3

A 2,DATA2

A 3,DATAl

Loop2 vary 2,D AT A3 ,DAT A2 •

Loop2 A !,DATA3

A 2,DATA

Loop3 vary 1,DATAl

Looρ3 A !,DATA!

DATAl DC F’5”

DATA2 DC F’4” DATA2 DC F’4”

DATA3 DC F’2” DATA3 DC F’2”

Labels starting with period (.) such as FINI, are macro labels and do not appear in the output of the macro processor. The statement AIF directs the macro processor to skip to the statement labelled FINI if the parameter corresponding to & COUNT is a 1; otherwise the macro processor is to continue with the statement following the AIF pseudo operation. AIF is a conditional branch pseudo operation. The AGO is

An unconditional branch or go to statement. It specifies a label appearing on some other statement in the macro instruction definition.

2.4 Macro calls within Macros

One macro within another macro is called a nested macro. Consider the following example 5 MACRO

ADDl &arg

L l,&arg

A 1 =FT’

ST l,<fcarg

MEND

MACRO

ADDS &argl,&arg2,&arg3

ADDl &argl

ADDl &arg2

ADDl &arg3

MEND

The above program could be written using multiple levels. Macro calls with in macros can involve several levels. The macro ADDS invoke another macro ADDl.

SOURCE EXPANDED SOURCE EXPANDED

SOURCE

(Level1) (Level2)

MACRO

ADDl &arg

L l,<fcarg
A I=FT’

ST 1

MEND ,<fcarg

MACRO

ADDS &argl,&arg2,&arg3

ADDl &argl

ADDl &arg2

ADDl &arg3

MEND SOURCE

Example of nested macro

TEST START 2000h

MACROS MACRO

CELTOFER MACRO &CEL &FER

LDA &CEL

MULT NINE

ADD THIRTYTWO

STA &FER

MEND

MACROF MACRO

CELTOFER MACRO &CEL &FER

LDAF &CEL

MULTF NINE

DIVF FIVE

ADD THIRTYTWO

STA &FER

MEND

Key takeaway

Nested macro calls refer to the macro calls within the macros. A macro is available within other macro definitions also. In the scenario where a macro call occurs, which contains another macro call; the macro processor generates the nested macro definition as text and places it on the input stack.

2.5 Macro instructions

Consider the following example 1

A!,DATA //add contents of DATA to register1

A2,DATA //add contents of DATA to register2

A3,DATA //add contents of DATA to register3

A!,DATA //add contents of DATA to register 1

A2,DATA //add contents of DATAto register2

A3,DATA //add contents of DATAto register3

In the above example the following sequence is repeated two times.

A!,DATA

A2,DATA

A3,D AT A

Rather than typing these sequences more than one time, we can simply define a macro.

2.6 Defining Macro, Design of two pass Macro processor

Macro definition starts with the key word MACRO and end with the keyword MEND i.e. it is defined in the following manner

MACRO

//macro name

//body of macro

MEN

The above example can be written using macro as follows

SOURCE EXPANDED SOURCE

MACRO

INCR

A !,DATA

A2,DATA

A3,D AT A

MEND

!,DAT

INCR A A

2,DAT

• A A

3,DAT

- A A

!,D

INCR A A T

2,DAT

• A A

3,DAT

• A A

DATADC F’5”

In the above case the macro processor replaces each macro call with the lines

A !,DATA

A 2,DATA

A 3,D AT A

The macro invocation can be done by using macro name, in this case it is INCR.

The process of replacement is called expanding the macro. In the above case the macro definition does not appear in the expanded source code.

2.7 Concept of single pass Macro processor

“Macro in macro” can be handled by a one-pass macro processor that alternates between macro definition and macro expansion.

The definition of a macro must be in the source program before any statements that call that macro, due to the one-pass structure.

This is a sensible restriction (does not create any real inconvenience).

Global variables

In a one-pass macro processor, there are three basic data structures involved:

DEFTAB

● The macro definition is saved here, together with the macro prototype and macro body.

● The comment lines have been eliminated.

● For ease of substituting arguments, references to macro instruction parameters are translated to positional notation.

2. NAMTAB

● Store macro names that function as an index to DEFTAB and contain references to the definition's start and end.

3. ARGTAB

● When macro invocations are expanded, this variable is used.

● The arguments are saved in this database according to their position in the argument list when a macro invocation statement is encountered.

Data structure

● The macro names are inserted into NAMTAB, which has two references to the start and finish of the DEFTAB definition.

● The third data structure is the ARGTAB argument table, which is used during macro invocation expansion.

● The arguments are saved in ARGTAB in the order in which they appear in the argument list.

Algorithm

Procedure DEFINE

● When the beginning of a macro definition is recognized, this function is called.

● Fill in the appropriate fields in DEFTAB and NAMTAB.

Procedure EXPAND

● Called to expand a macro invocation statement and set up the argument values in ARGTAB.

Procedure GETLINE

● Request that the next line be handled.

Algorithm for one pass macro processor

Begin {microprocessor}

EXPANDING : - FALSE

While OPCODE‘MIND’ do

Begin

CEPLINR

PROCESSLINE

End {while}

End {microprocessor}

Procedure PROCESSLINK

Begin

Search NAMTAB for OPCODE

If found then

Else if OPCODE=’MACRO’ then

DEFINE

Else write source line to expanded file

End {PROCESSLINK}

Procedure EXPAND

Begin

EXPANDING :=TRUE

Got first line of macro definition (prototype) from DEFTAB

Got up arguments from macro innovation in ARCTAB

Write macro invocation to expanded file as a constant

While not end of macro definition do

Begin

GETLINE

PROCESSLINE

End (while)

EXPANDING:=FALSE

End {EXPAND}

Procedure GETLINE

Begin

If EXPANDING then

Get next line of macro definition from DEFTAB

Multitude arguments from ARCTAB for positional notation

End (if)

Else

Read next line from input file

End {GETLINE}

Begin

Enter macro name into NAMTAB

Enter micro prtotype into DEFTAB

LEVEL t=1

While LEVEL>0 do

Begin

GETLINE

If this is not a comment line then

Begin

Substitute positonal notaiton for parameters

Enter line into DEFTAB

If OPCOD=’MACRO’ then

LEVEL:=LEVEL-1

Else if OPCODE=’MIND’ then

LEVEL:=LEVEL -1

End (if not comment)

End (while)

Store in NAMTAB pointers to beginning and end of definition

End {DEFINE}

Difference between two pass Vs one pass macro processor

Two Pass Macro Processor

One Pass Macro Processor

Passes:

(i) Pass1: Recognize macro definition

(ii) Pass 2: Recognize macro calls

Every macro must be defined before it is called
One-pass processor can alternate between macro definition and macro expansion

Nested macro definitions are not allowed

Nested macro definitions are allowed but nested calls are not

2.8 Introduction to Compilers: Phases of Compiler with one example

A minimally useful compiler is a large project and a fully optimising compiler doing everything imaginable is impossibly large (a trip to the library or CiteSeer makes it even bigger).

There are several different compiler designs that could make sense for an object oriented dynamically typed language. The possible compiler designs run from compiling as quickly as an interpreter interprets to very slow compilers that produce very high quality code.

Most JIT (Just-in-Time) compilers compile fast but not as fast as interpreters, they stick to linear time worst case algorithms because compilation happens at run time where noticeable pauses are not acceptable. Most mature batch compilers tend to be very slow, aiming at fast execution times (at least at higher optimization levels) though a lot are simpler to make them buildable.

JITs are a great strategy for very fast compiling compilers because they can recoup on the second execution. But they are much less appealing when execution time is important because the compile time pauses will get longer (potentially a lot longer, as compilers often use O(N2) - or greater - algorithms).

Compilers offer some strong benefits for up front design because of the rich body of literature, but they also have some strong disadvantages. It's likely that there are no successful projects working in the same design space: language implementation; optimization style (fast compilation or fast execution); and basic framework (SSA, dataflow, simple tree walkers, one pass). It is also very likely that no-one on the team has done anything similar.

Judging design literature without having personally implemented similar projects is hard, if not impossible, but many key decisions have to be made early. These are really project-level design, not technical, but they'll influence the choice of algorithm and intermediate forms.

Key takeaway

There are several different compiler designs that could make sense for an object oriented dynamically typed language.

JITs are a great strategy for very fast compiling compilers because they can recoup on the second execution.

2.9 Comparison of Compiler and Interpreter

Compiler	Interpreter
Scan the entire program first and then translate it into machine code.	Translate the program line-by-line.
Execution time is less.	Execution time is more.
Since source code is not required tampering with the source code is not possible.	Source code can be easily modified and hence no security of programs.
Convert the entire program to machine code; when all the syntax errors have been removed execution takes place.	Each time the program is executed, every line is checked for syntax error and then converted to equivalent machine code.
Slow for debugging.	Fast for debugging.
Machine code can be saved and used; source code and compiler are no longer needed.	Machine code cannot be saved; an interpreter is always required for translation.

2.10 Case study on GNU M4 Macro Processor

m4 is a macro processor in the sense that it duplicates its input to its output while expanding macros. Macros can take any number of arguments and are either built-in or user-defined. m4 has built-in functions for including named files, running shell commands, doing integer arithmetic, manipulating text in various ways, performing recursion, and so on, in addition to macro expansion. m4 can be used as a front-end for a compiler or as a standalone macro processor.

POSIX has standardized the m4 macro processor, which is widely available on all UNIXes. Only a small number of consumers are usually aware that it exists. Those who discover it, on the other hand, are likely to become regular users. GNU Autoconf's popularity, which requires GNU m4 to generate configure scripts, is an incentive for many people to install it, even if they will not program in m4. Except for a few small modifications, GNU m4 is essentially compatible with System V, Release 4 version.

M4 can be quite addictive for certain folks. They begin by using m4 to solve easy problems, then progress to more difficult tasks, learning how to construct complicated sets of m4 macros along the way. Once addicted, users would write intricate m4 programs to tackle even simple problems, dedicating more time to debugging their m4 scripts than to accomplishing actual work. Compulsive programmers should be aware that m4 may be harmful to their health.

Problems and Bugs

If you're having trouble with GNU M4 or believe you've discovered a bug, please let us know. Make sure you've spotted a genuine problem before reporting it. Reread the documentation to make sure it states you can do what you're attempting to do. If it's unclear whether you should be able to perform something or not, please report that as well; it's a documentation flaw!

Try to isolate a bug to the smallest possible input file that reproduces the problem before reporting it or attempting to solve it yourself. Then provide us the input file as well as the exact findings that m4 provided you with. Also, describe what you expected to happen; this will aid us in determining whether the issue was caused by the documentation.

Send an email to bug-m4@gnu.org once you've identified a specific issue. Please specify which version of m4 you are using. The command m4 —version can be used to obtain this information. Include information about the platform you're using.

Suggestions for things that aren't bugs are always welcome. Please also submit any questions you have regarding things that are unclear in the manual or are merely obscure features.

Invoking M4

The m4 command has the following format:

m4 [option…] [file…]

All options start with a ‘-' or a ‘--' if the option name is long. It is not necessary to write the entire option name; any unambiguous prefix will suffice. Even though POSIXLY CORRECT is set in the environment, POSIX needs m4 to recognize arguments mixed together with files. Most options take effect at starting regardless of where they are in the command line, although several are documented below as taking effect after any files that came before them. The argument — is used to indicate the end of a set of options.

Short options may be concatenated into a single command line argument with following options, options with necessary arguments may be provided as a single command line argument or as two arguments, and options with optional arguments must be provided as a single command line argument. m4 -QPDfoo -d a -df is similar to m4 -Q -P -D foo -d -df —./a, albeit the latter is regarded canonical.

Options with obligatory arguments may be offered as a single argument or as two arguments with lengthy options, but options with optional arguments must be provided as a single argument. To put it another way, m4 —def foo —debug an is the same as m4 —define=foo —debug= —./a, however the latter is deemed canonical (not to mention more robust, in case a future version of m4 introduces an option named —default).

The following settings, organized by functionality, are understood by m4.

● Operation modes - Optional operation modes via the command line.

● Preprocessor features - Preprocessor functionality can be accessed via the command line.

● Limits control - Limits control options from the command line.

● Frozen state - Options for the frozen state via the command line.

● Debugging options - Debugging options via the command line.

● Command line files - On the command line, specifying input files.

Optional operation modes through the command line

m4's overall operation is controlled by a number of options:

--help

On standard output, print a help summary, then exit m4 without reading any input files or performing any further activities.

--version

On standard output, print the program's version number, then quit m4 without reading any input files or performing any other operations.

-E

--fatal-warnings

Controls how warnings are shown. When a warning is printed, if it is not provided, execution proceeds and the exit status is unchanged. Warnings become fatal if stated exactly once; when one is sent, execution continues but the exit status is non-zero. If a warning is sent many times, execution will halt with a non-zero status the first time it is issued. The introduction of behavior levels is new to M4 1.4.9; you must provide -E twice to get behavior that is compatible with previous versions.

-i

--interactive

-e

This m4 invocation becomes interactive. All output will be unbuffered, and interrupts will be ignored. The -e spelling is there for compatibility with other m4 implementations, although it comes with a warning that it may be removed in a future version of GNU M4.

Using the command line to specify input files

The remaining command-line arguments are assumed to be input file names. Standard input is read if no names are present. A file with the name - is assumed to be standard input. It is common for input files to end with ‘.m4', but it is not needed.

The input files are read in the order that is specified. Because standard input can be read several times, the file name - may appear on the command line numerous times; this makes a difference when using a terminal or other special file type. If an input file terminates in the middle of an argument collection, a comment, or a quoted text, it's an error.

After processing input from any file names that appear earlier on the command line, the options —define (-D), —undefine (-U), —synclines (-s), and —trace (-t) take effect. Assume the file foo has the following information:

$ cat foo

Bar

The word ‘bar' can then be redefined across numerous foo instances:

$ m4 -Dbar=hello foo -Dbar=world foo

⇒hello

⇒world

The exit state of m4 will be 0 for success, 1 for general failure (such as difficulty reading an input file), and 63 for version mismatch if none of the input files activated m4exit (see M4exit) (see Using frozen files).

If you need to read a file with a name that begins with a -, type ‘./-file' or use — to signify the end of the choices.

Input comments in m4 format

The letters ‘#' and newline are commonly used to separate comments in m4. All characters between the comment delimiters are disregarded, but the whole comment (including the delimiters) is sent to the output—m4 does not trash comments.

Because comments can't be nested, the first newline after a '#' marks the end of the comment. The begin-comment string's commenting effect can be reduced by quoting it.

$ m4

quoted text' # commented text'

⇒quoted text # commented text'

quoting inhibits' #' comments'

⇒quoting inhibits # comments

The built-in macro changecom can be used to change the comment delimiters to any string at any time.

References:

Dhamdhere D., "Systems Programming and Operating Systems", McGraw Hill, ISBN 0 - 07 - 463579 – 4
Silberschatz, Galvin, Gagne, "Operating System Principles", 9th Edition, Wiley, ISBN 978- 1-118-06333-0
John R. Levine, Tony Mason, Doug Brown, “Lex & Yacc”, 1st Edition, O’REILLY,

ISBN 81-7366-062-X

4. Alfred V. Aho, Ravi Sethi, Reffrey D. Ullman, “Compilers Principles, Techniques, and Tools”, Addison Wesley, ISBN 981-235-885-4

Sign Up

Index

Notes

Highlighted

Underlined

Browse by Topics

Notes

Highlighted

Underlined