From 952c5c128f9efaea89d41d882c4ea3ade7df4591 Mon Sep 17 00:00:00 2001 From: zakk Date: Fri, 26 Aug 2005 04:48:05 +0000 Subject: Itsa me, quake3io! git-svn-id: svn://svn.icculus.org/quake3/trunk@2 edf5b092-35ff-0310-97b2-ce42778d08ea --- lcc/doc/4.html | 754 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 754 insertions(+) create mode 100755 lcc/doc/4.html (limited to 'lcc/doc/4.html') diff --git a/lcc/doc/4.html b/lcc/doc/4.html new file mode 100755 index 0000000..0158d8b --- /dev/null +++ b/lcc/doc/4.html @@ -0,0 +1,754 @@ + + + + + +The lcc 4.1 Code-Generation Interface + + + + +

The lcc 4.1 Code-Generation Interface

+ +

Christopher +W. Fraser and David R. Hanson, Microsoft Research

+ +

+ + +

+ + +

Introduction

+ +

Version 4.1 is the latest release of lcc, the ANSI C compiler described in +our book A Retargetable C Compiler: Design and Implementation +(Addison-Wesley, 1995, ISBN 0-8053-1670-1). This document summarizes the differences +between the 4.1 code-generation interface and the 3.x interface described in Chap. 5 of A +Retargetable C Compiler.

+ +

Previous versions of lcc supported only three sizes of integers, two sizes of floats, +and insisted that pointers fit in unsigned integers (see Sec. 5.1 of A Retargetable +C Compiler). These assumptions simplified the compiler, and were suitable for +32-bit architectures. But on 64-bit architectures, such as the DEC ALPHA, it's natural to +have four sizes of integers and perhaps three sizes of floats, and on 16-bit +architectures, 32-bit pointers don't fit in unsigned integers. Also, the 3.x constaints +limited the use of lcc's back ends for other languages, such as Java.

+ +

Version 4.x removes all of these restrictions: It supports any number of sizes for +integers and floats, and the size of pointers need not be related to the size of any of +the integer types. The major changes in the code-generation interface are: + +

The number of type suffixes has been reduced to 6.
Dag operators are composed of a generic operator, a type suffix, and a size.
Unsigned variants of several operators have been added.
Several interface functions have new signatures.

+ +

In addition, version 4.x is written in ANSI C and uses the standard I/O library and +other standard C functions.

+ +

The sections below parallel the subsections of Chap. 5 of A Retargetable C +Compiler and summarize the differences between the 3.x and 4.x code-generation +interface. Unaffected subsections are omitted. Page citations refer to pages in A +Retargetable C Compiler.

+ +

5.1 Type Metrics

+ +

There are now 10 metrics in an interface record:

+ +

Metrics charmetric;
+Metrics shortmetric;
+Metrics intmetric;
+Metrics longmetric;
+Metrics longlongmetric;
+Metrics floatmetric;
+Metrics doublemetric;
+Metrics longdoublemetric;
+Metrics ptrmetric;
+Metrics structmetric;

+ +

Each of these specifies the size and alignment of the corresponding type. ptrmetric +describes all pointers.

+ +

5.3 Symbols

+ +

The actual value of a constant is stored in the u.c.v field of a symbol, +which holds a Value:

+ +

typedef union value {
+	long i;
+	unsigned long u;
+	long double d;
+	void *p;
+	void (*g)(void);
+} Value;

+ +

The value is stored in the appropriate field according to its type, which is given by +the symbol's type field.

+ +

5.5 Dag Operators

+ +

The op field a of node structure holds a dag operator, which +consists of a generic operator, a type suffix, and a size indicator. The type suffixes +are:

+ +

enum {
+	F=FLOAT,
+	I=INT,
+	U=UNSIGNED,
+	P=POINTER,
+	V=VOID,
+	B=STRUCT
+};
+
+#define sizeop(n) ((n)<<10)

+ +

Given a generic operator o, a type suffix t, and a size s, +a type- and size-specific operator is formed by o+t+sizeop(s). For example, ADD+F+sizeop(4) +forms the operator ADDF4, which denotes the sum of two 4-byte floats. +Similarly, ADD+F+sizeop(8) forms ADDF8, which denotes 8-byte +floating addition. In the 3.x code-generation interface, ADDF and ADDD +denoted these operations. There was no size indicator in the 3.x operators because the +type suffix supplied both a type and a size.

+ +

Table 5.1 lists each generic operator, its valid type suffixes, and the number of kids +and syms that it uses; multiple values for kids indicate +type-specific variants. The notations in the syms column give the number +of syms values and a one-letter code that suggests their uses: 1V indicates +that syms[0] points to a symbol for a variable, 1C indicates that syms[0] +is a constant, and 1L indicates that syms[0] is a label. For 1S, syms[0] +is a constant whose value is a size in bytes; 2S adds syms[1], which is a +constant whose value is an alignment. For most operators, the type suffix and size +indicator denote the type and size of operation to perform and the type and size of the +result.

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +

Table 5.1Node Operators.
syms	kids	Operator	Type Suffixes	Sizes	Operation
1V	0	`ADDRF`	`...P..`	p	address of a parameter
1V	0	`ADDRG`	`...P..`	p	address of a global
1V	0	`ADDRL`	`...P..`	p	address of a local
1C	0	`CNST`	`FIUP..`	fdx csilh p	constant

	1	`BCOM`	`.IU...`	ilh	bitwise complement
1S	1	`CVF`	`FI....`	fdx ilh	convert from float
1S	1	`CVI`	`FIU...`	fdx csilh csilhp	convert from signed integer
1S	1	`CVP`	`..U..`	p	convert from pointer
1S	1	`CVU`	`.IUP..`	csilh p	convert from unsigned integer
	1	`INDIR`	`FIUP.B`	fdx csilh p	fetch
	1	`NEG`	`FI....`	fdx ilh	negation

	2	`ADD`	`FIUP..`	fdx ilh ilhp p	addition
	2	`BAND`	`.IU...`	ilh	bitwise AND
	2	`BOR`	`.IU...`	ilh	bitwise inclusive OR
	2	`BXOR`	`.IU...`	ilh	bitwise exclusive OR
	2	`DIV`	`FIU...`	fdx ilh	division
	2	`LSH`	`.IU...`	ilh	left shift
	2	`MOD`	`.IU...`	ilh	modulus
	2	`MUL`	`FIU...`	fdx ilh	multiplication
	2	`RSH`	`.IU...`	ilh	right shift
	2	`SUB`	`FIUP..`	fdx ilh ilhp p	subtraction

2S	2	`ASGN`	`FIUP.B`	fdx csilh p	assignment
1L	2	`EQ`	`FIU...`	fdx ilh ilhp	jump if equal
1L	2	`GE`	`FIU...`	fdx ilh ilhp	jump if greater than or equal
1L	2	`GT`	`FIU...`	fdx ilh ilhp	jump if greater than
1L	2	`LE`	`FIU...`	fdx ilh ilhp	jump if less than or equal
1L	2	`LT`	`FIU...`	fdx ilh ilhp	jump if less than
1L	2	`NE`	`FIU...`	fdx ilh ilhp	jump if not equal

2S	1	`ARG`	`FIUP.B`	fdx ilh p	argument
1	1 or 2	`CALL`	`FIUPVB`	fdx ilh p	function call
	1	`RET`	`FIUPV.`	fdx ilh p	return from function

	1	`JUMP`	`....V.`		unconditional jump
1L	0	`LABEL`	`....V.`		label definition

+ +

The entries in the Sizes column indicate sizes of the operators that +back ends must implement. Letters denote the size of float (f), double (d), long double +(x), character (c), short integer (s), integer (i), long integer (l), "long +long" integer (h) , and pointer (p). These sizes are separated into sets for each +type suffix, except that a single set is used for both I and U when the set for I is +identical to the set for U.

+ +

The actual values for the size indicators, fdxcsilhp, depend on the target. A +specification like ADDFf denotes the operator ADD+F+sizeop(f), +where "f" is replaced by a target-dependent value, e.g., ADDF4 and ADDF8. +For example, back ends must implement the following CVI and MUL +operators.

+ +

+
CVIFf CVIFd CVIFx
+ CVIIc CVIIs CVIIi CVIIl CVIIh
+ CVIUc CVIUs CVIUi CVIUl CVIUh + CVIUp
+
+ MULFf MULFd MULFx
+ MULIi MULIl MULIh
+ MULUi MULUl MULUh
+

+ +

On most platforms, there are fewer than three sizes of floats and six sizes of +integers, and pointers are usually the same size as one of the integers. And lcc doesn't +support the "long long" type, so h is not currently used. So the set of +platform-specific operators is usually smaller than the list above suggests. For example, +the X86, SPARC, and MIPS back ends implement the following CVI and MUL +operators.

+ +

+
CVIF4 CVIF8
+ CVII1 CVII2 CVII4
+ CVIU1 CVIU2 CVIU4
+
+ MULF4 MULF8
+ MULI4
+ MULU4
+

+ +

The set of operators is thus target-dependent; for example, ADDI8 appears +only if the target supports an 8-byte integer type. ops.c is +a program that, given a set of sizes, prints the required operators and their values, +e.g.,

+ +

% ops c=1 s=2 i=4 l=4 h=4 f=4 d=8 x=8 p=4
+...
+ CVIF4=4225 CVIF8=8321
+ CVII1=1157 CVII2=2181 CVII4=4229
+ CVIU1=1158 CVIU2=2182 CVIU4=4230
+...
+ MULF4=4561 MULF8=8657
+ MULI4=4565
+ MULU4=4566
+...
+131 operators

+ +

The type suffix for a conversion operator denotes the type of the result and the size +indicator gives the size of the result. For example, CVUI4 converts an +unsigned (U) to a 4-byte signed integer (I4). The syms[0] +field points to a symbol-table entry for a integer constant that gives the size of the +source operand. For example, if syms[0] in a CVUI4 points to a +symbol-table entry for 2, the conversion widens a 2-byte unsigned integer to a 4-byte +signed integer. Conversions that widen unsigned integers zero-extend; those that widen +signed integers sign-extend.

+ +

The front end composes conversions between types T₁ and T₂ +by widening T₁ to it's "supertype", if necessary, converting +that result to T₂'s supertype, then narrowing the result to T₂, +if necessary. The following table lists the supertypes; omitted entries are their own +supertypes.

+ +

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Type Supertype
signed char int
signed short int
unsigned char int, if sizeof (char) < sizeof (int)
+ unsigned, otherwise
unsigned short int, if sizeof (short) < sizeof (int)
+ unsigned, otherwise
void * an unsigned type as large as a pointer
+

+ +

Pointers are converted to an unsigned type of the same size, even when that type is not +one of the integer types.

+ +

For example, the front end converts a signed short to a float by first converting it to +an int and then to a float. It converts an unsigned short to an int with a single CVUIi +conversion, when shorts are smaller than ints.

+ +

There are now signed and unsigned variants of ASGN, INDIR, BCOM, +BOR, BXOR, BAND, ARG, CALL, +and RET to simplify code generation on platforms that use different +instructions or register set for signed and unsigned operations. Likewise there are now +pointer variants of ASGN, INDIR, ARG, CALL, +and RET.

+ +

5.6 Interface Flags

+ +

unsigned unsigned_char:1;

+ +

tells the front end whether plain characters are signed or unsigned. If it's zero, char +is a signed type; otherwise, char is an unsigned type.

+ +

All the interface flags can be set by command-line options, e.g., -Wf-unsigned_char=1 +causes plain characters to be unsigned.

+ +

5.8 Definitions

+ +

The front end announces local variables by calling

+ +

void (*local)(Symbol);

+ +

It announces temporaries likewise; these have the symbol's temporary flag +set, which indicates that the symbol will be used only in the next call to gen. +If a temporary's u.t.cse field is nonnull, it points to the node that +computes the value assigned to the temporary; see page 346.

+ +

The front end calls

+ +

void (*address)(Symbol p, Symbol q, long n);

+ +

to initialize q to a symbol that represents an address of the form x+n, +where x is the address represented by p and the long integer n +is positive or negative.

+ +

5.9 Constants

+ +

The interface function

+ +

void (*defconst)(int suffix, int size, Value v);

+ +

initializes constants. defconst emits directives to define a cell and initialize it to +a constant value. v is the constant value, suffix identifies the type of the value, and +size is the size of the value in bytes. The value of suffix indicates which field of v +holds the value, as shown in the following table.

+ +

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
suffix v Field size
F v.d float, double, long double
I v.i signed char, signed short, signed int, signed long
U v.u unsigned char, unsigned short, unsigned int, unsigned long
P v.p void *
+

+ +

defconst must narrow v.x when size is less than sizeof +v.x; e.g., to emit an unsigned char, defconst should emit (unsigned +char)v.i.

+ +

5.12 Upcalls

+ +

lcc 4.x uses standard I/O and its I/O functions have been changed accordingly. lcc +reads input from the standard input, emits code to the standard output, and writes +diagnostics to the standard error output. It uses freopen to redirect these +streams to explicit files, when necessary.

+ +

bp, outflush, and outs have been eliminated.

+ +

extern void fprint(FILE *f, const char *fmt, ...);
+extern void  print(const char *fmt, ...);

+ +

print formatted data to file f (fprint) or the standard +output (print). These functions are like standard C's printf and +fprintf, but support only some of the standard conversion specifiers and do +not support flags, precision, and field-width specifications. They support the following +new conversion specifiers in addition to those described on page 99.

+ +

+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Specifiers Corresponding printf Specifiers
%c %c
%d %D %d %ld
%u %U %u %lu
%x %X %x %lx
%f %e %g %e %f %g
%p Converts the corresponding void * argument to unsigned long and prints it with the printf + %#x specifier or just %x when the argument is null.
%I Prints the number of spaces given by the corresponding argument.
+

+ +

#define generic(op)  ((op)&0x3F0)
+#define specific(op) ((op)&0x3FF)

+ +

generic(op) returns the generic variant of op; that is, +without its type suffix and size indicator. specific(op) returns the +type-specific variant of op; that is, without its size indicator.

+ +

newconst has been replaced by

+ +

extern Symbol intconst(int n);

+ +

which installs the integer constant n in the symbol table, if necessary, +and returns a pointer to the symbol-table entry.

+ +

+ Chris Fraser / cwfraser@microsoft.com
+ David Hanson / drh@microsoft.com
+ $Revision: 145 $ $Date: 2001-10-17 16:53:10 -0500 (Wed, 17 Oct 2001) $ +

+ + -- cgit v1.2.3

Type		Supertype
signed char		int
signed short		int
unsigned char		int, if sizeof (char) < sizeof (int) + unsigned, otherwise
unsigned short		int, if sizeof (short) < sizeof (int) + unsigned, otherwise
void *		an unsigned type as large as a pointer

suffix	v Field	size
`F`	`v.d`	float, double, long double
`I`	`v.i`	signed char, signed short, signed int, signed long
`U`	`v.u`	unsigned char, unsigned short, unsigned int, unsigned long
`P`	`v.p`	void *

Specifiers		Corresponding printf Specifiers
`%c`		`%c`
`%d %D`		`%d %ld`
`%u %U`		`%u %lu`
`%x %X`		`%x %lx`
`%f %e %g`		`%e %f %g`
`%p`		Converts the corresponding void * argument to unsigned long and prints it with the `printf` + `%#x` specifier or just `%x` when the argument is null.
`%I`		Prints the number of spaces given by the corresponding argument.

The lcc 4.1 Code-Generation Interface

Contents