ASSEMBLER
An assembler is a program that converts assembly
language into machine code.
It takes the basic commands and operations from assembly
code and converts them into binary code that can be
recognized by a specific type of processor.
The assembler reads the assembly language source code twice
before it outputs object code. Each read of the source code is called a pass.
This is because assembly language source code often contains
forward references. A forward reference occurs when a label is used as an
operand, for example as a branch target, earlier in the code than the
definition of the label. The assembler cannot know the address of the forward
reference label until it reads the definition of the label.
During each pass, the assembler performs different
functions. In the first pass, the assembler:
Checks the syntax of the instruction or directive. It faults
if there is an error in the syntax, for example if a label is specified on a
directive that does not accept one.
Determines the size of the instruction and data being
assembled and reserves space.
Determines offsets of labels within sections.
Creates a symbol table containing label definitions and
their memory addresses.
In the second pass, the assembler:
Faults if an undefined reference is specified in an
instruction operand or directive.
Encodes the instructions using the label offsets from pass
1, where applicable.
Generates relocations.
Generates debug information if requested.
Outputs the object file.
Memory addresses of labels are determined and finalized in
the first pass. Therefore, the assembly code must not change during the second
pass. All instructions must be seen in both passes. Therefore you must not
define a symbol after a :DEF: test for the symbol. The assembler faults if it
sees code in pass 2 that was not seen in pass 1.
Line not seen in pass 1
The following example shows that num EQU 42 is not seen in
pass 1 but is seen in pass 2:
AREA x,CODE
[ :DEF: foo
num EQU 42
]
foo DCD num
END
Assembling this code generates the error:
A1903E: Line not seen in first pass; cannot be assembled.
Line not seen in pass 2
The following example shows that MOV r1,r2 is seen in pass 1
but not in pass 2:
AREA x,CODE
[ :LNOT: :DEF: foo
MOV r1, r2
]
foo MOV r3, r4
END
Assembling this code generates the error:
A1909E: Line not seen in second pass; cannot be assembled.
LINKER
In computer science, a linker is a computer program
that takes one or more object files generated by
a compiler and combines them into one, executable program.
How the Linker Works?
The compiler compiles a single high-level language file (C
language, for example) into a single object module file. The linker (ld) can
only work with object modules to link them together. Object modules are the
smallest unit that the linker works with.
Typically, on the linker command line, you will specify a
set of object modules (that has been previously compiled) and then a list of
libraries, including the Standard C Library. The linker takes the set of object
modules that you specify on the command line and links them together.
Afterwards there will probably be a set of "undefined references". A
reference is essentially a function call. An undefined reference is a function
call, with no defined function to match the call.
The linker will then go through the libraries, in order, to
match the undefined references with function definitions that are found in the
libraries. If it finds the function that matches the call, the linker will then
link in the object module in which the function is located. This part is
important: the linker links in THE ENTIRE OBJECT MODULE in which the function
is located. Remember, the linker knows nothing about the functions internal to
an object module, other than symbol names (such as function names). The
smallest unit the linker works with is object modules.
When there are no more undefined references, the linker has
linked everything and is done and outputs the final application.
LOADER
Loader is the program of the operating system which loads the
executable from the disk into the primary memory(RAM) for execution. It
allocates the memory space to the executable module in main memory and then
transfers control to the beginning instruction of the program .
How the Loader
Works?
Most of the time the
first call given after the compile execution of xlC compiler [while using the
strace command] is ‘execve()‘ which
actually is the loader .
This loader creates the process which involves:
Reading the file and creating an address space for the
process.
Page table entries for the instructions, data and program
stack are created and the register set is initialized.
Then, Executes a jump instruction to the first instruction
of the program which generally causes a page fault and the first page of your
instructions is brought into memory.