Understanding Different Phases of C Program Compilation - BunksAllowed

BunksAllowed is an effort to facilitate Self Learning process through the provision of quality tutorials.

Community

Understanding Different Phases of C Program Compilation

Share This
Let us write a sample C program test.c to understand the phases of the compilation process.

#include<stdio.h> int main(void) { printf("\n Hello World!"); return 0; }

Now to compile the program, use cc or gcc command as shown below.
#cc test.c

At this time, one file will be generated by the name a.out. To run this program do the following and check the result.
#./a.out


Compilation of a program - step by step


Let us try with an example code of the "Hello world!" program. Instead of compiling the code using gcc test.c, now we are going to compile the code using gcc test.c --save-temps, so that temporarily generated files are saved.

Firstly, when the source code test.c is fed to the compiler, pre-processing on the test.c is carried out. In this step, all the pre-processor directives are expanded and we get an extended source code known as test.i. The size of test.i is always greater than test.c. See the content of test.i as below.

Content of test.i


. static __inline int _putchar_unlocked(int _c) { struct _reent *_ptr; _ptr = (__getreent()); return (__sputc_r(_ptr, _c, ((_ptr)->_stdout))); } # 797 "/usr/include/stdio.h" 3 4 # 2 "test.c" 2 # 3 "test.c" int main (void) { printf("Hello World!"); return 0; }

Now the file test.i is fed to the C compiler, and if there are no compilation issues, we will be getting a file test.s which is written in assembly language. Hence test.s acts as an input for the Assembler. You can check out the content of test.s as illustrated below.

Content of test.s


.file "test.c" .def __main; .scl 2; .type 32; .endef .section .rdata,"dr" .LC0: .ascii "Hello World!\0" .text .globl main .def main; .scl 2; .type 32; .endef .seh_proc main main: pushq %rbp .seh_pushreg %rbp movq %rsp, %rbp .seh_setframe %rbp, 0 subq $32, %rsp .seh_stackalloc 32 .seh_endprologue call __main leaq .LC0(%rip), %rcx call printf movl $0, %eax addq $32, %rsp popq %rbp ret .seh_endproc .ident "GCC: (GNU) 6.4.0" .def printf; .scl 2; .type 32; .endef

As the test.s is generated, it is fed to Assembler, and Assembler converts it to an object file test.o.  The file test.o is composed of machine-level instructions of the source code part only. The function calls within the source code are never resolved in test.o.

Content of test.o


d†à .text 0 , ¤ P`.data @ PÀ.bss € PÀ.rdata \ @ P@.xdata l @ 0@.pdata x  @ 0@/4 â€Å¾ @ P@UH‰Ã¥Hē è HÂ� è ¸ HĀ ]ÃÂ�Â�Â�Â�Â�Â�Â�Â�Â�Â�Â�Â�Hello World! 2 P $ GCC: (GNU) 6.4.0 .file þÿ g test.c main .text $ .data .bss .rdata .xdata .pdata __main printf .rdata$zzz .rdata$zzz

Next, the object file test.o is acted by Linker, and all the logically resolved codes of the functions are loaded to primary memory from the secondary memory with the help of Loader. The linker resolves the function call dynamically by checking and validating the corresponding entries in the Symbol Table. All the newly resolved codes loaded by Loader get attached with test.o and finally we get the executable format of our source code.

With ./a.out command, we run this executable file.
We hope that now you have got the ideas of how a C program gets compiled and run.

Happy Exploring!

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.