Debugging
For debugging several debuggers available for the compiler packages Intel Cluster Studio and PGI Compiler. Additionally the graphical frontend ddd for the GNU debugger gdb is available as module.
Debugging using the GNU debugger
The GNU Project debugger (GDB) is a valuable tool that might help you to analyze what caused your program to crash, i.e. it assists you in the process of debugging. By means of GDB you can basically execute your program in a line-by-line manner, stop at specified positions and display or alter the state of all the variables therein. GDB is able to debug programs written in many compiled languages such as C and C++. Below we will consider a basic (malfunctioning) program written in C and see how GDB can be used to discover where the problem is located. This example is meant to illustrate several features of GDB
A basic malfunctioning example program
So as to illustrate how to use GDB in order to detect bugs in a program, consider the following source code for a malfunctioning program written in C (with annotated line numbers), here referred to as myExample.c:
1 #include <stdio.h> 2 #include <stdlib.h> 3 4 5 void listArray(int *, int); 6 7 8 void listArray(int *myArray, int N){ 9 int i; 10 11 printf("array = ["); 12 for(i=0; i<N; i++){ 13 printf("%d ",myArray[i]); 14 } 15 printf("]\n"); 16 17 } 18 19 20 int main(int argc, char *argv[]) { 21 int *myArray,N,i; 22 23 N=10; 24 myArray = (int *) malloc(-N*sizeof(int)); 25 26 for(i=0; i<N; i++){ 27 myArray[i]=i; 28 } 29 30 free(myArray); 31 return 0; 32 }
Besides the main function, which simply allocates an array and initializes its entries, the program implements one additional function called listArray, which lists a specified number of leading array entries.
Compiling for elaborate debugging information
So as to benefit from GDB you need to compile your program so that it provides further debugging information. Using gcc this is done by adding the compiler option -g. For the above example you might thus type
gcc myExample.c -o myExample -g
This then yields the executable myExample, compiled using further debugging symbols.
A debugging session using GDB
If you invoke the program in a straight foreward manner it will fail, resulting in
$ ./myExample Segmentation fault
So, why is this? Albeit you might have already spotted the reason for the segmentation fault, lets use GDB to hunt down the error! To start the GDB tool in order to debug the example program simply type
gdb myExample
on the command line. This will enter the interactive GDB mode, only. It will not run your program right away!
To have a look at the source code that referst to the executable myExample you might use the GDB command list in one of several ways. If you vaguely remember the name of a GDB command but cannot recall how to properly use it, you can use the help function to find out about the command. E.g. to obtain details on the list command, simply type:
(gdb) help list List specified function or line. With no argument, lists ten more lines after or around previous listing. "list -" lists the ten lines before a previous ten-line listing. One argument specifies a line, and ten lines are listed around that line. Two arguments with comma between specify starting and ending lines to list. Lines can be specified in these ways: LINENUM, to list around that line in current file, FILE:LINENUM, to list around that line in that file, FUNCTION, to list around beginning of that function, FILE:FUNCTION, to distinguish among like-named static functions. *ADDRESS, to list around the line containing that address. With two args if one is empty it stands for ten lines away from the other arg.
Hence to see the beginning of the main function one has to type
(gdb) list main 15 printf("]\n"); 16 17 } 18 19 20 int main(int argc, char *argv[]) { 21 int *myArray,N,i; 22 23 N=10; 24 myArray = (int *) malloc(-N*sizeof(int));
which yields the 10 surrounding lines of the beginning of the main function. To get the next 10 lines you might just type enter.
Now, in order to execute your program type
(gdb) run Starting program: /user/fk5/ifp/agcompphys/alxo9476/wmwr/debug/gdb/c/myExample Program received signal SIGSEGV, Segmentation fault. 0x000000000040070f in main (argc=1, argv=0x7fffffffe5a8) at myExample.c:27 27 myArray[i]=i;
So, there appears to be a problem at line 27 in the source code, at which point GDB reports a segmentation fault. To have a look at the context where the error occured you can type list 27 to find:
(gdb) list 27 22 23 N=10; 24 myArray = (int *) malloc(-N*sizeof(int)); 25 26 for(i=0; i<N; i++){ 27 myArray[i]=i; 28 } 29 30 free(myArray); 31 return 0;
One way to proceed is to clean restart and set a so called breakpoint at line
27. This will case the program flow to interupt as soon as it reaches line 27, right
before the command in that line, namely myArray[i]=i;
, is executed:
(gdb) kill Kill the program being debugged? (y or n) y (gdb) break 27 Breakpoint 1 at 0x4006f8: file myExample.c, line 27. (gdb) run Starting program: /user/fk5/ifp/agcompphys/alxo9476/wmwr/debug/gdb/c/myExample warning: no loadable sections found in added symbol-file system-supplied DSO at 0x2aaaaaaab000 Breakpoint 1, main (argc=1, argv=0x7fffffffe5a8) at myExample.c:27 27 myArray[i]=i;
Now, you might display the variables befor line 27 is executed. E.g. to see the current value of
the iteration variable i
you can type print i to find
(gdb) print i $1 = 0
Similarly, the address in memory of the array myArray can be listed by typing
(gdb) print myArray $2 = (int *) 0x0
At this point note that the address of the array myArray looks awkward. Obviousely something went wrong!
It should rather look similar to something like 0x7fffffffe5a8, i.e. the address of the
array argv in the argument list of the main function! Now, 0x0 is the hexadecimal variant of 0. Hence, at line 27, myArray points to NULL. At this point, bear in mind that
whenever you try to allocate memory for a data structure and something went wrong during the allocation
procedure via malloc
and no memory is reserved, the NULL pointer is given as a
return value by malloc
. Hence, the
address 0x0 is a hint that something went wrong during the allocation procedure.
To bracket the error further, lets see how the allocation procedure affects the pointer myArray.
The memory allocation is done at line 24, so lets assign an additional break point via break 24
and do a clean restart by again calling kill. By the way, if you assigned several breaking points already and want to
have a detailed look at them or simply overview them, you can type
(gdb) info b Num Type Disp Enb Address What 1 breakpoint keep y 0x00000000004006f8 in main at myExample.c:27 breakpoint already hit 1 time 2 breakpoint keep y 0x00000000004006d8 in main at myExample.c:24
Now, running the program will first stop prior to the command issued in line 24, where we might check the address of the array myArray before and after the attempted memory allcation:
(gdb) run Starting program: /user/fk5/ifp/agcompphys/alxo9476/wmwr/debug/gdb/c/myExample warning: no loadable sections found in added symbol-file system-supplied DSO at 0x2aaaaaaab000 Breakpoint 2, main (argc=1, argv=0x7fffffffe5a8) at myExample.c:24 24 myArray = (int *) malloc(-N*sizeof(int)); (gdb) print myArray $3 = (int *) 0x7fffffffe5a0 (gdb) next 26 for(i=0; i<N; i++){ (gdb) print myArray $4 = (int *) 0x0
And indeed, by looking at the arguments of the function malloc
it should be clear that the minus sign
supplied with the size of the desired memory block caused the trouble.
Now, correcting the error by changing line 24 to
myArray = (int *) malloc(N*sizeof(int));
yields a functioning program.
Further features of GDB
To illustrate further features of GDB, consider the corrected variant of the example program. Starting a GDB session for the corrected program offers the possibility to further explore some of its features:
Modifying the value of variables
Say you set a breakpoint at line 24, looking up the value of N
yields the answer 10 of course.
If you want to change the value of N
for the remaining session to the value 15, you might type
set var N=15
Calling functions within GDB
Say you set a breakpoint at line 29, i.e. right after the values of the array are initialized. If you are
interested in whether all entries are properly initialized you can call the function listArray
declared in the source code by typing:
(gdb) break 29 Breakpoint 1 at 0x40071b: file myExample.c, line 29. (gdb) run Starting program: /user/fk5/ifp/agcompphys/alxo9476/wmwr/debug/gdb/c/myExample warning: no loadable sections found in added symbol-file system-supplied DSO at 0x2aaaaaaab000 Breakpoint 1, main (argc=1, argv=0x7fffffffe5a8) at myExample.c:30 30 free(myArray); (gdb) call listArray(myArray,N) array = [0 1 2 3 4 5 6 7 8 9 ]
The possibility to directly use functions declared in the source code is especially useful if you want to display the content of intricate custom data structures.
Listing array entries
A more easy way to list single entris of an array and whole ranges of array entries is as follows (continuing the previous GDB session):
(gdb) print myArray[0] $1 = 0 (gdb) print myArray[0]@5 $2 = {0, 1, 2, 3, 4} (gdb) print myArray[0]@15 $3 = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 135121, 0, 0, 0, 0}
Note that in the last line, also parts in the memory outside of the range of the array myArray are displayed, which is due to the particular way @X addresses the entries of the array. If you are familiar with the concept of pointers, you might immediately think of the correspondance between pointers and addresses in memory. By using the dereference operator, i.e. the prefix *, you can access the value stored at the respective memory address:
(gdb) print *(myArray) $4 = 0 (gdb) print *(myArray+1) $5 = 1 (gdb) print *(myArray+10) $6 = 135121
Executing source code line by line
Say you set a breakpoint at line 27. Then you can continue to execute the program line by line by typing the GDB command step. As an additional argument you can specify the number of times the command should be executed:
(gdb) break 27 Breakpoint 1 at 0x4006f6: file myExample.c, line 27. (gdb) run Starting program: /user/fk5/ifp/agcompphys/alxo9476/wmwr/debug/gdb/c/myExample warning: no loadable sections found in added symbol-file system-supplied DSO at 0x2aaaaaaab000 Breakpoint 1, main (argc=1, argv=0x7fffffffe5a8) at myExample.c:27 27 myArray[i]=i; (gdb) print i $1 = 0 (gdb) step 2 Breakpoint 1, main (argc=1, argv=0x7fffffffe5a8) at myExample.c:27 27 myArray[i]=i; (gdb) print i $2 = 1 (gdb) step 2 Breakpoint 1, main (argc=1, argv=0x7fffffffe5a8) at myExample.c:27 27 myArray[i]=i; (gdb) print i $3 = 2
The command next executes the program line by line similar to step. However, when a subroutine is called, the next command treats the whole subroutine as one instruction (this is in contrast to the step command, which would pass through the subroutine also in line by line manner).
Conditional breakpoints
You can also provide a condition which needs to be met in order for the program to stop at a specified breakpoint. Say, e.g., you might want to put under scrutiny how the array entries change within the for loop (lines 26-28) when i==5. To achieve this, you can set a breakpoint at line 27 and execute the program flow until i==5 is met:
(gdb) break 27 Breakpoint 1 at 0x4006f6: file myExample.c, line 27. (gdb) condition 1 i==5 (gdb) run Starting program: /user/fk5/ifp/agcompphys/alxo9476/wmwr/debug/gdb/c/myExample Breakpoint 1, main (argc=1, argv=0x7fffffffe5a8) at myExample.c:27 27 myArray[i]=i; (gdb) print i $1 = 5 (gdb) print myArray[0]@N $2 = {0, 1, 2, 3, 4, 0, 0, 0, 0, 0} (gdb) step 26 for(i=0; i<N; i++){ (gdb) print myArray[0]@N $3 = {0, 1, 2, 3, 4, 5, 0, 0, 0, 0}
Using DDD for debugging
Note that the HPC system also supports users that prefer the "Data Display Debugger" (DDD) over GDB, since it features an intuitive graphical user interface. Consequently you need to login to your HPC account using the -X option similar to
ssh -X abcd1234@hero.hpc.uni-oldenburg.de
to forward the X windows connection through the ssh connection. DDD is not available from scratch, you need to load the respective module first. This is done via
module load ddd
After this you can, e.g., run DDD on the example discussed above. For this you might simply naviagte to the respective working directory and type (remember to compile the program for elaborate debugging information)
ddd myExample
A snapshot from an actual debugging session (for the above example; breakpoint at line 23, program flow currently at line 27) using DDD is shown below.