Warning
|
This chapter is still a stub and it’s being heavily worked on. If planning a major addition to this chapter, please synchronize with the authors to avoid conflicts or duplicated efforts. You are still welcome to submit your feedback using a GitHub Issue. You can use the same mechanism to suggest sections that you believe should be included in this chapter, too. |
Debugging is the art of identifying and removing errors (i.e. bugs) from software. This section covers the most common Erlang debugging tools and techniques. Even if step-by-step debugging tools such as the Debugger exist in Erlang, the most effective debugging techniques in Erlang are the ones based on the so-called Erlang tracing facilities, which will be discussed in detail in chapter [CH-Tracing]. This chapter also covers the concept of Crash Dump, a readable text file generated by the Erlang Runtime System when an unrecoverable error is detected, for example when the system runs out of memory or when an emulator limit is reached. Crash Dumps can are extremely precious for post-mortem analysis of Erlang nodes and you will learn how to read and interpret them.
Redbug is a debugging utility which allows you to easily interact with the Erlang tracing facilities. It is an external library which is part of the eper performance and debugging suite for Erlang and therefore it has to be installed separately. One of the best Redbug features is its ability to shut itself down in case of overload.
You can clone redbug via:
$ git clone https://github.com/massemanet/eper.git
You can then compile it with:
$ cd eper
$ make
Ensure eper
is included in your path when starting an Erlang shell
and you are set to go. This can be done by explicitely adding the path
to the eper beam files when invoking erl
:
$ erl -pa /path/to/eper/ebin
Alternatively, the following line can be added to the ~/.erlang
file. This will ensure that the path to redbug gets included
automatically at every startup:
code:add_patha("/path/to/eper/ebin").
Redbug is safe to be used in production, thanks to a self-protecting mechanism against overload, which kills the tool in case too many tracing messages are sent, preventing the Erlang node to become overloaded. Let’s see it in action:
$ erl
Erlang/OTP 19 [erts-8.2] [...]
Eshell V8.2 (abort with ^G)
1> l(redbug). (1)
{module,redbug}
2> redbug:start("lists:sort/1"). (2)
{30,1}
3> lists:sort([3,2,1]).
[1,2,3]
% 15:20:20 <0.31.0>({erlang,apply,2}) (3)
% lists:sort([3,2,1])
redbug done, timeout - 1 (4)
-
First, we ensure that the
redbug
module is available and loaded. -
We then start
redbug
. We are interested in the function namedsort
with arity1
, exported by the modulelists
. Remember that, in Erlang lingo, the arity represents the number of input arguments that a given function takes. -
Finally, we invoke the
lists:sort/1
function and we verify that a message is produced by redbug. -
After the default timeout (15 seconds) is reached, redbug stops and displays the message "redbug done". Redbug is also kind enough to tell us the reason why it stopped (timeout) and the number of messages that collected until that point (1).
Let’s now look at the actual message produced by redbug. By default messages are printed to the standard output, but it’s also possible to dump them to file:
% 15:20:20 <0.31.0>({erlang,apply,2})
% lists:sort([3,2,1])
Depending on the version of redbug you are using, you may get a
slightly different message. In this case, the message is split across
two lines. The first line contains a timestamp, the Process Identifier
(or PID) of the Erlang process which invoked the function and the
caller function. The second line contains the function called,
including the input arguments. Both lines are prepended with a %
,
which reminds us of the syntax for Erlang comments.
We can also ask Redbug to produce an extra message for the return value. This is achieved using the following syntax:
4> redbug:start("lists:sort/1->return").
{30,1}
Let’s invoke the lists:sort/1
function again. This time the output
from redbug is slightly different.
5> lists:sort([3,2,1]).
[1,2,3]
% 15:35:52 <0.31.0>({erlang,apply,2})
% lists:sort([3,2,1])
% 15:35:52 <0.31.0>({erlang,apply,2})
% lists:sort/1 -> [1,2,3]
redbug done, timeout - 1
In this case two messages are produced, one when entering the function and one when leaving the same function.
When dealing with real code, trace messages can be complex and therefore hardly readable. Let’s see what happens if we try to trace the sorting of a list containing 10.000 elements.
6> lists:sort(lists:seq(10000, 1, -1)).
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,
23,24,25,26,27,28,29|...]
% 15:48:42.208 <0.77.0>({erlang,apply,2})
% lists:sort([10000,9999,9998,9997,9996,9995,9994,9993,9992,9991,9990,9989,9988,9987,9986,
% 9985,9984,9983,9982,9981,9980,9979,9978,9977,9976,9975,9974,9973,9972,9971,
% 9970,9969,9968,9967,9966,9965,9964,9963,9962,9961,9960,9959,9958,9957,9956,
% 9955,9954,9953,9952,9951,9950,9949,9948,9947,9946,9945,9944,9943,9942,9941,
% 9940,9939,9938,9937,9936,9935,9934,9933,9932,9931,9930,9929,9928,9927,9926,
% 9925,9924,9923,9922,9921,9920,9919,9918,9917,9916,9915,9914,9913,9912,9911,
% [...]
% 84,83,82,81,80,79,78,77,76,75,74,73,72,71,70,69,68,67,66,65,64,63,62,61,60,
% 59,58,57,56,55,54,53,52,51,50,49,48,47,46,45,44,43,42,41,40,39,38,37,36,35,
% 34,33,32,31,30,29,28,27,26,25,24,23,22,21,20,19,18,17,16,15,14,13,12,11,10,9,
% 8,7,6,5,4,3,2,1])
% 15:48:42.210 <0.77.0>({erlang,apply,2}) lists:sort/1 ->
% [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,
% 23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,
% 42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,
% 61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,
% 80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,
% 99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,
% [...]
% 9951,9952,9953,9954,9955,9956,9957,9958,9959,9960,9961,
% 9962,9963,9964,9965,9966,9967,9968,9969,9970,9971,9972,
% 9973,9974,9975,9976,9977,9978,9979,9980,9981,9982,9983,
% 9984,9985,9986,9987,9988,9989,9990,9991,9992,9993,9994,
% 9995,9996,9997,9998,9999,10000]
redbug done, timeout - 1
Most of the output has been truncated here, but you should get the
idea. To improve things, we can use a couple of redbug options. The
option {arity, true}
instructs redbug to only display the number of
input arguments for the given function, instead of their actual
value. The {print_return, false}
option tells Redbug not to display
the return value of the function call, and to display a …
symbol,
instead. Let’s see these options in action.
7> redbug:start("lists:sort/1->return", [{arity, true}, {print_return, false}]).
{30,1}
8> lists:sort(lists:seq(10000, 1, -1)).
[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,
23,24,25,26,27,28,29|...]
% 15:55:32 <0.77.0>({erlang,apply,2})
% lists:sort/1
% 15:55:32 <0.77.0>({erlang,apply,2})
% lists:sort/1 -> '...'
redbug done, timeout - 1
By default, redbug stops after 15 seconds or after 10 messages are
received. Those values are a safe default, but they are rarely
enough. You can bump those limits by using the time
and msgs
options. time
is expressed in milliseconds.
9> redbug:start("lists:sort/1->return", [{arity, true}, {print_return, false}, {time, 60 * 1000}, {msgs, 100}]).
{30,1}
We can also activate redbug for several function calls
simultaneously. Let’s enable tracing for both functions lists:sort/1
and lists:sort_1/3
(an internal function used by the former):
10> redbug:start(["lists:sort/1->return", "lists:sort_1/3->return"]).
{30,2}
11> lists:sort([4,4,2,1]).
[1,2,4,4]
% 18:39:26 <0.32.0>({erlang,apply,2})
% lists:sort([4,4,2,1])
% 18:39:26 <0.32.0>({erlang,apply,2})
% lists:sort_1(4, [2,1], [4])
% 18:39:26 <0.32.0>({erlang,apply,2})
% lists:sort_1/3 -> [1,2,4,4]
% 18:39:26 <0.32.0>({erlang,apply,2})
% lists:sort/1 -> [1,2,4,4]
redbug done, timeout - 2
Last but not least, redbug offers the ability to only display results for matching input arguments. This is when the syntax looks a bit like magic.
12> redbug:start(["lists:sort([1,2,5])->return"]).
{30,1}
13> lists:sort([4,4,2,1]).
[1,2,4,4]
14> lists:sort([1,2,5]).
[1,2,5]
% 18:45:27 <0.32.0>({erlang,apply,2})
% lists:sort([1,2,5])
% 18:45:27 <0.32.0>({erlang,apply,2})
% lists:sort/1 -> [1,2,5]
redbug done, timeout - 1
In the above example, we are telling redbug that we are only
interested in function calls to the lists:sort/1
function when the
input arguments is the list [1,2,5]
. This allows us to remove a huge
amount of noise in the case our target function is used by many actors
at the same time and we are only interested in a specific use case.
Oh, and don’t forget that you can use the underscore as a wildcard:
15> redbug:start(["lists:sort([1,_,5])->return"]). {30,1}
16> lists:sort([1,2,5]). [1,2,5]
% 18:49:07 <0.32.0>({erlang,apply,2}) lists:sort([1,2,5])
% 18:49:07 <0.32.0>({erlang,apply,2}) lists:sort/1 -> [1,2,5]
17> lists:sort([1,4,5]). [1,4,5]
% 18:49:09 <0.32.0>({erlang,apply,2}) lists:sort([1,4,5])
% 18:49:09 <0.32.0>({erlang,apply,2}) lists:sort/1 -> [1,4,5] redbug
% done, timeout - 2
This section does not pretend to be a comprehensive guide to redbug, but it should be enough to get you going. To get a full list of the available options for redbug, you can ask the tool itself:
18> redbug:help().