Tracking the reference count of a GstMiniObject using gdb

As part of my work at Collabora, I'm currently adding Valgrind support to the awesome gst-validate tool. The ultimate goal is to run our hundreds of GStreamer tests inside Valgrind as part of the existing QA infrastructure to automatically track memory related regressions (invalid reads, leaks, etc).

Most of the gst-validate changes have already landed and can be very easily used by passing the --valgrind argument to gst-validate-launcher. I'm now focusing on making sure most of our existing tests are passing with Valgrind which means doing quite a lot of memory leaks debugging (everyone love doing those right?).

A lot of GStreamer types are based on GstMiniObject instead of the usual GObject. It makes a lot of sense from a performance pov but can make tracking ref count issues harder as we can't rely on tools such as RefDbg or gobject-list.

I was tracking one GstMiniObject leak today and was looking for a way to get a trace each time its reference is modified. We can use GST_DEBUG="GST_REFCOUNTING:9" to get logs each time the object is reffed/unreffed but I was actually interested in the full stack trace. With to the help of the French gang (kudos to Dodji, Bastien and Christophe!) I managed to do so using this good old gdb.

First thing is to break when the object you want to track is created, you can either do this by using in gdb b mysource:line or just add a G_BREAKPOINT() in your source code. Start your app with gdb as usual then uses:

set logging on
set pagination off

The ouput can be pretty long so this will ensure that logs are saved to a file (gdb.txt by default) and that gdb won't bother you asking for confirmation before printing ouput. Now start your app (run) and once it has paused use the following command:

watch -location ((GstMiniObject*)caps)->refcount

caps is the name of the instance of the object I want to track, as defined in the scope where I installed my breakpoint; update it to match yours. This command adds a watchpoint on the refcount of the object, that means gdb will now stop each time its value is modified. The -location option ensures that gdb watches the memory associated with the expression, not using it would limit us to the local scope of the variable.

Now we want to display a backtrace each time gdb pauses when this watchpoint is hit. This is done using commands:

commands
bt
continue
end

All the gdb instructions between commands and end will be automatically executed each time gdb pauses because of the watchpoint we just defined. In our case we first want to display a stack trace (bt) and then continue the execution of the program.

We are now all set, we just have to ask gdb to resume the normal execution of the program we are debugging:

continue

This should generate a gdb.txt log file containing something like:

Old value = 1
New value = 2
gst_mini_object_ref (mini_object=0x7ffff001c8f0) at gstminiobject.c:362
362	  return mini_object;
#0  0x00007ffff6f384ed in gst_mini_object_ref (mini_object=0x7ffff001c8f0) at gstminiobject.c:362
#1  0x00007ffff6f38b00 in gst_mini_object_replace (olddata=0x7ffff67f0c58, newdata=0x7ffff001c8f0) at gstminiobject.c:501
#2  0x00007ffff72573ed in gst_caps_replace (old_caps=0x7ffff67f0c58, new_caps=0x7ffff001c8f0) at ../../../gst/gstcaps.h:312
#3  0x00007ffff72578fa in helper_find_suggest (data=0x7ffff67f0c30, probability=GST_TYPE_FIND_MAXIMUM, caps=0x7ffff001c8f0) at gsttypefindhelper.c:230
#4  0x00007ffff6f7d606 in gst_type_find_suggest_simple (find=0x7ffff67f0bf0, probability=100, media_type=0x7ffff5de8a66 "video/mpegts", fieldname=0x7ffff5de8a4e "systemstream") at gsttypefind.c:197
#5  0x00007ffff5ddbec3 in mpeg_ts_type_find (tf=0x7ffff67f0bf0, unused=<optimized out>) at gsttypefindfunctions.c:2381
#6  0x00007ffff6f7dbc7 in gst_type_find_factory_call_function (factory=0x6dbbc0 [GstTypeFindFactory], find=0x7ffff67f0bf0) at gsttypefindfactory.c:215
#7  0x00007ffff7257d83 in gst_type_find_helper_get_range (obj=0x7e8280 [GstProxyPad], parent=0x7d0500 [GstGhostPad], func=0x7ffff6f2aa37 <gst_proxy_pad_getrange_default>, size=10420224, extension=0x7ffff00010e0 "MTS", prob=0x7ffff67f0d04) at gsttypefindhelper.c:355
#8  0x00007ffff683fd43 in gst_type_find_element_loop (pad=0x7e47d0 [GstPad]) at gsttypefindelement.c:1064
#9  0x00007ffff6f79895 in gst_task_func (task=0x7ef050 [GstTask]) at gsttask.c:331
#10 0x00007ffff6f7a971 in default_func (tdata=0x61ac70, pool=0x619910 [GstTaskPool]) at gsttaskpool.c:68
#11 0x0000003ebb070d68 in g_thread_pool_thread_proxy (data=<optimized out>) at gthreadpool.c:307
#12 0x0000003ebb0703d5 in g_thread_proxy (data=0x7ceca0) at gthread.c:764
#13 0x0000003eb880752a in start_thread (arg=0x7ffff67f1700) at pthread_create.c:310
#14 0x0000003eb850022d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Hardware watchpoint 1: -location ((GstMiniObject*)caps)->refcount

Old value = 2
New value = 1
0x00007ffff6f388b6 in gst_mini_object_unref (mini_object=0x7ffff001c8f0) at gstminiobject.c:442
442	  if (G_UNLIKELY (g_atomic_int_dec_and_test (&mini_object->refcount))) {
#0  0x00007ffff6f388b6 in gst_mini_object_unref (mini_object=0x7ffff001c8f0) at gstminiobject.c:442
#1  0x00007ffff6f7d027 in gst_caps_unref (caps=0x7ffff001c8f0) at ../gst/gstcaps.h:230
#2  0x00007ffff6f7d615 in gst_type_find_suggest_simple (find=0x7ffff67f0bf0, probability=100, media_type=0x7ffff5de8a66 "video/mpegts", fieldname=0x7ffff5de8a4e "systemstream") at gsttypefind.c:198
#3  0x00007ffff5ddbec3 in mpeg_ts_type_find (tf=0x7ffff67f0bf0, unused=<optimized out>) at gsttypefindfunctions.c:2381
#4  0x00007ffff6f7dbc7 in gst_type_find_factory_call_function (factory=0x6dbbc0 [GstTypeFindFactory], find=0x7ffff67f0bf0) at gsttypefindfactory.c:215
#5  0x00007ffff7257d83 in gst_type_find_helper_get_range (obj=0x7e8280 [GstProxyPad], parent=0x7d0500 [GstGhostPad], func=0x7ffff6f2aa37 <gst_proxy_pad_getrange_default>, size=10420224, extension=0x7ffff00010e0 "MTS", prob=0x7ffff67f0d04) at gsttypefindhelper.c:355
#6  0x00007ffff683fd43 in gst_type_find_element_loop (pad=0x7e47d0 [GstPad]) at gsttypefindelement.c:1064
#7  0x00007ffff6f79895 in gst_task_func (task=0x7ef050 [GstTask]) at gsttask.c:331
#8  0x00007ffff6f7a971 in default_func (tdata=0x61ac70, pool=0x619910 [GstTaskPool]) at gsttaskpool.c:68
#9  0x0000003ebb070d68 in g_thread_pool_thread_proxy (data=<optimized out>) at gthreadpool.c:307
#10 0x0000003ebb0703d5 in g_thread_proxy (data=0x7ceca0) at gthread.c:764
#11 0x0000003eb880752a in start_thread (arg=0x7ffff67f1700) at pthread_create.c:310
#12 0x0000003eb850022d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Hardware watchpoint 1: -location ((GstMiniObject*)caps)->refcount

If you see this kind of error in your log

Error evaluating expression for watchpoint 1 value has been optimized out

this means you may have to rebuild your application and/or its libraries without any optimization. This is pretty easy with autotools:

make clean
CFLAGS="-g3 -ggdb3 -O0" make

Now all you need is to grab a good cup of tea and start digging through this log to find your leak. Good fun!

Edit

Daniel pointed out to me that watchpoints can be pretty unreliable in gdb. So here is another version where we ask gdb to break when our object is reffed/unreffed. You just have to figure out its address using gdb or simply by printing the value of its pointer.

b gst_mini_object_ref if (mini_object == 0xdeadbeef)
b gst_mini_object_unref if (mini_object == 0xdeadbeef)
commands 1 2
bt
cont
end