【Free Style】利用Sanitizer发现代码中深藏的bug
我们知道,开发项目里面进度最难以预期的就是解决bug的速度,经常会遇到一些难以分析解决的bug影响整个项目的进度。
当我们使用C/C++构建我们的程序的时候,经常会遇到这类问题,主要表现在
• 问题难以复现
• 大压力下运行几天才能出现。
• 大并发问题下,出现问题现场无法分析定位问题。
• ...
经过大量的分析总结,发现这类难以解决的问题的原因大量集中在内存越界、内存重复释放、内存泄露、数据竞争原因。公司也会强制规定使用的pc lint与CodeX等检查工具,但是这些工具主要还是局限在基于规则的静态检查,无法发现一些运行时的bug。而Sanitizer是基于运行时的检查,不存在误报的问题,而且直接集成在工具链当中,不需要修改代码就可以使用,非常方便有效。
2、Sanitizer简介
Sanitizer是google开源的一个项目,其中包含四个组件:
• AdressSanitizer(asan)
• ThreadSanitizer(tsan)
• LeakSanitizer
• MemorySanitizer
使用Sanitizer可以在运行时检查多种问题。
Asan主要检查:
• Use after free (dangling pointer dereference)
Tsan主要检查:
• Destruction of a locked mutex
• Signal-unsafe malloc/free calls in signal handlers
• Potential deadlocks (lock order inversions)
3、如何使用asan
Sanitizer已经集成在gcc或者clang以及golang里面,使用asan或者tsan并不需要额外的工具,只需在gcc或者g++里面增加编译选项,增加一个连接库即可。
这里假设我们使用gcc编译器,介绍如何使用asan。
• Sanitizer的原则是,检查出来的一定是问题,不存在误报的问题。比如针对内存泄露,只要程序malloc的内存,在进程退出时没有释放,就会认为是内存泄露。
• Sanitizer的检测是基于运行时检查的,我们只需要把asan的库链接到我们可执行的二进制里面,并且增加代码的编译选项,那么每次二进制运行都会进行检测,如果检查到错误,就会输出到stderr里面。有些类型检查是需要等进程结束时才能输出 ,比如内存泄露检查,有些类型检查在当时就可以检查到,比如内存写越界等。
对自己的代码增加编译与链接选项:-fsanitize=address,如果使用gcc5.4版本及以上,需要使用gold链接器,在连接选项里面增加-fuse-ld=gold。最后需要增加asan的链接库,链接的时候需使用-lasan。
use after free example。
代码 use_after_free.c
#include <stddef.h>
#include <stdlib.h>
#include <stdio.h>
int main(int argc, char **argv) {
int * value = malloc(sizeof(int));
if(NULL == value)
{
printf("malloc failed\n");
exit(-1);
}
free(value);
return *value; // use after free. BOOM!!!!
}
编译
g c c -g -o use_after_free -fsanitize=address -lasan -fuse-ld=gold use_after_free.c
运行
# 因为这里链接的是asan的动态库,需要把asan进行预加载。
LD_PRELOAD=${LD_PRELOAD}:/usr/lib/gcc/x86_64-linux-gnu/5/libasan.so ./use_after_free
运行后得到结果,结果会明确告诉你,
• 哪里访问了释放后的内存
• 内存在哪里释放的
• 内存在哪里申请的
=================================================================
==68368==ERROR: AddressSanitizer: heap-use-after-free on address 0x60200000eff0 at pc 0x0000004009b6 bp 0x7fff4dc06190 sp 0x7fff4dc06188
READ of size 4 at 0x60200000eff0 thread T0 // 这里访问了释放后的内存
#0 0x4009b5 in main /home/zyn/store1/san_example/use_after_free.c:15
#1 0x7fd93b1b1f44 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21f44)
#2 0x400868 (/home/zyn/store1/san_example/use_after_free+0x400868)
0x60200000eff0 is located 0 bytes inside of 4-byte region [0x60200000eff0,0x60200000eff4)
freed by thread T0 here: // 内存是在哪里释放的
#0 0x7fd93b7ef222 in __interceptor_free (/usr/lib/gcc/x86_64-linux-gnu/5/libasan.so+0x94222)
#1 0x40097e in main /home/zyn/store1/san_example/use_after_free.c:14
#2 0x7fd93b1b1f44 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21f44)
previously allocated by thread T0 here: // 内存在哪里申请的
#0 0x7fd93b7ef4fa in malloc (/usr/lib/gcc/x86_64-linux-gnu/5/libasan.so+0x944fa)
#1 0x40094e in main /home/zyn/store1/san_example/use_after_free.c:7
#2 0x7fd93b1b1f44 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21f44)
SUMMARY: AddressSanitizer: heap-use-after-free /home/zyn/store1/san_example/use_after_free.c:15 main
Shadow bytes around the buggy address:
0x0c047fff9da0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c047fff9db0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c047fff9dc0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c047fff9dd0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c047fff9de0: fa fa 00 00 fa fa 00 00 fa fa 00 00 fa fa 00 00
=>0x0c047fff9df0: fa fa 00 00 fa fa 00 00 fa fa 00 00 fa fa[fd]fa
0x0c047fff9e00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c047fff9e10: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c047fff9e20: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c047fff9e30: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c047fff9e40: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Heap right redzone: fb
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack partial redzone: f4
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
==68368==ABORTING
heap-buffer-overflow example
代码heap_buffer_overflow.c
#include <stddef.h>
#include <stdlib.h>
#include <stdio.h>
int main(int argc, char **argv) {
char * value = malloc(sizeof(char));
if(NULL == value)
{
printf("malloc failed\n");
exit(-1);
}
int a= 100;
*((int*)value) = a; // heap buffer overflow. BOOM
return *value;
}
编译
g c c -g -o heap_buffer_overflow -fsanitize=address -lasan -fuse-ld=gold heap_buffer_overflow.c
运行
LD_PRELOAD=${LD_PRELOAD}:/usr/lib/gcc/x86_64-linux-gnu/5/libasan.so ./heap_buffer_overflow
结果会告诉你:
• 哪里内存越界了
• 内存在哪里申请的
=================================================================
==71705==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60200000eff0 at pc 0x000000400971 bp 0x7ffe0576adc0 sp 0x7ffe0576adb8
WRITE of size 4 at 0x60200000eff0 thread T0 //这里内存越界了
#0 0x400970 in main /home/zyn/store1/san_example/heap_buffer_overflow.c:14
#1 0x7f563fcf9f44 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21f44)
#2 0x400828 (/home/zyn/store1/san_example/heap_buffer_overflow+0x400828)
0x60200000eff1 is located 0 bytes to the right of 1-byte region [0x60200000eff0,0x60200000eff1)
allocated by thread T0 here:
#0 0x7f56403374fa in malloc (/usr/lib/gcc/x86_64-linux-gnu/5/libasan.so+0x944fa)
#1 0x40090e in main /home/zyn/store1/san_example/heap_buffer_overflow.c:7
#2 0x7f563fcf9f44 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21f44)
SUMMARY: AddressSanitizer: heap-buffer-overflow /home/zyn/store1/san_example/heap_buffer_overflow.c:14 main
Shadow bytes around the buggy address:
0x0c047fff9da0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c047fff9db0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c047fff9dc0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c047fff9dd0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c047fff9de0: fa fa 00 00 fa fa 00 00 fa fa 00 00 fa fa 00 00
=>0x0c047fff9df0: fa fa 00 00 fa fa 00 00 fa fa 00 00 fa fa[01]fa
0x0c047fff9e00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c047fff9e10: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c047fff9e20: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c047fff9e30: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
0x0c047fff9e40: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Heap right redzone: fb
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack partial redzone: f4
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
==71705==ABORTING
memory leak example
代码memory_leak.c
#include <stdlib.h>
void *p;
int main() {
p = malloc(7);
p = 0; // The memory is leaked here.
return 0;
}
运行
LD_PRELOAD=${LD_PRELOAD}:/usr/lib/gcc/x86_64-linux-gnu/5/libasan.so ./memory_leak
结果会告诉你泄露的内存是在哪里申请的。
=================================================================
==7829==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 7 byte(s) in 1 object(s) allocated from:
#0 0x42c0c5 in __interceptor_malloc /usr/home/hacker/llvm/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:74
#1 0x43ef81 in main /usr/home/hacker/memory-leak.c:6
#2 0x7fef044b876c in __libc_start_main /build/buildd/eglibc-2.15/csu/libc-start.c:226
SUMMARY: AddressSanitizer: 7 byte(s) leaked in 1 allocation(s).
Asan可以检查的内存错误还有很多,这里不一一列举,详细请参见https://github.com/google/sanitizers/wiki/AddressSanitizer
4、如何使用tsan
tsan主要是检查并发的一些问题,上面也介绍了,tsan可以检查很多种并发问题,但是个人认为Normal data race是我们编码的时候最容易出现的错误,而且导致的bug都很难分析验证。这里简单的介绍一下什么Normal data race: 简单来讲,一个内存资源如果在没有锁保护的情况下,会有多个线程去访问他,tsan就会认为这是一个Normal data race。
注:tsan是平台无关的,对于简单的赋值操作,都假设它没有原子性,除非加了编译器支持的原子语义。比如 int a;a = 100 ,tsan并不认为里面的赋值操作是原子的(x86硬件是可是保证内存中四节点对齐的赋值操作是原子的,但是程序员保证不了编译器一定把一个赋值操作编译成四节点对齐的指令),对于平台支持的原子主义,如int a ; __atomic_load_n(&a,__ATOMIC_SEQ_CST ),认为是一个原子的赋值操作。
有一些data race的bug非常难解决,一、因为是并发导致的问题,简单的测试用例很难测试出来这样的bug。二、它出现的概率非常小,即使出现了bug发生了,根据bug发现的现象很难分析到bug所在的代码位置,而且如果data race与其它一些类型的bug组合,会导致程序出现一些千奇百怪的现象,根据这些现象无法逆向分析出来根因所在。三、并发编程难度比较大,用文字描述比较困难,我们的编程规范对于并发编程也没有明确的要求,很容易引用data race的bug。所以感觉用工具来检查代码中是否有潜藏的data race非常必要。
下面给一个简单的Normal data race的例子。
Normal data race example
代码data_race.c
#include <pthread.h>
int Global;
void *Thread1(void *x)
{
Global = 42;//访问内存资源
return NULL;
}
void *Thread2(void *x)
{
Global = 43;//访问内存资源
return NULL;
}
int main()
{
pthread_t t[2];
pthread_create(&t[0], NULL, Thread1, NULL);
pthread_create(&t[1], NULL, Thread2, NULL);
pthread_join(t[0], NULL);
pthread_join(t[1], NULL);
return 0;
}
编译运行
g c c -g -o data_race -fsanitize=thread -ltsan -fuse-ld=gold data_race.c
./data_race
运行结果:
```
WARNING: ThreadSanitizer: data race (pid=63191) Write of size 4 at 0x00000040205c by thread T2: // 线程T2访问了冲突的的资源 #0 Thread2 /home/zyn/store1/san_example/data_race.c:13 (data_race+0x0000004008c9) // Global = 43;//访问内存资源 #1 (libtsan.so.0+0x000000025e9b)
Previous write of size 4 at 0x00000040205c by thread T1: // 线程T1访问了冲突的资源 #0 Thread1 /home/zyn/store1/san_example/data_race.c:7 (data_race+0x000000400888) // Global = 42;//访问内存资源 #1 (libtsan.so.0+0x000000025e9b)
Location is global 'Global' of size 4 at 0x00000040205c (data_race+0x00000040205c)
Thread T2 (tid=63194, running) created by main thread at: // 创建 T2的调用栈 #0 pthread_create (libtsan.so.0+0x000000029163) #1 main /home/zyn/store1/san_example/data_race.c:21 (data_race+0x000000400936)
Thread T1 (tid=63193, finished) created by main thread at: // 创建 T1的调用栈 #0 pthread_create (libtsan.so.0+0x000000029163) #1 main /home/zyn/store1/san_example/data_race.c:20 (data_race+0x000000400917)
SUMMARY: ThreadSanitizer: data race /home/zyn/store1/san_example/data_race.c:13 in Thread2 ```
四、借鉴方式
把Sanitizer检查集成到项目的CI当中,提前发现潜藏的bug,实现项目进度可控,提升项目的整体的效率。
如果项目是使用golang编写,也可以直接使用类似Sanitizer的功能,在编译运行go代码时,增加-race参数,就可以直接使用。
- 点赞
- 收藏
- 关注作者
评论(0)