프로그램분석 어떻게하나 - Programming Research Laboratory

Airac/Mairac
Static Analyzers for Automatic Verification of
Buffer Overrun/Memory Leak
Errors in C Programs
이광근 교수
ropas.snu.ac.kr/~kwang
Programming Research Lab
Seoul National University
11/1/2005
소개
• 93: Ph.D., Univ. of Illinois at Urbana-Champaign
• 93-95: Bell Labs 연구원, Software Principles
Research Dept. (Murray Hill)
• 95-03: KAIST 전산학과 조교수/부교수
• 98-03: 과기부 [프로그램 분석시스템 연구단] 단장
• 03-현재: 서울대 컴퓨터공학부 부교수
프로그램 분석 시스템 연구단
• 1998-2003 과기부 창의적연구진흥사업 지정
[프로그램분석 시스템 연구단] 단장
• 목표: 무결점 소프트웨어를 만들고 확인할 수
있는 원천 기술 연구
원천 기술
프로그램 분석 기술
Static Program Analysis
프로그램 분석
• 프로그램 분석(static program analysis) =
실행전에 실행성질을 자동으로 안전하게 어림
잡는 일반적인 방법
– “실행전”: 프로그램을 돌리기 전에
– “실행성질”: 실행중의 프로그램 성질
– “자동으로”: 프로그램이 프로그램을 분석
– “안전하게”: 모든 실제상황을 포섭
– “어림잡는”: 군더더기가없을 순 없다
– “일반적인”: 가능한 언어와 실행성질이 무제한
무결점 소프트웨어를 위해서는
두개의 기둥이 필요
• 프로그램 개발 프로세스
– 개발팀을 구성하고 운영해야 하는 체계
– 체계적인 운영을 강제하는 개발 도구
– 목표의 50%만 달성시켜줌: sw 오류는 계속 나타남
• 프로그램 오류 자동 검증 기술
–
–
–
–
자동: 소프트웨어가 소프트웨어를 분석
검증: 오류가 없다는 것을 확인해 줌
기술의 성숙도: 무르익어 산업체로 흘러들고 있슴
목표의 나머지 49%를 달성시켜 줄 것임: 무결점 sw
자동검증 기술이 적용된 예(외국)
•
Microsoftware (2001년 이후)
– device driver sw 검증: SLAM technology
– 안전한, 오류없는 sw개발에 집중: 요즘 Bill Gates 연설의 기초
•
Unix/Linux kernel 검증 (2000년 이후)
– model checking, static analysis 의 조합
– os community에서 가장 주목받고 있는 기술
• AirBus (2002년 이후)
– aviation controller모듈 sw 검증에 static analysis기술적용
– AirBus sw개발 프로세스의 표준으로 static analysis과정을 결정
•
이기술에 특화된 회사들 등장:
– AbsInt, Astree, PolySpace technologies, Trusted Logic,
GrammaTech, Esterel Technologies, Galois Connections, etc.
자동검증 기술이 적용된 예(국내)
•예
– C(삼성전자): 할당된 메모리 영역 바깥을 접근하
는 경우가 있는가? 검증.
– C(삼성전자): 할당된 메모리 사용이 끝났으면 모
두 재활용 하는가? 검증.
– C(정통부): 내장 병렬 소프트웨어의 오류? 검증.
– WEB프로그램(국가보안기술연구소): 웹소스가,
알려진 해커의 침입방법에 뚤릴 수 있는가? 검증.
– 기타등등
Contents
1. Introduction
–
what
2. Performance
–
–
–
for realistic sw’s
strength and weakness
in global competition
3. Discussion
Airac
Static Analyzer for Detecting All
Buffer Overrun Errors in C Programs
• “static”: no test runs
• “all”: no un-noticed overruns
• “C”: full ANSI C + (GNU C)
int *c = (int *)malloc(sizeof(int)*10);
c[i] = 1; c[i + f()] = 1; c[*k + (*g)()] = 1;
x = c+5; x[1] = 1;
z->a = c; (z->a)[i] = 1;
foo(c+2); int foo(int *d) {… d[i] = 1; …}
Airac: technology keywords
• static program analysis
– exhaustive: detects all buffer overruns
– sound: safe side when in doubt
– automatic: no need of help from C
pgmer’s
– always stops: even for non-terminating C
pgms
– modular: separate C files
– correct: based on a firm theoretical
framework
Airac: internals (1/2)
x1 = F1(x1,…,xN)
x2 = F2(x1,…,xN)
…
xN = FN(x1,…,xN)
C files
equation solver
C’ files
bug identification
Airac: internals (2/2)
• sound design by abstract interpretation
• accuracy improvement by
– narrowing, flow-sensitivity, context pruning,
static inlining(bounded polyvariance), static
loop unrolling
• cost reduction by
– widening, economic join/partial-order ops
– careful worklist order: lazy at join points
Airac’s Performance
Warnings About Performance
• Assume typeful C programs
– array sizes remain the same as declared
• Artificial semantics after errors
• No side-effect for library functions
• No main() then
– analyzing procedure calls in their defined order
• No alarms about buffers whose size is
unknown
• Worst values for free variables
Airac: performance (1/3)
Linux kernel 2.6.4
Alarms
Real
Errors
LOC
Time
(sec)
vmax302.c
(79)
1
1
246
3
xfrm_user.c
(235)
2
1
1,201
109
usb-midi.c
(332)
10
2
2,206
3617
atkbd.c
(332)
5
2
811
285
keyboard.c
(411)
2
1
1,256
9
(48)
1
1
1,273
79
eata_pio.c
(183)
3
1
984
8
cdc_acm.c
(468)
5
3
849
119
ip6_output.c
(198)
0
0
1,110
45
mptbase.c
(777)
2
1
6,158
8251
aty128fb.c
(98)
2
1
2,466
3671
af_inet.c
3.2GHz P4, 4GB RAM
Airac: performance (2/3)
GNU Softwares
Alarms
Real
Errors
LOC
Time
(sec)
(2,630)
66
1
20,258
577
bison-1.875 (5,164)
50
0
15,907
809
sed-4.0.8
(461)
29
0
6,053
1154
gzip-1.2.4a
(799)
17
0
7,327
794
grep-2.5.1
(187)
2
0
9,297
604
tar-1.13
Airac: performance (3/3)
(commercial softwares)
X Softwares
Alarms
Real
Errors
LOC
Time
(min)
A1
18
9
280,379
8
A2
196
56
3,584,664
789
A3
78
15
119,211
82
A4
435
7
806,829
112
A5
197
112
517,314
8
Airac: scalability
Airac vs Swat (1/3)
Linux kernel 2.6.4
SWAT
(Stanford/
Coverity)
AIRAC
(SNU)
Found
Errors
Found
Errors
/drivers/mtd/maps/vmax301.c
1
1
/net/xfrm/xfrm_user.c
1
1
/drivers/usb/class/usb-midi.c
2
2
/drivers/input/keyboard/atkbd.c
2
2
/drivers/char/keyboard.c
1
1
/net/ipv4/af_inet.c
1
1
/drivers/scsi/eata_pio.c
1
1
/drivers/usb/class/cdc-acm.c (*)
1
3
/net/ipv6/ip6_output.c (**)
1
0
/drivers/message/fusion/mptbase.c
1
1
/drivers/video/aty/aty128fb.c
1
1
Airac vs Swat (2/3)
Airac
Bugs
Coverity
Airac vs Swat (3/3)
구분
에러 검출력
A 적용결과
B 적용결과
Remark
SWAT (Coverity社
/Stanford)
AIRAC (서울대)
62% detect율 (8/13) 100% detect율 (13/13)
#Alarms: 19 buffers
#Real Errors: 2
buffers
#False Alarms: 17
buffers
Time: 7 min
#Alarms: 78
#Real Errors: 15 access (5
buffers)
#False Alarms: 63 access (18
buffers)
Time: 82 min
#Alarms: 2 buffers
#Real Errors: 2
buffers
#False Alarms: 0
buffers
Time: 4 min
#Alarms: 18
#Real Errors: 9 access (2
buffers)
#False Alarms: 9 access (6
buffers)
Time: 8 min
Error reporting 기준: Error reporting 기준: Buffer
Buffer (array)
의 모든 access
Taming False Alarms
• For each alarm from Airac, compute its truealarm probability
– conditional probability given its symptoms
• Sift out “probably false” alarms
– threshold by user-provided risk ratio
• Report first “probably true” alarms
Sifting Out False Alarms
• for parts of Linux kernels
• half of alarms are randomly used for the training
• :-) 74.84% of false alarms filtered out when Rs = 3 x Rr
• :-| 31.40% of true alarms swept out
Ranking False Alarms
•
The user sees “truer” alarms first
• 15.17% of false alarms were mixed up until the user
sees 50% of the true alarms
Airac: in global competition
• one of a few real-world static analyzer
in support of full ANSI C
• v.s. world-class powers on static
analysis:
– Coverity(USA): not sound, ad-hoc. Beaten by
Airac.
– Polyspace(France): comparable, sound, cost,
assumption
– all in the static analysis research community:
• I know what they can do.
• If I hadn’t known, they may be people of either shallow
technology or the “disruptive technology”
Mairac
Static Analyzer for Detecting All
Memory Leak Errors in C Programs
• Mairac = Airac + malloc/free-analysis
• Soundly approximate all possible
execution flows with pointer values
• Check if all malloc’ed addresses are
freed
Mairac
• option 1: free-before-end
– check if malloc’ed addresses are freed
before the end of the program
• option 2: free-before-return
– check if malloc’ed addresses are freed
before the return of a procedure
Mairac’s Limitation
• one abstract location/“malloc(size)”
– no information about the structure of heap
data
...
• all locations of ptr/“free(ptr)”
– our design choice
– soundness violation in principle yet rare in
practice
Mairac Understands
Interprocedural Pointer Aliasing
void pointer(char **p, char* s){ *p = s; }
int ResourceLeak_TC03 (int arg1)
{
char str[10] ="STRING";
char *p1, *p2;
p1 = (char *)malloc(sizeof(char)*10);
if( p1 == NULL) return 1;
strcat(p1,str);
pointer(&p2,p1);
free(p2);
return 0;
}
// both Mairac and Prevent conclude OK
Mairac Understands
Pointer Arithmetic
int pointer_arithmetic(int arg1)
{
char *buf1, *buf2, *p;
int i;
buf1 = malloc(10);
p = buf2 = malloc(10);
for(i=0;i<10;i++){
buf1++; buf2++;
}
free(buf1);
free(p);
}
// Prevent doesn’t alarm
Mairac Understands Paths
int unclear_condition(int cond)
{
char *buf1, *buf2;
int i;
if (cond) buf1 = alloc(10);
// cond != 0
else
// cond == 0
buf2 = alloc(10);
cond = cond + 10;
if (cond) free(buf1);
else free(buf2);
}
Mairac’s Performance (1/2)
small test cases
9 files
360 LOC
* Pointer
Arithmetic
Mairac
3
True
False
4
2
Prevent
* No heap data
structure analysis
Mairac’s Performance (2/2)
real commercial sw
102 files
72,293 LOC
Pentium 2.8 3GB
Mairac
•중복 알람
Prevent
•라이브러리 함수
about 1hr
True
False
72
20
23
22
337
8
목표: false alarm
ratio · 50%
Technology Keywords
•
•
•
•
•
•
static analysis: abstract interpretation
fully automatic
always terminate
detecting all targetted bugs
false alarms
“software MRI”
QnA
감사합니다.
ropas.snu.ac.kr