如何使用 Linux GNU GCC Profiling Tool

举报
Tiamo_T 发表于 2022/07/06 16:05:38 2022/07/06
【摘要】 在本文中,我们将探索 GNU 分析工具“gprof”。

分析是软件编程的一个重要方面。通过分析可以确定程序代码中耗时且需要重写的部分。这有助于使您的程序执行速度更快,这是始终需要的。
在非常大的项目中,分析不仅可以确定程序中执行速度比预期慢的部分,还可以帮助您找到许多其他统计数据,通过这些统计数据可以发现和整理许多潜在的错误。

在本文中,我们将探索 GNU 分析工具“gprof”。

如何使用 gprof

使用 gprof 工具一点也不复杂。您只需要在高层次上执行以下操作:

  • 在编译代码时启用分析
  • 执行程序代码以生成分析数据
  • 对分析数据文件(在上述步骤中生成)运行 gprof 工具。

上面的最后一步生成了一个人类可读形式的分析文件。除了一些其他信息外,该文件还包含几个表(平面配置文件和调用图)。平面配置文件概述了函数的时序信息,例如执行特定函数的时间消耗、调用次数等。另一方面,调用图关注每个函数,例如特定函数通过哪些函数函数被调用,所有函数都是从这个特定函数中调用的等等所以这样人们也可以了解在子例程中花费的执行时间。

让我们通过一个实际的例子来尝试理解上面列出的三个步骤。以下测试代码将在整篇文章中使用:

//test_gprof.c
#include<stdio.h>

void new_func1(void);

void func1(void)
{
    printf("\n Inside func1 \n");
    int i = 0;

    for(;i<0xffffffff;i++);
    new_func1();

    return;
}

static void func2(void)
{
    printf("\n Inside func2 \n");
    int i = 0;

    for(;i<0xffffffaa;i++);
    return;
}

int main(void)
{
    printf("\n Inside main()\n");
    int i = 0;

    for(;i<0xffffff;i++);
    func1();
    func2();

    return 0;
}
//test_gprof_new.c
#include<stdio.h>

void new_func1(void)
{
    printf("\n Inside new_func1()\n");
    int i = 0;

    for(;i<0xffffffee;i++);

    return;
}

请注意,函数内的“for”循环会消耗一些执行时间。

第 1 步:编译时启用分析

在第一步中,我们需要确保在代码编译完成时启用分析。这可以通过在编译步骤中添加“-pg”选项来实现。


从 gcc 的手册页:

-pg :生成额外的代码来编写适合分析程序 gprof 的配置文件信息。在编译需要数据的源文件时必须使用此选项,并且在链接时也必须使用它。

所以,让我们用'-pg'选项编译我们的代码:

$ gcc -Wall -pg test_gprof.c test_gprof_new.c -o test_gprof
$

请注意:选项 '-pg' 可以与编译的 gcc 命令(-c 选项)、链接的 gcc 命令(目标文件上的 -o 选项)和执行这两者的 gcc 命令一起使用(如上例所示) .

第 2 步:执行代码

在第二步中,执行作为第 1 步(上面)的结果生成的二进制文件,以便可以生成分析信息。

$ ls
test_gprof  test_gprof.c  test_gprof_new.c

$ ./test_gprof 

 Inside main()

 Inside func1 

 Inside new_func1()

 Inside func2 

$ ls
gmon.out  test_gprof  test_gprof.c  test_gprof_new.c

$

所以我们看到,当执行二进制文件时,在当前工作目录中生成了一个新文件“gmon.out”。

请注意,如果程序在执行时更改了当前工作目录(使用 chdir),那么 gmon.out 将在新的当前工作目录中生成。此外,您的程序需要有足够的权限才能在当前工作目录中创建 gmon.out。

第 3 步:运行 gprof 工具

在这一步中,gprof 工具使用可执行文件名和上面生成的“gmon.out”作为参数运行。这将生成一个分析文件,其中包含所有所需的分析信息。

$ gprof test_gprof gmon.out > analysis.txt

请注意,可以明确指定输出文件(如上面的示例)或在标准输出上生成信息。

$ ls
analysis.txt gmon.out test_gprof test_gprof.c test_gprof_new.c

所以我们看到生成了一个名为“analysis.txt”的文件。


理解分析信息

如上所述,所有分析信息现在都存在于“analysis.txt”中。让我们看一下这个文本文件:

Flat profile:

Each sample counts as 0.01 seconds.
%    cumulative self          self   total
time seconds    seconds calls s/call s/call name
33.86 15.52     15.52    1    15.52  15.52  func2
33.82 31.02     15.50    1    15.50  15.50  new_func1
33.29 46.27     15.26    1    15.26  30.75  func1
0.07  46.30     0.03                        main

% the percentage of the total running time of the
time program used by this function.

cumulative a running sum of the number of seconds accounted
seconds for by this function and those listed above it.

self the number of seconds accounted for by this
seconds function alone. This is the major sort for this
listing.

calls the number of times this function was invoked, if
this function is profiled, else blank.

self the average number of milliseconds spent in this
ms/call function per call, if this function is profiled,
else blank.

total the average number of milliseconds spent in this
ms/call function and its descendents per call, if this
function is profiled, else blank.

name the name of the function. This is the minor sort
for this listing. The index shows the location of
the function in the gprof listing. If the index is
in parenthesis it shows where it would appear in
the gprof listing if it were to be printed.

Call graph (explanation follows)

granularity: each sample hit covers 2 byte(s) for 0.02% of 46.30 seconds

index % time self children called name

[1]   100.0  0.03  46.27          main [1]
             15.26 15.50    1/1      func1 [2]
             15.52 0.00     1/1      func2 [3]
-----------------------------------------------
             15.26 15.50    1/1      main [1]
[2]   66.4   15.26 15.50    1     func1 [2]
             15.50 0.00     1/1      new_func1 [4]
-----------------------------------------------
             15.52 0.00     1/1      main [1]
[3]   33.5   15.52 0.00     1     func2 [3]
-----------------------------------------------
             15.50 0.00     1/1      func1 [2]
[4] 33.5     15.50 0.00     1     new_func1 [4]
-----------------------------------------------

This table describes the call tree of the program, and was sorted by
the total amount of time spent in each function and its children.

Each entry in this table consists of several lines. The line with the
index number at the left hand margin lists the current function.
The lines above it list the functions that called this function,
and the lines below it list the functions this one called.
This line lists:
index A unique number given to each element of the table.
Index numbers are sorted numerically.
The index number is printed next to every function name so
it is easier to look up where the function in the table.

% time This is the percentage of the `total' time that was spent
in this function and its children. Note that due to
different viewpoints, functions excluded by options, etc,
these numbers will NOT add up to 100%.

self This is the total amount of time spent in this function.

children This is the total amount of time propagated into this
function by its children.

called This is the number of times the function was called.
If the function called itself recursively, the number
only includes non-recursive calls, and is followed by
a `+' and the number of recursive calls.

name The name of the current function. The index number is
printed after it. If the function is a member of a
cycle, the cycle number is printed between the
function's name and the index number.

For the function's parents, the fields have the following meanings:

self This is the amount of time that was propagated directly
from the function into this parent.

children This is the amount of time that was propagated from
the function's children into this parent.

called This is the number of times this parent called the
function `/' the total number of times the function
was called. Recursive calls to the function are not
included in the number after the `/'.

name This is the name of the parent. The parent's index
number is printed after it. If the parent is a
member of a cycle, the cycle number is printed between
the name and the index number.

If the parents of the function cannot be determined, the word
`' is printed in the `name' field, and all the other
fields are blank.

For the function's children, the fields have the following meanings:

self This is the amount of time that was propagated directly
from the child into the function.

children This is the amount of time that was propagated from the
child's children to the function.

called This is the number of times the function called
this child `/' the total number of times the child
was called. Recursive calls by the child are not
listed in the number after the `/'.

name This is the name of the child. The child's index
number is printed after it. If the child is a
member of a cycle, the cycle number is printed
between the name and the index number.

If there are any cycles (circles) in the call graph, there is an
entry for the cycle-as-a-whole. This entry shows who called the
cycle (as parents) and the members of the cycle (as children.)
The `+' recursive calls entry shows the number of function calls that
were internal to the cycle, and the calls entry for each member shows,
for that member, how many times it was called from other members of
the cycle.

Index by function name

[2] func1 [1] main
[3] func2 [4] new_func1

所以(正如已经讨论过的)我们看到这个文件大致分为两部分:

1. 平面轮廓
2. 调用图

(平面配置文件和调用图)的各个列在输出本身中得到了很好的解释。

使用标志自定义 gprof 输出

有多种标志可用于自定义 gprof 工具的输出。其中一些讨论如下:

1. 使用 -a 禁止打印静态(私有)声明的函数

如果有一些静态函数您不需要其分析信息,则可以使用 -a 选项来实现:

$ gprof -a test_gprof gmon.out > analysis.txt

现在,如果我们看到该分析文件:

Flat profile:

Each sample counts as 0.01 seconds.
%        cumulative self           self    total
time  seconds       seconds calls  s/call  s/call  name
67.15 30.77         30.77     2    15.39  23.14    func1
33.82 46.27         15.50     1    15.50  15.50    new_func1
0.07   46.30         0.03                          main

...
...
...

Call graph (explanation follows)

granularity: each sample hit covers 2 byte(s) for 0.02% of 46.30 seconds

index   %time        self  children  called  name

[1]     100.0        0.03   46.27             main [1]
                     30.77  15.50     2/2      func1 [2]
-----------------------------------------------------
                     30.77  15.50     2/2      main [1]
[2]     99.9         30.77  15.50     2      func1 [2]
                     15.50   0.00     1/1      new_func1 [3]
----------------------------------------------------
                     15.50   0.00     1/1      func1 [2]
[3]        33.5      15.50 0.00       1      new_func1 [3]
-----------------------------------------------

...
...
...

所以我们看到没有与func2相关的信息(定义为static)

2. 使用 -b 抑制冗长的简介

正如您已经看到的那样,gprof 会生成包含大量详细信息的输出,因此如果不需要此信息,则可以使用 -b 标志来实现。

$ gprof -b test_gprof gmon.out > analysis.txt

现在如果我们看到分析文件:

Flat profile:

Each sample counts as 0.01 seconds.
%       cumulative    self            self    total
time    seconds       seconds  calls  s/call  s/call   name
33.86 15.52            15.52      1    15.52  15.52    func2
33.82 31.02            15.50      1    15.50  15.50    new_func1
33.29 46.27            15.26      1    15.26  30.75    func1
0.07   46.30            0.03                           main

Call graph

granularity: each sample hit covers 2 byte(s) for 0.02% of 46.30 seconds
index % time self children called name

[1]   100.0  0.03  46.27          main [1]
             15.26 15.50    1/1      func1 [2]
             15.52 0.00     1/1      func2 [3]
-----------------------------------------------
             15.26 15.50    1/1      main [1]
[2]   66.4   15.26 15.50    1     func1 [2]
             15.50 0.00     1/1      new_func1 [4]
-----------------------------------------------
             15.52 0.00     1/1      main [1]
[3]   33.5   15.52 0.00     1     func2 [3]
-----------------------------------------------
             15.50 0.00     1/1      func1 [2]
[4] 33.5     15.50 0.00     1     new_func1 [4]
-----------------------------------------------
Index by function name

[2] func1 [1] main
[3] func2 [4] new_func1

所以我们看到分析文件中不存在所有详细信息。

3. 使用 -p 仅打印平面配置文件

如果只需要平面轮廓,则:

$ gprof -p -b test_gprof gmon.out > analysis.txt

请注意,我已经使用(并将使用)-b 选项以避免分析输出中的额外信息。

现在,如果我们看到该分析输出:

Flat profile:

Each sample counts as 0.01 seconds.
%       cumulative    self            self   total
time    seconds       seconds  calls  s/call  s/call  name
33.86   15.52          15.52      1   15.52   15.52    func2
33.82   31.02          15.50      1   15.50   15.50    new_func1
33.29   46.27          15.26      1   15.26   30.75    func1
0.07    46.30          0.03                            main

所以我们看到输出中只有平面轮廓。

4.在平面配置文件中打印与特定功能相关的信息

这可以通过提供函数名称和 -p 选项来实现:

$ gprof -pfunc1 -b test_gprof gmon.out > analysis.txt

现在,如果我们看到该分析输出:

Flat profile:

Each sample counts as 0.01 seconds.
%          cumulative     self            self     total
time       seconds        seconds  calls  s/call   s/call  name
103.20     15.26          15.26     1     15.26   15.26    func1

所以我们看到显示了一个包含仅与函数 func1 相关的信息的平面配置文件。

5. 使用 -P 抑制输出中的平面轮廓

如果不需要平面轮廓,则可以使用 -P 选项抑制它:

$ gprof -P -b test_gprof gmon.out > analysis.txt

现在,如果我们看到分析输出:

Call graph

granularity: each sample hit covers 2 byte(s) for 0.02% of 46.30 seconds
index % time self children called name

[1]   100.0  0.03  46.27          main [1]
             15.26 15.50    1/1      func1 [2]
             15.52 0.00     1/1      func2 [3]
-----------------------------------------------
             15.26 15.50    1/1      main [1]
[2]   66.4   15.26 15.50    1     func1 [2]
             15.50 0.00     1/1      new_func1 [4]
-----------------------------------------------
             15.52 0.00     1/1      main [1]
[3]   33.5   15.52 0.00     1     func2 [3]
-----------------------------------------------
             15.50 0.00     1/1      func1 [2]
[4] 33.5     15.50 0.00     1     new_func1 [4]
-----------------------------------------------
Index by function name

[2] func1 [1] main
[3] func2 [4] new_func1

所以我们看到平面轮廓被抑制了,只有调用图显示在输出中。

此外,如果需要打印平面配置文件但不包括特定函数,那么也可以使用 -P 标志通过传递函数名称(以排除)连同它一起。

$ gprof -Pfunc1 -b test_gprof gmon.out > analysis.txt

在上面的示例中,我们尝试通过将 'func1' 与 -P 选项一起传递给 gprof 来排除它。现在让我们看看分析输出:

Flat profile:

Each sample counts as 0.01 seconds.
%         cumulative      self              self    total
time      seconds         seconds   calls   s/call  s/call  name
50.76     15.52            15.52      1     15.52   15.52   func2
50.69     31.02            15.50      1     15.50   15.50   new_func1
0.10      31.05            0.03                             main

因此,我们看到显示了平面配置文件,但 func1 上的信息被抑制了。

6. 使用 -q 仅打印调用图信息

gprof -q -b test_gprof gmon.out > analysis.txt

在上面的示例中,使用了选项 -q。让我们看看它对分析输出的影响:

Call graph

granularity: each sample hit covers 2 byte(s) for 0.02% of 46.30 seconds
index % time self children called name

[1]   100.0  0.03  46.27          main [1]
             15.26 15.50    1/1      func1 [2]
             15.52 0.00     1/1      func2 [3]
-----------------------------------------------
             15.26 15.50    1/1      main [1]
[2]   66.4   15.26 15.50    1     func1 [2]
             15.50 0.00     1/1      new_func1 [4]
-----------------------------------------------
             15.52 0.00     1/1      main [1]
[3]   33.5   15.52 0.00     1     func2 [3]
-----------------------------------------------
             15.50 0.00     1/1      func1 [2]
[4] 33.5     15.50 0.00     1     new_func1 [4]
-----------------------------------------------
Index by function name

[2] func1 [1] main
[3] func2 [4] new_func1

所以我们看到输出中只打印了调用图。

7. 只打印调用图中的特定函数信息。

这可以通过传递函数名和 -q 选项来实现。

$ gprof -qfunc1 -b test_gprof gmon.out > analysis.txt

现在,如果我们看到分析输出:

Call graph

granularity: each sample hit covers 2 byte(s) for 0.02% of 46.30 seconds
index % time self children called name

             15.26 15.50    1/1      main [1]
[2]   66.4   15.26 15.50    1     func1 [2]
             15.50 0.00     1/1      new_func1 [4]
-----------------------------------------------
             15.50 0.00     1/1      func1 [2]
[4]   33.5   15.50 0.00     1     new_func1 [4]
-----------------------------------------------
Index by function name

[2] func1 (1) main
(3) func2 [4] new_func1

所以我们看到调用图中只显示了与 func1 相关的信息。

8. 使用 -Q 抑制调用图

如果分析输出中不需要调用图信息,则可以使用 -Q 选项。

$ gprof -Q -b test_gprof gmon.out > analysis.txt

现在,如果我们看到分析输出:

Flat profile:

Each sample counts as 0.01 seconds.
%       cumulative    self            self    total
time    seconds       seconds  calls  s/call  s/call   name
33.86 15.52            15.52      1   15.52   15.52    func2
33.82 31.02            15.50      1   15.50   15.50    new_func1
33.29 46.27            15.26      1   15.26   30.75    func1
0.07   46.30            0.03                           main

所以我们看到输出中只有平面轮廓。整个调用图都被抑制了。

此外,如果希望从调用图中抑制特定函数,则可以通过将所需函数名称与 -Q 选项一起传递给 gprof 工具来实现。

$ gprof -Qfunc1 -b test_gprof gmon.out > analysis.txt

在上面的例子中,函数名 func1 被传递给 -Q 选项。

现在,如果我们看到分析输出:

Call graph

granularity: each sample hit covers 2 byte(s) for 0.02% of 46.30 seconds
index % time self children called name

[1]   100.0  0.03  46.27          main [1]
             15.26 15.50    1/1      func1 [2]
             15.52 0.00     1/1      func2 [3]
-----------------------------------------------
             15.52 0.00     1/1      main [1]
[3]   33.5   15.52 0.00     1     func2 [3]
-----------------------------------------------
             15.50 0.00     1/1      func1 [2]
[4]   33.5   15.50 0.00     1     new_func1 [4]
-----------------------------------------------
Index by function name

(2) func1 [1] main
[3] func2 [4] new_func1

所以我们看到与 func1 相关的调用图信息被抑制了。

【版权声明】本文为华为云社区用户原创内容,转载时必须标注文章的来源(华为云社区)、文章链接、文章作者等基本信息, 否则作者和本社区有权追究责任。如果您发现本社区中有涉嫌抄袭的内容,欢迎发送邮件进行举报,并提供相关证据,一经查实,本社区将立刻删除涉嫌侵权内容,举报邮箱: cloudbbs@huaweicloud.com
  • 点赞
  • 收藏
  • 关注作者

评论(0

0/1000
抱歉,系统识别当前为高风险访问,暂不支持该操作

全部回复

上滑加载中

设置昵称

在此一键设置昵称,即可参与社区互动!

*长度不超过10个汉字或20个英文字符,设置后3个月内不可修改。

*长度不超过10个汉字或20个英文字符,设置后3个月内不可修改。