【概要】
eclipse の登場もあり、プロファイリングツールも充実してきた今日この頃であるが、どんなに便利になろうとも昔から存在する機能やツールというのは意外と使い手があったりする。ということで、何とも今更ではあるが、hprof の使い方の紹介。4年前に書いたものをそのまま貼り付けただけなので情報が古くなっている可能性あり。

【キーワード】
プロファイリング、プロファイル、profile、サンプリング、sampling、CPU 使用状況、メモリ使用状況、メソッド呼び出し回数、メソッド呼び出し頻度、オブジェクトのサイズ、インスタンスのサイズ

【詳細】

以下、資料貼り付け。


=======================================
Java アプリケーションのプロファイリング
2002/11/29 小野 和俊
=======================================


1. hprof の使い方
=================

C:\javatest\prof\guideline>java -Xrunhprof:help
Hprof usage: -Xrunhprof[:help]|[<option>=<value>, ...]

Option Name and Value  Description                Default
---------------------  ----------------------     -------
heap=dump|sites|all    heap profiling             all
cpu=samples|times|old  CPU usage                  off
monitor=y|n            monitor contention         n
format=a|b             ascii or binary output     a
file=<file>            write data to file         java.hprof(.txt for ascii)
net=<host>:<port>      send data over a socket    write to file
depth=<size>           stack trace depth          4
cutoff=<value>         output cutoff point        0.0001
lineno=y|n             line number in traces?     y
thread=y|n             thread in traces?          n
doe=y|n                dump on exit?              y

Example: java -Xrunhprof:cpu=samples,file=log.txt,depth=3 FooClass


2. ホットスポットの検出(cpu=samples, cpu=times)
===============================================

(1) 調査に利用するプログラム

CpuTest.java
-------------------------------------------------------------------------------
import java.util.*;

public class CpuTest {
  private static void method1(int count) {
    List list = new ArrayList();
    for (int i = 0; i < count; i++) {
      list.add(new Object());
    }
  }

  private static void method2(int count) {
    List list = new ArrayList(count);
    for (int i = 0; i < count; i++) {
      list.add(new Object());
    }
  }

  public static void main(String[] args) {
    int count = 1000000;

    long start = System.currentTimeMillis();
    method1(count);
    long end = System.currentTimeMillis();
    System.out.println("* performance: " + (end - start));
  }
}
-------------------------------------------------------------------------------

(2) プログラムの実行(cpu=samples)

[説明]
CPU SAMPLES     is a statistical profile of program execution.  The VM
                periodically samples all running threads, and assigns
                a quantum to active TRACEs in those threads.  Entries
                in this record are TRACEs ranked by the percentage of
                total quanta they consumed; top-ranked TRACEs are
                typically hot spots in the program.

[実行]
C:\javatest\prof\guideline>java -Xrunhprof:cpu=samples CpuTest
* performance: 4766
Dumping CPU usage by sampling running threads ... done.

[プロファイル]
CPU SAMPLES BEGIN (total = 12) Thu Nov 28 20:23:31 2002
rank   self  accum   count trace method
   1 33.33% 33.33%       4     8 CpuTest.method1
   2 16.67% 50.00%       2     9 java.util.ArrayList.ensureCapacity
   3  8.33% 58.33%       1     2 java.util.zip.ZipEntry.initIDs
   4  8.33% 66.67%       1     3 java.util.jar.JarFile.getManifest
   5  8.33% 75.00%       1     5 java.io.FileInputStream.open
   6  8.33% 83.33%       1     4 java.util.zip.ZipFile.open
   7  8.33% 91.67%       1     6 java.security.BasicPermission.newPermissionCollection
   8  8.33% 100.00%       1     1 java.lang.String.indexOf
CPU SAMPLES END

TRACE 8:
  CpuTest.method1(CpuTest.java:7)
  CpuTest.main(CpuTest.java:22)
TRACE 9:
  java.util.ArrayList.ensureCapacity(ArrayList.java:165)
  java.util.ArrayList.add(ArrayList.java:350)
  CpuTest.method1(CpuTest.java:7)
  CpuTest.main(CpuTest.java:22)

-> サンプリングの total が 12 しかなく、プロファイリングは困難。


(3) プログラムの実行(cpu=time)

[説明]
CPU TIME  is a profile of program execution obtained by measuring
                the time spent in individual methods (excluding the time
                spent in callees), as well as by counting the number of
                times each method is called. Entries in this record are
                TRACEs ranked by the percentage of total CPU time. The
                "count" field indicates the number of times each TRACE 
                is invoked.

[実行]
C:\javatest\prof\guideline>java -Xrunhprof:cpu=times CpuTest
HPROF ERROR: thread local table NULL in method exit 00841E2C
HPROF ERROR: thread local table NULL in method exit 00841E2C
HPROF ERROR: thread local table NULL in method exit 008432B4
HPROF ERROR : stack underflow in method exit
HPROF ERROR : stack underflow in method exit
* performance: 23203
Dumping CPU usage by timing methods ... done.

[プロファイル]
CPU TIME (ms) BEGIN (total = 12990) Thu Nov 28 20:25:58 2002
rank   self  accum   count trace method
   1 52.56% 52.56% 1000000    16 java.util.ArrayList.add
   2 27.66% 80.22% 1000000     3 java.util.ArrayList.ensureCapacity
   3 17.07% 97.30%       1    13 CpuTest.method1
   4  0.96% 98.26%      28    12 java.lang.System.arraycopy
   5  0.24% 98.50%    1008     7 java.lang.Character.toLowerCase
   6  0.12% 98.61%       1     4 sun.misc.URLClassPath$FileLoader.getResource
   7  0.12% 98.73%     920    19 java.lang.String.charAt
   8  0.12% 98.85%       1    17 sun.net.www.protocol.file.Handler.createFileURLConnection
   9  0.12% 98.96%       1     2 java.security.UnresolvedPermission.<clinit>
  10  0.12% 99.08%       4     5 java.util.jar.Manifest.<init>
  11  0.12% 99.19%       5    10 java.util.zip.ZipFile.open
  12  0.12% 99.31%       5    18 java.util.jar.JarFile.getEntry
  13  0.12% 99.42%      63    15 java.util.HashMap.addEntry
  14  0.12% 99.54%       5    14 java.util.zip.ZipFile.<init>
  15  0.12% 99.65%     280    11 java.lang.String.charAt
  16  0.12% 99.77%       4     8 java.util.jar.JarVerifier.<init>
  17  0.12% 99.88%      77     6 java.util.HashMap.put
  18  0.12% 100.00%       5     9 java.io.Win32FileSystem.normalize
CPU TIME (ms) END

TRACE 16:
  java.util.ArrayList.add(ArrayList.java:Unknown line)
  CpuTest.method1(CpuTest.java:Unknown line)
  CpuTest.main(CpuTest.java:Unknown line)
TRACE 3:
  java.util.ArrayList.ensureCapacity(ArrayList.java:Unknown line)
  java.util.ArrayList.add(ArrayList.java:Unknown line)
  CpuTest.method1(CpuTest.java:Unknown line)
  CpuTest.main(CpuTest.java:Unknown line)
TRACE 13:
  CpuTest.method1(CpuTest.java:Unknown line)
  CpuTest.main(CpuTest.java:Unknown line)

(4) プログラムの改良
method1(count) -> method2(count)

[実行]
C:\javatest\prof\guideline>java -Xrunhprof:cpu=times CpuTest
HPROF ERROR: thread local table NULL in method exit 00841E2C
HPROF ERROR: thread local table NULL in method exit 00841E2C
HPROF ERROR: thread local table NULL in method exit 008432B4
HPROF ERROR : stack underflow in method exit
HPROF ERROR : stack underflow in method exit
* performance: 20891
Dumping CPU usage by timing methods ... done.

[プロファイル]
CPU TIME (ms) BEGIN (total = 15522) Thu Nov 28 20:35:09 2002
rank   self  accum   count trace method
   1 44.49% 44.49% 1000000    15 java.util.ArrayList.add
   2 27.58% 72.07% 1000000     4 java.util.ArrayList.ensureCapacity
   3 26.47% 98.54%       1    16 CpuTest.method2
   4  0.20% 98.74%      77    17 java.util.jar.Attributes$Name.isValid
   5  0.10% 98.84%       1     9 java.security.SecureClassLoader.getProtectionDomain
   6  0.10% 98.94%       5     3 java.util.zip.ZipFile.<init>
   7  0.10% 99.03%      63     2 java.util.HashMap.indexFor
   8  0.10% 99.13%    1008    12 java.lang.Character.toLowerCase
   9  0.10% 99.23%       4    13 java.util.jar.JarVerifier.<init>
  10  0.10% 99.32%       1    18 java.security.AccessController.getContext
  11  0.10% 99.42%       5    11 java.io.Win32FileSystem.normalize
  12  0.10% 99.52%       1     5 sun.net.www.protocol.file.FileURLConnection.getPermission
  13  0.10% 99.61%     252     6 java.util.jar.Manifest.toLower
  14  0.10% 99.71%       4     7 java.util.jar.JarFile.getBytes
  15  0.10% 99.81%      14     8 java.util.zip.Inflater.inflate
  16  0.10% 99.90%       4    14 java.util.jar.Manifest.read
  17  0.10% 100.00%      78    10 java.lang.String.toLowerCase
CPU TIME (ms) END

[改良前とのパフォーマンスの比較]
改良前: 3156(ms)
改良後: 1531(ms)

(5) 補足
- TRACE の深さを変更するには: depth=n (ex. depth=8)
ex.
depth=4
TRACE 6:
  java.security.BasicPermission.newPermissionCollection(BasicPermission.java:233)
  java.security.Permissions.getPermissionCollection(Permissions.java:205)
  java.security.Permissions.add(Permissions.java:94)
  sun.misc.Launcher$AppClassLoader.getPermissions(Launcher.java:274)

depth=8
TRACE 6:
  java.security.BasicPermission.newPermissionCollection(BasicPermission.java:233)
  java.security.Permissions.getPermissionCollection(Permissions.java:205)
  java.security.Permissions.add(Permissions.java:94)
  sun.misc.Launcher$AppClassLoader.getPermissions(Launcher.java:274)
  java.security.SecureClassLoader.getProtectionDomain(SecureClassLoader.java:155)
  java.security.SecureClassLoader.defineClass(SecureClassLoader.java:123)
  java.net.URLClassLoader.defineClass(URLClassLoader.java:250)
  java.net.URLClassLoader.access$100(URLClassLoader.java:54)

- 実行途中で snapshot を取るには: Ctrl-Break

- HotSpot のサンプリングより、Classic VM のサンプリングの方がサンプリングレートが高い。
 -> jdk1.3 を使用している場合、-classic オプションによってより細かなサンプリングを行う
    ことができる。ただし、あくまでも Classic VM を用いてのサンプリングなので、HotSpot
    の使用を前提としたサンプリングを行いたい場合には有効ではない。


3. インスタンスのメモリ使用量の調査(heap=sites)
===============================================

(1) 調査に利用するプログラム

MemoryTest.java
-------------------------------------------------------------------------------
public class MemoryTest {
  private static class Company {
    private Employee[] employees;

    public Company(Employee[] employees) {
      this.employees = employees;
    }
  }

  private static class Employee {
    private String name;
    private int age;

    public Employee(String name, int age) {
      this.name = name;
      this.age = age;
    }
  }

  private static final String[] EMPLOYEE_NAMES = new String[] {"信長", "家康", "秀吉"};

  private static Employee[] findEmployees(int headcount) {
    Employee[] employees = new Employee[headcount];
    for (int i = 0; i < headcount; i++) {
      int nameIndex = (int)(Math.random() * EMPLOYEE_NAMES.length);
      int age = (int)(Math.random() * 100);
      employees[i] = new Employee(EMPLOYEE_NAMES[nameIndex], age);
    }
    return employees;
  }

  public static void main(String[] args) {
    int count = 100000;

    long start = System.currentTimeMillis();
    Employee[] startupMembers = findEmployees(count);
    Company myCompany = new Company(startupMembers);
    long end = System.currentTimeMillis();
    System.out.println("* performance: " + (end - start));
  }
}
-------------------------------------------------------------------------------

(2) プログラムの実行(heap=sites)

[説明]
SITES           is a sorted list of allocation sites.  This identifies
                the most heavily allocated object types, and the TRACE
                at which those allocations occurred.

[実行]
C:\javatest\prof\guideline>java -Xrunhprof:heap=sites,cutoff=0.01 MemoryTest
* performance: 2391
Dumping allocation sites ... done.

[プロファイル]
SITES BEGIN (ordered by live bytes) Fri Nov 29 10:57:10 2002
          percent         live       alloc'ed  stack class
 rank   self  accum    bytes objs   bytes objs trace name
    1 74.56% 74.56%  2400000 100000 2400000 100000   278 MemoryTest$Employee
    2 12.43% 86.99%   400016    1  400016    1   274 MemoryTest$Employee
    3  5.08% 92.06%   163360 1133  163712 1143     1 [C
    4  1.65% 93.72%    53216  255   53248  257     1 [B
SITES END

TRACE 274:
  MemoryTest.findEmployees(MemoryTest.java:23)
  MemoryTest.main(MemoryTest.java:36)
TRACE 278:
  MemoryTest.findEmployees(MemoryTest.java:27)
  MemoryTest.main(MemoryTest.java:36)

(3) プログラムの改良
long age -> short age

[実行]
C:\javatest\prof\guideline>java -Xrunhprof:heap=sites,cutoff=0.01 MemoryTest
* performance: 2250
Dumping allocation sites ... done.

[プロファイル]
SITES BEGIN (ordered by live bytes) Fri Nov 29 10:59:42 2002
          percent         live       alloc'ed  stack class
 rank   self  accum    bytes objs   bytes objs trace name
    1 66.13% 66.13%  1600000 100000 1600000 100000   278 MemoryTest$Employee
    2 16.53% 82.66%   400016    1  400016    1   274 MemoryTest$Employee
    3  6.75% 89.42%   163392 1135  163712 1143     1 [C
    4  2.20% 91.62%    53248  257   53248  257     1 [B
    5  1.12% 92.74%    27136  342   27136  342     1 [S
SITES END

-> Employee インスタンスのメモリ使用量が1インスタンスにつき 24byte -> 16byte に減少。
   Employee インスタンス全体のメモリ使用量は 2.4MB -> 1.6MB に減少。

(4) 補足
- ランキングに表示するアイテムをカットするには: cutoff=x.xx
ex. cutoff=0.01 -> percent が 1% 以上のアイテムのみ表示。

- インスタンスのメモリ使用量は 8byte 毎に増減していく。
ex. Employee クラスの age を short にしても int にしても Employee 1インスタンスにつき 16byte。



以上