Sep 13, 2010

System monitoring with top

Probably the most important tool for any Linux systems administrator is top, which has got an interface that provides a real time view of the main events that are happening in the system, such as CPU consumption, memory, processes state, etc.

[root@centos ~]# top
top - 11:29:56 up 53 min,  1 user,  load average: 0.16, 0.05, 0.05
Tasks: 136 total,   1 running, 135 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.5%us,  0.6%sy,  0.0%ni, 98.2%id,  0.5%wa,  0.0%hi,  0.1%si,  0.0%st
Mem:   2059768k total,   352036k used,  1707732k free,    21248k buffers
Swap:  4095992k total,        0k used,  4095992k free,   207520k cached

PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
3057 root      15   0 12732 1004  716 R  2.0  0.0   0:00.01 top
1 root      15   0 10344  672  560 S  0.0  0.0   0:00.45 init
2 root      RT  -5     0    0    0 S  0.0  0.0   0:00.00 migration/0
3 root      34  19     0    0    0 S  0.0  0.0   0:00.00 ksoftirqd/0
4 root      RT  -5     0    0    0 S  0.0  0.0   0:00.00 watchdog/0
5 root      10  -5     0    0    0 S  0.0  0.0   0:04.19 events/0
6 root      10  -5     0    0    0 S  0.0  0.0   0:00.00 khelper
23 root      10  -5     0    0    0 S  0.0  0.0   0:00.00 kthread
27 root      10  -5     0    0    0 S  0.0  0.0   0:00.00 kblockd/0
28 root      20  -5     0    0    0 S  0.0  0.0   0:00.00 kacpid
85 root      20  -5     0    0    0 S  0.0  0.0   0:00.00 cqueue/0
88 root      20  -5     0    0    0 S  0.0  0.0   0:00.00 khubd
90 root      10  -5     0    0    0 S  0.0  0.0   0:00.00 kseriod
154 root      25   0     0    0    0 S  0.0  0.0   0:00.00 pdflush
155 root      15   0     0    0    0 S  0.0  0.0   0:00.03 pdflush
156 root      20  -5     0    0    0 S  0.0  0.0   0:00.00 kswapd0
157 root      20  -5     0    0    0 S  0.0  0.0   0:00.00 aio/0
298 root      11  -5     0    0    0 S  0.0  0.0   0:00.00 kpsmoused
...

We have a first line where the two most important data are the time that the machine is on (11:29:56 up) and the average number of system processes (load average) which have been waiting for any system resource (CPU, disk access, network, etc.) during the last 1, 5 and 15 minutes.

Then there is a data block where are showed the overall features of the system:

Tasks indicates the processes number which are up, where some of them will be able to be in running, sleeping, stopped or zombie state.

Cpu(s) shows the CPU use, by both the user (%us) and the system (%sy), as well as the percentage of CPU idle (%id).

Mem indicates the distribution which is being done of the RAM memory, offering the total amount available (total), the memory currently in use (used), the free memory (free), the buffers used (buffers) and within the total memory used, how much is cached (cached).

Swap shows the distribution of swap memory, providing the total amount available (total) and the part which is being used (used).

The other block of information presented by top is a set of columns with information about each process.

  • PID: process ID number.

  • USER: user name who has run the process.

  • PR: process priority.

  • NI: process priority change.

  • VIRT: amount of virtual memory for process (including all code, data and shared libraries - if you have N instances of the same program running at the same time, the context of the application will be only once in memory). VIRT = SWAP + RES.

  • RES: total physical memory (RAM) used by the process.

  • SHR: amount of memory that can be shared with other processes.

  • S: process status; D (sleeping and interruptible), S (sleeping), T (stopped) and Z (zombie).

  • %CPU: percentage of CPU usage.

  • %MEM: percentage of physical memory usage.

  • TIME+: total CPU time used by the process.

  • COMMAND: application which has run the process.

There are other fields associated with the tasks which are not displayed by default by top. If you want to view them, first you must press the 'f' key in order to see all available fields, and then press the key associated with the field to be added (e.g. 'p' key for SWAP).

Also say that the column values displayed by top can be ordered according to the memory (shift + m), PID (shift + n), CPU (shift + p) and the total CPU time used by the process (shift + t).

Finally also say that sometimes we can get that almost all physical memory is in use, but to sort the processes by memory, do not add the total amount of memory used. At this moment we will must look at the cached field, since in this way we will be able to see that the operating system is caching part of that memory, and the fact that a system caches memory is really the optimal situation.


No comments:

Post a Comment