Many people who use cloud services ask about CPU Steal. CPU Steal Time is a common metric that arises from the difference in environments between cloud services and physical servers. High CPU steal time can also cause web services to fail. CPU steal time is a metric that tells you how much of your CPU's resources are being stolen by the process of distributing virtualized resources. Let's take a look at CPU steal time.
CPU Steal Time is the amount of time, expressed as a percentage, that a virtual CPU waits for a physical CPU while the hypervisor services another virtual processor. Virtual machines (VMs) operating in a virtual environment share resources with other instances on a single host. CPU Steal Time tells you how long the CPUs running in a VM are waiting to be allocated resources from the physical machine.
First, run the top command in Linux to see a real-time view of key performance metrics. Below are the values when the top command was executed.
Top - 10:00:00 up 120 days, 7:00, 3 users, load average: 1.15, 0.88, 0.86
Tasks: 122 total, 10 running, 112 sleeping, 0 stopped, 0 zombie
%Cpu(s): 40.0%us,0.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0,0%hi, 0.1%si, 70.0%st
If you are using the Top command, you can see metrics that are currently occurring. If you want to see historical metrics in the event of a service issue, you should use a monitoring service, such as the and Tap Infrastructure Monitoring Service, to see metrics at the time of the issue. Most monitoring services monitor the CPU Steal metric around the clock.
CPU 사용량 = CPU Usage
The CPU(s) entry is described below.
CPU Steal Time can be found by looking at the very last entry for CPU. If you are not in a true virtualized environment, CPU Steal Time is meaningless.
CPU steals occur because either the physical equipment the VM is on is running out of resources to begin with, or the physical equipment has enough resources but not enough CPU resources allocated to the VM. This can happen if you have too many VMs up, or if the administrator has incorrectly set the limits for each resource available to the VM. Alternatively, the physical equipment may be aging and unable to keep up with the hosting service.
First of all, for jobs that take a long time in the background, such as batch jobs, this is usually not a problem. CPU Steal Time will not stop the job in these cases, it will just finish a little slower as it shares CPU cycles with other VMs.
However, this can be problematic for web applications. Web applications require real-time processing of customer requests. If web responses are required to be made in real time, CPU steal time increases and performance decreases, eventually the service will fail because real-time requests cannot be fulfilled.
As an end user, there is very little you can do to directly address the issue of high CPU Steal Time. If Cpu Steal Time is impacting your service, you will need to check with your hosting provider to ensure that the VMs you are currently purchasing are providing the appropriate resources per your contract, but most cloud service providers will tell you that they are. If Cpu Steal Time is impacting your service, you will need to take one of two actions
When using the cloud, you cannot afford to neglect monitoring. Whether you are a developer, operator, or planner, you need to have the tools in place to analyze the cause if something goes wrong. And hopefully, you have a hotline to connect with a cloud specialist or expert at all times.