Sales
0161 215 3814
0800 953 0642
Support
0800 230 0032
0161 215 3711

Managing Resource in RHEL6 Part 1

The most prominent and for the hosting business, most important updates that are going to come out for Redhat Enterprise Linux are those introducing control groups in RHEL6. These have been around in a supported capacity since Fedora 8 but really wont be seen in full until the new Redhat build gets started.

So, what are control groups?

Fundamentally, control groups are a way of shelving certain PIDs into different groups which provide different classes of resource management. This is different from traditional process groups in that you can be assigned and removed from membership of the control group in real time. Also groups can be stacked hierarchically which is important with regards to the resource management feature of control groups.

Control groups offer a method of guaranteeing a quality of service within a single system via the O/S, something which has not been possible to accomplish prior to control groups.

Basic Control Group Explanation

To get an idea of what control groups do, lets look at their architecture and how to administrate them on a basic level.

Control groups are exposed to userland via the cgroupfs filesystem which can be mounted on disk. I only want to cover one subsystem today so I’ll be mounting just the cpu subsystem.

Accessing Control Groups

[root@home ~]# mount -t cgroup -o cpu none /cgroup/
[root@home ~]# ll /cgroup/
total 0
-r--r--r--. 1 root root 0 Aug 18 09:32 cgroup.procs
-rw-r--r--. 1 root root 0 Aug 18 09:32 cpu.rt_period_us
-rw-r--r--. 1 root root 0 Aug 18 09:32 cpu.rt_runtime_us
-rw-r--r--. 1 root root 0 Aug 18 09:32 cpu.shares
-rw-r--r--. 1 root root 0 Aug 18 09:32 notify_on_release
-rw-r--r--. 1 root root 0 Aug 18 09:32 release_agent
-rw-r--r--. 1 root root 0 Aug 18 09:32 tasks

Control group components are broken down into subsystems which alter the way processes behave when in a control group. These expose certain parameters for controlling a resource within that group which are then enforced on PIDs members of the group. Parameters contents can be rea.  For example, using cat, you can write to some of the parameters using echo – much like with /proc, /sys or other pseudo filesystems.

There are a few different subsystems available to use all varying in the resource we want to manage but the mount parameters I used explose only the CPU subsystem for now.

CPU Subsystem Parameters

The CPU subsystem gives us access to the completely fair scheduler queuing (CFQ) algorithm and provides us with an opportunity to refine how the CPU schedules work on process IDs within the group itself. Lets take a look at each parameter.

  • cgroup.procs – the list of processes that are members of this control group.
  • cpu.rt_period_us – defines a real time period in milliseconds, used for the option below. This prevents processes in the group hogging the CPU.
  • cpu.rt_runtime_us – How much time this cgroup can have the CPU to itself in milliseconds (used with the option above.)
  • cpu.shares – An arbitrary integer which can define the share of the CPU processes in this group receive in relation to other control groups cpu.shares.
  • notify_on_release – When all the tasks in the control group have exited, define whether we should run the release agent (0 = no, 1 = yes)
  • release_agent – The path to a program to execute when the last task has exited the control group.
  • tasks – The list of processes + threads that are members of this control group. This differs from cgroup.procs as it will display lightweight processes too.

The first control group you make (the one you mounted) becomes the default and root control group and all tasks on the system running will be members of it.

Creating Control Groups

To add another control group you can simply go into the cgroup filesystem and mkdir. This initializes a new group inheriting all the subsystems of its parent (in this case, just CPU) with further tuning parameters.

[root@home cgroup]# mkdir fiftypc
[root@home cgroup]# ll -R
.:
total 0
-r--r--r--. 1 root root 0 Aug 18 09:32 cgroup.procs
-rw-r--r--. 1 root root 0 Aug 18 09:32 cpu.rt_period_us
-rw-r--r--. 1 root root 0 Aug 18 09:32 cpu.rt_runtime_us
-rw-r--r--. 1 root root 0 Aug 18 09:32 cpu.shares
drwxr-xr-x. 2 root root 0 Aug 18 10:09 fiftypc
-rw-r--r--. 1 root root 0 Aug 18 09:32 notify_on_release
-rw-r--r--. 1 root root 0 Aug 18 09:32 release_agent
-rw-r--r--. 1 root root 0 Aug 18 09:32 tasks

./fiftypc:
total 0
-r--r--r--. 1 root root 0 Aug 18 10:09 cgroup.procs
-rw-r--r--. 1 root root 0 Aug 18 10:09 cpu.rt_period_us
-rw-r--r--. 1 root root 0 Aug 18 10:09 cpu.rt_runtime_us
-rw-r--r--. 1 root root 0 Aug 18 10:09 cpu.shares
-rw-r--r--. 1 root root 0 Aug 18 10:09 notify_on_release
-rw-r--r--. 1 root root 0 Aug 18 10:09 tasks

As demonstrated, running mkdir in the cgroup with the name of the new control group will initialze a new control group which is a child of the root control group.

Adding Tasks to the Control Group

To add a process to the control group you need to echo the PID of the task into the control group.

Lets add a simple script that creates a few processes to see what happens once we’ve added it to the cgroup.

#!/usr/bin/python
import os, time, sys

time.sleep(5)

for i in range(0,5):
        if (os.fork() == 0):
                print "Iteration %d is sleeping" % i
                time.sleep(10)
                sys.exit(0)
        else:
                continue

for i in range(0,5):
        os.wait()

Lets run the script and assign it to our new control group.

[root@home cgroup]# python /dev/shm/simple.py & echo $! > /cgroup/fiftypc/tasks; \
sleep 7; \
cat /cgroup/fiftypc/tasks
[1] 6263

Iteration 0 is sleeping
Iteration 1 is sleeping
Iteration 2 is sleeping
Iteration 3 is sleeping
Iteration 4 is sleeping
6263
6265
6266
6267
6268
6269

What this does is execute the process which waits five seconds before spawning children. We add the process to the tasks list of our new control group using $! to pipe the PID into the tasks file of the control group we want to assign it to. We sleep for seven seconds to allow the process to spawn its children then output what is displayed in our task list.

This is a good demonstration of the way control groups behave; when processes are assigned into the group whenever a child process or thread is spawned from within the process ID that has membership in this control group, the child process also inherits the same control group. Thus we are able to group a series of processes into the control group. If we assign resource control parameters inside of the group then the child processes inherit the limits set.

.

We can also get a PID out of a control group by echoing its PID into another control group (such as the root group).

Stateful Resource Management

Control groups are great like this but the fashion of creating and managing these control groups is crude and difficult to implement on a system-wide basis. How could we really use this? We cannot maintain the state of this setup easily – as soon as we reboot we’ll lose all the cgroups, the parameters we assigned the control groups and of course any PIDs that were members of them. How can we made this more stateful, more meaningful and more elegant to manage?

Well – this is where redhat have come in. Libcgroup is a package that comes deployed with RHEL6 and offers us a means to abstract out the filesystem management of control groups and consistently deploy them to disk. RHEL have altered their initscripts facilities to allow you to start services right into their accompanying control group by using a special variable in /etc/sysconfig/<service>. But also more critically, have developed a means by which you can define your control groups and their parameters so the system can be rebooted and come up in the correct state. The mechanism still needs some work but is still elegant enough to take advantage of.

Before anything can be done with this, you need to install the libcgroup package from yum. Once this has been done there is a file in /etc/cgconfig.conf which we can use to define the control group names, ownerships and subsystems.

Lets take a step by step look at a configuration I have deployed to test out the cpu subsystem.

group users {
        perm {
                task {
                        uid = root;
                        gid = root;
                }
                admin {
                        uid = root;
                        gid = root;
                }
        }
        cpu {
                cpu.shares = 1024;
        }
}
mount {
	cpu = /cgroup/cpu;
}

Lets break down each important aspect and provide further explanation.

group users {
..
}

The group clause defines the name of our control group, which in this case will be called “users”.

        perm {
        ..
        }

The perm section simply denotes that the subsections within it refer to permissions of the control group.

               task {
                        uid = root;
                        gid = root;
                }|

The section task defines who has the ability to control the contents of the tasks file in the control group (selinux permitting on SELinux enabled systems). In my case, root is fine for my purposes. I can actually skip this whole section out and it will implicitly assign ownerships to root but for the sake of verbosity I have added it. This allows another person to re-assign PIDs into this group manually in real time if necessary.

                admin {
                        uid = root;
                        gid = root;
                }

The section admin defines who has the ability to write to the contents of the control group. Because cgroupfs is fundamenally a filesystem we can assign ownerships and permissions to the parameters file that the process becomes grouped into. admin defines which user and group gets ownership of the control group subsystem parameters.

        cpu {
                cpu.shares = 1024;
        }

This is the meat of the control group definitions. The cpu section refers to the name of the subsystem (cpu) and the parameters parts contain the parts ot the subsystem you can control. The subsystem ports follow dot notation and must contain the complete name of the file (which is reflected in the filesystem contents above). This part sets the parameters for the group itself allowing more stateful control of the cgroup filesystem. In my example I have set the “shares” value of the cpu subsystem to 1024.

mount {
	cpu = /cgroup/cpu;
}

The mount section determines where physically on disk to place the control group. I can put different subsystems on different portions of the disk but for now I have placed the cpu subsystem into /cgroup/cpu.

The only remaning thing left to do is run /etc/init.d/cgconfig start; chkconfig cgconfig on; to enable stateful cgrouping.

In Conclusion

Control groups in RHEL is for me going to be the most dramatic and beneficial update. In terms of risk management quality of service has never been something thats been easy to implement on multi-roled machines before with Linux.

This is soon going to change. Control groups will allow system administrators to know who can hog which resource, for how long and how much. Control groups brings us closer to a proper system where a single process cant demand too much of the CPU or one leaky application cant bring down an entire system. Effectively we are able to contain a problematic service, application or user through the use of control groups so they have little to no impact on the performance of more critical services.

In the next part we’ll talk about how you can place users into a cgroup automatically and look more closely at what effect the cpu subsystem has on work load.

    Share with:

Enjoy this article?