Difference between revisions of "Slab allocator"
(→Documentation of /proc/slabinfo fields) |
m (Add category) |
||
(5 intermediate revisions by one other user not shown) | |||
Line 155: | Line 155: | ||
== Documentation of /proc/slabinfo fields == | == Documentation of /proc/slabinfo fields == | ||
− | Here is some information about /proc/ | + | Here is some information about /proc/slabinfo. Some of this information was obtained from a writeup done by |
− | below for the source. | + | Doug Ledford in 2003, for the 2.4.x kernel. See below for the source. |
Here is a description of some of the fields in <tt>/proc/slabinfo</tt> | Here is a description of some of the fields in <tt>/proc/slabinfo</tt> | ||
− | active_objs: | + | ;active_objs |
− | out of that slab cache. This is the count of objects you currently have | + | :After creating a slab cache, you allocate your objects out of that slab cache. This is the count of objects you currently have allocated out of the cache. |
− | allocated out of the cache. | ||
− | num_objs: | + | ;num_objs |
− | cache. | + | :This is the current total number of objects in the cache. |
− | objsize: | + | ;objsize: |
− | overhead to maintaining the cache, so with a 512byte object and a | + | :This is the size of each allocated object. There is overhead to maintaining the cache, so with a 512byte object and a 4096-byte page size, you could fit 7 objects in a single page and you would waste 512-slab_overhead bytes per allocation. Slab overhead varies with object size (smaller objects have more objects per allocation and require more overhead to track used vs. unused objects). |
− | |||
− | would waste 512-slab_overhead bytes per allocation. Slab overhead | ||
− | varies with object size (smaller objects have more objects per | ||
− | allocation and require more overhead to track used vs. unused objects). | ||
− | ojbsperslab: This is how many objects can be put into each slab. To find | + | ;ojbsperslab |
− | the amount of wasted bytes per slab, multiply the objsize by objperslab, and | + | :This is how many objects can be put into each slab. To find the amount of wasted bytes per slab, multiply the objsize by objperslab, and subtract this from pagesperslab times the pagesize (often 4096). |
− | subtract this from pagesperslab times the pagesize (often 4096). | ||
− | pagesperslab: This is the size of each slab in units of memory | + | ;pagesperslab |
− | pages. Page size is architecture specific, but the most common size is | + | :This is the size of each slab in units of memory pages. Page size is architecture specific, but the most common size is 4k (4096 bytes). Some architectures have an 8k page size, and ia64 can do a 16k page size. Each slab for the cache is pagesperslab * arch_page_size bytes at a time, and total memory used by this particular slab cache is num_slabs * pagesperslab * arch_page_size. |
− | 4k (4096 bytes). Some architectures have an 8k page size, and ia64 can do a 16k | ||
− | page size. Each slab for the cache is pagesperslab * arch_page_size | ||
− | bytes at a time, and total memory used by this particular slab cache is | ||
− | |||
− | + | ;active_slabs | |
− | one object in use. | + | :This is the number of slabs that have at least one object in use. |
− | + | ;num_slabs | |
− | cache. | + | :This is the number of slabs currently allocated for the given cache. |
− | |||
− | + | On SMP machines, the slab cache will keep a per CPU cache of | |
− | |||
objects so that an object freed on CPU0 will be reused on CPU0 instead | objects so that an object freed on CPU0 will be reused on CPU0 instead | ||
of CPU1 if possible. This improves cache performance on SMP systems | of CPU1 if possible. This improves cache performance on SMP systems | ||
greatly. | greatly. | ||
− | limit: | + | ;limit |
− | stored in the per-CPU free list for this slab cache. | + | :This is the limit on the number of free objects that can be stored in the per-CPU free list for this slab cache. |
− | batch-count: | + | ;batch-count |
− | instead of doing one object at a time, we do batch-count objects at a | + | :On SMP systems, when we refill the available object list, instead of doing one object at a time, we do batch-count objects at a time. |
− | time. | ||
− | + | Note that is CONFIG_SLAB_DEBUG is enabled, then you'll get more | |
− | numbers on each line and the lines will | + | numbers on each line and the lines will have these fields: |
+ | <name> <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : | ||
+ | tunables <limit> <batchcount> <sharedfactor> : | ||
+ | slabdata <active_slabs num_slabs> <sharedavail> : | ||
+ | globalstat <listallocs> <maxobjs> <grown> <reaped> <error> <maxfreeable> <nodeallocs> <remotefrees> <alienoverflow> : | ||
+ | cpustats <allochit> <allocmiss> <freehit> <freemiss> | ||
− | + | I don't have time to document these right now. | |
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
Source: [http://kerneltrap.org/node/1104 Documenting slabinfo] - article by Doug Ledford in 2003 (for the 2.4 kernel - may be obsolete for 2.6) | Source: [http://kerneltrap.org/node/1104 Documenting slabinfo] - article by Doug Ledford in 2003 (for the 2.4 kernel - may be obsolete for 2.6) | ||
Line 227: | Line 209: | ||
<tt>Documentation/vm/slabinfo.c</tt>. This program can be used for debugging | <tt>Documentation/vm/slabinfo.c</tt>. This program can be used for debugging | ||
the slub allocator, not the slab allocator. | the slub allocator, not the slab allocator. | ||
+ | |||
+ | [[Category:Kernel]] |
Latest revision as of 14:32, 27 October 2011
Here is some information on the slab allocator in Linux.
(As I write this (8/2007), a new allocator "slub" has been submitted for inclusion in mainline. It remains to be seen if this means that the slab allocator will be removed.)
You can get a lot of information about the status of the slab allocator by examining the data in /proc/slabinfo.
The slab allocator is a system of allocating memory that is optimized for the allocation and freeing of same-sized memory objects. The slab allocator organizes the memory into caches, slabs and objects. A cache consists of multiple slabs, and each slab is a contiguous region of memory containing objects of all the same size.
When a cache is created, the creator specifies the object size and a name for the cache, as well as some flags. When an object is allocated, if there are no free objects in any slabs, a new slab is allocated. For small objects, a slab is a single page of memory. For example, on a system with 4096-byte pages, a cache of objects that are 160 bytes each would consist of slabs containing 24 objects per slab. Subsequent calls to allocate objects of the same size are quick, because the page has already been allocated for the object. Calls to free an object are also quick. If the system gets into a low memory condition, slabs with no allocated objects can be freed. Fragmentation is reduced (hopefully) because similar sized objects are grouped together.
sample of /proc/slabinfo
/proc # cat /proc/slabinfo slabinfo - version: 2.1 # name <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs > <num_slabs> <sharedavail> rpc_buffers 10 10 2048 2 1 : tunables 24 12 0 : slabdata 5 5 0 rpc_tasks 24 24 160 24 1 : tunables 120 60 0 : slabdata 1 1 0 rpc_inode_cache 10 18 416 9 1 : tunables 54 27 0 : slabdata 2 2 0 UNIX 2 10 384 10 1 : tunables 54 27 0 : slabdata 1 1 0 flow_cache 0 0 96 40 1 : tunables 120 60 0 : slabdata 0 0 0 cfq_io_context 0 0 88 44 1 : tunables 120 60 0 : slabdata 0 0 0 cfq_queue 0 0 88 44 1 : tunables 120 60 0 : slabdata 0 0 0 nfs_write_data 36 40 480 8 1 : tunables 54 27 0 : slabdata 5 5 0 nfs_read_data 32 36 448 9 1 : tunables 54 27 0 : slabdata 4 4 0 nfs_inode_cache 76 78 608 6 1 : tunables 54 27 0 : slabdata 13 13 0 nfs_page 0 0 64 59 1 : tunables 120 60 0 : slabdata 0 0 0 squashfs_inode_cache 0 0 384 10 1 : tunables 54 27 0 : slabdata 0 0 0 ext2_inode_cache 0 0 432 9 1 : tunables 54 27 0 : slabdata 0 0 0 journal_handle 0 0 20 169 1 : tunables 120 60 0 : slabdata 0 0 0 journal_head 0 0 52 72 1 : tunables 120 60 0 : slabdata 0 0 0 revoke_table 0 0 12 254 1 : tunables 120 60 0 : slabdata 0 0 0 revoke_record 0 0 16 203 1 : tunables 120 60 0 : slabdata 0 0 0 ext3_inode_cache 0 0 472 8 1 : tunables 54 27 0 : slabdata 0 0 0 ext3_xattr 0 0 44 84 1 : tunables 120 60 0 : slabdata 0 0 0 dnotify_cache 0 0 20 169 1 : tunables 120 60 0 : slabdata 0 0 0 inotify_event_cache 0 0 28 127 1 : tunables 120 60 0 : slabdata 0 0 0 inotify_watch_cache 0 0 40 92 1 : tunables 120 60 0 : slabdata 0 0 0 kioctx 0 0 160 24 1 : tunables 120 60 0 : slabdata 0 0 0 kiocb 0 0 160 24 1 : tunables 120 60 0 : slabdata 0 0 0 fasync_cache 0 0 16 203 1 : tunables 120 60 0 : slabdata 0 0 0 shmem_inode_cache 1 9 416 9 1 : tunables 54 27 0 : slabdata 1 1 0 posix_timers_cache 0 0 96 40 1 : tunables 120 60 0 : slabdata 0 0 0 uid_cache 0 0 64 59 1 : tunables 120 60 0 : slabdata 0 0 0 UDP-Lite 0 0 480 8 1 : tunables 54 27 0 : slabdata 0 0 0 tcp_bind_bucket 1 203 16 203 1 : tunables 120 60 0 : slabdata 1 1 0 inet_peer_cache 0 0 64 59 1 : tunables 120 60 0 : slabdata 0 0 0 secpath_cache 0 0 32 113 1 : tunables 120 60 0 : slabdata 0 0 0 xfrm_dst_cache 0 0 256 15 1 : tunables 120 60 0 : slabdata 0 0 0 ip_fib_alias 9 113 32 113 1 : tunables 120 60 0 : slabdata 1 1 0 ip_fib_hash 9 113 32 113 1 : tunables 120 60 0 : slabdata 1 1 0 ip_dst_cache 2 15 256 15 1 : tunables 120 60 0 : slabdata 1 1 0 arp_cache 1 30 128 30 1 : tunables 120 60 0 : slabdata 1 1 0 RAW 2 9 448 9 1 : tunables 54 27 0 : slabdata 1 1 0 UDP 1 8 480 8 1 : tunables 54 27 0 : slabdata 1 1 0 tw_sock_TCP 0 0 96 40 1 : tunables 120 60 0 : slabdata 0 0 0 request_sock_TCP 0 0 64 59 1 : tunables 120 60 0 : slabdata 0 0 0 TCP 1 3 1056 3 1 : tunables 24 12 0 : slabdata 1 1 0 eventpoll_pwq 0 0 36 101 1 : tunables 120 60 0 : slabdata 0 0 0 eventpoll_epi 0 0 96 40 1 : tunables 120 60 0 : slabdata 0 0 0 sgpool-128 2 2 2048 2 1 : tunables 24 12 0 : slabdata 1 1 0 sgpool-64 2 4 1024 4 1 : tunables 54 27 0 : slabdata 1 1 0 sgpool-32 2 8 512 8 1 : tunables 54 27 0 : slabdata 1 1 0 sgpool-16 2 15 256 15 1 : tunables 120 60 0 : slabdata 1 1 0 sgpool-8 2 30 128 30 1 : tunables 120 60 0 : slabdata 1 1 0 scsi_io_context 0 0 104 37 1 : tunables 120 60 0 : slabdata 0 0 0 blkdev_ioc 0 0 32 113 1 : tunables 120 60 0 : slabdata 0 0 0 blkdev_queue 9 12 900 4 1 : tunables 54 27 0 : slabdata 3 3 0 blkdev_requests 4 22 176 22 1 : tunables 120 60 0 : slabdata 1 1 0 biovec-256 2 2 3072 1 1 : tunables 24 12 0 : slabdata 2 2 0 biovec-128 2 2 1536 2 1 : tunables 24 12 0 : slabdata 1 1 0 biovec-64 2 5 768 5 1 : tunables 54 27 0 : slabdata 1 1 0 biovec-16 2 20 192 20 1 : tunables 120 60 0 : slabdata 1 1 0 biovec-4 2 59 64 59 1 : tunables 120 60 0 : slabdata 1 1 0 biovec-1 2 203 16 203 1 : tunables 120 60 0 : slabdata 1 1 0 bio 2 59 64 59 1 : tunables 120 60 0 : slabdata 1 1 0 sock_inode_cache 12 20 384 10 1 : tunables 54 27 0 : slabdata 2 2 0 skbuff_fclone_cache 0 0 352 11 1 : tunables 54 27 0 : slabdata 0 0 0 skbuff_head_cache 16 24 160 24 1 : tunables 120 60 0 : slabdata 1 1 0 file_lock_cache 0 0 96 40 1 : tunables 120 60 0 : slabdata 0 0 0 proc_inode_cache 1276 1276 336 11 1 : tunables 54 27 0 : slabdata 116 116 0 sigqueue 0 0 144 27 1 : tunables 120 60 0 : slabdata 0 0 0 radix_tree_node 34 39 288 13 1 : tunables 54 27 0 : slabdata 3 3 0 bdev_cache 1 9 448 9 1 : tunables 54 27 0 : slabdata 1 1 0 sysfs_dir_cache 1177 1248 48 78 1 : tunables 120 60 0 : slabdata 16 16 0 mnt_cache 16 30 128 30 1 : tunables 120 60 0 : slabdata 1 1 0 inode_cache 317 324 320 12 1 : tunables 54 27 0 : slabdata 27 27 0 dentry 1730 1736 124 31 1 : tunables 120 60 0 : slabdata 56 56 0 filp 96 96 160 24 1 : tunables 120 60 0 : slabdata 4 4 0 names_cache 1 1 4096 1 1 : tunables 24 12 0 : slabdata 1 1 0 idr_layer_cache 70 87 136 29 1 : tunables 120 60 0 : slabdata 3 3 0 buffer_head 0 0 52 72 1 : tunables 120 60 0 : slabdata 0 0 0 mm_struct 9 9 416 9 1 : tunables 54 27 0 : slabdata 1 1 0 vm_area_struct 184 184 84 46 1 : tunables 120 60 0 : slabdata 4 4 0 fs_cache 20 113 32 113 1 : tunables 120 60 0 : slabdata 1 1 0 files_cache 20 20 192 20 1 : tunables 120 60 0 : slabdata 1 1 0 signal_cache 30 30 384 10 1 : tunables 54 27 0 : slabdata 3 3 0 sighand_cache 21 21 1312 3 1 : tunables 24 12 0 : slabdata 7 7 0 task_struct 24 24 672 6 1 : tunables 54 27 0 : slabdata 4 4 0 anon_vma 65 339 8 339 1 : tunables 120 60 0 : slabdata 1 1 0 pid 36 101 36 101 1 : tunables 120 60 0 : slabdata 1 1 0 size-4194304(DMA) 0 0 4194304 1 1024 : tunables 1 1 0 : slabdata 0 0 0 size-4194304 0 0 4194304 1 1024 : tunables 1 1 0 : slabdata 0 0 0 size-2097152(DMA) 0 0 2097152 1 512 : tunables 1 1 0 : slabdata 0 0 0 size-2097152 0 0 2097152 1 512 : tunables 1 1 0 : slabdata 0 0 0 size-1048576(DMA) 0 0 1048576 1 256 : tunables 1 1 0 : slabdata 0 0 0 size-1048576 0 0 1048576 1 256 : tunables 1 1 0 : slabdata 0 0 0 size-524288(DMA) 0 0 524288 1 128 : tunables 1 1 0 : slabdata 0 0 0 size-524288 0 0 524288 1 128 : tunables 1 1 0 : slabdata 0 0 0 size-262144(DMA) 0 0 262144 1 64 : tunables 1 1 0 : slabdata 0 0 0 size-262144 0 0 262144 1 64 : tunables 1 1 0 : slabdata 0 0 0 size-131072(DMA) 0 0 131072 1 32 : tunables 8 4 0 : slabdata 0 0 0 size-131072 0 0 131072 1 32 : tunables 8 4 0 : slabdata 0 0 0 size-65536(DMA) 0 0 65536 1 16 : tunables 8 4 0 : slabdata 0 0 0 size-65536 0 0 65536 1 16 : tunables 8 4 0 : slabdata 0 0 0 size-32768(DMA) 0 0 32768 1 8 : tunables 8 4 0 : slabdata 0 0 0 size-32768 0 0 32768 1 8 : tunables 8 4 0 : slabdata 0 0 0 size-16384(DMA) 0 0 16384 1 4 : tunables 8 4 0 : slabdata 0 0 0 size-16384 0 0 16384 1 4 : tunables 8 4 0 : slabdata 0 0 0 size-8192(DMA) 0 0 8192 1 2 : tunables 8 4 0 : slabdata 0 0 0 size-8192 0 0 8192 1 2 : tunables 8 4 0 : slabdata 0 0 0 size-4096(DMA) 0 0 4096 1 1 : tunables 24 12 0 : slabdata 0 0 0 size-4096 4 4 4096 1 1 : tunables 24 12 0 : slabdata 4 4 0 size-2048(DMA) 0 0 2048 2 1 : tunables 24 12 0 : slabdata 0 0 0 size-2048 12 14 2048 2 1 : tunables 24 12 0 : slabdata 7 7 0 size-1024(DMA) 0 0 1024 4 1 : tunables 54 27 0 : slabdata 0 0 0 size-1024 11 12 1024 4 1 : tunables 54 27 0 : slabdata 3 3 0 size-512(DMA) 0 0 512 8 1 : tunables 54 27 0 : slabdata 0 0 0 size-512 208 208 512 8 1 : tunables 54 27 0 : slabdata 26 26 0 size-256(DMA) 0 0 256 15 1 : tunables 120 60 0 : slabdata 0 0 0 size-256 75 75 256 15 1 : tunables 120 60 0 : slabdata 5 5 0 size-192(DMA) 0 0 192 20 1 : tunables 120 60 0 : slabdata 0 0 0 size-192 40 40 192 20 1 : tunables 120 60 0 : slabdata 2 2 0 size-128(DMA) 0 0 128 30 1 : tunables 120 60 0 : slabdata 0 0 0 size-128 86 90 128 30 1 : tunables 120 60 0 : slabdata 3 3 0 size-96(DMA) 0 0 96 40 1 : tunables 120 60 0 : slabdata 0 0 0 size-96 388 400 96 40 1 : tunables 120 60 0 : slabdata 10 10 0 size-64(DMA) 0 0 64 59 1 : tunables 120 60 0 : slabdata 0 0 0 size-32(DMA) 0 0 32 113 1 : tunables 120 60 0 : slabdata 0 0 0 size-64 451 472 64 59 1 : tunables 120 60 0 : slabdata 8 8 0 size-32 871 904 32 113 1 : tunables 120 60 0 : slabdata 8 8 0 kmem_cache 125 160 96 40 1 : tunables 120 60 0 : slabdata 4 4 0
Documentation of /proc/slabinfo fields
Here is some information about /proc/slabinfo. Some of this information was obtained from a writeup done by Doug Ledford in 2003, for the 2.4.x kernel. See below for the source.
Here is a description of some of the fields in /proc/slabinfo
- active_objs
- After creating a slab cache, you allocate your objects out of that slab cache. This is the count of objects you currently have allocated out of the cache.
- num_objs
- This is the current total number of objects in the cache.
- objsize
- This is the size of each allocated object. There is overhead to maintaining the cache, so with a 512byte object and a 4096-byte page size, you could fit 7 objects in a single page and you would waste 512-slab_overhead bytes per allocation. Slab overhead varies with object size (smaller objects have more objects per allocation and require more overhead to track used vs. unused objects).
- ojbsperslab
- This is how many objects can be put into each slab. To find the amount of wasted bytes per slab, multiply the objsize by objperslab, and subtract this from pagesperslab times the pagesize (often 4096).
- pagesperslab
- This is the size of each slab in units of memory pages. Page size is architecture specific, but the most common size is 4k (4096 bytes). Some architectures have an 8k page size, and ia64 can do a 16k page size. Each slab for the cache is pagesperslab * arch_page_size bytes at a time, and total memory used by this particular slab cache is num_slabs * pagesperslab * arch_page_size.
- active_slabs
- This is the number of slabs that have at least one object in use.
- num_slabs
- This is the number of slabs currently allocated for the given cache.
On SMP machines, the slab cache will keep a per CPU cache of
objects so that an object freed on CPU0 will be reused on CPU0 instead
of CPU1 if possible. This improves cache performance on SMP systems
greatly.
- limit
- This is the limit on the number of free objects that can be stored in the per-CPU free list for this slab cache.
- batch-count
- On SMP systems, when we refill the available object list, instead of doing one object at a time, we do batch-count objects at a time.
Note that is CONFIG_SLAB_DEBUG is enabled, then you'll get more numbers on each line and the lines will have these fields:
<name> <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs num_slabs> <sharedavail> : globalstat <listallocs> <maxobjs> <grown> <reaped> <error> <maxfreeable> <nodeallocs> <remotefrees> <alienoverflow> : cpustats <allochit> <allocmiss> <freehit> <freemiss>
I don't have time to document these right now.
Source: Documenting slabinfo - article by Doug Ledford in 2003 (for the 2.4 kernel - may be obsolete for 2.6)
Programs
The 2.6.22 kernel includes a program called slabinfo.c, located at: Documentation/vm/slabinfo.c. This program can be used for debugging the slub allocator, not the slab allocator.