GlusterFS 3 on x86 64 CentOS 5.4
From Ye Ole Stash
Required Packages
- Your list will vary based on install but here are the RPMs I needed:
- libibverbs-1.1.2-4.el5.x86_64
- fuse.x86_64
- fuse-libs.x86_64
System layout
- All systems are Supermicro Atom based 1U servers (AKA one sexy 100Watt or less testing environment) except for node3
- node0 - Client...was a Asterisk server standby but it looked bored on the self in the storeroom.
- node1 - Spare client box.
- node2 - Yet another spare box. We like Supermicro around here.
- node3 - Intel Core2 Quad Q6600 - 8GB DDR2 - Supermicro C2SBA
Benchmarks
- Note on the Atoms and being CPU/Network bound. The Atoms use network cards based on MSI not the more modern MSI-X and also appear to have a issue with IRQbalance. This means they are using one core for all their transfers. To see if this is the case for you here is a MSI or MSI-X test. I am going to research this further and update as I have time and hardware available.
Baseline Client/Server Benchmarks
- HDPARM
- (node3 Seagate ST3750640AS SCSI Driver : ata_piix)hdparm -Tt /dev/sda
- Timing cached reads: 13936 MB in 2.00 seconds = 6984.90 MB/sec
- Timing buffered disk reads: 188 MB in 3.01 seconds = 62.43 MB/sec
- (node0 2x Seagate ST9160412AS drives in Linux software Raid 1 - SCSI Driver : AHCI) hdparm -Tt /dev/sda
- Timing cached reads: 2280 MB in 2.00 seconds = 1138.96 MB/sec
- Timing buffered disk reads: 108 MB in 3.02 seconds = 35.72 MB/sec
- (node1 and node2 Seagate ST3500320NS - SCSI Driver : AHCI) hdparm -Tt /dev/sda
- Timing cached reads: 2340 MB in 2.00 seconds = 1169.44 MB/sec
- Timing buffered disk reads: 214 MB in 3.01 seconds = 71.19 MB/sec
- Bonnie++ (bonnie++ -d /tmp/ -u 99 -r 512)
- (node0 2x Seagate ST9160412AS drives in Linux software Raid 1)
Version 1.96 ------Sequential Output------ --Sequential Input- --Random-
Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
gnbd_serv1 1G 128 97 61891 54 26970 14 541 98 75673 11 974.0 18
Latency 326ms 646ms 1182ms 35568us 59974us 109ms
Version 1.96 ------Sequential Create------ --------Random Create--------
gnbd_serv1 -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
16 18367 86 +++++ +++ 25711 92 19561 88 +++++ +++ 26570 93
Latency 419us 2470us 2359us 373us 20us 178us- (node1 and node2 Seagate ST3500320NS)
Version 1.96 ------Sequential Output------ --Sequential Input- --Random-
Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
drbd1 1G 104 99 80488 69 43908 25 561 99 130305 21 837.1 15
Latency 84830us 664ms 516ms 27590us 32567us 628ms
Version 1.96 ------Sequential Create------ --------Random Create--------
drbd1 -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
16 16085 83 +++++ +++ 23609 94 17329 87 +++++ +++ 22990 89
Latency 809us 2269us 2374us 449us 19us 92us
node0 as client
Mounted Replicated (Raid 1) FS using FUSE w/ no cache or writeback
- Bonnie++1.9 (RPMFORGE)
- bonnie++ -d /mnt/gluster/ -u 99 -r 512
- dd if=/dev/zero of=/mnt/gluster/sample-file-1 bs=1M count=1000
- 134217728 bytes (134 MB) copied, 6.30928 seconds, 21.3 MB/s
- CPU Usage: 65%
Mounted Replicated (Raid 1) FS using FUSE w/ Client Cache and Writeback
- Benchmark Client : node0
- dd if=/dev/zero of=/mnt/gluster/sample-file-1 bs=1M count=1000
- 134217728 bytes (134 MB) copied, 2.98266 seconds, RANGE:30-40.0 MB/s
- CPU Usage: 100% - The speed is CPU bound.
- cache-size 512MB
- window-size 1MB
node3 as client
Test 1 Mounted Replicated (Raid 1) FS using FUSE w/ Client Cache
- Benchmark Application : dd
- dd if=/dev/zero of=/mnt/gluster/sample-file-3 bs=1M count=1000
- 1073741824 bytes (1.1 GB) copied, 19.8335 seconds, 54.1 MB/s
- CPU Usage: 80%
- cache-size 512MB
- 'window-size 1MB
- Benchmark Application : Bonnie++ (bonnie++ -d /tmp/ -u 99 -r 512)
Version 1.96 ------Sequential Output------ --Sequential Input- --Random-
Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
stardust.80sretr 1G 28 10 44402 4 2167 0 58 20 59626 4 318.7 3
Latency 1918ms 131ms 335ms 174ms 66025us 352ms
Version 1.96 ------Sequential Create------ --------Random Create--------
stardust.80sretro.c -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
16 183 0 1003 0 257 0 164 0 1157 0 268 0
Latency 171ms 14187us 218ms 332ms 7720us 149msTest 2 Mounted Replicated (Raid 1) FS using FUSE w/ Client Cache
- GlusterFS Test 2 client config
- GlusterFS Test 2 server config
- bonnie++ -u 99 -r 512
Version 1.96 ------Sequential Output------ --Sequential Input- --Random-
Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
vserverdev.world 1G 46 35 49444 5 1244 0 52 28 114460 10 471.8 3
Latency 204ms 215ms 261ms 190ms 1641us 288ms
Version 1.96 ------Sequential Create------ --------Random Create--------
dev. -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
16 595 3 2845 3 696 1 572 3 3255 3 671 1
Latency 270ms 48366us 202ms 148ms 1300us 216msTest 3 Mounted Replicated (Raid 1) FS using FUSE w/ Client Cache
- GlusterFS Test 3 client config
- GlusterFS Test 3 server config same as test 2
- bonnie++ -u 99 -r 512
Version 1.96 ------Sequential Output------ --Sequential Input- --Random-
Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
dev 1G 50 33 48255 3 1207 0 73 22 114444 9 820.2 5
Latency 181ms 215ms 261ms 115ms 1734us 684ms
Version 1.96 ------Sequential Create------ --------Random Create--------
dev -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
16 582 2 2811 3 615 1 566 2 3365 3 419 0
Latency 192ms 3348us 204ms 137ms 2276us 403ms
1.96,1.96,dev,1,1270216200,1G,,50,33,48255,3,1207,0,73,22,114444,9,820.2,5,16,,,,,582,2,2811,3,615,1,566,2,3365,3,419,0,181ms,215ms,261ms,115ms,1734us,684ms,192ms,3348us,204ms,137ms,2276us,403msTest 4 Mounted Replicated (Raid 1) FS using FUSE w/ Client Cache
- GlusterFS Test 4 client config
- GlusterFS Test 4 server config same as test 2
- bonnie++ -u 99 -r 512
Version 1.96 ------Sequential Output------ --Sequential Input- --Random-
Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
vserverdev 1G 60 32 50271 3 1296 0 66 23 114424 9 821.6 6
Latency 178ms 205ms 260ms 186ms 1979us 91308us
Version 1.96 ------Sequential Create------ --------Random Create--------
vserverdev. -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
16 546 3 2705 2 573 1 554 2 3152 3 404 0
Latency 292ms 3525us 213ms 261ms 1276us 368ms
1.96,1.96,vserverdev,1,1270244440,1G,,60,32,50271,3,1296,0,66,23,114424,9,821.6,6,16,,,,,546,3,2705,2,573,1,554,2,3152,3,404,0,178ms,205ms,260ms,186ms,1979us,91308us,292ms,3525us,213ms,261ms,1276us,368msTest 5 Mounted Replicated (Raid 1) FS using FUSE w/ Client Cache
Version 1.96 ------Sequential Output------ --Sequential Input- --Random-
Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
vserverdev 1G 61 32 50627 6 1282 0 66 22 114489 8 823.7 4
Latency 177ms 120ms 409ms 190ms 1788us 93080us
Version 1.96 ------Sequential Create------ --------Random Create--------
vserverdev -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
16 594 3 2940 2 306 0 569 3 3228 3 692 1
Latency 179ms 2738us 583ms 140ms 1070us 645msTest 6 Mounted Replicated (Raid 1) FS using FUSE w/ Client and Server Cache
Version 1.96 ------Sequential Output------ --Sequential Input- --Random-
Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
vserverdev. 1G 88 31 46680 3 1244 0 56 28 114452 13 826.6 4
Latency 110ms 222ms 293ms 190ms 1878us 50357us
Version 1.96 ------Sequential Create------ --------Random Create--------
vserverdev. -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP /sec %CP
16 549 2 2822 2 503 1 561 2 3403 3 459 0
Latency 205ms 2763us 400ms 155ms 707us 405msGluster Examples
Replicated (Raid 1)
- Client File (/etc/glusterfs/glusterfs.vol) :
volume drbd1 type protocol/client option transport-type tcp option remote-host 192.168.1.201 option remote-subvolume brick1 end-volume volume drbd2 type protocol/client option transport-type tcp option remote-host 192.168.1.203 option remote-subvolume brick1 end-volume volume replicate1 type cluster/replicate subvolumes drbd1 drbd2 end-volume
- node1 config file (/etc/glusterfs/glusterfsd.vol)
- Generated by :
- $ /usr/bin/glusterfs-volgen --name storetest --raid 1 $server1_name_or_ip:$/dir/youwant/to/export $server2_name_or_ip:$/dir/youwant/to/export
volume posix1 type storage/posix option directory /mnt/export end-volume volume locks1 type features/locks subvolumes posix1 end-volume volume brick1 type performance/io-threads option thread-count 8 subvolumes locks1 end-volume volume server-tcp type protocol/server option transport-type tcp option auth.addr.brick1.allow * option transport.socket.listen-port 6996 option transport.socket.nodelay on subvolumes brick1 end-volume
- node2 config file (/etc/glusterfs/glusterfsd.vol):
- Generated by :
- $ /usr/bin/glusterfs-volgen --name storetest --raid 1 $server1_name_or_ip:$/dir/youwant/to/export $server2_name_or_ip:$/dir/youwant/to/export
volume posix1 type storage/posix option directory /mnt/export end-volume volume locks1 type features/locks subvolumes posix1 end-volume volume brick1 type performance/io-threads option thread-count 8 subvolumes locks1 end-volume volume server-tcp type protocol/server option transport-type tcp option auth.addr.brick1.allow * option transport.socket.listen-port 6996 option transport.socket.nodelay on subvolumes brick1 end-volume
Replicated + Distributed (Raid 10)
Quick Cheat Sheet
Translator Cheat Sheet
- cluster/replicate and cluster/afr (same thing as far as I can tell)
Client Cheat Sheet
- To make files "reappear" on a new, replaced, or rebuilding node just do a `ls -al` or similar ls command on the client system. This will only work if the file was put on a server node by a glusterfs client.
- To mount VMs use this command in the .vmx file
- mainMem.useNamedFile = "FALSE"
Reference and Further Reading
- http://www.gluster.com/community/documentation/index.php/Understanding_AFR_Translator
- http://www.gluster.com/community/documentation/index.php/GlusterFS_Translators_v1.3
- http://www.voicesofit.com/blogs/blog1.php?s=gluster&sentence=AND
- GlusterFS Optimizing Guide
Common Issues
- DocumentRoot must be a directory when starting up Apache and Gluster as mount pouint.
- SeLinux issue disable or set proper perms.