1. CentOS 6.5
2. GlusterFS
3. CTDB
4. Samba
# NFS/CIFS access
192.168.18.220 nas1.rickpc gluster01
192.168.18.2 nas2.rickpc gluster02
# CTDB interconnect
192.168.3.101 gluster01c
192.168.3.102 gluster02c
# GlusterFS interconnect
192.168.2.101 gluster01g
192.168.2.102 gluster02g
1.2. 建立實體硬碟
若要瞭解Linux磁碟檔案系統的基本原理和如何使用fdisk來分切磁碟可參考
[3]鳥哥網站精闢的介紹, 以下僅列出基本指令
## Prepare phylical partition to create /dev/sdb5
fdisk /dev/sib
partprobe
分切nas1和nas2的磁碟, 結果如下,
筆者所使用的硬碟為8G, 但只切出
/dev/sdb4 64M
/dev/sdb5 2.1G (將做為physical volume空間)
Disk /dev/sdb: 8589 MB, 8589934592 bytes
255 heads, 63 sectors/track, 1044 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x9815603c
Device Boot Start End Blocks Id System
/dev/sdb1 9 1044 8321670 5 Extended
/dev/sdb4 1 8 64228+ 83 Linux
/dev/sdb5 9 270 2104483+ 83 Linux
## Create phylical volume
pvcreate /dev/sdb5
## Create volume group
vgcreate vg_bricks /dev/sdb5
## Create logical volume
lvcreate -n lv_lock -L 64M vg_bricks
lvcreate -n lv_brick01 -L 1.5G vg_bricks
## Install XFS package
yum install -y xfsprogs
## format linux file system
mkfs.xfs -i size=512 /dev/vg_bricks/lv_lock
mkfs.xfs -i size=512 /dev/vg_bricks/lv_brick01
echo '/dev/vg_bricks/lv_lock /bricks/lock xfs defaults 0 0' >> /etc/fstab
echo '/dev/vg_bricks/lv_brick01 /bricks/brick01 xfs defaults 0 0' >> /etc/fstab
mkdir -p /bricks/lock
mkdir -p /bricks/brick01
mount /bricks/lock
mount /bricks/brick01
分別在nas1和nas2上建立PV, VG和LV, 結果如下:
LV Path /dev/vg_bricks/lv_lock
LV UUID rnRNbZ-QFun-pxvS-AS3f-pvn3-dvCY-h3qXgi
LV Write Access read/write
LV Creation host, time nas1.rickpc, 2014-07-04 16:54:20 +0800
LV Path /dev/vg_bricks/lv_brick01
LV UUID BwMD2T-YOJi-spM4-aarC-3Yyj-Jfe2-nsecIJ
LV Write Access read/write
LV Creation host, time nas1.rickpc, 2014-07-04 16:56:11 +0800
1.4. 安裝GlusterFS and create volumes
想瞭解CTDB與GlusterFS之間是如何運作以及如何安裝GlusterFS和CTDB, 可參考[5][6].
## Install GlusterFS packages on all nodes
yum install -y rpcbind glusterfs-server
chkconfig rpcbind on
service rpcbind restart
service glusterd restart
# Do not auto start glusterd with chkconfig.

## Configure cluster and create volumes from gluster01
## 將 gluster02g 加入可信任的儲存池 (Trusted Stroage Pool)
gluster peer probe gluster02g
## 確認信任關係
gluster peer status
## 建立 Volume: 在 glusterfs 的架構中,每一個 volume 就代表了單獨的虛擬檔案系統。
# transport tcp
gluster volume create lockvol replica 2 gluster01g:/bricks/lock gluster02g:/bricks/lock force
gluster volume create vol01 replica 2 gluster01g:/bricks/brick01 gluster02g:/bricks/brick01 force
gluster vol start lockvol
gluster vol start vol01
nas1和nas2分別建立了GlusterFS的虛擬檔案系統, 結果如下:
/dev/mapper/vg_bricks-lv_lock
60736 3576 57160 6% /bricks/lock
/dev/mapper/vg_bricks-lv_brick01
1562624 179536 1383088 12% /bricks/brick01
localhost:/lockvol 60672 3584 57088 6% /gluster/lock
localhost:/vol01 1562624 179584 1383040 12% /gluster/vol01
1.5. Install and configure Samba/CTDB
## Install Samba/CTDB packages on all nodes
# samba-3.6.9, samba-client-3.6.9, ctdb-1.0.114.5
yum install -y samba sambaclient ctdb
## Install NFS
# rpcbind-0.2.0, nfs-utils-1.2.3
yum install -y rpcbind nfs-utils
chkconfig rpcbind on
service rpcbind start
## Configure CTDB and Samba only on gluster01
mkdir -p /gluster/lock
mount -t glusterfs localhost:/lockvol /gluster/lock
## Edit /gluster/lock/ctdb
CTDB_PUBLIC_ADDRESSES=/gluster/lock/public_addresses
CTDB_NODES=/etc/ctdb/nodes
# Only when using Samba. Unnecessary for NFS.
CTDB_SET_DeterministicIPs=1
CTDB_SET_RecoveryBanPeriod=120
CTDB_SET_KeepaliveInterval=5
CTDB_SET_KeepaliveLimit=5
CTDB_SET_MonitorInterval=15
## Edit /gluster/lock/nodes
## Edit /gluster/lock/public_addresses
## Edit /gluster/lock/smb.conf
server string = Samba Server Version %v
comment = Shared Directories
browseable = yes
writable = yes
## Create symlink to config files on all nodes
mv /etc/sysconfig/ctdb /etc/sysconfig/ctdb.orig
mv /etc/samba/smb.conf /etc/samba/smb.conf.orig
ln -s /gluster/lock/ctdb /etc/sysconfig/ctdb
ln -s /gluster/lock/nodes /etc/ctdb/nodes
ln -s /gluster/lock/public_addresses /etc/ctdb/public_addresses
ln -s /gluster/lock/smb.conf /etc/samba/smb.conf
## Set SELinux permissive for smbd_t on all nodes due to the non-standard smb.conf location
yum install -y policycoreutils-python
semanage permissive -a smbd_t
# We'd better set an appropriate seculity context, but there's an open issue for using chcon with GlusterFS.
## Create the following script for start/stop services in /usr/local/bin/ctdb_manage
echo exec on all nodes: $@
runcmd service glusterd start
runcmd mkdir -p /gluster/lock
runcmd mount -t glusterfs localhost:/lockvol /gluster/lock
runcmd mkdir -p /gluster/vol01
runcmd mount -t glusterfs localhost:/vol01 /gluster/vol01
runcmd service ctdb start
runcmd umount /gluster/lock
runcmd umount /gluster/vol01
runcmd service glusterd stop
1.6. Start services
## Set samba password and check shared directories via one of floating IP's.
pdbedit -a -u root
## test samba connection
smbclient -L 192.168.18.201 -U root
smbclient -L 192.168.18.202 -U root
## check Windows connection
ssh gluster01 netstat -aT | grep microsoft
2. Testing your clustered Samba
2.1. Client Disconnection
在一台Windows的PC上, 設定Z槽的網路磁碟機, 並執行下述的run_client.bat
echo "%time% (^_-) Writing on file in the shared folder...."
echo %time% >> z:/wintest.txt
echo "%time% (-_^) Writing on file in the shared folder...."
echo %time% >> z:/wintest.txt
每兩秒會將目前的timestamp寫入Z:/wintest.txt中, 測試步驟如下:
1. 執行run_client.bat
2. 將Windows上的網路卡介面關閉, 程式無法把資料寫入cluster file system
3. 重新啟動網路卡介面, 程式又在很短時間內寫入cluster file system
2.2. CTDB Failover
使用ctdb status和ctdb ip查看目前cluster file system的狀態
1. 在Windows PC上執行run_client.bat
2. 在任一台Cluster node上, 關閉ctdb, 指令如下:
3. 觀察PC上的timestamp正常寫入cluster file system
2.3. Cluster Node Crash
將一台Cluster node reboot, 觀察Windows PC上的連線狀況
測試步驟:
1. 在Windows PC上執行run_client.bat
2. 將任一台Cluster node OS shutdown
3. 觀察PC上的timestamp的變化
"12:16:49.59 (-_^) Writing on file in the shared folder...."
"12:16:51.62 (^_-) Writing on file in the shared folder...."
"12:16:53.66 (-_^) Writing on file in the shared folder...."
"12:16:55.70 (^_-) Writing on file in the shared folder...."
"12:16:57.74 (-_^) Writing on file in the shared folder...."
"12:17:41.90 (^_-) Writing on file in the shared folder...."
"12:17:43.92 (-_^) Writing on file in the shared folder...."
"12:17:45.95 (^_-) Writing on file in the shared folder...."
"12:17:48.00 (-_^) Writing on file in the shared folder...."
紅色兩行的結果, 發現Winodws的連線會有數秒的中斷, 但若仍就符合一定程度的HA-level recovery
2.4. Ping_pong for CTDB lock rate
Ping_pong[7]是Samba open source所提供的一個小工具, 用來測量CTDB的lock rate
筆者稍微修改原程式碼, 並加入了將lock rate寫入到Graphite[7], 方便長時間觀察lock rate的變化
3. Reference