通过oracle ASM实现存储数据迁移
场景描述
客户原有部署在vmware虚拟化环境下的Oracle RAC双节点集群,通过iSCSI方式连接一台IBM FastT老存储,承载轻量数据库业务; 客户当前新采购一台华为5500v3存储,取新存储一部分空间,用来替代原有Oracle RAC双节点集群所使用的IBM老式存储,以解决因老的IBM存储空间不足、性能不足等问题给业务系统带来的影响;
OS:Red Hat Enterprise Linux Server release 6.4 DB:Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Production 旧存储:IBM FastT700 新存储:OceanStor 5500v3
数据迁移方案: 原存储挂载给Oracle数据库虚拟机,通过ASM自动存储管理方式提供给数据库使用; 数据迁移通过将新存储5500v3存储空间映射给OracleRAC主机,通过ASM的动态重平衡功能来进行磁盘替换的方式,进行新老存储数据迁移;
迁移方案
1.新存储系统规划
按规划创建硬盘域、存储池、LUN、主机映射等,通过不再赘述;
2.在存储上把新的LUN映射给二个RAC节点,并且被各节点正常识别;
2.1.在主机上识别新的LUN:
# service iscsi restart 确认新映射的LUN被系统发现: # fdisk -l
for i in e f g;do
echo "KERNEL==\"sd*\", BUS==\"scsi\", PROGRAM==\"/sbin/scsi_id --whitelisted --replace-whitespace --device=/dev/\$name\", RESULT==\"`/sbin/scsi_id --whitelisted --replace-whitespace --device=/dev/sd$i`\", NAME=\"asm-disk$i\", OWNER=\"grid\", GROUP=\"asmadmin\", MODE=\"0660\""
done
# 脚本输出结果如下:
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id --whitelisted --replace-whitespace --device=/dev/$name", RESULT=="369c37f41007ca7d15a24f37100000041", NAME="asm-diske", OWNER="grid", GROUP="asmadmin", MODE="0660"
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id --whitelisted --replace-whitespace --device=/dev/$name", RESULT=="369c37f41007ca7d15a2f915400000042", NAME="asm-diskf", OWNER="grid", GROUP="asmadmin", MODE="0660"
KERNEL=="sd*", BUS=="scsi", PROGRAM=="/sbin/scsi_id --whitelisted --replace-whitespace --device=/dev/$name", RESULT=="369c37f41007ca7d15af086ec00000045", NAME="asm-diskg", OWNER="grid", GROUP="asmadmin", MODE="0660"
# 把上面命令结果输入到以下配置文件里,然后启动UDEV服务:
# vi /etc/udev/rules.d/99-oracleasm.rules
# /sbin/start_udev
# ls -l /dev/asm*
brw-rw---- 1 grid asmadmin 8, 16 Oct 20 17:43 /dev/asm-diske
brw-rw---- 1 grid asmadmin 8, 32 Oct 20 17:43 /dev/asm-diskf
brw-rw---- 1 grid asmadmin 8, 48 Oct 20 17:43 /dev/asm-diskg
# 确保新增加的磁盘被系统正常识别,且权限正确。
在主机上识别新的LUN的时候,因为没有停用RAC的集群服务,所以导致集群的一些资源出错,花费了一些时间。
具体错误如下:
[grid@rac01 ~]$ sqlplus "/as sysdba"
SQL> select NAME , STATE FROM V$ASM_DISKGROUP;
NAME STATE
------------------------------ -----------
CRS DISMOUNTED
DATA MOUNTED
FRA MOUNTED
SQL>
SQL> alter diskgroup crs mount;
alter diskgroup crs mount
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15260: permission denied on ASM disk group
# 原因:因为登录的时候使用的是sysdba的身份,所以报此错误,改用sysasm的身份登录后,命令可正常执行:
[grid@rac01 ~]$ sqlplus "/as sysasm"
SQL> alter diskgroup CRS mount;
Diskgroup altered.
SQL> select NAME , STATE FROM V$ASM_DISKGROUP;
NAME STATE
------------------------------ -----------
CRS MOUNTED
DATA MOUNTED
FRA MOUNTED
###################### 第二处错误 ########################
# 原存储空间是iSCSI方式连接到主机的,在发现新磁盘的时候,直接重启了iSCSI服务,导致数据库和原磁盘的读写失败,OracleRAC集群报错,具体如下:
[grid@rac01 ~]$ crs_stat -t
CRS-0184: Cannot communicate with the CRS daemon.
[grid@rac01 ~]$ crsctl status resource -t
CRS-4535: Cannot communicate with Cluster Ready Services
CRS-4000: Command Status failed, or completed with errors.
[root@rac01 bin]# ps -ef | grep crsd.bin | grep -v grep
[root@rac01 bin]#
# 发现RAC集群服务失败后,尝试重启:
[root@rac01 bin]# ./crsctl start crs
CRS-4640: Oracle High Availability Services is already active
CRS-4000: Command Start failed, or completed with errors.
[root@rac01 bin]# ./crsctl stop crs
CRS-2796: The command may not proceed when Cluster Ready Services is not running
CRS-4687: Shutdown command has completed with errors.
CRS-4000: Command Stop failed, or completed with errors.
# 在正常启动和关闭均失败的情况下,选择强制关闭服务:
[root@rac01 bin]# ./crsctl stop crs -f
#此处需要等待一段时间,过程中会出错,无视即可;
# 查看crsd.bin,cssd.bin,evmd.bin,都已经没有了相关进程
[root@rac01 bin]# ps -ef | grep crsd.bin | grep -v grep
[root@rac01 bin]# ps -ef | grep cssd.bin | grep -v grep
[root@rac01 bin]# ps -ef | grep evmd.bin | grep -v grep
# 尝试再次启动crs,成功!
[root@rac01 bin]# ./crsctl start crs
CRS-4123: Oracle High Availability Services has been started.
# 启动也需要一段时间,确认各进程都已启动:
[root@rac01 bin]# ps -ef | grep crsd.bin | grep -v grep
[root@rac01 bin]# ps -ef | grep cssd.bin | grep -v grep
grid 1596 1 1 17:37 ? 00:00:00 /oracle/app/11.2.0/grid/bin/ocssd.bin
[root@rac01 bin]# ps -ef | grep evmd.bin | grep -v grep
grid 1699 1 0 17:37 ? 00:00:00 /oracle/app/11.2.0/grid/bin/evmd.bin
[root@rac01 bin]# ps -ef | grep crsd.bin | grep -v grep
[root@rac01 bin]# ps -ef | grep crsd.bin | grep -v grep
[root@rac01 bin]# ps -ef | grep crsd.bin | grep -v grep
root 1828 1 2 17:38 ? 00:00:00 /oracle/app/11.2.0/grid/bin/crsd.bin reboot
# 再次查看资源使用状况,可正常查看:
[root@rac01 bin]# ./crsctl stat resource -t
#至此,第1个节点非常不正常的处理完毕;
2.3.开始在第2个节点上准备新映射的LUN:
a.停止当前节点的cluster服务:
[root@rac02 bin]# pwd
/oracle/app/11.2.0/grid/bin
[root@rac02 bin]# ./crsctl stop cluster
b.确认各进程都不在:
[root@rac02 bin]# ps -ef | grep crsd.bin | grep -v grep
[root@rac02 bin]# ps -ef | grep pmon | grep -v grep
[root@rac02 bin]# service iscsi restart
添加2.2步骤里的输出结果到/etc/udev/rules.d/99-oracleasm.rules配置文件,并启动UDEV服务:
[root@rac02 bin]# /sbin/start_udev
[root@rac02 bin]# ls -l /dev/asm*
brw-rw---- 1 grid asmadmin 8, 16 Oct 20 17:43 /dev/asm-diske
brw-rw---- 1 grid asmadmin 8, 32 Oct 20 17:43 /dev/asm-diskf
brw-rw---- 1 grid asmadmin 8, 48 Oct 20 17:43 /dev/asm-diskg
确保新增加的磁盘被系统正常识别,且权限正确。
d.重新启动当前节点的cluster服务:
[root@rac02 bin]# ./crsctl start cluster
e.检查各进程及服务
[root@rac02 bin]# ./crsctl status resource -t
#结果不再赘述。
[root@rac02 bin]# ps -ef | grep pmon | grep -v grep
grid 6728 1 0 18:10 ? 00:00:00 asm_pmon_+ASM2
oracle 7118 1 0 18:10 ? 00:00:00 ora_pmon_orcl2
[root@rac02 bin]# ps -ef | grep crsd.bin | grep -v grep
root 6788 1 1 18:10 ? 00:00:00 /oracle/app/11.2.0/grid/bin/crsd.bin reboot
#至此,新增加的LUN在2个节点上都已经被正常识别出来了!
3.1.查看当前磁盘信息:
使用以下命令查看并记录当前磁盘组信息:
SQL> select group_number,name,state,total_mb,free_mb from v$asm_diskgroup;
GROUP_NUMBER NAME STATE TOTAL_MB FREE_MB
------------ ---------- ----------- ---------- ----------
1 CRS MOUNTED 5270 3746
2 DATA CONNECTED 51200 19206
3 FRA CONNECTED 8192 7402
使用以下命令查看当前磁盘组包含的ASM磁盘信息:
SQL> select header_status,name,failgroup,path,total_mb,free_mb from v$asm_disk where GROUP_NUMBER in(1,2,3);
HEADER_STATU NAME FAILGROUP PATH TOTAL_MB FREE_MB
------------ ---------- --------------- -------------------- ---------- ----------
MEMBER CRS_0000 CRS_0000 /dev/asm-diskb 5270 3746
MEMBER FRA_0000 FRA_0000 /dev/asm-diskc 8192 7402
MEMBER DATA_0000 DATA_0000 /dev/asm-diskd 51200 19206
使用以下命令查看可被ASM识别的所有可用磁盘信息:
SQL> select header_status,name,failgroup,path,total_mb,free_mb from v$asm_disk;
HEADER_STATU NAME FAILGROUP PATH TOTAL_MB FREE_MB
------------ ---------- --------------- -------------------- ---------- ----------
MEMBER CRS_0000 CRS_0000 /dev/asm-diskb 5270 3746
MEMBER FRA_0000 FRA_0000 /dev/asm-diskc 8192 7402
MEMBER DATA_0000 DATA_0000 /dev/asm-diskd 51200 19196
CANDIDATE /dev/asm-diske 0 0
CANDIDATE /dev/asm-diskf 0 0
CANDIDATE /dev/asm-diskg 0 0
6 rows selected.
3.2.新磁盘加入原有磁盘组,准备数据迁移:
根据磁盘容量大小确认迁移的磁盘的对应关系,然后将新盘加入到:
SQL> select header_status,name,failgroup,path,os_mb,total_mb,free_mb from v$asm_disk;
HEADER_STATU NAME FAILGROUP PATH OS_MB TOTAL_MB FREE_MB
------------ ---------- --------------- -------------------- ------- ---------- ----------
MEMBER CRS_0000 CRS_0000 /dev/asm-diskb 5270 5270 3746
MEMBER FRA_0000 FRA_0000 /dev/asm-diskc 8192 8192 7402
MEMBER DATA_0000 DATA_0000 /dev/asm-diskd 51200 51200 19196
CANDIDATE /dev/asm-diske 0 0 0
CANDIDATE /dev/asm-diskf 0 0 0
CANDIDATE /dev/asm-diskg 0 0 0
6 rows selected.
SQL> alter diskgroup CRS add disk '/dev/asm-diskg' name CRS_0001;
SQL> alter diskgroup FRA add disk '/dev/asm-diskf' name FRA_0001;
SQL> alter diskgroup DATA add disk '/dev/asm-diske' name CRS_0001;
命令执行后,系统会自动在后台执行Rebalance动作,查看是否结束:
SQL> select operation,state,power from v$asm_operation;
OPERATION STAT POWER
---------- ---- ----------
REBAL RUN 1
SQL> select operation,state,power from v$asm_operation;
no rows selected
当以上命令没有输出时,表示自动重平衡完成;
后台重平衡完成后,再次查看ASM磁盘:
SQL> select header_status,name,failgroup,path,total_mb,free_mb from v$asm_disk;
HEADER_STATU NAME FAILGROUP PATH TOTAL_MB FREE_MB
------------ ---------- --------------- -------------------- ---------- ----------
MEMBER CRS_0000 CRS_0000 /dev/asm-diskb 5270 4442
MEMBER CRS_0001 CRS_0001 /dev/asm-diskg 5270 4574
MEMBER FRA_0000 FRA_0000 /dev/asm-diskc 8192 7754
MEMBER DATA_0001 DATA_0001 /dev/asm-diskd 51350 35342
MEMBER FRA_0001 FRA_0001 /dev/asm-diskf 8342 7900
MEMBER DATA_0000 DATA_0000 /dev/asm-diske 51200 35192
6 rows selected.
删除原有(老存储)ASM磁盘:
SQL> alter diskgroup CRS drop disk CRS_0000;
SQL> alter diskgroup FRA drop disk FRA_0000;
SQL> alter diskgroup DATA drop disk DATA_0000;
等待自动重平衡结束:
SQL> select operation,state,power from v$asm_operation;
OPERATION STAT POWER
---------- ---- ----------
REBAL RUN 1
SQL> select operation,state,power from v$asm_operation;
no rows selected
删除原有(老存储)ASM磁盘后的磁盘状态:
SQL> select header_status,name,failgroup,path,total_mb,free_mb from v$asm_disk;
HEADER_STATU NAME FAILGROUP PATH TOTAL_MB FREE_MB
------------ ---------- --------------- -------------------- ---------- ----------
FORMER /dev/asm-diskb 0 0
FORMER /dev/asm-diskc 0 0
FORMER /dev/asm-diskd 0 0
MEMBER CRS_0001 CRS_0001 /dev/asm-diskg 5270 3620
MEMBER DATA_0001 DATA_0001 /dev/asm-diske 51350 19336
MEMBER FRA_0001 FRA_0001 /dev/asm-diskf 8342 7464
6 rows selected.
3.3.在系统层面删除老存储的磁盘相关信息:
#(注意:以下动作在各个节点上滚动执行,完全执行完其中一个后,再执行另外一个节点。)
a.停止Oracle集群服务器:# crsctl stop cluster
b.修改/etc/udev/rules.d/99-oracleasm.rules文件,将其中的老磁盘信息删除;
c.在存储侧删除老旧磁盘的映射信息;
d.重启rac节点:shutdown -r now
e.重启后Oracle集群服务自动启动,检查各服务状态:
# crs_stat -t
# crsctl status resource -t
所有服务运行正常,新老存储替换完成。
- 点赞
- 收藏
- 关注作者
评论(0)