Tuesday, March 13, 2018

Cluster-Activities



1)      Database / instance / listener / agent / ASM / CRS services availability and health


10g/11gr1
                               
Cluster Related Commands
crs_stat –t
Shows HA resource status
Crsstat
Ouptut of crs_stat -t
ps -ef|grep d.bin
crsd.bin evmd.bin ocssd.bin
crsctl check crs
CSS,CRS,EVMd appears healthy
crsctl stop crs
Stop crs and all other services on local node
crsctl disable crs*
Prevents CRS from starting on reboot
crsctl enable crs*
Enables CRS start on reboot
crs_stop –all
Stops all registered resources
crs_start –all
Starts all registered resources



Database Related Commands
srvctl start instance -d <db_name> -i <inst_name>
Starts an instance
srvctl start database -d <db_name>
Starts all instances
srvctl stop database -d <db_name>
Stops all instances, closes database
srvctl stop instance -d <db_name> -i <inst_name>
Stops an instance
srvctl start service -d <db_name> -s <service_name>
Starts a service
srvctl stop service -d <db_name> -s <service_name>
Stops a service
srvctl status service -d <db_name>
Checks status of a service
srvctl status instance -d <db_name> -i <inst_name>
Checks an individual instance
srvctl status database -d <db_name>
Checks status of all instances
srvctl start nodeapps -n <node_name>
Starts gsd, vip, listener, and ons
srvctl stop nodeapps -n <node_name>
Stops gsd, vip and listener


11gr2 Grid


2) Checking CRS Status: 

grdoradr103:/apps/ngdbf1/oracle/nghome:actnlf_1> su – oragrd

a)    for local node
grdoradr103:/apps/grid/grdhome:+ASM3> crsctl check crs

b)      for remote nodes in the cluster
grdoradr103:/apps/grid/grdhome:+ASM3> crsctl check cluster -all

c)    Viewing Cluster name: 
check  $CRS_HOME/cdata/<cluster_name> directory

d)      Viewing No. Of Nodes configured in Cluster: 
grdoradr103:/apps/grid/grdhome:+ASM3> olsnodes -n -s
grdoradr101     1       Active
grdoradr102     2       Active
grdoradr103     3       Active
grdoradr104     4       Active

e)      Check satus of cluster resources
grdoradr103:/apps/grid/grdhome:+ASM3> crsctl stat res -t

f)        Check status of local node cluster resources
grdoradr103:/apps/grid/grdhome:+ASM3>  crsctl stat res -t -init

g)      If any resource is down, start using the
Crsctl start resource <resource_name>

h)      Incase of database ,instances ,services to start :
srvctl start database -d DBname
srvctl stop database -d DBname
 
srvctl start instance -d DBname -i INSTANCEname
srvctl stop instance -d DBname -i INSTANCEname

srvctl status nodeapps -n NODEname
srvctl start nodeapps -n NODEname

srvctl config database -d DBname      -> to get some information about the database from OCR.

i)        stop clusterware stack
[root@rac1 bin]# ./crsctl stop cluster –all

j)        Stop/start all resources  on local node, Need to be logged onto the server as the root user to run this command. It will stop all HA resources on the local node
                    
                    crsctl stop crs
                    crsctl start crs
 
Note: for stoping of cluster,contact cc system team with the above command in individual servers, for stopping crs on local nodes each.ss



k)      To check the Cluster version
             ========================
clmoraprd02:/apps/grid/grdhome:+ASM2>  crsctl query crs activeversion
Oracle Clusterware active version on the cluster is [11.2.0.3.0]

3)  To stop the CRS. -
Need to run the below commands  as root user.Since root is the owner of the ohasd and crsd services
root      5464     1  0 Feb08 ?        01:48:54 /apps/crs/grid/product/11.2.0.3/bin/ohasd.bin reboot - High Availability service
oracrs    6291     1  0 Feb08 ?        00:02:58 /apps/crs/grid/product/11.2.0.3/bin/mdnsd.bin
oracrs    6304     1  0 Feb08 ?        00:31:45 /apps/crs/grid/product/11.2.0.3/bin/gpnpd.bin
oracrs    6318     1  0 Feb08 ?        03:04:05 /apps/crs/grid/product/11.2.0.3/bin/gipcd.bin
root      6331     1  0 Feb08 ?        13:23:19 /apps/crs/grid/product/11.2.0.3/bin/osysmond.bin
oracrs    6377     1  3 Feb08 ?        2-03:28:34 /apps/crs/grid/product/11.2.0.3/bin/ocssd.bin
root      6573     1  0 Feb08 ?        01:36:54 /apps/crs/grid/product/11.2.0.3/bin/octssd.bin reboot
oracrs    6606     1  0 Feb08 ?        00:03:20 /apps/crs/grid/product/11.2.0.3/bin/evmd.bin
root      7900     1  0 Feb08 ?        03:11:07 /apps/crs/grid/product/11.2.0.3/bin/crsd.bin reboot


---> cd $ORA_CRS_HOME/bin
crsctl stop crs       - This command needs to be run locally on all the local hosts.
crsctl start crs      - This command needs to be run locally on all the local hosts.

4) General Troubleshooting Instructions:

when stopping if the services does not go down.Please use the below command.
crsctl stop crs -f  - This command needs to be run locally on all the DB hosts.
After starting/Stopping user
Crsctl check crs on all the db hosts to find the status of crs services.
 To check all the CRS resources.Please use the below script to identify which resource is down/Up.






To check the status of all the CRS Resources – keep it in a .sh file then Run it
============================================

#!/usr/bin/ksh
# Sample 10g CRS resource status query script
# Description:
#    - Returns formatted version of crs_stat -t, in tabular
#      format, with the complete rsc names and filtering keywords
#   - The argument, $RSC_KEY, is optional and if passed to the script, will
#     limit the output to HA resources whose names match $RSC_KEY.
# Requirements:
#   - $ORA_CRS_HOME should be set in your environment
ORA_CRS_HOME=/apps/crs/grid/product/11.2.0.3
RSC_KEY=$1
QSTAT=-u
AWK=/bin/awk    # if not available use /usr/bin/awk
# Table header:echo ""
$AWK \
  'BEGIN {printf "%-45s %-10s %-18s\n", "HA Resource", "Target", "State";
          printf "%-45s %-10s %-18s\n", "-----------", "------", "-----";}'
# Table body:
$ORA_CRS_HOME/bin/crs_stat $QSTAT | $AWK \
 'BEGIN { FS="="; state = 0; }
  $1~/NAME/ && $2~/'$RSC_KEY'/ {appname = $2; state=1};
  state == 0 {next;}
  $1~/TARGET/ && state == 1 {apptarget = $2; state=2;}
  $1~/STATE/ && state == 2 {appstate = $2; state=3;}
  state == 3 {printf "%-45s %-10s %-18s\n", appname, apptarget, appstate; state=0;}'

5) To start/stop particular resource associated with the CRS.
===========================================================
you can get the details of the crs resources from the above script.Based on that if any resource is down/up we can make it up or down whatever needed using the below commands.
crsctl start resource myResource -n server1 - To start a particular CRS resource
crsctl stop resource myResource -n server1  - To stop a particular CRS resource.





---> crsctl status res biwqav.JAVA.db -p
NAME=biwqav.JAVA.db
TYPE=cluster_resource
ACL=owner:oracrs:rwx,pgrp:sapdba:rwx,other::r--
ACTION_FAILURE_TEMPLATE=
ACTION_SCRIPT=/apps/oraconfig/crs_scripts/ora_biw_JAVA_database.sh
ACTIVE_PLACEMENT=0
AGENT_FILENAME=%CRS_HOME%/bin/scriptagent
AUTO_START=restore
CARDINALITY=1
CHECK_INTERVAL=60
DEFAULT_TEMPLATE=
DEGREE=1
DESCRIPTION=
ENABLED=1
FAILOVER_DELAY=0
FAILURE_INTERVAL=0
FAILURE_THRESHOLD=0
HOSTING_MEMBERS=
LOAD=1
LOGGING_LEVEL=1
NOT_RESTARTING_TEMPLATE=
OFFLINE_CHECK_INTERVAL=0
PLACEMENT=balanced
PROFILE_CHANGE_TEMPLATE=
RESTART_ATTEMPTS=1
SCRIPT_TIMEOUT=60
SERVER_POOLS=
START_DEPENDENCIES=hard(biwqav.JAVA.vip) pullup(biwqav.JAVA.vip)
START_TIMEOUT=0
STATE_CHANGE_TEMPLATE=
STOP_DEPENDENCIES=hard(biwqav.JAVA.vip)
STOP_TIMEOUT=0
UPTIME_THRESHOLD=1h





crsctl status  res vip  -p

crsctl stop  res eccqav.JAVA.vip -f

crsctl start  res eccqav.JAVA.vip  -f -n saporaqav03

crsctl relocate res eccqav.JAVA.vip  -f -n saporaqav03



Refer:
=====

To know about the functionality of each services and for more commands.