Tuesday, March 21, 2017

DataGuard

                  Data Guard

Primary Database
A Data Guard configuration contains one production database, also referred to as the primary database, that functions in the primary role.

Standby Database
A standby database is a transactionally consistent copy of the primary database.

There are 3 types of standby databases possible in Data Guard Configuration:
1. Physical standby database
2. Logical standby database
3. Snapshot Standby database

Physical standby database:
Provides a physically identical copy of the primary database, with on diskdatabase structures that are identical to the primary database on a block-for-block basis. The database schema, including indexes, are the same. A physical standby database is kept synchronized with the primary database, through Redo Apply, which recovers the redo data received from the primary database and applies the redo to the physical standby database.
Logical standby database:
Contains the same logical information as the production database, although the physical organization and structure of the data can be different. The logical standby database is kept synchronized with the primary database through SQL Apply, which transforms the data in the redo received from the primary database into SQL statements and then executes the SQL statements on the standby database.
Snapshot Standby Database:
A snapshot standby database is a fully updatable standby database.
Like a physical or logical standby database, a snapshot standby database receives and archives redo data from a primary database. Unlike a physical or logical standby database, a snapshot standby database does not apply the redo data that it receives. The redo data received by a snapshot standby database is not applied until the snapshot standby is converted back into a physical standby database, after first discarding any local updates made to the snapshot standby database.

Data Guard Services
1. Redo Transport Services
2. Apply Services
3. Role Transitions

A physical standby database Data Guard Protection Modes
1. Maximum availability
2. Maximum performance (default)
3. Maximum protection

Actual Steps of configuring a data guard in max performance mode
steps
====
keep 2 terminals open for primary and standby
primary database = prod
standby database = std

on primary
export ORACLE_SID=prod
sqlplus / as sysdba
startup;

archive log list;
make sure primary is in archive log mode. as we are going to use RMAN to configure dataguard concept.  RMAN works only with archivelog mode.
show parameter spfile;
(to check the system is working with spfile or pfile)

create pfile from spfile;
(it created initprod.ora)
shut immediate;
exit;

cd $ORACLE_HOME/dbs
rm spfileprod.ora

vi initprod.ora
add the following parameters:
*.db_unique_name='hyd'
*.instance_name='prod'
*.log_archive_dest_2='service=topune'
*.standby_file_management=auto
:wq

cp initprod.ora initstd.ora

create the necessary directory structure for standby database
cd  /u01(to go to u01 directory)
mkdir -p /u01/std/arch
mkdir -p /u01/app/oracle/admin/std/adump

cd $ORACLE_HOME/dbs
vi initstd.ora
replace all "prod" with "std"
:%s/prod/std/g

change the following parameters:
db_name='prod' (must be same across all the standby databases)
db_unique_name='pune'
instance_name='std'
remove log_archive_dest_2
*.db_file_name_convert='/u01/prod','/u01/std'
*.log_file_name_convert='/u01/prod','/u01/std'
:wq

on primary
export ORACLE_SID=prod
sqlplus / as sysdba
startup mount;
ALTER DATABASE FORCE LOGGING;
exit;

cd $ORACLE_HOME/dbs
create password file for both
orapwd file=orapwprod password=manager force=y ignorecase=y
orapwd file=orapwstd password=manager force=y ignorecase=y

sqlplus / as sysdba
select status from v$instance;
(this is primary and must be in mount stage)
exit;

on standby
export ORACLE_SID=standby
sqlplus / as sysdba
startup nomount;

Configure a listener for standby database and a tns entry "topune" for the same.
As this standby will be in pune city so the connection which goes to standby listener will be called as "topune".
But the database name must be same "prod".  While creating tnsname sid name is std.

Go to Linux terminal
$netmgr
create a listener list_std for standby database
$lsnrctl start list_std

on primary server create a connect string tns service name "topune"
$tnsping topune

on primary
$rman target / nocatalog auxiliary sys/manager@topune
RMAN> duplicate target database for standby from active database;
exit;

sqlplus / as sysdba
select name, open_mode,database_role,protection_mode from v$database;
col db_unique_name format a15
select db_unique_name, open_mode,database_role,protection_mode from v$database;
alter database open;

on standby
shut immediate;
startup nomount;
alter database mount standby database;
alter database open read only;

alter database recover managed standby database disconnect;
alter database recover managed standby database disconnect from session;

select name, open_mode,database_role,protection_mode from v$database;
select db_unique_name, open_mode,database_role,protection_mode from v$database;

on primary
go to scott user and make some txn
sqlplus / as sysdba
conn scott/tiger
create table t1 as select * from emp;
insert into t1 select * from t1;
/
/
commit;

conn / as sysdba
alter system switch logfile;
archive log list;
note down the latest number
check the same archive log list at standby
both must be same. once we issue the "disconnect" command standby starts getting archive logs from primary and apply them locally. after a couple of minutes based on number of archive files both will be same.
If any file not moved towards standby copy them manually and use the below command. once we apply few others will start applying themself.
SQL> alter database register logfile '/var/arch/arch_1_101.arc';
If we have too many files we cant do it manual. Here we can use the below RMAN command:
rman> catalog start with '/var/arch';

Verify the Physical Standby Database Is Performing Properly
Step 1 Identify the existing archived redo log files.
select sequence#,first_time,next_time from v$archived_log order by 1;
Step 2 Force a log switch to archive the current online redo log file.
SQL> ALTER SYSTEM SWITCH LOGFILE;
Step 3 Verify the new redo data was archived on the standby database.
select sequence#,first_time,next_time from v$archived_log order by 1;

Step 4 Verify that received redo has been applied.
SQL> SELECT SEQUENCE#,APPLIED FROM V$ARCHIVED_LOG ORDER BY 1;

Note: The value of the APPLIED column for the most recently received log file will be either IN-MEMORY or YES if that log file has been applied.
Once the apply is uptodate we can keep the standby in read only mode.
alter database recover managed standby database cancel;
Converting from Max Performance Mode to Max Availability Mode
shut both primary and standby

at primary
cd $ORACLE_HOME/dbs
vi initprod.ora
log_archive_dest_2='service=topune lgwr'
:wq

sqlplus / as sysdba
startup mount
alter database set standby database to maximize availability;
alter database open;
select name,open_mode,database_role, protection_mode from v$database;
With this primary is ready with max availability mode.  At standby we need to create 3 redo log files so that primary can write directly into standby logs.

At Standby
sqlplus / as sysdba
startup nomount;
alter database mount standby database;
alter database open read only;

select name,open_mode,database_role, protection_mode from v$database;

column member format a30
select member,type from v$logfile;
alter database add standby logfile group 4 '/u01/std/redo04.log' size 50m;
alter database add standby logfile group 5 '/u01/std/redo05.log' size 50m;
alter database add standby logfile group 6 '/u01/std/redo06.log' size 50m;

select member,type from v$logfile;
select group#,status from v$managed_standby;
select group#,status from v$standby_log;
select name,open_mode,database_role, protection_mode from v$database;
shut immediate;
startup nomount;
alter database mount standby database;
alter database open read only;
select name,open_mode,database_role, protection_mode from v$database;
alter database recover managed standby database disconnect using current logfile;
alter database recover managed standby database cancel;


Converting from Max Availability Mode to Max Protection
at primary
shut immediate
exit

cd $ORACLE_HOME/dbs
vi initprod.ora
log_archive_dest_2='service=topune lgwr sync affirm'
:wq

At standby
sqlplus / as sysdba
startup

At Primary
sqlplus / as sysdba
startup mount
alter database set standby database to maximize protection;
alter database open;
select name,open_mode,database_role,protection_mode from v$database;

at standby
select name,open_mode,database_role,protection_mode from v$database;
should be same
shut immediate
startup

alter database recover managed standby database disconnect using current logfile;
alter database recover managed standby database cancel;


Snapshot Standby Database
FRA must be enabled at standby, to configure Snapshot Standby Database.
show parameter recover;
shut immediate;
exit

startup mount
alter database convert to snapshot standby;
alter database open;

select name,open_mode,database_role,protection_mode from v$database;

now open mode will be read write
make some dml, ddl for testing

Going back to physical standby
shut immediate;
startup mount;
alter database convert to physical standby;
alter database open;

all the ddl, dml done previously will be lost and standby will come to the position where we converted it to snapshot standby database.








DGMGRL (Dataguard Broker Utility)
One of the prerequisites for using DGMGRL is that a primary database and any standby databases must already exist. The DG_BROKER_START initialization parameter must be set to TRUE for all databases in the configuration. You must use a server parameter file with the broker. If an instance was not started with a server parameter file, then you must shut down the instance and restart it using the server parameter file.
show parameter spfile;

We must use oracle enterprise edition to use dgmgrl utility.
select banner from v$version;

compatibility atleast 10.2.0.1.0
show parameter compatible

Here we are configuring both primary and standby in single machine. So single listener is enough.  But we need to create 2 tns service names one to primary(toprod)  and one to standby(topune).
Primary database name = prod
SID name          = prod
db_unique_name    = hyd
tns service name      = toprod
Standby database name = std
SID name          = std
db_unique_name    = pune
tns service name      = topune
$lsnrctl start
$tnsping toprod
$tnsping topune
both must be ok.
$hostname
m2.dba.com
here m2 is my vm machine name and dba.com is my domain.
set this domain value to the parameter db_domain
alter system set db_domain=dba.com scope=spfile;
in listener we must do static regstration of the db. this must be done 2 times. one for normal prod db and another for dgmgrl.
2 entries everything is same but global db name should be like this
at Primary
normal
Global Database Name = prod
dgmgrl
Global Database Name  = hyd_DGMGRL.dba.com
systax is DbUniqueName_DGMGRL.domainName

at Standby
normal
Global Database Name = std
dgmgrl
Global Database Name  = pune_DGMGRL.dba.com

At Primary and standby both
show parameter dg_broker
alter system set dg_broker_start = true;

Before starting actual database we must add standby redo log files on both sides.  This is required if we want to configure MaxAvailability or MaxProtection modes. Both sides we need because we may also need switchover or failover in future.

alter database add standby logfile group 4 '/u01/prod/redo04.log' size 50m;
alter database add standby logfile group 5 '/u01/prod/redo05.log' size 50m;
alter database add standby logfile group 6 '/u01/prod/redo06.log' size 50m;
select group#, type from v$logfile order by 1;

alter database add standby logfile group 4 '/u01/std/redo04.log' size 50m;
alter database add standby logfile group 5 '/u01/std/redo05.log' size 50m;
alter database add standby logfile group 6 '/u01/std/redo06.log' size 50m;
select group#, type from v$logfile order by 1;

Actual DGMGRL configuration
dgmgrl>
connect sys/manager@toprod

To create the broker configuration, you first define the configuration including a profile for the primary database, which in this case is called prod. In a later command, you will add a profile for the standby database, std.
create configuration 'dgb'
as primary database is 'hyd'
connect identifier is toprod;
Here 'hyd' is the db_unique_name of prod database.
we can confirm the above change using the following command:
show configuration;

To add a standby database to the 'dgb' configuration, use the ADD DATABASE command to create a broker configuration profile for the standby database.
add database 'pune' as
connect identifier is topune
maintained as physical;
Here 'pune' is the db_unique_name of the standby database std.

Setting Database Properties
After you create the configuration with DGMGRL, you can set database properties at any time.
EDIT DATABASE 'pune' SET PROPERTY 'LogArchiveFormat'='log_%t_%s_%r_%d.arc';
Property "LogArchiveFormat" updated.

EDIT DATABASE 'pune' SET PROPERTY 'StandbyArchiveLocation'='/u01/std/arch/';
Property "StandbyArchiveLocation" updated.

Use the SHOW DATABASE VERBOSE command to view all properties and their values for a database.
show database verbose pune;
You can change a property if the database is enabled or disabled. However, if the database is disabled when you change a property, the change does not take effect until the database is enabled.

Enabling the Configuration and Databases
So far, the DGB configuration is disabled, which means it is not under the control of the Data Guard broker. When you finish configuring the databases into a broker configuration and setting any necessary database properties, you must enable the configuration to allow the Data Guard broker to manage it.
You can enable:
·       The entire configuration, including all of its databases.
·       A standby database

Enable the entire configuration.
You can enable the entire configuration, including all of the databases, with the following command:
enable configuration;

Show the configuration.
Use the SHOW command to verify that the configuration and its databases were successfully enabled:
show configuration;

Enable the database.
This step is unnecessary except if the standby database was previously disabled with the DISABLE DATABASE command. Normally, enabling the configuration also enables the standby database.
DGMGRL> ENABLE DATABASE 'pune';
show database 'pune';
show databae hyd;

On Standby
dgmgrl>
connect sys/manager@topune
show configuration;
should be same as primary.






Setting the Configuration Protection Mode
You can change the protection mode of the configuration at any time. However, it is best if you do this when there is no activity occurring in the configuration if you are moving to the maximum protection or maximum availability modes.
If the protection mode to be set is maximum protection mode, the broker automatically restarts the primary database.
This scenario sets the protection mode of the configuration to the MAXAVAILABILITY mode. Note that this protection mode requires that there be at least one standby database configured to use standby redo log files, with its LogXptMode configurable database property set to SYNC.

Step 1   Configure standby redo log files, if necessary.
Step 2   Set the LogXptMode configurable database property appropriately.
DGMGRL> EDIT DATABASE 'pune' SET PROPERTY 'LogXptMode'='SYNC';
The broker will not allow this command to succeed unless the standby database is configured with standby redo log files in the configuration.
Step 3   Change the overall protection mode for the configuration.
DGMGRL> EDIT CONFIGURATION SET PROTECTION MODE AS MAXAVAILABILITY;
Step 4   Verify the protection mode was changed.
DGMGRL> SHOW CONFIGURATION;

CHANGING THE MODES
At primary side only:
    >edit configuration set protection mode as ‘MaxAvailability’;
    >edit configuration set protection mode as ‘MaxProtection’;
    >edit configuration set protection mode as ‘MaxPerformance’;




Performing Routine Management Tasks
Changing Properties and States
Alter a Database Property
DGMGRL> EDIT DATABASE 'hyd' SET PROPERTY 'LogArchiveTrace'='127';

Alter the State of a Standby Database
You might want to temporarily stop Redo Apply on a physical standby.
DGMGRL> EDIT DATABASE 'pune' SET STATE='APPLY-OFF';
Redo data is still being received when you put the physical standby database in the APPLY-OFF state.

Alter the State of a Primary Database
EDIT DATABASE hyd SET STATE=TRANSPORT-OFF;
EDIT DATABASE hyd SET STATE=TRANSPORT-ON;

Disable a Configuration
DGMGRL> DISABLE CONFIGURATION;

Disable a Standby Database
You use the DISABLE DATABASE command when you temporarily do not want the broker to manage and monitor a standby database.
DGMGRL> DISABLE DATABASE 'pune';



Removing the Configuration or a Standby Database
When you use either the REMOVE CONFIGURATION or REMOVE DATABASE command, you effectively delete the configuration or standby database profile from the broker configuration file, removing the ability of the Data Guard broker to manage the configuration or the standby database, respectively.
Step 1   Remove a standby database from the configuration.
DGMGRL> SHOW CONFIGURATION;
DGMGRL> REMOVE DATABASE 'pune';
When operating under either maximum protection mode or maximum availability mode, the broker prevents you from deleting the last standby database that supports the protection mode.
Step 2   Remove the broker configuration.
DGMGRL> REMOVE CONFIGURATION;
SHOW CONFIGURATION;


Performing a Switchover Operation
You can switch the role of the primary database and a standby database using the SWITCHOVER command. Before you issue the SWITCHOVER command, you must ensure:
·       The state of the primary and standby databases are TRANSPORT-ON and APPLY-ON, respectively.
·       All participating databases are in good health, without any errors or warnings present.
·       The standby database properties were set on the primary database, so that the primary database can function correctly when transitioning to a standby database (shown in the following examples in boldface type).
·       Standby redo log files on the primary database are set up, and the LogXptMode configurable database property is set to SYNC if the configuration is operating in either maximum availability mode or maximum protection mode.
·       If fast-start failover is enabled, you can perform a switchover only to the standby database that was specified as the target standby database.The state of the primary and standby databases are TRANSPORT-ON and APPLY-ON, respectively.


Step 1   Check the primary database.
Use the SHOW DATABASE VERBOSE command to check the state, health, and properties of the primary database, as follows:
DGMGRL> SHOW DATABASE VERBOSE 'hyd';
Important properties to look for are
LogXptMode                       = 'SYNC'  
DbFileNameConvert               = 'dbs/bt, dbs/t'
LogFileNameConvert              = 'dbs/bt, dbs/t'
StandbyArchiveLocation        = '/archfs/arch/'
LogArchiveFormat                   = 'db1r_%d_%t_%s_%r.arc'

Step 2   Check the standby database that is the target of the switchover.
DGMGRL> SHOW DATABASE VERBOSE 'pune';
here also look for the same above properties.

Step 3   Issue the switchover command.
Issue the SWITCHOVER command to swap the roles of the primary and standby databases.
DGMGRL> switchover to 'pune';

Step 4   Show the configuration.
After the switchover completes, use the SHOW CONFIGURATION and SHOW DATABASE commands to verify that the switchover operation was successful.
Converting a Physical Standby to a Snapshot Standby
If you have a physical standby database that you would like to convert to a snapshot standby database, use the DGMGRL CONVERT DATABASE command. Redo data will continue to be received by the database while it is operating as a snapshot standby database, but it will not be applied until the snapshot standby is converted back into a physical standby database.
Note that the Flashback Database feature is required to create a snapshot standby database. If Flashback database is disabled, it is automatically enabled during conversion to a snapshot standby database. The broker automatically restarts the database to the mounted state if it had been opened with Flashback Database disabled. No user action is required.
DGMGRL> CONVERT DATABASE 'pune' to SNAPSHOT STANDBY;
DGMGRL> SHOW CONFIGURATION;

When you are ready to revert the database back to a physical standby database, use the DGMGRL CONVERT DATABASE command again as follows. Any updates made to the database while it was operating as a snapshot standby database will be discarded. All accumulated redo data will be applied by Redo Apply services after the database is converted back to a physical standby database.
DGMGRL> CONVERT DATABASE 'pune' to PHYSICAL STANDBY;









Monitoring a Data Guard Configuration
The scenario in this section demonstrates how to use the SHOW command and monitorable database properties to identify and resolve a failure situation.
Step 1   Check the configuration status.
The status of the broker configuration is an aggregated status of all databases and instances in the broker configuration. You can check the configuration status first to determine whether or not any further action needs to be taken. If the configuration status is SUCCESS, everything in the broker configuration is working fine. However, if you see the following error, it means something is wrong in the configuration:
DGMGRL> SHOW CONFIGURATION;

Step 2   Check the database status.
To identify which database has the failure, you need to go through all of the databases in the configuration one by one.
DGMGRL> SHOW DATABASE 'hyd';
DGMGRL> SHOW DATABASE 'pune';

Step 3   Check the StatusReport monitorable database property.
When you see message ORA-16810, you can use the StatusReport monitorable database property to identify each of the errors or warnings:
DGMGRL> SHOW DATABASE 'hyd' 'StatusReport';

Step 4   Check the LogXptStatus monitorable database property.
You see error ORA-16737 in the previous status report in Step 3. To identify the exact log transport error, you can use LogXptStatus monitorable database property:
DGMGRL> SHOW DATABASE 'hyd' 'LogXptStatus';
LOG TRANSPORT STATUS
PRIMARY_INSTANCE_NAME STANDBY_DATABASE_NAME               STATUS
              sales1             DR_Sales ORA-12541: TNS:no listener

Now you know the exact reason why redo transport services failed. To fix this error, start the listener for the physical standby database pune.

Step 5   Check the InconsistentProperties monitorable database property.
You also see warning ORA-16714 reported in Step 3. To identify the inconsistent values for property LogArchiveTrace, you can use the InconsistentProperties monitorable database property:
DGMGRL> SHOW DATABASE 'hyd' 'InconsistentProperties';
INCONSISTENT PROPERTIES
   INSTANCE_NAME   PROPERTY_NAME    MEMORY_VALUE    SPFILE_VALUE    BROKER_VALUE
          prod                     LogArchiveTrace           255                   0                 0
It seems that the current database memory value (255) is different from both the server parameter file (SPFILE) value (0) and Data Guard broker's property value (0). If you decide the database memory value is correct, you can update Data Guard broker's property value using the following command:
DGMGRL> EDIT DATABASE 'hyd' SET PROPERTY 'LogArchiveTrace'=255;
Property "LogArchiveTrace" updated

In the previous command, Data Guard broker also updates the spfile value for you so that value for LogArchiveTrace is kept consistent.

Step 6 Check the InconsistentLogXptProps monitorable database property.
Another warning you see in the status report returned in Step 3 is ORA-16715. To identify the inconsistent values for the redo transport configurable database property, ReopenSecs, you can use the InconsistentLogXptProps monitorable database property.
DGMGRL> SHOW DATABASE 'hyd' 'InconsistentLogXptProps';

INCONSISTENT LOG TRANSPORT PROPERTIES
   INSTANCE_NAME    STANDBY_NAME   PROPERTY_NAME    MEMORY_VALUE    BROKER_VALUE
          std             pune         ReopenSecs                      600                            300  

The current database memory value (600) is different from the Data Guard broker's property value (300). If you think the broker's property value is correct, you can fix the inconsistency by re-editing the property of the standby database with the same value, as shown in the following example:
DGMGRL> EDIT DATABASE 'pune' SET PROPERTY 'ReopenSecs'=300;
Property "ReopenSecs" updated










While these changes are happening we can monitor alert_prod.log file for live changes happening in this db.
go to alert log folder
ls -ltr drc*
$tail -f drcorcl.log
$tail -f alert_prod.log


RAC INTERVIEW QUESTIONS

1. What is the major difference between 10g and 11g RAC?
Well, there is not much difference between 10g and 11gR (1) RAC.
But there is a significant difference in 11gR2.
Prior to 11gR1(10g) RAC, the following were managed by Oracle CRS
From 11gR2(onwards) its completed HA stack managing and providing the following resources as like the other cluster software like VCS etc.
  • Databases
  • Instances
  • Applications
  • Cluster Management
  • Node Management
  • Event Services
  • High Availability
  • Network Management (provides DNS/GNS/MDNSD services on behalf of other traditional services) and SCAN – Single Access Client Naming method, HAIP
  • Storage Management (with help of ASM and other new ACFS filesystem)
  • Time synchronization (rather depending upon traditional NTP)
  • Removed OS dependent hang checker etc, manages with own additional monitor process
2.  What are Oracle Cluster Components?
Cluster Interconnect (HAIP)
Shared Storage (OCR/Voting Disk)
Clusterware software
3. What are Oracle RAC Components?
VIP, Node apps etc.
4. What are Oracle Kernel Components (nothing but how does Oracle RAC database differs than Normal single instance database in terms of Binaries and process)
Basically Oracle kernel need to switched on with RAC On option when you convert to RAC, that is the difference as it facilitates few RAC bg process like LMON,LCK,LMD,LMS etc.
To turn on RAC
# link the oracle libraries
$ cd $ORACLE_HOME/rdbms/lib
$ make -f ins_rdbms.mk rac_on
# rebuild oracle
$ cd $ORACLE_HOME/bin
$ relink oracle
Oracle RAC is composed of two or more database instances. They are composed of Memory structures and background processes same as the single instance database.Oracle RAC instances use two processes GES(Global Enqueue Service), GCS(Global Cache Service) that enable cache fusion.Oracle RAC instances are composed of following background processes:
ACMS—Atomic Controlfile to Memory Service (ACMS)
GTX0-j—Global Transaction Process
LMON—Global Enqueue Service Monitor
LMD—Global Enqueue Service Daemon
LMS—Global Cache Service Process
LCK0—Instance Enqueue Process
RMSn—Oracle RAC Management Processes (RMSn)
RSMN—Remote Slave Monitor
5. What is Clusterware?
Software that provides various interfaces and services for a cluster. Typically, this includes capabilities that:
  • Allow the cluster to be managed as a whole
  • Protect the integrity of the cluster
  • Maintain a registry of resources across the cluster
  • Deal with changes to the cluster
  • Provide a common view of resources
6. What are the background process that exists in 11gr2 and functionality?
Process Name
Functionality
crsd
•The CRS daemon (crsd) manages cluster resources based on configuration information that is stored in Oracle Cluster Registry (OCR) for each resource. This includes start, stop, monitor, and failover operations. The crsd process generates events when the status of a resource changes.
cssd
•Cluster Synchronization Service (CSS): Manages the cluster configuration by controlling which nodes are members of the cluster and by notifying members when a node joins or leaves the cluster. If you are using certified third-party clusterware, then CSS processes interfaces with your clusterware to manage node membership information. CSS has three separate processes: the CSS daemon (ocssd), the CSS Agent (cssdagent), and the CSS Monitor (cssdmonitor). The cssdagent process monitors the cluster and provides input/output fencing. This service formerly was provided by Oracle Process Monitor daemon (oprocd), also known as OraFenceService on Windows. A cssdagent failure results in Oracle Clusterware restarting the node.
diskmon
•Disk Monitor daemon (diskmon): Monitors and performs input/output fencing for Oracle Exadata Storage Server. As Exadata storage can be added to any Oracle RAC node at any point in time, the diskmon daemon is always started when ocssd is started.
evmd
•Event Manager (EVM): Is a background process that publishes Oracle Clusterware events
mdnsd
•Multicast domain name service (mDNS): Allows DNS requests. The mDNS process is a background process on Linux and UNIX, and a service on Windows.
gnsd
•Oracle Grid Naming Service (GNS): Is a gateway between the cluster mDNS and external DNS servers. The GNS process performs name resolution within the cluster.
ons
•Oracle Notification Service (ONS): Is a publish-and-subscribe service for communicating Fast Application Notification (FAN) events
oraagent
•oraagent: Extends clusterware to support Oracle-specific requirements and complex resources. It runs server callout scripts when FAN events occur. This process was known as RACG in Oracle Clusterware 11g Release 1 (11.1).
orarootagent
•Oracle root agent (orarootagent): Is a specialized oraagent process that helps CRSD manage resources owned by root, such as the network, and the Grid virtual IP address
oclskd
•Cluster kill daemon (oclskd): Handles instance/node evictions requests that have been escalated to CSS
gipcd
•Grid IPC daemon (gipcd): Is a helper daemon for the communications infrastructure
ctssd
•Cluster time synchronisation daemon(ctssd) to manage the time syncrhonization between nodes, rather depending on NTP
7. Under which user or owner the process will start?
Component
Name of the Process
Owner
Oracle High Availability Service
ohasd
init, root
Cluster Ready Service (CRS)
Cluster Ready Services
root
Cluster Synchronization Service (CSS)
ocssd,cssd monitor, cssdagent
grid owner
Event Manager (EVM)
evmd, evmlogger
grid owner
Cluster Time Synchronization Service (CTSS)
octssd
root
Oracle Notification Service (ONS)
ons, eons
grid owner
Oracle Agent
oragent
grid owner
Oracle Root Agent
orarootagent
root
Grid Naming Service (GNS)
gnsd
root
Grid Plug and Play (GPnP)
gpnpd
grid owner
Multicast domain name service (mDNS)
mdnsd
grid owner
8. What is startup sequence in Oracle 11g RAC? 11g RAC startup sequence?
9. As you said Voting & OCR Disk resides in ASM Diskgroups, but as per startup sequence OCSSD starts first before than ASM, how is it possible?
How does OCSSD starts if voting disk & OCR resides in ASM Diskgroups?
You might wonder how CSSD, which is required to start the clustered ASM instance, can be started if voting disks are stored in ASM? This sounds like a chicken-and-egg problem: without access to the voting disks there is no CSS, hence the node cannot join the cluster. But without being part of the cluster, CSSD cannot start the ASM instance. To solve this problem the ASM disk headers have new metadata in 11.2: you can use kfed to read the header of an ASM disk containing a voting disk. The kfdhdb.vfstart and kfdhdb.vfend fields tell CSS where to find the voting file. This does not require the ASM instance to be up. Once the voting disks are located, CSS can access them and joins the cluster.
10. How does SCAN works?

1.Client Connected through SCAN name of the cluster (remember all three IP addresses round robin resolves to same Host name (SCAN Name), here in this case our scan name is cluster01-scan.cluster01.example.com
2.The request reaches to DNS server in your corp and then resolves to one of the node out of three.  a. If GNS (Grid Naming service or domain is configured) that is a subdomain configured in  the DNS entry for to resolve cluster address the request will be handover to GNS (gnsd)
3.Here in our case assume there is no GNS, now the with the help of SCAN listeners where end points are configured to database listener.
4.Database Listeners listen the request and then process further.
5.In case of node addition, Listener 4, client need not to know or need not change any thing from their tns entry (address of 4thnode/instance) as they just using scan IP.
6.Same case even in the node deletion.
11. What is GNS?
Grid Naming service is alternative service to DNS , which will act as a sub domain in your DNS but managed by Oracle, with GNS the connection is routed to the cluster IP and manages internally.
12. What is GPNP?
Grid Plug and Play along with GNS provide dynamic
In previous releases, adding or removing servers in a cluster required extensive manual preparation.
In Oracle Database 11g Release 2, GPnP allows each node to perform the following tasks dynamically:
    • Negotiating appropriate network identities for itself
    • Acquiring additional information from a configuration profile
    • Configuring or reconfiguring itself using profile data, making host names and addresses resolvable on the network
For example a domain should contain
  • –Cluster name: cluster01
  • –Network domain: example.com
  • –GPnP domain: cluster01.example.com
To add a node, simply connect the server to the cluster and allow the cluster to configure the node.
To make it happen, Oracle uses the profile located in $GI_HOME/gpnp/profiles/peer/profile.xml which contains the cluster resources, for example disk locations of ASM. etc.
So this profile will be read local or from the remote machine when plugged into cluster and dynamically added to cluster.
13. What are the file types that ASM support and keep in disk groups?
Control files
Flashback logs
Data Pump dump sets
Data files
DB SPFILE
Data Guard configuration
Temporary data files
RMAN backup sets
Change tracking bitmaps
Online redo logs
RMAN data file copies
OCR files
Archive logs
Transport data files
ASM SPFILE
14. List Key benefits of ASM?
  • Stripes files rather than logical volumes
  • Provides redundancy on a file basis
  • Enables online disk reconfiguration and dynamic rebalancing
  • Reduces the time significantly to resynchronize a transient failure by tracking changes while disk is offline
  • Provides adjustable rebalancing speed
  • Is cluster-aware
  • Supports reading from mirrored copy instead of primary copy for extended clusters
  • Is automatically installed as part of the Grid Infrastructure
15. List key benefits of Oracle Grid Infrastructure?
16. List some of the background process that used in ASM?
Process
Description
RBAL
Opens all device files as part of discovery and coordinates the rebalance activity
ARBn
One or more slave processes that do the rebalance activity
GMON
Responsible for managing the disk-level activities such as drop or offline and advancing the ASM disk group compatibility
MARK
Marks ASM allocation units as stale when needed
Onnn
One or more ASM slave processes forming a pool of connections to the ASM instance for exchanging messages
PZ9n
One or more parallel slave processes used in fetching data on clustered ASM installation from GV$ views
13. What is node listener?
In 11gr2 the listeners will run from Grid Infrastructure software home
  • The node listener is a process that helps establish network connections from ASM clients to the ASM instance.
  • Runs by default from the Grid $ORACLE_HOME/bin directory
  • Listens on port 1521 by default
  • Is the same as a database instance listener
  • Is capable of listening for all database instances on the same machine in addition to the ASM instance
  • Can run concurrently with separate database listeners or be replaced by a separate database listener
  • Is named tnslsnr on the Linux platform
15. What is SCAN listener?
A scan listener is something that additional to node listener which listens the incoming db connection requests from the client which got through the scan IP, it got end points configured to node listener where it routes the db connection requests to particular node listener.
16. What is the difference between CRSCTL and SRVCTL?
crsctl manages clusterware-related operations:
  • Starting and stopping Oracle Clusterware
  • Enabling and disabling Oracle Clusterware daemons
  • Registering cluster resources
srvctl manages Oracle resource–related operations:
  • Starting and stopping database instances and services
  • Also from 11gR2 manages the cluster resources like network,vip,disks etc
17. How to control Oracle Clusterware?
To start or stop Oracle Clusterware on a specific node:
# crsctl stop crs
# crsctl start crs
To enable or disable Oracle Clusterware on a specific node:
# crsctl enable crs
# crsctl disable crs
19. How to check the cluster (all nodes) status?
To check the viability of Cluster Synchronization Services (CSS) across nodes:
$ crsctl check cluster
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
20. How to check the cluster (one node) status?
$ crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
21. How to find Voting Disk location?
•To determine the location of the voting disk:
# crsctl query css votedisk
## STATE File Universal Id File Name Disk group
– —– —————– ———- ———-
1. ONLINE 8c2e45d734c64f8abf9f136990f3daf8 (ASMDISK01) [DATA]
2. ONLINE 99bc153df3b84fb4bf071d916089fd4a (ASMDISK02) [DATA]
3. ONLINE 0b090b6b19154fc1bf5913bc70340921 (ASMDISK03) [DATA]
Located 3 voting disk(s).
22. How to find Location of OCR?
  • cat /etc/oracle/ocr.loc
ocrconfig_loc=+DATA
local_only=FALSE
  • #OCRCHECK (also about OCR integrity)
23. List some background process that used in ASM Instances?
Process
Description
RBAL
Opens all device files as part of discovery and coordinates the rebalance activity
ARBn
One or more slave processes that do the rebalance activity
GMON
Responsible for managing the disk-level activities such as drop or offline and advancing the ASM disk group compatibility
MARK
Marks ASM allocation units as stale when needed
Onnn
One or more ASM slave processes forming a pool of connections to the ASM instance for exchanging messages
PZ9n
One or more parallel slave processes used in fetching data on clustered ASM installation from GV$ views
24. What are types of ASM Mirroring?
Disk Group Type
Supported MirroringLevels
Default Mirroring Level
External redundancy
Unprotected (None)
Unprotected (None)
Normal redundancy
Two-wayThree-wayUnprotected (None)
Two-way
High redundancy
Three-way
Three-way
25. What is ASM Striping?
ASM can use variable size data extents to support larger files, reduce memory requirements, and improve performance.
Each data extent resides on an individual disk.
Data extents consist of one or more allocation units.
The data extent size is:
  • Equal to AU for the first 20,000 extents (0–19999)
  • Equal to 4 × AU for the next 20,000 extents (20000–39999)
  • Equal to 16 × AU for extents above 40,000
ASM stripes files using extents with a coarse method for load balancing or a fine method to reduce latency.
  • Coarse-grained striping is always equal to the effective AU size.
  • Fine-grained striping is always equal to 128 KB.
26. How many ASM Diskgroups can be created under one ASM Instance?
ASM imposes the following limits:
  • 63 disk groups in a storage system
  • 10,000 ASM disks in a storage system
  • Two-terabyte maximum storage for each ASM disk (non-Exadata)
  • Four-petabyte maximum storage for each ASM disk (Exadata)
  • 40-exabyte maximum storage for each storage system
  • 1 million files for each disk group
  • ASM file size limits (database limit is 128 TB):
1.External redundancy maximum file size is 140 PB.
2.Normal redundancy maximum file size is 42 PB.
3.High redundancy maximum file size is 15 PB.
27. How to find the cluster network settings?
To determine the list of interfaces available to the cluster:
$ oifcfg iflist –p -n
To determine the public and private interfaces that have been configured:
$ oifcfg getif
eth0 192.0.2.0 global public
eth1 192.168.1.0 global cluster_interconnect
To determine the Virtual IP (VIP) host name, VIP address, VIP subnet mask, and VIP interface name:
$ srvctl config nodeapps -a
VIP exists.:host01
VIP exists.: /192.0.2.247/192.0.2.247/255.255.255.0/eth0
28. How to change Public or VIP Address in RAC Cluster?
29. How to change Cluster interconnect in RAC?
On a single node in the cluster, add the new global interface specification:
$ oifcfg setif -global eth2/192.0.2.0:cluster_interconnect
Verify the changes with oifcfg getif and then stop Clusterware on all nodes by running the following command as root on each node:
# oifcfg getif
# crsctl stop crs
Assign the network address to the new network adapters on all nodes using ifconfig:
#ifconfig eth2 192.0.2.15 netmask 255.255.255.0 broadcast 192.0.2.255
Remove the former adapter/subnet specification and restart Clusterware:
$ oifcfgdelif -global eth1/192.168.1.0
# crsctl start crs
30. Managing or Modifying SCAN in Oracle RAC?
To add a SCAN VIP resource:
$ srvctl add scan -n cluster01-scan
To remove Clusterware resources from SCAN VIPs:
$ srvctl remove scan [-f]
To add a SCAN listener resource:
$ srvctl add scan_listener
$ srvctl add scan_listener -p 1521
To remove Clusterware resources from all SCAN listeners:
$ srvctl remove scan_listener [-f]
31. How to check the node connectivity in Oracle Grid Infrastructure?
$ cluvfy comp nodecon -n all –verbose
32. Can I stop all nodes in one command? Meaning that stopping whole cluster ?
In 10g its not possible, where in 11g it is possible
[root@pic1]# crsctl start cluster -all
[root@pic2]# crsctl stop cluster –all
33. What is OLR? Which of the following statements regarding the Oracle Local Registry (OLR) is true?
1.Each cluster node has a local registry for node-specific resources.
2.The OLR should be manually created after installing Grid Infrastructure on each node in the cluster.
3.One of its functions is to facilitate Clusterware startup in situations where the ASM stores the OCR and voting disks.
4.You can check the status of the OLR using ocrcheck.
34. What is runfixup.sh script in Oracle Clusterware 11g release 2 installation
With Oracle Clusterware 11g release 2, Oracle Universal Installer (OUI) detects when the minimum requirements for an installation are not met, and creates shell scripts, called fixup scripts, to finish incomplete system configuration steps. If OUI detects an incomplete task, then it generates fixup scripts (runfixup.sh). You can run the fixup script after you click the Fix and Check Again Button.
The Fixup script does the following:
If necessary sets kernel parameters to values required for successful installation, including:
  • Shared memory parameters.
  • Open file descriptor and UDP send/receive parameters.
Sets permissions on the Oracle Inventory (central inventory) directory. Reconfigures primary and secondary group memberships for the installation owner, if necessary, for the Oracle Inventory directory and the operating system privileges groups.
  • Sets shell limits if necessary to required values.
35. How to stop whole cluster with single command
crsctl stop cluster (possible only from 11gr2), please note crsctl commands becomes global now, if you do not specify node specifically the command executed globally for example
crsctl stop crs (stops in all crs resource in all nodes)
crsctl stop crs –n <ndeoname) (stops only in specified node)
36. CRS is not starting automatically after a node reboot, what you do to make it happen?
crsctl enable crs (as root)
to disable
crsctl disable crs (as root)
37. What are server pools in 11gr2?
38. What is policy managed databases in RAC?
39. What is Load balancing & how does it work?
40. Describe high level Steps to convert single instance to RAC?
41. What is the difference between TAF and FAN & FCF? at what conditions you use them?
1) TAF with tnsnames
a feature of Oracle Net Services for OCI8 clients. TAF is transparent application failover which will move a session to a backup connection if the session fails. With Oracle 10g Release 2, you can define the TAF policy on the service using dbms_service package. It will only work with OCI clients. It will only move the session and if the parameter is set, it will failover the select statement. For insert, update or delete transactions, the application must be TAF aware and roll back the transaction. YES, you should enable FCF on your OCI client when you use TAF, it will make the failover faster.
Note: TAF will not work with JDBC thin.

2) FAN with tnsnames with aq notifications true
FAN is a feature of Oracle RAC which stands for Fast Application Notification. This allows the database to notify the client of any change (Node up/down, instance up/down, database up/down). For integrated clients, inflight transactions are interrupted and an error message is returned. Inactive connections are terminated. 
FCF is the client feature for Oracle Clients that have integrated with FAN to provide fast failover for connections. Oracle JDBC Implicit Connection Cache, Oracle Data Provider for .NET (ODP.NET) and Oracle Call Interface are all integrated clients which provide the Fast Connection Failover feature.
3) FCF, along with FAN when using connection pools
FCF is a feature of Oracle clients that are integrated to receive FAN events and abort inflight transactions, clean up connections when a down event is received as well as create new connections when a up event is received. Tomcat or JBOSS can take advantage of FCF if the Oracle connection pool is used underneath. This can be either UCP (Universal Connection Pool for JAVA) or ICC (JDBC Implicit Connection Cache). UCP is recommended as ICC will be deprecated in a future release.
4) ONS, with clusterware either FAN/FCF
ONS is part of the clusterware and is used to propagate messages both between nodes and to application-tiers
ONS is the foundation for FAN upon which is built FCF.
RAC uses FAN to publish configuration changes and LBA events. Applications can react as those published events in two way :
- by using ONS api (you need to program it)
- by using FCF (automatic by using JDBC implicit connection cache on the application server)
you can also respond to FAN event by using server-side callout but this on the server side (as their name suggests it
)
Relationship between FAN/FCF/ONS
ONS –> FAN –> FCF
ONS -> send/receive messages on local and remote nodes.
FAN -> uses ONS to notify other processes about changes in configuration of service level
FCF -> uses FAN information working with conection pools JAVA and others.
42. Can you add voting disk online? Do you need voting disk backup?
Yes,  as per documentation, if you have multiple voting disk you can add online, but if you have only one voting disk , by that cluster will be down as its lost you just need to start crs in exclusive mode and add the votedisk using
crsctl add votedisk <path>
43. You have lost OCR disk, what is your next step?
The cluster stack will be down due to the fact that cssd is unable to maintain the integrity, this is true in 10g, From 11gR2 onwards, the crsd stack will be down, the hasd still up and running. You can add the ocr back by restoring the automatic backup or import the manual backup,
44. What happens when ocssd fails, what is node eviction? how does node eviction happens? For all answer will be same.
45. What is virtual IP and how does it works?
46. Describe some rac wait events you experienced?
and this table,
47. Can you modify VIP address after your cluster installation?
Yes, read here
48. How do you interpret AWR report in RAC instances, what sections in awr report for rac instances are most important?
Read here.
Update 12-May-2013, Some practical questions added here
1. Viewing Contents in OCR/Voting disks
         There are three possible ways to view the OCR contents.
         a.       OCRDUMP (or)
         b.       crs_stat -p  (or)
         c.       By using strings.
         Voting disk contents are not persistent and are not required to view the contents, because the voting disk contents will be overwritten. if still need to view, strings are used.
2. Server pools – Read in my blog
3. Verifying Cluster Interconnect
                  
                 Cluster interconnects can be verified by:
         i.       oifcfg getif
         ii.      From AWR Report.
         iii.     show parameter cluster_interconnect
         iv.      srvctl config network
4. Does scan IP required or we can disable it
    
          SCAN IP can be disabled if not required. However SCAN IP is mandatory during the RAC installation. Enabling/disabling SCAN IP is mostly used in oracle apps environment by the concurrent manager (kind of job scheduler in oracle apps).
         To disable the SCAN IP,
         i.       Do not use SCAN IP at the client end.
         ii.      Stop scan listener
               srvctl stop scan_listener
         iii.     Stop scan
               srvctl stop scan (this will stop the scan vip's)
         iv.      Disable scan and disable scan listener
              srvctl disable scan
5. Migrating to new Diskgroup scenarious
a.       Case 1: Migrating disk group from one storage to other with same name
        1. Consider the disk group is DATA,
        2. Create new disks in DATA pointing towards the new storage (EMC),
                  a) Partioning provisioning done by storage and they give you the device name or mapper like /dev/mapper/asakljdlas
        3. Add the new disk to diskgroup DATA
                  a) Alter diskgroup data add disk '/dev/mapper/asakljdlas' 
        3. drop the old disks from DATA with which rebalancing is done automatically. 
        If you want you can the rebalance by alter system set asm_power_limit =12 for full throttle.
            alter diskgroup data drop disk 'path to hitachi storage'
            Note: you can get the device name in v$asm_disk in path column.
        4. Request SAN team to detach the old Storage (HITACHI).

b.       Case 2: Migrating disk group from one to another with different diskgroup name.
        1) Create the Disk group with new name in the new storage. 
        2) Create the spfile in new diskgroup and change the parameter scope = spfile for control files etc.
        3) Take a control file backup in format +newdiskgroup
        4) Shutdown the db, startup nomount the database
        5) restore the control file from backup (now the control will restore to new diskgroup)
        6) Take the RMAN backup as copy of all the databases with new format.
               RMAN> backup database as copy format '+newdiskgroup name' ;
        3) RMAN> Switch database to copy.
        4) Verify dba_data_files,dba_temp_files, v$log that all files are pointing to new diskgroup name.

c.       Case 3: Migrating disk group to new storage but no additional diskgroup given
         1) Take the RMAN backup as copy of all the databases with new format and place it in the disk.
         2) Prepare rename commands from v$log ,v$datafile etc (dynamic queries)
         3) Take a backup of pfile and modify the following referring to new diskgroup name
                  .control_files
                 .db_create_file_dest
                 .db_create_online_log_dest_1
                 .db_create_online_log_dest_2
                 .db_recovery_file_des
                          4) stop the database
                          5) Unmount the diskgroup
                               asmcmd umount ORA_DATA
                          6) use asmcmd renamedg (11gr2 only) command to rename to new diskgroup
                               renamedg phase=both dgname=ORA_DATA newdgname=NEW_DATA verbose=true 
                 7)  mount the diskgroup
                      asmcmd mount NEW_DATA
                 8) start the database in mount with new pfile taken backup in step 3
                 9) Run the rename file scripts generated at step2
                 9) Add the diskgroup to cluster the cluster (if using rac)
                 srvctl modify database -d orcl -+NEW_FRA/orcl/spfileorcl.ora
                          srvctl modify database -d orcl -"NEW_DATA"
                          srvctl config database -d orcl  
                          srvctl start database -d orcl
                                        10) Delete the old diskgroup from cluster
                        crsctl delete resource ora.ORA_DATA.dg
                11) Open the database.
7. Database rename in RAC, what could be the checklist for you?
         a.       Take the outputs of all the services that are running on the databases.
         b.       set cluster_database=FALSE
         c.       Drop all the services associated with the database.
         d.       Stop the database
         e.       Startup mount
         f.       Use nid to change the DB Name.
                   Generic question, If using ASM the usual location for the datafile would be +DATA/datafile/OLDDBNAME/system01.dbf'
                   Does NID changes this path too? to reflect the new db name?
                   Yes it will, by using proper directory structure it will create a links to original directory structure. +DATA/datafile/NEWDBNAME/system01.dbf'
                   this has to be tested,  We dont have test bed, but thanks to Anji who confirmed it will

         g.       Change the parameters according to the new database name
         h.       Change the password file.
         i.       Stop the database.
         j.       Mount the database
         k.       Open database with Reset logs
         l.       Create spfile from pfile.
         m.       Add database to the cluster.
         n.       Create the services that are dropped in prior to rename.
         o.       Bounce the database.
8.How to find the database in which particular service is attached to when you have a large number of databases running in the server, you cannot check one by one manually
Write a shell script to read the database name from oratab and iterate the loop taking inpt as DB name in srvctl to get the result.
#!/bin/ksh
ORACLE_HOME=
PATH=$ORACLE_HOME/bin:$PATH
LD_LIBRARY_PATH=${SAVE_LLP}:${ORACLE_HOME}/lib
export TNS_ADMIN ORACLE_HOME PATH LD_LIBRARY_PATH
for INSTANCE in `cat /etc/oratab|grep -v "^#"|cut -f1 -d: -s`
do
export ORACLE_SID=$INSTANCE
echo `srvctl status service -d $INSTANCE -s $1| grep -i "is running"`
done
9. Difference between OHAS and CRS
OHAS is complete cluster stack which includes some kernel level tasks like managing network,time synchronization, disks etc, where the CRS has the ability to manage the resources like database,listeners,applications, etc With both of this Oracle provides the high availability


Oracle RAC Interview Questions & Answers

1. Where are the Clusterware files stored on a RAC environment?
The Clusterware is installed on each node (on an Oracle Home) and on the shared disks (the voting disks and the CSR file)
2. Where are the database software files stored on a RAC environment?
The base software is installed on each node of the cluster and the
database storage on the shared disks.
3. What kind of storage we can use for the shared Clusterware files?
- OCFS (Release 1 or 2)
- raw devices
- third party cluster file system such as GPFS or Veritas
4. What kind of storage we can use for the RAC database storage?
- OCFS (Release 1 or 2)
- ASM
- raw devices
- third party cluster file system such as GPFS or Veritas
5. What is a CFS?
A cluster File System (CFS) is a file system that may be accessed (read and write) by all members in a cluster at the same time. This implies that all members of a cluster have the same view.
6. What is an OCFS2?
The OCFS2 is the Oracle (version 2) Cluster File System which can be used for the Oracle Real Application Cluster.
7. Which files can be placed on an Oracle Cluster File System?
- Oracle Software installation (Windows only)
- Oracle files (controlfiles, datafiles, redologs, files described by the bfile datatype)
- Shared configuration files (spfile)
- OCR and voting disk
- Files created by Oracle during runtime
Note: There are some platform specific limitations.
8. Do you know another Cluster Vendor?
HP Tru64 Unix, Veritas, Microsoft
9. How is possible to install a RAC if we don’t have a CFS?
This is possible by using a raw device.
10. What is a raw device?
A raw device is a disk drive that does not yet have a file system set up. Raw devices are used for Real Application Clusters since they enable the sharing of disks.
11. What is a raw partition?
A raw partition is a portion of a physical disk that is accessed at the lowest possible level. A raw partition is created when an extended partition is created and logical partitions are assigned to it without any formatting. Once formatting is complete, it is called cooked partition.
12. When to use CFS over raw?
A CFS offers:
- Simpler management
- Use of Oracle Managed Files with RAC
- Single Oracle Software installation
- Autoextend enabled on Oracle datafiles
- Uniform accessibility to archive logs in case of physical node failure
- With Oracle_Home on CFS, when you apply Oracle patches CFS guarantees that the updated Oracle_Home is visible to all nodes in the cluster.
Note: This option is very dependent on the availability of a CFS on your platform.
13. When to use raw over CFS?
- Always when CFS is not available or not supported by Oracle.
- The performance is very, very important: Raw devices offer best performance without any intermediate layer between Oracle and the disk.
Note: Autoextend fails on raw devices if the space is exhausted. However the space could be added online if needed.
14. What CRS is?
Oracle RAC 10g Release 1 introduced Oracle Cluster Ready Services (CRS), a platform-independent set of system services for cluster environments. In Release 2, Oracle has renamed this product to Oracle Clusterware.
15. What is VIP IP used for?
It returns a dead connection IMMIDIATELY, when its primary node fails. Without using VIP IP, the clients have to wait around 10 minutes to receive ORA-3113: “end of file on communications channel”. However, using Transparent Application Failover (TAF) could avoid ORA-3113.
16. Why we need to have configured SSH or RSH on the RAC nodes?
SSH (Secure Shell,10g+) or RSH (Remote Shell, 9i+) allows “oracle” UNIX account connecting to another RAC node and copy/ run commands as the local “oracle” UNIX account.
17. Is the SSH, RSH needed for normal RAC operations?
No. SSH or RSH are needed only for RAC, patch set installation and clustered database creation.
18. Do we have to have Oracle RDBMS on all nodes?
Each node of a cluster that is being used for a clustered database will typically have the RDBMS and RAC software loaded on it, but not actual data files (these need to be available via shared disk).
19. What are the restrictions on the SID with a RAC database? Is it limited to 5 characters?
The SID prefix in 10g Release 1 and prior versions was restricted to five characters by install/ config tools so that an ORACLE_SID of up to max of 5+3=8 characters can be supported in a RAC environment. The SID prefix is relaxed up to 8 characters in 10g Release 2, see bug 4024251 for more information.
20. Does Real Application Clusters support heterogeneous platforms?
The Real Application Clusters do not support heterogeneous platforms in the same cluster.

21. Are there any issues for the interconnect when sharing the same switch as the public network by using VLAN to separate the network?
RAC and Clusterware deployment best practices suggests that the interconnect (private connection) be deployed on a stand-alone, physically separate, dedicated switch. On big network the connections could be instable.
22. What is the Load Balancing Advisory?
To assist in the balancing of application workload across designated resources, Oracle Database 10g Release 2 provides the Load Balancing Advisory. This Advisory monitors the current workload activity across the cluster and for each instance where a service is active; it provides a percentage value of how much of the total workload should be sent to this instance as well as service quality flag.
23. How many nodes are supported in a RAC Database?
With 10g Release 2, we support 100 nodes in a cluster using Oracle Clusterware, and 100 instances in a RAC database. Currently DBCA has a bug where it will not go beyond 63 instances. There is also a documentation bug for the max-instances parameter. With 10g Release 1 the Maximum is 63.

24. What is the Cluster Verification Utiltiy (cluvfy)?
The Cluster Verification Utility (CVU) is a validation tool that you can use to check all the important components that need to be verified at different stages of deployment in a RAC environment.
25. What versions of the database can I use the cluster verification utility (cluvfy) with?
The cluster verification utility is release with Oracle Database 10g Release 2 but can also be used with Oracle Database 10g Release 1.
26. If I am using Vendor Clusterware such as Veritas, IBM, Sun or HP, do I still need Oracle Clusterware to run Oracle RAC 10g?
Yes. When certified, you can use Vendor Clusterware however you must still install and use Oracle Clusterware for RAC. Best Practice is to leave Oracle Clusterware to manage RAC. For details see Metalink Note 332257.1 and for Veritas SFRAC see 397460.1.
27. Is RAC on VMWare supported?
Yes.
28. What is hangcheck timer used for ? 
The hangcheck timer checks regularly the health of the system. If the system hangs or stop the node will be restarted automatically.
There are 2 key parameters for this module:
-> hangcheck-tick: this parameter defines the period of time between checks of system health. The default value is 60 seconds; Oracle recommends setting it to 30seconds.
-> hangcheck-margin: this defines the maximum hang delay that should be tolerated before hangcheck-timer resets the RAC node.
29. Is the hangcheck timer still needed with Oracle RAC 10g?
Yes.
30. What files can I put on Linux OCFS2?
For optimal performance, you should only put the following files on Linux OCFS2:
- Datafiles
- Control Files
- Redo Logs
- Archive Logs
- Shared Configuration File (OCR)
- Voting File
- SPFILE
31. Is it possible to use ASM for the OCR and voting disk?
No, the OCR and voting disk must be on raw or CFS (cluster file system).
32. Can I change the name of my cluster after I have created it when I am using Oracle Clusterware?
No, you must properly uninstall Oracle Clusterware and then re-install.
33. What the O2CB is?
The O2CB is the OCFS2 cluster stack. OCFS2 includes some services. These services must be started before using OCFS2 (mount/ format the file systems).
34. What the OCR file is used for? 
OCR is a file that manages the cluster and RAC configuration.
35. What the Voting Disk file is used for? 
The voting disk is nothing but a file that contains and manages information of all the node memberships.
36. What is the recommended method to make backups of a RAC environment?                          RMAN to make backups of the database, dd to backup your voting disk and hard copies of the OCR file.
37.  What command would you use to check the availability of the RAC system?
crs_stat -t -v (-t -v are optional)
38. What is the minimum number of instances you need to have in order to create a RAC?
You can create a RAC with just one server.
39.  Name two specific RAC background processes
RAC processes are: LMON, LMDx, LMSn, LKCx and DIAG.
40.  Can you have many database versions in the same RAC?
Yes, but Clusterware version must be greater than the greater database version.
41.  What was RAC previous name before it was called RAC?OPS: Oracle Parallel Server
42.  What RAC component is used for communication between instances?Private Interconnect.
43.  What is the difference between normal views and RAC views?A RAC view has the prefix ‘G’. For example, GV$SESSION instead of V$SESSION
44.  Which command will we use to manage (stop, start) RAC services in command-line mode?
srvctl
45.  How many alert logs exist in a RAC environment?
A- One for each instance.
46. What are Oracle Clusterware Components
Voting Disk — Oracle RAC uses the voting disk to manage cluster membership by way of a health check and arbitrates cluster ownership among the instances in case of network failures. The voting disk must reside on shared disk.
Oracle Cluster Registry (OCR) — Maintains cluster configuration information as well as configuration information about any cluster database within the cluster. The OCR must reside on shared disk that is accessible by all of the nodes in your cluster
47. How do you backup voting disk
#dd if=voting_disk_name of=backup_file_name
48. How do I identify the voting disk location
#crsctl query css votedisk
49. How do I identify the OCR file location
check /var/opt/oracle/ocr.loc or /etc/ocr.loc ( depends upon platform)
or
#ocrcheck
50. What is SCAN?
Single Client Access Name (SCAN) is s a new Oracle Real Application Clusters (RAC) 11g Release 2 feature that provides a single name for clients to access an Oracle Database running in a cluster. The benefit is clients using SCAN do not need to change if you add or remove nodes in the cluster.






--------------------------------------------------------------------------------------------------------------