Performing an Exadata Health Check Using exachk


Exacheck is advised to be run as a part of periodic maintenance operations on the exadata. It is strongly recommended to be run before or after any upgrade, configuration change or anychange on the software or hardware.
Download the latest version from the oracle support (ID 1070954.1).
Query the version like that :
oracle@anarexadatatest:~$ cd /u01/
oracle@anarexadatatest:/u01$ ls
app repos staging
oracle@anarexadatatest:/u01$ cd staging/
oracle@anarexadatatest:/u01/staging$ ls
exachk OPatch RDBMS-11.2.0.3.25-Solaris-x86 Solaris11-x86-idr1178 Solaris11-x86-idr679
ExadataConfigurations patches SOLARIS-11.1-REPO-ISO-IMAGE-x64 Solaris11-x86-idr1211 Solaris11.1-SRU19.6-x86
onecommand RDBMS-11.2.0.3-Solaris-x86 Solaris11-x86-12.1.1.1.1-exa-family-repo Solaris11-x86-idr1401.3 Solaris11.1-SRU9.5.1-x86
oracle@anarexadatatest:/u01/staging$ cd exachk/
oracle@anarexadatatest:/u01/staging/exachk$ ./exachk -v
EXACHK VERSION: 12.1.0.2.1_20141009
oracle@anarexadatatest:/u01/staging/exachk$ 
oracle@anarexadatatest:/u01/staging/exachk$ ./exachk
CRS stack is running and CRS_HOME is not set. Do you want to set CRS_HOME to /u01/app/11.2.0.3/grid?[y/n][y]y
Checking ssh user equivalency settings on all nodes in cluster
Node bakuexa1dbadm02 is configured for ssh user equivalency for oracle user
Searching for running databases . . . . .
. . 
List of running databases registered in OCR
1. ANARLIVE
2. None of above
Select databases from list for checking best practices. For multiple databases, select 1 for All or comma separated number like 1,2 etc [1-2][1].1
. .
Checking Status of Oracle Software Stack - Clusterware, ASM, RDBMS
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
-------------------------------------------------------------------------------------------------------
 Oracle Stack Status 
-------------------------------------------------------------------------------------------------------
Host Name CRS Installed RDBMS Installed CRS UP ASM UP RDBMS UP DB Instance Name
-------------------------------------------------------------------------------------------------------
anarexadatatest Yes Yes Yes Yes Yes ANARLIVE1 
bakuexa1dbadm02 Yes Yes Yes Yes Yes ANARLIVE2 
-------------------------------------------------------------------------------------------------------
Copying plug-ins
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
root user equivalence is not setup between anarexadatatest and STORAGE SERVER bakuexa1cel01 (192.168.10.3).
1. Enter 1 if you will enter root password for each STORAGE SERVER when prompted.
2. Enter 2 to exit and configure root user equivalence manually and re-run exachk.
3. Enter 3 to skip checking best practices on STORAGE SERVER.
Please indicate your selection from one of the above options for STORAGE SERVER[1-3][1]:- 1
Is root password same on all STORAGE SERVER?[y/n][y]y
Enter root password for STORAGE SERVER :-
Verifying root password.
. . . . . . . . . . . . . . . . . . . .
80 of the included audit checks require root privileged data collection on DATABASE SERVER. If sudo is not configured or the root password is not available, audit checks which require root privileged data collection can be skipped.
1. Enter 1 if you will enter root password for each on DATABASE SERVER host when prompted
2. Enter 2 if you have sudo configured for oracle user to execute root_exachk.sh script on DATABASE SERVER
3. Enter 3 to skip the root privileged collections on DATABASE SERVER
4. Enter 4 to exit and work with the SA to configure sudo on DATABASE SERVER or to arrange for root access and run the tool later.
Please indicate your selection from one of the above options for root access[1-4][1]:- 1
Is root password same on all compute nodes?[y/n][y]y
Enter root password on DATABASE SERVER:-
Verifying root password.
. . .
9 of the included audit checks require nm2user privileged data collection on INFINIBAND SWITCH .
1. Enter 1 if you will enter nm2user password for each INFINIBAND SWITCH when prompted
2. Enter 2 to exit and to arrange for nm2user access and run the exachk later.
3. Enter 3 to skip checking best practices on INFINIBAND SWITCH
Please indicate your selection from one of the above options for INFINIBAND SWITCH[1-3][1]:- 1
Is nm2user password same on all INFINIBAND SWITCH ?[y/n][y]y
Enter nm2user password for INFINIBAND SWITCH :-
Verifying nm2user password.
. .
You can still continue but root privileged checks will not be executed on following nodes.
1. bakuexa1sw-ibb01
Do you want to continue[y/n][y]:- y
*** Checking Best Practice Recommendations (PASS/WARNING/FAIL) ***
Collections and audit checks log file is 
/u01/staging/exachk/exachk_anarexadatatest_ANARLIVE_112814_120155/log/exachk.log
Checking for prompts in /export/home/oracle/.profile on anarexadatatest for oracle user...
Checking for prompts in /export/home/oracle/.profile on bakuexa1dbadm02 for oracle user...
Starting to run exachk in background on bakuexa1dbadm02
=============================================================
 Node name - anarexadatatest 
=============================================================
Collecting - ASM Diskgroup Attributes 
Collecting - ASM diskgroup usable free space 
Collecting - ASM initialization parameters 
Collecting - Database Parameters for ANARLIVE database
Collecting - Database Undocumented Parameters for ANARLIVE database
Collecting - RDBMS Feature Usage for ANARLIVE database
Collecting - Clusterware and RDBMS software version
Collecting - Patches for Grid Infrastructure 
Collecting - Patches for RDBMS Home 
Collecting - RDBMS patch inventory
Preparing to run root privileged commands on DATABASE SERVER anarexadatatest.
Collecting - CRS user time zone check 
Collecting - Clusterware patch inventory 
Collecting - Discover switch type(spine or leaf) 
Collecting - Exadata Critical Issue DB22 
Collecting - Exadata software version on database server 
Collecting - HCA firmware version on database server 
Collecting - HCA transfer rate on database server 
Collecting - Infiniband Switch counters on all switches for Solaris 
Collecting - Minimum exadata version required for ASR 
Collecting - Operating system and Kernel version on database server 
Collecting - Verify Database Server Disk Controller Configuration 
Collecting - Verify Database Server Physical Drive Configuration 
Collecting - Verify Database Server Virtual Drive Configuration 
Collecting - Verify Database Server ZFS RAID Configuration 
Collecting - Verify Disk Cache Policy on database server 
Collecting - Verify ILOM Power Up Configuration for HOST_AUTO_POWER_ON 
Collecting - Verify ILOM Power Up Configuration for HOST_LAST_POWER_STATE 
Collecting - Verify InfiniBand Fabric Topology (verify-topology) 
Collecting - Verify InfiniBand subnet manager is not running on database server 
Collecting - Verify InfiniBand subnet manager is running on an InfiniBand switch 
Collecting - Verify NTP server on database server to compare systemwide 
Collecting - Verify NTP sync on database server to compare systemwide 
Collecting - Verify RAID Controller Battery Condition [Database Server] 
Collecting - Verify RAID Controller Battery Temperature [Database Server] 
Collecting - Verify database server disk controllers use writeback cache 
Collecting - Verify the file /.updfrm_exact does not exist [Database Server] 
Collecting - Verify there are no memory (ECC) errors 
Collecting - root time zone check 
Collecting - verify asr exadata configuration check via ASREXACHECK on database server
Starting to run root privileged commands in background on STORAGE SERVER bakuexa1cel01 (192.168.10.3)
Starting to run root privileged commands in background on STORAGE SERVER bakuexa1cel02 (192.168.10.5)
Starting to run root privileged commands in background on STORAGE SERVER bakuexa1cel03 (192.168.10.7)
Skipping nm2user privileged commands on INFINIBAND SWITCH bakuexa1sw-ibb01
Starting to run nm2user privileged commands in background on INFINIBAND SWITCH bakuexa1sw-iba01.
Skipping nm2user privileged checks for bakuexa1sw-iba01.
The nm2user password must have been changed since the passwords were validated at the beginning of tool execution
Collections from STORAGE SERVER:
----------------------------------
Collecting - Ambient Temperature on storage server 
Collecting - Exadata Critical Issue EX10 
Collecting - Exadata Critical Issue EX11 
Collecting - Exadata critical issue EX14 
Collecting - Exadata critical issue EX15 
Collecting - Exadata software version on storage server 
Collecting - Exadata software version on storage servers 
Collecting - RAID controller version on storage servers 
Collecting - Verify Disk Cache Policy on storage servers 
Collecting - Verify Exadata Smart Flash Cache is created 
Collecting - Verify Hardware and Firmware on Database and Storage Servers (CheckHWnFWProfile) [Storage Server] 
Collecting - Verify ILOM Power Up Configuration for HOST_AUTO_POWER_ON on storage servers 
Collecting - Verify ILOM Power Up Configuration for HOST_LAST_POWER_STATE on storage servers 
Collecting - Verify InfiniBand subnet manager is not running on storage server 
Collecting - Verify Master (Rack) Serial Number is Set [Storage Server] 
Collecting - Verify NTP server on storage server to compare systemwide 
Collecting - Verify NTP sync on storage server to compare systemwide 
Collecting - Verify OSSCONF/cellinit.ora consistency across storage servers 
Collecting - Verify RAID Controller Battery Condition [Storage Server] 
Collecting - Verify RAID Controller Battery Temperature [Storage Server] 
Collecting - Verify storage server disk controllers use writeback cache 
Collecting - Verify storage server network configuration with ipconf 
Collecting - Verify the file /.updfrm_exact does not exist [Storage Server] 
Collecting - Verify there are no memory (ECC) errors on Storage Servers 
Collecting - verify asr exadata configuration check via ASREXACHECK on storage servers 
Collecting - Configure Storage Server alerts to be sent via email 
Collecting - ExaWatcher status on storage servers 
Collecting - Exadata Celldisk predictive failures 
Collecting - Exadata storage server root filesystem free space 
Collecting - HCA firmware version on storage server 
Collecting - Operating system and Kernel version on storage server 
Collecting - Scan storage server alerthistory for non-test open alerts 
Collecting - Scan storage server alerthistory for stateful alerts not cleared 
Collecting - Scan storage server alerthistory for test open alerts 
Collecting - Storage server flash cache mode 
Collecting - Storage server make and model 
Collecting - Verify Data Network is Separate from Management Network on storage server 
Collecting - Verify Ethernet Cable Connection Quality on storage servers 
Collecting - Verify Exadata Smart Flash Cache is actually in use 
Collecting - Verify Exadata Smart Flash Log is Created 
Collecting - Verify InfiniBand Cable Connection Quality on storage servers 
Collecting - Verify average ping times to DNS nameserver 
Collecting - Verify celldisk configuration on disk drives 
Collecting - Verify celldisk configuration on flash memory devices 
Collecting - Verify griddisk ASM status 
Collecting - Verify griddisk count matches across all storage servers where a given prefix name exists 
Collecting - Verify storage server metric CD_IO_ST_RQ 
Collecting - Verify there are no griddisks configured on flash memory devices 
Collecting - Verify total number of griddisks with a given prefix name is evenly divisible of celldisks 
Collecting - Verify total size of all griddisks fully utilizes celldisk capacity 
Collecting - mpt_cmd_retry_count from /etc/modprobe.conf on Storage Servers
Data collections completed. Checking best practices on anarexadatatest.
--------------------------------------------------------------------------------------
 WARNING => SYS or SYSTEM objects were found to be INVALID for ANARLIVE
 FAIL => Database parameter DB_BLOCK_CHECKSUM is NOT set to recommended value on ANARLIVE1 instance
 FAIL => Storage Server alerts are not configured to be sent via email
 FAIL => InfiniBand network error counters are non-zero
 FAIL => Database parameter CLUSTER_INTERCONNECTS is NOT set to the recommended value for ANARLIVE
 FAIL => Database parameter COMPATIBLE should be set to recommended value for ANARLIVE
 FAIL => Database parameters log_archive_dest_n with Location attribute are NOT all set to recommended value for ANARLIVE
 WARNING => Database parameter DB_BLOCK_CHECKING on PRIMARY is NOT set to the recommended value. for ANARLIVE
 FAIL => Flashback on PRIMARY is not configured for ANARLIVE
 INFO => Operational Best Practices
 INFO => Database Consolidation Best Practices
 INFO => Computer failure prevention best practices
 INFO => Data corruption prevention best practices
 INFO => Logical corruption prevention best practices
 INFO => Database/Cluster/Site failure prevention best practices
 INFO => Client failover operational best practices
INFO => Database failure prevention best practices
 FAIL => Primary database is NOT protected with Data Guard (standby database) for real-time data protection and availability for ANARLIVE
 WARNING => Hidden database initialization parameters should be set per best practice recommendations for ANARLIVE
INFO => Storage failures prevention best practices
 INFO => Network failure prevention best practices
 INFO => Software maintenance best practices
 FAIL => FRA space management problem file types are present without an RMAN backup completion within the last 7 days. for ANARLIVE
 INFO => Oracle recovery manager(rman) best practices
 WARNING => RMAN controlfile autobackup should be set to ON for ANARLIVE
 INFO => Exadata Critical Issues (Doc ID 1270094.1):- DB01-DB04,DB06,DB09-DB25, EX1-EX15 and IB1-IB3
Collecting patch inventory on CRS HOME /u01/app/11.2.0.3/grid
Collecting patch inventory on ORACLE_HOME /u01/app/oracle/product/11.2.0.3/dbhome_1
Copying results from bakuexa1dbadm02 and generating report. This might take a while. Be patient.
=============================================================
 Node name - bakuexa1dbadm02 
=============================================================
Collecting - Clusterware and RDBMS software version
Collecting - Patches for Grid Infrastructure 
Collecting - Patches for RDBMS Home 
Collecting - RDBMS patch inventory
Preparing to run root privileged commands on DATABASE SERVER bakuexa1dbadm02.
Collecting - CRS user time zone check 
Collecting - Clusterware patch inventory 
Collecting - Exadata Critical Issue DB22 
Collecting - Exadata software version on database server 
Collecting - HCA firmware version on database server 
Collecting - HCA transfer rate on database server 
Collecting - Minimum exadata version required for ASR 
Collecting - Operating system and Kernel version on database server 
Collecting - Verify Database Server Disk Controller Configuration 
Collecting - Verify Database Server Physical Drive Configuration 
Collecting - Verify Database Server Virtual Drive Configuration 
Collecting - Verify Database Server ZFS RAID Configuration 
Collecting - Verify Disk Cache Policy on database server 
Collecting - Verify ILOM Power Up Configuration for HOST_AUTO_POWER_ON 
Collecting - Verify ILOM Power Up Configuration for HOST_LAST_POWER_STATE 
Collecting - Verify InfiniBand subnet manager is not running on database server 
Collecting - Verify NTP server on database server to compare systemwide 
Collecting - Verify NTP sync on database server to compare systemwide 
Collecting - Verify RAID Controller Battery Condition [Database Server] 
Collecting - Verify RAID Controller Battery Temperature [Database Server] 
Collecting - Verify database server disk controllers use writeback cache 
Collecting - Verify the file /.updfrm_exact does not exist [Database Server] 
Collecting - Verify there are no memory (ECC) errors 
Collecting - root time zone check 
Collecting - verify asr exadata configuration check via ASREXACHECK on database server
Data collections completed. Checking best practices on bakuexa1dbadm02.
--------------------------------------------------------------------------------------
FAIL => Database parameter DB_BLOCK_CHECKSUM is NOT set to recommended value on ANARLIVE2 instance
 FAIL => Database parameter CLUSTER_INTERCONNECTS is NOT set to the recommended value for ANARLIVE
 FAIL => Database parameter COMPATIBLE should be set to recommended value for ANARLIVE
 FAIL => Database parameters log_archive_dest_n with Location attribute are NOT all set to recommended value for ANARLIVE
 WARNING => Database parameter DB_BLOCK_CHECKING on PRIMARY is NOT set to the recommended value. for ANARLIVE
WARNING => Hidden database initialization parameters should be set per best practice recommendations for ANARLIVE
Collecting patch inventory on CRS HOME /u01/app/11.2.0.3/grid
Collecting patch inventory on ORACLE_HOME /u01/app/oracle/product/11.2.0.3/dbhome_1
---------------------------------------------------------------------------------
 CLUSTERWIDE CHECKS
---------------------------------------------------------------------------------
 FAIL => cellinit.ora does match across database servers
---------------------------------------------------------------------------------
Detailed report (html) - /u01/staging/exachk/exachk_anarexadatatest_ANARLIVE_112814_120155/exachk_anarexadatatest_ANARLIVE_112814_120155.html
UPLOAD(if required) - /u01/staging/exachk/exachk_anarexadatatest_ANARLIVE_112814_120155.zip

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: