maintaining a highly available database infrastructure is not just a technical requirement; it is a critical business necessity. oracle real application clusters (rac) is designed to provide zero downtime, but the process of applying quarterly release updates (ru) can be exceptionally challenging. many technical documents provide generic steps, but they fail to address the complex edge cases, permission bugs, and cluster health dependencies that arise in a real-world production environment. in this extensive technical article, we will explore the complete lifecycle of upgrading an oracle 19c database from ru 19.24 to 19.25 on an oracle linux 7.9 system. this guide is structured to help database administrators, system architects, and it professionals understand the “why” behind every single command, ensuring a smooth and secure transition.
- 1. understanding architecture and staging requirements
- 2. the critical role of the opatch utility upgrade
- 3. cluster health checks and background daemon management
- 4. executing the patch on the primary node (node 1)
- 5. navigating the notorious node 2 permission bug
- 6. data dictionary compilation and post-patching verification
1. understanding architecture and staging requirements
before executing any modifications to the core database binaries, establishing a secure and stable staging environment is absolutely paramount. in a typical 2-node oracle rac configuration, administrators must ensure that the patch files are accessible without compromising the strict user permission models enforced by the linux operating system. the patch we are deploying, specifically patch id 36916690, is a comprehensive release update that modifies both the grid infrastructure home and the database home.
the best practice for staging massive oracle patch files is to utilize local storage rather than shared network storage like nfs or acfs. network latency or temporary disconnections during the unzipping process can lead to silent file corruption. therefore, creating a dedicated local directory, such as /tmp/patch, on every single node participating in the cluster is the safest approach.
[root@trdb1]$ mkdir -pv /tmp/patch
[root@trdb2]$ mkdir -pv /tmp/patch
[root@trdb1]$ chmod -r 777 /tmp/patch
you might wonder why we are applying chmod 777 to the staging directory, as this is generally frowned upon by security auditors. during the complex patching workflow, the executable scripts will dynamically switch contexts between the ‘root’ user, the ‘grid’ user, and the ‘oracle’ user. if the unzipped binary files retain strict root-only ownership, the subsequent patching processes initiated by the grid or oracle users will terminate abruptly with a ‘permission denied’ error. applying open permissions temporarily during the maintenance window eliminates this risk and ensures smooth execution across different user profiles. always remember to verify the md5 checksums of your downloaded files before proceeding, as a dropped network packet during sftp transfer can waste hours of troubleshooting time.
2. the critical role of the opatch utility upgrade
one of the most common reasons a quarterly release update fails in enterprise environments is the usage of an outdated patching engine. oracle constantly evolves the metadata structures within their patch files. if you attempt to read a 2025 release update using an opatch utility from 2019, the software will simply fail to parse the xml instructions, resulting in an immediate crash during the prerequisite checking phase.
for the oracle 19.25 release update, oracle engineering strictly mandates opatch version 12.2.0.1.44 or higher. this utility must be independently downloaded (patch id 6880880) and extracted into both the grid infrastructure home and the database home across all nodes.
[root@trdb1]$ cd $grid_home
[root@trdb1]$ mv opatch opatch.bak_2
[root@trdb1]$ unzip /tmp/patch/p6880880_190000_linux-x86-64.zip -d $grid_home
[root@trdb1]$ chown grid:oinstall -r $grid_home/opatch
[+asm1:grid@trdb1]$ opatch version
notice that we do not delete the old opatch directory; instead, we rename it to act as an immediate rollback strategy. furthermore, the chown command is critical here. if an administrator unzips the opatch archive as the root user and forgets to change the ownership back to the grid or oracle user, the executable will be locked out. always verify the installation by explicitly checking the version output. if the terminal returns an older version than expected, you must investigate your system’s path variables to ensure there are no hardcoded alias conflicts in your bash profile.
3. cluster health checks and background daemon management
oracle real application clusters operate by utilizing a complex web of background daemons and synchronization services. before we initiate the patching sequence, which involves unlocking system binaries and restarting core services, we must guarantee that the cluster is in a perfectly healthy state. attempting to patch a cluster that is already suffering from split-brain scenarios or network heartbeat issues will lead to catastrophic corruption.
a specific component that requires our attention is the cluster health advisor, often referred to as chad. this monitoring service continuously analyzes database performance and frequently opens file handles against the core database libraries. if chad remains active during the patching process, the operating system will prevent the replacement of these library files, resulting in a ‘text file busy’ exception.
[+asm1:grid@trdb1]$ srvctl status mgmtlsnr
[+asm1:grid@trdb1]$ srvctl status mgmtdb
[+asm1:grid@trdb1]$ srvctl stop cha
[+asm1:grid@trdb1]$ crsctl status resource ora.chad -t
in my professional experience managing large-scale oracle deployments, failing to stop the cluster health advisor is a guaranteed way to extend your downtime by several hours. when the automated patching script attempts to unlock the grid home, an active chad process will cause the script to hang indefinitely. you are then forced to manually trace process ids and execute hard kills at the operating system level, which introduces unnecessary risk. taking the time to cleanly shut down the ‘ora.chad’ resource is a mandatory preventative measure.
4. executing the patch on the primary node (node 1)
we are utilizing a rolling patch methodology, which is the primary architectural advantage of an oracle rac environment. this means we will completely upgrade the binaries and restart the software stack on the first node while the second node continues to handle live user application traffic. this complex sequence is managed by a combination of shell scripts provided by oracle and the opatch utility itself.
the process involves four distinct phases: first, we execute a prepatch script as the root user to unlock the secure grid infrastructure binaries. second, the grid user applies the patch to the grid home. third, the oracle user applies the patch to the database home. finally, the root user executes a postpatch script to relock the binaries and initialize the newly patched clusterware stack.
[root@trdb1]$ $grid_home/crs/install/rootcrs.sh -prepatch
[+asm1:grid@trdb1]$ $grid_home/opatch/opatch apply -oh $grid_home -local /tmp/patch/36916690/36912597
[prdb1:oracle@trdb1]$ $oracle_home/opatch/opatch apply -oh $oracle_home -local /tmp/patch/36916690/36912597
[root@trdb1]$ $grid_home/crs/install/rootcrs.sh -postpatch
the inclusion of the -local flag in the opatch commands is absolutely critical. oracle’s patching engine is designed to be highly automated, and without this flag, it might attempt to propagate the binary changes across the network to the secondary node via secure shell (ssh). because the secondary node is currently processing live production transactions, modifying its core binaries dynamically would result in an immediate and unrecoverable instance crash. the local flag strictly confines all binary modifications to the server you are currently logged into.
5. navigating the notorious node 2 permission bug
many generic tutorials suggest that once the primary node is successfully patched, you simply repeat the exact same commands on the secondary node. however, seasoned database administrators know that this is a dangerous oversimplification. there is a deeply documented architectural quirk within the oracle inventory system that manifests precisely at this stage, specifically affecting the central inventory directory located at /u01/app/orainventory.
when the postpatch script completes its execution on the first node, it sometimes inadvertently modifies the ownership and read/write permissions of the centralized xml configuration files on the shared storage or secondary node paths. if you blindly attempt to initiate the patching sequence on node 2 without correcting this, the prerequisite checks will fail entirely due to an inability to read the ‘oui-patch.xml’ file.
chmod 770 /u01/app/orainventory/contentsxml
chmod 660 /u01/app/orainventory/contentsxml/comps.xml
chmod 660 /u01/app/orainventory/contentsxml/inventory.xml
chmod 660 /u01/app/orainventory/contentsxml/libs.xml
chmod 660 /u01/app/orainventory/contentsxml/oui-patch.xml
ls -l /u01/app/orainventory/contentsxml/oui-patch.xml
by manually resetting the file permissions to ‘660’ and ensuring that the appropriate installation group maintains control over these xml files, we preemptively avoid the dreaded ‘crs-41053: checking oracle grid infrastructure for file permission issues’ error. if this error occurs during the final postpatch script on the secondary node, you must also meticulously review the ownership of the diagnostic directories (/u01/app/oracle/diag). resolving these permission layers manually is a hallmark of an experienced systems architect.
6. data dictionary compilation and post-patching verification
successfully replacing the physical binary files on the linux operating system does not mean the upgrade is complete. the database’s internal logical structure, known as the data dictionary, must now be synchronized to understand the new software parameters and security fixes introduced in release update 19.25. this logical upgrade is performed using the ‘datapatch’ utility.
during the binary replacement, many internal pl/sql packages and views may become temporarily invalid. we must run a recompilation script, typically ‘utlrp.sql’, to restore these objects. additionally, executing specific grant statements before recompilation can prevent the datapatch utility from hanging due to unhandled exceptions in standard system packages.
[prdb1:oracle@trdb1]$ sqlplus "/as sysdba"
grant execute on dbms_random to public;
grant execute on utl_file to public;
grant execute on utl_http to public;
grant execute on dbms_sql to public;
@?/rdbms/admin/utlrp.sql
[prdb1:oracle@trdb1]$ cd $oracle_home/opatch
[prdb1:oracle@trdb1]$ ./datapatch -verbose
the explicitly added grant execute statements in the workflow above are a direct result of countless hours spent resolving stalled datapatch processes. by ensuring these core packages are accessible to the public role during the recompilation phase, you eliminate the vast majority of dependency errors. once datapatch concludes successfully, you can query the ‘dba_registry_sqlpatch’ view to confirm the exact timestamp and status of the 19.25 release update. mastering this detailed, multi-layered approach to oracle rac patching is what guarantees system stability and protects the enterprise’s mission-critical data.
