Tuesday, May 15, 2012

HOWTO (ebiz) - Howto to restart OPP without having to restart full Concurrent Managers

This post describes the steps to restart OPP processes manually without having to take a full outage on concurrent managers.

Sometimes OPP does not re-start normally when DBA tries to re-start OPP from the Form "Concurrent Manager -> Administer" form.

Login to APPS and navigate to Concurrent Managers/Administer form

Query for %Output Post Processor% manager.

You will see Actual=4 and Target=4 processes

Click on Restart Button. Make sure Oputput Post Processor is selected.

Wait for at least 2 minutes.


You will see Target=0 and Actual=0


If Actual is still 4, then proceed with steps below to manual kill OPP processes.

Identify OPP process in the UNIX
Login to APPTIER server and identify the OPP processes.


[appl@papp1 log]$ ps -ef |grep FNDOPP | grep -v grep
appl  6947  7203  2 10:31 ?        00:01:21 /u01/oracle/PROD/apps/tech_st/10.1.3/appsutil/jdk/bin/java -DCLIENT_PROCESSID=6947 -server -Xmx384m -XX:NewRatio=2 -XX:+UseSerialGC -Doracle.apps.fnd.common.Pool.leak.mode=stderr:off -verbose:gc -mx1024m -Ddbcfile=/u01/oracle/PROD/inst/apps/PROD_papp1/appl/fnd/12.0.0/secure/PROD.dbc -Dcpid=486574 -Dconc_queue_id=6269 -Dqueue_appl_id=0 -Dlogfile=/u01/oracle/PROD/apps/logs/appl/conc/log/FNDOPP486574.txt -DLONG_RUNNING_JVM=true -DOVERRIDE_DBC=true -DFND_JDBC_BUFFER_MIN=1 -DFND_JDBC_BUFFER_MAX=2 oracle.apps.fnd.cp.gsf.GSMServiceController
appl  6948  7203  3 10:31 ?        00:01:30 /u01/oracle/PROD/apps/tech_st/10.1.3/appsutil/jdk/bin/java -DCLIENT_PROCESSID=6948 -server -Xmx384m -XX:NewRatio=2 -XX:+UseSerialGC -Doracle.apps.fnd.common.Pool.leak.mode=stderr:off -verbose:gc -mx1024m -Ddbcfile=/u01/oracle/PROD/inst/apps/PROD_papp1/appl/fnd/12.0.0/secure/PROD.dbc -Dcpid=486575 -Dconc_queue_id=6269 -Dqueue_appl_id=0 -Dlogfile=/u01/oracle/PROD/apps/logs/appl/conc/log/FNDOPP486575.txt -DLONG_RUNNING_JVM=true -DOVERRIDE_DBC=true -DFND_JDBC_BUFFER_MIN=1 -DFND_JDBC_BUFFER_MAX=2 oracle.apps.fnd.cp.gsf.GSMServiceController
appl  6949  7203  2 10:31 ?        00:01:06 /u01/oracle/PROD/apps/tech_st/10.1.3/appsutil/jdk/bin/java -DCLIENT_PROCESSID=6949 -server -Xmx384m -XX:NewRatio=2 -XX:+UseSerialGC -Doracle.apps.fnd.common.Pool.leak.mode=stderr:off -verbose:gc -mx1024m -Ddbcfile=/u01/oracle/PROD/inst/apps/PROD_papp1/appl/fnd/12.0.0/secure/PROD.dbc -Dcpid=486576 -Dconc_queue_id=6269 -Dqueue_appl_id=0 -Dlogfile=/u01/oracle/PROD/apps/logs/appl/conc/log/FNDOPP486576.txt -DLONG_RUNNING_JVM=true -DOVERRIDE_DBC=true -DFND_JDBC_BUFFER_MIN=1 -DFND_JDBC_BUFFER_MAX=2 oracle.apps.fnd.cp.gsf.GSMServiceController
appl  6953  7203  0 10:31 ?        00:00:27 /u01/oracle/PROD/apps/tech_st/10.1.3/appsutil/jdk/bin/java -DCLIENT_PROCESSID=6953 -server -Xmx384m -XX:NewRatio=2 -XX:+UseSerialGC -Doracle.apps.fnd.common.Pool.leak.mode=stderr:off -verbose:gc -mx1024m -Ddbcfile=/u01/oracle/PROD/inst/apps/PROD_papp1/appl/fnd/12.0.0/secure/PROD.dbc -Dcpid=486577 -Dconc_queue_id=6269 -Dqueue_appl_id=0 -Dlogfile=/u01/oracle/PROD/apps/logs/appl/conc/log/FNDOPP486577.txt -DLONG_RUNNING_JVM=true -DOVERRIDE_DBC=true -DFND_JDBC_BUFFER_MIN=1 -DFND_JDBC_BUFFER_MAX=2 oracle.apps.fnd.cp.gsf.GSMServiceController

Kill OPP processes

By Killing the processes you are forcing OPP process to stop. ICM will detect all OPP processes were found dead and Manually DBA has to Activate OPP manager from the Administer (concurrent managers) form.

$ kill -9 6953  6949  6948  6947


Tail the Main concurrent Manager log and confirm OPP processes have been found dead

$ cd $APPLCSF/log/PROD_1017.mgr

NOTE 1017 is the month and date of the last time concurrent managers started. Output is is a follows. You need to confirm all four processes are found dead.


                   Process monitor session started : 10-NOV-2011 10:29:13
Found dead process: spid=(7307), cpid=(485567), Service Instance=(6269)
Found dead process: spid=(7295), cpid=(485566), Service Instance=(6269)
Found dead process: spid=(7278), cpid=(485565), Service Instance=(6269)
Found dead process: spid=(7268), cpid=(485564), Service Instance=(6269)

Confirm Actual and target processes count is 0

Navigate to Concurrent Managers/Administer form and query for %Output Post Processor% manager.

You will see Actual=0 and Target=0 processes

Click on Activate Button. Make sure Oputput Post Processor is selected.

Wait for at least 2 minutes.

You will see Target=4 and Actual=4

Tail the Main concurrent Manager log and confirm OPP processes have started

$ cd $APPLCSF/log/PROD_1017.mgr


NOTE 1017 is the month and date of the last time concurrent managers started. Output is a follows. You need to confirm all four processes have started.



                 Process monitor session started : 10-NOV-2011 10:31:50
Starting FNDCPOPP Concurrent Manager               : 10-NOV-2011 10:31:51
Starting FNDCPOPP Concurrent Manager               : 10-NOV-2011 10:31:51
Starting FNDCPOPP Concurrent Manager               : 10-NOV-2011 10:31:51
Starting FNDCPOPP Concurrent Manager               : 10-NOV-2011 10:31:52

Check OPP log files are processing concurrent requests post actions correctly

[appl@papp1 log]$ $ cd $APPLCSF/log/FNDOPP*

check timestamp, normally last four processes

[appl@papp1 log]$ ls -lrt FNDOPP*
-rw-rw-r-- 1 appl dba 2836639 Nov  9 20:10 FNDOPP485566.txt
-rw-rw-r-- 1 appl dba 5704431 Nov  9 20:10 FNDOPP485565.txt
-rw-rw-r-- 1 appl dba 1316142 Nov 10 09:09 FNDOPP485567.txt
-rw-rw-r-- 1 appl dba 1016488 Nov 10 10:21 FNDOPP485564.txt
-rw-rw-r-- 1 appl dba   43403 Nov 10 11:59 FNDOPP486577.txt
-rw-rw-r-- 1 appl dba   47016 Nov 10 11:59 FNDOPP486576.txt
-rw-rw-r-- 1 appl dba   37590 Nov 10 11:59 FNDOPP486575.txt
-rw-rw-r-- 1 appl dba   42614 Nov 10 11:59 FNDOPP486574.txt

[appl@papp1 log]$ /sbin/fuser FNDOPP486574.txt FNDOPP486575.txt FNDOPP486576.txt FNDOPP486577.txt
FNDOPP486574.txt:     6947
FNDOPP486575.txt:     6948
FNDOPP486576.txt:     6949
FNDOPP486577.txt:     6953

Output file FNDOPP486574.txt

[11/10/11 10:34:43 AM] [486574:RT10109319] Executing post-processing actions for request 10109319.
[11/10/11 10:34:43 AM] [486574:RT10109319] Starting XML Publisher post-processing action.
[11/10/11 10:34:43 AM] [486574:RT10109319]
Template code: X142R
Template app:  XC
Language:      en
Territory:     null
Output type:   PDF
...
...
...
[11/10/11 10:34:44 AM] [486574:RT10109319] XML Publisher post-processing action complete.
[11/10/11 10:34:44 AM] [486574:RT10109319] Completed post-processing actions for request 10109319.

HOWTO (Oracle DB) - Monitor the progress and ETA of your RMAN Backup or Duplicate

Have you ever wondered how good or bad your RMAN backup or duplicate is performing?

This happened to me very often, backup and restore or duplicate operations are business critical activities. Normally, when a critical restore of your Production database is happening, everyone is looking over you shoulder asking "Is it done? What is the ETA?". These are questions that sometimes are hard to answer, specially if you have slow backup infrastructure or slow disk or Tapes.

The good thing is that Oracle provides all the information required to predict the ETA of an RMAN operation. All the information in there in the dictionary views. I have make my life easier by creating a shell script that gives you a full picture of your RMAN operation.

With this script you tell:
  • When the operation will be finished.
  • How much MBytes have been done.
  • What is the current throughput/sec of the operation
  • and mainly, give that information to your manager watching over your shoulder every second!!!
Output of the script:

$ . oraenv
$ bash rman_duplicate_monitor_eta.sh 
 
-------- RMAN: timing information - Tue May 15 22:05:47 EST 2012 ----------
 
OPNAME                              SOFAR  TOTALWORK   PCT_DONE    ELA_MIN    ETA_MIN START_TIME          FINISH_TIME
------------------------------ ---------- ---------- ---------- ---------- ---------- ------------------- -------------------
RMAN: incremental datafile bac        638        656      97.26          1          0 15-05-2012 22:02:30 15-05-2012 22:05:47
RMAN: aggregate input              737488     738214       99.9        5.2          0 15-05-2012 21:59:13 15-05-2012 22:05:47
RMAN: incremental datafile bac        640          0        100          1            15-05-2012 22:02:30
RMAN: aggregate output             519231          0        100        1.9            15-05-2012 22:02:30
 
'DUPLICATE/RESTORETHROUGHPUT MBYTES_SOFAR MBYTES_PER_SEC NAME
---------------------------- ------------ -------------- --------------------------
DUPLICATE/RESTORE THROUGHPUT         22.3            .07 physical write total bytes
 
'BACKUPTHROUGHPUT MBYTES_SOFAR MBYTES_PER_SEC NAME
----------------- ------------ -------------- --------------------------
BACKUP THROUGHPUT         5868          50.59 physical read total bytes

Shell Script code below:

#!/bin/bash
# $Id: ora_watch.sh 24 2010-03-16 00:18:42Z cricci $
#=========================================================================+
#  Copyright (c) 2009 Christian Daniel Ricci, Sydney, Australia           |
#                          All rights reserved.                           |
#=========================================================================+
# FILENAME
#       rman_duplicate_monitor_eta.sh.sh
#
# DESCRIPTION
#       Monitor RMAN operations
#
# HISTORY
#       CR - 16-Mar-2010 - Created
#
#=========================================================================*/

while [ 1 -eq 1 ]; do

ps -ef | grep rman | grep -v grep > /dev/null
[ $? -eq 1 ] && exit 0

echo "-------- RMAN: timing information - `date` ----------"

sqlplus -s '/as sysdba' <<EOF
col opname for a30
col message for a30 wra on
col units for a15
set lin 300 pages 1000 feed off echo off

alter session set nls_date_format='dd-mm-yyyy hh24:mi:ss';

select OPNAME,
--message,
SOFAR,TOTALWORK,round((SOFAR*100/decode(TOTALWORK,0,SOFAR,TOTALWORK)),2) PCT_DONE,
--UNITS,
round(elapsed_seconds/60,1) ela_min,round(time_remaining/60,1) eta_min,
START_TIME,round(time_remaining/60,1)/60/24+sysdate finish_time
 from v\$session_longops
where
OPNAME like 'RMAN%'
and SOFAR != TOTALWORK
order by 8;

SELECT
  'DUPLICATE/RESTORE THROUGHPUT',
  round(SUM(v.value/1024/1024),1) mbytes_sofar,
  round(SUM(v.value     /1024/1024)/nvl((SELECT MIN(elapsed_seconds)
  FROM v\$session_longops
  WHERE OPNAME LIKE 'RMAN: aggregate input'
  AND SOFAR            != TOTALWORK
  AND elapsed_seconds IS NOT NULL
  ),SUM(v.value     /1024/1024)),2) mbytes_per_sec,
  n.name
FROM gv\$sesstat v,
  v\$statname n,
  gv\$session s
WHERE v.statistic#=n.statistic#
AND n.name = 'physical write total bytes'
AND v.sid = s.sid
AND v.inst_id=s.inst_id
AND s.program like 'rman@%'
GROUP BY 'DUPLICATE/RESTORE THROUGHPUT',n.name;

SELECT
  'BACKUP THROUGHPUT',
  round(SUM(v.value/1024/1024),1) mbytes_sofar,
  round(SUM(v.value     /1024/1024)/nvl((SELECT MIN(elapsed_seconds)
  FROM v\$session_longops
  WHERE OPNAME LIKE 'RMAN: aggregate output'
  AND SOFAR            != TOTALWORK
  AND elapsed_seconds IS NOT NULL
  ),SUM(v.value     /1024/1024)),2) mbytes_per_sec,
  n.name
FROM gv\$sesstat v,
  v\$statname n,
  gv\$session s
WHERE v.statistic#=n.statistic#
AND n.name = 'physical read total bytes'
AND v.sid = s.sid
AND v.inst_id=s.inst_id
AND s.program like 'rman@%'
GROUP BY 'BACKUP THROUGHPUT',n.name;

exit
EOF

sleep 60

done