Tuesday, May 15, 2012

HOWTO (Oracle DB) - Monitor the progress and ETA of your RMAN Backup or Duplicate

Have you ever wondered how good or bad your RMAN backup or duplicate is performing?

This happened to me very often, backup and restore or duplicate operations are business critical activities. Normally, when a critical restore of your Production database is happening, everyone is looking over you shoulder asking "Is it done? What is the ETA?". These are questions that sometimes are hard to answer, specially if you have slow backup infrastructure or slow disk or Tapes.

The good thing is that Oracle provides all the information required to predict the ETA of an RMAN operation. All the information in there in the dictionary views. I have make my life easier by creating a shell script that gives you a full picture of your RMAN operation.

With this script you tell:
  • When the operation will be finished.
  • How much MBytes have been done.
  • What is the current throughput/sec of the operation
  • and mainly, give that information to your manager watching over your shoulder every second!!!
Output of the script:

$ . oraenv
$ bash rman_duplicate_monitor_eta.sh 
 
-------- RMAN: timing information - Tue May 15 22:05:47 EST 2012 ----------
 
OPNAME                              SOFAR  TOTALWORK   PCT_DONE    ELA_MIN    ETA_MIN START_TIME          FINISH_TIME
------------------------------ ---------- ---------- ---------- ---------- ---------- ------------------- -------------------
RMAN: incremental datafile bac        638        656      97.26          1          0 15-05-2012 22:02:30 15-05-2012 22:05:47
RMAN: aggregate input              737488     738214       99.9        5.2          0 15-05-2012 21:59:13 15-05-2012 22:05:47
RMAN: incremental datafile bac        640          0        100          1            15-05-2012 22:02:30
RMAN: aggregate output             519231          0        100        1.9            15-05-2012 22:02:30
 
'DUPLICATE/RESTORETHROUGHPUT MBYTES_SOFAR MBYTES_PER_SEC NAME
---------------------------- ------------ -------------- --------------------------
DUPLICATE/RESTORE THROUGHPUT         22.3            .07 physical write total bytes
 
'BACKUPTHROUGHPUT MBYTES_SOFAR MBYTES_PER_SEC NAME
----------------- ------------ -------------- --------------------------
BACKUP THROUGHPUT         5868          50.59 physical read total bytes

Shell Script code below:

#!/bin/bash
# $Id: ora_watch.sh 24 2010-03-16 00:18:42Z cricci $
#=========================================================================+
#  Copyright (c) 2009 Christian Daniel Ricci, Sydney, Australia           |
#                          All rights reserved.                           |
#=========================================================================+
# FILENAME
#       rman_duplicate_monitor_eta.sh.sh
#
# DESCRIPTION
#       Monitor RMAN operations
#
# HISTORY
#       CR - 16-Mar-2010 - Created
#
#=========================================================================*/

while [ 1 -eq 1 ]; do

ps -ef | grep rman | grep -v grep > /dev/null
[ $? -eq 1 ] && exit 0

echo "-------- RMAN: timing information - `date` ----------"

sqlplus -s '/as sysdba' <<EOF
col opname for a30
col message for a30 wra on
col units for a15
set lin 300 pages 1000 feed off echo off

alter session set nls_date_format='dd-mm-yyyy hh24:mi:ss';

select OPNAME,
--message,
SOFAR,TOTALWORK,round((SOFAR*100/decode(TOTALWORK,0,SOFAR,TOTALWORK)),2) PCT_DONE,
--UNITS,
round(elapsed_seconds/60,1) ela_min,round(time_remaining/60,1) eta_min,
START_TIME,round(time_remaining/60,1)/60/24+sysdate finish_time
 from v\$session_longops
where
OPNAME like 'RMAN%'
and SOFAR != TOTALWORK
order by 8;

SELECT
  'DUPLICATE/RESTORE THROUGHPUT',
  round(SUM(v.value/1024/1024),1) mbytes_sofar,
  round(SUM(v.value     /1024/1024)/nvl((SELECT MIN(elapsed_seconds)
  FROM v\$session_longops
  WHERE OPNAME LIKE 'RMAN: aggregate input'
  AND SOFAR            != TOTALWORK
  AND elapsed_seconds IS NOT NULL
  ),SUM(v.value     /1024/1024)),2) mbytes_per_sec,
  n.name
FROM gv\$sesstat v,
  v\$statname n,
  gv\$session s
WHERE v.statistic#=n.statistic#
AND n.name = 'physical write total bytes'
AND v.sid = s.sid
AND v.inst_id=s.inst_id
AND s.program like 'rman@%'
GROUP BY 'DUPLICATE/RESTORE THROUGHPUT',n.name;

SELECT
  'BACKUP THROUGHPUT',
  round(SUM(v.value/1024/1024),1) mbytes_sofar,
  round(SUM(v.value     /1024/1024)/nvl((SELECT MIN(elapsed_seconds)
  FROM v\$session_longops
  WHERE OPNAME LIKE 'RMAN: aggregate output'
  AND SOFAR            != TOTALWORK
  AND elapsed_seconds IS NOT NULL
  ),SUM(v.value     /1024/1024)),2) mbytes_per_sec,
  n.name
FROM gv\$sesstat v,
  v\$statname n,
  gv\$session s
WHERE v.statistic#=n.statistic#
AND n.name = 'physical read total bytes'
AND v.sid = s.sid
AND v.inst_id=s.inst_id
AND s.program like 'rman@%'
GROUP BY 'BACKUP THROUGHPUT',n.name;

exit
EOF

sleep 60

done
  

7 comments:

  1. Thanks Chris it Helped.

    Reg "DUPLICATE/RESTORE THROUGHPUT" script,

    Do you have script which can tell how much mb of data written by each channel?

    ReplyDelete
  2. Thanks, nice Script. I'm restoring 37TB database with ~2K datafiles using 8 Auxiliary Channels, this script gives me the status/ETA etc. for the channels currently working which is really good. But is there a way to find out how long the entire restore is going to take?

    ReplyDelete
  3. thanks christian for the script.

    is there is a way we can check how much time is left.

    ReplyDelete