Automating and delegating datapump imports with PL/SQL

We recently bought another company and we are in the process of integrating our systems with theirs. One requirement was to import their daily business data into our Business Intelligence system for reporting. We agreed they would provide an Oracle datapump export file for us to import. We wanted to automate this and have a non-privileged application owner perform the import. I created a stored procedure that could run the import as the system user and granted execute on the procedure to the application owner. The dbms_datapump package has lots of options and it was fun exploring them to find the best way of doing this.

Setup –

Run as sys :

 grant execute on dbms_lock to system;

Run as system:

CREATE OR REPLACE DIRECTORY SOURCE AS '/nfsmount/exp/incoming';
CREATE OR REPLACE PROCEDURE SYSTEM.SCHEMA_IMPORT (
 dumpfilename IN VARCHAR2)
IS
 h1 NUMBER; -- data pump job handle
 job_state VARCHAR2 (30);
 status ku$_Status; -- data pump status
 job_not_exist EXCEPTION;
 PRAGMA EXCEPTION_INIT (job_not_exist, -31626);
BEGIN
 h1 :=
 DBMS_DATAPUMP.open (operation => 'IMPORT',
   job_mode => 'SCHEMA',
   job_name => NULL);
 DBMS_DATAPUMP.set_parameter (h1, 'TABLE_EXISTS_ACTION', 'TRUNCATE');
 DBMS_DATAPUMP.add_file (h1, dumpfilename, 'SOURCE');
 DBMS_DATAPUMP.add_file (h1,
   dumpfilename || '.log',
   'SOURCE',
   NULL,
   DBMS_DATAPUMP.KU$_FILE_TYPE_LOG_FILE);
 DBMS_DATAPUMP.metadata_remap (h1,
   'REMAP_SCHEMA',
   'FROMSCHEMA',
   'TOSCHEMA');
 DBMS_DATAPUMP.start_job (h1);
 job_state := 'UNDEFINED';
BEGIN
 WHILE (job_state != 'COMPLETED') AND (job_state != 'STOPPED')
 LOOP
 status :=
   DBMS_DATAPUMP.get_status (
   handle => h1,
   mask => DBMS_DATAPUMP.ku$_status_job_error
   + DBMS_DATAPUMP.ku$_status_job_status
   + DBMS_DATAPUMP.ku$_status_wip,
   timeout => -1);
 job_state := status.job_status.state;
 DBMS_LOCK.sleep (10);
 END LOOP;
 EXCEPTION
 WHEN job_not_exist
 THEN
 DBMS_OUTPUT.put_line ('job finished');
 END;
COMMIT;
END;
/
grant execute on schema_import to APP_OWNER;

Execution –

Run as APP_OWNER:

EXECUTE SYSTEM.SCHEMA_IMPORT('FILENAME.DMP')
Advertisements

Using OEM 12c canned clone process

Today we got a request to refresh a development database from a copy of production. I decided to use the canned clone process that ships with OEM 12c instead of the usual scripts or manual processes. I was pleasantly surprised with the results. There was a failure related to an NFS problem and I was able to restart the process easily.

Pro:

  • Minimizes the amount of scratch disk space by backing up and restoring one datafile at a time.
  • No scripting or manual process required.
  • Uses standard RMAN under the covers.
  • Works with older versions (This one was 10.2)
  • Works with existing backups or directly from the source database (This database was not previously backed up and was not in archivelog mode)
  • Can duplicate a live database if it is in archivelog mode and running at version 11
  • Easy to restart after errors.
  • Runs as a database job in the background, unattended.
  • Works on the same machine or across the network.

Con:

  • More time required – one datafile at a time, no parallelization. 763G ran for 6.5 hours.
  • No email notification for errors or completion.
  • Black box – you really need to know how to do the steps manually and this will not teach you.
  • Needs scratch space on local disks to avoid NFS performance problems.  (size of largest datafile)

 

Migrating to different endianness

I recently moved a 2TB database from Linux, Oracle 11.1 to AIX, Oracle 11.2 using impdp with network_link and parallel options. The data was imported in about two hours and the indexes took another 7 hours. There was no downtime but we kept the users off the system while this was happening.

Here is the command I used on the new server:

nohup impdp system/secret NETWORK_LINK=olddb FULL=y  PARALLEL=25 &
Then you can run this to see which parallel process is importing which table:
impdp system attach
Import> status
Import> parallel=30 << this will increase the parallel processes if you want
I had created all the datafiles in the new database and created a db link named olddb beforehand. The servers were on the same local area network and the network was the bottleneck.

 

SQLDeveloper on Oracle Enterprise Linux 6.3

Today while installing SQLDeveloper on OEL 6.3 I ran into a couple of issues I wasn’t expecting. The installation went fine using the Oracle rpm file. The first execution of the /opt/sqldeveloper/sqldeveloper.sh script prompted for the location of a jdk. A find came up with /etc/alternatives/java_sdk_openjdk and that worked fine. The next issue was “Can’t connect to X11 window server using ‘:0.0’ as the value of the DISPLAY variable”. I tried running xterm to see if it would work. Xterm was not installed on the system and I couldn’t find any of the other X11 stuff. So I ran “yum install xterm”. Now xterm works and so does SQLDeveloper.

OEM 12c

I took the opportunity to attend the OEM 12c Workshop which was offered here in Indianapolis at the Oracle office last week. I had seen a presentation at the Oracle User’s Group meeting in April. Greg Walters was the presenter at the INOUG meeting and he also ran the workshop. It was a very good hands on workshop. Each attendee got their own servers. One of the exercises involved locking up a database and then connecting using real time ADDM (Automatic Database Diagnostics Monitor). It connects directly to the shared memory on the server and reads the v$ information. It reported the specific Unix process that was causing the problem. I logged on to the server and ran a kill -9 on that process which cleared up the database problem. Another good exercise involved generating an ADM (Application Data Model) and then using the model to subset the data on a new database. I typed in the SQL to select a subset of one table using bind variables for future flexibility. When I ran the job it selected a subset for that table plus all the other tables in the schema according to the ADM discovered referential integrity rules. That was very cool and fast too.

Training a new DBA on investigating performance problems

We have a new DBA in training and part of my job is to mentor and demonstrate good Oracle practices. Here’s a write up I made for him on how I solved a performance issue.

Issue: The BI team complained about a large query in the production BI database that had been running slowly for the past few days.

Steps to resolve:

Action

Result

Tool

Find the past three days SQL ids Found three different SQL ids for the past three days. @find_sql_awr (uses dba_hist_* views)
Review execution plan for past three days The optimizer saw each day as separate sql because bind variables are not used but it used the same plan anyway. The implication is that stats are stable over past three days. SQL Monitor showed a mismatch between expected rows and actual rows returned each step (this was apparent to me while the SQL was running). @dplan_awr (uses dbms_xplan.display_awr)
Determine the driving tables Identified three tables – queried dba_tables to find owner Eyeballs 🙂
Check statistics variability over the past 10 days Two tables showed no change in stats for past 10 days. @diff_table_stats (uses dbms_stats.diff_table_stats_in_history)
Gather fresh stats on all three tables dbms_stats.gather_table_stats
Check stats variability again One indexed column on one table changed from having 2 unique values to 70. @diff_table_stats
Generate a new execution plan, force matching New plan’s calculated cost is less than half the original plan. However the real performance improvement is a lot better because the original plan was calculated on old statistics and the original calculated cost was reported as much, much lower than the actual cost. SQL Tuning Advisor

Discussion:

The underlying data changed over the past 10 days. The SQL was being re-parsed every day because bind variables are not used. The automatic statistics gathering job did not recalculate statistics on two of the tables because it has a 10% threshold. The data changed less than 10% for the whole table even though one column changed significantly. When the optimizer saw the new statistics, it changed the indexes it uses. I added the force matching option so that the optimizer wouldn’t need to re-parse every day just because bind variables are not used. Quote from the BI team: “It is now running in single digit minutes as opposed to hours.”

Resources:

The cardinality presentation by Maria Colgan and the scripts presentation by Kerry Osborne provide more background on this method.

Truncating tables you don’t own

This topic came up again today – “I have delete privileges on the table, why can’t I truncate it?” I remembered getting surprised by this when Oracle changed it. In 7.0 anyone who had delete privileges could also truncate the table. When I upgraded to 7.1 my solution was to create a dynamic PL/SQL procedure owned by the table owner and grant execute to the other user. This is still the best workaround (today we would use execute immediate). I was surprised to see that an old post in comp.databases.oracle is still around – since 1994! I guess it’s true that nothing ever gets deleted on the internet 🙂

Discussion subject changed to “TRUNCATE & Version 7.1.3” by Steve Harville
Steve Harville
View profile 
 More options Dec 2 1994, 10:19 am
Oracle shipped a new plsql package with 7.1.3 that
provides a way to circumvent the new restrictions
on truncate. The dbms_sql package is installed
during a standard install. Set serveroutput on in
the glogin.sql script to see the output of this
procedure.
———————————————————–create or replace procedure truncate_table (
table_name   varchar2,
storage_type varchar2)
as
— This procedure accepts 2 arguments. table_name is
— the name of the table to truncate. It must be owned
— by the owner of this procedure. storage_type can be
— ‘drop storage’ or ‘reuse storage’. Grant execute on
— this procedure to whoever needs truncate privleges
— on this user’s tables.
— Steve Harville, Stemco Inc, 12/1/94

crsor integer;
rval  integer;
begin
dbms_output.put_line(‘Truncating Table : ‘|| table_name ||
‘ Storage : ‘|| storage_type);
crsor := dbms_sql.open_cursor;
dbms_sql.parse(crsor, ‘truncate table ‘|| table_name ||
‘ ‘|| storage_type ,dbms_sql.v7);
rval := dbms_sql.execute(crsor);
dbms_sql.close_cursor(crsor);
end;
/
— grant execute to any user or role that needs truncate
grant execute on truncate_table to username, role;

———————————————————–

— truncate_table.sql
— example : sqlplus / @truncate_table emp reuse storage
— Steve Harville, Stemco Inc, 12/1/94

execute <owner name goes here>.truncate_table(‘&1′,’&2 &3’)

———————————————————–

# truncate_table unix shell script
# example : truncate_table emp reuse storage
# Steve Harville, Stemco Inc, 12/1/94

sqlplus / @truncate_table $1 $2 $3

———————————————————–

Hope this helps.

Steve Harville

Installing ASM/RAC without asmlib

I’m installing RAC (using ASM) on Red Hat Linux 6.2. Oracle has decided not to provide asmlib for Red Hat 6 and above. So I needed to figure out how to ensure correct permissions and ownership for the shared disks. Red Hat has an article on the support website which recommends using udev or multipath. I set up multipath on my VMware systems but it did not see any disks. VMware was not reporting the UUID’s of the disks to Linux. A little research on the VMware website gave me a clue, and I inserted this line into the vmx files:

disk.EnableUUID = “TRUE”

After rebooting, the multipath software could see the shared disks and automatically created the multipath devices.

Migrating databases to new storage

This was a last minute request to move five databases on two servers from an old failing storage system to a new one. There was a total of 5.6 terabytes to move and the databases were in heavy use by the development team. The DBA who usually supported these systems was unavailable so I started working with the storage team to get this done.

I was able to get the storage team to create 67 new disks with the same sizes as the original disks and assign them to the correct servers. I researched how to get the disks to show up for ASM on Windows. There were 37 diskgroups and I assigned the new disks to each disk group according to the size of the original disks. So each diskgroup now had double the storage.

The next step shows the power of this technique: I dropped the old disks from each diskgroup and ASM moved all the data from the old disks to the new disks, then released the old disks. I started these late in the day and they finished the next morning. No downtime, no impact to the development team.

Lessons learned: use standard disk sizes, minimize the number of diskgroups

RAC on AIX

I just finished the class: X0161 Oracle RAC on AIX Systems Workshop taught by Andrei Socoliuc of IBM Romania. Andrei knew a lot about Oracle RAC and AIX so it was a good class. I thought the best part was the hands on labs, especially the pre-installation preparation of the operating system. There was also a lot of good information on hardware and LPAR configuration. This is my first RAC class since 2003 when it had just been released, so the overview of RAC was a good way for me to get a refresher on RAC internals. My only complaint  is that we spent a lot of time on IBM’s shared disk solution (GPFS) and very little on Oracle’s ASM .

How do I know if the cardinality estimates in a plan are accurate?

Maria Colgan has another great blog post today about using the GATHER_PLAN_STATISTICS hint to see the actual rows returned as well as the optimizer’s estimate of number of rows returned.  This is very good information and will make it obvious if your statistics are wrong.

Opening a standby READ_ONLY and continuing to apply logs

After reading Guenadi’s post I decided to try it myself. Creating Oracle 11g active standby database from physical standby database.

10:34:21 PROD-DB1SYS>select * from v$instance;
INSTANCE_NUMBER INSTANCE_NAME    HOST_NAME                                                        VERSION
--------------- ---------------- ---------------------------------------------------------------- -----------------
              1 PROD-DB             dev-box                                                     11.1.0.7.0
1 row selected.
Elapsed: 00:00:00.01
10:37:23 PROD-DB1SYS>select open_mode from v$database;
OPEN_MODE
----------
MOUNTED
1 row selected.
Elapsed: 00:00:00.03
10:37:48 PROD-DB1SYS>alter database recover managed standby database cancel;
Database altered.
Elapsed: 00:00:02.06
10:38:52 PROD-DB1SYS>alter database open read only;
Database altered.
Elapsed: 00:00:23.96
10:39:45 PROD-DB1SYS>alter database recover managed standby database using current logfile disconnect;
Database altered.
Elapsed: 00:00:09.08
10:40:40 PROD-DB1SYS>select open_mode from v$database;
OPEN_MODE
----------
READ ONLY
1 row selected.
Elapsed: 00:00:00.02
……………………………………………………………………………….
SQL> select HOST_NAME from v$instance;

HOST_NAME
----------------------------------------------------------------

prod-box

SQL> create user sharville identified by secret account unlock;
User created.
SQL> grant unlimited tablespace to sharville;
Grant succeeded.
SQL> create table sharville.sh (col_id number(10));
Table created.
……………………………………………………………………………..
SQL> select HOST_NAME from v$instance;

HOST_NAME
----------------------------------------------------------------
  dev-box  
11.1.0.7.0        04-APR-11 OPEN         NO           1 STARTED
ALLOWED    NO  ACTIVE            PRIMARY_INSTANCE   NORMAL    NO
SQL> select object_name from all_objects where owner = 'SHARVILLE';
OBJECT_NAME
------------------------------
SH

WAIT_FOR_GAP status on standby

This morning I noticed one of the managed standby databases was in a pending shutdown state. I restarted it and verified the logs were getting applied from production. However, I noticed the WAIT_FOR_GAP status on one of the old logs. Some of the logs had not been applied and were no longer on the production system. So I found out which logs were in the gap and used RMAN to recover them in production. Oracle automatically shipped them to the standby and applied them.

Discover a gap:
 1* SELECT PROCESS, CLIENT_PROCESS, SEQUENCE#, STATUS FROM V$MANAGED_STANDBY 
PROCESS   CLIENT_P  SEQUENCE# STATUS
--------- -------- ---------- ------------
ARCH      ARCH         136626 CLOSING
ARCH      ARCH              0 CONNECTED
ARCH      ARCH         136627 CLOSING
ARCH      ARCH         136628 CLOSING
RFS       LGWR         136629 IDLE
RFS       UNKNOWN           0 IDLE
RFS       UNKNOWN           0 IDLE
MRP0      N/A          130107 WAIT_FOR_GAP
RFS       UNKNOWN           0 IDLE

Find out the extent of the gap:

  1  SELECT high.thread#, "LowGap#", "HighGap#"
  2  FROM
  3       (
  4       SELECT thread#, MIN(sequence#)-1 "HighGap#"
  5       FROM
  6       (
  7           SELECT a.thread#, a.sequence#
  8           FROM
  9           (
 10               SELECT *
 11               FROM v$archived_log
 12           ) a,
 13           (
 14               SELECT thread#, MAX(next_change#)gap1
 15               FROM v$log_history
 16               GROUP BY thread#
 17           ) b
 18           WHERE a.thread# = b.thread#
 19           AND a.next_change# > gap1
 20       )
 21       GROUP BY thread#
 22   ) high,
 23   (
 24       SELECT thread#, MIN(sequence#) "LowGap#"
 25       FROM
 26       (
 27           SELECT thread#, sequence#
 28           FROM v$log_history, v$datafile
 29           WHERE checkpoint_change# <= next_change#
 30           AND checkpoint_change# >= first_change#
 31       )
 32       GROUP BY thread#
 33   ) low
 34*  WHERE low.thread# = high.thread#
11:13:08 BIPROD1SYS>/
   THREAD#    LowGap#   HighGap#
---------- ---------- ----------
         1     130106     130153

Recover the logs in Production:

RMAN>  restore archivelog from sequence 130106 until sequence 130153;

I had to repeat this sequence of commands several times because there were multiple gaps.

Hotsos Symposium

This was the best conference I have ever attended. Very smart speakers and attendees. It was fun and I learned a lot about Oracle performance. Some of the themes were : statistics, Exadata and fine grained measurement. I would definitely go again! Kerry Osborne of Enkitek gave the keynote presentation and I had a chance to talk to him several times. Enkitek is a leader in Exadata and Kerry shared lots of good information.  Hotsos Symposium 2011

CRS-2800: Cannot start resource ‘ora.asm’ as it is already in the INTERMEDIATE state on server

I got this error when I was installing Oracle RAC 11.2 on Red Hat Enterprise Linux 5.6.  I was installing Clusterware using ASM on VMware shared disks.  When I created the independent persistent virtual disks, I left the “allocate all disk space now” option unselected. Oracleasm was happy on both RAC nodes. The Oracle installer was happy when it created +ASM1 on the first RAC node. But when the ASM instance started on node 2 it did not like the “virtually provisioned” disk. The +ASM2 instance was not open and was complaining about one of the shared disks being corrupt at a certain byte. When I checked the virtual disk files, I saw that Oracle was trying to read past the end of the file. I started all over with new fully allocated shared disk and that fixed the problem. Everything is up and running now.

Oracle error 19505 returned by remote Oracle server

I was re-running an rman duplicate from active database today and got this error. I had assumed that the duplicate command would overwrite the files from the previous run. Instead it created new filenames and caused the ASM disk group to fill up.

Here’s the rman message on the target system:

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of Duplicate Db command at 11/16/2010 10:08:27
RMAN-03015: error occurred in stored script Memory Script
RMAN-03009: failure of backup command on ORA_DISK_6 channel at 11/16/2010 10:00:25
ORA-17628: Oracle error 19505 returned by remote Oracle server

Here’s the alert log entry on the auxiliary system:

ORA-19505: failed to identify file "+data"
ORA-17502: ksfdcre:4 Failed to create file +data
ORA-15041: diskgroup "DATA" space exhausted

The target (source) is on ASM using OMF (Oracle Managed Files). The auxiliary (destination) is also using ASM and OMF. I looked through the rman log and found a lot of set newname commands like this:

executing command: SET NEWNAME

No filename is specified so it makes up new filenames, which are different from the previous run. So that’s why I ran out of space.

I used asmcmd to remove all of the files and reclaim the space.

Easy fix, but initially confusing.

ORA-38794: Flashback target time not in current incarnation

I got this error this morning when I attempted to flashback a standby database to an earlier time.

 flashback database to timestamp
   to_timestamp(’05/01/10 04:00:00′, ‘MM/DD/YY HH24:MI:SS’)
SQL> /
flashback database to timestamp
*
ERROR at line 1:
ORA-38794: Flashback target time not in current incarnation

This is Bug 6843554 – ORA-38794 during Flashback Database on Standby previously activated with Restore Point [ID 6843554.8]. It’s fixed in 11.2 but we are on 11.1.0.7. It’s caused by “orphaned incarnations” which you can see using this query:
select * from v$database_incarnation;

There is an easy work-around, flash back to the scn.
 flashback database to scn 44062315534;

Also, flashback database does a recovery so it needs the archived redo log from before the scn. We had to restore that from tape because it had already been backed up and deleted by RMAN.