We recently bought another company and we are in the process of integrating our systems with theirs. One requirement was to import their daily business data into our Business Intelligence system for reporting. We agreed they would provide an Oracle datapump export file for us to import. We wanted to automate this and have a non-privileged application owner perform the import. I created a stored procedure that could run the import as the system user and granted execute on the procedure to the application owner. The dbms_datapump package has lots of options and it was fun exploring them to find the best way of doing this.
Run as sys :
grant execute on dbms_lock to system;
Run as system:
CREATE OR REPLACE DIRECTORY SOURCE AS '/nfsmount/exp/incoming';
CREATE OR REPLACE PROCEDURE SYSTEM.SCHEMA_IMPORT (
dumpfilename IN VARCHAR2)
h1 NUMBER; -- data pump job handle
job_state VARCHAR2 (30);
status ku$_Status; -- data pump status
PRAGMA EXCEPTION_INIT (job_not_exist, -31626);
DBMS_DATAPUMP.open (operation => 'IMPORT',
job_mode => 'SCHEMA',
job_name => NULL);
DBMS_DATAPUMP.set_parameter (h1, 'TABLE_EXISTS_ACTION', 'TRUNCATE');
DBMS_DATAPUMP.add_file (h1, dumpfilename, 'SOURCE');
dumpfilename || '.log',
job_state := 'UNDEFINED';
WHILE (job_state != 'COMPLETED') AND (job_state != 'STOPPED')
handle => h1,
mask => DBMS_DATAPUMP.ku$_status_job_error
timeout => -1);
job_state := status.job_status.state;
DBMS_OUTPUT.put_line ('job finished');
grant execute on schema_import to APP_OWNER;
Run as APP_OWNER:
Today we got a request to refresh a development database from a copy of production. I decided to use the canned clone process that ships with OEM 12c instead of the usual scripts or manual processes. I was pleasantly surprised with the results. There was a failure related to an NFS problem and I was able to restart the process easily.
- Minimizes the amount of scratch disk space by backing up and restoring one datafile at a time.
- No scripting or manual process required.
- Uses standard RMAN under the covers.
- Works with older versions (This one was 10.2)
- Works with existing backups or directly from the source database (This database was not previously backed up and was not in archivelog mode)
- Can duplicate a live database if it is in archivelog mode and running at version 11
- Easy to restart after errors.
- Runs as a database job in the background, unattended.
- Works on the same machine or across the network.
- More time required – one datafile at a time, no parallelization. 763G ran for 6.5 hours.
- No email notification for errors or completion.
- Black box – you really need to know how to do the steps manually and this will not teach you.
- Needs scratch space on local disks to avoid NFS performance problems. (size of largest datafile)
I recently moved a 2TB database from Linux, Oracle 11.1 to AIX, Oracle 11.2 using impdp with network_link and parallel options. The data was imported in about two hours and the indexes took another 7 hours. There was no downtime but we kept the users off the system while this was happening.
Here is the command I used on the new server:
nohup impdp system/secret NETWORK_LINK=olddb FULL=y PARALLEL=25 &
Then you can run this to see which parallel process is importing which table:
impdp system attach
Import> parallel=30 << this will increase the parallel processes if you want
I had created all the datafiles in the new database and created a db link named olddb beforehand. The servers were on the same local area network and the network was the bottleneck.
Today while installing SQLDeveloper on OEL 6.3 I ran into a couple of issues I wasn’t expecting. The installation went fine using the Oracle rpm file. The first execution of the /opt/sqldeveloper/sqldeveloper.sh script prompted for the location of a jdk. A find came up with /etc/alternatives/java_sdk_openjdk and that worked fine. The next issue was “Can’t connect to X11 window server using ‘:0.0’ as the value of the DISPLAY variable”. I tried running xterm to see if it would work. Xterm was not installed on the system and I couldn’t find any of the other X11 stuff. So I ran “yum install xterm”. Now xterm works and so does SQLDeveloper.
I took the opportunity to attend the OEM 12c Workshop which was offered here in Indianapolis at the Oracle office last week. I had seen a presentation at the Oracle User’s Group meeting in April. Greg Walters was the presenter at the INOUG meeting and he also ran the workshop. It was a very good hands on workshop. Each attendee got their own servers. One of the exercises involved locking up a database and then connecting using real time ADDM (Automatic Database Diagnostics Monitor). It connects directly to the shared memory on the server and reads the v$ information. It reported the specific Unix process that was causing the problem. I logged on to the server and ran a kill -9 on that process which cleared up the database problem. Another good exercise involved generating an ADM (Application Data Model) and then using the model to subset the data on a new database. I typed in the SQL to select a subset of one table using bind variables for future flexibility. When I ran the job it selected a subset for that table plus all the other tables in the schema according to the ADM discovered referential integrity rules. That was very cool and fast too.
We have a new DBA in training and part of my job is to mentor and demonstrate good Oracle practices. Here’s a write up I made for him on how I solved a performance issue.
Issue: The BI team complained about a large query in the production BI database that had been running slowly for the past few days.
Steps to resolve:
|Find the past three days SQL ids
||Found three different SQL ids for the past three days.
||@find_sql_awr (uses dba_hist_* views)
|Review execution plan for past three days
||The optimizer saw each day as separate sql because bind variables are not used but it used the same plan anyway. The implication is that stats are stable over past three days. SQL Monitor showed a mismatch between expected rows and actual rows returned each step (this was apparent to me while the SQL was running).
||@dplan_awr (uses dbms_xplan.display_awr)
|Determine the driving tables
||Identified three tables – queried dba_tables to find owner
|Check statistics variability over the past 10 days
||Two tables showed no change in stats for past 10 days.
||@diff_table_stats (uses dbms_stats.diff_table_stats_in_history)
|Gather fresh stats on all three tables
|Check stats variability again
||One indexed column on one table changed from having 2 unique values to 70.
|Generate a new execution plan, force matching
||New plan’s calculated cost is less than half the original plan. However the real performance improvement is a lot better because the original plan was calculated on old statistics and the original calculated cost was reported as much, much lower than the actual cost.
||SQL Tuning Advisor
The underlying data changed over the past 10 days. The SQL was being re-parsed every day because bind variables are not used. The automatic statistics gathering job did not recalculate statistics on two of the tables because it has a 10% threshold. The data changed less than 10% for the whole table even though one column changed significantly. When the optimizer saw the new statistics, it changed the indexes it uses. I added the force matching option so that the optimizer wouldn’t need to re-parse every day just because bind variables are not used. Quote from the BI team: “It is now running in single digit minutes as opposed to hours.”
The cardinality presentation by Maria Colgan and the scripts presentation by Kerry Osborne provide more background on this method.
This topic came up again today – “I have delete privileges on the table, why can’t I truncate it?” I remembered getting surprised by this when Oracle changed it. In 7.0 anyone who had delete privileges could also truncate the table. When I upgraded to 7.1 my solution was to create a dynamic PL/SQL procedure owned by the table owner and grant execute to the other user. This is still the best workaround (today we would use execute immediate). I was surprised to see that an old post in comp.databases.oracle is still around – since 1994! I guess it’s true that nothing ever gets deleted on the internet 🙂
|Discussion subject changed to “TRUNCATE & Version 7.1.3” by Steve Harville
|| More options Dec 2 1994, 10:19 am
Oracle shipped a new plsql package with 7.1.3 that
provides a way to circumvent the new restrictions
. The dbms_sql package is installed
during a standard install. Set serveroutput on in
the glogin.sql script to see the output of this
———————————————————–create or replace procedure truncate_table (
— This procedure accepts 2 arguments. table_name is
— the name of the table to truncate
. It must be owned
— by the owner of this procedure. storage_type can be
— ‘drop storage’ or ‘reuse storage’. Grant execute on
— this procedure to whoever needs truncate
— on this user’s tables.
— Steve Harville, Stemco
dbms_output.put_line(‘Truncating Table : ‘|| table_name ||
‘ Storage : ‘|| storage_type);
crsor := dbms_sql.open_cursor;
table ‘|| table_name ||
‘ ‘|| storage_type ,dbms_sql.v7);
rval := dbms_sql.execute(crsor);
— grant execute to any user or role that needs truncate
grant execute on truncate_table to username, role;
— example : sqlplus / @truncate_table emp reuse storage
— Steve Harville, Stemco Inc, 12/1/94
execute <owner name goes here>.truncate_table(‘&1′,’&2 &3’)
# truncate_table unix shell script
# example : truncate_table emp reuse storage
# Steve Harville, Stemco Inc, 12/1/94
sqlplus / @truncate_table $1 $2 $3
Hope this helps.
I’m installing RAC (using ASM) on Red Hat Linux 6.2. Oracle has decided not to provide asmlib for Red Hat 6 and above. So I needed to figure out how to ensure correct permissions and ownership for the shared disks. Red Hat has an article on the support website which recommends using udev or multipath. I set up multipath on my VMware systems but it did not see any disks. VMware was not reporting the UUID’s of the disks to Linux. A little research on the VMware website gave me a clue, and I inserted this line into the vmx files:
disk.EnableUUID = “TRUE”
After rebooting, the multipath software could see the shared disks and automatically created the multipath devices.
This was a last minute request to move five databases on two servers from an old failing storage system to a new one. There was a total of 5.6 terabytes to move and the databases were in heavy use by the development team. The DBA who usually supported these systems was unavailable so I started working with the storage team to get this done.
I was able to get the storage team to create 67 new disks with the same sizes as the original disks and assign them to the correct servers. I researched how to get the disks to show up for ASM on Windows. There were 37 diskgroups and I assigned the new disks to each disk group according to the size of the original disks. So each diskgroup now had double the storage.
The next step shows the power of this technique: I dropped the old disks from each diskgroup and ASM moved all the data from the old disks to the new disks, then released the old disks. I started these late in the day and they finished the next morning. No downtime, no impact to the development team.
Lessons learned: use standard disk sizes, minimize the number of diskgroups
I just finished the class: X0161 Oracle RAC on AIX Systems Workshop taught by Andrei Socoliuc of IBM Romania. Andrei knew a lot about Oracle RAC and AIX so it was a good class. I thought the best part was the hands on labs, especially the pre-installation preparation of the operating system. There was also a lot of good information on hardware and LPAR configuration. This is my first RAC class since 2003 when it had just been released, so the overview of RAC was a good way for me to get a refresher on RAC internals. My only complaint is that we spent a lot of time on IBM’s shared disk solution (GPFS) and very little on Oracle’s ASM .
Maria Colgan has another great blog post today about using the GATHER_PLAN_STATISTICS hint to see the actual rows returned as well as the optimizer’s estimate of number of rows returned. This is very good information and will make it obvious if your statistics are wrong.
After reading Guenadi’s post I decided to try it myself. Creating Oracle 11g active standby database from physical standby database.
10:34:21 PROD-DB1SYS>select * from v$instance;
INSTANCE_NUMBER INSTANCE_NAME HOST_NAME VERSION
--------------- ---------------- ---------------------------------------------------------------- -----------------
1 PROD-DB dev-box 18.104.22.168.0
1 row selected.
10:37:23 PROD-DB1SYS>select open_mode from v$database;
1 row selected.
10:37:48 PROD-DB1SYS>alter database recover managed standby database cancel;
10:38:52 PROD-DB1SYS>alter database open read only;
10:39:45 PROD-DB1SYS>alter database recover managed standby database using current logfile disconnect;
10:40:40 PROD-DB1SYS>select open_mode from v$database;
1 row selected.
SQL> select HOST_NAME from v$instance;
SQL> create user sharville identified by secret account unlock;
SQL> grant unlimited tablespace to sharville;
SQL> create table sharville.sh (col_id number(10));
SQL> select HOST_NAME from v$instance;
22.214.171.124.0 04-APR-11 OPEN NO 1 STARTED
ALLOWED NO ACTIVE PRIMARY_INSTANCE NORMAL NO
SQL> select object_name from all_objects where owner = 'SHARVILLE';
This morning I noticed one of the managed standby databases was in a pending shutdown state. I restarted it and verified the logs were getting applied from production. However, I noticed the WAIT_FOR_GAP status on one of the old logs. Some of the logs had not been applied and were no longer on the production system. So I found out which logs were in the gap and used RMAN to recover them in production. Oracle automatically shipped them to the standby and applied them.
Discover a gap:
1* SELECT PROCESS, CLIENT_PROCESS, SEQUENCE#, STATUS FROM V$MANAGED_STANDBY
PROCESS CLIENT_P SEQUENCE# STATUS
--------- -------- ---------- ------------
ARCH ARCH 136626 CLOSING
ARCH ARCH 0 CONNECTED
ARCH ARCH 136627 CLOSING
ARCH ARCH 136628 CLOSING
RFS LGWR 136629 IDLE
RFS UNKNOWN 0 IDLE
RFS UNKNOWN 0 IDLE
MRP0 N/A 130107 WAIT_FOR_GAP
RFS UNKNOWN 0 IDLE
Find out the extent of the gap:
1 SELECT high.thread#, "LowGap#", "HighGap#"
4 SELECT thread#, MIN(sequence#)-1 "HighGap#"
7 SELECT a.thread#, a.sequence#
10 SELECT *
11 FROM v$archived_log
12 ) a,
14 SELECT thread#, MAX(next_change#)gap1
15 FROM v$log_history
16 GROUP BY thread#
17 ) b
18 WHERE a.thread# = b.thread#
19 AND a.next_change# > gap1
21 GROUP BY thread#
22 ) high,
24 SELECT thread#, MIN(sequence#) "LowGap#"
27 SELECT thread#, sequence#
28 FROM v$log_history, v$datafile
29 WHERE checkpoint_change# <= next_change#
30 AND checkpoint_change# >= first_change#
32 GROUP BY thread#
33 ) low
34* WHERE low.thread# = high.thread#
THREAD# LowGap# HighGap#
---------- ---------- ----------
1 130106 130153
Recover the logs in Production:
RMAN> restore archivelog from sequence 130106 until sequence 130153;
I had to repeat this sequence of commands several times because there were multiple gaps.
This was the best conference I have ever attended. Very smart speakers and attendees. It was fun and I learned a lot about Oracle performance. Some of the themes were : statistics, Exadata and fine grained measurement. I would definitely go again! Kerry Osborne of Enkitek gave the keynote presentation and I had a chance to talk to him several times. Enkitek is a leader in Exadata and Kerry shared lots of good information. Hotsos Symposium 2011
I got this error when I was installing Oracle RAC 11.2 on Red Hat Enterprise Linux 5.6. I was installing Clusterware using ASM on VMware shared disks. When I created the independent persistent virtual disks, I left the “allocate all disk space now” option unselected. Oracleasm was happy on both RAC nodes. The Oracle installer was happy when it created +ASM1 on the first RAC node. But when the ASM instance started on node 2 it did not like the “virtually provisioned” disk. The +ASM2 instance was not open and was complaining about one of the shared disks being corrupt at a certain byte. When I checked the virtual disk files, I saw that Oracle was trying to read past the end of the file. I started all over with new fully allocated shared disk and that fixed the problem. Everything is up and running now.
I was re-running an rman duplicate from active database today and got this error. I had assumed that the duplicate command would overwrite the files from the previous run. Instead it created new filenames and caused the ASM disk group to fill up.
Here’s the rman message on the target system:
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-03002: failure of Duplicate Db command at 11/16/2010 10:08:27
RMAN-03015: error occurred in stored script Memory Script
RMAN-03009: failure of backup command on ORA_DISK_6 channel at 11/16/2010 10:00:25
ORA-17628: Oracle error 19505 returned by remote Oracle server
Here’s the alert log entry on the auxiliary system:
ORA-19505: failed to identify file "+data"
ORA-17502: ksfdcre:4 Failed to create file +data
ORA-15041: diskgroup "DATA" space exhausted
The target (source) is on ASM using OMF (Oracle Managed Files). The auxiliary (destination) is also using ASM and OMF. I looked through the rman log and found a lot of set newname commands like this:
executing command: SET NEWNAME
No filename is specified so it makes up new filenames, which are different from the previous run. So that’s why I ran out of space.
I used asmcmd to remove all of the files and reclaim the space.
Easy fix, but initially confusing.
Thanks to Zeeshan Baig for the great article about installing Webutil for Oracle Forms. It is much better than the Oracle documentation.
Here’s a good blog post about cpu utilization.
I got this error this morning when I attempted to flashback a standby database to an earlier time.
flashback database to timestamp
to_timestamp(’05/01/10 04:00:00′, ‘MM/DD/YY HH24:MI:SS’)
flashback database to timestamp
ERROR at line 1:
ORA-38794: Flashback target time not in current incarnation
This is Bug 6843554 – ORA-38794 during Flashback Database on Standby previously activated with Restore Point [ID 6843554.8]. It’s fixed in 11.2 but we are on 126.96.36.199. It’s caused by “orphaned incarnations” which you can see using this query:
select * from v$database_incarnation;
There is an easy work-around, flash back to the scn.
flashback database to scn 44062315534;
Also, flashback database does a recovery so it needs the archived redo log from before the scn. We had to restore that from tape because it had already been backed up and deleted by RMAN.
I found out that sometimes the OC4J processes won’t restart because the shutdown script fails to clean out the J2EE persistent lock files. Lucky for me someone had already documented how to fix the problem (delete the files).