Oracle Application Express 5.1.1 on Exadata cluster

Our DBA team really likes using APEX to develop DBA applications. We are a pretty small team and we like to automate and standardize everything we do. We make a habit of using database tables to store our configuration information and we write scripts that read those DBA tables in order to support our business users.

We recently took control of a new Exadata cluster and of course we created a DBA repository database as we were creating the business databases. Since APEX 5.1.1 was just released, we decided to replace our old APEX system. That system was was a single instance running on a virtual machine and controlled by an outsourced administration team. We are a lot more comfortable having control over installation and maintenance of a system we depend on to get our work done.

The new Exadata is a four node RAC and our new DBA repository has an instance running on each node. We did not want to install software on the Exadata so that ruled out using Rest Data Services (aka APEX Listener). It also ruled out using Oracle HTTP Server with mod_plsql.

Since this APEX system is only used by our DBA team and is not exposed to outside networks we felt comfortable using the Embedded PL/SQL Gateway. However, the Oracle documentation contains this ominous note:”Oracle recommends that you do not select the Embedded PL/SQL Gateway option for Oracle RAC installations. Because the Embedded PL/SQL Gateway uses an HTTP Server built into the database instance, it does not take advantage of the Oracle RAC shared architecture.” (https://docs.oracle.com/database/apex-5.1/HTMIG/choosing-web-listener.htm#HTMIG370)

What to do? We decided to move forward with the Embedded PL/SQL Gateway and see how useable it would be. We could always spin up another virtual machine later if we needed to go the Rest Data Services route. I went through the steps in the APEX documentation to set up the Embedded PL/SQL Gateway. I could not get a response when I hit the database machine APEX URL that I had configured. I poked around on the servers and noticed that the XML db listeners were not running. I had forgotten that when I created the DBA database I intentionally did not install any options like Java and XML db. After installing XML db and restarting the database APEX started working. (https://docs.oracle.com/database/121/ADXDB/appaman.htm#ADXDB2700)

But that’s not all – here’s the real reason I wanted to blog about this. This APEX system is reachable using the scan listener! So that scary quote in the documentation should be ignored. We can connect using a URL like this: http://corpexa-scan.corp.com:8080/apex . No server names are required. So we CAN take advantage of RAC functionality.

This is just what we wanted so we are very satisfied with our new APEX system.

Oracle In Memory Column Store

I attended the Greater Cincinnati Oracle User’s Group meeting yesterday at the local Oracle office. I was expecting a good presentation because Maria Colgan (@sqlmaria) was presenting.

I was not disappointed and this was an excellent presentation and conversation with Maria who used to head up the optimizer team and now runs the In Memory team.  She went into deep dives of a lot of technical innovations they have used for performance and reliability. I took lots of notes.

All of the code is embedded in Oracle database software (12.1) so it’s just a parameter change to turn it on. The only other step is to alter table INMEMORY. Some of the inmemory stuff can be offloaded to Exadata but you have to pay the license when you turn it on.

In Oracle 12.2 the whole thing can run in exadata flash instead of in the db server dram. Oracle 12.2 will be out “sometime this year”.

Automating Redo Log Maintenance

We asked a consultant to review our production database and one of the recommendations was to switch our redo logs from 2GB to 4GB. This surprised me at first but after checking the 11.2 documentation, I saw that 4GB is the current recommendation.  This is a 4 node RAC cluster and we had previously added several additional redo logs based on the application vendor’s  recommendations :-).  I decided I would take the opportunity to try out the PL/SQL debugger in SQL Developer 4.0.1.  Doing the development entirely in SQL Developer instead of vi turned out to be a good decision and saved me a lot of time.  In addition to changing the redo sizes, I wanted to move them from our original Exadata X2 ASM disk groups to our new Exadata X3 disk groups.

Manual process:

  • Create a new group like the original group
  • If the old group is inactive and has been archived, drop the old group
  • If the old group is active, current or has not been archived, force a log switch on all RAC nodes, wait until it’s inactive, then drop the old group
  • Repeat until all of the old log groups are gone (I’m using Oracle Managed Files so I don’t need to log on to asmcmd and manually delete the files)

I automated these manual steps in the sys.REDO_MAINT package:

CREATE OR REPLACE PACKAGE REDO_MAINT AS
type log_rec_type is record (log_group number,log_thread number,is_archived varchar2(3),log_status varchar2(10) ,log_size number);
cursor log_records return log_rec_type;
procedure manage_logs (
  new_size      varchar2,
  new_location  varchar2,
  new_location2 varchar2);
function create_new_group (
  newgroup    number,
  newthread   number,
  newfilespec varchar2,
  newfilespec2 varchar2,
  newfilesize varchar2)
    return boolean;
function drop_old_group (
  oldgroup  number)
    return boolean;
function redo_in_use (
  groupnum  number)
    return boolean;
function redo_not_archived (
  groupnum number)
    return boolean;
END REDO_MAINT;
/

CREATE OR REPLACE PACKAGE BODY REDO_MAINT AS

  cursor log_records return log_rec_type IS
    SELECT group# ,thread#,archived,status,BYTES from v$log;

  procedure manage_logs (
  new_size      varchar2,
  new_location  varchar2,
  new_location2 varchar2) AS
  BEGIN
    for existing_logs in log_records
    loop
      dbms_output.put_line('Group '||existing_logs.log_group 
                        || ' Thread ' || existing_logs.log_thread
                        || ' Archived ' || existing_logs.is_archived
                        || ' Status ' || existing_logs.log_status
                        || ' Size ' || existing_logs.log_size);

      if create_new_group(existing_logs.log_group + 100, existing_logs.log_thread,new_location,new_location2,new_size) 
      then
        dbms_output.put_line('create new group success');
        loop
          if drop_old_group(existing_logs.log_group)
          then
            dbms_output.put_line('Dropped old group ' || existing_logs.log_group);
            exit;
          end if;
          dbms_output.put_line('old group was not dropped - retrying ' || existing_logs.log_group);
        end loop;
      else
        dbms_output.put_line('create new group FAILED');
        exit;
      end if;
    end loop;
    NULL;
  END manage_logs;

  function create_new_group (
  newgroup    number,
  newthread   number,
  newfilespec varchar2,
  newfilespec2 varchar2,
  newfilesize varchar2)
    return boolean AS
  create_group varchar(200);
  BEGIN
    create_group:='ALTER DATABASE ADD LOGFILE THREAD ' || newthread 
    || ' GROUP ' || newgroup 
    || ' ( ' || '''' || newfilespec || '''' || ' , ' || '''' || newfilespec2 || '''' 
    || ' ) size ' || newfilesize ;
    dbms_output.put_line(create_group);
    execute immediate create_group;
    RETURN TRUE;
    exception
    when others then 
     dbms_output.put_line('An error was encountered - '||SQLCODE||' -ERROR- '||SQLERRM);
     return FALSE;
  END create_new_group;

  function drop_old_group (
  oldgroup  number)
    return boolean AS
  drop_group varchar(200);
  BEGIN
    drop_group:='alter database drop logfile group ' || oldgroup ;
     if redo_in_use(oldgroup) or redo_not_archived(oldgroup)
     then
      execute immediate 'ALTER SYSTEM ARCHIVE LOG CURRENT' ;
      -- wait until target redo log becomes inactive
      loop
        if redo_in_use(oldgroup) or redo_not_archived(oldgroup)
        then
          dbms_output.put_line('Sleeping - group ' || oldgroup);
          DBMS_LOCK.SLEEP(60);
        else
          -- was active - now inactive 
          dbms_output.put_line(drop_group);
          execute immediate drop_group;
          RETURN TRUE;
          exit;
        end if;
      end loop;
      return FALSE;
    end if;
    -- was not active - no wait required
    dbms_output.put_line(drop_group);
    execute immediate drop_group;
    RETURN TRUE;
        exception
    when others then 
     dbms_output.put_line('An error was encountered - '||SQLCODE||' -ERROR- '||SQLERRM);
     return FALSE;
  END drop_old_group;

  function redo_in_use (
  groupnum  number)
    return boolean AS
  current_status varchar2(10);  
  BEGIN
  select status into current_status from v$log where group# = groupnum;
  if current_status in ( 'ACTIVE','CURRENT') 
  then
    RETURN TRUE;
  else
    return FALSE;
  END IF;
        exception
    when others then 
     dbms_output.put_line('An error was encountered - '||SQLCODE||' -ERROR- '||SQLERRM);
     return FALSE;
  END redo_in_use;

  function redo_not_archived (
  groupnum number)
    return boolean AS
  is_it_archived varchar(3);
  BEGIN
    select archived into is_it_archived from v$log where group# = groupnum;
    if is_it_archived = 'NO'
    then
    -- redo not archived
       RETURN TRUE;
    else
      return FALSE;
    END IF;
      exception
    when others then 
     dbms_output.put_line('An error was encountered - '||SQLCODE||' -ERROR- '||SQLERRM);
     return FALSE;
  END redo_not_archived;

END REDO_MAINT;
/

Example:
SYS>select group#,thread#,bytes,archived,status from v$log order by  group#, thread#;

    GROUP#    THREAD#            BYTES ARC STATUS
---------- ---------- ---------------- --- ----------------
         1          1    2,147,483,648 YES INACTIVE
         2          1    2,147,483,648 YES INACTIVE
         3          2    2,147,483,648 YES INACTIVE
         4          2    2,147,483,648 YES INACTIVE
         5          1    2,147,483,648 YES INACTIVE
         6          2    2,147,483,648 YES INACTIVE
         7          3    2,147,483,648 YES INACTIVE
         8          3    2,147,483,648 YES ACTIVE
         9          3    2,147,483,648 NO  CURRENT
        10          4    2,147,483,648 YES INACTIVE
        11          4    2,147,483,648 YES INACTIVE
        12          4    2,147,483,648 YES INACTIVE
        13          1    2,147,483,648 YES INACTIVE
        14          1    2,147,483,648 YES INACTIVE
        15          1    2,147,483,648 YES INACTIVE
        16          1    2,147,483,648 NO  CURRENT
        17          1    2,147,483,648 YES INACTIVE
        20          2    2,147,483,648 YES ACTIVE
        21          2    2,147,483,648 NO  CURRENT
        22          2    2,147,483,648 YES INACTIVE
        23          2    2,147,483,648 YES INACTIVE
        24          2    2,147,483,648 YES INACTIVE
        30          3    2,147,483,648 YES INACTIVE
        31          3    2,147,483,648 YES INACTIVE
        32          3    2,147,483,648 YES INACTIVE
        33          3    2,147,483,648 YES INACTIVE
        34          3    2,147,483,648 YES INACTIVE
        40          4    2,147,483,648 YES INACTIVE
        41          4    2,147,483,648 YES INACTIVE
        42          4    2,147,483,648 YES INACTIVE
        43          4    2,147,483,648 NO  CURRENT
        44          4    2,147,483,648 YES INACTIVE

32 rows selected.
Kick it off:
BEGIN
  REDO_MAINT.MANAGE_LOGS( NEW_SIZE => '4g', NEW_LOCATION => '+RECO_EDP2', NEW_LOCATION2 => '+DATA_EDP2');
END;
/

on another session:
SYS>select group#,thread#,bytes,archived,status from v$log order by  group#, thread#;

    GROUP#    THREAD#            BYTES ARC STATUS
---------- ---------- ---------------- --- ----------------
         3          2    2,147,483,648 YES INACTIVE
         4          2    2,147,483,648 YES INACTIVE
         5          1    2,147,483,648 YES INACTIVE
         6          2    2,147,483,648 YES INACTIVE
         7          3    2,147,483,648 YES INACTIVE
         8          3    2,147,483,648 YES INACTIVE
         9          3    2,147,483,648 YES INACTIVE
        10          4    2,147,483,648 YES INACTIVE
        11          4    2,147,483,648 NO  CURRENT
        12          4    2,147,483,648 YES INACTIVE
        13          1    2,147,483,648 YES INACTIVE
        14          1    2,147,483,648 YES INACTIVE
        15          1    2,147,483,648 YES INACTIVE
        16          1    2,147,483,648 YES INACTIVE
        17          1    2,147,483,648 YES INACTIVE
        20          2    2,147,483,648 YES INACTIVE
        21          2    2,147,483,648 YES INACTIVE
        22          2    2,147,483,648 YES INACTIVE
        23          2    2,147,483,648 YES INACTIVE
        24          2    2,147,483,648 NO  CURRENT
        30          3    2,147,483,648 YES INACTIVE
        31          3    2,147,483,648 YES INACTIVE
        32          3    2,147,483,648 NO  CURRENT
        33          3    2,147,483,648 YES INACTIVE
        34          3    2,147,483,648 YES INACTIVE
        40          4    2,147,483,648 YES INACTIVE
        41          4    2,147,483,648 YES INACTIVE
        42          4    2,147,483,648 YES INACTIVE
        43          4    2,147,483,648 YES INACTIVE
        44          4    2,147,483,648 YES INACTIVE
       101          1    4,294,967,296 NO  CURRENT
       102          1    4,294,967,296 YES UNUSED

32 rows selected.

Output from PL/SQL:

Group 1 Thread 1 Archived NO Status CURRENT Size 2147483648
ALTER DATABASE ADD LOGFILE THREAD 1 GROUP 101 ( '+RECO_EDP2' , '+DATA_EDP2' ) size 4g
create new group success
Sleeping - group 1
Sleeping - group 1
Sleeping - group 1
Sleeping - group 1
Sleeping - group 1
alter database drop logfile group 1
Dropped old group 1
Group 2 Thread 1 Archived YES Status INACTIVE Size 2147483648
ALTER DATABASE ADD LOGFILE THREAD 1 GROUP 102 ( '+RECO_EDP2' , '+DATA_EDP2' ) size 4g
create new group success
alter database drop logfile group 2
Dropped old group 2
Group 3 Thread 2 Archived YES Status INACTIVE Size 2147483648
ALTER DATABASE ADD LOGFILE THREAD 2 GROUP 103 ( '+RECO_EDP2' , '+DATA_EDP2' ) size 4g
create new group success
alter database drop logfile group 3
Dropped old group 3
Group 4 Thread 2 Archived YES Status INACTIVE Size 2147483648
ALTER DATABASE ADD LOGFILE THREAD 2 GROUP 104 ( '+RECO_EDP2' , '+DATA_EDP2' ) size 4g
create new group success
alter database drop logfile group 4
Dropped old group 4
Group 5 Thread 1 Archived YES Status INACTIVE Size 2147483648
ALTER DATABASE ADD LOGFILE THREAD 1 GROUP 105 ( '+RECO_EDP2' , '+DATA_EDP2' ) size 4g
create new group success
alter database drop logfile group 5
Dropped old group 5
Group 6 Thread 2 Archived YES Status INACTIVE Size 2147483648
ALTER DATABASE ADD LOGFILE THREAD 2 GROUP 106 ( '+RECO_EDP2' , '+DATA_EDP2' ) size 4g
create new group success
alter database drop logfile group 6
Dropped old group 6
Group 7 Thread 3 Archived YES Status INACTIVE Size 2147483648
ALTER DATABASE ADD LOGFILE THREAD 3 GROUP 107 ( '+RECO_EDP2' , '+DATA_EDP2' ) size 4g
create new group success
alter database drop logfile group 7
Dropped old group 7
Group 8 Thread 3 Archived YES Status INACTIVE Size 2147483648
ALTER DATABASE ADD LOGFILE THREAD 3 GROUP 108 ( '+RECO_EDP2' , '+DATA_EDP2' ) size 4g
create new group success
alter database drop logfile group 8
Dropped old group 8
Group 9 Thread 3 Archived YES Status INACTIVE Size 2147483648
ALTER DATABASE ADD LOGFILE THREAD 3 GROUP 109 ( '+RECO_EDP2' , '+DATA_EDP2' ) size 4g
create new group success
alter database drop logfile group 9
Dropped old group 9
Group 10 Thread 4 Archived NO Status CURRENT Size 2147483648
ALTER DATABASE ADD LOGFILE THREAD 4 GROUP 110 ( '+RECO_EDP2' , '+DATA_EDP2' ) size 4g
create new group success
alter database drop logfile group 10
Dropped old group 10
Group 11 Thread 4 Archived YES Status INACTIVE Size 2147483648
ALTER DATABASE ADD LOGFILE THREAD 4 GROUP 111 ( '+RECO_EDP2' , '+DATA_EDP2' ) size 4g
create new group success
Sleeping - group 11
Sleeping - group 11
Sleeping - group 11
Sleeping - group 11
alter database drop logfile group 11
Dropped old group 11
Group 12 Thread 4 Archived YES Status INACTIVE Size 2147483648
ALTER DATABASE ADD LOGFILE THREAD 4 GROUP 112 ( '+RECO_EDP2' , '+DATA_EDP2' ) size 4g
create new group success
alter database drop logfile group 12
Dropped old group 12
Group 13 Thread 1 Archived YES Status INACTIVE Size 2147483648
ALTER DATABASE ADD LOGFILE THREAD 1 GROUP 113 ( '+RECO_EDP2' , '+DATA_EDP2' ) size 4g
create new group success
alter database drop logfile group 13
Dropped old group 13
Group 14 Thread 1 Archived YES Status INACTIVE Size 2147483648
ALTER DATABASE ADD LOGFILE THREAD 1 GROUP 114 ( '+RECO_EDP2' , '+DATA_EDP2' ) size 4g
create new group success
alter database drop logfile group 14
Dropped old group 14
Group 15 Thread 1 Archived YES Status INACTIVE Size 2147483648
ALTER DATABASE ADD LOGFILE THREAD 1 GROUP 115 ( '+RECO_EDP2' , '+DATA_EDP2' ) size 4g
create new group success
alter database drop logfile group 15
Dropped old group 15
Group 16 Thread 1 Archived YES Status INACTIVE Size 2147483648
ALTER DATABASE ADD LOGFILE THREAD 1 GROUP 116 ( '+RECO_EDP2' , '+DATA_EDP2' ) size 4g
create new group success
alter database drop logfile group 16
Dropped old group 16
Group 17 Thread 1 Archived YES Status INACTIVE Size 2147483648
ALTER DATABASE ADD LOGFILE THREAD 1 GROUP 117 ( '+RECO_EDP2' , '+DATA_EDP2' ) size 4g
create new group success
alter database drop logfile group 17
Dropped old group 17
Group 20 Thread 2 Archived YES Status INACTIVE Size 2147483648
ALTER DATABASE ADD LOGFILE THREAD 2 GROUP 120 ( '+RECO_EDP2' , '+DATA_EDP2' ) size 4g
create new group success
alter database drop logfile group 20
Dropped old group 20
Group 21 Thread 2 Archived YES Status INACTIVE Size 2147483648
ALTER DATABASE ADD LOGFILE THREAD 2 GROUP 121 ( '+RECO_EDP2' , '+DATA_EDP2' ) size 4g
create new group success
alter database drop logfile group 21
Dropped old group 21
Group 22 Thread 2 Archived YES Status INACTIVE Size 2147483648
ALTER DATABASE ADD LOGFILE THREAD 2 GROUP 122 ( '+RECO_EDP2' , '+DATA_EDP2' ) size 4g
create new group success
alter database drop logfile group 22
Dropped old group 22
Group 23 Thread 2 Archived NO Status CURRENT Size 2147483648
ALTER DATABASE ADD LOGFILE THREAD 2 GROUP 123 ( '+RECO_EDP2' , '+DATA_EDP2' ) size 4g
create new group success
alter database drop logfile group 23
Dropped old group 23
Group 24 Thread 2 Archived YES Status INACTIVE Size 2147483648
ALTER DATABASE ADD LOGFILE THREAD 2 GROUP 124 ( '+RECO_EDP2' , '+DATA_EDP2' ) size 4g
create new group success
alter database drop logfile group 24
Dropped old group 24
Group 30 Thread 3 Archived YES Status INACTIVE Size 2147483648
ALTER DATABASE ADD LOGFILE THREAD 3 GROUP 130 ( '+RECO_EDP2' , '+DATA_EDP2' ) size 4g
create new group success
alter database drop logfile group 30
Dropped old group 30
Group 31 Thread 3 Archived NO Status CURRENT Size 2147483648
ALTER DATABASE ADD LOGFILE THREAD 3 GROUP 131 ( '+RECO_EDP2' , '+DATA_EDP2' ) size 4g
create new group success
alter database drop logfile group 31
Dropped old group 31
Group 32 Thread 3 Archived YES Status INACTIVE Size 2147483648
ALTER DATABASE ADD LOGFILE THREAD 3 GROUP 132 ( '+RECO_EDP2' , '+DATA_EDP2' ) size 4g
create new group success
alter database drop logfile group 32
Dropped old group 32
Group 33 Thread 3 Archived YES Status INACTIVE Size 2147483648
ALTER DATABASE ADD LOGFILE THREAD 3 GROUP 133 ( '+RECO_EDP2' , '+DATA_EDP2' ) size 4g
create new group success
alter database drop logfile group 33
Dropped old group 33
Group 34 Thread 3 Archived YES Status INACTIVE Size 2147483648
ALTER DATABASE ADD LOGFILE THREAD 3 GROUP 134 ( '+RECO_EDP2' , '+DATA_EDP2' ) size 4g
create new group success
alter database drop logfile group 34
Dropped old group 34
Group 40 Thread 4 Archived YES Status INACTIVE Size 2147483648
ALTER DATABASE ADD LOGFILE THREAD 4 GROUP 140 ( '+RECO_EDP2' , '+DATA_EDP2' ) size 4g
create new group success
alter database drop logfile group 40
Dropped old group 40
Group 41 Thread 4 Archived YES Status INACTIVE Size 2147483648
ALTER DATABASE ADD LOGFILE THREAD 4 GROUP 141 ( '+RECO_EDP2' , '+DATA_EDP2' ) size 4g
create new group success
alter database drop logfile group 41
Dropped old group 41
Group 42 Thread 4 Archived YES Status INACTIVE Size 2147483648
ALTER DATABASE ADD LOGFILE THREAD 4 GROUP 142 ( '+RECO_EDP2' , '+DATA_EDP2' ) size 4g
create new group success
alter database drop logfile group 42
Dropped old group 42
Group 43 Thread 4 Archived YES Status INACTIVE Size 2147483648
ALTER DATABASE ADD LOGFILE THREAD 4 GROUP 143 ( '+RECO_EDP2' , '+DATA_EDP2' ) size 4g
create new group success
alter database drop logfile group 43
Dropped old group 43
Group 44 Thread 4 Archived YES Status INACTIVE Size 2147483648
ALTER DATABASE ADD LOGFILE THREAD 4 GROUP 144 ( '+RECO_EDP2' , '+DATA_EDP2' ) size 4g
create new group success
alter database drop logfile group 44
Dropped old group 44

PL/SQL procedure successfully completed.

Elapsed: 00:15:12.47

SYS>select group#,thread#,bytes,archived,status from v$log order by  group#, thread#;

    GROUP#    THREAD#            BYTES ARC STATUS
---------- ---------- ---------------- --- ----------------
       101          1    4,294,967,296 YES INACTIVE
       102          1    4,294,967,296 NO  CURRENT
       103          2    4,294,967,296 NO  CURRENT
       104          2    4,294,967,296 YES UNUSED
       105          1    4,294,967,296 YES UNUSED
       106          2    4,294,967,296 YES UNUSED
       107          3    4,294,967,296 NO  CURRENT
       108          3    4,294,967,296 YES UNUSED
       109          3    4,294,967,296 YES UNUSED
       110          4    4,294,967,296 NO  CURRENT
       111          4    4,294,967,296 YES UNUSED
       112          4    4,294,967,296 YES UNUSED
       113          1    4,294,967,296 YES UNUSED
       114          1    4,294,967,296 YES UNUSED
       115          1    4,294,967,296 YES UNUSED
       116          1    4,294,967,296 YES UNUSED
       117          1    4,294,967,296 YES UNUSED
       120          2    4,294,967,296 YES UNUSED
       121          2    4,294,967,296 YES UNUSED
       122          2    4,294,967,296 YES UNUSED
       123          2    4,294,967,296 YES UNUSED
       124          2    4,294,967,296 YES UNUSED
       130          3    4,294,967,296 YES UNUSED
       131          3    4,294,967,296 YES UNUSED
       132          3    4,294,967,296 YES UNUSED
       133          3    4,294,967,296 YES UNUSED
       134          3    4,294,967,296 YES UNUSED
       140          4    4,294,967,296 YES UNUSED
       141          4    4,294,967,296 YES UNUSED
       142          4    4,294,967,296 YES UNUSED
       143          4    4,294,967,296 YES UNUSED
       144          4    4,294,967,296 YES UNUSED

32 rows selected.

Getting ORA-15204 on full database restore

Last week I was testing our backups and got this error: ORA-15204: database version 11.2.0.0.0 is incompatible with diskgroup DATA_EDT1. This surprised me because all of our databases and Oracle homes on this Exadata are at version 11.2.0.3. I was testing full restore/recovery without a control file or an spfile. The first step was to restore spfile and controlfile from autobackup to their ASM locations.
I queried V$ASM_DISKGROUP in ASM and saw that all of our diskgroups have compatibility set to 11.2.0.2.0. In order to start RMAN restore I had created a minimal pfile. I didn’t bother setting the compatible parameter. I edited the pfile and set compatible=”11.2.0.3.0″ – that fixed it and the restore/recovery worked fine.

 

 

.rman

High load average, high IO wait, slow process startup, slow RAC communications

We have been having performance problems on our test Exadata for several months. I have opened five Oracle service requests for multiple symptoms. While the cpu utilization was fairly low, Oracle background processes would hang, the OEM 12c agent would hang, backup jobs would hang, we would experience slow communications between RAC nodes, half of the cpus would be in 100% IO wait state, the system load average would exceed 6000, etc. We noticed one of the NFS mounts was unreachable and it happened to be the mount point where we keep our DBA scripts. Processes would be in IO wait state “D” and we noticed several were the DBA scripts running from NFS. We could resolve the problem by killing these scripts. So I moved the backup scripts to local drives and eliminated some issues.
The problem kept returning though so I kept opening more SRs with no solution. Yesterday I escalated and had one of my SRs reopened. I was finally able to get to the correct person in Oracle support who gave me two things to try. The first was to add the “noac” option for the NFS mounts. The idea was that this would resolve issues where synchronous writes are induced. Since we are backing up to NFS using RMAN and tar this seemed a good bet. And it did help a lot. But we were still able to bring the problem back by tarring to NFS.
These NFS mounts are across Infiniband to an Exalogic ZFS storage system. The second fix was based on the fact that the new OEL kernel 11.2.3.2.1 update has memory management changes that may result in high TCP/IP traffic causing memory starvation for contiguous memory free space. See Knowledge base article 1546861.1 System hung with large numbers of page allocation failures with “order:5” : <Future Exadata releases will be changing the MTU size on the InfiniBand Interfaces to 7000 (down from 65520) for new installations, so the 7000 MTU for Exadata environments is known to be appropriate>  So I changed the Infiniband MTU from 65520 to 7000 and restarted the network service. That finally fixed the issue.

Exadata Database Machine Administration Workshop

I spent last week in downtown Chicago attending this class. It was a very good class and the instructor, Andy Fortunak was excellent. We had lots of labs to work through. The computers we used ran virtual database servers and virtual storage servers. This setup caused some problems, mainly slowness. My virtual database machine’s clusterware failed towards the end of the week but the virtual storage servers continued to work fine.

The main message was: invest the time needed to set up services and Database Resource Manager.  IO Resource Manager and Instance Caging depends on it.

We also discussed consolidation and using RAC one-node. We are consolidating scores of databases on Exadata. Most of them do not require multiple RAC nodes for workload but we do want high availability and load balancing. This week I tested manually moving one-node databases between nodes and it works great. The client does not get disconnected but does get an error message. Srvctl actually makes it a two node RAC and then stops the original node. I tested automatic failover of rac one-node and compared it with a full rac two-node with one instance down. They both work the way I wanted, a second instance starts up. I can’t see any technical reason for going with one-node.

Hotsos Symposium

This was the best conference I have ever attended. Very smart speakers and attendees. It was fun and I learned a lot about Oracle performance. Some of the themes were : statistics, Exadata and fine grained measurement. I would definitely go again! Kerry Osborne of Enkitek gave the keynote presentation and I had a chance to talk to him several times. Enkitek is a leader in Exadata and Kerry shared lots of good information.  Hotsos Symposium 2011