ORA-02396: exceeded maximum idle time, please connect again

I was doing a schema import last night, over a database link. Lack of sleep had set in and I was getting the following error

Processing object type SCHEMA_EXPORT/TABLE/INDEX/INDEX 
ORA-39126: Worker unexpected fatal error in KUPW$WORKER.FETCH_XML_OBJECTS [INDEX:"SCHEMA_OWNER"."TABLE2"] 
ORA-02396: exceeded maximum idle time, please connect again 

ORA-06512: at "SYS.DBMS_SYS_ERROR", line 95 
ORA-06512: at "SYS.KUPW$WORKER", line 11259 

----- PL/SQL Call Stack ----- 
object line object 
handle number name 
0x2156ad620 27116 package body SYS.KUPW$WORKER 
0x2156ad620 11286 package body SYS.KUPW$WORKER 
0x2156ad620 13515 package body SYS.KUPW$WORKER 
0x2156ad620 3173 package body SYS.KUPW$WORKER 
0x2156ad620 12035 package body SYS.KUPW$WORKER 
0x2156ad620 2081 package body SYS.KUPW$WORKER 
0x20d5f9bb0 2 anonymous block 

KUPP$PROC.CHANGE_USER 
KUPP$PROC.CHANGE_USER 
NONE connected to AAR00287 
DBMS_LOB.TRIM 
DBMS_LOB.TRIM 
DBMS_LOB.TRIM 
DBMS_LOB.TRIM 
DBMS_METADATA.FETCH_XML_CLOB 
DBMS_METADATA.FETCH_XML_CLOB 
In procedure DETERMINE_FATAL_ERROR with ORA-02396: exceeded maximum idle time, please connect again 

Now if you aren’t tired this should be a pretty obvious fix, the idle_time value on the remote server was set to 60 and was disconnecting me during the index build section of the import. I altered the profile to give my user/profile unlimited idle_time. Problem solve, import complete.

RMAN-20036: invalid record order

Trying to register a database with my rman catalog I was getting the following error

$ rman target sys/xxx@DB01 catalog=RMAN_DB01/xxx@rmancat

Recovery Manager: Release 11.2.0.4.0 - Production on Mon Aug 7 11:42:37 2017

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

connected to target database: DB01 (DBID=114156133)
connected to recovery catalog database

RMAN> create catalog;

recovery catalog created

RMAN> register database;

starting full resync of recovery catalog
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of register command at 08/07/2017 11:42:44
RMAN-03014: implicit resync of recovery catalog failed
RMAN-03009: failure of full resync command on default channel at 08/07/2017 11:42:44
RMAN-20036: invalid record order

Poking around metalink I found note 113325.1

Basicly it says there is a mismatch in the number of standby redo logs, but this database does not have a standby. However it was cloned from a database that had a standby, and I apparently forgot to remove the standby redo logs after I cloned it.

SQL> select INST_ID,GROUP#,TYPE from gv$logfile;
         1          1         ONLINE
         1          2         ONLINE
         1          3         ONLINE
         1          4 INVALID STANDBY
         1          5         STANDBY
         2          6         ONLINE
         2          7         ONLINE
         2          8         ONLINE
         2          9 INVALID STANDBY
         2         10         STANDBY

So cleanup the standby logfiles

SQL> alter database drop standby logfile group 1 thread 4;
		 
Database altered.

SQL> alter database drop standby logfile group 1 thread 5;
		 
Database altered.

SQL> alter database drop standby logfile group 2 thread 9;
		 
Database altered.

SQL> alter database drop standby logfile group 2 thread 9;
		 
Database altered.

I dropped and re-created the catalog

RMAN> create catalog;

recovery catalog created

RMAN> register database;

database registered in recovery catalog
starting full resync of recovery catalog
full resync complete

 

Targets.xml was rejected: loaded with a wrong agent token

I noticed an agent on one of my database servers was down this morning. Agents crash from time to time, no big deal. I will just restart it

$ ./emctl start agent
Oracle Enterprise Manager Cloud Control 12c Release 2
Copyright (c) 1996, 2012 Oracle Corporation.  All rights reserved.
Starting agent ............ failed.
Target Manager failed at Startup: targets.xml was rejected: loaded with a wrong agent token
Consult emctl.log and emagent.nohup in: /u01/app/oracle/product/12.1.0.4/agent12c/agent_inst/sysman/log

Well that sucks. Something is out of sync. Usually this will happen when your file system is filled up and a change cannot be written to the targets.xml. I wasn’t out of space though. I’m not quite sure what caused this, but I know how to fix it.

make a backup of the targets.xml file, blank out the file, and add the opening and closign targets tag

cd /u01/app/oracle/product/12.1.0.4/agent12c/agent_inst/sysman/emd/
mv targets.xml targets.xml.20170220
echo "<Targets></Targets>" > targets.xml

Then re-synchronize the agent from OMS, this will re-populate the targets.xml

emcli login -username=SYSMAN
emcli sync
emcli resyncAgent -agent="dbsrv01:3872"

it takes about another 10 minutes to resynchronize but while it is running the agent will look like this

$ ./emctl status agent
Oracle Enterprise Manager Cloud Control 12c Release 2
Copyright (c) 1996, 2012 Oracle Corporation.  All rights reserved.
---------------------------------------------------------------
Agent Version     : 12.1.0.2.0
OMS Version       : (unknown)
Protocol Version  : 12.1.0.1.0
Agent Home        : /u01/app/oracle/product/12.1.0.4/agent12c/agent_inst
Agent Binaries    : /u01/app/oracle/product/12.1.0.4/agent12c/core/12.1.0.2.0
Agent Process ID  : 794210
Parent Process ID : 793987
Agent URL         : https://dbsrv01:3872/emd/main/
Repository URL    : https://oem1:4900/empbs/upload
Started at        : 2017-02-20 11:56:00
Started by user   : oracle
Last Reload       : (none)
Last successful upload                       : (none)
Last attempted upload                        : (none)
Total Megabytes of XML files uploaded so far : 0
Number of XML files pending upload           : 95
Size of XML files pending upload(MB)         : 0.9
Available disk space on upload filesystem    : 73.85%
Collection Status                            : [RESYNC]
Heartbeat Status                             : Agent is blocked
Last attempted heartbeat to OMS              : 2017-02-20 11:59:03
Last successful heartbeat to OMS             : (none)
Next scheduled heartbeat to OMS              : 2017-02-20 12:02:03

After a little waiting around the agent is back to normal

$ ./emctl status agent
Oracle Enterprise Manager Cloud Control 12c Release 2
Copyright (c) 1996, 2012 Oracle Corporation.  All rights reserved.
---------------------------------------------------------------
Agent Version     : 12.1.0.2.0
OMS Version       : 12.1.0.4.0
Protocol Version  : 12.1.0.1.0
Agent Home        : /u01/app/oracle/product/12.1.0.4/agent12c/agent_inst
Agent Binaries    : /u01/app/oracle/product/12.1.0.4/agent12c/core/12.1.0.2.0
Agent Process ID  : 794210
Parent Process ID : 793987
Agent URL         : https://dbsrv01:3872/emd/main/
Repository URL    : https://oem1:4900/empbs/upload
Started at        : 2017-02-20 11:56:00
Started by user   : oracle
Last Reload       : (none)
Last successful upload                       : 2017-02-20 12:19:53
Last attempted upload                        : 2017-02-20 12:19:53
Total Megabytes of XML files uploaded so far : 2.15
Number of XML files pending upload           : 0
Size of XML files pending upload(MB)         : 0
Available disk space on upload filesystem    : 73.62%
Collection Status                            : Collections enabled
Heartbeat Status                             : Ok
Last attempted heartbeat to OMS              : 2017-02-20 12:19:04
Last successful heartbeat to OMS             : 2017-02-20 12:19:04
Next scheduled heartbeat to OMS              : 2017-02-20 12:20:04

---------------------------------------------------------------
Agent is Running and Ready

 

Oracle OEM 13c LDAP error

My new OEM 13c system has been having lots of issues, I have multiple bugs registered with oracle. This one is particularly odd, trying to create a new user I get an LDAP error

The user actually does get created, but when I try to assign additional privileges to the user I get the error again, and this time it doesn’t actually add them.

The kicker is, I am not using LDAP, the user is created as an internal user, no single sign on!

I am trying to give a user access to the performance page, so off to emcli!!

I need to find the privileges related to the performance tab. It’s better to copy and paste these since they are case sensitive.

$ emcli get_supported_privileges | grep -i perf
PERFORM_OPERATION_AS_ANY_AGENT  Resource Type    TARGET
DB_PERFSQL_OPT_ADMIN            Resource         TARGET                        ENTITY_GUID           TARGET_NAME:TARGET_TYPE
PERFORM_OPERATION               Resource         TARGET                        ENTITY_GUID           TARGET_NAME:TARGET_TYPE
DB_PERFORMANCE_HOME_VIEW        Resource         TARGET                        ENTITY_GUID           TARGET_NAME:TARGET_TYPE
DB_PERFORMANCE_ADMIN            Resource         TARGET                        ENTITY_GUID           TARGET_NAME:TARGET_TYPE
DB_PERFSQL_OPT_CONSOLE_VIEW     Resource         TARGET                        ENTITY_GUID           TARGET_NAME:TARGET_TYPE
PERFORM_OPERATION_ANYWHERE      Resource Type    TARGET
DB_PERFORMANCE_VIEW             Resource         TARGET                        ENTITY_GUID           TARGET_NAME:TARGET_TYPE
PERFORM_OPERATION_AS_AGENT      Resource         TARGET                        ENTITY_GUID           TARGET_NAME:TARGET_TYPE

I want to give myUser DB_PERFORMANCE_VIEW and DB_PERFORMANCE_HOME_VIEW. I have my databases assigned to groups, so lets do that grant

emcli grant_privs -name="myUser" -privilege="DB_PERFORMANCE_HOME_VIEW;TARGET_NAME=Dev:TARGET_TYPE=group"
emcli grant_privs -name="myUser" -privilege="DB_PERFORMANCE_VIEW;TARGET_NAME=Dev:TARGET_TYPE=group"
emcli modify_group -name="Dev" -privilege_propagation=true

I looped through my dev/test/prod groups

Checking inside OEM, the grants are now in place

I don’t understand what OEM is doing though the web interface that is difference from using emcli. I had another issue deploying plugins using the web interface. I was able to deploy them with emcli. I will have a post on that soon.

ohasd permissions

I was working kind of late last night, trying to get a new install working. I ran into a bug where the permissions on the ohasd were incorrect after patching GI. I went out to a working server to see what the permissions should be, build my chown and chmod statements and pasted them into my terminal window. Unfortunately I got them in the wrong terminal, and had managed to copy the wrong permissions. I changed the ownership on ohasd on the first node of my production RAC Cluster. Apparently the permissions are really important because the whole node went down.

A little bit of panic set in and I wasn’t sure what I had done. I didn’t realize i had pasted the permission statements into the wrong window, and the error messages weren’t very helpful.

[root@node1 bin]# ./crsctl stop crs -f
CRS-4639: Could not contact Oracle High Availability Services
CRS-4000: Command Stop failed, or completed with errors.
[root@node1 bin]# ps -ef | grep d.bin
root 33615 1 0 16:53 ? 00:00:00 /u01/app/12.1.0.2/grid/bin/ohasd.bin reboot
root 53203 1 0 16:57 ? 00:00:00 /u01/app/12.1.0.2/grid/bin/ohasd.bin reboot
root 79744 1 0 17:07 ? 00:00:00 /u01/app/12.1.0.2/grid/bin/ohasd.bin reboot
root 105598 103980 0 17:17 pts/2 00:00:00 grep d.bin
[root@node1 bin]# kill -9 33615 53203 79744
[root@node1 bin]# ps -ef | grep d.bin
root 106623 103980 0 17:17 pts/2 00:00:00 grep d.bin
[root@node1 bin]# date
Thu Jan 5 17:18:02 EST 2017
[root@node1 bin]# ./crsctl start crs
CRS-4124: Oracle High Availability Services startup failed.
CRS-4000: Command Start failed, or completed with errors.

I was getting nothing in the logs.

I figured it must be a permission issue, but I wasn’t quite sure what to reset them to.

Apparently I am not the first person to do this since Oracle has a document for fixing this!

How to check and fix file permissions on Grid Infrastructure environment (Doc ID 1931142.1)

I ran

./rootcrs.pl -init

rebooted the node, and all was right with the world, except for my ego being kind of damaged from making such a silly mistake.

Mind your spaces

I am trying to export a database, everything except for a single table. I need to send a copy of the database to the vendor, but they don’t need a table that takes up %90 of the space in the database due to it storing a massive number of BLOBs and CLOBs

This should be really simple

$ expdp "'/ as sysdba'" directory=A dumpfile=fullexport.dmp exclude=TABLE:"IN ('TABLEOFCLOBS')"

Export: Release 11.2.0.4.0 - Production on Fri Jun 17 10:37:01 2016

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Data Mining
and Real Application Testing options
ORA-39001: invalid argument value
ORA-39071: Value for EXCLUDE is badly formed.
ORA-00936: missing expression

That syntax looks correct to me! Maybe I copy/pasted in a special character or something…. I type out the full command by hand and get the same error. I am stumped!

idontgetit

I worked the googles on the internet machine, and see a bunch of examples that look pretty much identical to mine. Then I notice it. I have a space after the “IN” key word.

$ expdp "'/ as sysdba'" directory=A dumpfile=fullexport.dmp exclude=table:"IN('TABLEOFCLOBS')" FULL=Y

Export: Release 11.2.0.4.0 - Production on Fri Jun 17 10:38:14 2016

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, Automatic Storage Management, OLAP, Data Mining
and Real Application Testing options
Starting "SYS"."SYS_EXPORT_FULL_01":  "/******** AS SYSDBA" directory=A dumpfile=exportfile.dmp exclude=table:IN('TABLEOFCLOBS') FULL=Y
Estimate in progress using BLOCKS method...
Processing object type DATABASE_EXPORT/SCHEMA/TABLE/TABLE_DATA
Total estimation using BLOCKS method: 53.28 GB

Thwarted by spaces again! It seems trival, and kind of incorrect since the “IN” in a sql statement needs that space there.

Rushing = Stupid Mistakes

I am currently being asked to refresh a database. The end users are pissed because they asked for the database to be refreshed at noon yesterday, this was the first they had asked for it, and they were doing “really important things” and by 3pm were pissed that I hadn’t responded to them, let alone completed the work. The thing was, I took a vacation. I took a whole two days off to spend some time with my girlfriend and wasn’t checking my email. Never mind that we have a whole team of DBAs that can refresh a database, they had emailed me. What they should have done was open a proper ticket, and someone would have helped out even though they hadn’t planned and were now in a panic.

So After escalating it to their boss because I wasn’t providing them with “good service”, who in turn talked to my boss, who informed him how they had dropped the ball, and I would deal with it today, that they were being completely unprofessional. (My boss is a good guy, and sticks up for me).

So here is the technical part of this post. Ever DBA has cloned a database, thousands and thousands of databases, but when you are rushing you can make stupid errors. I just fought this error for 20 minutes

Oracle instance shut down
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of Duplicate Db command at 06/08/2016 14:23:04
RMAN-05501: aborting duplication of target database
RMAN-03015: error occurred in stored script Memory Script
ORA-27191: sbtinfo2 returned error
Additional information: 3480

This should have been a no brainier, but in my rush I wasn’t fully paying attention to the error.

I was just issuing the command

duplicate target database to testdb from active database

My defaults for RMAN are using sbt ie tape, so for some reason the tape channels were being loaded. This causes an error. The solution is simple and should have been fairly immediate, just allocate disk channels.

run
{
ALLOCATE CHANNEL d1 TYPE DISK ;
ALLOCATE CHANNEL d2 TYPE DISK ;
ALLOCATE CHANNEL d3 TYPE DISK ;
ALLOCATE AUXILIARY CHANNEL a1 TYPE DISK ;
ALLOCATE AUXILIARY CHANNEL a2 TYPE DISK ;
ALLOCATE AUXILIARY CHANNEL a3 TYPE DISK ;
duplicate target database to testdb from active database;
}

Even simple things become very hard when you are rushing.

ASM du -s

one of my biggest complaints about ASM is the lack of the ability to do the equivalent of “du -hs *” to view the size of all the directories. I found a little piece of shell script that does just this. It is extremely slow but it is very helpful!

dg=RECO; for dir in $(asmcmd ls $dg); do echo $dg/$dir; asmcmd du $dg/$dir; done

Just change the first argment “dg=RECO” to the actual disc group you want to examine.

HOTSOS In Texas!

I am currently in Texas at the HOTSOS Symposium. I look forward to this conference every year. There are some brilliant people here, and it is a great place to learn new things. Kerry Osborn did a great live presentation yesterday, can’t wait to get home and try some of this stuff out. Also saw a great presentation by Tanel Poder.

The food down here is amazing too! We went to the Gas Monkey Bar and Grill last night and the food was amazing. I love traveling for work. I mean it is more fun to travel just for fun, but when I can learn something new, and have someone else pickup the majority of the cost, that is pretty amazing!

DBSNMP By Way Of EMCLI

Currently I am changing passwords in several hundred databases, not that uncommon of a task for any DBA. Unfortunately it is time to rotate the dbsnmp password. Usually I would log into OEM after changing the password and manually update the password in the monitoring configuration for each and every database. This is fine if there are only a few passwords, but a huge pain when you are dealing with hundreds. The documentation oracle provides isn’t completely clear [1592390.1], at least it wasn’t to me at first.

First I would connect to the database as the sys user and change the password for dbsnmp

alter user dbsnmp identified by MyFancyPassword;

Then you connect to EM through emcli and update the monitoring credentials. For this post lets assume my Oracle RAC cluster has the name “MyRAC” which contains two nodes “MyRAC1” and “MyRAC2”. We don’t really need to know the individual instance names since the dbsnmp user is shared between both nodes.

Connect to the repository

$ emcli login -username=sysman
Enter password :

Login successful
oracle@dcaxoem1:/home/oracle
$ emcli sync
Synchronized successfully

Now you can change the password via emcli

emcli set_monitoring_credential -target_type="rac_database" -target_name="MyRAC" -set_name="DBCredsMonitoring" -cred_type="DBCreds" -attributes="DBUserName:dbsnmp;DBPassword:MyFancyPassword;DBRole:Normal"

Where I ran into trouble was the last argument, if you look at the monitoring configuration page for a database, it becomes clear what is being asked for

editdbsmp

The role is the same role you would select on the monitoring setup page, Normal or SYSDBA.

The syntax is slightly different if you are doing a single instance, but basically the same idea

emcli set_monitoring_credential -target_type=oracle_database -target_name="ora1" -set_name="DBMonCred" -cred_type="DBCreds" -attributes=”DBUserName:dbsnmp;DBPassword:MyFancyPassword;DBRole:Noraml”

So I am going to modify the change password proceedure to spool out these commands and just run them against OEM when I have completed changing the passwords on the databases.