Category Archives: Oracle

Hacking SLOB to run on Solaris

Kevin Closson’s Silly Little Oracle Benchmark https://kevinclosson.net/slob/ is not ported to Solaris. This means that to get it working is not as easy as it should be. Kevin himself suggests that you use a small linux host to run the tool if your database is running on a non-supported operating system.

I’m going to track the changes I make here.

setup.sh

First problem – grep!

./setup.sh test 16
SLOB 2.4.0
FATAL : 2018.05.15-14:40:57 : Usage : ./setup.sh.orig: <tablespace name> <number of SLOB schemas to create and load>
FATAL : 2018.05.15-14:40:57 : Option 2 must be an integer

This is caused by the function f_is_int requiring the gnu grep command behaviour. Simplest way to this is to change the grep to point to /usr/gnu/bin/grep in this script.

Next problem..

Once this change was made, I was then able to create the first schema ( yay!) but then when it came to make the remaining 15, it failed.

NOTIFY : 2018.05.15-14:37:27 : Waiting for background batch 1. Loading up to user11
FATAL : 2018.05.15-14:37:29 : 
FATAL : 2018.05.15-14:37:29 : f_flag_abort: Triggering abort
FATAL : 2018.05.15-14:37:29 : 
FATAL : 2018.05.15-14:37:30 : 
FATAL : 2018.05.15-14:37:30 : f_flag_abort: Triggering abort
FATAL : 2018.05.15-14:37:30 : 
FATAL : 2018.05.15-14:37:40 : 
FATAL : 2018.05.15-14:37:40 : f_flag_abort: Triggering abort
FATAL : 2018.05.15-14:37:40 : 
FATAL : 2018.05.15-14:37:40 : 
FATAL : 2018.05.15-14:37:40 : f_check_abort_flag: discovered abort flag
FATAL : 2018.05.15-14:37:40 : 
FATAL : 2018.05.15-14:37:40 : Aborting SLOB setup. See /export/home/oracle/mel/SLOB/cr_tab_and_load.out

 

Handily there is an error message in the logfile

ALTER TABLE cf1 MINIMIZE RECORDS_PER_BLOCK
*
ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-00039: error during periodic action
ORA-04036: PGA memory used by the instance exceeds PGA_AGGREGATE_LIMIT




Disconnected from Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options
FATAL : 2018.05.15-14:37:40 :
FATAL : 2018.05.15-14:37:40 : f_setup: Failed to load user4 SLOB table
FATAL : 2018.05.15-14:37:40 :

I’m going to assume the clue is in the error and increase the PGA aggregate limit

SQL> alter system set pga_aggregate_limit=36G scope=both sid='*';

After this change, the script ran to the end successfully.

 

$ diff setup.sh setup.sh.orig
30,31d29
< 
< 
86c84
< if ( ! echo $s | /usr/gnu/bin/grep -q "^-\?[0-9]*$" )
---
> if ( ! echo $s | grep -q "^-\?[0-9]*$" )
250c248
< if ( echo "$cdb" | /usr/gnu/bin/grep -q "YES" > /dev/null 2>&1 )
---
> if ( echo "$cdb" | grep -q "YES" > /dev/null 2>&1 )
254c252
< if ( echo "$cdb" | /usr/gnu/bin/grep -q "NO" > /dev/null 2>&1 )
---
> if ( echo "$cdb" | grep -q "NO" > /dev/null 2>&1 )
410c408
< done | sqlplus -s "$constring" 2>&1 | tee -a $fname | /usr/gnu/bin/grep -i "dropped" | wc -l | while read num_processed
---
> done | sqlplus -s "$constring" 2>&1 | tee -a $fname | grep -i "dropped" | wc -l | while read num_processed

runit.sh

So – we already know that the scripts need gnu grep, especially in the function f_is_int. Safest option will be  to change every occurrence of grep with /usr/gnu/bin/grep.

Try running it with a single schema

 ./runit.sh 1

The duration of the run is based on the slob.conf variable RUN_TIME. The default is 300 seconds, but I am dropping it to 90 while I am doing debugging.

This ran successfully (I think!). Certainly AWR reports were created.

There was one obvious failure – the mpstat output was not generated.

$ cat mpstat.out 
mpstat: -P expects a number
Usage: mpstat [-aqm] [-A core|soc|bin] [-k key1,[key2,...]] [-o num] [-p | -P processor_set] [-T d|u] [-I statfile | -O statfile ] [interval [count]]

Ok – so this is the line causing the problem.

 ( mpstat -P ALL 3 > mpstat.out 2>&1) &

On Linux mpstat without the -P flag will give you summary output, -P separates the output by processor.  On Solaris the default is to show one line per CPU, so I just need to change the line to

 ( mpstat 3 > mpstat.out 2>&1) &

also check for other invocations of mpstat

Next - try with 2 sessions

./runit.sh 2

Arrgh. That fails with

: List of monitored sqlplus PIDs written to /tmp/.SLOB.2018.05.15.153524/13174.f_wait_pids.out.
usage: ps [ -aAdefHlcjLPyZ ] [ -o format ] [ -t termlist ]
 [ -u userlist ] [ -U userlist ] [ -G grouplist ]
 [ -p proclist ] [ -g pgrplist ] [ -s sidlist ] [ -z zonelist ] [-h lgrplist]
 'format' is one or more of:
 user ruser group rgroup uid ruid gid rgid pid ppid pgid sid taskid ctid
 pri opri pcpu pmem vsz rss rssprivate rssshared osz nice class time etime stime zone zoneid env
 f s c lwp nlwp psr tty addr wchan fname comm args projid project pset lgrp

This wasn’t obvious where the fault was happening. So lets add the -x flag to the first line of our script to try and find the line causing our problems.

++ sed /PID/d
++ wc -l
+++ cat /tmp/.SLOB.2018.05.15.153852/15451.f_wait_pids.out
++ sed 's/[^0-9]//g'
++ ps -p 32526 32527
usage: ps [ -aAdefHlcjLPyZ ] [ -o format ] [ -t termlist ]
 [ -u userlist ] [ -U userlist ] [ -G grouplist ]
 [ -p proclist ] [ -g pgrplist ] [ -s sidlist ] [ -z zonelist ] [-h lgrplist]
 'format' is one or more of:
 user ruser group rgroup uid ruid gid rgid pid ppid pgid sid taskid ctid
 pri opri pcpu pmem vsz rss rssprivate rssshared osz nice class time etime stime zone zoneid env
 f s c lwp nlwp psr tty addr wchan fname comm args projid project pset lgrp
++ return 0
+ tmp=0

So there is something with the ps -p command that is not happy in solaris. I  can reproduce the problem at my command line, by providing ps -p with multiple pids.


# ps -p 7 8
usage: ps [ -aAdefHlcjLPyZ ] [ -o format ] [ -t termlist ].....

If we look at the ps man page, on solaris it says

Some options accept lists as arguments. Items in a list can be either
separated by commas or else enclosed in quotes and separated by commas
or spaces. Values for proclist and grplist must be numeric.

so this means the following will work, but not the format within runit.sh

# ps -p 7,8
 PID TTY TIME CMD
 7 ? 127:47 zpool-rp
 8 ? 1:39 kmem_tas
# ps -p "7 8"
 PID TTY TIME CMD
 7 ? 127:47 zpool-rp
 8 ? 1:39 kmem_tas

 

So – I need to find the line with ps command, and try to change the format presented.

while ( ps -p $pidstring > /dev/null 2>&1 )

 ps -fp $pidstring

both occur in function f_wait_pids

but, they are actually being presented with the list of pids built up in line 1440

sqlplus_pids="${sqlplus_pids} $!"

So – how do I go about getting either a comma or a quote in the string that is passed to the function f_wait_pids?

Well, rather than destroying something that Kevin may rely on later, I add a line below 1440 to generate the pid list with commas

 melsqlplus_pids="${melsqlplus_pids},$!"

This is pretty ugly, as it means my string leads with a comma. However it seems that Solaris doesn’t care about this.

Now I need to add this to the function call on line 1493

if ( ! f_wait_pids "$(( SCHEMAS * THREADS_PER_SCHEMA ))" "$RUN_TIME" "$WORK_LOOP" "$sqlplus_pids"  "$melsqlplus_pids")

Of course I need to add something to pick up that new 5th argument to the function, so within the function itself

local melpidstring="$5"

while ( ps -p "$melpidstring" > /dev/null 2>&1 )

ps -fp "$melpidstring"

So this all seems ok.. but then I discover some more little rats in f_count_pid.

So once again – create my own custom usage of the pid list. There probably is a MUCH simpler way to do this, I’m just building it up as I go along. So – here are the changes I have made to runit.sh to get it to execute without error

 

oracle@sc8avm-25:~/mel/SLOB$ diff runit.sh runit.sh.orig
1c1
< #!/bin/bash 
---
> #!/bin/bash
45c45
< if ( ! echo "$s" | /usr/gnu/bin/grep -q "^-\?[0-9]*$" ) 
---
> if ( ! echo "$s" | grep -q "^-\?[0-9]*$" ) 
275c275
< for string in 'iostat -xm 3' 'mpstat 3' 'vmstat 3'
---
> for string in 'iostat -xm 3' 'mpstat -P ALL 3' 'vmstat 3'
300c300
< ls -l /proc/${tmp}/fd | /usr/gnu/bin/grep deleted
---
> ls -l /proc/${tmp}/fd | grep deleted
416c416
< sqlplus $user/${user}${non_admin_connect_string} <<EOF 2>/dev/null | sed 's/^.* FATAL/FATAL/g' | /usr/gnu/bin/grep FATAL > $tmpfile
---
> sqlplus $user/${user}${non_admin_connect_string} <<EOF 2>/dev/null | sed 's/^.* FATAL/FATAL/g' | grep FATAL > $tmpfile
457c457
< if ( /usr/gnu/bin/grep FATAL "$tmpfile" > /dev/null 2>&1 )
---
> if ( grep FATAL "$tmpfile" > /dev/null 2>&1 )
485d484
< local melpidstring="$5"
488d486
< local meltmpfile="${SLOB_TEMPDIR}/${RANDOM}.MEL${FUNCNAME}.out"
502d499
< echo "$melpidstring" > $meltmpfile 2>&1
508d504
< f_msg NOTIFY "List of monitored sqlplus PIDs with commas written to ${meltmpfile}."
514c510
< tmp=`f_count_pids "$meltmpfile"`
---
> tmp=`f_count_pids "$tmpfile"`
542,544c538,539
< echo "This is the pidstring value $pidstring"
< echo "This is the melpidstring value $melpidstring"
< while ( ps -p "$melpidstring" > /dev/null 2>&1 )
---
> 
> while ( ps -p $pidstring > /dev/null 2>&1 )
551c546
< ps -fp "$melpidstring"
---
> ps -fp $pidstring
875c870
< if ( ! echo "$tmp" | /usr/gnu/bin/grep -q '\-' > /dev/null 2>&1 )
---
> if ( ! echo "$tmp" | grep -q '\-' > /dev/null 2>&1 )
1397c1392
< ( mpstat 3 > mpstat.out 2>&1) &
---
> ( mpstat -P ALL 3 > mpstat.out 2>&1) &
1446d1440
< melsqlplus_pids="${melsqlplus_pids},$!"
1496c1490,1491
< if ( ! f_wait_pids "$(( SCHEMAS * THREADS_PER_SCHEMA ))" "$RUN_TIME" "$WORK_LOOP" "$sqlplus_pids" "$melsqlplus_pids" )
---
> 
> if ( ! f_wait_pids "$(( SCHEMAS * THREADS_PER_SCHEMA ))" "$RUN_TIME" "$WORK_LOOP" "$sqlplus_pids" )

Tomorrow – time to get someone else to run it and see if it behaves how they expect.

 

Advertisements

What solaris packages do I need to use runinstaller?

If you are running runinstaller on Solaris 11, and you have not installed the solaris-large-server incorporation you are likely to be missing some packages that you need to install the Oracle database.

oracle@host-8-200:/var/tmp/database$ pkg list |grep server
 group/system/solaris-small-server 0.5.11-0.175.3.11.0.4.0 i--

So – let’s start at what you need to run runinstaller..

oracle@host-8-200:/var/tmp/database$ ./runInstaller
 Starting Oracle Universal Installer...

Checking Temp space: must be greater than 547 MB. Actual 1956079 MB Passed
 Checking swap space: must be greater than 150 MB. Actual 1964769 MB Passed
 Checking monitor: must be configured to display at least 256 colors
 >>> Could not execute /usr/bin/xdpyinfo Failed <<<<

Some requirement checks failed. You must fulfill these requirements before

continuing with the installation,

Continue? (y/n) [n] n

You could just install solaris -large-server or just a minimal packages (if you prefer) to allow runinstaller to run.

# pkg install xauth x11/diagnostic/x11-info-clients library/motif terminal/xterm

 

If you receive an error

VM warning: PICL (libpicl.so.1) is missing. Performance will not be optimal.

you can resolve this by installing the package picl

# pkg install /system/picl

Changing Exadata Cell access parameters

One challenge is with Exadata cells in a lab environment is that they are secure! This means that it has long lock out times in the event of an incorrect login and tough lock settings. You can manually change these.. but every time you update your cell there is a chance they will be reset.

A more permanent way is to use /opt/oracle.cellos/host_access_control on each storage cell. https://docs.oracle.com/cd/E58626_01/html/E58630/z40036a01393423.html#scrolltoc

For example, if you want to drop the lock time in the event of a failed login from 10 minutes to a more manageable 60 seconds  you would issue the command

/opt/oracle.cellos/host_access_control pam-auth --lock=60

You can combine multiple pam-auth commands on the same line.. e.g. if I also want to say that the cell only remembers one previous password I could say

/opt/oracle.cellos/host_access_control pam-auth --lock=60 --remember=1

 

There are a lot of options for this tool – you can set the system back to secure defaults, or make it even more secure, such as locking an account after a single failed login!

Installing Enterprise Manager agent on Oracle Linux 6.7 SPARC

I have access to an EM13 Enterprise Manager server, and I am going to add my Oracle Linux 6.7 SPARC to this system for monitoring.

First – check that you have the latest plugins and agents installed for the platform.

Screenshot-Self Update: Agent Software - Oracle Enterprise Manager - Mozilla Firefox

Next, on the hosts, create a user to ‘own’ the agent software

[root@host-8-160 ~]# groupadd -g 10001 oinstall
[root@host-8-160 ~]# useradd -g oinstall -s /bin/bash -d /home/agent13 -m agent13
[root@host-8-160 ~]# passwd agent13
Changing password for user agent13.

Create a directory structure for the software

[root@host-8-160 ~]# mkdir -p /u01/app
[root@host-8-160 ~]# chgrp -R oinstall /u01
[root@host-8-160 ~]# chmod g+rwx /u01

Now, back in Enterprise Manager go.. Setup -> Add Target -> Add Target Manually -> Install Agent on Host.

Enter the fully qualified domain name of your host, and the correct Platform

Screenshot-Add Host Targets : Host and Platform - Mozilla Firefox

 

Enter the installation location

Enter the credentials for agent13 and root user and hit next.

Then you can hit deploy agent.

 

Post install configuration/Worries

 

The agent installed successfully – but the host target is not being marked as available.

Looking at the output of emctl status agent I have 2 concerns.

[agent13@host-8-160 bin]$ ./emctl status agent
Oracle Enterprise Manager Cloud Control 13c Release 2 
Copyright (c) 1996, 2016 Oracle Corporation. All rights reserved.
---------------------------------------------------------------
Agent Version : 13.2.0.0.0
OMS Version : 13.2.0.0.0
Protocol Version : 12.1.0.1.0
Agent Home : /u01/app/agent13/ngc13/agent_inst
Agent Log Directory : /u01/app/agent13/ngc13/agent_inst/sysman/log
Agent Binaries : /u01/app/agent13/ngc13/agent_13.2.0.0.0
Core JAR Location : /u01/app/agent13/ngc13/agent_13.2.0.0.0/jlib
Agent Process ID : 20636
Parent Process ID : 20497
Agent URL : https://host-8-160.blah.com:3876/emd/main/
Local Agent URL in NAT : https://host-8-160.blah.com:3876/emd/main/
Repository URL : https://ngc13c.blah.com:4901/empbs/upload
Started at : 2017-06-13 10:52:26
Started by user : agent13
Operating System : Linux version 4.1.12-94.3.4.el6uek.sparc64 (sparcv9)
Number of Targets : (none)
Last Reload : (none)
Last successful upload : (none)
Last attempted upload : (none)
Total Megabytes of XML files uploaded so far : 0
Number of XML files pending upload : 0
Size of XML files pending upload(MB) : 0
Available disk space on upload filesystem : 98.09%
Collection Status : Collections enabled
Heartbeat Status : Ok
Last attempted heartbeat to OMS : 2017-06-13 10:57:31
Last successful heartbeat to OMS : 2017-06-13 10:57:31
Next scheduled heartbeat to OMS : 2017-06-13 10:58:31

 

There are no targets, and there has not been a successful upload.

[agent13@host-8-160 bin]$ ./emctl pingOMS
Oracle Enterprise Manager Cloud Control 13c Release 2 
Copyright (c) 1996, 2016 Oracle Corporation. All rights reserved.
---------------------------------------------------------------
EMD pingOMS completed successfully

[agent13@host-8-160 bin]$ ./emctl upload agent
Oracle Enterprise Manager Cloud Control 13c Release 2 
Copyright (c) 1996, 2016 Oracle Corporation. All rights reserved.
---------------------------------------------------------------
EMD upload completed successfully

If I look in my targets.xml it is pretty empty

[agent13@host-8-160 ngc13]$ cat ./agent_inst/sysman/emd/targets.xml
<Targets AGENT_TOKEN="67DBE4C8ECBA03FA5DC991893B75619C55C9B1CEACAA6ED68074AB9C65CFF973"/>

[agent13@host-8-160 bin]$ ./emctl config agent listtargets
Oracle Enterprise Manager Cloud Control 13c Release 2 
Copyright (c) 1996, 2016 Oracle Corporation. All rights reserved.
[agent13@host-8-160 bin]$

On the enterprise manager server I had errors similar to this

 
Metric evaluation error start - Unable to connect to the agent at https://host-8-161.blah.com:3876/emd/main/ [No route to host]

Tried putting that URL into my browser… cannot connect to it.

Firewall! DOH! Of course!

Temporarily disabled the iptables firewall

[root@host-8-161 /]# service iptables stop
iptables: Setting chains to policy ACCEPT: filter [ OK ]
iptables: Flushing firewall rules: [ OK ]
iptables: Unloading modules: [ OK ]

Now I can connect to the agent address in my browser.

 

So.. the choice is disable the firewall, or alter the rules. As I’m in a lab, I’m going straight to disabling the firewall.

[root@host-8-161 /]# chkconfig iptables off

Now, try to get the agent to generate the internal target list (host, ORACLE_HOME)

[agent13@host-8-161 bin]$ ./emctl config agent listtargets
Oracle Enterprise Manager Cloud Control 13c Release 2 
Copyright (c) 1996, 2016 Oracle Corporation. All rights reserved.

[agent13@host-8-161 bin]$ ./emctl config agent addinternaltargets
Oracle Enterprise Manager Cloud Control 13c Release 2 
Copyright (c) 1996, 2016 Oracle Corporation. All rights reserved.
2017-06-13 12:00:37,234 [main] WARN oracle.sysman.gcagent.comm.agent.http.SSLInit - User requested cipher suite SSL_RSA_WITH_RC4_128_MD5, which is not supported for SSLContext TLSv1.2
2017-06-13 12:00:37,242 [main] WARN oracle.sysman.gcagent.comm.agent.http.SSLInit - User requested cipher suite SSL_RSA_WITH_RC4_128_SHA, which is not supported for SSLContext TLSv1.2

[agent13@host-8-161 bin]$ ./emctl config agent listtargets
Oracle Enterprise Manager Cloud Control 13c Release 2 
Copyright (c) 1996, 2016 Oracle Corporation. All rights reserved.

Now when I look at my targets.xml it has entries

[agent13@host-8-161 agent_inst]$ cat ./sysman/emd/targets.xml
<Targets AGENT_TOKEN="6A415CAF76EC952756AE3BC675B0080ADAEE066B3F9B10B1B4A6410870130843">
 <Target TYPE="host" NAME="host-8-161.blah.com" DISPLAY_NAME="host-8-161.osc.uk.oracle.com" EMD_URL="https://host-8-161.blah.com:3876/emd/main/" TIMEZONE_REGION="" IDENTIFIER="TARGET_GUID=51D4595ED5982DE8E0539011038AD7DB"/>
 <Target TYPE="oracle_emd" NAME="host-8-161.blah.com:3876" DISPLAY_NAME="host-8-161.blah.com:3876" EMD_URL="https://host-8-161.blah.com:3876/emd/main/" TIMEZONE_REGION="" IDENTIFIER="TARGET_GUID=0958C84AFB17CE4D3F9FB85C81250615"/>
 <Target TYPE="oracle_home" NAME="agent13c1_1_host-8-161.blah.com_1639" DISPLAY_NAME="agent13c1_1_host-8-161.blah.com_1639" EMD_URL="https://host-8-161.blah.com:3876/emd/main/" TIMEZONE_REGION="" IDENTIFIER="TARGET_GUID=C13E4BCE40F4509C3FC788A3C08EED68">
 <Property NAME="HOME_TYPE" VALUE="O"/>
 <Property NAME="INVENTORY" VALUE="/u01/app/oraInventory"/>
 <Property NAME="INSTALL_LOCATION" VALUE="/u01/app/agent13/ngc13/agent_13.2.0.0.0"/>
 </Target>
</Targets>

When I look at the hosts in Enterprise Manager they are now marked as up.

 

linux

Thoughts and other questions..

The agent13 user on the primary domain has automatically been given the permission to run read only ldm commands (similar to the privileges that need to be manually applied to the user on Solaris).

Unlike on other platforms (e.g. SuperCluster) the hierachy of LDOMs does not seem to be recorded.

Scribbled notes on installing the Oracle database on Oracle Linux 6.7 SPARC

I had a very short time to play with my Oracle Linux SPARC box before I handed it to my customers, so I only had a very quick attempt to install the Oracle RDBMS and start a database. I did only a very basic install using database storage on filesystem, and allowed the installer to create the DB. So these notes are even more scrappy than usual.

While not yet a certified platform, you can download the Oracle Database 12.1.0.2 for Oracle Linux 6.7 SPARC images on e-delivery. There is not a publicly available install document, so I’m going to follow the install guide for Linux

Preparing for the Install

https://docs.oracle.com/database/121/LADBI/olinrpm.htm#LADBI7477

I couldn’t find the pre-installation rpm for Linux-SPARC on ULN. So I am going to have to follow the documentation and hope I have all the packages.

Verify openssh is installed

[root@host-8-161 yum.repos.d]# rpm -qa |grep ssh
 openssh-clients-5.3p1-117.el6.sparc64
 openssh-5.3p1-117.el6.sparc64
 openssh-server-5.3p1-117.el6.sparc64
 libssh2-1.4.2-2.el6_7.1.sparc64

Check that the required packages are installed.

https://docs.oracle.com/database/121/LADBI/pre_install.htm#LADBI7534

I added the following packages..

  • compat-libcap1
  • compat-libstdc++-33

On the installation media the is an additional rpm to install cvuqdisk-1.0.9-1.rpm – but this requires an oracle user….

So.. lets create my user and groups for now

[root@host-8-161 rpm]# groupadd -g 1001 oinstall
 [root@host-8-161 rpm]# groupadd -g 1002 dba
 [root@host-8-161 rpm]# useradd -g dba -G oinstall -s /bin/bash -d /home/oracle/ oracle
 [root@host-8-161 rpm]# passwd oracle

Now retry installing the package

[root@host-8-161 rpm]# rpm -i cvuqdisk-1.0.9-1.rpm

Using default group oinstall to install package

Create directories

[root@host-8-161 rpm]# mkdir /u01
[root@host-8-161 rpm]# mkdir -p /u01/app/oracle
[root@host-8-161 rpm]# chown -R oracle:dba /u01

 

Set the Oracle user resource limits

[root@host-8-161 rpm]# cat /etc/security/limits.conf

# End of file
 oracle soft nofile 1024
 oracle hard nofile 65536
 oracle soft nproc 2047
 oracle hard nproc 16384
 oracle soft stack 10240
 oracle hard stack 10240
 oracle soft memlock 3145728
 oracle hard memlock 3145728

 

  • Set the display and try running runinstaller
  • .. fails with PRVF-0002 Unable to retrieve local node name
  • Added the hostname and IP to the local /etc/hosts and the install continued.
  • Pre-installation checks give warning about kernel parameters and swap size… but it does offer me a fixit script for the kernel parameters. Need to ensure that these changes to parameters are added to /etc/sysctl.conf
  • Ran the fixit, and still some parameters giving a warning – semms etc – I guess these would need a reboot. So the kernel parameters will need reviewing

Things I might want to consider adding to the /etc/sysctl.conf (stolen from another system)


kernel.msgmni = 2878
kernel.msgmax = 8192
kernel.msgmnb = 65536
kernel.shmmni = 4096
kernel.shmmax = 229916494233
kernel.shmall = 28065978

 

After the install completed, the database started ok.

 

 

 

 

Adding an Oracle Exadata Storage Server to Enterprise Manager using the command line

 

Ok, I’m just noodling around here… I have some ‘spare’ storage servers that are in the same fabric as my SuperCluster and I wanted to discover them in EM.

oracle@odc-em-sc7a:/u01/app/oracle/agent13/agent_inst/sysman/log$ emcli add_target -type=oracle_exadata -name=”expbcel09.osc.uk.oracle.com” -host=”sc7ach00pd00-d1.osc.uk.oracle.com” -properties=”CellName:expbcel09.osc.uk.oracle.com;MgmtIPAddr:138.3.3.82″
Target “expbcel09.osc.uk.oracle.com:oracle_exadata” added successfully

Changing attributes on ASM Diskgroups

You can see the attributes set on your ASM diskgroups in 2 ways, via the view  v$asm_attribute

SQL> select a.name, b.name, b.value from v$asm_attribute b, v$asm_diskgroup a where a.group_number=b.group_number and a.name='DATAX6C1' ;

Or via asmcmd and the lsattr

grid@sc7ach00pd00-d1:~$ asmcmd

ASMCMD> cd DATAX6C1
ASMCMD> lsattr -G DATAXC1 -l

Attributes you might want to pay attention to in an Exadata environment are

  • compatible.advm
  • cell.smart_scan_capable
  • appliance.mode
  • compatible.asm
  • compatible.rdbms

If you manually create a diskgroup via asmca these attributes will not normally be set, and so you may want to go manually set them.

SQL> ALTER DISKGROUP DATAX6C1 SET ATTRIBUTE 'appliance.mode'='TRUE';

The attribute is set immediately, but based on my experience, it does not come into effect until the disk group has  been rebalanced.

SQL> alter diskgroup DATAX6C1 rebalance power 2;

Redirecting Oracle Management agent to another OMS

I have an Enterprise Manager 13.1 system that is regularly reset to a ‘clean’ state using zfs snapshots. If you already have the management agents installed on other hosts you have the choice of

  1. De-installing the existing agent and installing a clean one
  2. Persuading the agent to talk to your ‘new’ server

A way of doing 2) is to follow the steps outlined in the MOS note EM 12c : Redirecting Oracle Management Agent to Another Oracle Management Service (Doc ID 1390717.1) which despite being for 12c appears to work for Enterprise Manager 13.

 

agent13@$ rm -rf agent_inst
agent13@$ mkdir agent_inst
agent13@$ /u01/app/agent13/agent_13.1.0.0.0/sysman/install/agentDeploy.sh \
AGENT_BASE_DIR=/u01/app/agent13 OMS_HOST=odc-em-ssc \
EM_UPLOAD_PORT=1159 AGENT_REGISTRATION_PASSWORD=welcome1 \
AGENT_INSTANCE_HOME=/u01/app/agent13/agent_inst \
AGENT_PORT=3872 -configOnly

 

This takes about 5 minutes to re-configure the agent to talk to the new server and then promote the targets. You will then be prompted to re-run the root.sh associated with the agent’s Oracle Home.

 

Allowing a non-privileged user to view LDOM configuration (updated for EM13.2)

In Solaris 11.2 (and probably other releases – I haven’t checked) if you try to view the ldom configuration as a non-privileged user you will get the following message

emagent @sscadb01:~$ ldm ls
Authorization failed

You can grant the user the ability to view the LDOM config  using the built in ‘LDoms Review’ profile

# usermod -P 'LDoms Power Mgmt Observability' emagent
# usermod -P 'LDoms Review' emagent

Login and out again and it should work.

emagent@sscadb01:~$ ldm ls
NAME STATE FLAGS CONS VCPU MEMORY UTIL NORM UPTIME
primary active -n-cv- UART 128 523776M 3.7% 3.7% 41d 6h 9m
ssccn1-app1 active -n---- 5001 128 512G 0.4% 0.2% 41d 6h 32m

Enterprise Manager 13 is LDOMs aware, and you will need to add this privilege to the agent software owner if you want the virtualization to be shown in this tool.

The Enterprise Manager 13.2 manual now lists slightly different privileges as required for the Enterprise Manager Systems Integration plugin. http://docs.oracle.com/cd/E73210_01/EMADM/GUID-62C91671-4F42-40A0-B929-22CBFEE73672.htm#EMADM15592

# usermod -P 'LDoms Power Mgmt Observability' emagent
# usermod -A solaris.ldoms.read,solaris.ldoms.ldmpower emagent

 

Removing a stubborn target it Enterprise Manager 12c R4

Very rarely you can encounter this problem in enterprise manager – a host or agent target that you cannot remove via the web interface. This is usually caused in my environment by decommissioning a host BEFORE the targets have all been dropped (Basically being a gungho dumbass and not considering things properly).

You can do a lot of operations by the emcli command line interface from your enterprise manager host

 

Login to Enterprise Manager

-bash-4.1$ ./emcli login -username=SYSMAN
Enter password : 

Login successful

Run the sync command to synchronize with the current OMS

-bash-4.1$ ./emcli sync
Synchronized successfully

Get the target that you want to remove – in my case it is a zone host that has been trashed.

-bash-4.1$ ./emcli get_targets |grep ssca-z7
4       Agent Unreachab  host                  ssca-z7-mgmt.osc.uk.oracle.com     
4       Agent Unreachab  oracle_emd            ssca-z7-mgmt.osc.uk.oracle.com:3872

I need to remove both the agent and the host

-bash-4.1$ ./emcli delete_target -name="ssca-z7-mgmt.osc.uk.oracle.com:3872" -type="oracle_emd" -delete_monitored_target

And then you can login to the GUI and check that it has gone!