Tag Archives: solaris

Hacking SLOB to run on Solaris

Kevin Closson’s Silly Little Oracle Benchmark https://kevinclosson.net/slob/ is not ported to Solaris. This means that to get it working is not as easy as it should be. Kevin himself suggests that you use a small linux host to run the tool if your database is running on a non-supported operating system.

I’m going to track the changes I make here.


First problem – grep!

./setup.sh test 16
SLOB 2.4.0
FATAL : 2018.05.15-14:40:57 : Usage : ./setup.sh.orig: <tablespace name> <number of SLOB schemas to create and load>
FATAL : 2018.05.15-14:40:57 : Option 2 must be an integer

This is caused by the function f_is_int requiring the gnu grep command behaviour. Simplest way to this is to change the grep to point to /usr/gnu/bin/grep in this script.

Next problem..

Once this change was made, I was then able to create the first schema ( yay!) but then when it came to make the remaining 15, it failed.

NOTIFY : 2018.05.15-14:37:27 : Waiting for background batch 1. Loading up to user11
FATAL : 2018.05.15-14:37:29 : 
FATAL : 2018.05.15-14:37:29 : f_flag_abort: Triggering abort
FATAL : 2018.05.15-14:37:29 : 
FATAL : 2018.05.15-14:37:30 : 
FATAL : 2018.05.15-14:37:30 : f_flag_abort: Triggering abort
FATAL : 2018.05.15-14:37:30 : 
FATAL : 2018.05.15-14:37:40 : 
FATAL : 2018.05.15-14:37:40 : f_flag_abort: Triggering abort
FATAL : 2018.05.15-14:37:40 : 
FATAL : 2018.05.15-14:37:40 : 
FATAL : 2018.05.15-14:37:40 : f_check_abort_flag: discovered abort flag
FATAL : 2018.05.15-14:37:40 : 
FATAL : 2018.05.15-14:37:40 : Aborting SLOB setup. See /export/home/oracle/mel/SLOB/cr_tab_and_load.out


Handily there is an error message in the logfile

ERROR at line 1:
ORA-00604: error occurred at recursive SQL level 1
ORA-00039: error during periodic action
ORA-04036: PGA memory used by the instance exceeds PGA_AGGREGATE_LIMIT

Disconnected from Oracle Database 12c Enterprise Edition Release - 64bit Production
With the Partitioning, Real Application Clusters, Automatic Storage Management, OLAP,
Advanced Analytics and Real Application Testing options
FATAL : 2018.05.15-14:37:40 :
FATAL : 2018.05.15-14:37:40 : f_setup: Failed to load user4 SLOB table
FATAL : 2018.05.15-14:37:40 :

I’m going to assume the clue is in the error and increase the PGA aggregate limit

SQL> alter system set pga_aggregate_limit=36G scope=both sid='*';

After this change, the script ran to the end successfully.


$ diff setup.sh setup.sh.orig
< if ( ! echo $s | /usr/gnu/bin/grep -q "^-\?[0-9]*$" )
> if ( ! echo $s | grep -q "^-\?[0-9]*$" )
< if ( echo "$cdb" | /usr/gnu/bin/grep -q "YES" > /dev/null 2>&1 )
> if ( echo "$cdb" | grep -q "YES" > /dev/null 2>&1 )
< if ( echo "$cdb" | /usr/gnu/bin/grep -q "NO" > /dev/null 2>&1 )
> if ( echo "$cdb" | grep -q "NO" > /dev/null 2>&1 )
< done | sqlplus -s "$constring" 2>&1 | tee -a $fname | /usr/gnu/bin/grep -i "dropped" | wc -l | while read num_processed
> done | sqlplus -s "$constring" 2>&1 | tee -a $fname | grep -i "dropped" | wc -l | while read num_processed


So – we already know that the scripts need gnu grep, especially in the function f_is_int. Safest option will be  to change every occurrence of grep with /usr/gnu/bin/grep.

Try running it with a single schema

 ./runit.sh 1

The duration of the run is based on the slob.conf variable RUN_TIME. The default is 300 seconds, but I am dropping it to 90 while I am doing debugging.

This ran successfully (I think!). Certainly AWR reports were created.

There was one obvious failure – the mpstat output was not generated.

$ cat mpstat.out 
mpstat: -P expects a number
Usage: mpstat [-aqm] [-A core|soc|bin] [-k key1,[key2,...]] [-o num] [-p | -P processor_set] [-T d|u] [-I statfile | -O statfile ] [interval [count]]

Ok – so this is the line causing the problem.

 ( mpstat -P ALL 3 > mpstat.out 2>&1) &

On Linux mpstat without the -P flag will give you summary output, -P separates the output by processor.  On Solaris the default is to show one line per CPU, so I just need to change the line to

 ( mpstat 3 > mpstat.out 2>&1) &

also check for other invocations of mpstat

Next - try with 2 sessions

./runit.sh 2

Arrgh. That fails with

: List of monitored sqlplus PIDs written to /tmp/.SLOB.2018.05.15.153524/13174.f_wait_pids.out.
usage: ps [ -aAdefHlcjLPyZ ] [ -o format ] [ -t termlist ]
 [ -u userlist ] [ -U userlist ] [ -G grouplist ]
 [ -p proclist ] [ -g pgrplist ] [ -s sidlist ] [ -z zonelist ] [-h lgrplist]
 'format' is one or more of:
 user ruser group rgroup uid ruid gid rgid pid ppid pgid sid taskid ctid
 pri opri pcpu pmem vsz rss rssprivate rssshared osz nice class time etime stime zone zoneid env
 f s c lwp nlwp psr tty addr wchan fname comm args projid project pset lgrp

This wasn’t obvious where the fault was happening. So lets add the -x flag to the first line of our script to try and find the line causing our problems.

++ sed /PID/d
++ wc -l
+++ cat /tmp/.SLOB.2018.05.15.153852/15451.f_wait_pids.out
++ sed 's/[^0-9]//g'
++ ps -p 32526 32527
usage: ps [ -aAdefHlcjLPyZ ] [ -o format ] [ -t termlist ]
 [ -u userlist ] [ -U userlist ] [ -G grouplist ]
 [ -p proclist ] [ -g pgrplist ] [ -s sidlist ] [ -z zonelist ] [-h lgrplist]
 'format' is one or more of:
 user ruser group rgroup uid ruid gid rgid pid ppid pgid sid taskid ctid
 pri opri pcpu pmem vsz rss rssprivate rssshared osz nice class time etime stime zone zoneid env
 f s c lwp nlwp psr tty addr wchan fname comm args projid project pset lgrp
++ return 0
+ tmp=0

So there is something with the ps -p command that is not happy in solaris. I  can reproduce the problem at my command line, by providing ps -p with multiple pids.

# ps -p 7 8
usage: ps [ -aAdefHlcjLPyZ ] [ -o format ] [ -t termlist ].....

If we look at the ps man page, on solaris it says

Some options accept lists as arguments. Items in a list can be either
separated by commas or else enclosed in quotes and separated by commas
or spaces. Values for proclist and grplist must be numeric.

so this means the following will work, but not the format within runit.sh

# ps -p 7,8
 7 ? 127:47 zpool-rp
 8 ? 1:39 kmem_tas
# ps -p "7 8"
 7 ? 127:47 zpool-rp
 8 ? 1:39 kmem_tas


So – I need to find the line with ps command, and try to change the format presented.

while ( ps -p $pidstring > /dev/null 2>&1 )

 ps -fp $pidstring

both occur in function f_wait_pids

but, they are actually being presented with the list of pids built up in line 1440

sqlplus_pids="${sqlplus_pids} $!"

So – how do I go about getting either a comma or a quote in the string that is passed to the function f_wait_pids?

Well, rather than destroying something that Kevin may rely on later, I add a line below 1440 to generate the pid list with commas


This is pretty ugly, as it means my string leads with a comma. However it seems that Solaris doesn’t care about this.

Now I need to add this to the function call on line 1493

if ( ! f_wait_pids "$(( SCHEMAS * THREADS_PER_SCHEMA ))" "$RUN_TIME" "$WORK_LOOP" "$sqlplus_pids"  "$melsqlplus_pids")

Of course I need to add something to pick up that new 5th argument to the function, so within the function itself

local melpidstring="$5"

while ( ps -p "$melpidstring" > /dev/null 2>&1 )

ps -fp "$melpidstring"

So this all seems ok.. but then I discover some more little rats in f_count_pid.

So once again – create my own custom usage of the pid list. There probably is a MUCH simpler way to do this, I’m just building it up as I go along. So – here are the changes I have made to runit.sh to get it to execute without error


oracle@sc8avm-25:~/mel/SLOB$ diff runit.sh runit.sh.orig
< #!/bin/bash 
> #!/bin/bash
< if ( ! echo "$s" | /usr/gnu/bin/grep -q "^-\?[0-9]*$" ) 
> if ( ! echo "$s" | grep -q "^-\?[0-9]*$" ) 
< for string in 'iostat -xm 3' 'mpstat 3' 'vmstat 3'
> for string in 'iostat -xm 3' 'mpstat -P ALL 3' 'vmstat 3'
< ls -l /proc/${tmp}/fd | /usr/gnu/bin/grep deleted
> ls -l /proc/${tmp}/fd | grep deleted
< sqlplus $user/${user}${non_admin_connect_string} <<EOF 2>/dev/null | sed 's/^.* FATAL/FATAL/g' | /usr/gnu/bin/grep FATAL > $tmpfile
> sqlplus $user/${user}${non_admin_connect_string} <<EOF 2>/dev/null | sed 's/^.* FATAL/FATAL/g' | grep FATAL > $tmpfile
< if ( /usr/gnu/bin/grep FATAL "$tmpfile" > /dev/null 2>&1 )
> if ( grep FATAL "$tmpfile" > /dev/null 2>&1 )
< local melpidstring="$5"
< local meltmpfile="${SLOB_TEMPDIR}/${RANDOM}.MEL${FUNCNAME}.out"
< echo "$melpidstring" > $meltmpfile 2>&1
< f_msg NOTIFY "List of monitored sqlplus PIDs with commas written to ${meltmpfile}."
< tmp=`f_count_pids "$meltmpfile"`
> tmp=`f_count_pids "$tmpfile"`
< echo "This is the pidstring value $pidstring"
< echo "This is the melpidstring value $melpidstring"
< while ( ps -p "$melpidstring" > /dev/null 2>&1 )
> while ( ps -p $pidstring > /dev/null 2>&1 )
< ps -fp "$melpidstring"
> ps -fp $pidstring
< if ( ! echo "$tmp" | /usr/gnu/bin/grep -q '\-' > /dev/null 2>&1 )
> if ( ! echo "$tmp" | grep -q '\-' > /dev/null 2>&1 )
< ( mpstat 3 > mpstat.out 2>&1) &
> ( mpstat -P ALL 3 > mpstat.out 2>&1) &
< melsqlplus_pids="${melsqlplus_pids},$!"
< if ( ! f_wait_pids "$(( SCHEMAS * THREADS_PER_SCHEMA ))" "$RUN_TIME" "$WORK_LOOP" "$sqlplus_pids" "$melsqlplus_pids" )
> if ( ! f_wait_pids "$(( SCHEMAS * THREADS_PER_SCHEMA ))" "$RUN_TIME" "$WORK_LOOP" "$sqlplus_pids" )

Tomorrow – time to get someone else to run it and see if it behaves how they expect.



Creating a ramdisk in Solaris 11

Ramdisks are a great way to ‘prove’ that it’s not the performance of the underlying disks device that is stopping a process from writing a file quickly (doesn’t prove anything about the filesystem though…) . Ramdisks are transient, and are lost on system reboot, and also consume the memory on your system, so if you make them too large you can cause yourself other problems.

Creating a Ramdisk

The ramdiskadm command is used to create a ramdisk. In this example I am creating a 2G ramdisk called ‘idisk’

# ramdiskadm -a idisk 2G

Then you create the filesystem on the ramdisk (in this case UFS)

# newfs /dev/ramdisk/idisk

newfs: construct a new file system /dev/ramdisk/idisk: (y/n)? y
Warning: 2688 sector(s) in last cylinder unallocated
/dev/ramdisk/idisk:    41942400 sectors in 6827 cylinders of 48 tracks, 128 sectors
        20479.7MB in 427 cyl groups (16 c/g, 48.00MB/g, 5824 i/g)
super-block backups (for fsck -F ufs -o b=#) at:
 32, 98464, 196896, 295328, 393760, 492192, 590624, 689056, 787488, 885920,
Initializing cylinder groups:
super-block backups for last 10 cylinder groups at:
 40997024, 41095456, 41193888, 41292320, 41390752, 41489184, 41587616,
 41686048, 41784480, 41882912

Now you have a filesystem, you can mount it onto the correct location

# mkdir /export/home/tuxedo/DATA2
# mount /dev/ramdisk/idisk /export/home/tuxedo/DATA2 

Remember to set the ownership/permissions to allow the non-root users to write to the device

# chown tuxedo:oinstall /export/home/tuxedo/DATA2

Maintaining Ramdisks

You can check if a ramdisk exists by just running ramdiskadm without parameters

# ramdiskadm

Block Device                                                  Size  Removable 
/dev/ramdisk/idisk                                     21474836480    Yes

You can remove a ramdisk by unmounting the filesystem and using ramdiskadm -d

# umount /export/home/tuxedo/DATA2 
# ramdiskadm -d idisk

Changing the number of CPUs in a zone’s processor set (pooladm) Solaris 11.1

This post is mainly related to SuperCluster configuration, but can be applied to other solaris based systems.

In SuperCluster you can run the Oracle RDBMS in zones, and the zones are CPU capped. You may want to change the number of CPUs assigned to your zone for a couple of reasons

1) You have changed the number of CPUs that are available in the LDOM supporting this zone using a tool such as setcoremem and want to resize the zone to take this into account.

2) You need to change the zone size because you need more / less capacity.

Determine the number of processors that need to be reserved for the global zone (and all of your other zones!)

For SuperCluster you should reserve a miniumum of 2 cores per IB HCA in domains with a single HCA, a minimum of 4 cores for domains with 2 or more HCA. You also need to take into account other zones running on the system.

Find out how many CPUs are in the global zone

# psrinfo -pv

The physical processor has 16 cores and 128 virtual processors (0-127)

(... snipped output)
The core has 8 virtual processors (488-495)
The core has 8 virtual processors (496-503)
The core has 8 virtual processors (504-511)
SPARC-T5 (chipid 3, clock 3600 MHz)

So in my case, there are 512 virtual CPU, of which I need to reserve at least 32 (4 cores x 8 threads) for my global zone, and as I will also have an app zone on there, possibly a lot more. So lets say I’m going to keep 16 cores in global that would leave 384 for my dbzone

Get the name of the pool

The SuperCluster db zone creation tools create the pools with logical names based on the zonename. However, the way to be sure what pool is in use is to look at the zone’s definition

# zonecfg -z sc5acn01-d5 export |grep pool
set pool=pool_sc5acn01-d5.blah.uk.mydomain.com_id_6135

If this doesn’t return the name of the pool, your zone is not using resource pools and may be using one of the other methods of capping CPU usage such as allocating dedicated cpu resources (ncpus=X). If so.. stop reading here as this is not the blog post you are looking for.

Find the processor set associated with your pool by either looking in /etc/pooladm.conf or by running pooladm with no parameters to get the current running config. Checking in pooladm is WAY more readable so that is my preferred method.

# pooladm |more

system default
string system.comment
int system.version 1
boolean system.bind-default true
string system.poold.objectives wt-load

pool pool_sc5acn01-d5.blah.uk.mydomain.com_id_6135
int pool.sys_id 1
boolean pool.active true
boolean pool.default false
string pool.scheduler TS
int pool.importance 1
string pool.comment
pset pset_sc5acn01-d5.blah.uk.mydomain.com_id_6135

Here’s an extract of my pooladm.conf

<pool name="pool_sc5acn01-d5.blah.uk.mydomain.com_id_6135" active="true" default="false" importance="1" comment="" res="pset_1" ref_id="pool_1">
<property name="pool.sys_id" type="int">1</property>
<property name="pool.scheduler" type="string">TS</property>
<pool name="pool_default" active="true" default="true" importance="1" comment="" res="pset_-1" ref_id="pool_0">
<property name="pool.sys_id" type="int">0</property>
<property name="pool.scheduler" type="string">TS</property>
<res_comp type="pset" sys_id="1" name="pset_sc5acn01-d5.blah.uk.mydomain.com_id_6135" default="false" min="8" max="8" units="population" comment="" ref_id="pset_1">

It looks to me that the res=”pset_1″ in the pool definition points to the ref_id=”pset_1″ in the pset definition.

OK, so now I know my pset is called pset_sc5acn01-d5.blah.uk.mydomain.com_id_6135 and it currently has 8 cpus. I also know that my running config and my persistent config are synchronised.

Shutdown the database zone.

This may not be necessary, but since Oracle can make assumptions based on CPU count at startup, I think it is safest.

# zoneadm -z sc5acn01-d5 shutdown

Change the pset configuration

I’m going to do this by changing the config file to make it persistent, as there’s nothing more embarassing than making a change that is lost by a reboot. I set the processor set to have a minimum of 384 CPU and a maximum of 384 CPU.

# poolcfg -c 'modify pset pset_sc5acn02-d5.blah.uk.mydomain.com_id_6135 ( uint pset.min=384 ; uint pset.max=384 )' /etc/pooladm.conf

Check that it has applied to your config file
# grep pset_sc5acn01-d5 /etc/pooladm.conf

Force it to re-read the file and use the new configuration

# pooladm -c
# pooladm -s

Now you can run pooladm without any arguments and get the running config. If it all looks ok, go ahead and boot your zone

# pooladm

(snipped output)

pset pset_sc5acn01-d5.blah.uk.mydoamain.com_id_6135
int pset.sys_id 1
boolean pset.default false
uint pset.min 384
uint pset.max 384
string pset.units population
uint pset.load 723
uint pset.size 384
string pset.comment

Using CURL to transfer files from a ftps server

Unfortunately Solaris does not come with a ftps compatible client by default, but you can get round this by either a) installing another client such as lftp or b) using CURL

CURL is not a super friendly program however, so here are some of my notes on using it to access one of my servers.

Get a directory listing of the files in the source directory

curl  -k -3 --ftp-ssl -u techie ftp://files.blah.com:990//export/home/sources/* -l

-k says to ignore certificate problems (this may be bad in the outside world – but this is all internal to my network and we don’t have ssl certificates)

-u specifies the username to connect to the server. You can optionally follow this with a : and a password -u username:password but this will be clear text. If you don’t specify a password on the command line it will prompt. Another option is to use a curl config file (–config with the username:password specified in it)

-3 says to use SSLv3 (this may not be relevant in your environment)

// in the directory path says that it is an absolute path not a relative one.

This will give you a list of the files inside the directory, which is super handy for scripting purposes.

Then you can issue curl command for each file you need to download  like

curl  -k -3 --ftp-ssl -u techie ftp://files.blah.com:990//export/home/sources/myfile.txt  -O

-O specifies to create the file in  your local directory, with the same filename.

Curl has a whole bunch of options and this is only a tiny snapshot. There is a really useful man page available online at http://curl.haxx.se/docs/manpage.html

Disable access time (atime) recording on ZFS

If you have a filesystem that contains data which is accessed often, but you do not want to record the access time information because it is static data (e.g. content for a webserver) you can change this in zfs properties.

You can see the full list of zfs properties

root@sc5acn02-d2:/var/fmw# zfs get all logpool/fmw_app
NAME             PROPERTY              VALUE                  SOURCE
logpool/fmw_app  aclinherit            restricted             default
logpool/fmw_app  aclmode               discard                default
logpool/fmw_app  atime                 on                     default
logpool/fmw_app  available             1.57T                  -
logpool/fmw_app  canmount              on                     default
logpool/fmw_app  casesensitivity       mixed                  -
logpool/fmw_app  checksum              on                     default
logpool/fmw_app  compression           off                    default
logpool/fmw_app  compressratio         1.00x                  -
logpool/fmw_app  copies                1                      default
logpool/fmw_app  creation              Wed Jun  4 15:47 2014  -
logpool/fmw_app  dedup                 off                    default
logpool/fmw_app  devices               on                     default
logpool/fmw_app  encryption            off                    -
logpool/fmw_app  exec                  on                     default
logpool/fmw_app  keychangedate         -                      default
logpool/fmw_app  keysource             none                   default
logpool/fmw_app  keystatus             none                   -
logpool/fmw_app  logbias               latency                default
logpool/fmw_app  mlslabel              none                   -
logpool/fmw_app  mounted               yes                    -
logpool/fmw_app  mountpoint            /logpool/fmw_app       default
logpool/fmw_app  multilevel            off                    -
logpool/fmw_app  nbmand                off                    default
logpool/fmw_app  normalization         none                   -
logpool/fmw_app  primarycache          all                    default
logpool/fmw_app  quota                 none                   default
logpool/fmw_app  readonly              off                    default
logpool/fmw_app  recordsize            128K                   default
logpool/fmw_app  referenced            30.1G                  -
logpool/fmw_app  refquota              none                   default
logpool/fmw_app  refreservation        none                   default
logpool/fmw_app  rekeydate             -                      default
logpool/fmw_app  reservation           none                   default
logpool/fmw_app  rstchown              on                     default
logpool/fmw_app  secondarycache        all                    default
logpool/fmw_app  setuid                on                     default
logpool/fmw_app  shadow                none                   -
logpool/fmw_app  share.*               ...                    default
logpool/fmw_app  snapdir               hidden                 default
logpool/fmw_app  sync                  standard               default
logpool/fmw_app  type                  filesystem             -
logpool/fmw_app  used                  30.1G                  -
logpool/fmw_app  usedbychildren        0                      -
logpool/fmw_app  usedbydataset         30.1G                  -
logpool/fmw_app  usedbyrefreservation  0                      -
logpool/fmw_app  usedbysnapshots       0                      -
logpool/fmw_app  utf8only              off                    -
logpool/fmw_app  version               6                      -
logpool/fmw_app  vscan                 off                    default
logpool/fmw_app  xattr                 on                     default
logpool/fmw_app  zoned                 off                    default

You can use the set to command to change the property.

root@sc5acn02-d2:/var/fmw# zfs set atime=off logpool/fmw_app

You do not need to remount the filesystem, the change is applied instantly

root@sc5acn01-d2:/var/bea# mount
[snipped output]
/var/fmw/app on logpool/fmw_app read/write/setuid/devices/rstchown/nonbmand/exec/xattr/noatime/dev=47d0012 on Wed Jun  4 16:19:57 2014

Converting Apache SSL certificates for use in Oracle Traffic Director

Certificates that have been exported by Apache cannot be directly imported into OTD. Apache exports consist of 2 files, a .pem and a .key, and if you try to load the .pem file you will get the error ‘OTD-64112’

First you convert them to PKCS12 format

otd_user@otd-zone:~$ openssl pkcs12 -export -in wildcard.load.melnet.net.pem -inkey wildcard.load.melnet.net.key -out wildcard.p12

Then you import them into Oracle Traffic Director. The first time I tried this I had problems, as the new certificates were not visible in the GUI despite it noticing that files cert8.db/key3.db had changed and prompting for a reload. cert8.db/key3.db are the older Berkeley format databases, and OTD uses the newer SQLite format files cert9.db/key4.db. You can control this by setting the environment variable NSS_DEFAULT_DB_TYPE or by putting the prefix sql: on the directory path to the certificate directory specified by -d

otd_user@otd-zone:/u01/OTDInstance/net-sso-home/config$ export NSS_DEFAULT_DB_TYPE="sql"
<otd_user@otd-zone:/u01/OTDInstance/net-sso-home/config$ pk12util -i /export/home/otd_user/wildcard.p12 -d .
 Enter password for PKCS12 file:

Use certutil to verify it has loaded successfully

otd_user@otd-zone:/u01/OTDInstance/net-sso-home/config$ certutil -K -d.certutil: Checking token "NSS Certificate DB" in slot "NSS User Private Key and Certificate Services"

Then login to the OTD management user interface, select to pull and deploy the changes.

rsa 7fe2205c3cc98df1df0a907c9ed81900048bf434 cert-*.load.melnet.net

Useful related knowledge

certutil utility reference: https://developer.mozilla.org/en-US/docs/Mozilla/Projects/NSS/Reference/NSS_tools_:_certutil

How To Import Pre-existing SSL Server Certificate and Key Files Into Oracle Traffic Director (Doc ID 1495668.1)