Tag Archives: Solaris 11

What packages did that incorporation just install?

So you just installed a Solaris package incorporation and you want to work out what it actually included..

First .. find out when your package was installed

root@host-8-200:/var/log/pkg# pkg history 
START OPERATION CLIENT OUTCOME
2017-10-12T14:51:03 set-property transfer module Succeeded
2017-10-12T14:51:03 image-create transfer module Succeeded
2017-10-12T14:51:04 refresh-publishers transfer module Succeeded
2017-10-12T14:51:20 rebuild-image-catalogs transfer module Succeeded
2017-10-12T14:51:27 install transfer module Succeeded
2017-10-12T15:33:18 install pkg Succeeded
2017-10-12T15:39:07 install pkg Succeeded

We are going to dig into the install that occurred at 15:39

root@host-8-200:/var/log/pkg# pkg history -t 2017-10-12T15:39:07 -l

This gives a  really long listing.. but the key part for me was headed

 

Package version changes:
None -> pkg://solaris/developer/build/make@0.5.11,5.11-0.175.2.0.0.34.0:20140303T132010Z
None -> pkg://solaris/developer/assembler@0.5.11,5.11-0.175.3.9.0.2.0:20160528T012706Z
None -> pkg://solaris/group/prerequisite/oracle/oracle-rdbms-server-12-1-preinstall@0.5.11,5.11-0.175.3.11.0.4.0:20160804T020607Z

This shows that 3 packages were installed as they went from version ‘None’ to an actual version number.

Advertisements

Daxstat in Solaris 11.3/SuperCluster problems

One of the big challenges with the Software in Silicon features is actually monitoring them to see if they are actually doing anything. To monitor the DAX on the SPARC M7 chip you used to have to use busstat commands, and then interpret them (not easy! the documentation is not very thorough).

In later releases of Solaris 11.3 (SRU 19 onwards?) there is a command called daxstat which you can use to see in a much more human readable form the activity on your DAX.https://docs.oracle.com/cd/E86824_01/html/E54764/daxstat-1m.html

However, when I went to use it on my SuperCluster I hit a problem… it was failing with an error I couldn’t understand.

# daxstat
Traceback (most recent call last):
 File "/usr/bin/daxstat", line 969, in <module>
 sys.exit(main())
 File "/usr/bin/daxstat", line 962, in main
 return process_opts()
 File "/usr/bin/daxstat", line 905, in process_opts
 dax_ids, dax_queue_ids = derive_dax_opts(args, parser)
 File "/usr/bin/daxstat", line 844, in derive_dax_opts
 dax_ids = find_ids(query, parser, None)
 File "/usr/bin/daxstat", line 683, in find_ids
 all_dax_kstats = RCU.list_objects(kbind.Kstat(), query)
 File "/usr/lib/python2.7/vendor-packages/rad/connect.py", line 391, in list_objects
 a RADInterface object
 File "/usr/lib/python2.7/vendor-packages/rad/client.py", line 213, in _raise_error
 packer.pack_int((timestamp % 1000000) * 1000)
rad.client.NotFoundError: Error listing com.oracle.solaris.rad.kstat:type=Kstat: not found (3)

It *should* have worked on my current version of Solaris

# pkg list entire
NAME (PUBLISHER) VERSION IFO
entire 0.5.11-0.175.3.22.0.3.0 i--

 

So, I did some tweaking. I am not sure which of steps 1 or 2 actually fixed my problem, as it seemed to need the reboot to activate my ‘fix’

Step 1 – Make sure you have the Remote Administration Daemon packages installed. https://docs.oracle.com/cd/E53394_01/html/E54825/index.html I installed the package group group/system/management/rad/rad-server-interfaces to make sure I wasn’t missing anything.

 # pkg list |grep rad
group/system/management/rad/rad-server-interfaces 0.5.11-0.175.3.0.0.30.0 i--
system/management/rad 0.5.11-0.175.3.21.0.4.0 i--
system/management/rad/client/rad-c 0.5.11-0.175.3.21.0.3.0 i--
system/management/rad/client/rad-java 0.5.11-0.175.3.17.0.1.0 i--
system/management/rad/client/rad-python 0.5.11-0.175.3.17.0.1.0 i--
system/management/rad/module/rad-dlmgr 0.5.11-0.175.3.17.0.1.0 i--
system/management/rad/module/rad-files 0.5.11-0.175.3.17.0.1.0 i--
system/management/rad/module/rad-kstat 0.5.11-0.175.3.17.0.1.0 i--
system/management/rad/module/rad-network 0.5.11-0.175.3.17.0.1.0 i--
system/management/rad/module/rad-panels 0.5.11-0.175.3.17.0.1.0 i--
system/management/rad/module/rad-smf 0.5.11-0.175.3.17.0.1.0 i--
system/management/rad/module/rad-time 0.5.11-0.175.3.17.0.1.0 i--
system/management/rad/module/rad-usermgr 0.5.11-0.175.3.17.0.4.0 i--
system/management/rad/module/rad-zfsmgr 0.5.11-0.175.3.17.0.1.0 i--
system/management/rad/module/rad-zonemgr 0.5.11-0.175.3.22.0.1.0 i--

Step 2 – Make sure the RAD service is running (mine was disabled)

# svcs -a |grep rad

online 9:22:10 svc:/system/rad:local
online 9:22:10 svc:/system/logadm-upgrade:default
online 9:22:10 svc:/system/rad:local-http

It was still failing to work though.. and I couldn’t work out why. In desperation I tried a reboot and the command stopped failing.

# daxstat 1
No data available to display.

Run workload that uses the DAX and then it will populate the output.

# daxstat 1
DAX commands fallbacks input output %busy
 0 466 9 4.0M 0.0M 0
 1 469 13 4.0M 0.0M 0
 2 462 12 2.0M 0.0M 0
 3 461 16 4.0M 0.0M 0
 4 473 9 4.0M 0.0M 0
 5 457 8 2.0M 0.0M 0
 6 459 6 2.0M 0.0M 0
 7 465 10 4.0M 0.0M 0

vmstat – who is in kthr w status

This post leans heavily on the work of clever people before me, especially this blog post https://blogs.oracle.com/swan/entry/find_out_process_es_with

I had a zone with has been heavily used until recently, but had mostly been quiescent for the past few weeks. In vmstat, it had been noticed that I had approximately 150 lwps in the w state.

# vmstat 5 5
 kthr memory page disk faults cpu
 r b w swap free re mf pi po fr de sr sd sd sd vc in sy cs us sy id
 0 0 188 873445368 54537152 472 699 13 0 0 0 0 20 0 32 0 17621 32317 17008 0 0 99
 0 0 149 863818176 9628832 42 733 38 0 0 0 0 31 0 33 0 19384 147817 16927 20 1 79
 0 0 149 864814392 10389536 125 919 0 0 0 0 0 14 0 20 0 19523 135382 16785 26 1 73
 0 0 149 864914688 10422784 63 544 0 0 0 0 0 16 0 33 0 19072 128355 16572 26 1 73
 0 0 149 863905112 9505416 46 579 0 0 0 0 0 14 0 29 0 19057 130498 16582 25 1 75

 

The vmstat man page says..

 w the number of swapped out lightweight processes (LWPs)
   that are waiting for processing resources to finish.

Hmm..

So I had a look at this blog post which uses mdb and some knowledge of the solaris source to find the PIDs https://blogs.oracle.com/swan/entry/find_out_process_es_with

 

Of course as I’m in a Zone, I need to do my investigations at the Global Zone level.

First get the list of PIDs and their swapped count in hex

 

# echo '::walk proc|::print -t proc_t p_pidp->pid_id p_swapcnt'|mdb -k|awk '{if(NR%2){printf("%s\t",$0);}else{printf("%s\n",$0);}}'|awk '{if($NF!=0){printf("pid: %s\tp_swapcnt: %s\n",$4,$NF);}}'

giving an output like

pid: 0x2f p_swapcnt: 0x1
pid: 0x25 p_swapcnt: 0x3
pid: 0x11 p_swapcnt: 0x1
pid: 0xf p_swapcnt: 0x5
pid: 0xcb7d p_swapcnt: 0x1

which I saved to a text file. Now, the blog post only had 17 to play with.. I’ve got over 150 so I’m not going to be looking up all the individual PIDs by hand. There is almost certainly a more elegant way of doing this, through cunning use of pipe and awk or maybe dtrace, but I was pressed for time.

#!/bin/bash
runcounter=0
while read blah pidder blah1 counter
do
 outpid=`printf "%d\n" $pidder`
 outcounter=`printf "%d\n" $counter`
 echo "Number of LWPS swapped : $outcounter"
 echo "Process=`ps -fp $outpid`" 
 echo "-------------------------------------------"
 runcounter=$(($runcounter+$outcounter))
done < walk.txt 
echo "Total number of lwps in state w: $runcounter"

This gave an output similar to :

<snip>

-------------------------------------------
Number of LWPS swapped : 5
Process= UID PID PPID C STIME TTY TIME CMD
 root 15 1 0 Feb 27 ? 1:22 /lib/svc/bin/svc.startd
-------------------------------------------
Number of LWPS swapped : 1
Process= UID PID PPID C STIME TTY TIME CMD
 root 52093 15 0 Mar 09 console 0:00 /usr/sbin/ttymon -g -d /dev/console -l console -m ldterm,ttcompat -h -p sc7ach00pd01-d2 console login: 
-------------------------------------------
Total number of lwps in state w: 149


From the Solaris Internals manual ( 2.4.1 The Process Structure,  Table 10.3 and 10.3.6 The Memory Scheduler),   processes with p_swapcnt > 0 are those who have been swapped out by the memory scheduler to free up memory pages.  This is a separate operation from page-out, and is relatively inexpensive, though does dramatically affect the process’s performance.  Swapping out a process involves removing all of a process’s thread structures and private pages from memory and setting flags in the process to table to show that this process has been swapped out.  The memory scheduler is started at boot time and doesn’t do anything until the memory is consistently less than desfree memory over a 30 second average.  Desfree is a calculated value https://docs.oracle.com/cd/E53394_01/html/E54818/chapter2-10.html#OSTUNchapter2-103 , set at 1/128th of the memory of the system, at a minimum of 256K.

 

At some point in the recent past, this system suffered extreme memory pressure due to someone starting up a huge SGA + PGA on the system. It looks like the memory scheduler does not automatically swap the processes back in when the memory pressure eases, instead waiting for the process to do ‘something’ and need to run those LWPs (this makes sense – it’s better to not do work unless it’s needed, and as desfree is not actually a lot of memory free, if you’re bumping along that threshhold the last thing you need is the scheduler to un-swap something and tip you back into a memory shortage)

I basically ‘touched’ each one of these pids by using the pfiles command … and now I have no processes sitting in state ‘w’

# vmstat 5 3
 kthr      memory            page            disk          faults      cpu
 r b w   swap  free  re  mf pi po fr de sr sd sd sd vc   in   sy   cs us sy id
 0 0 187 873530456 54500568 471 698 13 0 0 0 0 20 0 32 0 17621 32769 17006 0 0 99
 0 0 0 909716856 43286880 585 1103 0 0 0 0 0 28 0 32 0 18478 206015 16744 2 0 98
 0 0 0 909647064 43275192 62 260 0 0 0 0 0 14  0 27  0 17117 202019 15744 2 0 98

 

 

Creating a ramdisk in Solaris 11

Ramdisks are a great way to ‘prove’ that it’s not the performance of the underlying disks device that is stopping a process from writing a file quickly (doesn’t prove anything about the filesystem though…) . Ramdisks are transient, and are lost on system reboot, and also consume the memory on your system, so if you make them too large you can cause yourself other problems.

Creating a Ramdisk

The ramdiskadm command is used to create a ramdisk. In this example I am creating a 2G ramdisk called ‘idisk’

# ramdiskadm -a idisk 2G

Then you create the filesystem on the ramdisk (in this case UFS)

# newfs /dev/ramdisk/idisk

newfs: construct a new file system /dev/ramdisk/idisk: (y/n)? y
Warning: 2688 sector(s) in last cylinder unallocated
/dev/ramdisk/idisk:    41942400 sectors in 6827 cylinders of 48 tracks, 128 sectors
        20479.7MB in 427 cyl groups (16 c/g, 48.00MB/g, 5824 i/g)
super-block backups (for fsck -F ufs -o b=#) at:
 32, 98464, 196896, 295328, 393760, 492192, 590624, 689056, 787488, 885920,
Initializing cylinder groups:
........
super-block backups for last 10 cylinder groups at:
 40997024, 41095456, 41193888, 41292320, 41390752, 41489184, 41587616,
 41686048, 41784480, 41882912

Now you have a filesystem, you can mount it onto the correct location

# mkdir /export/home/tuxedo/DATA2
# mount /dev/ramdisk/idisk /export/home/tuxedo/DATA2 

Remember to set the ownership/permissions to allow the non-root users to write to the device

# chown tuxedo:oinstall /export/home/tuxedo/DATA2

Maintaining Ramdisks

You can check if a ramdisk exists by just running ramdiskadm without parameters

# ramdiskadm

Block Device                                                  Size  Removable 
/dev/ramdisk/idisk                                     21474836480    Yes

You can remove a ramdisk by unmounting the filesystem and using ramdiskadm -d

# umount /export/home/tuxedo/DATA2 
# ramdiskadm -d idisk

Expanding a zpool backed by an iSCSI LUN

So, you have a zpool provided by an iscsi LUN which is tight on space, and you’ve done all the tidying you can think of.. what do you do next? Well if you’re lucky, you have space to expand the iscsi LUN and then make it available to your zpool.

First – find the LUN that holds the zpool using zpool status <poolname>

 

# zpool status zonepool
  pool: zonepool
 state: ONLINE
  scan: none requested
config:

        NAME                                     STATE     READ WRITE CKSUM
        zonepool                                 ONLINE       0     0     0
          c0t600144F0A22997000000574BFAA90004d0  ONLINE       0     0     0

errors: No known data errors

Note the lun identifier, starting with c0, ending with ‘d0’

Locate the LUN on your storage appliance. If you are on a ZFS appliance there is a really handy script  in Oracle Support Document 1921605.1 Otherwise you’ll have to use the tools supplied with your storage array or your eyes 😉

So, I’ve located my lun on my ZFS appliance by matching the LUN identifier and then I need to change the LUN size..

shares> select sc1-myfs
shares sc1-myfs> 
shares sc1-myfs> select zoneshares_zonepool 
shares sc1-myfs/zoneshares_zonepool> get lunguid
 lunguid = 600144F0A22997000000574BFAA90004
shares sc1-myfs/zoneshares_zonepool> set volsize=500G
 volsize = 500G (uncommitted)
shares sc1-myfs/zoneshares_zonepool> commit

 

Now I just need to get my zpool to expand into the available space on the lun

# zpool online -e zonepool c0t600144F0A22997000000574BFAA90004d0

And now we’re done

 

Allowing a non-privileged user to view LDOM configuration (updated for EM13.2)

In Solaris 11.2 (and probably other releases – I haven’t checked) if you try to view the ldom configuration as a non-privileged user you will get the following message

emagent @sscadb01:~$ ldm ls
Authorization failed

You can grant the user the ability to view the LDOM config  using the built in ‘LDoms Review’ profile

# usermod -P 'LDoms Power Mgmt Observability' emagent
# usermod -P 'LDoms Review' emagent

Login and out again and it should work.

emagent@sscadb01:~$ ldm ls
NAME STATE FLAGS CONS VCPU MEMORY UTIL NORM UPTIME
primary active -n-cv- UART 128 523776M 3.7% 3.7% 41d 6h 9m
ssccn1-app1 active -n---- 5001 128 512G 0.4% 0.2% 41d 6h 32m

Enterprise Manager 13 is LDOMs aware, and you will need to add this privilege to the agent software owner if you want the virtualization to be shown in this tool.

The Enterprise Manager 13.2 manual now lists slightly different privileges as required for the Enterprise Manager Systems Integration plugin. http://docs.oracle.com/cd/E73210_01/EMADM/GUID-62C91671-4F42-40A0-B929-22CBFEE73672.htm#EMADM15592

# usermod -P 'LDoms Power Mgmt Observability' emagent
# usermod -A solaris.ldoms.read,solaris.ldoms.ldmpower emagent

 

Exempting root from password complexity rules on Solaris 11

THIS IS STRONGLY NOT ADVISED!!!!

In Solaris 11.1 root passwords have complexity rules enforced

# passwd root
New Password:
passwd: The password must contain at least 1 numeric or special character(s).

The /etc/pam.d/other file lists the rules to be used.. and at the tail of the file it gives you the instructions …

# Password construction requirements apply to all users.
# Edit /usr/lib/security/pam_authtok_common and remove force_check
# to have the traditional authorized administrator bypass of construction
# requirements.
password include pam_authtok_common
password required pam_authtok_store.so.1

Edit /usr/lib/security/pam_authtok_common and remove the force_check from the line

other password requisite pam_authtok_check.so.1 force_check

Giving…

#
# Copyright (c) 2012, Oracle and/or its affiliates. All rights reserved.
#
# PAM common include file for PAM authentication token manipulation.
# Remove the 'force_check' option from pam_authtok_check(5) to have the
# traditional authorized administrator bypass of construction requirements.
#
other password required pam_dhkeys.so.1
other password requisite pam_authtok_get.so.1
other password requisite pam_authtok_check.so.1

Pooladm can hang you up!

In my last post I wrote about using pooladm to change the number of CPUs assigned to a zone. I’ve been using this command a lot more recently and I’ve hit upon some quirky behaviour when the contents of /etc/pooladm.conf is inconsistent with the reality on your system. So, I have a T5-8 LDOM. This originally had 32 cores (256 threads) of T5-8 CPU and a zone using a pool of 240 vcpus. I used supported tools to increase the number of cores in my LDOM to 60 (480 threads) and rebooted.  After reboot I try to increase the pset size for my zone..

# poolcfg -c 'modify pset pset_sc5acn01-d8.blah.com_id_6138 ( uint pset.min=448 ; uint pset.max=448 )' /etc/pooladm.conf
# pooladm -c

And then I wait, and I wait and I wait… and still nothing happens. The OS is up and running ok, I can do other jobs but my pooladm isn’t returning. When I truss it, it seems to just sat there doing nothing..

41937: brk(0x0035FA30) = 0x00000000
41937: brk(0x0035FA30) = 0x00000000
41937: brk(0x0035FA30) = 0x00000000
41937: brk(0x00361A30) = 0x00000000
41937: brk(0x00361A30) = 0x00000000
41937: brk(0x00361A30) = 0x00000000
41937: brk(0x00363A30) = 0x00000000

I gave up after 5 minutes of waiting.. I tried rebooting the LDOM again, and after reboot the zone hasn’t automatically restarted as the system pools service is stuck at transitioning to online

offline* 9:59:15 svc:/system/pools:default

After a bit of experimentation, I discover that I can increase the number of cores up to 240, but if I make the jump to 296 I get the hang. Then I look at my pooladm.conf

<property name="pool.sys_id" type="int">1</property>
<property name="pool.scheduler" type="string">TS</property>
</pool>
<pool name="pool_default" active="true" default="true" importance="1" comment="" res="pset_-1" ref_id="pool_0">
<property name="pool.sys_id" type="int">0</property>
<property name="pool.scheduler" type="string">TS</property>
</pool>
<res_comp type="pset" sys_id="1" name="pset_sc5acn01-d8.osc.uk.oracle.com_id_6138" default="false" min="448" max="448" units="population" comment="" ref_id="pset_1">

You can see here that my system  default pset size is still shown as 256 despite the changes in the underlying LDOM. If I do

# pooladm -s

To save the current running config to /etc/pooladm.conf this value increases to match the new size of the LDOM. Now I can manipulate the size of the zone’s pool and enable it successfully using pooladm -c So – if you need to change the underlying number of CPUs in your LDOM then you must take care! – If increasing, using pooladm -s to refresh the pooladm.conf with the new total number of cpus – If decreasing  (I’m only guessing here how it will behave here) reduce the zone’s pool size first.

UPDATE: There is also some information in the MOS note SuperCluster – Avoiding system/pools maintenance state when modifying zone pool CPU count after using setcoremem (Doc ID 1991360.1)

Enabling Multicast routing on a specific interface – Solaris 11

I had a customer with an application running on a Solaris 11 zone that was failing to connect to a multicast group.

You can check your multicast group memberships by running the command

Zone#netstat -g
Group Memberships: IPv4
Interface Group RefCnt
--------- -------------------- ------
lo0 all-systems.mcast.net 1
sc_ipmp0 all-systems.mcast.net 1

Group Memberships: IPv6
If Group RefCnt
----- --------------------------- ------
lo0 ff02::202 1
lo0 ff02::1:ff00:1 1
lo0 ff02::1 1

You cannot add a multicast route at the zone level..

Zone# route add -interface 224.0/4 -gateway 168.4.9.88
add net 224.0/4: gateway 168.4.9.88: insufficient privileges

You need to add it at the global zone interface (164.4.10.29 is the IP address for my interface sc_ipmp0 in the global zone)

GZ# route -p add -interface 224.0/4 -gateway 168.4.10.29

Once the application in the zone tries to use multicast on the sc_ipmp0 interface we should see a new group membership

Zone# netstat -g
Group Memberships: IPv4
Interface Group RefCnt
--------- -------------------- ------
lo0 all-systems.mcast.net 1
sc_ipmp0 239.255.255.250 1
sc_ipmp0 all-systems.mcast.net 1

Group Memberships: IPv6
If Group RefCnt
----- --------------------------- ------
lo0 ff02::202 1
lo0 ff02::1:ff00:1 1
lo0 ff02::1 1

 

 

Allowing a user to use ports under 1024 on Solaris 11

You can allow a normal unix user to create processes on privileged ports (e.g. under 1024) by assigning them the privilege net_privaddr. This is useful if you want your webserver to run as a non-root user.

# usermod -K defaultpriv=basic,net_privaddr webservd

This change will be recorded in the file /etc/user_attr. The user will need to re-login and restart processes to pick up these changes.

There are LOTS of other privileges you can assign this way, you can see a listing with a brief description as root by running

# ppriv -lv

Useful related knowledge

http://www.c0t0d0s0.org/archives/4075-Less-known-Solaris-features-RBAC-and-Privileges-Part-3-Privileges.html