Category Archives: Uncategorized

Crash dump analysis – or who shot ssca01?

So, I had a system fail with

Sep  7 17:30:48 ssca01 SC Alert: [ID 665947 daemon.notice] Audit | minor: root : Close Session : object = "/SP/session/type" : value = "shell" : success
 Sep  7 17:31:40 ssca01 unix: [ID 836849 kern.notice]
 Sep  7 17:31:40 ssca01 ^Mpanic[cpu243]/thread=302adde5c20:
 Sep  7 17:31:40 ssca01 unix: [ID 156897 kern.notice] forced crash dump initiated at user request

So I need to know which process issued the crash dump..

root@ssca01: cd /var/crash
root@ssca01:/var/crash# file *.0
 vmdump.0:       SunOS 5.11 11.0 64-bit SPARC compressed crash dump from 'ssca01'

First unpack the compressed crash dump

root@ssca01:/var/crash# savefore -vf vmdump.0
root@ssca01:/var/crash# mdb -k unix.0 vmcore.0
 Loading modules: [ unix genunix specfs dtrace zfs scsi_vhci sd mpt_sas mac px ldc crypto ip hook neti arp usba kssl sockfs qlc fctl random niumx idm fcp cpc mdesc fcip logindmux ptm sppp nsmb ufs ipc nfs ]
 > :: status
 mdb: syntax error near ":"
 > ::status
 debugging crash dump vmcore.0 (64-bit) from ssca01
 operating system: 5.11 11.0 (sun4v)
 image uuid: 8d9326c5-a01c-e66e-d5dc-fe299173999d
 panic message: forced crash dump initiated at user request
 dump content: kernel pages only
> ::showrev
 Hostname: ssca01
 Release: 5.11
 Kernel architecture: sun4v
 Application architecture: sparcv9
 Kernel version: SunOS 5.11 sun4v 11.0
 Platform: sun4v
> ::panicinfo
 cpu              243
 thread      302adde5c20
 message forced crash dump initiated at user request
 tstate       9900001605
 g1                4
 g2                4
 g3          193e800
 g4                1
 g5          183f800
 g6                0
 g7      302adde5c20
 o0          12b7820
 o1      2a11bb1b9b8
 o2          7100000
 o3               32
 o4                2
 o5      2a11bb1b9d8
 o6      2a11bb1b081
 o7          107ed4c
 pc          105650c
 npc          1056510
 y                0

Then you can use this shell script to pick out the details of the thread..

#!/usr/bin/env sh
echo "::ps" | mdb -k unix.0 vmcore.0 | \
 nawk '$8 !~ /ADDR/ {print $8" "$NF}' > /tmp/.core.$$
cat /dev/null > /tmp/core.$$
while read ps; do
 echo "process name: `echo ${ps} | nawk '{print $2}'`" >> /tmp/core.$$
 echo ${ps} | nawk '{print $1"::walk thread | ::findstack"}' | \
 mdb unix.0 vmcore.0 >> /tmp/core.$$
 echo >> /tmp/core.$$
 done < /tmp/.core.$$
rm /tmp/.core.$$
exit 0

Have a look at the created listings and look for the thread number

vi /tmp/core.*

process name: cssdagent
 [snip of lots of threads]
stack pointer for thread 302adde5c20: 2a11bb1b081
 000002a11bb1b131 kadmin+0x5a0()
 000002a11bb1b201 uadmin+0x1c0()
 000002a11bb1b2d1 syscall_trap+0xac()

so this was initiated by cssdagent, part of Oracle Clusterware

<to be continued>

Using tr to strip strange characters from a text file…

I had a text file that had several weird characters in it and I was struggling to remove them using substitution in vi.

^X This is my test file ^Y ^S It^Xs full of problems ^\ like this ^]

First find the octal number for the problem characters. Use  od -c on a sample problem text (seriously – not the whole file, it’s very difficult to read output)

me@home2 # od -c mel.txt
0000000 030       T   h   i   s       i   s       m   y       t   e   s
0000020   t       f   i   l   e     031     023       I   t 030   s
0000040   f   u   l   l       o   f       p   r   o   b   l   e   m   s
0000060     034       l   i   k   e       t   h   i   s     035      \n
0000100

You can see from this my problem characters are octal 031030 023 034 035

From context I can guess what these characters should be – 030 and 031 are single quotes, 034 and 035 are double quotes and 023 is a -.

I’m going to use tr ‘<string1>’ ‘<string2’ What this format of the tr command does is when it encounters a character in <string1> it substitutes it with the character in the same position in <string2>. The only complication in my case is that I need to substitute in ‘ characters so I’m going to specify their octal value.

cat mel.txt | tr '2334353130' '-""4747'
' This is my test file ' - It's full of problems " like this "

I’ve been left with some padding spaces around my replaced characters, but these are pretty simple to remove in vi.

 

 

Zoning a Brocade Switch Using WWNs

Way back in the mists of time I used to use port based zoning on Brocade switches, however, I started having problems with this and newer storage systems (almost certainly pilot error!). I needed to zone some switches for a customer’s piece of work and this time I thought I’d get with the future and use WWN based zoning.

So, in my setup I have 2 hosts, each with 2 connections per switch, and 2 storage arrays with 1 connection to the switch.

swd77:admin> switchshow
switchName:     swd77
switchType:     34.0
switchState:    Online
switchMode:     Native
switchRole:     Principal
switchDomain:   1
switchId:       fffc01
switchWwn:      10:00:00:05:1e:02:a2:08
zoning:         OFF
switchBeacon:   OFF

Area Port Media Speed State
==============================
  0   0   id    N4   Online    F-Port  20:14:00:a0:b8:29:f5:56 <- Storage Array 1
  1   1   id    N4   Online    F-Port  20:16:00:a0:b8:29:cd:b4 <- Storage Array 2
  2   2   id    N4   Online    F-Port  21:00:00:24:ff:20:3a:f6 <- Host A
  3   3   id    N4   Online    F-Port  21:00:00:24:ff:20:3a:e0 <- Host A
  4   4   --    N4   No_Module
  5   5   --    N4   No_Module
  6   6   id    N4   No_Light
  7   7   id    N4   No_Light
  8   8   id    N4   Online    F-Port  21:00:00:24:ff:20:3b:92 <- Host B
  9   9   id    N4   Online    F-Port  21:00:00:24:ff:25:6d:ac <- Host B
 10  10   id    N4   No_Light
 11  11   id    N4   No_Light
 12  12   id    N4   No_Light
 13  13   id    N4   No_Light
 14  14   --    N4   No_Module
 15  15   --    N4   No_Module

Create aliases for your hosts and storage arrays

swd77:admin> alicreate host1_a,"21:00:00:24:ff:20:3b:92"
swd77:admin> alicreate host1_b,"21:00:00:24:ff:25:6d:ac"
swd77:admin> alicreate host2_a,"21:00:00:24:ff:20:3a:f6"
swd77:admin> alicreate host2_b,"21:00:00:24:ff:20:3a:e0"
swd77:admin> alicreate "a6140","20:14:00:a0:b8:29:f5:56"
swd77:admin> alicreate "b6140","20:16:00:a0:b8:29:cd:b4"

Create Zones to include your alias

swd77:admin> zonecreate "port2","host1_a; a6140; b6140"
swd77:admin> zonecreate "port3","host1_b; a6140; b6140"
swd77:admin> zonecreate "port8","host2_a;  a6140; b6140"
swd77:admin> zonecreate "port9","host2_b;  a6140; b6140"

Create a configuration for your zones and save it

swd77:admin> cfgcreate "customer1","port2; port3; port8; port9"
swd77:admin> cfgsave
You are about to save the Defined zoning configuration. This
action will only save the changes on Defined configuration.
Any changes made on the Effective configuration will not
take effect until it is re-enabled.
Do you want to save Defined zoning configuration only?  (yes, y, no, n): [no] yes

When you’re happy with your configuration, enable it.

swd77:admin> cfgenable customer1
You are about to enable a new zoning configuration.
This action will replace the old zoning configuration with the
current configuration selected.
Do you want to enable 'customer1' configuration  (yes, y, no, n): [no] y
zone config "customer1" is in effect
Updating flash ...

Check at the OS level to see if you can see all your required volumes.

Memory usage on Solaris 10

I needed to try and work out what was using the memory on one of my Solaris 10 boxes under test.

Check that no-one has created any large files under /tmp. By default on Solaris /tmp is mounted on swap, so any large files created in that directory will consume memory/swap until removed.

Use prstat to display processes ordered by memory used, and show a summary output of memory consumption by user.

prstat -a -s rss

root@ellg-v140-114 # prstat -a -s rss
   PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/NLWP      
 26692 tango    2077M 2064M sleep   59    0   0:00:02 0.0% M3UA_stack/7
 26702 tango    2077M 2064M sleep   59    0   0:00:02 0.0% M3UA_stack/7
 26676 tango     774M  771M sleep   59    0   0:00:01 0.0% msc/3
 26678 tango     774M  771M sleep   49    0   0:00:01 0.0% msc/3
 26675 tango     774M  771M sleep   59    0   0:00:01 0.0% msc/3
 26677 tango     774M  771M sleep   59    0   0:00:01 0.0% msc/3
 26681 tango     768M  765M sleep   59    0   0:00:01 0.0% msc/3
 26682 tango     768M  765M sleep   59    0   0:00:01 0.0% msc/3
 26664 tango     744M  740M sleep   59    0   0:00:01 0.0% MAP_router2/3
 26665 tango     744M  740M sleep   59    0   0:00:01 0.0% MAP_router2/3
 26666 tango     744M  740M sleep   59    0   0:00:01 0.0% MAP_router2/3
 26669 tango     744M  740M sleep   59    0   0:00:01 0.0% MAP_router2/3
 26668 tango     744M  740M sleep   59    0   0:00:01 0.0% MAP_router2/3
 26667 tango     744M  740M sleep   59    0   0:00:01 0.0% MAP_router2/3
 26684 tango     343M  340M sleep   59    0   0:00:00 0.0% hlr/3
 NPROC USERNAME  SWAP   RSS MEMORY      TIME  CPU                            
    44 tango      16G   16G    50%   0:00:48 0.6%
   113 root      234M  227M   0.7%   0:26:02 0.8%
     1 noaccess  135M  115M   0.4%   0:03:53 0.0%
     6 daemon     10M 7616K   0.0%   0:00:01 0.0%
     1 smmsp    1900K 4240K   0.0%   0:00:01 0.0%
Total: 165 processes, 386 lwps, load averages: 0.61, 0.66, 0.43

Be careful with the output of this though, if your application has lots of processes connecting to a single shared memory segment (e.g. Oracle) then each of these will show up as a separate consumer of memory the same size as the shared memory segment.

Use memstat to have a peek at what the kernel memory usage is

root@ellg-v140-114 # echo "::memstat" |mdb -k
Page Summary                Pages                MB  %Tot
------------     ----------------  ----------------  ----
Kernel                     353282              1380    4%
Anon                      5920763             23127   71%
Exec and libs               14034                54    0%
Page cache                  32185               125    0%
Free (cachelist)            11157                43    0%
Free (freelist)           2052859              8018   24%

Total                     8384280             32751
Physical                  8175445             31935

If you’re using ZFS check how much is being used by the ARCcache.

# kstat -m zfs |grep size

Carrot Cake

Well, not really technical, here’s my favourite carrot cake recipe. I had to phone my grandma to get the recipe recently, and decided to share it..

6fl oz/ 180ml sunflower oil
8 oz/230g Soft brown sugar
5 oz/141g  plain flour (wholemeal for preference)
5 oz/141g  grated carrot
2 large eggs
2 oz/56g chopped walnuts (or chopped mixed nuts)
1tsp/5ml ground nutmeg (or mixed spice if you like it more cinnamony)
1tsp/5ml cinnamon
1tsp/5ml bicarbonate of soda

Whisk together oil and sugar until smooth
Whisk in eggs.
Mix cinnamon,nutmeg, bicarbonate and flour together
Add flour mix to the liquid mixture.
Beat in carrots and nuts until evenly combined

Bake in 7″ round baking tin for 1 hour at 350f/Gas Mark 4/180 c

Moving your voting disks

11gr2 provides some good facilities for moving voting disks. http://download.oracle.com/docs/cd/E11882_01/rac.112/e10717/votocr.htm#BGBBIGJH describes the various options for you.

root@c1718-3-50 # ./crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   23da5a8ec4fe4f7abf87392211a614ab (/dev/rdsk/c20t600A0B8000475BDC0000F5CA4C03684Cd0s6) [DATA1]
Located 1 voting disk(s)

I need to migrate the voting disks from disk group +DATA, which has external redundancy set, (hence only a single voting disk) to disk group +DATA1 which has normal redundancy

root@c1718-3-50 # ./crsctl replace votedisk +DATA1

That gives me the following voting disks…

root@c1718-3-50 # ./crsctl query css votedisk
##  STATE    File Universal Id                File Name Disk group
--  -----    -----------------                --------- ---------
 1. ONLINE   23da5a8ec4fe4f7abf36197331a614ff (/dev/rdsk/c20t600A0B8000475BDC0000F5CA4C063B1Dd0s6) [DATA1]
 2. ONLINE   02e77b8b7af04f6dbfb0bd7af1604eee (/dev/rdsk/c20t600A0B8000475BDC0000F5CD4C063B2Fd0s6) [DATA1]
 3. ONLINE   21a6741b7d0e4fafbf7a81d22c177521 (/dev/rdsk/c20t600A0B8000475BDC0000F5D04C063B43d0s6) [DATA1]
Located 3 voting disk(s).

Nokia – Free Navigation for all?

There has been a lot of buzz about Nokia making their navigation free forever.

However, it isn’t that simple, unless you are on the short list of phones that support Ovi Maps 3.0.3, you don’t get the free navigation at all. This means if you have a smartphone such as the E71 you don’t get the free navigation, and there are some rumours that you cannot buy new navigation licenses.

Next time, no nokia for me.