Tag Archives: SuperCluster

Adding an Oracle Exadata Storage Server to Enterprise Manager using the command line

 

Ok, I’m just noodling around here… I have some ‘spare’ storage servers that are in the same fabric as my SuperCluster and I wanted to discover them in EM.

oracle@odc-em-sc7a:/u01/app/oracle/agent13/agent_inst/sysman/log$ emcli add_target -type=oracle_exadata -name=”expbcel09.osc.uk.oracle.com” -host=”sc7ach00pd00-d1.osc.uk.oracle.com” -properties=”CellName:expbcel09.osc.uk.oracle.com;MgmtIPAddr:138.3.3.82″
Target “expbcel09.osc.uk.oracle.com:oracle_exadata” added successfully

Expanding a zpool backed by an iSCSI LUN

So, you have a zpool provided by an iscsi LUN which is tight on space, and you’ve done all the tidying you can think of.. what do you do next? Well if you’re lucky, you have space to expand the iscsi LUN and then make it available to your zpool.

First – find the LUN that holds the zpool using zpool status <poolname>

 

# zpool status zonepool
  pool: zonepool
 state: ONLINE
  scan: none requested
config:

        NAME                                     STATE     READ WRITE CKSUM
        zonepool                                 ONLINE       0     0     0
          c0t600144F0A22997000000574BFAA90004d0  ONLINE       0     0     0

errors: No known data errors

Note the lun identifier, starting with c0, ending with ‘d0’

Locate the LUN on your storage appliance. If you are on a ZFS appliance there is a really handy script  in Oracle Support Document 1921605.1 Otherwise you’ll have to use the tools supplied with your storage array or your eyes 😉

So, I’ve located my lun on my ZFS appliance by matching the LUN identifier and then I need to change the LUN size..

shares> select sc1-myfs
shares sc1-myfs> 
shares sc1-myfs> select zoneshares_zonepool 
shares sc1-myfs/zoneshares_zonepool> get lunguid
 lunguid = 600144F0A22997000000574BFAA90004
shares sc1-myfs/zoneshares_zonepool> set volsize=500G
 volsize = 500G (uncommitted)
shares sc1-myfs/zoneshares_zonepool> commit

 

Now I just need to get my zpool to expand into the available space on the lun

# zpool online -e zonepool c0t600144F0A22997000000574BFAA90004d0

And now we’re done

 

Creating an Application Zone on a SuperCluster

Application Zones on a SuperCluster Solaris 11 LDOM are subject to a lot fewer restrictions than the Exadata Database Zones. This also means that the documentation is less proscriptive and detailed.

This post will show a simple Solaris 11 zone creation, it is meant for example purposes only and not as a product. I am going to use a T5 SuperCluster for this walkthrough. The main difference you will need to consider for a M7 SuperCluster are:

  1. both heads of the ZFS-ES are active so you will need to select the correct head and infiniband interface name.
  2. there is only 1 QGBE card available per PDOM. This means you may need to present vnics from the domain that owns the card for the management network if you require this connectivity.

 

Useful Related MOS Notes

Considerations

As per note 2041460.1 the best practice for creating the file systems for the zone root filesystem is to  1 LUN per LDOM and create a filesystem on this shared pool for each application zone. Reservations and quotas can be used to prevent a zone from using more that its share.

You need to make sure you calculate minimum number of cores required for the global-zone  as per note 1625055.1

You  need to make sure that the IPS repos are all available, and that any IDRs  you have applied to your global zone are available.

Preparation


Put entries into global zone’s hostfile to for  your new zone.  I will use 3 addresses, one for the 1gbit management network, 1 for the 10gbit client network and 1 for the infiniband network on the storage partition (p8503).

 

10.10.14.15     sc5bcn01-d4.blah.mydomain.com   sc5bcn01-d4
10.10.10.78     sc5b01-d4.blah.mydomain.com sc5b01-d4
192.168.28.10   sc5b01-d4-storIB.blah.mydomain.com      sc5102-d4-storIB

 

Create an iscsi LUN for the zone root filesystem if you do not already have one already defined to hold zone roots. I am going to use the iscsi-lun.sh script that is designed for use  by other tools which create the Exadata Database Zones. The good thing about using this is it follows the naming convention etc. used for the other zones. However, it is not installed by default on Application zones (it is provided by the system/platform/supercluster/iscsi package in the exa-family repository) and this is not a supported use of the script.

  • -z is the name of my ZFS-ES
  • -i is the 1gbit hostname of my globalzone
  • -n and -N are used by the exavm utility to create the LUNs. In our case they will both be set to 1.
  • -s The size of the LUN to be created.
  • -l the volume block size. I have selected 32K, you may have other performance metrics that lead you to a different block size.
root@sc5bcn01-d3:/opt/oracle.supercluster/bin# ./iscsi-lun.sh create  \
-z sc5bsn01 -i sc5bcn01-d3  -n 1 -N 1 -s 500G -l 32K
Verifying sc5bcn01-d3 is an initiator node
The authenticity of host 'sc5bcn01-d3 (10.10.14.14)' can't be established.
RSA key fingerprint is 72:e6:d1:a1:be:a3:b3:d9:96:ea:77:61:bd:c7:f8:de.
Are you sure you want to continue connecting (yes/no)? yes
Password: 
Getting IP address of IB interface ipmp1 on sc5bsn01
Password: 
Setting up iscsi service on sc5bcn01-d3
Password: 
Setting up san object(s) and lun(s) for sc5bcn01-d3 on sc5bsn01
Password: 
Setting up iscsi devices on sc5bcn01-d3
Password: 
c0t600144F0F0C4EECD00005436848B0001d0 has been formatted and ready to use

Create a zpool to hold all of your zone roots

root@sc5bcn01-d3:/# zpool create zoneroots c0t600144F0F0C4EECD00005436848B0001d0

Now create a filesystem for your zone root and set a quota on it (optional).

root@sc5bcn01-d3:/# zfs create zoneroots/sc5b01-d4-rpool 
root@sc5bcn01-d3:/# zfs set quota=100G zoneroots/sc5b01-d4-rpool

Create partitions so your zone can access the IB Storage Network (optional, but nice to have, and my example will include them). First locate the interfaces that have access to the IB Storage Network partition  (PKEY=8503) using dladm and then create partitions using these interfaces.

root@sc5bcn01-d3:~# dladm show-part
LINK         PKEY  OVER         STATE    FLAGS
stor_ipmp0_0 8503  net7         up       f---
stor_ipmp0_1 8503  net8         up       f---
bondib0_0    FFFF  net8         up       f---
bondib0_1    FFFF  net7         up       f---
root@sc5bcn01-d3:~# dladm create-part -l net8 -P 8503 sc5b01d4_net8_p8503
root@sc5bcn01-d3:~# dladm create-part -l net7 -P 8503 sc5b01d4_net7_p8503

Create the Zone

Prepare your zone configuration file, here is mine. Note, I have non-standard link names to make it more readable. You will need to use ipadm to determine the lower-link names  that match your system

create -b
set brand=solaris
set zonepath=/zoneroots/sc5b01-d4-rpool
set autoboot=true
set ip-type=exclusive
add net
set configure-allowed-address=true
set physical=sc5b01d4_net7_p8503
end
add net
set configure-allowed-address=true
set physical=sc5b01d4_net8_p8503
end
add anet
set linkname=net0
set lower-link=auto
set configure-allowed-address=true
set link-protection=mac-nospoof
set mac-address=random
end
add net
set linkname=mgmt0
set lower-link=net0
set configure-allowed-address=true
set link-protection=mac-nospoof
set mac-address=random
end
add net
set linkname=mgmt1
set lower-link=net1
set configure-allowed-address=true
set link-protection=mac-nospoof
set mac-address=random
end
add anet
set linkname=client0
set lower-link=net2
set configure-allowed-address=true
set link-protection=mac-nospoof
set mac-address=random
end
add anet
set linkname=client1
set lower-link=net5
set configure-allowed-address=true
set link-protection=mac-nospoof
set mac-address=random
end

 

Implement the zone configuration using your pre-configured file or type it in manually..

root@sc5bcn01-d3:~# zonecfg -z sc5b01-d4 -f <yourzonefile>

 

Install the zone. Optionally you can specify a template to install required packages on top of the standard solaris-small-server group, or specify another package group. I base this on the standard xml file used by zone installs and customize the <software data> section (see this blog post here https://blogs.oracle.com/zoneszone/entry/automating_custom_software_installation_in  for an example)

root@sc5bcn01-d3:~# cp /usr/share/auto_install/manifest/zone_default.xml myzone.xml
root@sc5bcn01-d3:~# zoneadm -z sc5b01-d4 install -m myzone.xml

Next you boot the zone, and use zlogin -C to login to the console and answer the usual Solaris configuration questions about root password, timezone, locale. I do not usually configure the networking at this time, and add it later.

root@sc5bcn01-d3:~# zoneadm -z sc5b01-d4 boot
root@sc5bcn01-d3:~# zlogin -C sc5b01-d4

Create the required networking

# ipadm create-ip  mgmt0
# ipadm create-ip  mgmt1
# ipadm create-ip  client1
# ipadm create-ip  client0
# ipadm create-ipmp -i mgmt0 -i mgmt1 scm_ipmp0
# ipadm create-ipmp -i client0 -i client1 sc_ipmp0
# ipadm create-addr -T static -a local=10.10.10.78/22 sc_ipmp0/v4
# ipadm create-addr -T static -a local=10.10.14.15/24 scm_ipmp0/v4
# route -p add default 10.10.8.1
# ipadm create-ip sc5b01d4_net8_p8503
# ipadm create-ip sc5b01d4_net7_p8503
# ipadm create-ipmp -i sc5b01d4_net8_p8503 -i sc5b01d4_net7_p8503 stor_ipmp0
# ipadm set-ifprop -p standby=on -m ip sc5b01d4_net8_p8503
# ipadm create-addr -T static -a local=192.168.28.10/22 stor_ipmp0/v4

Optional Post Install steps

Root Login

Allow root to login over ssh by editing /etc/ssh/sshd_config and changing PermitRootLogin=no to PermitRootLogin=yes.
# svcadm restart ssh

Configure DNS support

# svccfg -s dns/client setprop config/search = astring: "blah.mydomain.com"
# svccfg -s dns/client setprop config/nameserver = net_address: \(10.10.34.4 10.10.34.5\)
# svccfg -s dns/client refresh 
# svccfg -s dns/client:default  validate
# svccfg -s dns/client:default  refresh 
# svccfg -s /system/name-service/switch setprop config/default = astring: \"files dns\"
# svccfg -s system/name-service/switch:default refresh
# svcadm enable dns/client

 

 Resource Capping

At the time of writing (20/04/16) virtual and physical memory capping is not supported on SuperCluster. This is mentioned in Oracle Support Document 1452277.1 (SuperCluster Critical Issues) as issue SOL_11_1.

Creating Processor sets and associating with your zone

See more detail about pools and processor sets on my blog here and here.And of course in the Solaris 11.3 manuals.

A quick summary of the commands follows.

This creates a fixed size processor set, consisting of 64 threads.

poolcfg -c "create pset pset_sc5bcn02-d4.osc.uk.oracle.com_id_6160 (uint pset.min = 64; uint pset.max = 64)" /etc/pooladm.conf

Then a pool is created, and associated with the processor set.

poolcfg -c "create pool pool_sc5bcn02-d4.osc.uk.oracle.com_id_6160" /etc/pooladm.conf
poolcfg -c "associate pool pool_sc5bcn02-d4.osc.uk.oracle.com_id_6160 (pset pset_sc5bcn02-d4.osc.uk.oracle.com_id_6160)" /etc/pooladm.conf
poolcfg -c 'modify pool pool_sc5bcn02-d4.osc.uk.oracle.com_id_6160 (string pool.scheduler="TS")' /etc/pooladm.conf

Enable the pool configuration saved in /etc/pooladm.conf

pooladm -c

modify the zone config to set the pool

zonecfg -z sc5bcn02-d4
zonecfg:sc5bcn02-d4> set pool=pool_sc5bcn02-d4.osc.uk.oracle.com_id_6160
zonecfg:sc5bcn02-d4> verify
zonecfg:sc5bcn02-d4> commit

Then you can stop and restart the zone to associate it with the processor set.

Keeping ssh sessions alive for longer

The default sshd configuration on SuperCluster seems to have changed to a more secure setup, which is great in a production type environment but sucks in my gung-ho lab environment.

The things I like to re-enable in my /etc/ssh/sshd_config are

X11Forwarding yes
KeepAlive yes
ClientAliveInterval 30 
ClientAliveCountMax 99999

 

Then I stop and restart sshd and my life becomes a lot easier.

#svcadm restart ssh

Changing the number of CPUs in a zone’s processor set (pooladm) Solaris 11.1

This post is mainly related to SuperCluster configuration, but can be applied to other solaris based systems.

In SuperCluster you can run the Oracle RDBMS in zones, and the zones are CPU capped. You may want to change the number of CPUs assigned to your zone for a couple of reasons

1) You have changed the number of CPUs that are available in the LDOM supporting this zone using a tool such as setcoremem and want to resize the zone to take this into account.

2) You need to change the zone size because you need more / less capacity.

Determine the number of processors that need to be reserved for the global zone (and all of your other zones!)


For SuperCluster you should reserve a miniumum of 2 cores per IB HCA in domains with a single HCA, a minimum of 4 cores for domains with 2 or more HCA. You also need to take into account other zones running on the system.


Find out how many CPUs are in the global zone

# psrinfo -pv

The physical processor has 16 cores and 128 virtual processors (0-127)

(... snipped output)
The core has 8 virtual processors (488-495)
The core has 8 virtual processors (496-503)
The core has 8 virtual processors (504-511)
SPARC-T5 (chipid 3, clock 3600 MHz)

So in my case, there are 512 virtual CPU, of which I need to reserve at least 32 (4 cores x 8 threads) for my global zone, and as I will also have an app zone on there, possibly a lot more. So lets say I’m going to keep 16 cores in global that would leave 384 for my dbzone

Get the name of the pool

The SuperCluster db zone creation tools create the pools with logical names based on the zonename. However, the way to be sure what pool is in use is to look at the zone’s definition

# zonecfg -z sc5acn01-d5 export |grep pool
set pool=pool_sc5acn01-d5.blah.uk.mydomain.com_id_6135

If this doesn’t return the name of the pool, your zone is not using resource pools and may be using one of the other methods of capping CPU usage such as allocating dedicated cpu resources (ncpus=X). If so.. stop reading here as this is not the blog post you are looking for.

Find the processor set associated with your pool by either looking in /etc/pooladm.conf or by running pooladm with no parameters to get the current running config. Checking in pooladm is WAY more readable so that is my preferred method.

# pooladm |more

system default
string system.comment
int system.version 1
boolean system.bind-default true
string system.poold.objectives wt-load

pool pool_sc5acn01-d5.blah.uk.mydomain.com_id_6135
int pool.sys_id 1
boolean pool.active true
boolean pool.default false
string pool.scheduler TS
int pool.importance 1
string pool.comment
pset pset_sc5acn01-d5.blah.uk.mydomain.com_id_6135

Here’s an extract of my pooladm.conf

<pool name="pool_sc5acn01-d5.blah.uk.mydomain.com_id_6135" active="true" default="false" importance="1" comment="" res="pset_1" ref_id="pool_1">
<property name="pool.sys_id" type="int">1</property>
<property name="pool.scheduler" type="string">TS</property>
</pool>
<pool name="pool_default" active="true" default="true" importance="1" comment="" res="pset_-1" ref_id="pool_0">
<property name="pool.sys_id" type="int">0</property>
<property name="pool.scheduler" type="string">TS</property>
</pool>
<res_comp type="pset" sys_id="1" name="pset_sc5acn01-d5.blah.uk.mydomain.com_id_6135" default="false" min="8" max="8" units="population" comment="" ref_id="pset_1">

It looks to me that the res=”pset_1″ in the pool definition points to the ref_id=”pset_1″ in the pset definition.

OK, so now I know my pset is called pset_sc5acn01-d5.blah.uk.mydomain.com_id_6135 and it currently has 8 cpus. I also know that my running config and my persistent config are synchronised.

Shutdown the database zone.

This may not be necessary, but since Oracle can make assumptions based on CPU count at startup, I think it is safest.

# zoneadm -z sc5acn01-d5 shutdown

Change the pset configuration

I’m going to do this by changing the config file to make it persistent, as there’s nothing more embarassing than making a change that is lost by a reboot. I set the processor set to have a minimum of 384 CPU and a maximum of 384 CPU.

# poolcfg -c 'modify pset pset_sc5acn02-d5.blah.uk.mydomain.com_id_6135 ( uint pset.min=384 ; uint pset.max=384 )' /etc/pooladm.conf

Check that it has applied to your config file
# grep pset_sc5acn01-d5 /etc/pooladm.conf

Force it to re-read the file and use the new configuration

# pooladm -c
# pooladm -s

Now you can run pooladm without any arguments and get the running config. If it all looks ok, go ahead and boot your zone

# pooladm

(snipped output)


pset pset_sc5acn01-d5.blah.uk.mydoamain.com_id_6135
int pset.sys_id 1
boolean pset.default false
uint pset.min 384
uint pset.max 384
string pset.units population
uint pset.load 723
uint pset.size 384
string pset.comment

Creating an Infiniband listener on SuperCluster

There is a common IB network between all nodes in a SuperCluster. This is primarily used for access to the ZFS storage appliance, but it can also be used to provide IB access to the database.

I have a 2 node RAC cluster, and both nodes have a network adapter similar to this

# ifconfig stor_ipmp0 
stor_ipmp0: flags=108001000843 mtu 65520 index 2
        inet 192.168.28.2 netmask fffffc00 broadcast 192.168.31.255
        groupname stor_ipmp0

First create entries for the IB VIP in /etc/hosts on all nodes in the cluster and on the hosts who want to access the IB Listener (e.g. Exalogic, application nodes in the SuperCluster)

192.168.28.102  sc5a01-d1-ibvip sc5a01-d1-ibvip.osc.uk.oracle.com
192.168.28.103  sc5a02-d1-ibvip sc5a02-d1-ibvip.osc.uk.oracle.com

Create the required oracle network resources on one node of the cluster. If you’re uncertain about the subnet to specify you can use one of the online calculators such as http://www.subnet-calculator.com/

Define the network interface as a global public cluster interface as root

# /u01/app/11.2.0.3/grid/bin/oifcfg setif -global stor_ipmp0/192.168.28.0:public


# /u01/app/11.2.0.3/grid/bin/oifcfg getif                                       
bondeth0  138.3.16.0  global  public
bondib0  192.168.0.0  global  cluster_interconnect
bondib1  192.168.0.0  global  cluster_interconnect
bondib2  192.168.0.0  global  cluster_interconnect
bondib3  192.168.0.0  global  cluster_interconnect
stor_ipmp0  192.168.28.0  global  public

Create the new network resource in the grid infrastructure as root

# /u01/app/11.2.0.3/grid/bin/srvctl add network -k 2 -S 192.168.28.0/255.255.252.0/stor_ipmp0

# ./srvctl config network
Network exists: 1/138.3.16.0/255.255.254.0/bondeth0, type static
Network exists: 2/192.168.28.0/255.255.252.0/stor_ipmp0, type static

Create your VIP resources as root

# ./srvctl add vip -n sc5acn01-d1 -A sc5a01-d1-ibvip/255.255.252.0/stor_ipmp0 -k 2
# ./srvctl add vip -n sc5acn02-d1 -A sc5a02-d1-ibvip/255.255.252.0/stor_ipmp0 -k 2

Create IB Listener as the grid home owner

$ srvctl add listener -l LISTENER_IB -k 2 -p "TCP:1522,/SDP:1522" -s

Add entries to the grid home tnsnames.ora. There are slightly different configuartions per node, as we will explicitly name the ‘local’ and ‘remote’ nodes. If we had more than 2 nodes, then our entries for LISTENER_IBREMOTE and LISTENER_IPREMOTE would list all of the non-local nodes.

Node 1

## IB Listener configuration


DBM01_IB =
  (DESCRIPTION =
    (LOAD_BALANCE=on)
    (ADDRESS = (PROTOCOL = TCP)(HOST = sc5a01-d1-ibvip)(PORT = 1522))
    (ADDRESS = (PROTOCOL = TCP)(HOST = sc5a02-d1-ibvip)(PORT = 1522))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = dbm01)
    )
  )

LISTENER_IBREMOTE =
  (DESCRIPTION =
        (ADDRESS_LIST = 
            (ADDRESS = (PROTOCOL = TCP)(HOST = sc5a02-d1-ibvip)(PORT = 1522))
    )
  )  

LISTENER_IBLOCAL =
  (DESCRIPTION =
        (ADDRESS_LIST = 
            (ADDRESS = (PROTOCOL = TCP)(HOST = sc5a01-d1-ibvip)(PORT = 1522))
            (ADDRESS = (PROTOCOL = SDP)(HOST = sc5a01-d1-ibvip)(PORT = 1522))
    )
  )
LISTENER_IPLOCAL =
  (DESCRIPTION =
        (ADDRESS_LIST = 
            (ADDRESS = (PROTOCOL = TCP)(HOST = sc5a01-d1-vip)(PORT = 1521))
    )
  )
LISTENER_IPREMOTE =
  (DESCRIPTION =
        (ADDRESS_LIST = 
            (ADDRESS = (PROTOCOL = TCP)(HOST = sc5a02-d1-vip)(PORT = 1521))
    )
  )

Node 2


## IB Listener configuration


DBM01_IB =
  (DESCRIPTION =
    (LOAD_BALANCE=on)
    (ADDRESS = (PROTOCOL = TCP)(HOST = sc5a01-d1-ibvip)(PORT = 1522))
    (ADDRESS = (PROTOCOL = TCP)(HOST = sc5a02-d1-ibvip)(PORT = 1522))
    (CONNECT_DATA =
      (SERVER = DEDICATED)
      (SERVICE_NAME = dbm01)
    )
  )

LISTENER_IBREMOTE =
  (DESCRIPTION =
        (ADDRESS_LIST = 
            (ADDRESS = (PROTOCOL = TCP)(HOST = sc5a01-d1-ibvip)(PORT = 1522))
    )
  )

LISTENER_IBLOCAL =
  (DESCRIPTION =
        (ADDRESS_LIST = 
            (ADDRESS = (PROTOCOL = TCP)(HOST = sc5a02-d1-ibvip)(PORT = 1522))
            (ADDRESS = (PROTOCOL = SDP)(HOST = sc5a02-d1-ibvip)(PORT = 1522))
    )
  )
LISTENER_IPLOCAL =
  (DESCRIPTION =
        (ADDRESS_LIST = 
            (ADDRESS = (PROTOCOL = TCP)(HOST = sc5a02-d1-vip)(PORT = 1521))
    )
  )
LISTENER_IPREMOTE =
  (DESCRIPTION =
        (ADDRESS_LIST = 
            (ADDRESS = (PROTOCOL = TCP)(HOST = sc5a01-d1-vip)(PORT = 1521))
    )
  )

Start your new listener as the grid home owner

grid@sc5acn01-d1:/u01/app/11.2.0.3/grid/network/admin$ srvctl start listener -l LISTENER_IB

Copy your new tnsnames.ora entries to your $ORACLE_HOME/network/admin/tnsnames.ora

Login as to your database as sysdba and set the listener_networks parameter

SQL> alter system set listener_networks='((NAME=network2)(LOCAL_LISTENER=LISTENER_IBLOCAL)(REMOTE_LISTENER=LISTENER_IBREMOTE))','((NAME=network1)(LOCAL_LISTENER=LISTENER_IPLOCAL)(REMOTE_LISTENER=LISTENER_IPREMOTE))' scope=both sid='*';

This will cause the database to register with the new infiniband listener.

Creating a basic DNS Server in Solaris 11

Create a zone (optional)

I created a zone to hold my temporary DNS server so it was quick and easy to remove at the end of the testing

root@sc5acn02-d1:~# zfs create -o mountpoint=/zones rpool/zones
root@sc5acn02-d1:~# zonecfg -z dns-zone
Use 'create' to begin configuring a new zone.
zonecfg:dns-zone> create
create: Using system default template 'SYSdefault'
zonecfg:dns-zone> set zonepath=/zones/dns-zone
zonecfg:dns-zone> commit
zonecfg:dns-zone> exit
root@sc5acn02-d1:~# zoneadm -z dns-zone install
The following ZFS file system(s) have been created:
    rpool/zones/dns-zone
Progress being logged to /var/log/zones/zoneadm.20140523T153804Z.dns-zone.install
       Image: Preparing at /zones/dns-zone/root.



Boot the zone
# zoneadm -z dns-zone boot

Login to the console and setup the network interfaces
# zlogin -C dns-zone

Configure DNS

Install  the BIND dns package

root@dns-zone:/var/tmp# pkg install service/network/dns/bind

Use the h2n script to convert a host based setup to a bind dns setup (I got my copy from ftp://ftp.hpl.hp.com/pub/h2n/h2n.tar.gz)

./h2n -d load.melnet.net -n 138.3

Create your named.conf file

options {
        directory       "/etc/namedb/working";
        pid-file        "/var/run/named/pid";
        dump-file       "/var/dump/named_dump.db";         
    statistics-file "/var/stats/named.stats";
};

zone "load.melnet.net" {
        type master;
        file "/etc/namedb/master/load.db";
};
zone "3.138.in-addr.arpa" {
        type master;
        file "/etc/namedb/master/3.138.db";
};

root@dns-zone:/var/tmp# mkdir -p /etc/namedb/working
root@dns-zone:/var/tmp# mkdir /var/run/named
root@dns-zone:/var/tmp# mkdir -p /var/dump
root@dns-zone:/var/tmp# mkdir -p /var/stats
root@dns-zone:/var/tmp# mkdir -p /etc/namedb/master
root@dns-zone:/var/tmp# cp db.load /etc/namedb/master/load.db
root@dns-zone:/var/tmp# cp db.138.3 /etc/namedb/master/3.138.db

My files looked like this

# cat /etc/namedb/master/load.db
@ IN  SOA dns-zone.load.melnet.net. root.dns-zone.load.melnet.net. ( 1 10800 3600 604800 86400 )
  IN  NS  dns-zone.load.melnet.net.

localhost            IN  A     127.0.0.1

dns-zone             IN  A     138.3.1.39
dns-zone             IN  MX    10 dns-zone.load.melnet.net.

host-17-128          IN  A     138.3.17.128
host-17-128          IN  MX    10 host-17-128.load.melnet.net.


root@dns-zone:/etc/namedb/master# cat /etc/namedb/master/3.138.db
@ IN  SOA dns-zone.load.melnet.net. root.dns-zone.load.melnet.net. ( 1 10800 3600 604800 86400 )
  IN  NS  dns-zone.load.melnet.net.

39.1.3.138.IN-ADDR.ARPA.        IN  PTR   dns-zone.load.melnet.net.
128.17.3.138.IN-ADDR.ARPA.      IN  PTR   host-17-128.load.melnet.net.

Setup a client to your dns

svccfg -s /network/dns/client setprop config/nameserver = net_address: 138.3.1.39
svccfg -s /network/dns/client setprop config/domain = astring: "load.melnet.net"
svccfg -s /network/dns/client setprop config/search = astring: "load.melnet.net"
svccfg -s /network/dns/client setprop config/ipnodes = astring: '"files dns"'
svccfg -s /network/dns/client setprop config/host = astring: '"files dns"'

Verify the configuration is correct:

root@dns-zone:/etc/namedb/master# svcadm enable dns/client
root@dns-zone:/etc/namedb/master# nslookup host-17-128
Server:         138.3.1.39
Address:        138.3.1.39#53

Name:   host-17-128.load.melnet.net
Address: 138.3.17.128

root@dns-zone:/etc/namedb/master# nslookup 138.3.17.128
Server:         138.3.1.39
Address:        138.3.1.39#53

128.17.3.138.in-addr.arpa       name = host-17-128.load.melnet.net.

Adding new records to your DNS

You have a couple of ways to add new records to your dns.. you can

1) Add the new entries to your host file and re-run h2n
2) Manually add entries to the load.db and 3.138.db files

and then refresh/restart the dns service.

To manually add a new host to the DNS  sc5a02-d2 138.3.17.172/etc/namedb/master/

Add a ‘forwards’ entry to /etc/namedb/master/load.db

sc5a02-d2            IN  A     138.3.17.172
sc5a02-d2            IN  MX    10 sc5a02-d2.load.melnet.net.

Add a reverse entry to /etc/namedb/master/3.138.db

172.17.3.138.IN-ADDR.ARPA.      IN  PTR   sc5a02-d2.load.melnet.net.

refresh and restart the server

root@dns-zone:/etc/namedb/master# svcadm refresh /network/dns/server
root@dns-zone:/etc/namedb/master# svcadm restart /network/dns/server

Test it forwards and backwards.

root@dns-zone:/etc/namedb/master# nslookup 138.3.17.172
Server:         138.3.1.39
Address:        138.3.1.39#53

172.17.3.138.in-addr.arpa       name = sc5a02-d2.load.melnet.net.

root@dns-zone:/etc/namedb/master# nslookup sc5a02-d2
Server:         138.3.1.39
Address:        138.3.1.39#53

Name:   sc5a02-d2.load.melnet.net
Address: 138.3.17.172

Appendix

Manually creating entries is a bit of a pain though if you have a lot of them. I’ve done a very dumb script here to generate entries..

#!/bin/sh
# quick generate DNS entries script
# accepts host and IP, produces entries for files
#Fixed variables
DOMAINER=load.melnet.net
HOSTER=$1
IPPER=$2
echo "Forwards entry"
echo "$HOSTER       IN  A             $IPPER"
echo "$HOSTER       IN  MX            10 ${HOSTER}.${DOMAINER}."

echo "Backwards entry"
BACKWARDSIP=`echo $IPPER | awk -F. '{print $4 "." $3 "." $2 "." $1}'`
echo "${BACKWARDSIP}.IN-ADDR.ARPA.     IN  PTR  ${HOSTER}.${DOMAINER}."

Enabling DNFS and configuring a RMAN backup to a ZFS 7320

DNFS Configuration process

This process is based on the setup required to attach a ZFS-BA to an Exadata. Unlike the ZFS-7320 a ZFS-BA has more infiniband links connected to the system and so can support greater throughput.

On the ZFS appliance

Create a  new project to hold the backup destination ‘MyCompanyBackuptest’

Edit project ‘MyCompanyBackuptest’
General Tab

→ Set ‘Synchronous write bias’ to Throughput
→ Set ‘Mountpoint’ to /export/mydb

Protocols Tab

→ Add nfs exceptions for all of ‘MyCompany’ servers for read/write and root access, using ‘Network’ and giving the individual IP addresses.

192.168.28.7/32
192.168.28.6/32
192.168.28.3/32
192.168.28.2/32

Shares Tab

→ Create filesystems backup1 to backup8

On SPARC node

As root

Check the required kernel parameters are set in /etc/system (done automatically by ssctuner service)

set rpcmod:clnt_max_conns = 8
set nfs:nfs3_bsize = 131072

Set suggested ndd parameters, by creating a script in /etc/rc2.d so they are set after every boot.

root@sc5acn01-d1:/etc/rc2.d# cat S99ndd
/usr/sbin/ndd -set /dev/tcp tcp_max_buf 4194304
/usr/sbin/ndd -set /dev/tcp tcp_xmit_hiwat 2097152
/usr/sbin/ndd -set /dev/tcp tcp_recv_hiwat 2097152
/usr/sbin/ndd -set /dev/tcp tcp_conn_req_max_q 16384
/usr/sbin/ndd -set /dev/tcp tcp_conn_req_max_q0 16384

Create mountpoints for the backup directories

root@sc5acn01-d1:/# for i in 1 2 3 4 5 6 7 8 
do 
mkdir /backup${i} 
done

Add /etc/vfstab entries for the mountpoints

sc5a-storIB:/export/mydb/backup1 - /backup1 nfs - yes rw,bg,hard,nointr,rsize=1048576,wsize=1048576,proto=tcp,vers=3,forcedirectio
sc5a-storIB:/export/mydb/backup2 - /backup2 nfs - yes rw,bg,hard,nointr,rsize=1048576,wsize=1048576,proto=tcp,vers=3,forcedirectio
sc5a-storIB:/export/mydb/backup3 - /backup3 nfs - yes rw,bg,hard,nointr,rsize=1048576,wsize=1048576,proto=tcp,vers=3,forcedirectio
sc5a-storIB:/export/mydb/backup4 - /backup4 nfs - yes rw,bg,hard,nointr,rsize=1048576,wsize=1048576,proto=tcp,vers=3,forcedirectio
sc5a-storIB:/export/mydb/backup5 - /backup5 nfs - yes rw,bg,hard,nointr,rsize=1048576,wsize=1048576,proto=tcp,vers=3,forcedirectio
sc5a-storIB:/export/mydb/backup6 - /backup6 nfs - yes rw,bg,hard,nointr,rsize=1048576,wsize=1048576,proto=tcp,vers=3,forcedirectio
sc5a-storIB:/export/mydb/backup7 - /backup7 nfs - yes rw,bg,hard,nointr,rsize=1048576,wsize=1048576,proto=tcp,vers=3,forcedirectio
sc5a-storIB:/export/mydb/backup8 - /backup8 nfs - yes rw,bg,hard,nointr,rsize=1048576,wsize=1048576,proto=tcp,vers=3,forcedirectio

Mount the filesystems and set ownership to oracle:dba

root@sc5acn01-d1:/# for i in 1 2 3 4 5 6 7 8 
do 
mount /backup${i} 
done
root@sc5acn01-d1:/# for i in 1 2 3 4 5 6 7 8 
do 
chown oracle:dba 
/backup${i} 
done

As Oracle

Stop any databases running from the ORACLE_HOME where you want to enable DNFS.
Ensure you can remotely authenticate as sysdba, creating a password file using orapwd if required.
Relink for dnfs support

oracle@sc5acn01-d1:/u01/app/oracle/product/11.2.0.3/dbhome_1/rdbms/lib$ make -f $ORACLE_HOME/rdbms/lib/ins_rdbms.mk dnfs_on

I was a little uncertain about the oradnfstab entries as most examples relate to a ZFS-BA which has many IB connections and 2 active heads, whereas the 7320 in this case was set in Active/Passive. I created $ORACLE_HOME/dbs/oradnfstab with the following entries.

server:sc5a-storIB path:192.168.28.1
export: /export/mydb/backup1 mount:/backup1
export: /export/mydb/backup2 mount:/backup2
export: /export/mydb/backup3 mount:/backup3
export: /export/mydb/backup4 mount:/backup4
export: /export/mydb/backup5 mount:/backup5
export: /export/mydb/backup6 mount:/backup6
export: /export/mydb/backup7 mount:/backup7
export: /export/mydb/backup8 mount:/backup8

Restart you database and check the alertlog to see if DNFS has been enabled by grepping for NFS.

Oracle instance running with ODM: Oracle Direct NFS ODM Library Version 3.0
 Wed Mar 26 16:50:43 2014

Backup and restore scripts will need to be adjusted to set suggested underscore parameters and to use the new locations.

oracle@sc5acn01-d1:~/mel$ cat dnfs_backup.rman
startup mount
run
{
sql 'alter system set "_backup_disk_bufcnt"=64';
sql 'alter system set "_backup_disk_bufsz"=1048576';
ALLOCATE CHANNEL ch01 DEVICE TYPE DISK connect 'sys/welcome1@mydb' format '/backup1/mydb/%U';
ALLOCATE CHANNEL ch02 DEVICE TYPE DISK connect 'sys/welcome1@mydb' format '/backup2/mydb/%U';
ALLOCATE CHANNEL ch03 DEVICE TYPE DISK connect 'sys/welcome1@mydb' format '/backup3/mydb/%U';
ALLOCATE CHANNEL ch04 DEVICE TYPE DISK connect 'sys/welcome1@mydb' format '/backup4/mydb/%U';
ALLOCATE CHANNEL ch05 DEVICE TYPE DISK connect 'sys/welcome1@mydb' format '/backup5/mydb/%U';
ALLOCATE CHANNEL ch06 DEVICE TYPE DISK connect 'sys/welcome1@mydb' format '/backup6/mydb/%U';
ALLOCATE CHANNEL ch07 DEVICE TYPE DISK connect 'sys/welcome1@mydb' format '/backup7/mydb/%U';
ALLOCATE CHANNEL ch08 DEVICE TYPE DISK connect 'sys/welcome1@mydb' format '/backup8/mydb/%U';
backup database TAG='dnfs-backup';
backup current controlfile format '/backup/dnfs-backup/backup-controlfile';
}
oracle@sc5acn01-d1:~/mel$ cat dnfs_restore.rman
startup nomount
restore controlfile from '/backup/dnfs-backup/backup-controlfile';
alter database mount;
configure device type disk parallelism 2;
run
{
sql 'alter system set "_backup_disk_bufcnt"=64';
sql 'alter system set "_backup_disk_bufsz"=1048576';
ALLOCATE CHANNEL ch01 DEVICE TYPE DISK connect 'sys/welcome1@mydb' format '/backup1/mydb/%U';
ALLOCATE CHANNEL ch02 DEVICE TYPE DISK connect 'sys/welcome1@mydb' format '/backup2/mydb/%U';
ALLOCATE CHANNEL ch03 DEVICE TYPE DISK connect 'sys/welcome1@mydb' format '/backup3/mydb/%U';
ALLOCATE CHANNEL ch04 DEVICE TYPE DISK connect 'sys/welcome1@mydb' format '/backup4/mydb/%U';
ALLOCATE CHANNEL ch05 DEVICE TYPE DISK connect 'sys/welcome1@mydb' format '/backup5/mydb/%U';
ALLOCATE CHANNEL ch06 DEVICE TYPE DISK connect 'sys/welcome1@mydb' format '/backup6/mydb/%U';
ALLOCATE CHANNEL ch07 DEVICE TYPE DISK connect 'sys/welcome1@mydb' format '/backup7/mydb/%U';
ALLOCATE CHANNEL ch08 DEVICE TYPE DISK connect 'sys/welcome1@mydb' format '/backup8/mydb/%U';
restore database from TAG='dnfs-backup';
}

Results of the changes

The timings are based on the longest running backup piece, rather than the wall clock time as this could include other RMAN operations such as re-cataloging files.

Standard NFS DNFS
Backup 2:32:09 44:58
Restore 33:42 24:46

So, it’s clear from these results that DNFS can have a huge impact on the backup performance and also a positive effect on restore performance.

If you look at the ZFS analytics for the backup, you can see that we were writing approximately 2 GB/s

backup

Also we were seeing approximately 1.2 GB/s read for the restore.
restore

Showing the Supercluster software release version

If you’re logging into a strange T5-8 SuperCluster to work on it, you may want to know which release of the software was used to install the system.

This information is held in a SMF service

root@sc5acn01-d1:~# svcs -a |grep oes
online         Mar_18   svc:/system/oes/id:default

Which you can extract the properties regarding software release and supercluster build type.

root@sc5acn01-d1:~# svcprop /system/oes/id:default
oes/type astring SuperCluster
oes/node astring ssccn1
configuration/name astring F4-1
configuration/build astring ssc-1.5.8
configuration/domain_type astring db
configuration/ovm_domain_type astring root
rack/serial_number astring unknown
filesystem-local/entities fmri svc:/system/filesystem/local:default
filesystem-local/grouping astring require_all
filesystem-local/restart_on astring none
filesystem-local/type astring service
general/complete astring 
general/enabled boolean true
general/entity_stability astring Unstable
start/exec astring :true
start/timeout_seconds count 0
start/type astring method
stop/exec astring :true
stop/timeout_seconds count 0
stop/type astring method
manifestfiles/lib_svc_manifest_system_oes_oes_id_xml astring /lib/svc/manifest/system/oes/oes_id.xml
startd/duration astring contract
tm_common_name/C ustring OES\ identification\ information
tm_description/C ustring Provides\ OES\ identification\ information
restarter/logfile astring /var/svc/log/system-oes-id:default.log
restarter/start_method_timestamp time 1395150946.060111000
restarter/start_method_waitstatus integer 0
restarter/auxiliary_state astring dependencies_satisfied
restarter/next_state astring none
restarter/state astring online
restarter/state_timestamp time 1395150946.115875000

 

ZFS Appliance NFS exceptions

I had a situation where I wanted to restrict access to a project on my ZFS storage appliance (7320) to a small list of hosts on a private network. The project needs to be accessible r/w, with root permissions from 4 hosts that I need to specify by IP address.

192.168.28.2     
192.168.28.3    
192.168.28.6   
192.168.28.7

However, other hosts in the 192.168.28.X/22 range must not be able to mount the share.
The way to achieve this is to lock down the permissions and then explicitly grant access to the systems you need. You have 3 ways of specifying the names of hosts for exceptions:-

  • Host(FQDN) or Netgroup – This requires you to have your private hostnames registered in DNS, which was not possible in my case. You CANNOT enter an IP address in this field.
  • DNS Domain – all of my hosts are in the same domain, so this was not fine grained enough.
  • Network – Counter-intuitively, it is network that will allow me to specify individual IP addresses, using a CIDR netmask that allows only 1 host (the netmask does not have to match that of the underlying interface)

First thing – set the default NFS share mode to ‘NONE’ so that non-excepted hosts cannot mount the share.

Then add exception for each host, using a /32 netmask which limits it to a single IP.

zfs

So, a quick test. This one should work

root@myhost-d1:/stage# ifconfig stor_ipmp0
stor_ipmp0: flags=8001000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,IPMP> mtu 65520 index 3
        inet 192.168.28.2 netmask fffffc00 broadcast 192.168.31.255
        groupname stor_ipmp0
root@myhost-d1:/# mount -f nfs -o rw 192.168.28.1:/export/stage /mnt
root@myhost-d1:/# df -k /mnt
Filesystem           1024-blocks        Used   Available Capacity  Mounted on
192.168.28.1:/export/stage
                     10737418209          31 10737418178     1%    /mnt

This one should fail

root@myhost-d3:~# ifconfig stor_ipmp0
stor_ipmp0: flags=8001000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,IPMP> mtu 65520 index 3
        inet 192.168.28.4 netmask fffffc00 broadcast 192.168.31.255
        groupname stor_ipmp0
root@myhost-d3:~#  mount -f nfs -o rw 192.168.28.1:/export/stage /mnt
nfs mount: mount: /mnt: Permission denied