Service Management Facility

From Genunix

(Redirected from SMF)

SMF(5) is an innovative replacement for the aging System V init system. It features distinct advantages such as dependancy handling, automated restart, delegated control (via RBAC), custom environments without the need for setuid, and much more.

When the Solaris kernel boots it processes /etc/default/init and then /etc/inittab. The inittab:

  1. Run autopush: configures lists of automatically pushed STREAMS modules
  2. Run soconfig: configure transport providers for use by sockets
  3. Run /lib/svc/bin/svc.startd: This is where SMF takes over.

SMF is interacted with using 3 different commands:

  • svcs: Report Service Status (svc = "service", s for "status")
  • svcadm: Service Administration, interact with a service (start/stop/restart, etc)
  • svccfg: Service Configuration, modify attributes of a given service


Contents

SMF Boot Arguments

You can pass several SMF related arguments to the kernel at boot, they are:

  • -m milestone=ABC: This will cause the system to boot only to the specified milestone. Typically used for milestone "none".
  • -m quiet: Prints standard per-service output and error messages requiring administrative intervention.
  • -m verbose: Prints standard per-service output with more informational messages.
  • -m debug: Prints standard per-service output and all svc.startd messages to log.

Refer to kernel(1M) and boot(1M) for additional details.


Milestones

Milestones are special services that utilize dependencies to act upon a group of related services. Stock milestone manifests can be found in /var/svc/manifest/milestone.

The following is an example of the single-user milestone:


<service_bundle type='manifest' name='SUNWcsr:single-user'>
 <service name='milestone/single-user' type='service' version='1'>
        <create_default_instance enabled='true' />
        <single_instance />

        <dependency name='sysidtool' grouping='require_all' restart_on='none' type='service' delete='true'>
                <service_fmri value='svc:/system/sysidtool:net' />
                <service_fmri value='svc:/system/sysidtool:system' />
        </dependency>
        <dependency name='nodename' grouping='require_all' restart_on='none' type='service'>
                <service_fmri value='svc:/system/identity:node' />
        </dependency>
        <dependency name='filesystem-minimal' grouping='require_all' restart_on='none' type='service'>
                <service_fmri value='svc:/system/filesystem/minimal' />
        </dependency>
        <dependency name='milestone-devices' grouping='require_all' restart_on='none' type='service'>
                <service_fmri value='svc:/milestone/devices' />
        </dependency>
        <dependency name='manifests' grouping='require_all' restart_on='none' type='service'>
                <service_fmri value='svc:/system/manifest-import' />
        </dependency>
        <dependency name='loopback-network' grouping='require_any' restart_on='none' type='service'>
                <service_fmri value='svc:/network/loopback' />
        </dependency>
        <dependency name='network' grouping='optional_all' restart_on='none' type='service'>
                <service_fmri value='svc:/milestone/network' />
        </dependency>

        <exec_method type='method' name='start' exec='/sbin/rcS start' timeout_seconds='1800' />
        <exec_method type='method' name='stop' exec=':true' timeout_seconds='0' />

        <property_group name='startd' type='framework'>
                <propval name='duration' type='astring' value='transient' />
                <propval name='timeout_retry' type='boolean' value='false' />
        </property_group>
        ....

Notice the following in the above example:

  • The service_bundle is type "manifest"
  • The service is type "service"
  • Several of the dependancies are "require_all"
  • The "start" exec_method starts the legacy init scripts by exec'ing /etc/rcS start.
  • A Property group for startd is present (duration=transient tells svc.startd not to restart if no processes are left running, and timeout_retry=false tells svc.startd not to retry the start method if it times out.)

Standard Milestones

The following milestones are present in B52:

  • milestone/network
  • milestone/devices
  • milestone/name-services
  • milestone/sysconfig
  • milestone/single-user (Run Level S)
  • milestone/multi-user (Run Level 2)
  • milestone/multi-user-server (Run Level 3)

Additionally, two special milestones are available:

  • none
  • all

When moving back down to a milestone, only services defined as dependant of that milestone will continue to run. Please see Dependency question related with milestone.

Creating Custom Milestones

Viewing Services with svcs

The svcs reports on various services. The following options are available:

  • -a: Show all services, reguardless of state.
  • -d: List dependencies of this service
  • -D: List services that depend on this service
  • -H: Omit headers
  • -l: Display all avaiable information
  • -o: specify columns (-o state,fmri)
  • -p: List processes associated with service
  • -R: --
  • -s: Sort output by column.
  • -S: Same as -s but reverse order.
  • -v: Verbose
  • -x: Display explanation for service state.

Interacting with Services using svcadm

svcadm accepts the following commands:

  • enable [-rst] <service> ... - enable and online service(s)
  • disable [-st] <service> ... - disable and offline service(s)
  • restart <service> ... - restart specified service(s)
  • refresh <service> ... - re-read service configuration
  • milestone [-d] <milestone> - advance to a service milestone
  • mark [-It] <state> <service> ... - set maintenance state
  • clear <service> ... - clear maintenance state

The options to these are (where applicable, as above):

  • -v: Available to all, acts verbosely
  • -r: Recursively enable dependencies
  • -s: "sleep", command doesn't return until action is complete.
  • -t: "temporary", preform the action but don't make it persistent.
  • -d: "default", make the specified milestone the default boot milestone.
  • -I: Immediate, don't wait.

Examples

In the following example we verbosely (-v) restart fmd:

root@aeon ~$ svcadm -v restart fmd
Action restart set for svc:/system/fmd:default.
root@aeon ~$ svcs fmd
STATE          STIME    FMRI
online*         0:51:41 svc:/system/fmd:default
root@aeon ~$ svcs fmd
STATE          STIME    FMRI
online*         0:51:41 svc:/system/fmd:default
root@aeon ~$ svcs fmd
STATE          STIME    FMRI
online*         0:51:41 svc:/system/fmd:default
root@aeon ~$ svcs fmd
STATE          STIME    FMRI
online          0:52:02 svc:/system/fmd:default

In the next example, we verbosely start Apache and wait (-s) for it to start before returning the prompt:

root@aeon ~$ svcadm -v enable -s  http:cswapache2
svc:/network/http:cswapache2 enabled.
root@aeon ~$ svcs http
STATE          STIME    FMRI
online          0:57:23 svc:/network/http:cswapache2

Next we'll change milestones and see the result:

# svcadm milestone multi-user
# svcs -l multi-user-server
fmri         svc:/milestone/multi-user-server:default
name         multi-user plus exports milestone
enabled      false (temporary)
state        disabled
next_state   none
state_time   Sat Sep 15 00:59:12 2007
logfile      /var/svc/log/milestone-multi-user-server:default.log
restarter    svc:/system/svc/restarter:default
dependency   require_all/none svc:/milestone/multi-user (online)
dependency   optional_all/none svc:/network/ssh (disabled)
dependency   optional_all/none svc:/network/nfs/server (disabled)
dependency   optional_all/none svc:/network/rpc/smserver (disabled)
dependency   optional_all/none svc:/network/rpc/bootparams (disabled)
dependency   optional_all/none svc:/network/rarp (disabled)
dependency   optional_all/none svc:/network/dhcp-server (disabled)
dependency   optional_all/none svc:/appliance/kit/dhcp-server (disabled)
dependency   optional_all/none svc:/network/samba (disabled)
dependency   optional_all/none svc:/network/winbind (disabled)
dependency   optional_all/none svc:/network/wins (disabled)

# svcs -a | grep  0:59:11
disabled        0:59:11 svc:/network/routing/ndp:default
disabled        0:59:11 svc:/network/routing/route:default
disabled        0:59:11 svc:/network/rpc/gss:default
disabled        0:59:11 svc:/network/security/ktkt_warn:default
disabled        0:59:11 svc:/network/rpc/cde-calendar-manager:default
disabled        0:59:11 svc:/system/console-login:default
disabled        0:59:11 svc:/system/fmd:default
disabled        0:59:11 svc:/network/rpc/cde-ttdbserver:tcp
disabled        0:59:11 svc:/system/sac:default
disabled        0:59:11 svc:/system/intrd:default
disabled        0:59:11 svc:/system/resource-mgmt:default
disabled        0:59:11 svc:/application/print/ppd-cache-update:default
disabled        0:59:11 svc:/system/picl:default
disabled        0:59:11 svc:/system/boot-archive-update:default
disabled        0:59:11 svc:/system/basicreg:default
disabled        0:59:11 svc:/network/shares/group:zfs
disabled        0:59:11 svc:/network/shares/group:default
disabled        0:59:11 svc:/system/coreadm:default
disabled        0:59:11 svc:/system/dumpadm:default
disabled        0:59:11 svc:/application/cde-printinfo:default
disabled        0:59:11 svc:/system/filesystem/rmvolmgr:default

# svcadm -v milestone multi-user-server
# svcs -l multi-user-server
fmri         svc:/milestone/multi-user-server:default
name         multi-user plus exports milestone
enabled      true
state        online
next_state   none
state_time   Sat Sep 15 01:01:26 2007
logfile      /var/svc/log/milestone-multi-user-server:default.log
restarter    svc:/system/svc/restarter:default
dependency   require_all/none svc:/milestone/multi-user (online)
dependency   optional_all/none svc:/network/ssh (online)
dependency   optional_all/none svc:/network/nfs/server (disabled)
dependency   optional_all/none svc:/network/rpc/smserver (online)
dependency   optional_all/none svc:/network/rpc/bootparams (disabled)
dependency   optional_all/none svc:/network/rarp (disabled)
dependency   optional_all/none svc:/network/dhcp-server (disabled)
dependency   optional_all/none svc:/appliance/kit/dhcp-server (disabled)
dependency   optional_all/none svc:/network/samba (disabled)
dependency   optional_all/none svc:/network/winbind (disabled)
dependency   optional_all/none svc:/network/wins (disabled)

Please note something interesting above about milestones:

  • Firstly, the ssh service was never stopped even though the milestone was retreated from. (BUG)
  • Secondly, the milestone change wasn't flagged with -d and therefore inherently temporary


Modifying Services with svccfg

svccfg can be quick and easy or complex and confusing depending on how deeply you go, so we'll break down the most common uses and then look at the deeper complexities:

Quick & Easy Common Functions

The most common usage for svccfg is to import a new manifest, export a manifest (dump XML to STDOUT), and to delete a service entirely.

Examples:

  • svccfg delete uptmagnt/tcp: Completely delete the existing service FMRI
  • svccfg import /tmp/uptmagnt-tcp.xml: Import a new service from an XML manifest
  • svccfg export uptmagnt/tcp: Dump the manifest to STDOUT.

The "lamer" way to modify a service is to export the FMRI to file, modify the output XML, delete the service, and then re-import the manifest as if it were new. This avoids the hassles of interacting with svccfg and its more complex appearing features.

Here is a complete example of doing this:

root@aeon tmp$ svccfg export http > /tmp/http.xml
root@aeon tmp$ svccfg delete http
svccfg: Instance svc:/network/http:cswapache2 may be running.  Use delete -f if it is not.
root@aeon tmp$ svccfg delete -f http
root@aeon tmp$ svcs -a http   
svcs: -a ignored when used with arguments.
svcs: Pattern 'http' doesn't match any instances
STATE          STIME    FMRI
root@aeon tmp$ svccfg import /tmp/http.xml 
root@aeon tmp$ svcs -a http
svcs: -a ignored when used with arguments.
STATE          STIME    FMRI
disabled        1:14:51 svc:/network/http:apache2
online          1:14:52 svc:/network/http:cswapache2

Modifying a Service Properly

While export/delete/import isn't the kosher way to do things, its a bit simpler to the minds of most. For details on the "proper way" please see the Property Commands section below.

Checking and Setting Service Properties

  • Get all properties for a service: svcprop system/console-login
  • Get specific properties for a service: svcprop -f -p ttymon/label system/console-login
  • Set a specific property for a service: svccfg -s system/console-login setprop ttymon/label = console (svccfg -s {fmri} setprop {prop} = {value})

svccfg Advanced Features

svccfg accepts the following commands:

svccfg: General commands:        help set repository end
Manifest commands:       inventory validate import export archive
Profile commands:        apply extract
Entity commands:         list select unselect add delete
Snapshot commands:       listsnap selectsnap revert
Property group commands: listpg addpg delpg
Property commands:       listprop setprop delprop editprop
Property value commands: addpropvalue delpropvalue setenv unsetenv

Each command can be proceeded by -s FMRI (the CLI shortcut to using select FMRI) and -v for verbose. Additionally, you can stash svccfg commands into a "script" which is then feed via the -f /path/to/file flag.

Lets take each command group one-by-one:

Manifest Commands

These commands work on manifests:

  • inventory /some/file: Print the services and instances contained in a manifest.
  • validate /some/file: Process a manifest file without changing the repository.
  • import /some/file: Import a manifest into the repository.
  • export: Print a manifest for service to file, or standard output if not specified.
  • archive: Print an archive to file, or standard output if not specified.

An archive is a full XML service description for all services, instances, and their persistent properties in the repository. This does not include transient properties such as service state, and is suitable for a relocatable repository backup. See also extract.

Entity commands

Entities are services independant of a given instance. For example, "network/http:default" is an instance, whereas "network/http" is an entity.

  • list: List children of the currently selected entity.
  • select: Select the named child entity or the one indicated by fmri.
  • unselect: Select the parent of the currently selected entity.
  • add: Add a new child entity (instance) to the currently selected entity.
  • delete: Delete the named child entity or the one indicated by fmri. With -f, delete

running services.

The following is an example of selecting a service (by glob), adding a child entity to it (instance), unselecting and re-selecting, and then deleting the child-instance.

root@aeon tmp$ svccfg
svc:> select *mysql*
svc:/network/cswmysql5> unselect
svc:> select http
svc:/network/http> add boobs
svc:/network/http> list
:properties
cswapache2
apache2
boobs
svc:/network/http> select boobs
svc:/network/http:boobs> unselect
svc:/network/http> unselect
svc:> select boobs
svc:/network/http:boobs> delete boobs
Current selection has been deleted.
svc:/network/http> list
:properties
cswapache2
apache2

Profile Commands

A "profile" is a file that contain a set of service instances and values for the enabled property on each instance.

  • apply /path/to/profile: Apply a profile
  • extract: Print a profile to file, or standard output if not specified.

The following is an example profile extract of a running system:

$ svccfg extract | head    
<?xml version='1.0'?>
<!DOCTYPE service_bundle SYSTEM '/usr/share/lib/xml/dtd/service_bundle.dtd.1'>
<service_bundle type='profile' name='extract'>
  <service name='system/boot-archive' type='service' version='0'>
    <instance name='default' enabled='true'/>
  </service>
  <service name='system/device/local' type='service' version='0'>
    <instance name='default' enabled='true'/>
  </service>

So why do we care about profiles? Lets say you wanted to put the system into a specific state but didn't wish to use a milestone for this purpose, you could create a production_profile.xml and maintance_profile.xml and then change the complete system state based on just applying the appropriate profile.

The following is an example of the previous case:

root@aeon tmp$ svccfg extract > /tmp/production_profile.xml
root@aeon tmp$ cp /tmp/production_profile.xml /tmp/maintance_profile.xml
root@aeon tmp$ vi /tmp/maintance_profile.xml 
 (disable talk, http, isns, and iscsi target)

root@aeon tmp$ svccfg apply /tmp/maintance_profile.xml 
root@aeon tmp$ svcs -a | grep  1:52   
disabled        1:52:22 svc:/network/talk:default
disabled        1:52:24 svc:/network/isns_server:default
disabled        1:52:24 svc:/network/http:cswapache2
disabled        1:52:27 svc:/system/iscsitgt:default

root@aeon tmp$ svccfg -v apply /tmp/production_profile.xml 
svccfg: svc:/system/boot-archive:default updated.
svccfg: svc:/system/device/local:default updated.
svccfg: svc:/milestone/devices:default updated.
svccfg: svc:/system/identity:node updated.
svccfg: svc:/system/identity:domain updated.
svccfg: svc:/system/filesystem/local:default updated.
svccfg: svc:/system/manifest-import:default updated.
....

root@aeon tmp$ svcs -a | grep 1:53
online          1:53:48 svc:/network/talk:default
online          1:53:49 svc:/network/isns_server:default
online          1:53:50 svc:/network/http:cswapache2
online          1:53:50 svc:/system/iscsitgt:default

The power afforded here is non-intuitive but substantial when used appropriately.

Snapshot Commands

Historical data about each instance in the repository is maintained by the service management facility. This data is made available as read-only snapshots for administrative inspection and rollback.

  • listsnap: List snapshots of the currently selected instance.
  • selectsnap [some_snap]: Select a snapshot of the currently selected instance, or the Editing snapshot by default.
  • revert [some_snap]: Change the properties of the currently selected instance and its ancestors to those in a snapshot, or the currently selected snapshot by default.

The following is an example of navigating entities (services), child entities (instances), and snapshots:

svc:> select http
svc:/network/http> list
:properties
cswapache2
apache2
svc:/network/http> select apache2
svc:/network/http:apache2> listsnap
last-import
initial
running
svc:/network/http:apache2> selectsnap running
[running]svc:/network/http:apache2> ...
[running]svc:/network/http:apache2> revert
svc:/network/http:apache2> 

...

Property Group (PG) Commands

Properties are, duh, grouped into Property Groups (PG). This command is used as a compliment to the property commands.

  • listpg: List property groups of the currently selected entity.
  • addpg: Add a new property group to the currently selected entity.
  • delpg: Delete the named property group from the currently selected entity.

Example:

svc:> select network/radius
svc:/network/radius> listpg
general                framework
tm_common_name         template
tm_man_radiusd         template
tm_doc_freeradius_org  template

Use the listprop command to examine the properties and values within each pg.

Valid property group types are defined by the XML schema.

Property Commands

Properties are arbitrary name/value pairs that can add finess to your SMF experience:

  • listprop: List property groups and properties of the currently selected entity.
  • setprop pg/name = value: Set the pg/name property of the currently selected entity. Values may be enclosed in double-quotes. Value lists may span multiple lines.
  • delprop pg/name: Delete the pg/name property of the currently selected entity.
  • editprop: Invoke $EDITOR to edit the properties of the currently selected entity.

The following is an example of the properties for the RADIUS service:

svc:/network/radius> listprop
general                     framework
general/entity_stability    astring  Evolving
tm_common_name              template
tm_common_name/C            ustring  "Radius Server"
tm_man_radiusd              template
tm_man_radiusd/manpath      astring  :default
tm_man_radiusd/section      astring  8
tm_man_radiusd/title        astring  radiusd
tm_doc_freeradius_org       template
tm_doc_freeradius_org/name  astring  freeradius.org
tm_doc_freeradius_org/uri   astring  http://httpd.freeradius.org

Here is a sampling of what it looks like to edit properties in vi:

select svc:/network/http

# Property group "general"
# delprop general
# addpg general framework
# setprop general/entity_stability = astring: Evolving
...

When using editprop you simply uncomment and edit the propeties you want to change and then save and exit the editor.

PLEASE NOTE!!! After editing properties you must refresh the service for them to be effective!

Understanding Properties: Thinking in Layers

At this point we need to take an aside and look at properties and at what layers they are present. For the purpose of discussion, we'll use the RADIUS service.

Lets first review the XML Manifest:

svc:> export radius
<?xml version='1.0'?>
<!DOCTYPE service_bundle SYSTEM '/usr/share/lib/xml/dtd/service_bundle.dtd.1'>
<service_bundle type='manifest' name='export'>
  <service name='network/radius' type='service' version='0'>
    <instance name='default' enabled='false'>
      <dependency name='loopback' grouping='require_all' restart_on='error' type='service'>
        <service_fmri value='svc:/network/loopback:default'/>
      </dependency>
      <dependency name='paths' grouping='require_all' restart_on='error' type='path'>
        <service_fmri value='file://localhost/opt/csw/etc/raddb/radiusd.conf'/>
      </dependency>
      <dependency name='physical' grouping='optional_all' restart_on='error' type='service'>
        <service_fmri value='svc:/network/physical:default'/>
      </dependency>
      <exec_method name='start' type='method' exec='/opt/csw/sbin/radiusd' timeout_seconds='60'>
        <method_context/>
      </exec_method>
      <exec_method name='stop' type='method' exec=':kill' timeout_seconds='60'>
        <method_context/>
      </exec_method>
      <exec_method name='refresh' type='method' exec=':kill -HUP' timeout_seconds='60'>
        <method_context/>
      </exec_method>
    </instance>
    <stability value='Evolving'/>
    <template>
      <common_name>
        <loctext xml:lang='C'>Radius Server</loctext>
      </common_name>
      <documentation>
        <manpage title='radiusd' section='8'/>
        <doc_link name='freeradius.org' uri='http://httpd.freeradius.org'/>
      </documentation>
    </template>
  </service>
</service_bundle>

Notice the structure here:

  1. service_bundle
    1. service
      1. instance ...
        1. dependancy ...
        2. ....
        3. exec_method ...
        4. ...
      2. stability
      3. template
        1. common_name ...
        2. documentation ...

Now, recall that we have "entities" (services) and "child entities" (instances). With that knowledge we examine the properties:

svc:> select radius
svc:/network/radius> listpg
general                framework
tm_common_name         template
tm_man_radiusd         template
tm_doc_freeradius_org  template

svc:/network/radius> listprop
general                     framework
general/entity_stability    astring  Evolving
tm_common_name              template
tm_common_name/C            ustring  "Radius Server"
tm_man_radiusd              template
tm_man_radiusd/manpath      astring  :default
tm_man_radiusd/section      astring  8
tm_man_radiusd/title        astring  radiusd
tm_doc_freeradius_org       template
tm_doc_freeradius_org/name  astring  freeradius.org
tm_doc_freeradius_org/uri   astring  http://httpd.freeradius.org

So for the entity "network/radius" we have 4 property groups, 1 general and 3 "template" which corrispond to our hierarchy in the manifest above.

But why don't see see methods and dependancies and such? Because they are bound to the default instance, the child entity named "default", and so we descend into the child entity:

svc:/network/radius> list
:properties
default
svc:/network/radius> select default

svc:/network/radius:default> listpg
loopback   dependency
paths      dependency
physical   dependency
general    framework
start      method
stop       method
refresh    method
restarter  framework    NONPERSISTENT

svc:/network/radius:default> listprop
loopback                   dependency
loopback/entities          fmri     svc:/network/loopback:default
loopback/grouping          astring  require_all
loopback/restart_on        astring  error
loopback/type              astring  service
paths                      dependency
paths/entities             fmri     file://localhost/opt/csw/etc/raddb/radiusd.conf
paths/grouping             astring  require_all
paths/restart_on           astring  error
paths/type                 astring  path
physical                   dependency
physical/entities          fmri     svc:/network/physical:default
physical/grouping          astring  optional_all
physical/restart_on        astring  error
physical/type              astring  service
general                    framework
general/enabled            boolean  false
start                      method
start/exec                 astring  /opt/csw/sbin/radiusd
start/timeout_seconds      count    60
start/type                 astring  method
stop                       method
stop/exec                  astring  :kill
stop/timeout_seconds       count    60
stop/type                  astring  method
refresh                    method
refresh/exec               astring  ":kill -HUP"
refresh/timeout_seconds    count    60
refresh/type               astring  method
restarter                  framework    NONPERSISTENT
restarter/auxiliary_state  astring  none
restarter/next_state       astring  none
restarter/state            astring  disabled
restarter/state_timestamp  time     1189243586.192735000

Now we see the properties we expect.

Please note that in the following "list" example ":properties" is not an entity but rather an indication that properties are bound to the entity:

svc:/network/radius> list
:properties
default
svc:/network/radius> select default
svc:/network/radius:default> list
:properties

For contrast, lets now examine a manifest that uses a single_instance. First the manifest excerpt:

svc:> export example
<?xml version='1.0'?>
<!DOCTYPE service_bundle SYSTEM '/usr/share/lib/xml/dtd/service_bundle.dtd.1'>
<service_bundle type='manifest' name='export'>
  <service name='network/mongrel/example' type='service' version='0'>
    <create_default_instance enabled='true'/>
    <single_instance/>
    <dependency name='fs' grouping='require_all' restart_on='none' type='service'>
      <service_fmri value='svc:/system/filesystem/local'/>
    </dependency>
    ....
    <exec_method name='stop' type='method' exec=':kill' timeout_seconds='60'>
      <method_context/>
    </exec_method>
  </service>
</service_bundle>

Notice how different the flow is because we're using single_instance:

  1. service_bundle
    1. service
      1. create_default_instance
      2. single_instance
      3. dependancy ...
      4. ...
      5. exec_method ...
      6. ...

The heirarchy is much flatter in this example, and so viewing properties is very different:

svc:> select example
svc:/network/mongrel/example> listpg
general     framework
fs          dependency
net         dependency
dependents  framework
start       method
stop        method

svc:/network/mongrel/example> listprop
general                     framework
general/single_instance     boolean  true
fs                          dependency
fs/entities                 fmri     svc:/system/filesystem/local
fs/grouping                 astring  require_all
fs/restart_on               astring  none
fs/type                     astring  service
net                         dependency
net/entities                fmri     svc:/network/nfs/client svc:/network/loopback
net/grouping                astring  require_all
net/restart_on              astring  none
net/type                    astring  service
dependents                  framework
dependents/mongrel_example  fmri     svc:/milestone/multi-user
start                       method
start/environment           astring  PATH=/usr/bin:/bin:/opt/csw/bin
start/exec                  astring  "/opt/csw/bin/mongrel_rails cluster::start"
start/group                 astring  example
start/limit_privileges      astring  :default
start/privileges            astring  :default
start/project               astring  :default
start/resource_pool         astring  :default
start/supp_groups           astring  :default
start/timeout_seconds       count    60
start/type                  astring  method
start/use_profile           boolean  false
start/user                  astring  example
start/working_directory     astring  /home/example/rails/example.org/current
stop                        method
stop/exec                   astring  :kill
stop/project                astring  :default
stop/resource_pool          astring  :default
stop/timeout_seconds        count    60
stop/type                   astring  method
stop/working_directory      astring  :default

Seeing the contrast of the two its obvious how services can scale with instances if you choose to do so and why using single_instance really isn't the best choice in the long run.

Property Value Commands

  • addpropvalue pg/name [type:] value: Add the given value to the named property.
  • delpropvalue pg/name glob: Delete all values matching the glob pattern fron the given property.
  • setenv [-s | -i | -m method] NAME value: Set an environment variable for the given service, instance, or method context.
  • unsetenv [-s | -i | -m method] NAME value: Unset an environment variable for the given service, instance, or method context.

The flags for setenv/unsetenv are:

  • -i: If method_name not selected, inserts "method_context" pg for selected instance
  • -s: If method_name not selected, inserts "method_context" pg for selected service
  • -m method_name: Define which method the context should apply to (eg: "start")

An example of adding an ENV to an instance:

svc:/network/radius:default> setenv HOME /tmp

svc:/network/radius:default> listprop *environment
start/environment  astring  HOME=/tmp

An example of adding and ENV to a service (all instances):

svc:/network/radius:default> setenv -s HOME /tmp
svc:/network/radius:default> unselect

svc:/network/radius> listprop
general                     framework
general/entity_stability    astring  Evolving
tm_common_name              template
tm_common_name/C            ustring  "Radius Server"
....
method_context              framework
method_context/environment  astring  HOME=/tmp

When viewed in XML manifest form we can see both levels (-i for instance, -s for service):

svc:/network/radius>  export radius
...
<service_bundle type='manifest' name='export'>
  <service name='network/radius' type='service' version='0'>
    <method_context>
      <method_environment>
        <envvar name='HOME' value='/tmp'/>
      </method_environment>
    </method_context>
    <instance name='default' enabled='false'>
      <dependency name='loopback' grouping='require_all' restart_on='error' type='service'>
        <service_fmri value='svc:/network/loopback:default'/>
       .....
      <exec_method name='start' type='method' exec='/opt/csw/sbin/radiusd' timeout_seconds='60'>
        <method_context>
          <method_environment>
            <envvar name='HOME' value='/tmp'/>
          </method_environment>
        </method_context>
      </exec_method>
     .....

Environmental Variables in Manifests

Within manifest start contexts we can define various environmental variables, a quick method of setting these is:

     $ svccfg -s system/service setenv LD_PRELOAD libumem.so
     $ svccfg -s system/service setenv UMEM_DEBUG default

To define environmental variable from within a manifest please see the method_context section.

Dependencies & Dependants

Dependencies are what really separate SMF from SysV Init; each service instance can have custom dependencies that impact the state of the given instance, or define dependants that impact the state of other instances. The purpose of both is to inform the restarter of where a given instance fits into the universe of services on the system. Based on stitching the tapestry of dependencies and depents together a 'graph' is created describing what goes where and in what order.

Dependencies

The following is an example dependency as found in an XML Manifest:

        <dependency name='syslog' grouping='optional_all' restart_on='none' type='service'>
                <service_fmri value='svc:/system/system-log' />
        </dependency>

A dependency identifies a group of FMRIs upon which the service is in some sense dependent.

  1. dependency: name, grouping, restart_on, type, delete
    1. service_fmri: value (path dependencies in the form: 'file://path/to/file')

dependency properties seen above are:

  • name: Name of this dependancy
  • grouping: Relationship between the various FMRIs grouped here,
    • "require_all" of the FMRIs to be online
    • "require_any" of the FMRIs to be online
    • "exclude_all" of the FMRIs from being online (inverse of "require_all")
    • "optional_all" is satisified when all FMRIs are online or unable to come online (RESEARCH THIS)
  • restart_on: Type of events from the FMRIs that should trigger restart of this instance,
    • "error" restarts if the dependancies is restarted due to hardware fault
    • "restart" restarts if the dependencies are for any reason at all
    • "refresh" restart the service if the dependencies are refreshed or restarted for any reason
    • "none" will never restart the service due to dependency state change
  • type: Type of this dependency, either "service" or "path" for filesystem paths
  • delete: This dependancy should be deleted, "true" or "false" (Optional)

Dependants

Dependants are different, they make some other service dependent on this one. Because they impact other services they should be used with great caution!

The following is an example of a dependant defined in an XML Manifiest:

        <dependent name='rcap_multi-user' grouping='optional_all' restart_on='none'>
                        <service_fmri value='svc:/milestone/multi-user' />
        </dependent>

In this example, the "rcap" service makes the multi-user milestone dependent on the rcap service, even though the multi-user milestone manifest does not define it as a dependency.

The following is the structure of dependent groups ("restart_on=require_all" and "type=service" properties are implied):

  1. dependent: name, grouping, restart_on, delete, override (see dependency above)
    1. service_fmri: value

The "override" property ("true" or "false") specifies whether to replace an existing dependent of the same name. Default is "false".

Determining an Instances Deps

To list FMRI's that a given service is dependent on (ie: their state impacts this services state) use svcs -d fmri:

benr@aeon network$ svcs -d ssh
STATE          STIME    FMRI
disabled       Sep_08   svc:/network/physical:nwam
disabled       Sep_08   svc:/system/filesystem/autofs:default
online         Sep_08   svc:/network/loopback:default
online         Sep_08   svc:/network/physical:default
online         Sep_08   svc:/system/cryptosvc:default
online         Sep_08   svc:/system/filesystem/local:default
online         Sep_08   svc:/system/utmp:default

To list service FMRI's that are depenent upon this service (ie: this services state impacts their state) use svcs -D fmri:

benr@aeon network$ svcs -D ssh
STATE          STIME    FMRI
online         Sep_08   svc:/milestone/multi-user-server:default

To determine the type of dependancy use svcs -l fmri:

benr@aeon network$ svcs -l ssh
fmri         svc:/network/ssh:default
...
dependency   require_all/none svc:/system/filesystem/local (online)
dependency   optional_all/none svc:/system/filesystem/autofs (disabled)
dependency   require_all/none svc:/network/loopback (online)
dependency   require_all/none svc:/network/physical (multiple)
dependency   require_all/none svc:/system/cryptosvc (online)
dependency   require_all/none svc:/system/utmp (online)
dependency   require_all/restart file://localhost/etc/ssh/sshd_config (online)

As an example above, services listed as "require_all/none" (grouping/restart_on) means that the service is required for this one to function but that restarts of these services do not impact this one (ssh in this case).

Example:

  ... insert here ...

Startup Ordering by Dependencies

Startup ordering is based on a the dependency chain. Use svcs -d to see what services are dependencies of a given services and svcs -D to see what depends on that service. Using svcs -l we can see that nature of these deps (optional/required) and form an ordering.

$ svcs -l name-services | grep ^dep
dependency   optional_all/none svc:/network/dns/client (online)
dependency   optional_all/none svc:/network/ldap/client (disabled)
dependency   optional_all/none svc:/network/nis/client (disabled)
dependency   optional_all/none svc:/network/rpc/nisplus (disabled)

$ svcs -d name-services
STATE          STIME    FMRI
disabled        2:02:34 svc:/network/rpc/nisplus:default
disabled        2:02:34 svc:/network/nis/client:default
disabled        2:02:34 svc:/network/ldap/client:default
online          2:02:47 svc:/network/dns/client:default

$ svcs -D name-services
STATE          STIME    FMRI
disabled        2:02:34 svc:/network/nfs/cbd:default
disabled        2:02:34 svc:/network/nfs/mapid:default
disabled        2:02:34 svc:/network/nfs/client:default
disabled        2:02:34 svc:/system/filesystem/autofs:default
disabled        2:02:34 svc:/network/rpc/bootparams:default
disabled        2:02:35 svc:/network/apocd/udp:default
online          2:02:49 svc:/system/cron:default
online          2:02:50 svc:/application/opengl/ogl-select:default
online          2:02:52 svc:/network/inetd:default
online          2:02:54 svc:/system/system-log:default
online          2:02:54 svc:/network/smtp:sendmail
online          2:02:55 svc:/system/auditd:default
online          2:03:15 svc:/milestone/multi-user:default

From that we can come up with something like this:

[NIS] [NIS+] [DNS] [LDAP]
   \      |     |   /
   [ name-services ]
   /    |     |     \
[smtp] [nfs] [cron] [inetd] ...

We can continue to do this to form an even larger pictures, but it would be overwhelmingly big. Use the SMF Dependency Graph Generator tool to get something close.

Example Service: Mongrels

...

Many of our users are using the SMF Manifest created by Mark Cornick, a Joyent User who is a Rails developers who used to be a Solaris SysAdmin. His Manfiest (no profile) looks like this:

<?xml version=1.0?>
<!DOCTYPE service_bundle SYSTEM /usr/share/lib/xml/dtd/service_bundle.dtd.1>
<service_bundle type=manifest name=mongrel/yourapp>
  <service name=network/mongrel/yourapp type=service version=0>
    <create_default_instance enabled=true/>
    <single_instance/>

    <dependency name=fs grouping=require_all restart_on=none type=service>
      <service_fmri value=svc:/system/filesystem/local/>
    </dependency>
    <dependency name=net grouping=require_all restart_on=none type=service>
      <service_fmri value=svc:/network/loopback/>
      <!-- need nfs/client if the mongrel stuff is on nfs -->
      <!-- for instance, you need this for TextDrive container hosting -->
      <!-- if you keep your stuff on /home -->
      <service_fmri value=svc:/network/nfs/client/>
    </dependency>
    <dependent name=mongrel_yourapp restart_on=none grouping=optional_all>
      <service_fmri value=svc:/milestone/multi-user/>
    </dependent>

    <exec_method name=start type=method exec=/opt/csw/bin/mongrel_rails cluster::start timeout_seconds=60>
      <method_context working_directory=/path/to/yourapp>
        <method_credential user=youruser group=yourgroup />
        <method_environment>
          <!-- youll need this if you use blastwave -->
          <envvar name="PATH" value="/usr/bin:/bin:/opt/csw/bin" />
        </method_environment>
      </method_context>
    </exec_method>

    <exec_method name=stop type=method exec=:kill timeout_seconds=60>
      <method_context/>
    </exec_method>

  </service>
</service_bundle>


Methods

Methods are used to preform actions such as start, stop, restart, and refresh a service. They can be simple "exec-and-fork" one-line statements in the Manifest XML or complete scripts.

Exec Methods

Exec Methods are responsible for defining what actually should happen when a service changes state. Each exec_method can include elements that define the context in which the method runs. Please see the "Method Context" section below.

exec_method accepts the following properties:

  • type: "method" or "monitor"
  • name: Name of this exec method, usually a defined interface of the restart
  • exec: String idenfying the action to take, for svc.startd this is a string passed to exec(2)
  • timeout_seconds: An integer duration in seconds to wait for the method to complete. A value of "0" or "-1" denotes infinite timeout.
  • delete: Specifies that this exec_method should be removed (Optional)

svc.startd(1M) lists the following valid method names:

  • refresh: Reload appropriate configruation params from repository (SMF) or config file (app) without interrupting service. Typically this is a SIGHUP. (Optional)
  • start: Start the service. Return success after app is available to consumers. Fail is a conflicting instance is already running, or if service is unable to start.
  • stop: Stop the service.

A "restart" is a stop then start of the service.


Monitor Methods

The service_bundle DTD currently notes two types of exec_methods, 'method' and 'monitor'. According to Liane, monitor is currently only supported by restarters. See 'monitor Methods' post from SMF-Discuss Mailing list.

Method Context

One of the great powers of SMF is the ability to add specific context to a service. That is, for each exec_method (individually or corporately) you can define the following context in the following XML flow:

  1. exec_method ...: type, name, exec, timeout_seconds, delete
    1. method_context: working_directory, project, resource_pool
      1. method_credential ...: user, group, supp_groups, privileges, limit_privileges
      2. method_profile ...: name
      3. method_environment
        1. envvar: name, value

Valid properties of a method context are:

  • use_profile: Bolean that specifies whether the RBAC profile should be used instead of user, group, privs, and limit_privs properties
  • environment: Environment variables to insert into the environment of the method in the form of multiple NAME=value strings
  • profile: Name of an RBAC profile that identifies an entry in exec_attr matching the method
  • user: User ID in numeric or text form
  • group: Group ID in number of text form
  • supp_groups: Optional string that specifies supplimental group membership by ID
  • privileges: Optional string that specifies the privilege set as defined in privileges(5).
  • limit_privileges: Optional string specifying the limit priv set as defined by privileges(5).
  • working_directory: Home directory from which to launch the method. ":home" can be used to match the home dir of the UID specified to launch the method. If not set, ":home" is default. AT THE PRESENT YOU MUST SET USER/GROUP METHOD_CREDENTIALS FOR WORKING_DIRECTORY TO WORK!!!
  • corefile_pattern: (Not in DTD?) Optional string that specifies the corefile pattern to use for the service as per coreadm(1M). This setting overrides the global core pattern.
  • project: The Project ID in numeric or text form. ":default" can be used to identify "getdefaultproj()" for the user whose uid is used to launch the method
  • resource_pool: Resource pool name on which to launch method. ":default" will use the entry from /etc/projects specified for the uid used to launch the method.


Method context can be defined per-service or per-method; that is defind within a specific exec_method or between the service and instance tags.

Simple Example:

        <method_context>
              <method_credential user='root' group='root' />
              <method_environment>
                     <envvar name="PATH" value="/bin:/opt/csw/bin" />
               </method_environment>
         </method_context>

Complex Example:

 <method_context
   working_directory='/u01/app/oracle/product/10.2.0/db_1'
   project='oracle'
   resource_pool=':default'>

   <method_credential
    user='oracle'
    group='dba'
    supp_groups=':default'
    privileges=':default'
    limit_privileges=':default'/>

   <method_environment>
    <envvar name='ORACLE_SID' value='orcl1' />
    <envvar name='ORACLE_HOME' value='/u01/app/oracle/product/10.2.0/db_1' />
    <envvar name='ORA_NLS10' value='' />

   </method_environment>
  </method_context>

See smf_method(5) for additional details.

Writing Methods

SMF Methods are complete scripts resembling legacy SysV init script. The recommended invocation is the same, by supplying a start, stop, etc, argument to the script. The primary difference is that SMF passes along information the method in the form of environmental variables, to assist. These env's are:

  • SMF_FMRI: The service fault management resource identifier (FMRI) of the instance for which the method is invoked.
  • SMF_METHOD: The full method name of the method that is invoked
  • SMF_RESTARTER: The service FMRI of the restarter that invokes the method
  • SMF_ZONENAME: The name of the zone in which the method is running. This can also be obtained by using the zonename(1) command.

A variety of tokens are avaiable to use in your Manifest when calling a method, which are:

  • %r: Name of the restarter
  • %m: Name of the method
  • %s: Name of the service
  • %i: Name of the instance
  • %f: Name of the FMRI
  • %{prop[:,]}: Value(s) of a property. (ie: %general/action_authorization)

Token example, the following invocation is sent to a script that outputs these values:

<exec_method name='start' type='method' exec='/tmp/testing %m %s %i' timeout_seconds='60' />

The output from the above is:

[ Sep 16 01:30:17 Executing start method ("/tmp/testing start network/test default") ]
Method: start
Service: network/test
Instance: default
[ Sep 16 01:30:17 Method "start" exited with status 0 ]

Several convenience shell functions are provided by /lib/svc/share/smf_include.sh. These functions are:

  • smf_present(): Returns 0 is being called from SMF directly (use if you want to have one script that is used for both SMF and Init)
  • smf_clear_env(): Unset the SMF environmental variables above
  • smf_console(): Used for SMF logging (ie: echo message 2>&1 | smf_console)
  • smf_zonename(): Print the name of the zone
  • smf_is_globalzone(): Return 0 if globalzone, 1 otherwise
  • smf_is_nonglobalzone(): Inverse of previous
  • smf_configure_ip(): Return 0 if the zone needs IP to be configured (global or exclusive zone; ie: you actually can modify network statck)
  • smf_dont_configure_ip(): Inverse of above (non-global or shared zone)
  • smf_is_system_labeled(): Return 0 if system is labeled (ie: Trusted Extensions)
  • smf_netstrategy (): Sets _INIT_NET_IF of the network-booted interface if the system was booted from the network. _INIT_NET_STRATEGY is also set as "none", "dhcp", or "rarp".
  • smf_kill_contract(): Called from stop methods, accepts the args "CONTRACT SIGNAL WAIT TIMEOUT"

This include also provides exit status definitions:


     SMF_EXIT_OK            0           Method exited, performing its
                                        operation successfully.
     SMF_EXIT_ERR_FATAL     95          Method failed fatally and  is
                                        unrecoverable  without admin-
                                        istrative intervention.
     SMF_EXIT_ERR_CONFIG    96          Unrecoverable   configuration
                                        error.   A  common  condition
                                        that returns this exit status
                                        is  the  absence  of required
                                        configuration  files  for  an
                                        enabled service instance.
     SMF_EXIT_MON_DEGRADE   97          Monitor   assesses    service
                                        instance  as  operating  in a
                                        degraded mode.
     SMF_EXIT_MON_OFFLINE   98          Monitor   assesses    service
                                        instance   as  non-responsive
                                        and effectively offline.
     SMF_EXIT_ERR_NOSMF     99          Method  has  been  mistakenly
                                        invoked  outside  the  smf(5)
                                        facility.    Services    that
                                        depend on smf(5) capabilities
                                        should exit with this  status
                                        value.
     SMF_EXIT_ERR_PERM      100         Method  requires  a  form  of
                                        permission   such   as   file
                                        access, privilege, authoriza-
                                        tion,   or  other  credential
                                        that is  not  available  when
                                        invoked.
     SMF_EXIT_ERR_OTHER     non-zero    Any non-zero exit status from
                                        a  method  is  treated  as an
                                        unknown error.  A  series  of
                                        unknown  errors  can be diag-
                                        nosed as a fault by the  res-
                                        tarter  or  on  behalf of the
                                        restarter.

SMF_EXIT_ERR_OTHER is undefined and should be used for all non-zero exist statuses.


The following is an example of the method for SNMPd:

. /lib/svc/share/smf_include.sh

SNMPD_FILE=/etc/sma/snmp/snmpd.conf
SNMPCONFPATH=/etc/sma/snmp:/var/sma_snmp
MIBDIRS=/etc/sma/snmp/mibs

export SNMPCONFPATH
export MIBDIRS

if /usr/bin/egrep  '#DISABLE=YES' ${SNMPD_FILE} > /dev/null 2>&1; then
        echo "snmpd disabled by config file ${SNMPD_FILE}"
        svcadm disable svc:/application/management/sma
        exit $SMF_EXIT_MON_OFFLINE;
else
        /usr/sfw/sbin/snmpd
fi

Here is an example of a simple service that doesn't execute a daemon, nfs/client. Here sending the signal SMF_EXIT_OK keeps the method from failing due to the lack of resultant PIDs:

. /lib/svc/share/smf_include.sh

stop_nfsclnt()
{
        /sbin/umountall -F cachefs
        /sbin/umountall -F nfs
}

case "$1" in
'start')

        /sbin/mountall -F nfs
        /sbin/mountall -F cachefs
        /sbin/swapadd
        ;;

'stop')
        stop_nfsclnt
        ;;

*)
        echo "Usage: $0 { start | stop }"
        exit 1
        ;;
esac
exit $SMF_EXIT_OK

And here is an example for svc-zones:

. /lib/svc/share/smf_include.sh

#
# Return a list of running, non-global zones for which a shutdown via
# "/sbin/init 0" may work (typically only Solaris zones.)
#
# At present, this means any running "lx" zones don't qualify.
#
shutdown_zones()
{
        zoneadm list -p | nawk -F: '{
                if (($5 != "lx") && ($2 != "global")) {
                        print $2
                }
        }'
}

[ ! -x /usr/sbin/zoneadm ] && exit 0    # SUNWzoneu not installed

if [ -z "$SMF_FMRI" ]; then
        echo "this script can only be invoked by smf(5)"
        exit $SMF_EXIT_ERR_NOSMF
fi

# Make sure working directory is / to prevent unmounting problems.
cd /
PATH=/usr/sbin:/usr/bin; export PATH

case "$1" in
'start')
        egrep -vs '^#|^global:' /etc/zones/index || exit 0  # no local zones
        ZONES=""
        for zone in `zoneadm list -pi | nawk -F: '{
                        if ($3 == "installed") {
                                print $2
                        }
                }'`; do
                zonecfg -z $zone info autoboot | grep "true" >/dev/null 2>&1
                if [ $? -eq 0 ]; then
                        [ -z "$ZONES" ] && echo "Booting zones:\c"
                        ZONES=yes
                        echo " $zone\c"
                        #
                        # zoneadmd puts itself into its own contract so
                        # this service will lose sight of it.  We don't
                        # support restart so it is OK for zoneadmd to
                        # to be in an orphaned contract.
                        #
                        zoneadm -z $zone boot &
                fi
        done
        #
        # Wait for all zoneadm processes to finish before allowing the
        # start method to exit.
        #
        wait
        [ -n "$ZONES" ] && echo .
        ;;

'stop')
        egrep -vs '^#|^global:' /etc/zones/index || exit 0  # no local zones
        [ "`zoneadm list`" = "global" ] && exit 0   # no zones running

        SVC_TIMEOUT=`svcprop -p stop/timeout_seconds $SMF_FMRI`

        #
        # First, try shutting down any running zones for which an "init 0" may
        # work.
        #
        MAXSHUT=`expr 3 \* $SVC_TIMEOUT \/ 4` # 3/4 of time to zone shutdown
        MAXHALT=`expr $SVC_TIMEOUT \/ 4`      # rest of time goes to halt

        zonelist=`shutdown_zones`

        if [ -n "$zonelist" ]; then
                SHUTDOWN=0
                echo "Shutting down running zones (for up to $MAXSHUT" \
                    "seconds):\c"

                for zone in $zonelist; do
                        echo " $zone\c"
                        zlogin -S $zone /sbin/init 0 < /dev/null >&0 2>&0 &
                        SHUTDOWN=1
                done

                [ $SHUTDOWN -eq 1 ] && echo "."

                # Allow time for zones to shutdown cleanly

                while [ $MAXSHUT -gt 0 -a "`shutdown_zones`" != "" ]; do
                        MAXSHUT=`expr $MAXSHUT - 1`
                        sleep 1 # wait a bit longer
                done
        fi

        #
        # Second, try halting any non-global zones still running
        #
        WAITPIDS=""
        for zone in `zoneadm list`; do
                if [ "$zone" != "global" ]; then
                        [ -z "$WAITPIDS" ] &&
                            echo "Zones failed to shutdown; trying to halt " \
                            "(for up to $MAXHALT seconds):\c"
                        echo " $zone\c"
                        zoneadm -z $zone halt &
                        WAITPIDS="$WAITPIDS $!"
                fi
        done
        [ ! -z "$WAITPIDS" ] && echo .

        # Wait for the 'zoneadm halt' commands to complete.  We will let this
        # run forever, since the restart daemon will eventually kill us off
        # anyway if the halts do not complete after a certain period of time.
        wait $WAITPIDS

        # If the halts complete but a zone is still not shutdown, it might
        # be in a state like 'shutting_down' or 'down'.  So we give it some
        # time to come all the way down.

        while [ $MAXHALT -gt 0 -a "`zoneadm list`" != "global" ]; do
                MAXHALT=`expr $MAXHALT - 1`
                sleep 1 # wait a bit longer
        done

        #
        # Report on zones which failed to shutdown.
        #
        for zone in `zoneadm list`; do
                if [ "$zone" != "global" ]; then
                        echo "Zone '$zone' failed to halt."
                fi
        done
        [ "`zoneadm list`" != "global" ] && exit 1   # zones still running
        ;;

*)
        echo "Usage: $0 { start | stop }"
        exit 1
        ;;
esac
exit 0

Transient Services

SMF's criteria for success is a PID spawned from a method that continues to run. This PID or set of PIDs are tracked via contracts. If all the PIDs exit the service responds by restarting the service.

But what if you want to write a service that doesn't actually spawn PIDs for a long running daemon but rather to just do something? A great example is nfs/client which binds to several dependencies and runs moutnall -F nfs. Services of this kind are known as transient services.

To accomplish this and avoid a restart loop of doom we must use the "transient" property. The following is an example manifest:

<?xml version='1.0'?>
<!DOCTYPE service_bundle SYSTEM '/usr/share/lib/xml/dtd/service_bundle.dtd.1'>
<service_bundle type='manifest' name='test'>
  <service name='network/test' type='service' version='0'>

    <instance name="default" enabled="false">
     <dependency name='fs' grouping='require_all' restart_on='none' type='service'>
        <service_fmri value='svc:/system/filesystem/local'/>
     </dependency>
     <exec_method name='start' type='method' exec='/home/benr/svn/smf/test/testing %m %s %i' timeout_seconds='60'>
        <method_context working_directory='/home/benr/svn/smf/test/' />
     </exec_method>
     <exec_method name='stop' type='method' exec=':true' timeout_seconds='60' />

     <property_group name='startd' type='framework'>
        <propval name='duration' type='astring' value='transient' />
     </property_group>
    </instance>

  </service>
</service_bundle>

And here is the accompanying method:


# Include SMF convience library:
. /lib/svc/share/smf_include.sh

# This script is invoked thusly:
#   testing %m %s %i

echo "Method: ${1}"
echo "Service: ${2}"
echo "Instance: ${3}"


exit $SMF_EXIT_OK

The result in the log is:

[ Sep 18 03:01:03 Executing start method ("/home/benr/svn/smf/test/testing start network/test default") ]
Method: start
Service: network/test
Instance: default
[ Sep 18 03:01:03 Method "start" exited with status 0 ]

Service Instances

Services can be defined as single or multi-instance. With a single instance you are limited to 'my/service:default' and can not create derivative instances in the future. With multi-instance you can create multiple instances of the same service that share the following definitions (by defining them prior to the <instance ...> tag):

  • restarter
  • dependancy
  • dependant
  • method_context
  • exec_method
  • stability
  • template
  • property_group

Example:

<?xml version='1.0'?>
<!DOCTYPE service_bundle SYSTEM '/usr/share/lib/xml/dtd/service_bundle.dtd.1'>
<service_bundle type='manifest' name='test'>
  <service name='network/test' type='service' version='0'>

     <dependency name='fs' grouping='require_all' restart_on='none' type='service'>
        <service_fmri value='svc:/system/filesystem/local'/>
     </dependency>
     <exec_method name='start' type='method' exec='./testing %m %s %i' timeout_seconds='60'/>
     <exec_method name='stop' type='method' exec=':true' timeout_seconds='60' />

     <property_group name='startd' type='framework'>
        <propval name='duration' type='astring' value='transient' />
     </property_group>

    <instance name="default" enabled="false">
        <method_context working_directory="/home/benr/svn/smf/test/">
                <method_credential user="root" group="staff"/>
                <method_environment>
                        <envvar name="PATH" value="/usr/bin:/sbin:/usr/sbin:/usr/local/bin" />
                </method_environment>
        </method_context>
    </instance>

    <instance name="defunct" enabled="false">
        <method_context working_directory="/home/benr/svn/smf/test/">
                <method_credential user="benr" group="staff"/>
                <method_environment>
                        <envvar name="PATH" value="/usr/bin:/sbin:/usr/sbin:/usr/local/bin" />
                </method_environment>
        </method_context>
    </instance>

  </service>
</service_bundle>

Over-rides do not work, anything diffrenet between multiple instances must be defined individually in each instance.

Which is seen as:

root@aeon test$ svcadm enable test:defunct
root@aeon test$ svcs test
STATE          STIME    FMRI
online          3:58:12 svc:/network/test:default
online          3:58:31 svc:/network/test:defunct

SMF Service Properties

One powerful capability of SMF is the ability to store key/value pairs of properties within the SMF Repository. These properties can then be queried by scripts (methods most likely) to change behaviour.

Example, the following code except (from an Apache2 Method) uses a service property to decide whether or not to enable SSL:

        ssl=`svcprop -p httpd/ssl svc:/network/http:cswapache2`
        if [ "$ssl" = false ]; then
                cmd="start"
        else
                cmd="startssl"
        fi

The query itself looks like this:

# svcprop -p httpd/ssl svc:/network/http:cswapache2
false

svcprop

The svcprop tool allows you to easily and quickly work with service properties without the need to use svccfg. The output produced by svcprop is far more easily scripted than other methods.

...

Delegated Control with RBAC

Using RBAC individual services (FMRI's) can be added as authroizations in /etc/security/auth_attr. These authorizations can then be attributed directly to a user account (usermod -A solaris.smf.manage.XYZ user), added to a role (rolemod ...), or bundled into a profile which is then attributed to a user or role.

To determine existing authorizations for users and roles please examine /etc/user_attr.

Example of giving access to the http service to the user "benr". First add the following line to /etc/security/auth_attr:

solaris.smf.manage.http:::Manage HTTP service states::

Then associate it:

$ usermod -A solaris.smf.manage.http benr
UX: usermod: benr is currently logged in, some changes may not take effect until next login.

$ grep ^benr /etc/user_attr 
benr::::type=normal;auths=solaris.smf.manage.http;profiles=Software Installation;roles=zfs

Now test:

...

CAN'T MAKE THIS WORK FOR SOME REASON... Fucked up on non-default instances.


See RBAC#Example_1:_SMF_Control for details.

Enhanced SMF Profiles

As noted above, SMF Profiles are a dump of the complete running state of a system, essentially limited to FMRI and enabled or disabled. The Enhanced SMF Profile Project aims to expand the concept to have a tiered overlay structure.

For details read the Enhanced SMF Profiles Design Document (PDF available).

Understanding Contracts

...

Snapshots

Historical data about each instance in the repository is maintained by the service management facility. This data is made available as read-only snapshots for administrative inspection and rollback.

The smf(5) man page defines the following snapshot types:

  • initial: Initial configuration of the instance created by the administrator or produced during package installation.
  • last_import: Configuration as prescribed by the manifest of the service that is taken during svccfg(1M) import operation. This snapshot provides a baseline for determining property customization.
  • previous: Current configuration captured when an administrative undo operation is performed.
  • running: The running configuration of the instance.
  • start: Configuration captured during a successful transition to the online state.

Please refer to the svccfg Snapshot commands section above for details on interacting with snapshots. Snapshots can not be manually created.

Why snapshots are useful? No idea.

Restarters

For each managed service, a restarter relies on retrieving properties on the service instance to determine configuration. The restarter manages a set of property groups to communicate the current disposition of a service with display tools such as svcs(1).

From the smf_restarter(5) man page:

  Service configuration
     The common restarter configuration for all services is  cap-
     tured in the general property group. This group includes the
     following required and optional property settings.

     enabled              This is a required  property.  If  set,
                          the  restarter  of an instance attempts
                          to maintain availability  of  the  ser-
                          vice.

     restarter            This  is  an  optional  property   that
                          allows  the  specification of an alter-
                          nate restarter to  manage  the  service
                          instance.  If the restarter property is
                          empty or absent, the restarter defaults
                          to svc.startd(1M).

     single_instance      This is an optional property. When set,
                          only  one  instance  of  the service is
                          allowed to transition to an  online  or
                          degraded status at any time.

     transient_process    This  is  an  optional  property.  When
                          specified, the service does not contain
                          objects that run indefinitely

  Service reporting
     All restarters report status using  the  restarter  property
     group, which includes the following properties:

     next_state         The current  state  and  next  state,  if
                        currently  in  transition,  for instances
                        stored in these  properties.  See  smf(5)
                        for   a   description  of  the  potential
                        states.

     auxiliary_state    An astring with no spaces that contains a
                        precise   term   to   describe  the  full
                        restarter-specific state  in  combination
                        with  the  restarter state property.  The
                        auxiliary state may not always be set and
                        is  always  cleared during transition out
                        of any state. Each restarter must  define
                        the  precise  list of auxiliary states it
                        will use.

     state_timestamp    The  time  when  the  current  state  was
                        reached.

     contract           The primary process contract ID, if  any,
                        under  which the service instance is exe-
                        cuting.

Test Manifest & Method

The best way to learn SMF is to play with it. The following is an example service for doing just that.

<?xml version='1.0'?>
<!DOCTYPE service_bundle SYSTEM '/usr/share/lib/xml/dtd/service_bundle.dtd.1'>
<service_bundle type='manifest' name='test'>
  <service name='network/test' type='service' version='0'>

    <instance name="default" enabled="false">
     <dependency name='fs' grouping='require_all' restart_on='none' type='service'>
        <service_fmri value='svc:/system/filesystem/local'/>
     </dependency>
     <exec_method name='start' type='method' exec='./testing %m %s %i' timeout_seconds='60'>
        <method_context working_directory="/home/benr/svn/smf/test/">
                <method_credential user="root" group="staff"/>
                <method_environment>
                        <envvar name="PATH" value="/usr/bin:/sbin:/usr/sbin:/usr/local/bin" />
                </method_environment>
        </method_context>
     </exec_method>
     <exec_method name='stop' type='method' exec=':true' timeout_seconds='60' />

     <property_group name='startd' type='framework'>
        <propval name='duration' type='astring' value='transient' />
     </property_group>
    </instance>

  </service>
</service_bundle>

And the method:

#!/bin/bash
# Test Method
# benr - 9/14/07

# Include SMF convience library:
. /lib/svc/share/smf_include.sh

# This script is invoked thusly:
#   testing %m %s %i

echo "Method: ${1}"
echo "Service: ${2}"
echo "Instance: ${3}"
echo "PWD is: $PWD"
echo "HOME is: $HOME"
echo "UID is: $UID"

env

exit $SMF_EXIT_OK


SMF & FMD

...


Repository and Rebuilding

The repository (SQLite database) is found in /etc/svc and can be manipulated....

Full "seed" Recovery

The following will completely re-initialize SMF:

[christina:/zones/z02380AA/root] root# /lib/svc/bin/restore_repository -r `pwd`                                                                         

See http://sun.com/msg/SMF-8000-MY for more information on the use of
this script to restore backup copies of the smf(5) repository.

If there are any problems which need human intervention, this script will
give instructions and then exit back to your shell.

Note that upon full completion of this script, the system will be rebooted
using reboot(1M), which will interrupt any active services.

The following backups of /zones/z02380AA/root/etc/svc/repository.db exist, from
oldest to newest:

manifest_import-20060803_153418
manifest_import-20060807_203011
boot-20070123_144209
boot-20070224_104229
boot-20070224_150958
boot-20080205_141435

The backups are named based on their type and the time what they were taken.
Backups beginning with "boot" are made before the first change is made to
the repository after system boot.  Backups beginning with "manifest_import"
are made after svc:/system/manifest-import:default finishes its processing.
The time of backup is given in YYYYMMDD_HHMMSS format.

Please enter either a specific backup repository from the above list to
restore it, or one of the following choices:

        CHOICE            ACTION
        ----------------  ----------------------------------------------
        boot              restore the most recent post-boot backup
        manifest_import   restore the most recent manifest_import backup
        -seed-            restore the initial starting repository  (All
                            customizations will be lost, including those
                            made by the install/upgrade process.)
        -quit-            cancel script and quit

Enter response [boot]: -seed-

After confirmation, the following steps will be taken:

/zones/z02380AA/root/etc/svc/repository.db
    -- renamed --> /zones/z02380AA/root/etc/svc/repository.db_old_20080207_231718
/zones/z02380AA/root/lib/svc/seed/global.db
    -- copied --> /zones/z02380AA/root/etc/svc/repository.db

Proceed [yes/no]? yes
/zones/z02380AA/root/etc/svc/repository.db
    -- renamed --> /zones/z02380AA/root/etc/svc/repository.db_old_20080207_231718
/zones/z02380AA/root/lib/svc/seed/global.db
    -- copied --> /zones/z02380AA/root/etc/svc/repository.db

The backup repository has been successfully restored.

See Also

Attribution

Content donated by Joyent.