![]() ![]() ![]() ![]() ![]() |
UNIX Unleashed, System Administrator's Edition
- 16 -Startup and Shutdownby David Gumkowski and John Semencar Starting up and shutting down UNIX are unlike most system administration tasks in that after deciding when either occurs, the administrator is more a passive observer than a proactive participant. Vigilance and informed understanding are required more than anticipation of problems and needs. Outputting to the system console, startup generates a wealth of information about what is transpiring. Most importantly, it shows what devices appear or are interrupting and shows what tasks are beginning. Also, most boot problems reflect some kind of message to the system console. This chapter discusses what some common console messages mean during startup and shut down, identifies what commands are involved in either process, and describes daemons normally spawned as a result of restarting the system. StartupIn the most basic sense, starting up a UNIX-based operating system, booting, is an orderly method to accomplish a predefined set of tasks. Those tasks would normally include
An abbreviated sample startup from a Hewlett Packard HP-UX Release 10.x machine is found in Listing 16.1. Note that most startup messages are written to the system console device as well as the system log file. Please refer to your system's manual page for syslogd to find where your syslog configuration file is located. The configuration file will indicate to you in the last column where the system log files are located. Listing 16.1. Sample startup from a Hewlett Packard HP-UX Release 10.x machine.************************************************** HP-UX Start-up in progress Thu May 01 06:00:00 EST 1997 ************************************************** Mount file systems Output from "/sbin/rc1.d/S100hfsmount start": ---------------------------- Setting hostname Output from "/sbin/rc1.d/S320hostname start": ---------------------------- Save system core image if needed Output from "/sbin/rc1.d/S440savecore start": ---------------------------- EXIT CODE: 2 - savecore found no core dump to save "/sbin/rc1.d/S440savecore start" SKIPPED ---------------------------- Recover editor crash files Output from "/sbin/rc2.d/S200clean_ex start": ---------------------------- preserving editor files (if any) List and/or clear temporary files Output from "/sbin/rc2.d/S204clean_tmps start": ---------------------------- Starting the ptydaemon Start network tracing and logging daemon Output from "/sbin/rc2.d/S300nettl start": ---------------------------- Initializing Network Tracing and Logging... Done. Configure HP Ethernet interfaces Output from "/sbin/rc2.d/S320hpether start": ---------------------------- Start NFS server subsystem Output from "/sbin/rc3.d/S100nfs.server start": ---------------------------- starting NFS SERVER networking Starting OpenView Output from "/sbin/rc3.d/S940ov500 start": ---------------------------- Initialization ProcessSpecifically, the kernel, commonly named /vmunix or /unix, whether it is located on the root partition directly or some subdirectory such as /stand on HP systems, will execute and give rise to a system father task, init. This father task will propagate children processes commonly needed for operation. Common operations normally completed during boot include such things as setting the machine's name, checking and mounting disks and file systems, starting system logs, configuring network interfaces and beginning network and mail services, commencing line printer services, enabling accounting and quotas, clearing temporary partitions, and saving core dumps. To understand how those functions come into being is to grasp how the father process operates. Though blurred by what constitutes BSD versus SYS V UNIX flavors today, the two flavors generate the identically named init, but their respective calling modes differ significantly. Configuration FileSystems such as HP-UX, IRIX, Linux, and Solaris all use a very flexible init
process that creates jobs directed from a file named /etc/inittab. Init's
general arguments are shown here: 0 Shut down the machine into a halted state. The
machine enters a PROM monitor mode or a powered off condition.
Listing 16.2 shows an abbreviated sample inittab file. Listing 16.2. Abbreviated sample inittab file.strt:2:initdefault: lev0:06s:wait:/etc/rc0 > /dev/console 2>&1 < /dev/console lev2:23:wait:/etc/rc2 > /dev/console 2>&1 < /dev/console lev3:3:wait:/etc/rc3 > /dev/console 2> &1 < /dev/console rebt:6:wait:/etc/init.d/announce restart ioin::sysinit:/sbin/ioinitrc > /dev/console 2>&1 brcl::bootwait:/sbin/bcheckrc < /dev/console 2>&1 cons:123456:respawn:/usr/sbin/getty console console powf::powerwait:/sbin/powerfail > /dev/console 2>&1 The general form of an entry in this file is as follows: identifier:run-level:action-keyword: process identifier is a text string of up to four characters in length and is used to uniquely identify an entry. Two character identifiers should be used with care because it is possible for PTY identities to conflict with an identifier. Ultimately, this would lead to corruption of the utmp file which keeps a record of all users currently logged in. The run level is one or more of the init arguments described previously or blank to indicate all run levels. A default run level of 2 or 3 is common, depending on your system. Run level 1 is usually reserved for special tasks such as system installation. When init changes run levels, all processes not belonging to the requested run level will be killed eventually. The exception to that rule are a,b,c started commands.
The action keyword defines the course of action executed by init. Values and their meaning are found in Table 16.1. Table 16.1. Action keyword table.
The process is any daemon, executable script, or program. This process can invoke other scripts or binaries. In the example of an inittab file given previously, a system powerup would default to run level 2. Run levels 0 (shutdown), 6 (reboot), and s (single user) would all execute the script /etc/rc0, which in turn could call subscripts. Run levels 2 or 3 (multiuser/expanded multiuser) would execute the /etc/rc2 script. Additionally, run level 3 (expanded multiuser) would also execute the /etc/rc3 script. Run level 6 (reboot) would announce what is happening by executing the /etc/init.d/announce script given the restart argument. For any run level, before the console receives the login prompt, run /sbin/ioinitrc to check the consistency between the kernel data and I/O configuration file. For any run level going from single to multiuser, run a file system consistency check by executing /sbin/bcheckrc. For run levels 1-6, if getty doesn't exist for the console, begin it. For any run level in which powering down is requested, run the /sbin/powerfail script. BSD type systems use an init process that is somewhat less flexible in usage. It runs a basic reboot sequence and depending on how it is invoked, begins a multiuser or single-user system. init changes states via signals. The signal is invoked using the UNIX kill command. For example, to drop back to single-user mode from multiuser mode, the superuser would kill -TERM 1. Table 16.2 lists the signals. Table 16.2. Signals used with kill command.
RC ScriptsEach system type begins similarly by initializing an operating condition through calls to scripts or directories containing scripts generally of the type /etc/rc*. BSD systems normally would call /etc/rc, /etc/rc.local, or /etc/rc.boot. Because of the flexibility of the inittab version, it is best to look in that file for the location of the startup scripts. A methodology that is now favored by vendors supporting inittab, such as HP-UX and IRIX, creates directories such as /sbin/rc[run-level].d or /etc/rc[run-level].d. These directories contain files such as S##name (startup) or K##name (kill/shutdown) that are links to scripts in /sbin/init.d or /etc/init.d. The ##s are ordered in the manner in which they are called by a superscript. A sample startup sequence is found in Listing 16.3. Listing 16.4 shows a sample shutdown sequence. Listing 16.3. A sample startup sequence from an HP-UX system.lrwxr-xr-x 1 root sys 16 Apr 9 Listing 16.4. A sample shutdown sequence from an IRIX system.lrwxr-xr-x 1 root sys 14 Mar 18 1997 In the last example, K02midi would be executed followed by (in order), K02videod, K02xdm, K03announce, and so on, until K99disk_patch was executed. On this system, the ~/init.d/* files are used for startup and shutdown, depending whether they are invoked with a start (S types) or stop (K types) parameter. Some systems have a template superscript called ~/init.d/template that should be used to initiate the daemon, program or, moreover, script. If a configuration file is needed by the superscript, it should be placed in /etc/rc.config.d. Listing 16.5 is a partial example of a script utilizing the template to initiate the startup or shutdown of a relational database management, in this case Oracle 7 Server. Links are required for the execution of the superscript. Referring to the example, the following commands can be used to create the links: ln -s ~/init.d/oracle ~/rc2.d/S900oracle ln -s ~/init.d/oracle ~/rc1.d/K100oracle In this case, Oracle will be stopped whenever the system is shut down from a run level higher than 1. It will be started when entering run level 2. The numbering of the start and kill scripts may be different than the ones used. Listing 16.5. Example of an init startup/shutdown script.case $1 in 'start_msg') echo "Starting ORACLE" ;; 'stop_msg') echo "Stopping ORACLE" ;; 'start') # source the system configuration variables if [ -f /etc/rc.config.d/oracle ] ; then . /etc/rc.config.d/oracle else echo "ERROR: /etc/rc.config.d/oracle file MISSING" fi # Check to see if this script is allowed to run... if [ $ORACLE_START != 1 ]; then rval=2 else #Starting Oracle su - oracle -c /u99/home/dba/oracle/product/7.2.3/bin/dbstart fi ;; 'stop') # source the system configuration variables if [ -f /etc/rc.config.d/oracle ] ; then . /etc/rc.config.d/oracle else echo "ERROR: /etc/rc.config.d/oracle file MISSING" fi # Check to see if this script is allowed to run... if [ $ORACLE_START != 1 ]; then rval=2 else #Stopping Oracle su - oracle -c /u99/home/dba/oracle/product/7.2.3/bin/dbshut fi ;; *) echo "usage: $0 {start|stop|start_msg|stop_msg}" rval=1 ;; esac Startup Daemons and ProgramsWhen the system is operational, after you log in, run ps -ef (SYS V type) or ps ax (BSD type) from a shell prompt. This will list the processes currently running. An idle system with no users will most likely include at least a subset of the following tasks:
One last note about the start up process: if the file /etc/nologin exists, only the superuser may log in. Other users attempting to log in would see the textual contents of /etc/nologin. ShutdownAs they say, "what goes up must come down. This is as true for computers as it is for other things in life. A normal shutdowns is an attempt to terminate processes in an orderly fashion so that when the system comes back up, there will be little error. A graceful shutdown will kill running tasks as smoothly as it can. It will then synchronize the disks with any outstanding buffers in memory and dismount them. When this needs to be done, first and foremost, make sure a shutdown really needs to occur. Your decision to do this will have a lot to do with your site culture and how many users you impact. Also, many times, a little research will lead you to try killing and restarting a daemon or living with a non-volatile problem until a patch can be applied later, during the night. If the system must come down, however, depending on the current circumstances, there is a variety of ways to bring down a running system. Among the methods are the commands shutdown, reboot, sync, init, and halt--and by removing power from the machine. Generally, as you would expect, removing power or not having synchronized (all disk writes completed) quiescent disks, will almost ensure that some file system corruption will occur that will need correction during the next boot. More than likely, the file system consistency check program, fsck, will be able to autocorrect the problems--but, given a choice, use a safer method of bringing your system down. fsck is automatically invoked during system startup unless specifically turned off (fastboot). Its function is to check the consistency of inodes, free space, links, directory entries, pathnames, and superblocks. It does not perform a surface scan or remap bad blocks. Here, in more detail, is a summary of the possible commands for various operating environments. Note that these commands do not necessarily include all of the possible options for the command. See your local manual page for a complete list of options. HP-UX
IRIX
Solaris
Linux
In every given example, the most graceful way to shut down was purposely identified as such because shutdown is always the preferred way for an uncomplicated shutdown of the system. Listing 16.6 shows a sample shutdown from an IRIX system. Listing 16.6. Sample shutdown from an IRIX system.Shutdown started. Wed Apr 16 01:46:29 EDT 1997 Broadcast Message from root (ttyq0) on indy Wed Apr 16 01:46:29 1997 THE SYSTEM IS BEING SHUT DOWN! Log off now. On the system console, once shutdown began, the following appeared: The system is shutting down. Please wait. unexported /usr1 unexported /usr2 Removing swap areas. Unmounting file systems: As with starting up the system, shutting down the system will reflect parts of what is happening to the system console and system log file. SummaryTo recap, during system startup and shutdown, as the console spews information out, be an educated observer; unless problems occur, keep the rest of your system administration skills ready but in the background. This will probably be the easiest task of the day. Have a cup of coffee; you'll probably need it later in the day. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|