Architectural relation between Linux and Unix

Is UNIX a real ancestor of Linux

There is always a talk in the computing community regarding Linux that it is a clone of UNIX. Although it is true that Linux is based on the architectural foundation of UNIX .... this article verifies the distance and relationship between the two. It presents a comparative study of both the operating systems.

Linux is an operating system based on UNIX standards. It provides a programming interface and user interface compatible with standard UNIX systems, and can run a large number of UNIX applications, including a sufficiently increased number of commercially supported applications. However, the Linux system includes many components that were developed independently Linux. Linux operating system kernel is entirely original, but it allows much existing free UNIX software to run, resulting in an entire UNIX-compatible operating system.

Linux resembles non-microkernel UNIX implementation. It is multiuser, multitasking system with a full set of UNIX-compatible tools. Linux file system adheres to traditional UNIX semantics, and the standard UNIX networking model in full implementation.

Previously the only file system supported by Linux was the Minix file system as the first Linux kernels were cross-developed on a Minix platform. However, the kernel did implement proper UNIX processes with protected address space. The next version of Linux 1.0, released on March 14, 1994, included support for UNIX’s standard TCP/IP networking protocols, along with BSD-compatible socket interface for network programming. This included UNIX-style interprocess communication (IPC) with other features like shared memory, semaphores, and message queues. In the latest versions the Linux has matured and included precompiled set of packages for easy installation. These distributions include many of the UNIX tools, such as news servers, web browsers, text processing, text editing tools, etc.

Kernel: Linux system comprises three main bodies of code in line with most traditional UNIX implementation, i.e., i. kernel, ii. system library, and iii. system utilities. Linux retains UNIX’s historical model, that is, the kernel is created as a single, monolithic binary to improve performance. Since all kernel code and data structures are kept in a single address space, no context switching is necessary when a process calls an OS function or when a hardware interrupt is delivered. The system libraries of Linux contain all the functions necessary to support the operation of UNIX or POSIX applications. However, the operating system provided by Linux kernel looks nothing like a UNIX system as it is missing many of the extra features of UNIX and the features the Linux provides are not necessarily in the format in which UNIX application system expects them to appear.

Process Management: Even though a measure chunk of the process management is identical to UNIX yet Linux operates differently from UNIX at certain key areas. The basic principle of UNIX process management separates two distinct operations: i. creation of process, and ii. running of a new program. A new process is created by the fork system call, and a new program is run after a call to execve. However, a new process may be created with fork without a new program being run. The new subprocess simply continues to execute exactly the same program that the parent process was running. Equally, running a new program does not require that a new process be created first; any process may call execve at any time. The currently running program is immediately terminated, and the new program starts executing in the context of the existing process. This model has the advantage of great simplicity – rather than specifying the details of new environment for the new program, the new program simply run in their existing environment. Should it be necessary for the parent process to modify the environment in which the new program is to be run, it can fork and then, still running the original program in a child process, make any system calls it needed to modify that child process before finally executing the new program. Thus, under UNIX, a process encompasses all the information that the OS must maintain to track the context of a single execution of a single program. Under Linux, this context can be broken into a number of specific sections; more broadly, the process properties fall into three groups, i.e., the process identity, environment, and context.

The process personalities are not traditionally found on UNIX systems, but under Linux each process has an associated personality identifier that can modify slightly the semantics of certain system calls. Emulation libraries to request that system calls be compatible with certain flavors of UNIX primarily use personalities.

User Program’s Loading & Execution: Older version of Linux kernels also understood the a.out format for binary files, which was a simple format common on older UNIX system. Newer Linux systems use the more modern ELF format, now supported by most current UNIX implementations. It is to note that, the ELF format has number of advantages over a.out, including flexibility and extendibility as new sections can be added to an ELF binary without creating confusion to the loader routines. By allowing registration of multiple loader routines, Linux can easily support the ELF and a.out binary formats in a single running system. The Linux has no single routine for loading a new program. Instead it maintains a table of possible loader functions, and gives each such function the opportunity to try loading the given file when an exec system call is made.

File Systems: During its inception the Linux file systems were based upon Minix file systems, but now the Linux retains UNIX’s standard file system model. In UNIX, a file does not have to be an object stored on desk or fetched over a network from a remote file server, rather the UNIX files can be anything capable of handling the input or output of a stream of data. Thus, the device drivers, interprocess communication channels, or network a connection also appears as files to the user. The Linux kernel, however, handles all these types of files by hiding the implementation details of any single file type behind a layer of software, known as the virtual file system (VFS).

The proc file system of Linux provides a way for programs to access the statistics about Linux kernel and the associated loaded drivers as plain text files, which the standard UNIX user environment provides powerful tools to process. For instance, the traditional UNIX ps command for listing the states of all the running processes has been implemented as a privileged process that reads the process state directly from the kernel’s virtual memory, but under Linux, this command is implemented as an entirely unprivileged program that simply parses and formats the information from proc.

Input/Output: The users find an I/O system in Linux much like UNIX, i.e., all device drivers appear as normal files. A user can have an access to a device in the same way as he can open any other file. Devices can appear as objects within the file system. The system administrator can create special files within a file system that contain references to a specific device driver, thus a user opening such a file will be able to read from and write to the device referenced. Also, by using the normal file-protection system that determines who can access a particular file, an administrator can set access permissions for each device.

Synchronization: It is the standard UNIX mechanism for informing a process that an event has occurred is a signal. Signals can be sent from any process to any other process with restrictions on signals sent to processes owned by another user. However, a limited number of signals are available and they cannot carry information – only the fact that a signal occurred is available to a process. The kernel also generates signals internally – it can send a signal to a server process when data arrive on a network channel, to a parent process when a child terminates, or when a timer expires. So far the Linux kernel is concerned, it does not internally use signals to communicate with processes running in kernel mode. If a kernel-mode process is expecting an event to occur, it will not normally use signals to receive notification of that event. Instead, the communication about incoming synchronous events within the kernel is performed through the use of scheduling states and wait_queue structures. These mechanisms allow kernel-mode process to inform one another about relevant events, and they also allow events to be generated by device drivers or by the networking system. Whenever a process wants to wait for some event to complete, it places itself on a wait_queue associated with that event and tells the scheduler that it is no longer eligible for execution. Once the event has completed, it will wake up every process on the wait_queue. This procedure allows multiple processes to wait for a single event.

Although signals have always been the main mechanism for communicating synchronous events among processes, Linux also implements the semaphore mechanism of System V UNIX. A process can wait on a semaphore as easily it can wait for a signal, but semaphores have its own two leading disadvantages, i.e., large number of semaphores can be shared among multiple independent processes, and operations on multiple semaphores can be performed atomically. Thus, internally the Linux wait_queue mechanism synchronizes processes that are communicating with semaphores

Linux offers several mechanisms for passing data among processes. The standard UNIX pipe mechanism allows a child process to inherit a communication channel from its parent; data written to one end of the pipe can be read at the other. Under Linux, pipes appear as just another type of inode to virtual file system software, and each pipe has a pair of wait_queue to synchronize the reader and writer. UNIX also defines a set of networking facilities that can be send streams of data to both local and remote processes.

Network Structure: Networking is a key are of functionality for Linux. Not only does Linux supports the standard Internet protocols used for most UNIX-to- UNIX communications, but it also implements a number of protocols native to other, non-UNIX operating systems. Since Linux was basically developed for small PCs, rather than for large workstations or on server class systems, it supports many of the protocols typically used on PC networks, such as IPX and AppleTalk.

Security: Linux security structure is closely related to typical UNIX security mechanisms. So far the authentication part of the security is concerned, the UNIX authentication has been performed through the use of a publicly readable password file. A user’s password is combined with a random salt value and the result is encoded with a one-way transformation function and stored in the password file. The use of one-way transformation function means that the original password cannot be deduced from the password file except by trial-and-error. When a user presents a password to the system, the password is recombined with the salt value stored in the password file and passed through the same one-way transformation. If the result matches the contents of the password file, the password is then accepted. Since the UNIX implementations of this mechanism had multiple problems, a new mechanism is developed which is known as pluggable authentication modules (PAM). This system is based on a shared library that can be used by any system component that requires authenticating users. An implementation of this system is also available under Linux. PAM allows authentication modules to be loaded on demand as specified in a system-wide configuration file. If a new authentication mechanism is added at a later date, it can be added to the configuration file and all system components will immediately be able to take advantage of it. PAM specifies authentication methods, account restrictions, session-setup functions, or password changing function.

Access control part of the security under UNIX system, including Linux, is performed through the use of unique numeric identifiers. A user identifier (uid) identifies a single user or a single set of access rights. A group identifier (gid) is an extra identifier that can be used to identify rights belonging to more than one user. Every object in a UNIX system under user and group access control has a single uid and a single gid associated with it. Linux performs access control by assigning objects a protection mask that specifies which access module out of read, write, or execute, are to be granted to process with owner group, or world access. Thus, the owner of an object might have full read, write, or execute access to a file. Other user in a certain group might be given read access nut denied the write access, and everyone else might be given no access at all.

Linux implements the standard UNIX setuid mechanism that allows a program to run with privileges different from those of the user running the program. The UNIX implementation of setuid distinguishes between a process’ real and effective uid. The real uid is that of the user running the program and the effective uid is that of the file’s owner.

Linux implements the above mechanism in an augmented style by two ways. First, it implements the POSIX specifications save user-id, which allow a process to drop and require the effective uid repeatedly. A program, for security reason, may want to perform most of its operations in a safe mode, waiving the privileges granted by its setuid status, but may wish to perform selected operations with all its privileges. Standard UNIX implementations achieve this capacity only by swapping the real and effective uids. The previous effective uid is remembered but the program’s real uid does not always correspond to the uid of the user running the program. Saved uids allow a process to set its effective uid and then goes back to the previous value of its effective uid, without having to modify the real uid at any time.

The second enhancement provided by Linux is the addition of a process characteristic that grants just a subset of the rights of effective uid. The fsgid and fsuid process properties are used when access rights are granted to files, and are set every time the effective uid or gid is set. However, the fsgid and fsuid can be set independently if the effective ids, allowing a process to access files on behalf of another user without taking on the identity of that other user in any other way. Specially, server possesses can use this mechanism to serve files to a certain user without the processes becoming vulnerable to being killed or suspended y that user.

Linux provides another mechanism that has become common in the modern versions of UNIX for flexible passing of rights from one to another program. When a local socket has been set up between any two processes on the system, either of those processes may send to the other process a file descriptor for one of its open files, the other process receives a duplicate file descriptor for the same file. This mechanism allows a client to pass access to a single file selectively to some server process, without granting that process any other privileges.

The entire discussion suggests that despite multiple similarities of Linux system with that of the UNIX system, it has certain original developments and feasible advancements with regards to various components, for example: its kernel. Although Linux looks and feels much like an UNIX system; in fact the UNIX compatibility has been a major design goal of Linux project. Thus the similarities may have been kept for the purpose of cross compatibility. During the first phase of Linux design it was fully inspired by Minix kernel architecture. The Minix itself was a simple simulation of UNIX systems that had been developed for the purpose to provide academic support to the learners of operating systems. The UNIX source code had been made restricted by the time to be used and the technical teachings were in a serious halt in want of a technically carved and fully operative system, which encouraged Professor Tenenbaum to develop Minix.

Press Esc to close