Chapter 3 Process Concept¶
3.1 Process Concept¶
Process
A program in execution, the basis of all computation.
- Batch system: jobs (= process)
- Time-shared system: user programs or tasks
3.1.1 The process¶
Process consists
- text section: program code
- data section: global variables
- heap: memory
- current activity (program counter + registers)
- stack: temporary data:
- function parameters
- return addresses
-
local variables
Program | Process |
---|---|
passive entity | active entity |
a file containing a list of instructions stored on disk (executable file) | program counter: specifying the next instruction to execute + a set of associated resources |
When
- double-clicking an icon
- prog.exe
- a.out
an executable file is loaded into memory: program $\to$ process.
Two different processes: the text section are equivalent, the data, heap and stack vary.
Process can be an execution environment for other code. (simulation)
e.g.
1 |
|
The command java
runs the JVM as an ordinary process, then executes the Java program testProgram
in the VM.
3.1.2 Process State¶
- New
- Running: execute instructions
- Waiting: wait some event (I/O, signal)
- Ready: wait to be assigned to a processor
- Terminated
Process and Processor
Only 1 process runs on any processor. (Many processes may be ready and waiting)
3.1.3 Process Control Block¶
- Process state
- Program counter: address of the next instruction
- CPU registers: accumulators, index registers, stack pointers, general-purpose registers, and any condition-code information
- CPU-scheduling information
- Memory-management information
- Accounting information: the amount of CPU and real time used, time limits, account numbers, job or process numbers
- I/O status information: the list of I/O devices allocated to the process, a list of open files
3.2 Process Scheduling¶
- Multiprogramming: to have some process running at all times $\to$ maximize CPU utilization
- Time sharing: switch the CPU among processes
- Process scheduler: selects an available process
3.2.1 Scheduling Queues¶
As processes enter the system, they are put into a job queue.
- Job queue: consists of all processes in the system.
- Ready queue: keep ready and waiting processes.
When a process exit, it is removed from all queues and has its PCB and resources deallocated.
3.2.2 Schedulers¶
Processes are first spooled to a mass-storage device (e.g. disk). Then
-
Long-term scheduler (job)
- selects processes from this pool
- loads them into memory for execution
-
Short-term scheduler (CPU)
- selects from among the processes that are ready to execute
- allocates CPU to one of them
Long-term scheduler
- Controls the degree of multiprogramming (# processes)
- Selects a good process mix of I/O-bound and CPU-bound
Medium-term scheduler
Swapping
3.2.3 Context Switch¶
When a context switch occurs
The kernel saves the context of the old process in its PCB and loads the saved context of the new process scheduled to run.
3.3 Operations on Processes¶
3.3.1 Process Creation¶
Process Identifier (pid)
An integer number, which provides a unique value for each process in the system, and it can be used as an index to access various attributes of a process within the kernel.
init process
A process has pid = 1, and serves as the root parent process for all user processes.
When a process creates a child process, that child process may obtain the resources from
- OS
- a subset of parent process
When a process creates a new process
- The parent continues to execute concurrently with its children.
- The parent waits until some or all of its children have terminated.
There are also two address-space possibilities for the new process
- The child process is a duplicate of the parent process (it has the same program and data as the parent).
- The child process has a new program loaded into it.
fork()
The new process created by fork()
consists of a copy of the address of parent process.
-
Return code
- Child process: 0
- Parent process: pid of the child
After fork()
syscall
One of the two processes uses the exec()
syscall to replace the process's memory space with a new program.
Creating a separate process using the UNIX fork()
system call.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
3.3.2 Process Termination¶
A process terminates when it finishes executing its final statement and asks the operating system to delete it by using the exit()
system call.
Terminating process
A parent needs to know the identities of its children if it is to terminate them.
A parent can terminate its children by
- The child use too much resources. (The parent have a mechanism to inspect the state of its children)
- The task assigned to the child is no longer required.
- The parent is exiting.
Cascading Termination
If a process terminates (either normally or abnormally), then all its children must also be terminated.
exit()
may be called either directly or indirectly (return
):
1 |
|
Process Table Entry (PTE)
Contains the process's exit status.
Zombie
A process terminated, but whose parent hasn't called wait()
. Once the parent calls wait()
, the pid of the zombie process and its entry in the PTE are released.
The init
process periodically invokes wait()
to collect and release the orphan's pid and PTE.
3.4 Interprocess Communication¶
Processes have two classifications:
- Independent
- Cooperating
- Information sharing
- Computation speedup - multicore
- Modularity
- Convenience - parallel tasks
Interprocess communication (IPC)
- Shared memory: slower (syscalls are required)
- Message passing: faster (syscalls are required only to establish shared memory regions)
3.4.1 Shared-Memory Systems¶
Producer–consumer problem¶
A producer process produces information that is consumed by a consumer process.
e.g.
- A compiler produce assembly code that is consumed by an assembler. The assembler, in turn, may produce object modules that are consumed by the loader.
- A server as a producer and a client as a consumer.
We need a buffer which resides in a region of shared memory (producer & consumer), and can be filled by the producer and emptied by the consumer.
- Unbounded buffer
- Bounded buffer (more practical)
Implement the shared buffer
as a circular array.
1 2 3 4 5 6 7 8 9 |
|
1 2 3 4 5 6 7 8 9 |
|
1 2 3 4 5 6 7 8 9 10 11 |
|
3.4.2 Message-Passing Systems¶
Message passing provides a mechanism to allow processes to communicate and to synchronize their actions without sharing the same address space.
Communication link
If processes $P$ and $Q$ want to communicate, they must send messages to and receive messages from each other.
Several implentation of send()
/receive()
operations:
- Direct of indirect communication
- Synchronous or asynchronous communication
- Automatic or explicit buffering
3.4.2.1 Naming¶
-
Direct communication
The messages are sent to and received from processes.
-
Symmetry
send(P, message)
receive(Q, message)
-
Asymmetry
send(P, message)
receive(id, message)
-
-
Indirect communication
The messages are sent to and received from mailboxes, or ports.
send(A, message)
— send a message to mailbox Areceive(A, message)
— receive a message from mailbox A
The process that creates a new mailbox is that mailbox's owner by default.
A mailbox can be owned by the OS.
3.4.2.2 Synchronization¶
Message passing may be either
-
Blocking (synchronous)
- Blocking send (blocked until the message is received)
- Blocking receive
-
Nonblocking (asynchronous)
- Nonblocking send
- Nonblocking receive (valid message or a null)
Rendezvous
When both send()
and receive()
are blocking.
3.4.2.3 Buffering¶
Messages reside in a temporary queue:
- Zero capacity (no buffering)
- Bounded capacity
- Unbounded capacity
3.5 Examples of IPC Systems¶
3.5.1 An Example: POSIX Shared Memory¶
1 2 3 4 5 6 7 |
|
A process must first create a shared-memory:
1 |
|
The ftruncate()
function configure the size of the object in bytes:
1 |
|
3.5.2 An Example: Mach¶
Even system calls are made by messages. When a task is created, two special mailboxes
- kernel mailbox
- notify mailbox
are also created.
There are three syscalls needed:
-
msg_send()
If the mailbox is full:
- Wait indefinitely until there is room in the mailbox
- Wait at most $n$ milliseconds
- Do not wait at all but rather return immediately
- Temporarily cache a message (server tasks)
-
msg_receive()
msg_rpc()
: sends a message and waits for exactly one return message from the sender.
Remote
The RPC (Remote Procedure Call) models a typical subroutine procedure call but can work between systems.
port_allocate()
Creates a new mailbox and allocates space for its queue of messages.
Mach guarantees that multiple messages from the same sender are queued in first-in, first-out (FIFO) order but does not guarantee an absolute ordering
One task can either own or receive from a mailbox
Mailbox set
A collection of mailboxes.
port_status()
e.g. # of messages in a mailbox.
3.5.3 An Example: Windows¶
Application programs can be considered clients of a subsystem server.
Advanced Local Procedure Call (ALPC)
It is used for communication between two processes on the same machine.
Windows uses two types of ports
- Connection ports
- Communication ports
Callback
Allows the client and server to accept requests when they would normally be expecting a reply.
When an ALPC channel is created, 1 of 3 message-passing techniques is chosen:
- Small messages: using the port's message queue.
- Larger messages: passed through a section object (a region of shared memory)
- Very large messages: calling API to read/write directly into the address space.
3.6 Communication in Client–Server Systems¶
3.6.1 Sockets¶
Socket
An endpoint for communication. (IP + port#)
A pair of processes communicating over a network employs a pair of sockets—one for each process.
Socket behavior
The server waits for incoming client requests by listening to a specified port. Once a request is received, the server accepts a connection from the client socket to complete the connection.
Well-known ports: (all ports below 1024 are considered well known)
- 23: telnet
- 21: FTP
- 80: HTTP
Java provides:
- Connection-oriented (TCP) sockets:
Socket
- Connectionless (UDP) sockets:
DatagramSocket
MulticastSocket
: a subclass ofDatagramSocket
. It allows data to be sent to multiple recipients.
Loopback
IP address 127.0.0.1.
3.6.2 Remote Procedure Calls¶
The RPC was designed as a way to abstract the procedure-call mechanism for use between systems with network connections.
Each message is addressed to an RPC daemon listening to a port on the remote system, and each contains an identifier specifying the function to execute and the parameters to pass to that function.
The semantics of RPCs allows a client to invoke a procedure on a remote host as it would invoke a procedure locally.
Stub
The RPC system hides the details that allow communication to take place by providing a stub on the client side.
Parameter marshalling
Packaging the parameters into a form that can be transmitted over a network.
Procedure of RPCs:
- The client invokes a RPC
- RPC system
- calls the appropriate stub (client side)
- passes the stub the parameters to the RPC
- Marshals parameter: packaging the parameters into a form that can be transmitted over a network
- The stub transmits a message to the server using message passing.
- A stub (server side)
- receives this message
- invokes the procedure on the server
- (optional) Return values using the same technique
Issues for RPC:
- Data representation
- External Data Representation (XDR)
- Parameter marshalling
- External Data Representation (XDR)
- Semantics of a call
- at most once
- exactly once (ACK)
- Binding of the client and server port
- Matchmaker (a rendezvous mechanism)
3.6.3 Pipes¶
In implementing a pipe, four issues:
- Does the pipe allow bidirectional communication, or is communication unidirectional?
- If two-way communication is allowed, is it half duplex (data can travel only one way at a time) or full duplex (data can travel in both directions at the same time)?
- Must a relationship (such as parent–child) exist between the communicating processes?
- Can the pipes communicate over a network, or must the communicating processes reside on the same machine?
3.6.3.1 Ordinary Pipes¶
1 |
|
Ordinarya pipes on on Windows: anonymous pipes (similar to UNIX)
3.6.3.2 Named Pipes¶
Ordinary Pipes | Named Pipes |
---|---|
unidirectional | bidirectional |
parent-child required | not required |
In UNIX, named pipes = FIFOs. A FIFO is created with the mkfifo()
.
Pipes in practice:
1 2 |
|