Process Concept – IV

INTER-PROCESS COMMUNICATION (IPC)

Continuing our discussion from the previous post, we discuss Inter-Process Communication in this post. This is followed by a discussion on Shared memory systems and message passing systems. Different ways of message passing, which include direct/indirect communication, synchronous and asynchronous communication, and automatic/explicit buffering are also discussed. 

Why do we need IPC?

  • Processes executing concurrently in OS can either be independent or cooperating.
     

    Independent Process

    Cooperating Process

    1

    Cannot affect or be affected by other processes executing in the system

    Can affect or be affected by other processes executing in the system.

    2

    does not share data with any other process

    shares data with other processes

  • Cooperating processes need IPC mechanism to exchange data and information
Why Process Cooperation needed?
There is a great need for cooperating processes in parallel and distributed systems where a complex problem is spread over several processes to speed up the computation. Often, these processes reside on different machines connected by a network and therefore, need mechanisms for cooperation.
  • Information Sharing
    • several users might be interested in same piece of information
    • need environment for concurrent access to such information
  • Computation Speedup
    • parallel execution by breaking one task into several sub-tasks
    • allows a program to handle many user requests at the same time
  • Modularity
    • to construct system in a modular fashion
    • dividing system functions into processes or threads
  • Convenience
    • provide ease to user to work on multiple tasks simultaneously

Achieving IPC

Two fundamental models of IPC

  • Shared memory
  • Message Passing
     

    Attribute

    Shared memory model

    Message Passing model

    1

    Achieving IPC

    Establish a region of memory that is shared between the cooperating processes

    Establish a system for exchange of messages between the cooperating processes as a means of IPC

    2

    Ease of implementation

    Difficult to implement because need to handle conflicts

    Easier to implement, especially in a distributed system

    3

    Speed

    Usually faster than Message passing, since kernel intervention required only once : to establish shared memory via system calls. Once shared memory established, all accesses treated as routine memory accesses, with no kernel intervention required.

    Usually slower than shared memory, since Message passing is implemented via system calls, therefore require more time-consuming task of kernel intervention.

    4

    System with Multiple cores

    On systems with several processing cores, message passing provides better performance than shared memory systems.Shared memory suffers from cache coherency issues, which arise because shared data migrate among the several caches. As the number of processing cores on systems increases, it is possible to have message passing as the preferred mechanism for IPC.

    message passing provides better performance than shared memory systems.

     

SHARED MEMORY SYSTEMS
As discussed earlier, IPC using shared memory is achieved by establishing a region of shared memory.

  • Usually, this shared memory region resides in the address space of process creating the shared memory segment.
  • If a process wishes to communicate via this shared memory → it must attach this memory segment to their address space.
    • Normally, OS tries to prevent one process from accessing another process’s memory. → it is needed that the cooperating processes agree to remove this restriction
    • form of data and location of memory → determined by the processes itself [no control of OS in this regard]
    • The cooperating processes only need to ensure that they are not writing to the same location simultaneously.
Shared memory systems
Shared memory systems


Understanding Cooperating Processes through example of Producer-Consumer Problem

  • Producer-Consumer problem → Common paradigm for Cooperating processes
  • Producer : Process that produces information to be consumed by the consumer
  • Consumer : Process that consumes information produced by the consumer
  • Real World Example : 
    • A Compiler may produce assembly code to be consumed by an assembler
    • Assembler, in turn, may produce object to be consumed by the loader.
    • Another example is Client Server Paradigm → Server : Producer (of resources such as HTML files, images), Client : Consumer (of those resources)
  • Problem Statement : Design a solution to allow both – The Producer process and the consumer process to execute concurrently.

Solution to Producer-Consumer problem using Shared memory paradigm

  • Make available a buffer of items → that can be filled by producer and emptied by consumer
  • Buffer location → region of memory shared by both processes
  • Need synchronization of Producer and Consumer → Consumer doesn’t try to consume an item that hasn’t been produced yet.
  • Buffer sizes : two types possible
    • Unbounded buffer
      • No practical limit on size of buffer → Producer can produce items at all times
      • Consumer has to wait for items, if buffer is empty
    • Bounded buffer
      • fixed size of the buffer → producer must wait if the buffer is full
      • Consumer has to wait for items, if buffer is empty.

Let us look at the solution to Producer Consumer problem using bounded buffer more closely.

  • In the above solution, shared buffer is implemented as a circular array.
  • two logical pointers are used : 
    • in : points to next empty position in buffer
    • out: points to first full position in buffer
  • At most BUFFER_SIZE – 1 items can be stores using this scheme
  • Also, this solution does not handle the case when both the producer and consumer try to access shared buffer simultaneously. This will be discussed in further posts when we discuss process synchronization.

MESSAGE PASSING SYSTEMS
Why Message passing?

  • To provide means to cooperating processes to communicate with each other without sharing address space
  • useful for distributed systems → Since communicating processes may reside on different computers connected by a network, therefore not possible to have IPC via shared memory.
    • e.g. an internet chat program, designed to provide communication by exchanging messages

Message passing systems work in the following manner

  • Every message passing scheme provides support for at least two operations:
    • send(message)
    • receive(message)

Types of Messages


 

Fixed-size Messages

Variable-sized Messages

1

System level implementation : Straightforward

Syatem level implementation : Complex

2

Programming : Difficult

Programming : Easy

  • This kind of trade-off seen throughout OS design
  • For two processes to communicate
    • they must first establish a communication link 
    • they must implement send(),receive() operations
    • This can be done in 3 ways : 
      • Direct or indirect communication
      • Synchronous or Asynchronous Communication
      • Automatic or explicit buffering
Message Passing Systems
 Message Passing Systems

Let us now discuss these ways in detail.

 Direct Communication

  • Processes that want to communicate must explicitly name the recipient or sender of the communication.
  • Primitives for communication use naming directly to send/receive messages:
    • send (P, message)  → Send message to Process P.
    • receive(Q, message) → Receive a message fro process Q.
    • This scheme exhibits symmetry in addressing, i.e. both the sender process and receiver process must name each other to communicate.
  • Another variant of this scheme employs asymmetry in addressing
    • only the sender names the recipient
    • The recipient is not required to name the sender.
    • send(P, message) → Send a message to Process P.
    • receive(id, message) → Receive a message from any process. The variable id is set to the name of the process for which communication has taken place.
  • Properties of Communication link
    • Processes need to know each other’s identity before communicating. A link is established automatically between every pairs of processes that want to communicate.
    • A link is associated with exactly two processes.
    • Between each pair of processes, there exists exactly one link.
  • Disadvantages of this scheme
    • limited modularity of resulting process definitions
    • Why? If one wants to change the identifier of a process, then all references to the previous identifier have to be found and replaced.
    • This is similar to hard-coding technique, where identifiers are explicitly stated.

Indirect Communication

  • Communication via mailboxes , or ports
  • Mailbox → Abstract object into which messages can be placed and from which messages can be removed.
    • Each mailbox has a unique identification. e.g. in POSIX [Portable Operating System Interface, family of standards specified by IEEE], message queues use an integer to identify mailboxes.
    • A process can communicate with another process via a number of different mailboxes.
    • Two processes can communicate only if they have a shared mailbox.
  • Mailbox ownership :
    • Mailbox can be owned either by a process or the Operating system
    • Mailbox owned by the process
      • mailbox is part of address space of the process
      • Owner : Can only receive messages through this mailbox
      • User : Can only send messages to the mailbox.
      • If owner terminates → mailbox disappears → Senders of messages to this mailbox must be notified that mailbox no longer exists
    • Mailbox owned by the OS
      • independent mailbox → has its own existence → not attached to any particular process
      • Duty of the OS to provide a mechanism to a process to : 
        • Create a new mailbox
        • Send and receive messages through the mailbox.
        • Delete the mailbox
      • Initially, creating process is the owner of the mailbox.
        • Ownership and receiving privileges can be passed to other processes via appropriate system calls
        • This results in multiple receivers for each mailbox.
  • Primitives for communication
    • send(A,message) : Send a message to mailbox A.
    • receive(A,message) : Receive a message from mailbox A.
  • Properties of Communication link
    • A link is established between a pair of processes only if both members of the pair have a shared mailbox.
    • A link may be associated with more than two processes.
    • Between each pair of communicating processes, a number of different links may exist, with each link corresponding to one mailbox.

Let us discuss an example of communication via indirect message passing.

  • 3 processes → P1,P2,P3 →  Share a mailbox A
  • P1 sends a message to A.
  • Both P2 and P3 can execute a receive from A.
  • The process which will actually receive the message sent by P1 will depend on what we choose: 
    • allow a link to be associated with 2 processes at most → both receive
    • allow at most 1 process at a time to execute receive() operation → only 1 receives
    • Allow the system to select a process that will receive the message (arbitrarily) → System identified the receiver to the sender (through an algorithm, say round robin) → either P2 receives or P3 receives but not both.

Message Passing via Synchronization

  • Design options to implement message passing primitives
    • Blocking (Synchronous)
    • Non-blocking (Asynchronous)
       

      A

      Send

      Receive

      1

      Blocking

      The sending process is blocked until the message is received by the receiving process or by the mailbox.

      The receiver blocks until a message is available.

      2

      NonBlocking

      The sending process sends the message and resumes operation.

      The receiver retrieves either a valid message or a null.

    • Blocking send() and receive() → we have a rendezvous between sender and receiver
    • The producer merely invokes the blocking send() call and waits until the message is delivered to either the receiver or the mailbox.
    • Likewise, when the consumer invokes receive(), it blocks until a message is available
    • Producer-consumer problem solution is trivial in this case.

Message passing via Automatic or explicit buffering

When messages are sent through message passing systems like direct/indirect communication, they typically reside in a temporary queue until they are received.

  • Zero capacity queue
    • no buffering
    • communication link cannot have any messages waiting in it
    • only blocking send and receive will work.
  • Bounded Capacity Queue
    • finite length of queue, say n → only n messages can reside at a time
    • if queue not full while sending →
      • sender places message in queue
      • sender continues execeution without waiting
    • queue full → 
      • sender must block until space available in queue.
  • Unbounded capacity queue
    • potentially infinite queue length
    • sender never blocks

Issues involved in message Passing

  • Reliability
    • Messages may fail to arrive/ arrive in garbled form
  • Order
    • Messages can arrive out of order due to buffering delays/network congestion
  • Access
    • Different approaches put up different restrictions on ports
    • Bound ports : can be only one reader and writer
    • Free port : allows any number of readers and writers
    • Input port : 1 reader but any number of writers
    • output port : any number of readers but only 1 writer
  • Integration with I/O
    • integration with files and I/O → makes OS API smaller and easier to understand




In the next blog post, we will conclude our discussion on Process Concept. Until then, keep reading and sharing 🙂