Process Synchronization – IV – Mutex and Semaphores

In the previous post, we discussed a solution to the Critical Section problem using test_and_set() atomic instruction which satisfied all the three requirements mandated for a correct solution.

In this post, we shall start with software-based solutions designed to solve the critical section problem.

This is necessary, because

  • hardware-based solutions are generally inaccessible to application programmers.
  • Also, hardware level locks are difficult to implement in multi-processor architectures. Reason :
    • Uniprocessor architectures → can use uninterruptable sequence of instructions → using special instructions or instruction prefixes to disable interrupts temporarily
    • Multiprocessor architectures → need complex hardware/software support → need to handle synchronization issues

Let us discuss the simplest software tool available to handle critical section problem – the mutex lock

MUTEX LOCKS

Mutex stands for Mutual exclusion.

Definition : A mutex lock, in simple terms, is a boolean variable whose value indicates whether lock is available or not.

  • Mutex lock is used to protect critical regions and prevent race conditions.
  • Therefore, a process must acquire the lock before entering a critical section and should release the lock when it exits the critical section.
  • acquiring the lock → acquire() function
    • If the lock is available, a call to acquire() succeeds → the lock is then considered unavailable
    • If a process tries to acquire an unavailable lock → it is blocked until the lock is released
  • releasing the lock → release() function
  • acquire() and release() calls must be performed atomically
  • In real world, a mutex is similar to airplane lavatory. When it is occupied, other persons have to wait. When a person comes out, he/she signals the next person in waiting queue to use it.

Thus, the solution to critical section problem using mutex is as simple as acquiring the lock, executing critical section, and releasing the lock. However, if we notice one thing, which we have mentioned previously too, that if  process tries to acquire an unavailable lock, it is blocked,

  • The while loop in acquire() function make a process loop continuously, until the process executing its critical section releases the lock.
  • This is called busy waiting. Infact, it is a disadvantage of this implementation.
    • In a multi-programming system, one CPU is shared among many processes.
    • Busy waiting cycles waste CPU cycles, which could have been used to do some productive work instead.
  • This kind of mutex lock is also called a spinlock, since the process “spins” while waiting for the lock to become available.
  • Busy waiting was also present in implementations using test_and_set() and compare_and_swap() instructions.

However, not all is bad about spinlocks. They do have an advantage.

  • Since a process spins while waiting on a lock, therefore no context switch is required (which may take considerable time)
  • When locks are expected to be held for short times, spinlocks are useful because they save the overhead time which occurs in switching context.
  • As a result, they are often employed on multiprocessor systems, where one thread can spin on one processor, while the other thread executes its critical section on another processor.

SEMAPHORES

  • Semaphore : Sema [to signal] + phoros (Origin in Greek)
  • Definition : A semaphore S is an integer variable, for which only initialization and two atomic operations (wait() and signal()) are defined.
  • Originally
    • wait() operation was called P (from Dutch word proberen)to signal
    • signal() operation was called V (from dutch word verhogen) → to increment
  • It behaves similar to a mutex lock, but can provide more robust and sophisticated ways to implement process synchronization.
  • In a real world analogy, Semaphore is similar to having multiple lavatories on an airplane. Initially, all are available. The available count is decreased as they are occupied. If all are full, available count is 0, and all waiting persons are blocked. → they have to wait till one becomes empty.

The following points must be kept in mind before using semaphores : 

  • The variable changing steps [S++ and S–] must be executed indivisibly. This means that when a process modifies the semaphore value, no other process can simultaneously modify the same semaphore value.
  • For the wait() function, if the test in the while loop (S <= 0) is false, then there should be no interruptions before S gets decremented.
  • It is okay for the loop to be interrupted when the test is true. This prevents the system from hanging forever.

Semaphore Types

OS often distinguishes between semaphores based on the range of values they can take.

  • Binary Semaphore
    • Can range only between 0 and 1
    • Behaviour similar to mutex locks → Therefore, can be used to provide mutual exclusion in systems that do not provide a separate mutex mechanism

  • Counting Semaphore
    • Value can range over an unrestricted domain
    • Since they are not restricted to values 0 and 1, they can be used to control access to resources which consist of a finite number of instances
      • Say an organization has 20 printers to be shared throughout the office.
      • The semaphore S is initialized to the number of resource instances available. (in this case 20)
      • When a process wishes to use the resource, it performs a wait() operation on the semaphore → S– [decrementing the count]
      • Process releases the resource → signal() operation → S++
      • If semaphore count goes to 0 → all the resource instances are being used → all processes now wishing to use resources will get blocked until Semaphore count becomes greater than 0

Using semaphores to solve synchronization problems

Let us say we have to processes P1 and P2.

  • P1 wants to execute statement S1 and P2 wants to executes S2.
  • However, it is required to execute S2 only after S1.
  • In this case, we can implement this scheme using a common semaphore shared between the two processes.
  • We initialize the semaphore S to 0.

  • P2 will execute only after signal(S) has been executed, which can happen only after S1 has been executed.

Difference between a mutex and a Semaphore

Please note that mutex and semaphore use the same API, they are used for different purposes.

 

Mutex

Semaphore

1

Locking mechanism

Signaling mechanism

2

Correct use : Meant to be used as a lock. i.e. it can be taken and released

Correct use : Meant to be used as a signal from one task to another.

3

A single process acquires the mutex lock and then releases it. Only then can it be used by other processes.

Processes that use semaphore can either signal() or wait() – not both. e.g. a process P1 may contain statement to signal(increment) a particular semaphore at a particular time. A different process P2, which waits on that semaphore, will then carry out its task. In this example, P1 is the producer of the event, and P2 is the consumer.

4

Mutex lock : can have only two values → 0 or 1

Binary Semaphores : 0 or 1 Counting Sempahores : can have any value

5

In the next post, we will discuss Semaphore implementation, and the concept of Priority inversion. Until then, keep reading and sharing 🙂
Sources : 1, 2, 3, 4