Operating Systems (OS)

An Operating System (OS) is system software that manages computer hardware and software resources and provides common services for computer programs.

Think of it as the "manager" of your computer.

+---------------------+
|   Your Apps         |
| (Chrome, VS Code)   |
+----------+----------+
| Operating System    |
| (Windows, Linux, macOS) |
+----------+----------+
| Hardware            |
| (CPU, RAM, Disk, GPU) |
+---------------------+

Why Do We Need an OS?

Problem OS Solves It
Hardware is complex OS hides complexity
Apps need CPU, RAM, disk OS allocates fairly
Multiple apps run together OS schedules them
You click mouse → app opens OS translates input

Without OS → You’d write machine code for every CPU instruction.
With OS → You just double-click an app.

Core Functions of an OS

Function What It Does Example
Process Management Create, run, kill apps Open Chrome → new process
Memory Management Give RAM to apps 2GB to VS Code
File System Management Read/write files Save report.docx
Device Management Talk to hardware Print to printer
Security Protect data/users Login password
User Interface GUI or CLI Desktop, Terminal

Types of Operating Systems

Type Examples Use Case
Desktop OS Windows, macOS, Linux (Ubuntu) Laptops, PCs
Mobile OS Android, iOS Phones
Server OS Ubuntu Server, Windows Server, RHEL Cloud, data centers
Real-Time OS (RTOS) FreeRTOS, VxWorks Robots, cars
Embedded OS Linux (Raspberry Pi), TinyOS IoT devices

Key Components of an OS

+---------------------+
|     User Space      |
|  Apps, Libraries    |
+----------+----------+
|     Kernel Space    |
|  [Kernel]           |
|  ├─ Process Mgr     |
|  ├─ Memory Mgr      |
|  ├─ File System     |
|  ├─ Device Drivers  |
|  └─ System Calls     |
+----------+----------+
|     Hardware        |
+---------------------+

Von Neumann Architecture

The Von Neumann Architecture is the foundational design of most modern computers, where a single memory stores both program instructions and data, and the CPU executes instructions in a fetch-decode-execute cycle.

+------------------+
|     Memory       | ← Stores BOTH Instructions & Data
| [1010] [Data]    |
+--------+---------+
         ↑
         |
       CPU
  +------+------+
  |  ALU  |     |
  |  Registers  |
  |  Control    |
  +-------------+

"Stored Program Concept" – Instructions and data are stored in the same memory and are indistinguishable in format.

Core Components

Component Full Name Role Details
1. CPU Central Processing Unit Brain of the computer Executes instructions using fetch-decode-execute cycle
2. Memory Main Memory (RAM) Stores instructions + data Single shared memory (Von Neumann bottleneck)
3. I/O System Input/Output Devices Interact with user/world Keyboard, screen, disk, network
4. Bus System Interconnect Data highways Connects CPU, Memory, I/O

1. CPU (Central Processing Unit) – The Executor

+---------------------+
|        CPU          |
|  +---------------+  |
|  |  ALU          |  | ← Arithmetic Logic Unit (ADD, AND, etc.)
|  +---------------+  |
|  |  Control Unit |  | ← Orchestrates fetch-decode-execute
|  +---------------+  |
|  |  Registers     |  | ← High-speed storage (PC, IR, ACC, etc.)
|  +---------------+  |
+---------------------+

Sub-Components of CPU

Part Function
ALU Performs math/logic (+, &, ==)
Control Unit Sends control signals (e.g., "read memory")
Registers Tiny, fast storage inside CPU

2. Memory (RAM) – The Shared Workspace

Memory Address | Content
---------------|---------
0x1000         | LOAD R1, [0x2000]
0x1004         | ADD R1, R2
0x1008         | STORE R1, [0x2008]
0x2000         | 5        ← Data
0x2004         | 3        ← Data

3. I/O System – The Bridge to the World

Device Type Example
Input User → CPU Keyboard, mouse, microphone
Output CPU → User Monitor, speaker, printer
Storage Persistent SSD, HDD, USB

Disk → Memory → CPU → Screen

Bus System – The Data Highways

The Bus System is a set of parallel wires (physical or logical) that carry data, addresses, and control signals between CPU, Memory, and I/O.

CPU ↔ [Address Bus] ↔ Memory
CPU ↔ [Data Bus]    ↔ Memory
CPU ↔ [Control Bus] ↔ Memory/I/O

Three Types of Buses

Bus Type Direction Width Purpose Example
Address Bus CPU → Memory/I/O 32/64 bits "Where?" – Memory address to read/write 0x2000
Data Bus Bidirectional 32/64 bits "What?" – Actual data/instruction LOAD R1, R2 or 42
Control Bus Bidirectional ~10–20 lines "When/How?" – Control signals READ, WRITE, IRQ

FullArchitecture

Summary Table

Component Analogy Key Role
CPU Brain Executes
Memory Desk Stores code + data
I/O Hands/Eyes Input/Output
Bus System Roads Connects everything

Golden Rule:

"CPU asks Memory for instructions/data using Address Bus, gets them via Data Bus, and uses Control Bus to say READ/WRITE."

CPU Registers (The "Scratchpad")

Register Purpose Example
PC (Program Counter) Holds address of next instruction 0x1000
IR (Instruction Register) Holds current instruction being executed ADD R1, R2
MAR (Memory Address Register) Holds memory address to read/write 0x2000
MDR (Memory Data Register) Holds data from/to memory 42
ACC (Accumulator) Stores intermediate results Result of ADD
General-Purpose Registers (R0–Rn) Fast storage for variables R1 = 10
Status Register (Flags) Stores condition codes Z=1 (zero), N=1 (negative)

Instruction Cycle (Fetch-Decode-Execute)

REPEAT FOREVER:
  1. FETCH
     PC → MAR
     Memory[MAR] → MDR → IR
     PC = PC + 1

  2. DECODE
     Decode IR → know operation (ADD, LOAD, etc.)

  3. EXECUTE
     Do the operation (ALU, memory, I/O)
     Update flags if needed

InstructCycle

Modes of Execution

Mode Privilege Use
User Mode Limited Run apps (Chrome, VS Code)
Kernel/Supervisor Mode Full access OS kernel, drivers
Hypervisor Mode (modern) Highest Virtual machines (VMware, KVM)

Switching: Apps use system calls → trap to kernel mode.

Example: Simple Program in Memory

Memory Address | Content
---------------|---------
0x1000         | LOAD R1, [0x2000]   ← PC points here
0x1004         | LOAD R2, [0x2004]
0x1008         | ADD R1, R2
0x100C         | STORE R1, [0x2008]
0x1010         | HALT

Data:

0x2000: 5
0x2004: 3
→ Result: 0x2008 = 8

Next: Want to see pipelining, interrupts, or real CPU (x86, ARM)?
Ask: "Explain CPU pipelining" or "How does an interrupt work?"

Process Management

1. What is a Program?

A program is a passive set of instructions stored on disk.

hello.c  →  Source
hello.exe → Executable (binary file on disk)

2. What is a Process?

A process is an active program in execution.

Program (on disk) → OS loads → Process (in RAM + CPU)
Component Meaning
Code (Text) Instructions
Data Global variables, heap, stack
Registers PC, SP, CPU registers
State Running, Ready, Blocked

3. Process Control Block (PCB) – Process Attributes

PCB is the "ID card" of a process – OS stores all info here.

struct PCB {
    int pid;              // Process ID
    int state;            // Running, Ready, Blocked
    int priority;         // Scheduling priority
    void* pc;             // Program Counter
    void* registers[];    // CPU registers
    void* memory_limits;  // Base/limit
    list open_files;      // File descriptors
    int cpu_time_used;    // Total CPU used
    int arrival_time;     // When it arrived
    int burst_time;       // CPU needed
    // ... more
};

4. Process States (Lifecycle)

alt text

State Meaning
New Just created, not in memory
Ready In memory, waiting for CPU
Running Executing on CPU
Waiting/Blocked Waiting for I/O, signal
Terminated Finished or killed

5. Process Operations (System Calls)

Operation System Call Action
Create fork() Duplicate process
Execute exec() Load new program
Wait wait() Parent waits for child
Exit exit() Terminate process
Kill kill() Send signal
pid_t pid = fork();     // Creates child
if (pid == 0) exec("ls"); // Child runs ls

6. Scheduling Queues

OS maintains queues of PCBs:

+-------------+     +-------------+     +-------------+
| Ready Queue | →   | Device Queue| →   | Job Queue   |
| (FCFS, RR)  |     | (Disk, Net) |     | (Batch)     |
+-------------+     +-------------+     +-------------+
Queue Contains
Job Queue All processes in system
Ready Queue Processes in RAM, ready for CPU
Device Queue Processes waiting for I/O

7. CPU Scheduler & Dispatcher

Component Role
Scheduler Decides which process runs next
Dispatcher Switches CPU to that process

8. Context Switching

Saving state of current process → Loading state of next process

Process A (Running)
   ↓
Save: PCB_A (PC, registers, SP)
   ↓
Load: PCB_B
   ↓
Process B (Running)

Time: 1–100 μs (overhead)

9. CPU Scheduling Timings

Term Formula Meaning
Arrival Time (AT) - When process enters system
Burst Time (BT) - CPU time needed
Completion Time (CT) - When process finishes
Turnaround Time (TAT) CT - AT Total time in system
Waiting Time (WT) TAT - BT Time spent waiting
Response Time (RT) First CPU - AT Time to first run

10. Types of Schedulers

Type When Goal
Long-term (Job) Admit to RAM Balance load
Mid-term Swap in/out Manage memory
Short-term (CPU) Pick next process Minimize wait

11. Summary Diagram

alt text

12. Golden Rules

Rule Why
Process ≠ Program One program → many processes
PCB = Process DNA OS tracks everything here
Context switch = expensive Save/restore registers
Scheduler = fairness + performance FCFS, SJF, RR, Priority

CPU Scheduling Algorithms**

Goal: Maximize CPU utilization, throughput, minimize waiting time, turnaround time, response time.

1. First-Come, First-Served (FCFS)

"Queue at the counter" – Non-preemptive

Avg WT = (0+20+22)/3 = 14

Cons: Convoy Effect – Short jobs wait behind long ones.

2. Shortest Job First (SJF)Optimal (Non-Preemptive)

"Do small tasks first"

Cons: Starvation for long jobs, needs burst time prediction

3. Shortest Remaining Time First (SRTF)Preemptive SJF

"If a shorter job arrives, pause current"

Same as SJF if no new arrivals during execution.
With new short job, it preempts.

4. Round Robin (RR)Preemptive, Time-Sharing

"Everyone gets 1 turn" – Time Quantum (e.g., 4 ms)

Pros: Fair, responsive
Cons: High context switch overhead if quantum too small

5. Priority Scheduling

"VIPs go first" – Preemptive or Non-Preemptive

Cons: Starvation of low-priority jobs
Fix: Aging – Increase priority over time

6. Multilevel Queue Schedulers

Why?

Problem Solution
Different types of processes (interactive, batch, real-time) Separate queues
Single algorithm not optimal for all Tailored scheduling per queue
Starvation in priority Feedback promotes/demotes

"Fixed priority queues – processes stay in their queue."

+--------------------+
| Queue 0 (RR, q=8)  | ← System processes (Highest)
+--------------------+
| Queue 1 (RR, q=16) | ← Interactive (Foreground)
+--------------------+
| Queue 2 (FCFS)     | ← Batch (Background)
+--------------------+
Feature Detail
Fixed Assignment Process enters one queue and stays
Queue Priority Higher queue → always runs first
Intra-Queue Scheduling Each queue has its own algorithm
No Movement No promotion/demotion

Pros & Cons

Pros Cons
Tailored scheduling Starvation of lower queues
Simple Inflexible
Predictable Poor for dynamic workloads

7. Multilevel Feedback Queue (MLFQ)

"Dynamic priority – processes move between queues based on behavior."

+--------------------+
| Q0 (RR, q=8)       | ← New / Short jobs
+--------------------+
| Q1 (RR, q=16)      | ← Used full quantum
+--------------------+
| Q2 (FCFS)          | ← CPU-bound (long jobs)
+--------------------+

Rules of MLFQ

Rule Meaning
1 If Priority(A) > Priority(B) → A runs
2 If Priority(A) == Priority(B) → Round-robin
3 New process → top queue
4 Uses full quantum → demote to next queue
5 After some time → promote to higher queue (aging)
6 I/O-bound → stays in high queue

Queue Movement

Time Action
0 P1 in Q0
2 P1 uses full quantum → demote to Q1
4 P2 returns early → stays in Q0
8 P1 in Q1, uses 2 → still 1 left

But in real systems, MLFQ wins because short jobs finish fast.

Aging (Anti-Starvation)

Every 5 seconds:
  for each process in lower queue:
    priority++

Prevents infinite wait.

Real-World Use

OS Scheduler
Windows MLFQ-inspired (priority + quantum)
Linux (2.6+) CFS (fair share) – evolved from MLFQ
macOS MLFQ with fairness
Unix Original MLFQ

MLFQ is the foundation of modern OS schedulers.

Linux CFS

CFS = Completely Fair Scheduler
Goal: Give every process a fair share of CPU time, proportional to its nice value.

Core Idea: Virtual Runtime (vruntime)

"The process that has used the least CPU time runs next." "Every nanosecond of CPU is accounted for, and the task that has waited longest (in weighted time) runs to run."

vruntime = actual_runtime × (1024 / weight)

Key Concepts

Concept Meaning
Nice Value -20 (highest) to +19 (lowest) → default 0
Weight weight = 1024 / (1.25^nice)
Red-Black Tree Stores runnable tasks sorted by vruntime
sched_latency Target time slice (~20 ms for 4 tasks)
min_granularity Minimum time quantum (~1 ms)

Algorithm alt text

Summary Table: CPU Scheduling

Algorithm Preemptive? Avg WT (Example) Best For
FCFS No 14 Batch
SJF No 4 Minimum wait
SRTF Yes ~4 Optimal
RR Yes 2.67 Interactive
Priority Yes/No Varies Real-time
MLFQ Yes Balanced General-purpose

Inter-Process Communication (IPC)

IPC = How processes talk to each other

Process A        Process B
   ↓                ↓
[Shared Memory] OR [Message Passing]

1. Why IPC?

Use Case Example
Data Sharing Web server → cache
Synchronization Producer-Consumer
Modularity Microservices

2. Two Models of IPC

Model Mechanism Speed Example
Shared Memory Same memory region Fast mmap, shmget
Message Passing Send/Receive messages Slower Pipes, Sockets, MQ

IPC Mechanisms

1. Pipes

Unidirectional data channel between related processes (parent-child).
- Anonymous: Created via pipe() → 2 file descriptors (read/write)
- Named (FIFO): mkfifo() → any process can open
Use: Shell pipelines (ls | grep), parent → child data

int pipefd[2];
pipe(pipefd);
write(pipefd[1], "hi", 2);  // Parent
read(pipefd[0], buf, 2);    // Child

2. Message Queues

Kernel-persistent queue for structured messages.
- mq_open(), mq_send(), mq_receive()
- Messages have priority
- Survives process crash
Use: Decoupled communication, task queues

mqd_t mq = mq_open("/myqueue", O_CREAT);
mq_send(mq, "data", 4, 0);
mq_receive(mq, buf, size, &prio);

Pros: Decoupled, priority

3. Shared Memory

Fastest IPC: processes map same physical memory.
- shmget(), shmat()
- No kernel copy → zero-copy
- Requires synchronization (semaphore/mutex)
Use: High-performance data sharing (databases, video)

int shmid = shmget(key, 1024, IPC_CREAT);
char* data = shmat(shmid, NULL, 0);
strcpy(data, "Hello");

Fastest but needs synchronization

4. Semaphores

Semaphore – An integer variable used for synchronization and resource counting.
Supports two atomic operations: - wait() (P): Decrement (block if 0) - signal() (V): Increment (wake up waiter)

Types: - Binary Semaphore = 0 or 1 → acts like a lock - Counting Semaphore = N → allows N resources

Mutex vs Semaphore – Key Difference

Feature Mutex Semaphore
Purpose Mutual Exclusion (lock) Synchronization + Counting
Ownership Owned by one thread No ownership (any thread can signal)
Value Always 1 (binary) Can be N (counting)
Use Case Protect critical section Control access to N resources
Example pthread_mutex_lock() sem_wait(&empty)

Mutex = "I own this lock"
Semaphore = "There are 5 slots available"

Producer-Consumer Problem

Problem:
One or more producers generate data → put in a bounded buffer
One or more consumers take data from buffer
Challenges: - Buffer overflow (producer too fast) - Buffer underflow (consumer too fast) - Race condition on buffer access

Solution Using 3 Semaphores

#define BUFFER_SIZE 10
int buffer[BUFFER_SIZE];
int in = 0, out = 0;

// Semaphores
sem_t empty;    // Counts empty slots (init = 10)
sem_t full;     // Counts full slots  (init = 0)
sem_t mutex;    // Binary lock for buffer access (init = 1)

Producer Code

void producer() {
    int item;
    while (1) {
        item = produce_item();

        sem_wait(&empty);    // Wait for empty slot
        sem_wait(&mutex);    // Lock buffer

        buffer[in] = item;   // Critical section
        in = (in + 1) % BUFFER_SIZE;

        sem_post(&mutex);    // Unlock
        sem_post(&full);     // Signal: one more full slot
    }
}

Consumer Code

void consumer() {
    int item;
    while (1) {
        sem_wait(&full);     // Wait for full slot
        sem_wait(&mutex);    // Lock buffer

        item = buffer[out];  // Critical section
        out = (out + 1) % BUFFER_SIZE;

        sem_post(&mutex);    // Unlock
        sem_post(&empty);    // Signal: one more empty slot

        consume_item(item);
    }
}

How Semaphores Solve It

Semaphore Initial Role
empty 10 Blocks producer if buffer full
full 0 Blocks consumer if buffer empty
mutex 1 Prevents race condition on in/out

Summary: Semaphores in Producer-Consumer

empty → "How many slots are free?"
full  → "How many items are ready?"
mutex → "Only one at a time can touch buffer"

No overflow, no underflow, no race condition — guaranteed by atomic wait()/signal()

5. Sockets

Bidirectional communication endpoint (local or network).
- UNIX Domain: Local (AF_UNIX) → fast, file-based
- TCP/UDP: Network (AF_INET)
Use: Client-server, microservices, distributed systems

// TCP
socket() → bind() → listen() → accept()
// OR
socket() → connect()

Used in networking, microservices

6. Signals

Asynchronous notification to a process.
- kill(pid, SIGUSR1) → send
- signal(SIGUSR1, handler) → receive
- Limited data (just signal number)
Use: Terminate, pause, custom events (SIGUSR1/SIGUSR2)

kill(pid, SIGUSR1);  // Send
signal(SIGUSR1, handler); // Receive

Asynchronous, limited data

5. Summary Table: IPC

Method Speed Sync Needed? Use Case
Shared Memory Fastest Yes High perf
Message Queue Medium No Decoupled
Semaphores Fast Yes (with shared mem) Sync + resource counting (Producer-Consumer)
Pipes Medium No Parent-Child
Sockets Slower No Network
Signals Fast No Notify

Golden Rules

Scheduling IPC
SJF = min wait time Shared memory = fastest
RR = fair for interactive Use semaphore with shared mem
MLFQ = best of both Message passing = safer

# Concurrent Programming

Concurrent Programming is the practice of designing and writing programs that can execute multiple tasks at the same time, either by parallel execution on multiple CPU cores or by interleaving operations on a single core.

Sequential:
Task A → Task B → Task C

Concurrent:
Task A  ↘
         ↘  →  Result
Task B  ↗

Why Concurrent Programming?

Need Example
Speed Video encoding on 8-core CPU
Responsiveness Web server handles 1000 users
Resource Use I/O + CPU overlap (download + process)
Real-World Modeling Simulate traffic, users, sensors

Concurrency vs Parallelism

alt text

All parallel is concurrent, but not all concurrent is parallel. "Concurrency is about structure. Parallelism is about speed."

Models of Concurrent Programming

Model Language Example
Threads C, Java, Python pthread_create(), Thread.start()
Async/Await Python, JS, C# async def fetch()
Actors Erlang, Akka actor ! message
Futures/Promises Java, JS CompletableFuture
Coroutines Go, Kotlin go func(), suspend fun

Challenges & Solutions

Challenge Description Solution(s)
Race Condition Multiple tasks access shared data concurrently, leading to unpredictable outcomes depending on execution timing. Mutex, Locks, Atomic operations
Deadlock Two or more tasks permanently block each other by holding resources while waiting for others. Avoid circular wait, Timeouts, Resource ordering
Starvation A task is perpetually denied CPU or resources due to higher-priority tasks dominating. Fair scheduling, Aging, Priority boosting
Livelock Tasks actively retry conflicting actions indefinitely without progress (like deadlock but busy). Randomized backoff, Leader election, Timeouts
Priority Inversion A low-priority task blocks a high-priority one by holding a needed resource. Priority inheritance, Priority ceiling
Overhead High cost of context switching, synchronization, and thread management in concurrent programs. Thread pools, Async I/O, Event-driven models

Deadlock?

Deadlock = A set of processes are permanently blocked, each holding a resource and waiting for another resource held by another process in the set.

P1 holds R1 → wants R2
P2 holds R2 → wants R1
→ Circular wait → DEADLOCK

4 Necessary Conditions for Deadlock (Coffman Conditions)

Condition Meaning
1. Mutual Exclusion Resource can be held by only one process at a time
2. Hold and Wait Process holds resource while waiting for another
3. No Preemption Resources cannot be forcibly taken
4. Circular Wait Processes form a cycle in resource allocation graph

All 4 must be true → deadlock possible
Break any one → deadlock impossible

1. Deadlock Prevention

Design system so that at least one of the 4 conditions is never satisfied.

1: Eliminate Mutual Exclusion

How Example
Use spooling instead of exclusive access Printer → spool to disk → multiple jobs "share"
Not practical for most resources (RAM, files)

Rarely used – most resources are inherently exclusive.

2: Eliminate Hold and Wait

Method How
Request all resources at once Process declares all needs upfront
Release before requesting new If can't get all, release held ones
// Example
lock(R1, R2, R3);  // Get all or none
do_work();
unlock(R1, R2, R3);

Cons: - Low resource utilization - Starvation (long list of needs) - Hard to predict all needs

3: Allow Preemption

How Example
Force release of resource OS kills process or takes resource
Checkpoint & rollback Save state, restart later

Used in: - Virtual memory (page stealing) - Some database systems

Cons: - Complex - Not all resources preemptible (printer half-printed)

4: Prevent Circular WaitMOST PRACTICAL

Method How
Total ordering of resources Assign number: R1=1, R2=2, ..., Rn=n
Request in increasing order Process must acquire lower → higher
// Correct
lock(R1);  // 1
lock(R3);  // 3

// Wrong → violates order
lock(R3);  // 3
lock(R1);  // 1 → DENIED or BLOCK

Pros: - Simple - No runtime overhead - Widely used

Cons: - Arbitrary ordering - May reduce concurrency

Prevention Summary Table

Strategy Breaks Practical? Used In
Mutual Exclusion 1 No Spooling
Hold and Wait 2 Medium Batch systems
No Preemption 3 Medium VM, DB
Circular Wait 4 Yes Most OS, DB, apps

2. Deadlock Avoidance

Allow the system to enter unsafe states, but avoid deadlock using runtime checks.

Banker’s Algorithm (Dijkstra)

Input Meaning
Available Free resources
Max Max demand of each process
Allocation Currently held
Need = Max – Allocation

Safety Algorithm

  1. Find a process where Need ≤ Available
  2. Assume it finishes → add Allocation to Available
  3. Repeat
  4. If all finish → Safe, else Unsafe
Example:
Available: [3,3,2]
P1 Need: [1,0,2] → Safe → Run → Release
→ Available becomes [4,3,4]

Pros: Optimal utilization
Cons: Requires future knowledge, high overhead

Used in: Early systems, rarely in modern OS

3. Deadlock Detection

Allow deadlock to occur, then detect and recover.

Resource Allocation Graph (RAG)

graph TD
    P1 --> R1
    P2 --> R2
    R1 --> P2
    R2 --> P1
    style P1 fill:#f96
    style P2 fill:#f96

Detection Algorithm

  1. Mark all processes with no outgoing request
  2. Remove them and their resources
  3. Repeat
  4. If any process remains → Deadlock

Run periodically or on resource request failure

4. Deadlock Recovery

Once detected, break the deadlock.

1: Process Termination

Method How
Kill one process Choose cheapest (least CPU, memory)
Kill all Restart system

Cons: Lose work

2: Resource Preemption

Steps
1. Select victim process
2. Rollback to safe state
3. Preempt resource
4. Allocate to waiting process

Used in: Databases (transaction rollback)

Recovery Cost Factors

Factor Choose Victim By
CPU time used Least
Resources held Fewest
Processes to kill Minimum
Priority Lowest

Real-World Example: Dining Philosophers

Philosopher i:
  think()
  pick_left_chopstick(i)
  pick_right_chopstick((i+1)%5)
  eat()
  put_down()

Deadlock: All pick left → wait for right → cycle

Prevention:

// Break circular wait
if (i == 4) {
  pick_right_first()
} else {
  pick_left_first()
}

Summary Table

Method When Overhead Practical
Prevention Design time Low Yes (Circular wait)
Avoidance Runtime High Rare
Detection Periodic Medium Yes (with recovery)
Recovery After deadlock High Yes (kill/restart)

Best Practice in Modern Systems

System Strategy
Linux Prevention (mutex order), Detection (hung task)
Databases Detection + Transaction rollback
Java ReentrantLock with tryLock()
Apps Avoid nested locks, timeout, ordering

Golden Rules

Rule Why
Always acquire locks in global order Prevents circular wait
Use try_lock() with timeout Avoids indefinite wait
Minimize lock scope Reduces contention
Detect & recover Better than crash

Multithreading

Multithreading = Running multiple threads within one process.
Threads share memory, code, and file descriptors.

Process
├── Thread 1
├── Thread 2
└── Thread 3
    └── Shared: Heap, Code, Global Vars

Types of Threads

Type Managed By Context Switch Speed Example
User-Level Threads (ULT) User library Fast (no kernel) Very Fast POSIX pthreads (early), Go goroutines
Kernel-Level Threads (KLT) OS Kernel Slow (kernel trap) Slower Linux (NPTL), Windows Threads
Hybrid (1:1 or M:N) Both Mixed Balanced Modern: Linux, Windows, Java

1. User-Level Threads (ULT)M:1 Model

Process
├── ULT1 → Scheduler (in user space)
├── ULT2 → Runs on one KLT
└── ULT3
Pros Cons
Fast context switch (no syscall) Blocking call blocks all threads
Portable No true parallelism
Low overhead Kernel sees only one thread

Example: Early Java Green Threads, Go (before 1.5)

2. Kernel-Level Threads (KLT)1:1 Model

Process
├── KLT1 → Kernel schedules
├── KLT2 → on CPU cores
└── KLT3
Pros Cons
True parallelism Slow context switch
Non-blocking I/O High overhead
Kernel scheduling

Default in Linux (NPTL), Windows, macOS

3. Hybrid Threads (M:N Model)

Process
├── ULT1 → Mapped to KLT1
├── ULT2 → KLT1
├── ULT3 → KLT2
└── Scheduler (user + kernel)
Pros Cons
Best of both Complex
Scalable (100K+ threads) Harder to debug
Efficient

Used in: - Go (goroutines → KLT) - Java (fibers, Project Loom) - Erlang (actors)

Multithreading vs Multiprocessing

Feature Multithreading Multiprocessing
Execution Unit Thread Process
Memory Shared Isolated
Creation Cost Low High
Context Switch Fast Slow
Parallelism Yes (on multi-core) Yes
Fault Isolation No (crash → whole process) Yes
IPC Direct (shared vars) Pipes, sockets, queues
GIL (Python) Blocked Not blocked
Use Case I/O-bound, shared data CPU-bound, isolation

When to Use What?

Workload Use
I/O-Bound (web server, DB) Multithreading
CPU-Bound (ML, video encode) Multiprocessing
Need Isolation (plugins) Multiprocessing
Low Latency Hybrid threads
Python multiprocessing (bypass GIL)

5. Real-World Examples

App Model
Nginx Multithreading + event loop
Apache Multiprocessing (prefork)
Chrome One process per tab
Go Web Server 100K goroutines (hybrid)
Python Flask Multiprocessing for CPU

6. Summary Table

Feature User Threads Kernel Threads Hybrid Processes
Speed Fastest Slow Fast Slowest
Parallelism No Yes Yes Yes
Memory Shared Shared Shared Isolated
Blocking All One One One
Scalability 1000s 100s 100K+ 100s
Use Legacy Default Modern Isolation

Golden Rules

Rule Why
Use threads for I/O, processes for CPU Maximize performance
Avoid shared state Prevent race conditions
Use thread pools Reduce creation cost
In Python: use multiprocessing for CPU Bypass GIL

Memory Management

Memory management is the core OS function that allocates, tracks, protects, and reclaims memory for running programs. It ensures isolation, efficiency, and scalability by abstracting physical RAM into virtual address spaces.

1. Program Loading & Its Types

Program Loading = Process of transferring a program from disk to RAM and preparing it for execution.

Steps in Program Loading

  1. User invokes (e.g., ./app, execve() syscall).
  2. Loader reads executable file (ELF/PE format).
  3. Allocates virtual memory regions:
  4. Text (code) – read-only
  5. Data – initialized globals
  6. BSS – uninitialized (zero-filled)
  7. Heap – dynamic allocation
  8. Stack – local vars, function calls
  9. Loads segments into memory.
  10. Performs relocation (adjust addresses).
  11. Links libraries (static/dynamic).
  12. Sets PC (Program Counter) to entry point.
  13. Starts execution.

Types of Program Loading

Type Description Pros Cons Use Case
Static Loading Entire program + dependencies loaded at start Fast startup, self-contained Wastes memory, large binary Embedded, small apps
Dynamic Loading Modules loaded only when needed (lazy) Saves memory, modular Runtime delay Large apps (Java, Python)
Absolute Loading Fixed memory address Simple Inflexible Bootloaders
Relocatable Loading Base address assigned at load time Flexible Needs relocation Most modern OS

Example:
gcc -static app.cStatic
python script.pyDynamic (imports loaded on demand)

2. Main Memory Management

Goal: Allocate physical RAM to processes efficiently.

Two main approaches:

A. Contiguous Allocation

Each process gets one continuous block of memory.

Allocation Strategies

Strategy How it Works Pros Cons
First Fit First hole ≥ size Fast External fragmentation
Best Fit Smallest hole ≥ size Less waste Slow search, tiny holes
Worst Fit Largest hole Leaves big holes Often wasteful

Fragmentation in Contiguous

Solution: Compaction (move processes) → expensive.

B. Non-Contiguous Allocation

Memory divided into fixed or variable units; process can span non-adjacent blocks.

3. Paging (Most Important – Non-Contiguous)

Paging = Divides memory into fixed-size pages (usually 4KB).

Virtual Address:  [ Page Number | Page Offset ]
                   (p bits)     (f bits)

Page Table

Maps virtual page number → physical frame number

Page Table Entry (PTE):
| Valid Bit | Frame Number | Protection | Dirty | ...

Example

Address Translation

Virtual Address: 0x12345678
→ Page Number: 0x1234
→ Page Offset: 0x5678

Page Table[0x1234] = Frame 0x9ABC
→ Physical Address = 0x9ABC5678

Multi-Level Paging (To Reduce Page Table Size)

Problem: 64-bit system → 2^52 pages → huge page table

2-Level Paging (32-bit)

Page Directory (PD) → Points to Page Tables (PT)
PT → Points to Frames

4-Level Paging (x86-64)

PML4 → PDPT → PD → PT → Frame

Translation Lookaside Buffer (TLB)

TLB = Cache of recent page table entries (in CPU)

CPU → TLB → Hit? → Physical Address
         ↓ Miss
     Page Walk (MMU) → Load PTE → TLB

Paging with Hashing (Hashed Page Table)

For 64-bit systems with sparse addresses.

4. Segmentation

Segmentation = Divides memory into variable-sized segments based on logical structure.

Segment Purpose
Code .text
Data .data, .bss
Stack Local vars
Heap Dynamic

Segment Table

Segment Base | Limit | Protection

Pros

Cons

5. Segmentation + Paging (Paged Segmentation)

Best of both: Logical segments + fixed pages.

Segment Table → Points to Page Table
Page Table → Points to Frames

Summary Table

Concept Type Fragmentation Table Overhead Use Case
Contiguous Single block External + Internal None Simple systems
Paging Fixed pages Only Internal Page Table All modern OS
Segmentation Variable External Segment Table Logical structure
Paged Seg. Hybrid Internal Two tables Legacy

Key Advantages of Paging (Why It's Dominant)

Feature Benefit
No external fragmentation Any free frame can be used
Easy swapping Fixed size → easy disk mapping
Sharing Same frame mapped in multiple processes
Protection Per-page R/W/X bits
Demand loading Load page only on access

Golden Rules

Rule Why
Paging > Segmentation Simpler, no external frag
TLB is critical 90%+ translations from TLB
Multi-level paging saves space Only allocate used tables
Page size = 4KB (trade-off) Balance internal frag vs table size

Virtual Memory & Demand Paging

It allows programs to run as if they have more memory than physically available, enables process isolation, and supports efficient memory sharing.

Virtual Memory = An abstraction that gives each process the illusion of a large, private, contiguous memory space, even when physical RAM is limited.

Process A sees: 0x00000000 → 0xFFFFFFFFFFFFFFFF (16 EB)
Process B sees: Same range
→ But they are isolated and mapped to different physical RAM

Key Goals of Virtual Memory

Goal How Achieved
Large Address Space 64-bit → 2^64 bytes (16 exabytes)
Process Isolation Each process has private virtual space
Memory Protection Per-page R/W/X permissions
Efficient Sharing Shared libraries, IPC
Transparent Execution Program runs without knowing physical limits

Virtual vs Physical Address Space

Virtual Address Physical Address
Seen by Process OS, Hardware
Range 0 to 2^n-1 (e.g., 2^64) 0 to actual RAM size
Translation Via Page Table Direct to RAM

How Virtual Memory Works

Paging is the foundation of virtual memory.

64-bit Virtual Address:
+--------------------------------------------------+
|  PML4 (9) | PDPT (9) | PD (9) | PT (9) | Offset (12) |
+--------------------------------------------------+

Demand Paging – "Load Only When Needed"

Demand Paging = Pages are loaded from disk into RAM only when accessed (on a page fault).

Why Demand Paging?

Page Fault Handling – Step by Step

alt text

Valid/Invalid Bit in Page Table

Bit Meaning
1 (Valid) Page is in RAM
0 (Invalid) Page not in RAM → Page Fault

4. Backing Store (Swap Space)

Swap File/Swap Partition on disk stores pages evicted from RAM.

# Linux
swapon /swapfile
free -h
# → Swap: 8GB

5. Page Replacement Algorithms

When RAM is full, OS evicts a page to make room.

Algorithm How Pros Cons
FIFO First In, First Out Simple Belady's Anomaly
LRU Least Recently Used Good locality Expensive (stack or counter)
Optimal Replace page not used longest in future Best (theoretical) Impossible
Clock (Second Chance) Circular list + reference bit Efficient approximation Moderate

Key Metrics

Metric Meaning
Page Fault Rate Faults per second
Major Fault Requires disk I/O
Minor Fault Page already in RAM (reclaim)
Working Set Pages actively used

Golden Rules

Rule Why
Demand paging = efficiency Don’t load unused code
Page size = 4KB (sweet spot) Balance frag vs table size
COW = fast fork Share until write
Avoid thrashing Monitor working set
TLB + Cache = speed 99% translations from cache

Thrashing

Thrashing = A severe performance degradation in a virtual memory system where the CPU spends more time handling page faults (swapping pages in/out of disk) than executing actual program instructions.

CPU Utilization: 90% → 10% → 1%
Page Fault Rate: 100/sec → 10,000/sec → 100,000/sec
→ System becomes unresponsive

Why Does Thrashing Happen?

Root Cause: Insufficient Physical RAM

graph LR
    A[Too Many Processes] --> B[Working Set > RAM]
    B --> C[Frequent Page Faults]
    C --> D[OS Evicts Pages]
    D --> E[Pages Needed Again]
    E --> C
    style C fill:#f96,stroke:#333

Working Set of a process = Set of pages it references in a time window (e.g., last 10 million instructions).

Process Working Set Size
Firefox 200 MB
VS Code 150 MB
Database 500 MB
Total RAM = 1 GB
3 processes → 850 MB working set → Fits → Smooth
5 processes → 1.5 GB working set → Thrashing

Thrashing in Action – Step-by-Step

Time 0:
  RAM: [P1 pages] [P2 pages] [P3 pages] → Full

Time 1:
  P4 starts → Needs 200 MB
  OS evicts P1's pages → P4 loads

Time 2:
  P1 runs → Page fault! → Evict P2 → Load P1
  P2 runs → Page fault! → Evict P3 → Load P2
  P3 runs → Page fault! → Evict P4 → Load P3

→ CPU busy with page faults, not work

Symptoms of Thrashing

Symptom How to Observe
High CPU usage by kernel top%sy (system) > 80%
High disk I/O iostat → high kB_read/s, kB_wrtn/s
Low CPU utilization for user apps %us < 5%
Unresponsive system GUI freezes, commands lag
High page fault rate vmstatpgfault > 10,000/sec
# Monitor
vmstat 1
# → si/so: swap in/out (high = bad)
# → bi/bo: blocks in/out

sar -B 1
# → pgpgin/s, pgpgout/s

Page Fault Types

Type Meaning
Minor Fault Page in RAM (but not mapped) → fast
Major Fault Page on disk → slow (10ms)
Thrashing = Too many major faults

Belady’s Anomaly & Thrashing

With FIFO replacement, increasing frames can increase page faults → worsens thrashing.

Reference string: 1,2,3,4,1,2,5,1,2,3,4,5
3 frames → 9 faults
4 frames → 10 faults ← Anomaly

How OS Detects Thrashing

Method How
Page Fault Frequency (PFF) If faults/sec > threshold → thrashing
CPU Utilization Drop If CPU < 10% but faults high → suspend processes
Working Set Estimation Track referenced pages over time

Solutions to Prevent/Recover from Thrashing

  1. Increase Physical RAM
  2. Best fix: Add more RAM or use faster storage (SSD).

  3. Reduce Multiprogramming Degree

  4. Suspend/Swap out processes until working sets fit.

  5. Improve Locality

  6. Optimize code: Reduce random access, use arrays sequentially.
  7. Compiler: Better loop ordering.

  8. Priority-Based Swapping

  9. Swap out low-priority or idle processes first.

  10. Working Set Model (Theoretical)

  11. OS tracks working set window (Δ).
  12. If working set > RAM → suspend process.

Use Local Replacement (not Global) - Each process has fixed frame allocation. - Prevents one process from stealing all frames.

Linux Thrashing Protection

Feature How it Works
OOM Killer Kills high-memory processes when swap exhausted
PSI (Pressure Stall Information) Monitors memory pressure (some/full)
Swappiness Controls swap aggressiveness (vm.swappiness=60)

Real-World Example

Laptop: 8GB RAM
Running: Chrome (50 tabs) + VS Code + Docker + Zoom
→ Working set = 12GB
→ Thrashing: Disk LED blinking, UI freezes
→ Fix: Close Chrome tabs or add RAM

Golden Rules to Avoid Thrashing

Rule Why
Keep working set < RAM Core principle
Monitor PSI / vmstat Early detection
Use SSD, not HDD 100x faster page faults
Set swappiness low Prefer RAM over swap
Limit containers Prevent memory explosion

File System & Disk Management**

File systems and disk management are core OS components that organize, store, retrieve, and protect data on storage devices. They abstract raw disk blocks into user-friendly files and directories, while ensuring performance, reliability, and security.

1. What is a File?

File = A named collection of related data stored on disk.

File Attributes:
- Name: report.pdf
- Type: PDF
- Size: 2.1 MB
- Location: /home/user/docs/
- Permissions: rw-r--r--
- Timestamps: Created, Modified, Accessed
- Owner: user

File Types

Type Example
Regular document.txt, image.jpg
Directory folder/
Symbolic Link link → /real/path
Device /dev/sda (block), /dev/tty (char)
Socket/Pipe IPC

2. File System – The Big Picture

File System = Software + Data Structures that manage how files are stored, organized, and accessed on a storage device. alt text

3. Key File System Operations

Operation System Call Action
Create open("file", O_CREAT) Allocate inode + directory entry
Read read(fd, buf, size) Copy from disk to memory
Write write(fd, buf, size) Copy to disk (buffered)
Delete unlink("file") Remove directory entry, free inode
Open open() Get file descriptor
Close close() Release fd
Seek lseek() Move file pointer

4. File System Structure on Disk

+---------------------+
| Boot Block          | ← Bootloader (optional)
+---------------------+
| Superblock          | ← FS metadata (size, inodes, blocks)
+---------------------+
| Inode Table         | ← File metadata (size, blocks, perms)
+---------------------+
| Data Blocks         | ← Actual file content
+---------------------+

5. Inode – The Heart of File Metadata

Inode = Index Node – Stores all file metadata except name.

struct inode {
    mode_t     mode;        // Permissions + type
    uid_t      uid;         // Owner
    gid_t      gid;         // Group
    off_t      size;        // Size in bytes
    time_t     atime;       // Access time
    time_t     mtime;       // Modify time
    time_t     ctime;       // Change time
    blkcnt_t   blocks;      // Number of blocks
    blkptr_t   block[15];   // Pointers to data blocks
};

Inode Block Pointers (EXT2/3/4)

12 Direct → 4KB blocks
1 Indirect → 1024 direct pointers
1 Double Indirect → 1024 indirect
1 Triple Indirect → 1024 double
→ Max file: ~2TB

6. Directory Structure

Directory = A special file containing (name, inode) pairs.

Directory Entry:
+--------+----------+
| Name   | Inode #  |
+--------+----------+
| .      | 2        |
| ..     | 2        |
| file.txt | 156      |
| docs/  | 789      |
+--------+----------+

7. Disk Management – Partitioning & Formatting

1. Partitioning

Divides disk into logical sections.

# MBR (Legacy)
Partition 1: /boot (ext4)
Partition 2: / (ext4)
Partition 3: swap

# GPT (Modern)
GUID Partition Table → Up to 128 partitions

Tools: fdisk, gdisk, parted

2. Formatting

Installs file system on partition.

mkfs.ext4 /dev/sda1
mkfs.ntfs /dev/sdb1

8. Popular File Systems

FS OS Features Use Case
EXT4 Linux Journaling, extents, 1EB max Default Linux
XFS Linux High perf, 8EB Servers, big files
Btrfs Linux Snapshots, RAID, CoW Advanced
NTFS Windows ACLs, encryption Windows
APFS macOS Snapshots, encryption macOS/iOS
FAT32 Cross Simple, 4GB file limit USB drives

9. Journaling – Crash Recovery

Journal = Log of pending operations before commit.

Write Intent → Journal → Commit → Data Blocks
→ If crash → Replay journal

Types: - Data Journaling: Log data + metadata (slow, safe) - Metadata Journaling: Log metadata only (fast) – EXT3/4 default - Ordered: Write data first, then metadata

10. File System Mounting

Mount = Attach file system to directory tree.

mount /dev/sda1 /home
# → /home now accesses sda1
/ (root)
├── home/
│   └── user/ ← /dev/sda1
├── boot/
└── etc/

Unmount: umount /home

11. Virtual File System (VFS) – Abstraction Layer

VFS = Unified interface for all file systems.

struct file_operations {
    ssize_t (*read) (struct file *, char *, size_t, loff_t *);
    ssize_t (*write) (struct file *, const char *, size_t, loff_t *);
    // ...
};

12. Disk Scheduling Algorithms

Optimize disk head movement to reduce seek time.

Algorithm How Pros Cons
FCFS First come, first served Simple High seek
SSTF Shortest Seek Time First Low seek Starvation
SCAN (Elevator) Sweep head → reverse Fair
C-SCAN Circular SCAN More uniform
LOOK SCAN but stop at last request Efficient

13. RAID – Disk Redundancy & Performance

Level Min Disks Features
RAID 0 2 Striping → Speed
RAID 1 2 Mirroring → Redundancy
RAID 5 3 Striping + Parity → Balance
RAID 6 4 Double parity → 2-disk fail
RAID 10 4 Mirror + Stripe → Best

14. Disk Caching & Buffering

Summary Diagram

alt text

Golden Rules

Rule Why
Inode ≠ Name Name in directory, metadata in inode
Journaling saves data Recover from crash
VFS = portability Same API for all FS
Use SSD + EXT4 Best performance
Backup > RAID RAID ≠ backup