The shell is the interface provided by an operating system to its services. In this assignment, you will implement a simple shell for an UNIX-based operating system. Your shell will provide basic functionalities such as the ability to run multiple programs, directory tree navigation, I/O redirection and inter-process communication.

Learning outcomes

You will get an understanding of how a simple UNIX shell can be implemented. You will become familiar with pipes, I/O redirection, environment variables, etc.

Assignment

Task Read the man pages for the following system calls (Linux: man -S 2 fork etc).

During this assignment, you will change two functions in the file sh.c, parse_line and run_program, to make the tests pass.

Task Implement the functionality required to run a single program. Modify the run_program function such that you will be able to execute:

Use the fork and wait system calls. Your implementation must handle both programs that exist in the $PATH environment variable and programs that are given explicitly via their path. Check that the current user can access and execute the program file (use the access function). The global variable path_dir_list is a circular list containing the directories in the $PATH variable.

Task Extend the shell such that it can run a sequence of programs. Modify the gettoken function to handle ; between commands. You should be able to execute:

Task Add support for background processes. This should allow you to start a long-running process in the background, but immediately return to the shell prompt. Implementation-wise, this means that the shell does not need to wait for the child process to terminate. Once the child process terminates it will become a zombie process, that has exited but has been not waited for by the parent. You can see these processes by running ps. To remove these zombie processes, the shell should regularly check for exited children using a non-blocking waitpid system call with the WNOHANG flag as an argument. You should be able to execute:

Task Add directory tree navigation by implementing the cd command. The following commands should work:

Task Implement I/O redirection Allow for the standard output of a program to be redirected to a file and for the contents of a file to be redirected to the standard input.

Task Implement inter-process communication using pipes Redirect the output from one process to the input of another process.

Use the pipe and close system calls. Small changes in the parse_line(void) function might be needed.

Hand-in

Ensure the make test returns a PASS result. Once you are done, push the changes to a branch with a name starting with lab1.

Then go to the Gitlab page, create a merge request from the lab1 branch to master, and assign it to one of the instructors (as explained in Lab 0).

Hints

LAB2: Introduction

Signals are an important IPC mechanism present in most operating systems. Usually they are ready to use and require the least setup out of all other IPC mechanisms. However, they are useful in specific cases which might not suit all situations.

Learning outcomes

Assignment

Core dumps

Write a C program which crashes. Try to make it short, e.g. less than 30 bytes. Run it and see what happens (not much should happen). In general it is useful to get feedback from the system about where in the source code a program crashes. You will learn that next.

A core file is a dump of the memory and the register state of a process. Producing such a file is called dumping core or making a core dump. Debuggers can read core files so you can do post-mortem debugging, i.e. inspect a process after its death.

On some systems (e.g. Ubuntu 18.04), you also have to disable the apport service:

Compile with debugging info (i.e. -g) your crashing program and rerun it. You should get a core file, either in the same directory you ran the program from or in a specific directory dedicated to these. (E.g. on the student machines, check the /var/lib/apport/coredump/ directory.) Then you can run gdb a.out core to get info about where it crashed.

Sending signals to a process

Signals can be sent to a process, as above, due to an event in the program itself. But they can also be sent from outside the process. You can do so by using the kill command. Read the manual pages for this command, man 1 kill. For a complete list of signals, you can read man 7 signal.

Task Write a program which is an infinite loop, start your program and put in the background (e.g. ./a.out &). Now send the SIGINT signal to the process and see what happens. You can do so with the command kill -2 pid, or with the longer kill -INT pid. Did you get a core file? You should not. SIGINT causes a normal termination.

Now start the program again, but this time send the SIGQUIT signal to the process and see what happens. SIGQUIT is signal number 3. SIGQUIT is one of the signals which cause an abnormal termination and a core dump. Now you should get a core file. Again type gdb a.out core. You should be able to see e.g. where the program was when it received the signal. Post mortem debugging can be very useful when a program terminates unexpectedly.

Checking the return status of a program

Task Write a program (in a file called exitcode.c) that takes another program as argument, executes it, prints the PID of the process, prints its exit status and also whether it terminated normally or abnormally. If the process was terminated by a signal, it should also print that together with the signal number. You will have to use fork, exec, waitpid system calls and the WIFEXITED, WEXITSTATUS, WIFSIGNALED, WTERMSIG macros. You can test these by sending signals using the kill command.

C API for signal handling

ISO C defines the following signals (i.e. they must be supported also by non-UNIX C compilers):

A signal set (of type sigset_t) is a set of signal numbers. Typically a variable of type sigset_t is declared and then initialized using either sigemptyset or sigfillset, followed by sigaddset or sigdelset. See the man page for sigemptyset.

To install a signal handler for a signal using the UNIX interface directly, the sigaction system call is used. A signal set (sigset_t) specifies a set of signals which are put on hold (i.e. ignored) while the installed signal handler is executing.

Installing signal handlers

Task Extend your infinite-loop program (in a file called loop1.c) to print a message every time the user hits Ctrl-C (SIGINT).

Task Now, install a signal handler for the signal SIGINT which is itself an infinite loop. While it is looping, you should ignore signal USR1 but handle USR2. For the signal USR2, you should install a signal handler which prints out a suitable message. While a signal handler is executing due to a signal s then s itself is blocked.

Masking signals

Suppose your program is executing some critical code in which it does not want to be terminated or receive certain signals. One way to achieve that is to ignore the relevant signals (except SIGKILL which cannot be ignored). An alternative is to block the signals, or putting them on hold. We used the blocking of a signal when a signal handler was executing (i.e. the signal itself) but we can also block them without executing in a signal handler. The system call sigprocmask is used for this. It blocks some signals but not all.

Task Create another program (in file loop2.c) that executes a loop for about 10 seconds. Make it print a message before entering the loop. After the loop, it should print all the signals it received during the execution of the loop. How can you portably collect the signals even if you don't know how many there are? Your code should work even if a new signal number is added, so testing each known signal explicitly is not permitted. See sigprocmask, sigpending, and sigismember.

Alarms

Finally, one can use alarm to get the SIGALRM signal after a specified number of seconds. Task Write another program (in file loop3.c) containing a loop and install the appropriate signal handler to count how many iterations of the loop can be executed in 10 seconds. This is sometimes a useful way to benchmark a routine. Consider using a global boolean variable to control the termination of the loop. In the signal handler, change the value of the variable such that the loop exits. Try your program with and without optimization (e.g. -O4).

Hand-in

The source files for the programs exitcode.c, loop1.c, loop2.c, loop3.c. Once you are done, push the changes to a branch with a name starting with lab2.

LAB3: Introduction

NOTE: You may choose to implement this assignment in C or in Zig Links to an external site. (a newer system programming language). You should see this as a challenge and an opportunity to learn another language. The description below is made for the C version, but the differences in program structure are minor. Some specific instructions for the Zig version are given at the end.

In this assigment you will add virtual memory support to a simulator of a very simple CPU. As part of this you will implement three (3) page-replacement algorithms.

Learning outcomes

Understand the memory paging mechanism.

Before you start

Make sure you are familiar with the concepts: virtual memory, FIFO and second-chance page replacement algorithms, optimal page replacement, Bélády's anomaly.

Consider the following page access sequence: 1 2 3 4 1 2 5 1 2 3 4 5. Simulate (paper & pencil) this sequence on a machine with 3 memory frames using FIFO page replacement. How many page faults do you get. Simulate the same sequence on a machine with 4 memory frames. How many page faults do you get now? Why?

Assignment

Set up

The source code for this assigment is available in the lab3_vm directory. The whole machine simulator is contained in the machine.c file. The assembly program that will run on the simulator is fac.s.
Type make to check that everything builds.

Study the machine simulator

Make sure you understand what the functions read_memory and write_memory are doing and where they are called from. Study the page_table_entry_t and coremap_entry_t data structures.

Note also that:

translate converts a virtual address to a physical address, and may result in a page fault if the wanted address is in a page not present in memory (because it was never accessed before or was swapped out)
pagefault is the "exception handler" for the case when a certain virtual page is not present in memory. It will need to allocate a physical page and, if needed, bring a swapped out page back in
take_phys_page just needs to allocate a physical page; this can be an unused one if it exists, or it may need to swap out an occupied page using an eviction algorithm

Study the `fac.s` program

The program is computing 12! and writes the result in register R3. Be prepared to modify the program such that it computes 2!. Answer the following questions:

What is the instruction st 31,1,-1 doing?
What are the memory access patterns of the program?
When does the program require more memory: when computing 2! or 12!? Why?

Implement the FIFO page replacement algorithm

Complete the following functions:

pagefault - handles the case when a page is not available in the memory. This means allocating a new page (take_phys_page, which may evict a page), bringing in the wanted page from the disk and returning it,
take_phys_page - picks a page to evict using a replacement algorithm (replace) and swaps it out on the disk if necessary, and
fifo_page_replace - just picks the next page in line to replace using the fifo policy.

You can run the program by typing make run-fifo. How many page faults do you get?

Implement the second chance page replacement algorithm

Complete the second_chance_replace function. You can run the program by typing make run-sc. How many page faults do you get now? Compare with the FIFO algorithm.

Optimize the number of disk writes

Add a new variable to the simulator that count the disk writes and print it out at the end of the program. Use the page_table_entry_t::modified field to reduce the number of disk writes.

Optimal page replacement

Implement the Optimal page replacement algorithm and compare with the page replacement algorithms you have already implemented. Keep a trace of virtual page accesses when you run the program with one of the paging algorithms you have already implemented and then simulate the optimal page replacement algorithm on that trace.

Hand-in

Once you have implemented all of the above page replacement algorithms, add a README file to the repository describing the differences between them. Ensure that your code build and then push the changes to a branch with a name starting with lab3.

git push origin HEAD:lab3

Hints

The approximate size of the code changes is:

machine.c | 173 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 163 insertions(+), 10 deletions(-)

Zig

Uses zig version 0.12.0-dev.1108+aeadcb391 (tested also with 0.11.0)

Make Zig visible

Run /usr/local/cs/EDAF35/setup in the terminal you will use on your the lab machine. If you run this on your own machine make sure you install zig yourself - it's already installed on the Linux lab machines.

Build the Program

In the directory with build.zig run zig build.

Run the Unit Tests

To run the tests from main.zig: try zig build test - tests should fail for the skeleton code. You may extend these tests if you wish.

Run the Program

Try zig build run which will give you:

Options are:
   --fifo or --second-chance
   optionally --debug
error: ChooseOneReplacement
...

You need to pick either fifo or second-chance to run either of your implemented functions. So zig build run -- --fifo a.s will run the machine on code from file a.s using FIFO policy. You may leave out the file name if you want to use fac.s file by default. For more debug prints use --debug. Alternatively you can use directly the executable in zig-out/bin.

Code Changes for the Zig Version

Take the following with a grain of salt since you might implement things differently:

lab3_vm/zig/src/main.zig | 51 +++++++++++++++++++++++++++++++++++++++++++--------
1 file changed, 43 insertions(+), 8 deletions(-)

Lab 4: A FUSE File System

Due 30 Nov by 12:00

Introduction

In this assignment you will implement a simple, but usable file system. We will use FUSE Links to an external site., a library that enables running file system code in user-space.

Learning outcomes

Understand the organization of a file system and its elementary operations.

Before you start

Read the manual (e.g. man 2 rename) for the following system calls:

stat ([DIR_ENTRY])
rename ([RENAME])
truncate ([TRUNC_FREE])
unlink ([REMOVE])
lseek ([READ_OFFSET])

and also the comments in /usr/include/fuse/fuse.h for the following members of struct fuse_operations: gettattr, readdir, read, write, chmod, truncate, rename, unlink, create. You will implement some of these functions.

Assignment

Overview

You will extend a very primitive file system implementing a FUSE interface (can be mounted in Linux). The source code for this file system is located in lab4_fs. To build the project, run make build. The project comes with a test script which you can run using make test.

This project also comes with a set of tools. To build them, run make tools.

run format_myfs to format the file system;
run info_myfs to print information about the file system.

The physical disk is emulated by using a regular file (called RAWDISK_FS) and you will need to use the provided interface towards the disk as given, without changing rawdisk.c and rawdisk.h. In the current implementation, the file system uses a flat directory structure (no sub-directories) and a block map to support linked lists, for your files or for managing the free blocks. The functions needed for handling the block map are defined in fs_support.c. Make sure you understand what each of them does.

In principle all the changes you will implement, will very likely be located mainly in one file, ssfs.c. This implements all the FUSE operations you will work with (which you could extend, if you want to implement other FUSE operations).

Finally, there are a couple of useful tools you might also need to adjust. format_myfs.c does a soft format of the file system, bringing the file system to a valid initial empty state. info_myfs.c displays information about the current state of the file system, such as the block map and statistics - very useful for debugging your system. The Makefile helps you build the FUSE interface or the tools. Please study all of these carefully before embarking on changing them. The suggestion is to run ssfs in one shell and open another shell to test standard Linux file commands and run the tools.

Implementation

Your tasks are listed below. Search for the [ ] tag in ssfs.c to locate the code that needs to be changed / implemented.

[DIR_ENTRY] : Add modification time to the directory entry and handle it right (Note this will require you to reformat the file system, since the directory structure changed!).
[RENAME] : Implement file renaming.
[TRUNC_FREE] : Modify the size of a file - shrink or grow. Allocate and free blocks if needed.
[REMOVE] : Implement file deletion.
[READ_OFFSET] : Correctly implement reading with offset in do_read.
Optional [LARGE_DIR] : Allow the directory to grow over the one block limit (still flat)
Optional [LARGE_WRITE] : Modify do_write to allow writing more than one block

Important: all of the above should be implemented without calling existing file system functions such as truncate, unlink, rename ,etc. Your functions will simply operate on the structure in fs_support.h|c using the provided functions (with additions you deem necessary).

Hand-in

Ensure that all the tests pass when running make test. Once you are done, push the changes to a branch with a name starting with lab4.

git push origin HEAD:lab4

Hints

To test your implementation of [TRUNC_FREE] you will need to create files that span multiple blocks. If you don't implement [LARGE_WRITE], you can still create files that span more than one block using a sequence of small writes. One way to do this is to run for i in {1..512} ; do echo _ >> file.txt ; done.
You can use truncate -s SIZE FILE to create a FILE of given SIZE.
Warning macOS uses temporary files to coalesce writes and this will make debugging more difficult. We recommend using a GNU/Linux operating system instead.
The approximate size of the changes is:

fs_support.h |   4 +-
ssfs.c       | 244 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++-----------------------
2 files changed, 177 insertions(+), 71 deletions(-)

Lab 5: Custom Heap Memory Manager

Due 7 Dec by 12:00

NOTE: This assignment is to be written in C with a build script and unit test part using Zig 0.12.0-dev.1108+aeadcb391 Links to an external site. (a newer system programming language). You should see this as a challenge and an opportunity to check out another language. The Zig part is however not compulsory and the task can be completed fully by writting C code.

Introduction

In this assigment you will complete a simple heap memory manager and test it in action in a real Linux tool (gawk). For this, you will need to write your own malloc, free, and realloc functions, which will override the existing standard library user-space memory allocation functions. A skeleton already implementing calloc, along with other helper data structures and functions is provided to you. You will need to fill in functionality marked with TODO comments.

Learning outcomes

Understand heap memory management in Linux and be able to implement a simple custom memory manager.

Before you start

Malloc, free, realloc

Read the man pages for malloc, free, realloc, calloc. Pay attention to the corner cases and return values - to work properly your implementation must follow these closely.

Brk, sbrk

Read the man pages for brk and sbrk. These are used internally by the provided helper functions to reserve space in the program space. An overview of how sbrk works is in the picture below. Note that sbrk(0) returns the current program break.

Note: Be aware that sbrk may work differently on different UNIX systems, so do consult the man pages for your target system to figure out specifics. For instance, on Mac OS X, sbrk called with a negative increment will not decrease the program break, but only mark the freed memory as unused.

Assignment

To set your system up without Zig and use only C please see the instructions at the end of the description. You will anyway work exclusively (unless you want to add more unit tests) in C, in one file!

Setup

To use zig in a terminal on a lab machine, execute /usr/local/cs/EDAF35/setup first.
The source code for this assigment is available in the lab5_mm directory, as ll-mm.c.
In lab5_mm/malloc-tester is a zig project dedicated to unit testing your c, as well as setting up, building and checking your implementation with gawk. Check your available options or steps by running zig build -h in lab5_mm/malloc-tester. The most important are listed below:
1. zig build test - will run unit tests on your ll-mm.c file functions; this step does not need gawk, but will help you identify some errors in your implementation. These tests are found in src/main.zig and you may add more tests, split these or add more printouts if you wish. This will produce no output if all tests succeed. You may run zig build test --summary all for more information.
2. zig build setup - will automatically download and configure gawk sources; run this first
3. zig build install - after the setup above, this will compile and install gawk using your ll-mm.c file
4. zig build check - will run the gawk tests with the executable produced with install

Study the memory manager skeleton

In the ll-mm.c you will find a number of functions and a data structure already implemented. Study these carefully. A brief description follows:

block - is your linked list structure, that makes use of the fact that successive blocks lie in memory after each other
new_block - creates a new block by increasing the brk limit
find_block - takes in a pointer and figures out whether it is a an address previously allocated with malloc/realloc, in which case it returns the block. Use this to validate input addresses.
split_block - if the size of the given block is large enough, splits the block in two, still linked in the same list. Returns a negative number if it failed for some reason.
merge_blocks - merges all free adjacent blocks starting with the given one.

There are a number of other functions used either for unit testing or inside the functions above. None of these need modifications!

Task Have a look at the calloc implementation. What is the point of builtin_umull_overflow function?

Run the unit test

There are a number of unit tests already implemented in malloc-tester/src/main.zig. These assume the functions work properly, so if you run zig build test before implemeting anything you should get a failure. Once you start implementing and your functions behave properly you will get fewer failed tests. For more details you can also try runnig zig build test --summary all. As you implement stuff, do run the unit tests now and then, to see what works and what does not yet.

Implement `free`, `malloc`, `realloc`

Task Implement free in ll-mm.c. Think about handling also bad input, such as a random address or null. Write more unit tests if you deem it necessary. What do you do with free consecutive blocks?

Task Implement malloc in a simple way first, by creating new blocks. Then try to reuse free blocks that would fit the new allocation. Then try to split off the remaining space, to reduce fragmentation. All the functions you need to use for these tasks are already given.

Task Implement realloc. Identify the cases you need to handle and add them gradually. Pick easy solutions first, but do not allocate new blocks unnecessarily! Shrinking an allocation should not result in increasing the sbrk(0).

Optional Task Try to add tests for calloc by looking at how the other unit tests are written and checking the man page for calloc. Of course, if the test is correct, it will fail, since calloc uses malloc, which still needs to be implemented.

Test with `gawk`

If your unit tests pass, you are ready to test your memory manager with a real application. gawk is a UNIX tool that has a test suite which stresses memory allocation a lot, which is why we will use it to test your implementation.

This will require you to run zig build setup, zig build install, and zig build check, which will then run the gawk (version 3.1.8) tests with your implementation of heap manager. The goal is to have all of these tests pass.

Troubleshooting

If zig build check fails for any reason or simply does not finish, something is wrong with your implementation. This may happen even if all the unit tests you ran with zig build test passed! Sometimes you will get core dumps that may help you identify what went wrong. Please check Lab 2 to see how you can use core dumps with gdb.

Hand-in

git push origin HEAD:lab5

To think about

Is the linked-list solution inefficient in some cases? Which ones?
Compare this to another possible solution, such as the buddy system. Which one is better, why and in which cases?
What is the difference between malloc and kmalloc? (see man pages). What is the advantage of knowing the size of all objects in your program - e.g. in a kernel - in the case of a memory manager?

Hints

The approximate size of the code changes is:

lab5_mm/ll-mm.c                    | 91 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-------------------
lab5_mm/malloc-tester/src/main.zig | 18 +++++++++++-------
2 files changed, 83 insertions(+), 26 deletions(-)

C without Zig

If you choose not to use Zig, you can always write your own unit tests in C and use existing tools such as make to build gawk and associated tests. However, you will need to do a number of steps manually.

Set up gawk

First make sure your CFLAGS are unset otherwise the patch will break, i.e. export CFLAGS=''

To download, configure gawk you will need to carry out the following steps:

wget https://ftp.gnu.org/gnu/gawk/gawk-3.1.8.tar.gz
gunzip gawk-3.1.8.tar.gz
tar xvf gawk-3.1.8.tar
rm gawk-3.1.8.tar
cd gawk-3.1.8/
./configure
touch .deps/ll-mm.Po

You will only need to do this once.

Make gawk use your heap manager

First, make sure your functions are actually called malloc, free, calloc, realloc. You might have changed these names for carrying out unit testing.

Then copy your ll-mm.c file in the gawk-3.1.8 directory: this needs to be here, to compile with the rest.

Now you will need to patch the Makefile so it will compile with your ll-mm.c. For this create a file named gawk.patch with the following contents:

--- Makefile    2023-10-02 16:48:57.001338786 +0200
+++ ../Makefile.2       2023-10-02 16:48:41.081451959 +0200
@@ -38,7 +38,7 @@
 # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA
 #

-
+mms = ll-mm
 pkgincludedir = $(includedir)/gawk
 pkglibdir = $(libdir)/gawk
 pkglibexecdir = $(libexecdir)/gawk
@@ -90,7 +90,7 @@
        floatcomp.$(OBJEXT) gawkmisc.$(OBJEXT) getopt.$(OBJEXT) \
        getopt1.$(OBJEXT) io.$(OBJEXT) main.$(OBJEXT) msg.$(OBJEXT) \
        node.$(OBJEXT) random.$(OBJEXT) re.$(OBJEXT) regex.$(OBJEXT) \
-       replace.$(OBJEXT) version.$(OBJEXT)
+       replace.$(OBJEXT) version.$(OBJEXT) $(mms).$(OBJEXT)
 am_gawk_OBJECTS = $(am__objects_1) eval.$(OBJEXT) profile.$(OBJEXT)
 gawk_OBJECTS = $(am_gawk_OBJECTS)
 gawk_LDADD = $(LDADD)
@@ -177,7 +177,7 @@
 AWK = gawk
 CC = gcc
 CCDEPMODE = depmode=gcc3
-CFLAGS = -g -O2
+CFLAGS = -g -O2 -fno-optimize-strlen
 CPP = gcc -E
 CPPFLAGS =
 CYGPATH_W = echo
@@ -376,6 +376,7 @@
        regex.h \
        replace.c \
        version.c \
+       $(mms).c \
        xalloc.h

 gawk_SOURCES = $(base_sources) eval.c profile.c
@@ -522,6 +523,7 @@
 include ./$(DEPDIR)/regex.Po
 include ./$(DEPDIR)/replace.Po
 include ./$(DEPDIR)/version.Po
+include ./$(DEPDIR)/$(mms).Po

 .c.o:
        $(COMPILE) -MT $@ -MD -MP -MF $(DEPDIR)/$*.Tpo -c -o $@ $<

Then use this file to fix the Makefile with patch -F 3 Makefile < gawk.patch

You should now be able to do make to compile gawk with your heap manager. Also you should be able to run make check in order to run the gawk tests with your implementation.

LAB1: Introduction

Learning outcomes

Assignment

Hand-in

Hints

LAB2: Introduction

Learning outcomes

Assignment

Core dumps

Sending signals to a process

Checking the return status of a program

C API for signal handling

Installing signal handlers

Masking signals

Alarms

Hand-in

LAB3: Introduction

Learning outcomes

Before you start

Assignment

Set up

Study the machine simulator

Study the fac.s program

Implement the FIFO page replacement algorithm

Implement the second chance page replacement algorithm

Optimize the number of disk writes

Optimal page replacement

Hand-in

Hints

Zig

Make Zig visible

Build the Program

Run the Unit Tests

Run the Program

Code Changes for the Zig Version

Lab 4: A FUSE File System

Introduction

Learning outcomes

Before you start

Assignment

Overview

Implementation

Hand-in

Hints

Lab 5: Custom Heap Memory Manager

Introduction

Learning outcomes

Before you start

Malloc, free, realloc

Brk, sbrk

Assignment

Setup

Study the memory manager skeleton

Run the unit test

Implement free, malloc, realloc

Test with gawk

Troubleshooting

Hand-in

To think about

Hints

C without Zig

Set up gawk

Make gawk use your heap manager

Study the `fac.s` program

Implement `free`, `malloc`, `realloc`

Test with `gawk`