The shell is the interface provided by an operating system to its services. In this assignment, you will implement a simple shell for an UNIX-based operating system. Your shell will provide basic functionalities such as the ability to run multiple programs, directory tree navigation, I/O redirection and inter-process communication.
You will get an understanding of how a simple UNIX shell can be implemented. You will become familiar with pipes, I/O redirection, environment variables, etc.
Task Read the man pages for the following system calls (Linux: man -S 2 fork
etc).
access
close
dup2
execv
exit
fork
open
pipe
wait
and waitpid
(it might be more convenient to use use waitpid
instead of wait
in your implementation).chdir
and getcwd
(to implement cd)Task Fetch the incomplete UNIX shell source code and ensure that it builds:
Move to the lab1_shell
directory and build the shell:
cd lab1_shell make
This should produce an executable, sh
, which is an incomplete shell.
Run the tests and watch them fail.
make test
During this assignment, you will change two functions in the file sh.c, parse_line
and run_program
, to make the tests pass.
Task Implement the functionality required to run a single program. Modify the run_program
function such that you will be able to execute:
% /usr/bin/pwd
Use the fork
and wait
system calls. Your implementation must handle both programs that exist in the $PATH
environment variable and programs that are given explicitly via their path. Check that the current user can access and execute the program file (use the access
function). The global variable path_dir_list
is a circular list containing the directories in the $PATH
variable.
Task Extend the shell such that it can run a sequence of programs. Modify the gettoken
function to handle ;
between commands. You should be able to execute:
% ls ; ls
Task Add support for background processes. This should allow you to start a long-running process in the background, but immediately return to the shell prompt. Implementation-wise, this means that the shell does not need to wait for the child process to terminate. Once the child process terminates it will become a zombie process, that has exited but has been not waited for by the parent. You can see these processes by running ps
. To remove these zombie processes, the shell should regularly check for exited children using a non-blocking waitpid
system call with the WNOHANG
flag as an argument. You should be able to execute:
% sleep 10 & ls
Task Add directory tree navigation by implementing the cd
command. The following commands should work:
% cd DIR
- change current directory to DIR
% cd -
- change to the previous working directoryObserve that cd
is a shell built-in, not a program in $PATH
.
Task Implement I/O redirection Allow for the standard output of a program to be redirected to a file and for the contents of a file to be redirected to the standard input.
% ls > filelist
% cat < filelist
The dup2
system call might be useful.
Task Implement inter-process communication using pipes Redirect the output from one process to the input of another process.
% ls | wc
% ls | cat | cat | cat | wc
% ls | cat | wc & echo 1
Use the pipe
and close
system calls. Small changes in the parse_line(void)
function might be needed.
Ensure the make test
returns a PASS
result. Once you are done, push the changes to a branch with a name starting with lab1
.
git push origin HEAD:lab1
Then go to the Gitlab page, create a merge request from the lab1
branch to master
, and assign it to one of the instructors (as explained in Lab 0).
The approximate size of the code changes is:
sh.c | 176 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---------------
Signals are an important IPC mechanism present in most operating systems. Usually they are ready to use and require the least setup out of all other IPC mechanisms. However, they are useful in specific cases which might not suit all situations.
Understand and get experience with UNIX signals.
Write a C program which crashes. Try to make it short, e.g. less than 30 bytes. Run it and see what happens (not much should happen). In general it is useful to get feedback from the system about where in the source code a program crashes. You will learn that next.
A core file is a dump of the memory and the register state of a process. Producing such a file is called dumping core or making a core dump. Debuggers can read core files so you can do post-mortem debugging, i.e. inspect a process after its death.
To enable this, issue the following command in the shell:
ulimit -c unlimited
On some systems (e.g. Ubuntu 18.04), you also have to disable the apport
service:
sudo service apport stop
Compile with debugging info (i.e. -g) your crashing program and rerun it. You should get a core file, either in the same directory you ran the program from or in a specific directory dedicated to these. (E.g. on the student machines, check the /var/lib/apport/coredump/
directory.) Then you can run gdb a.out core
to get info about where it crashed.
Signals can be sent to a process, as above, due to an event in the program itself. But they can also be sent from outside the process. You can do so by using the kill
command. Read the manual pages for this command, man 1 kill
. For a complete list of signals, you can read man 7 signal
.
Task Write a program which is an infinite loop, start your program and put in the background (e.g. ./a.out &
). Now send the SIGINT
signal to the process and see what happens. You can do so with the command kill -2 pid
, or with the longer kill -INT pid
. Did you get a core file? You should not. SIGINT
causes a normal termination.
Now start the program again, but this time send the SIGQUIT
signal to the process and see what happens. SIGQUIT
is signal number 3. SIGQUIT
is one of the signals which cause an abnormal termination and a core dump. Now you should get a core file. Again type gdb a.out core
. You should be able to see e.g. where the program was when it received the signal. Post mortem debugging can be very useful when a program terminates unexpectedly.
Task Write a program (in a file called exitcode.c
) that takes another program as argument, executes it, prints the PID of the process, prints its exit status and also whether it terminated normally or abnormally. If the process was terminated by a signal, it should also print that together with the signal number. You will have to use fork
, exec
, waitpid
system calls and the WIFEXITED
, WEXITSTATUS
, WIFSIGNALED
, WTERMSIG
macros. You can test these by sending signals using the kill
command.
A process has three choices when it receives a signal:
ISO C defines the following signals (i.e. they must be supported also by non-UNIX C compilers):
SIGABRT
SIGFPE
SIGILL
SIGINT
SIGSEGV
SIGTERM
A signal set (of type sigset_t
) is a set of signal numbers. Typically a variable of type sigset_t
is declared and then initialized using either sigemptyset
or sigfillset
, followed by sigaddset
or sigdelset
. See the man page for sigemptyset
.
To install a signal handler for a signal using the UNIX interface directly, the sigaction
system call is used. A signal set (sigset_t
) specifies a set of signals which are put on hold (i.e. ignored) while the installed signal handler is executing.
Task Extend your infinite-loop program (in a file called loop1.c
) to print a message every time the user hits Ctrl-C (SIGINT
).
Task Now, install a signal handler for the signal SIGINT
which is itself an infinite loop. While it is looping, you should ignore signal USR1
but handle USR2
. For the signal USR2
, you should install a signal handler which prints out a suitable message. While a signal handler is executing due to a signal s then s itself is blocked.
Suppose your program is executing some critical code in which it does not want to be terminated or receive certain signals. One way to achieve that is to ignore the relevant signals (except SIGKILL
which cannot be ignored). An alternative is to block the signals, or putting them on hold. We used the blocking of a signal when a signal handler was executing (i.e. the signal itself) but we can also block them without executing in a signal handler. The system call sigprocmask
is used for this. It blocks some signals but not all.
Task Create another program (in file loop2.c
) that executes a loop for about 10 seconds. Make it print a message before entering the loop. After the loop, it should print all the signals it received during the execution of the loop. How can you portably collect the signals even if you don't know how many there are? Your code should work even if a new signal number is added, so testing each known signal explicitly is not permitted. See sigprocmask
, sigpending
, and sigismember
.
Finally, one can use alarm
to get the SIGALRM
signal after a specified number of seconds. Task Write another program (in file loop3.c
) containing a loop and install the appropriate signal handler to count how many iterations of the loop can be executed in 10 seconds. This is sometimes a useful way to benchmark a routine. Consider using a global boolean variable to control the termination of the loop. In the signal handler, change the value of the variable such that the loop exits. Try your program with and without optimization (e.g. -O4).
The source files for the programs exitcode.c
, loop1.c
, loop2.c
, loop3.c
. Once you are done, push the changes to a branch with a name starting with lab2
.
git push origin HEAD:lab2
NOTE: You may choose to implement this assignment in C or in Zig Links to an external site. (a newer system programming language). You should see this as a challenge and an opportunity to learn another language. The description below is made for the C version, but the differences in program structure are minor. Some specific instructions for the Zig version are given at the end.
In this assigment you will add virtual memory support to a simulator of a very simple CPU. As part of this you will implement three (3) page-replacement algorithms.
Understand the memory paging mechanism.
Make sure you are familiar with the concepts: virtual memory, FIFO and second-chance page replacement algorithms, optimal page replacement, Bélády's anomaly.
Consider the following page access sequence: 1 2 3 4 1 2 5 1 2 3 4 5. Simulate (paper & pencil) this sequence on a machine with 3 memory frames using FIFO page replacement. How many page faults do you get. Simulate the same sequence on a machine with 4 memory frames. How many page faults do you get now? Why?
lab3_vm
directory. The whole machine simulator is contained in the machine.c
file. The assembly program that will run on the simulator is fac.s
.make
to check that everything builds.Make sure you understand what the functions read_memory
and write_memory
are doing and where they are called from. Study the page_table_entry_t
and coremap_entry_t
data structures.
Note also that:
translate
converts a virtual address to a physical address, and may result in a page fault if the wanted address is in a page not present in memory (because it was never accessed before or was swapped out)pagefault
is the "exception handler" for the case when a certain virtual page is not present in memory. It will need to allocate a physical page and, if needed, bring a swapped out page back intake_phys_page
just needs to allocate a physical page; this can be an unused one if it exists, or it may need to swap out an occupied page using an eviction algorithmfac.s
programThe program is computing 12! and writes the result in register R3
. Be prepared to modify the program such that it computes 2!. Answer the following questions:
st 31,1,-1
doing?Complete the following functions:
pagefault
- handles the case when a page is not available in the memory. This means allocating a new page (take_phys_page
, which may evict a page), bringing in the wanted page from the disk and returning it,take_phys_page
- picks a page to evict using a replacement algorithm (replace
) and swaps it out on the disk if necessary, andfifo_page_replace
- just picks the next page in line to replace using the fifo policy.You can run the program by typing make run-fifo
. How many page faults do you get?
Complete the second_chance_replace
function. You can run the program by typing make run-sc
. How many page faults do you get now? Compare with the FIFO algorithm.
Add a new variable to the simulator that count the disk writes and print it out at the end of the program. Use the page_table_entry_t::modified
field to reduce the number of disk writes.
Implement the Optimal page replacement algorithm and compare with the page replacement algorithms you have already implemented. Keep a trace of virtual page accesses when you run the program with one of the paging algorithms you have already implemented and then simulate the optimal page replacement algorithm on that trace.
Once you have implemented all of the above page replacement algorithms, add a README file to the repository describing the differences between them. Ensure that your code build and then push the changes to a branch with a name starting with lab3
.
git push origin HEAD:lab3
The approximate size of the code changes is:
machine.c | 173 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++------
1 file changed, 163 insertions(+), 10 deletions(-)
Uses zig version
0.12.0-dev.1108+aeadcb391 (tested also with 0.11.0)
Run /usr/local/cs/EDAF35/setup
in the terminal you will use on your the lab machine. If you run this on your own machine make sure you install zig yourself - it's already installed on the Linux lab machines.
In the directory with build.zig
run zig build
.
To run the tests from main.zig
: try zig build test
- tests should fail for the skeleton code. You may extend these tests if you wish.
Try zig build run
which will give you:
Options are:
--fifo or --second-chance
optionally --debug
error: ChooseOneReplacement
...
You need to pick either fifo or second-chance to run either of your implemented functions. So zig build run -- --fifo a.s
will run the machine on code from file a.s using FIFO policy. You may leave out the file name if you want to use fac.s
file by default. For more debug prints use --debug
. Alternatively you can use directly the executable in zig-out/bin
.
Take the following with a grain of salt since you might implement things differently:
lab3_vm/zig/src/main.zig | 51 +++++++++++++++++++++++++++++++++++++++++++--------
1 file changed, 43 insertions(+), 8 deletions(-)
In this assignment you will implement a simple, but usable file system. We will use FUSE Links to an external site., a library that enables running file system code in user-space.
Understand the organization of a file system and its elementary operations.
Read the manual (e.g. man 2 rename
) for the following system calls:
stat
([DIR_ENTRY]
)rename
([RENAME]
)truncate
([TRUNC_FREE]
)unlink
([REMOVE]
)lseek
([READ_OFFSET]
)and also the comments in /usr/include/fuse/fuse.h
for the following members of struct fuse_operations
: gettattr
, readdir
, read
, write
, chmod
, truncate
, rename
, unlink
, create
. You will implement some of these functions.
You will extend a very primitive file system implementing a FUSE interface (can be mounted in Linux). The source code for this file system is located in lab4_fs
. To build the project, run make build
. The project comes with a test script which you can run using make test
.
This project also comes with a set of tools. To build them, run make tools
.
format_myfs
to format the file system;info_myfs
to print information about the file system.The physical disk is emulated by using a regular file (called RAWDISK_FS
) and you will need to use the provided interface towards the disk as given, without changing rawdisk.c
and rawdisk.h
. In the current implementation, the file system uses a flat directory structure (no sub-directories) and a block map to support linked lists, for your files or for managing the free blocks. The functions needed for handling the block map are defined in fs_support.c
. Make sure you understand what each of them does.
In principle all the changes you will implement, will very likely be located mainly in one file, ssfs.c
. This implements all the FUSE operations you will work with (which you could extend, if you want to implement other FUSE operations).
Finally, there are a couple of useful tools you might also need to adjust. format_myfs.c
does a soft format of the file system, bringing the file system to a valid initial empty state. info_myfs.c
displays information about the current state of the file system, such as the block map and statistics - very useful for debugging your system. The Makefile
helps you build the FUSE interface or the tools. Please study all of these carefully before embarking on changing them. The suggestion is to run ssfs
in one shell and open another shell to test standard Linux file commands and run the tools.
Your tasks are listed below. Search for the [ ]
tag in ssfs.c
to locate the code that needs to be changed / implemented.
[DIR_ENTRY]
: Add modification time to the directory entry and handle it right (Note this will require you to reformat the file system, since the directory structure changed!).[RENAME]
: Implement file renaming.[TRUNC_FREE]
: Modify the size of a file - shrink or grow. Allocate and free blocks if needed.[REMOVE]
: Implement file deletion.[READ_OFFSET]
: Correctly implement reading with offset in do_read
.[LARGE_DIR]
: Allow the directory to grow over the one block limit (still flat)[LARGE_WRITE]
: Modify do_write
to allow writing more than one blockImportant: all of the above should be implemented without calling existing file system functions such as truncate
, unlink
, rename
,etc. Your functions will simply operate on the structure in fs_support.h|c
using the provided functions (with additions you deem necessary).
Ensure that all the tests pass when running make test
. Once you are done, push the changes to a branch with a name starting with lab4
.
git push origin HEAD:lab4
[TRUNC_FREE]
you will need to create files that span multiple blocks. If you don't implement [LARGE_WRITE]
, you can still create files that span more than one block using a sequence of small writes. One way to do this is to run for i in {1..512} ; do echo _ >> file.txt ; done
.truncate -s SIZE FILE
to create a FILE
of given SIZE
.fs_support.h | 4 +-
ssfs.c | 244 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++-----------------------
2 files changed, 177 insertions(+), 71 deletions(-)
NOTE: This assignment is to be written in C with a build script and unit test part using Zig 0.12.0-dev.1108+aeadcb391 Links to an external site. (a newer system programming language). You should see this as a challenge and an opportunity to check out another language. The Zig part is however not compulsory and the task can be completed fully by writting C code.
In this assigment you will complete a simple heap memory manager and test it in action in a real Linux tool (gawk
). For this, you will need to write your own malloc
, free
, and realloc
functions, which will override the existing standard library user-space memory allocation functions. A skeleton already implementing calloc
, along with other helper data structures and functions is provided to you. You will need to fill in functionality marked with TODO
comments.
Understand heap memory management in Linux and be able to implement a simple custom memory manager.
Read the man pages for malloc
, free
, realloc
, calloc
. Pay attention to the corner cases and return values - to work properly your implementation must follow these closely.
Read the man pages for brk
and sbrk
. These are used internally by the provided helper functions to reserve space in the program space. An overview of how sbrk
works is in the picture below. Note that sbrk(0)
returns the current program break.
Note: Be aware that sbrk
may work differently on different UNIX systems, so do consult the man pages for your target system to figure out specifics. For instance, on Mac OS X, sbrk
called with a negative increment will not decrease the program break, but only mark the freed memory as unused.
To set your system up without Zig and use only C please see the instructions at the end of the description. You will anyway work exclusively (unless you want to add more unit tests) in C, in one file!
/usr/local/cs/EDAF35/setup
first.lab5_mm
directory, as ll-mm.c
.lab5_mm/malloc-tester
is a zig project dedicated to unit testing your c, as well as setting up, building and checking your implementation with gawk
. Check your available options or steps by running zig build -h
in lab5_mm/malloc-tester
. The most important are listed below:
zig build test
- will run unit tests on your ll-mm.c
file functions; this step does not need gawk, but will help you identify some errors in your implementation. These tests are found in src/main.zig
and you may add more tests, split these or add more printouts if you wish. This will produce no output if all tests succeed. You may run zig build test --summary all
for more information.zig build setup
- will automatically download and configure gawk
sources; run this firstzig build install
- after the setup above, this will compile and install gawk
using your ll-mm.c
filezig build check
- will run the gawk
tests with the executable produced with installIn the ll-mm.c
you will find a number of functions and a data structure already implemented. Study these carefully. A brief description follows:
block
- is your linked list structure, that makes use of the fact that successive blocks lie in memory after each othernew_block
- creates a new block by increasing the brk limitfind_block
- takes in a pointer and figures out whether it is a an address previously allocated with malloc/realloc
, in which case it returns the block. Use this to validate input addresses.split_block
- if the size of the given block is large enough, splits the block in two, still linked in the same list. Returns a negative number if it failed for some reason.merge_blocks
- merges all free adjacent blocks starting with the given one.There are a number of other functions used either for unit testing or inside the functions above. None of these need modifications!
Task Have a look at the calloc
implementation. What is the point of builtin_umull_overflow
function?
There are a number of unit tests already implemented in malloc-tester/src/main.zig
. These assume the functions work properly, so if you run zig build test
before implemeting anything you should get a failure. Once you start implementing and your functions behave properly you will get fewer failed tests. For more details you can also try runnig zig build test --summary all
. As you implement stuff, do run the unit tests now and then, to see what works and what does not yet.
free
, malloc
, realloc
Task Implement free
in ll-mm.c
. Think about handling also bad input, such as a random address or null. Write more unit tests if you deem it necessary. What do you do with free consecutive blocks?
Task Implement malloc
in a simple way first, by creating new blocks. Then try to reuse free blocks that would fit the new allocation. Then try to split off the remaining space, to reduce fragmentation. All the functions you need to use for these tasks are already given.
Task Implement realloc
. Identify the cases you need to handle and add them gradually. Pick easy solutions first, but do not allocate new blocks unnecessarily! Shrinking an allocation should not result in increasing the sbrk(0)
.
Optional Task Try to add tests for calloc
by looking at how the other unit tests are written and checking the man page for calloc
. Of course, if the test is correct, it will fail, since calloc
uses malloc
, which still needs to be implemented.
gawk
If your unit tests pass, you are ready to test your memory manager with a real application. gawk
is a UNIX tool that has a test suite which stresses memory allocation a lot, which is why we will use it to test your implementation.
This will require you to run zig build setup
, zig build install
, and zig build check
, which will then run the gawk
(version 3.1.8) tests with your implementation of heap manager. The goal is to have all of these tests pass.
If zig build check
fails for any reason or simply does not finish, something is wrong with your implementation. This may happen even if all the unit tests you ran with zig build test
passed! Sometimes you will get core dumps that may help you identify what went wrong. Please check Lab 2 to see how you can use core dumps with gdb
.
Once you have implemented all of the above page replacement algorithms, add a README file to the repository describing the differences between them. Ensure that your code build and then push the changes to a branch with a name starting with lab5
.
git push origin HEAD:lab5
malloc
and kmalloc
? (see man pages). What is the advantage of knowing the size of all objects in your program - e.g. in a kernel - in the case of a memory manager?The approximate size of the code changes is:
lab5_mm/ll-mm.c | 91 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-------------------
lab5_mm/malloc-tester/src/main.zig | 18 +++++++++++-------
2 files changed, 83 insertions(+), 26 deletions(-)
If you choose not to use Zig, you can always write your own unit tests in C and use existing tools such as make
to build gawk
and associated tests. However, you will need to do a number of steps manually.
First make sure your CFLAGS are unset otherwise the patch will break, i.e. export CFLAGS=''
To download, configure gawk you will need to carry out the following steps:
wget https://ftp.gnu.org/gnu/gawk/gawk-3.1.8.tar.gz
gunzip gawk-3.1.8.tar.gz
tar xvf gawk-3.1.8.tar
rm gawk-3.1.8.tar
cd gawk-3.1.8/
./configure
touch .deps/ll-mm.Po
You will only need to do this once.
First, make sure your functions are actually called malloc
, free
, calloc
, realloc
. You might have changed these names for carrying out unit testing.
Then copy your ll-mm.c
file in the gawk-3.1.8
directory: this needs to be here, to compile with the rest.
Now you will need to patch the Makefile so it will compile with your ll-mm.c
. For this create a file named gawk.patch
with the following contents:
--- Makefile 2023-10-02 16:48:57.001338786 +0200
+++ ../Makefile.2 2023-10-02 16:48:41.081451959 +0200
@@ -38,7 +38,7 @@
# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA
#
-
+mms = ll-mm
pkgincludedir = $(includedir)/gawk
pkglibdir = $(libdir)/gawk
pkglibexecdir = $(libexecdir)/gawk
@@ -90,7 +90,7 @@
floatcomp.$(OBJEXT) gawkmisc.$(OBJEXT) getopt.$(OBJEXT) \
getopt1.$(OBJEXT) io.$(OBJEXT) main.$(OBJEXT) msg.$(OBJEXT) \
node.$(OBJEXT) random.$(OBJEXT) re.$(OBJEXT) regex.$(OBJEXT) \
- replace.$(OBJEXT) version.$(OBJEXT)
+ replace.$(OBJEXT) version.$(OBJEXT) $(mms).$(OBJEXT)
am_gawk_OBJECTS = $(am__objects_1) eval.$(OBJEXT) profile.$(OBJEXT)
gawk_OBJECTS = $(am_gawk_OBJECTS)
gawk_LDADD = $(LDADD)
@@ -177,7 +177,7 @@
AWK = gawk
CC = gcc
CCDEPMODE = depmode=gcc3
-CFLAGS = -g -O2
+CFLAGS = -g -O2 -fno-optimize-strlen
CPP = gcc -E
CPPFLAGS =
CYGPATH_W = echo
@@ -376,6 +376,7 @@
regex.h \
replace.c \
version.c \
+ $(mms).c \
xalloc.h
gawk_SOURCES = $(base_sources) eval.c profile.c
@@ -522,6 +523,7 @@
include ./$(DEPDIR)/regex.Po
include ./$(DEPDIR)/replace.Po
include ./$(DEPDIR)/version.Po
+include ./$(DEPDIR)/$(mms).Po
.c.o:
$(COMPILE) -MT $@ -MD -MP -MF $(DEPDIR)/$*.Tpo -c -o $@ $<
Then use this file to fix the Makefile with patch -F 3 Makefile < gawk.patch
You should now be able to do make
to compile gawk
with your heap manager. Also you should be able to run make check
in order to run the gawk
tests with your implementation.