CS 5460 Homework 1

Due: Tuesday, September 8th, 2009 12:25pm

Submit a single C file using handin on a CADE machine:

handin cs5460 hw1 <file>

Part 1 – Learning About Syscalls

There are six “?”s below. Include an answer to each question as a comment in the C file that you hand in with part 2.

Start with the following simple C program:

  #include <stdlib.h>
  
  int main(int argc, char **argv) {
    write(1, "hello\n", 6);
    write(1, "goodbye\n", 8);
    exit(0);
  }

It prints “hello” and “goodbye” to stdout, and then it exits.

Now add this function to your C file:

  int write(int fd, char *str, int len) {
    /* do nothing */
    return 0;
 }

When you compile, link, and run, nothing will print out. Why?

Change the write line for “goodbye” to

  syscall(1, 1, "goodbye\n", 8);

Compile and run on an x86_64 machine, such as a Linux machine in the CADE lab. Now, “goodbye” is printed, while “hello” still isn't. Why?

Without changing the program, compile using the -m32 flag to gcc, so that you compile for 32-bit mode. Then, when you run the program, nothing is printed. Furthermore, the exit status of the program is 1 instead of 0. Why?

Hint: On a CADE machine, compare /usr/include/asm-x86_64/unistd.h and /usr/include/asm-i386/unistd.h.)

Add this function to your C file, too:

  int syscall(int n, ...) {
    /* do nothing */
    return 0;
 }

The program doesn't print output in either 32-bit or 64-bit mode. Why?

Replace the syscall call with

    asm("movl %0, %%ecx\n"
        "movl $1, %%ebx\n"
        "movl $8, %%edx\n"
        "movl $4, %%eax\n"
        "int $0x80\n"
        :
        : "r" ("goodbye\n")
        );

Note: If you're not familiar with inline assembly in gcc, see GCC-Inline-Assembly-HOWTO.

Compile in 32-bit mode with -m32. The “goodbye” printout is back. Why? Can any C function declaration interfere with this variant?

Part 2 – Generating Code

Your main programming task for this assignment is to implement a gen function that works in 32-bit mode and matches the prototype

  we_proc gen(char *s, int n);

where we_proc is defined as

  typedef void (*we_proc)();

The function generated by gen should write s to stdout n times and then exit. The generated function must not rely on write or syscall bindings, in case they are replaced. Also, multiple calls to gen should not affect each other. For example,

  int main(int argc, char **argv) {
    we_proc a, b;
    a = gen("hello\n", 3);
    b = gen("goodbye\n", 4);
    a();
    write(1, "oops\n", 5);
    return 0;
  }

should print “hello” three times and exit (without printing “oops”).

The constraint that different calls to gen do not interfere means that you can't just use global variables and pointers to C-implemented functions. Instead, you must generate the code at run time. You will only need a few x86 instructions, so you should be able to work out the machine code to generate.

Note: Here's a reference for x86/x86_64: AMD64 Architecture Programmer's Manual Volume 3: General-Purpose and System Instructions.

Hint: The five bytes 0xB8, 1, 0, 0, and 0 form an x86 machine-code sequence that moves the value 1 into the EAX register. The first byte is the instruction, and the last four bytes are the constant 1 in little-endian format.

If you just call malloc to allocate space for the code, calling the allocated pointer as a function will lead to a segmentation fault. That's because memory allocated by malloc has execute permission disabled for security reasons. To allocate executable memory, use this allocator:

  #include <stdlib.h>
  #include <unistd.h>
  #include <malloc.h>
  #include <sys/mman.h>
  
  void *malloc_exec(size_t sz) {
    long pagesize = sysconf(_SC_PAGE_SIZE);
    void *p = memalign(pagesize, sz);
    if (mprotect(p, sz, PROT_READ | PROT_WRITE | PROT_EXEC))
      abort();
    return p;
  }

Don't forget to compile your program in 32-bit mode.


Last update: Tuesday, December 8th, 2009
mflatt@cs.utah.edu