====== Writing Applications using LIBL4 ====== LIBL4 is the generic userspace library that provides a unified interface to the Codezero microkernel. Codezero pagers and applications use this library to initiate system calls. LIBL4 is a plain and abstract library. It implements a thin layer of glue logic for every system call. Since the microkernel does not define any policy for the IPC protocol, LIBL4 also provides certain helpers to form up a userspace protocol for IPC. Finally LIBL4 includes a small set of API helpers for creation of threads and management of capabilities. Generally the library has been kept small so that it does not enforce complicated API conventions to userspace applications. Note, the [[:codezero_api|Codezero API Reference]] chapter already includes exact the API elements that LIBL4 provides for each system call or structure. This guide builds up on the API reference and provides examples on how to use various LIBL4 interfaces. ==== Codezero standalone application and library examples ==== - [[application_development:#Hello_World_application|Hello World application]] - [[application_development:#Thread_library_demo|Thread library demo]] - [[application_development:#Mutex_library_demo|Mutex library demo]] - [[application_development:#other_examples|Other examples]] ==== Codezero pager application examples ==== - [[:application_development:#building_a_service_using_ipc|Building a service using IPC]] - [[:application_development:#management_of_children_threads|Management of children threads]] - [[:application_development:#manipulating_children_address_spaces|Manipulating children address spaces]] ==== ==== [[:Documentation|<< Back to Documentation]] ==== Codezero standalone application and library examples ==== === 1.) Hello World application === The Codezero system configuration provides two baremetal container types named **hello_world** and **empty**. These container types typically produce new project directories under the //conts///, where denotes the name chosen as part of the configuration. This directory contains all source code, linker, and build scripts that are necessary to build a complete standalone container. The **empty** container contains the bare minimum sources required, and the **hello_world** container includes a few additions for the container to print a //Hello World!// message on the console. Below is a source code snippet that describes the bare minimum required sources to create a new container, for demonstration purposes. For the methodology of building, advancing, and integration of a new container to the Codezero build system, please refer to the [[Getting Started]] chapter. /* * Main function for this container */ #include #include L4LIB_INC_ARCH(syslib.h) #include L4LIB_INC_ARCH(syscalls.h) #include void __container_init(void) { /* Generic L4 initialization */ __l4_init(); /* Entry to main */ main(); } int print_hello_world(void) { printf("%s: Hello world from %s!\n", __CONTAINER__, __CONTAINER_NAME__); return 0; } int main(void) { print_hello_world(); return 0; } * Typically, the _ _container_init() function is called by the LIBL4 library, before calling **main()**. === 2.) Thread library demo === LIBL4 provides a small multithreading library with a handful of API calls. Using this library, a microkernel application may create multiple threads in its own address space. Most LIBL4 helper functions are direct wrappers around actual system calls. Unlike those helpers, this library includes a small and plain runtime for dynamically managing the allocation of thread structures, thread stack, and UTCB areas. Note that API functions for this library are not listed as part of **L4_USERSPACE_LIBRARY** sections of the microkernel system call API pages. The elements of this API are listed below. The complete list may be found at: //codezero/conts/userlibs/libl4/include/l4lib/lib/thread.h//. struct l4_thread { struct task_ids ids; /* Thread IDs */ struct l4_mutex lock; /* Lock for thread struct */ struct link list; /* Link to list of threads */ unsigned long *stack; /* Stack (grows downwards) */ struct utcb *utcb; /* UTCB address */ }; /* * These are thread calls that are meant to be * called by library users. */ int thread_create(int (*func)(void *), void *args, unsigned int flags, struct l4_thread **tptr); int thread_wait(struct l4_thread *t); void thread_exit(int exitcode); /* * This is to be called only if the to-be-destroyed thread is in a * sane condition for destruction. */ int thread_destroy(struct l4_thread *thread); /* Library init function called by __container_init */ void __l4_threadlib_init(void); * Typically, the **_ _l4_threadlib_init()** is used for library initialization before calling **main()**. * Parent threads may use the **thread_create()** function to create new threads in the current address space. * Parent threads may wait for a thread's own destruction by a **thread_wait()** library call. * Parent threads may optionally destroy children, using a **thread_destroy()** library call. This option should be exercised with caution, as it may cause problems if the thread was manipulating shared data during its destruction. * Alternatively, children threads may exit using the **thread_exit()** library call. Essentially, this call should be paired up with a thread_wait() call by the parent thread. Below is an example demonstration of how the library may be instantiated: #include #include L4LIB_INC_ARCH(syslib.h) #include L4LIB_INC_ARCH(syscalls.h) #include #define NTHREADS 6 #define dbg_printf printf int thread_test_func1(void *arg) { /* Wait for a while before exiting */ int j = 0x400000; while (j--) ; return tid; } int thread_demo() { struct l4_thread *thread[NTHREADS]; int err; /* Create threads */ for (int i = 0; i < NTHREADS; i++) err = thread_create(thread_test_func1, 0, TC_SHARE_SPACE, &thread[i]); /* * Wait for all threads to exit successfully */ for (int i = 0; i < NTHREADS; i++) if ((err = thread_wait(thread[i])) < 0) return err; return 0; } In the above demonstration, a child thread is created by the main thread. The child thread implicitly exits using the **thread_exit()** library call as soon as it returns from its main function. The parent thread waits for the child destruction using the **thread_wait()** library call. The full source code for the demo may be located at //codezero/conts/baremetal/threads_demo//. The demo may be instantiated by selecting its type as the baremetal container type during the system configuration. === 3.) Mutex library demo === Codezero provides the notion of userspace mutexes for multithreaded applications to synchronize. Userspace mutexes consist of architecture-specific synchronization primitives supported by kernel-based mutex wait queues. Codezero userspace mutexes have been designed such that threads are blocked inside the kernel only upon contention. For the noncontending case, threads simply continue execution without the intervention of the microkernel. When one or more threads contend, they go to sleep inside the kernel, waiting for a rendezvous to occur for wake up. This design allows for fast userspace locking, while reducing the load on kernel with regard to mutexes. Below is an example demonstration of how userspace mutexes may be used in a multithreaded application: int mutex_thread_contending(void *arg) { struct mutex_test_data *data = (struct mutex_test_data *)arg; l4id_t tid = self_tid(); int err = tid; for (int i = 0; i < MUTEX_INCREMENTS; i++) { /* Lock the data structure */ if ((err = l4_mutex_lock(&data->lock)) < 0) return -err; /* Sleep some time to have some threads blocked on the mutex */ for (int j = 0; j < 3; j++) l4_thread_switch(0); /* Increment and release lock */ data->val++; /* Unlock the data structure */ if ((err = l4_mutex_unlock(&data->lock)) < 0) return -err; } return 0; } int test_mutex(int (*mutex_thread)(void *)) { struct l4_thread *thread[MUTEX_NTHREADS]; int err; /* Init mutex data */ init_test_data(&tdata); /* Lock the mutex so nobody starts working */ if ((err = l4_mutex_lock(&tdata.lock)) < 0) return err; /* Create threads */ for (int i = 0; i < MUTEX_NTHREADS; i++) { if ((err = thread_create(mutex_thread, &tdata, TC_SHARE_SPACE, &thread[i])) < 0) return err; /* Unlock the mutex and initiate all workers */ if ((err = l4_mutex_unlock(&tdata.lock)) < 0) return -err; return 0; } == Operational model == * In this demonstration, the main thread creates multiple threads that will contend on the mutex, namely the **mutex_thread_contending** threads. * The main thread locks the mutex until all threads are created and started. * Each child tries to acquire the mutex. * Once any one of them acquires the mutex, it makes an **l4_thread_switch()** call, effectively giving execution control to other threads. * The other threads, having found out the mutex is locked, declare contention and call **l4_mutex_control** system call for a rendezvous. * When the unlocker releases the lock, it finds out about contention and calls **l4_mutex_control** to wake up any contended threads. In order to have further insight into userspace mutexes, the //codezero/conts/baremetal/mutex_demo// project is selectable from the Codezero configuration system under baremetal containers. ==== Codezero pager application examples ==== === 1.) Building a service using IPC === IPC is the core method of communication in a virtualization system based on microkernels. In such a system, often IPC takes place between two parties that are involved in a client-server relationship. In Codezero, client server communication is kept simple and lightweight. There is no mechanism to create autogenerated client and server stubs, as this methodology is known to create notoriously complicated and heavyweight implementations. Instead, any client server communication is formed manually using the IPC messaging protocol provided by LIBL4 helper functions. In this section you may find fictional examples on how to create services that serve requests from potential applications using IPC. Typically, a virtualized operating system kernel serves requests from its client applications this way. == Handling requests == On a typical Codezero service, a request-handling pattern involves the code snippet, as described below. void handle_requests(void) { /* Generic IPC data */ u32 mr[MR_UNUSED_TOTAL]; l4id_t senderid; struct tcb *sender; u32 tag; int ret; /* Receive request from any thread */ if ((ret = l4_receive(L4_ANYTHREAD)) < 0) goto out_err; /* Read the tag that identifies a request */ tag = l4_get_tag(); /* Read the sender ID, set by the microkernel */ senderid = l4_get_sender(); /* Retrieve the information stored on the service about the sender */ if (!(sender = find_task(senderid))) { l4_ipc_return(-ESRCH); return; } /* Read message registers */ for (int i = 0; i < MR_UNUSED_TOTAL; i++) mr[i] = read_mr(MR_UNUSED_START + i); /* Handle the request according to the given tag */ switch(tag) { case L4_IPC_REQUEST_NO_RETURN: { ret = handle_no_return_request(sender, (char *)mr[0], mr[1], mr[2]); if (ret < 0) break; /* We only return for errors. */ else return; /* Otherwise, we don't return; a one way request. */ } case L4_IPC_REQUEST_WITH_RETURN: ret = handle_returning_request(sender, (void *)mr[0]); break; default: } /* Send return message back to the client. */ if ((ret = l4_ipc_return(ret)) < 0) { printf("%s: L4 IPC Error: %d.\n", __FUNCTION__, ret); BUG(); } out_err: printf("IPC Error occured: %d\n", err); } void main(void) { /* Initialize service */ initialise(); while (1) { handle_requests(); } } == Operational model == * After initialization, the server continuously asks for requests. * A general server typically accepts requests from any thread on the system, using the **L4_ANYTHREAD** special value. * Each IPC contains crucial information about the request such as the **request tag** and the **sender ID.** * The **sender ID** is set by the microkernel since the receiver service cannot trust this information if it came from a sender party. The microkernel imposes no structure on the content of message fields. This is the only exception for security reasons. * The **tag** in the message identifies the type of request, and it is only relevant to IPC parties in userspace. It bears no significance for the microkernel. * Both the **tag** and **sender ID** are received on preallocated message slots, defined by the protocol between the service and the client. Typically, these message slots are defined by **LIBL4**. * The rest of the message registers contain actual arguments about the request. On the ARM architecture, LIBL4 defines four user-defined slots for sending system call arguments. These are defined by the **L4SYS_ARG0** - **L4SYS_ARG3** symbols, which denote message-register offset values. These slots typically take place after the **tag** and **senderid** slots. * A typical request involves a **send** and a **receive** phase. For instance, a client makes a system call by the **send** operation and receives the return value of the system call through a **receive**. The return value is also returned in a preallocated message register, typically defined by **LIBL4**. * On requests that do not require a return value (e.g., imagine an **exit** system call that do not return), the service moves on to the next request without a **send** phase, after the first **receive**. == Blocking IPC == A note worth mentioning here is that the communication in this example is synchronous. In other words, both the client and the server tasks block during IPC. This may create complications in those cases where one of the parties involved in the IPC is buggy. For example, a service in its return phase may block indefinitely if the client does not adhere to the protocol and issue a receive. This problem may be solved by using multithreaded pagers. This problem is completely avoided in case the virtualized Linux kernel communicates with Linux userspace applications. Since Linux userspace is binary compatible with a native Linux kernel environment, application system calls are converted to IPC inside the microkernel. Since a microkernel-generated IPC never blocks its pager indefinitely, the Linux kernel is protected from being stalled by a client IPC that blocks forever. === 2.) Management of children threads === Pagers are responsible for creating, destroying, and managing the execution of threads that they are associated with. As a general rule, each pager is responsible for the set of all threads inside a particular container. Threads may be created in an existing address space, on a brand new, clean address space, or an address space that has been created as a copy of an existing address space. Below are example code snippets that achieve various thread manipulation operations. == Thread creation == * Creating a brand new thread: /* Create a new thread in a new address space */ void thread_new(void) { struct task_ids ids; int err; ids.tid = TASK_ID_INVALID; ids.spid = TASK_ID_INVALID; ids.tgid = TASK_ID_INVALID; if ((err = l4_thread_control(THREAD_CREATE | THREAD_NEW_SPACE, &ids)) < 0) { printf("l4_thread_control failed: %d\n", err); } } * Creating a new thread in existing address space (e.g., virtualized Linux kernel handling a clone() syscall): /* Create a new thread in an existing address space */ void thread_new(struct task_ids *parent) { struct task_ids ids; int err; /* Specify parent ids */ ids.tid = parent->tid; ids.spid = parent->spid; ids.tgid = TASK_ID_INVALID; if ((err = l4_thread_control(THREAD_CREATE | THREAD_SAME_SPACE, &ids)) < 0) { printf("l4_thread_control failed: %d\n", err); } } * Creating a new thread in a new address space that is a copy of another address space (e.g., virtualized Linux kernel handling a fork() system call): /* Create a new thread in a new, copied space */ void thread_new(struct task_ids *parent) { struct task_ids ids; int err; /* Specify parent ids */ ids.tid = parent->tid; ids.spid = parent->spid; ids.tgid = TASK_ID_INVALID; if ((err = l4_thread_control(THREAD_CREATE | THREAD_COPY_SPACE, &ids)) < 0) { printf("l4_thread_control failed: %d\n", err); } } == Thread context manipulation == * Thread context is manipulated via the **l4_exchange_registers()** system call. For example, on a newly created thread in an existing address space, thread context may be modified in order to initiate new thread execution on the relevant context. Typically, any one of thread's registers or UTCB address may be changed or read by this system call. Below is a fictional example on how this call may be used: void thread_manipulate(struct task_ids *new_ids, unsigned long new_stack, unsigned long utcb_address) { struct exregs_data exregs; int err; memset(&exregs, 0, sizeof(exregs)); /* Set new stack for child */ exregs_set_stack(&exregs, new_stack); /* Set child return value to 0 */ exregs_set_mr(&exregs, MR_RETURN, 0); /* Set child utcb */ exregs_set_utcb(&exregs, utcb_address); /* Do the actual exchange registers call to microkernel */ if ((err = l4_exchange_registers(&exregs, new_ids->tid)) < 0) printf("Exchange registers error: %d\n", err); } == Operational model == The Codezero capability system is flexible enough to allocate privileges to make any system call to any particular task. However, as a convention **l4_thread_control** and **l4_exchange_registers()** are privileged system calls that are only meant to be executed by pagers. As an example, the virtualized Linux kernel may receive a system-call request that may involve one of these operations. The operational model in the above system calls are as follows: * A client thread requests a new thread is created in its own address space via IPC. * The pager handles the request by creating a new thread via **l4_thread_control** and modifying its context as requested via **l4_exchange_registers**. * Pager replies back to client that new thread is ready for execution. * Depending on the implementation, the pager may initiate a new thread's execution if requested by the client. In conclusion, the above code snippets are used usually as part of an IPC request/reply pair between the pager on the system and its client. === 3.) Manipulating children address spaces === Pagers manipulate the address space of their children using privileged address space manipulation functions. Address spaces are created, cleared, and destroyed by a **l4_thread_control** system call during thread creation. However, the modification of existing address spaces are done by the **l4_map** and **l4_unmap** system calls. Below are the code snippets for typical address-space manipulation operations by pagers. == Mapping a new page == Below is the microkernel's architecture-specific structure for describing a page fault: /* Kernel's data about the fault */ typedef struct fault_kdata { u32 faulty_pc; /* In DABT: Aborting PC, In PABT: Same as FAR */ u32 fsr; /* In DABT: DFSR, In PABT: IFSR */ u32 far; /* In DABT: DFAR, in PABT: IFAR */ pte_t pte; /* Faulty page table entry */ } __attribute__ ((__packed__)) fault_kdata_t; Below is the code snippet taken from a pager during the handling of a page fault from a task: ... /* Map the new page to faulting task */ l4_map((void *)page_to_phys(page), (void *)page_align(fault->address), 1, (reason & VM_READ) ? MAP_USR_RO : MAP_USR_RW, fault->task->tid); dprintf("%s: Mapped 0x%x as writable to tid %d.\n", __TASKNAME__, page_align(fault->address), fault->task->tid); return 0; } == Operational model == * In the above example, a pager (namely a virtualized kernel) is handling a page fault IPC generated by the microkernel in response to a memory exception (e.g., a data abort or a prefetch abort exception on ARM). * The exception data is delivered in the form of IPC, using the **fault_kdata_t** structure in case of the ARM architecture. * Using the information provided by the IPC, the pager determines which physical page to map to its client and with what permission flags. * Usually pagers manipulate client address spaces as a result of page faults. On other occasions, pagers may proactively prefault various pages in a client's address space if this is necessary. == Unmapping a range of pages == Below is the code snippet taken from a pager during the unmapping of a virtual memory address range from a client task: ... /* * Unmap the whole VMA address range. Note that this * may return -1 if the area was already faulted, which * means the area was unmapped before being touched. */ l4_unmap((void *)__pfn_to_addr(vma->pfn_start), vma->pfn_end - vma->pfn_start, task->tid); return 0; } == Operational model == * In the above example, the pager removes the mapping for a virtual address range from a task in response to a memory unmap IPC request. === 4.) Other examples === For other API usage examples such as IPC, thread management, or capability management, please refer to baremetal container sources provided under //codezero/conts/baremetal//. Under this directory, each individual project is a good starting point for understanding how the API may be used in L4 applications. ==== ==== If you have questions about other application scenarios, or if a concept is not described clearly enough, please notify us by direct [[mailto:info@l4dev.org?subject=Documentation%20Update%20Request|email]] or ask on our [[http://lists.l4dev.org/mailman/listinfo/codezero-devel|mailing list]].