Log#55 : More about the IPC instructions

By whygee on Sunday 14 July 2013, 23:47 - Architecture

Jean-Christophe sent me an interesting email loaded with questions about the IPC instructions. Before I address them, I felt that I should provide some background in the precedent post about "limiting YASEP32's threads code sizes to 16 bits". Go read it now !

Done ? OK. And now, the (translated) email.

> Concerning the 3 instructions IPC, IPE et IPR of the YASEP, I have read that you designed them with the HURD's needs in mind. However, I'm not sure to see how it solves the problem.

This story started long ago, in the F-CPU era, and the encounter with the HURD team at RMLL2002 in Bordeaux. They were trying to solve the problem of slow inter-server calls that crippled the efficiency of their system. That's more than 10 years ago...

Since then, many things have evolved and the question is quite a bit different, now that I can redesign the WHOLE computing platform, not even being forced to "run Linux" or "run the HURD". I make YASEP run whatever I need or want and I don't care as much about others' whims. But the question remained.

The IPC instructions solve one part of the problem of switching fast to another thread. These instructions make sense when the YASEP is implemented with a large register bank that can hold 8, 16 or 32 tread contexts. In this case, and if the called thread is preloaded, the switch is almost instantaneous.

Of course, you can't limit a system to run only 8, 16 or 32 contexts. This is only suitable for a SMT architecture (see : "barrel processor") but
software size could grow beyond that. My actually running Linux laptop has about 177 tasks right now, and only a few are actually using the CPU. So, for large implementations, the YASEP must store the corresponding, actual thread ID as well as a smaller, 5-bit ID for the cache. Or 6 bits, if you want to emulate Ubicom.

Some associative memory and you implement the cache mechanism. And if your code calls a thread that is not already loaded in the CPU register bank, you "get hit by the miss", but "this should not happen too often". And embedded systems that don't need 32 simultaneous threads can just use the 5-bit ID directly (and trap if the thread ID is too large).

For the rest of this post, bear in mind that I am not a microkernel specialist. There are even several points of view about them and I will only speak about mine, despite having never created a full operating system. At least I know what features and behaviour my OS will have.

> If these instructions are meant to call routines in safe code sections (TCB) it might be a good, flexible solution.

That was the initial purpose : call shared libraries and fast system calls. Later, it evolved with the addition of some additional security checks. It was possible because the YASEP is not created to use classic paged memory only.

> But I understand that the HURD's problem was to provide a platform where same-level users (less privileged than the machine or the admin) could run their own servers and exchange services safely. One typical example being User A creating a USB file system and letting User B access files on the USB flash dongle. B must call functions (readdir, read, write, etc.) in A's server.

There the problem is not totally solved by the IPC instructions but they help by making the context switch faster. However the big problem is the transfer of data across memory addressing spaces, and that requires a later, deeper analysis and smart design. It's way too early for this now.

> What happens when the server does not return to the client ? This could happen if an error or bug occurs in the server, or if it is malicious. There is a need of a mechanism that lets the caller get his control back but it complicates the design of the servers, that could be interrupted at any moment (which is not impossible or desired).

Reliable code MUST be resilient. And code by definition may be interrupted at any point. Exception handling is an integral feature of high-level languages and low-level systems are naturally prone to failures : flash memory errors (worn out ?), file system capacity saturated, network down for any reason, USB cable that is removed without notice... And bugs happen.

Even critical sections could fail. This is why I consider an instruction design (for http://yasep.org/#!ISM/CRIT) where you can
check if the critical section has been interrupted or not. In that case, you re-start the critical section. It should be more resilient and safer than blindly disabling interrupts. Expect things to fail instead of relying on assumptions.

> One of the underlying questions is : who "pays" for the resources that are necessary to run the request ? The client, who lends his resources (and that must get them back from the server in case of a failure) or the user who provides the server (who then risks being DDOSed)...

My opinion is that the requester must pay for the resources to run the request, for example by providing CPU time, access rights and memory, that are necessary to complete the request. Of course, certain protections are necessary to prevent abuses or to limit the effects of bugs.

How the resources are accounted is another important thing to define, along with how to easily transfer data blocks or credentials between servers or their instances. For example, if a server is called by User 1, it should not be able to share these informations with the instance that services User 2. I'm interested by hardware-based solutions that speed-up and simplify software design, without turning into a mess like the iAPX432 :-)

20200411:

This discussion is still not settled... I'll try to have time to work on #NPM - New Programming Model .

Log#54 : A crippled YASEP ? It's for your own good...

Log#56 : The tracker's backend

Discussions

Become a Hackaday.io Member