micro HTTP server in C

If you need a very-high throughput or a multithreaded web server to host data for everyone on the web, look at Apache (with CGI), NGINX or https://www.gnu.org/software/libmicrohttpd/

However they are limited by the CGI's basic inability to preserve a context through several HTTP requests and implement efficient sessions: each request must analyse the headers again, check the cookies again, even though the connection with the server was not closed ! Actually you can't even know the order of requests with a multithreaded server, and race conditions are expected if you do anything more than serve static files.

This project solves the problem of implicit sessions with a slower but inherently safer approach : a single-threaded server ensures that the request are received and replied in order. No risk of race conditions from that angle, no need of funky programming techniques to avoid them. A session is congruent with an open TCP/IP connection, which also helps with safety (despite dubious security but it's not the purpose).

-o-O-0-O-o-

The server's code can contain two modules, that each serve a domain and specific purpose:

Static files server
HTTaP manager
(an optional loopback with POST is also planned)

They can be individually disabled but they are usually integrated in the same server because HTTP/JS makes it much more difficult to connect to resources located on a different IP address or IP port.

The client first points the browser to IPaddress:IPport which provides the gateway to the application, with its code, images, text... It looks like a normal web server, probably like yasep.org.

The code is downloaded by the client then JavaScript is executed. The server can focus on real-time, low-level operations while the client performs the high-level application logic, user interaction, computations and display.

This workflow is ideal when you want to control, configure and interact with embedded devices, over wire or radio link if you need. Your client can be any brand or model as long as it abides to simple web standards. You can code once in JS (with ygwm.org) and run everywhere !

-o-O-0-O-o-

Since this is a software project, it's hard to create a diagram/illustration but the diagram showing #HTTaP works well because it shows how this server is intended to be used. It's the "HTTaP server" boxes:

The server is not a competitor to Apache and others, but a piece of code that is embedded in other programs to make them web-enabled and work with real time constraints. You can thus control software, hardware, or even both... This is another reason why it is single-threaded : only one user is expected at a given time.

You can adapt the server to recognise SCPI commands if you are not a HTTaP fan.

Here is an example of use : in 2017/03, the server runs on the Pi to drive the "Remote Controlled Car" extension board, for the workshop at IESA. You wouldn't want somebody else to "drive your car" at the same time as you, right ?

The noise you hear is the generated by "bit banging" the loudspeaker pin in the polling loop that drives the server's FSM. It can get really fast, showing that

The web interface doesn't interfere with proper real-time reactions
The application can do heavy lifting without race conditions and still remain responsive

In this case, the server is used along with the #C GPIO library for Raspberry Pi and I prepare other libraries to interface with even more peripherals !

See also the #Simon Says learn Pi and IoT project where these building blocks are put together.

-o-O-0-O-o-

Logs:
1. Timeout
2. MIME type handling
3. Licensing
4. New version
5. Security and sandboxing
6. Overview of the code
7. Timeout and persistence
8. When to enable CORS
9. Files are served
10. Handling user-provided routines
11. Potentian use case: The API of DOOM
12. Respawn
13. Back to this project !
14. A more polished version
15. A much better timing system
16. New version : Firefox vs Chromium-browser
17. version...

File cache considerations
Yann Guidon / YGDES • 05/18/2023 at 19:36 • 0 comments

As I return to this (now quite old) project and remember how it is structured, the hack of the attributes and MIME types now strikes me as adapted only for very low traffic. If any significant traffic happens, this would increase the number of system calls (open, read, close...) which is not desired. A sort of file cache, at least for the small ones, would somehow reduce the system's load, at least during the construction of the response header.

My first idea was to keep the last X file descriptors open, to at least save on the open() calls and related kernel operations. However this does not reduce the amount of calls because there will still be a seek() and a read(). Oh and there must be a cache lookup and update system as well... So a small cache area, with a dozen of 1KB bins, is required to save on the seek()s and read()s.

File access is far from being a bottleneck. Files get read when a new session starts and this is not timing-critical. However, slowly, this brings us closer to the original intended architecture where the server manages a small filesystem by itself, free from OS considerations and easily embedded in a tiny computer or large microcontroller.
Refactoring with aligned strings
Yann Guidon / YGDES • 05/18/2023 at 12:54 • 0 comments

The server relies a lot on strings, particularly concatenations.To keep the code small and fast, it involves a lot of manual handling, which requires care to prevent bugs or worse. It is one of the weak/sore points of the code base, which inspired the development of #Aligned Strings format. Now that this small library is functional, it is time to use it for real !
I'm still very annoyed by the limitation of the C preprocessor that prevents me from further streamlining and optimising the declarations of flexible strings inside functions, as I can't #define or alias words from inside a macro. But in this project, it's not a significant roadblock, so no need to resort to m4 or other dirty tricks : the system works, not as efficiently as I would love, but it's still a good progress compared to the overly "micromanaged" strings of the existing version. Moreover, this refactoring is a great opportunity to put the aligned strings library to the test of real life and even enhance it.

A poll() problem...

Yann Guidon / YGDES • 01/06/2021 at 18:58 • 0 comments

I got something unexpected while looking for easy/simple/light ways to probe if the server is up with a bash script.

> ./serv_simple
  === And now, browse to 127.0.0.1:60075 ===
Port: 60075, Keepalive: 10s
Path: files  Root page: index.html

Warning: chroot() failure (are you root ?) : No such file or directory

Server socket ready
H_BLOCKED

On the script side:

root@pi:/home/pi# echo "GET /" | telnet 127.0.0.1 60075
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
Connection closed by foreign host.

and then:

* Connected to 127.0.0.1:39298
HTTaP-Session: 2l1cvf10
H_BLOCKING
0
received 7
---------------------------
GET /

*
got root
trying to read file: index.html
Extra Header: 25 bytes
Content-Type: text/html
sending 95 bytes header + 12387 bytes payload : 12482
1

poll() problem on client socket: No such file or directory

Meanwhile I get a kernel message :

[ 2038.025046] TCP: request_sock_TCP: Possible SYN flooding on port 60075. Sending cookies.  Check SNMP counters.

OTOH when I change the URL to an invalid one, the server closes the connection itsef and it works.

root@pi:/home/pi# echo "GET /plop" | telnet 127.0.0.1 60075
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
Connection closed by foreign host.

I can thus check the server with this command:

( wget 127.0.0.1:60075/plop -O - 2>&1 | grep 404 ) && echo OK

Or better :

wget -q '127.0.0.1:60075/?' -O - | grep HTTaP

However the poll() failure behaviour is inappropriate and it is now fixed withHTTaP_src.20210106.tgz

72 !
Yann Guidon / YGDES • 05/16/2020 at 00:15 • 0 comments
I created a little test in the sandbox page...

This result is encouraging !

It means that the rate of ping-pong between the HTTaP server and the browser can reach 72 per second.

Of course I cheated :
- There was no real workload, I just requested /?ping and didn't even bother checking the reply.
- The test is on the same computer, on a multi-processor system, so there is no network latency at all.
But it's always good to check the higher bound, right ? It confirms that it's possible to create useful interactive systems despite the serialised link.

Note however some of the untold features :
- The test can run without interference from other clients, even in another tab.
- The test can run along with other operations (such as the PING button) thanks to the queueing semaphore.
Not bad...

As usual : check the latest version in the files section.
The rate drops to 25/s on a DSL line with a ping time of 20ms so this is coherent...
And now, JavaScript !
Yann Guidon / YGDES • 05/09/2020 at 00:34 • 0 comments

The latest revision is working, at least from the C side.

Now is the time to evolve the client side, with the creation of a JavaScript client side.

So far the keepalive mechanism is working well but there is just one issue : the page can be opened in two tabs of the same web browser...

I need to find a trick to prevent this !

20200511:
The root HTTaP object includes a new element : "HTTaP_open" is 0 when /? is first accessed, and 1 subsequently.

It is up to the client to detect this situation and avoid loading anything else to prevent disruption of the already established connection.
20200514:
I have also solved the problem of dangling/open connections when the page is closed or reloaded !
v20200504
Yann Guidon / YGDES • 05/04/2020 at 01:02 • 0 comments
The new version is here ! HTTaP_src.20200504.tgz
I start to remove the s(n)printf calls that look like more trouble than they're worth. Convenience vs safety, right ? I still have to convert some places to the new system that gives me more control, and less sources of potential errors (or injections).
I also implement the new /?ping definition to help the client manage its own timers.
I still have the plans in sight, to demonstrate the "flexible" program (to dynamically switch from polled to blocking mode) but this required some tweaking here and there... At least now I have a new API to expose to the user, if/when there is a suitable #define : the default is below.
```
// define our own HTTaP parser :
#define HTTAP_PARSE my_HTTaP_parser

int my_HTTaP_parser(
    char *request, // pointer to the request (receive buffer)
    int recv_len,  // byte count of the request
    int ReqType,   // 1 for GET, 2 for POST
    char *b,       // pointer to the send buffer
    int *len) {    // length of the send buffer

  return 0;
} 
```
You can now define your own keywords without touching the other files :-)
More bugs
Yann Guidon / YGDES • 05/03/2020 at 16:49 • 0 comments

The last iteration of the rewrite uncovered a dirty, ugly bug that should remain secret, unless you dare to diff the recent versions. But now it's fixed and the system is more stable. I didn't even encounter the problem in real life but it sure would have been easy to understand because all system calls have explicit error messages.
TL;DR : don't use versions of the code before HTTaP_src.20200502.2.tgz !
Now I start to work on a second example code that switches dynamically between polling mode and blocking mode, under control of both the workload and the user. This implies giving orders in HTTaP form from the user interface, hence more JS and C code.
The C code is not really developed on the HTTaP side, I have something ambitious in mind but I prefer to develop it later. But I need HTTaP now. So I have found a solution : split the HTTaP request parsing code, away from the server. This will be a separate file. And I can provide a preliminary/quickanddirty version for the iteration that I start now, through a HTTaP_parse() function.
version 2020/05/01
Yann Guidon / YGDES • 05/01/2020 at 20:59 • 0 comments
I uploaded a newer version that seems to solve some issues I discovered these last days. I found a condition that triggered a double close, for example, and a more worrying endless loop when the peer closes the socket before the server... POSIX sockets are really as painful as they seem !

I reworked some functions and it changed the behaviour of the browsers, I got the following effects :
Local/127.0.0.1:
- Firefox 57 desktop: works well. No extra socket is opened apparently now. Load score : 24/24 elements, favicon loaded.
- Chrome still has problems : load score = 17/23, 6 extra sockets opened close to the end, no favicon.
Remote (over the internet) :
- Firefox (desktop Linux as well as Android) : 51 extra connections, 19 elements loaded, no favicon...
- Chrome : 8 elements loaded, 17 extra sockets...
The local scores are those that matter because HTTaP is not intended to be used over the WWW (though of course it would be better if it could but then multi-threading would become necessary, and it's out of scope). Normally, HTTaP is used over a simple LAN.

The change is that serving the pages is done in priority, while checking the incoming connections has lost its precedence.

Another problem is when receiving zero-sized packets. TCP/IP seems to allow this, and it might help with sending keepalive packets. However the BSD standards indicate the closure of a socket by reading 0 bytes. It is no obvious/simple to verify this with a different method (I considered testing with poll() and checking for POLLOUT in the .revent...) so I have chosen to simply leave this case alone, and close the socket.

Due to interferences with Facebook's URL hijacks, I added a special HTTaP 400 error message with a link to the main page.

For example http://httap.org:60075/?invalid will return an error 400 with the following clear text message:
```
HTTaP key not found.
main site
```
The above links work well with Chrome (to the extent that it displays the page). However Firefox will fire 10 sockets and send no data on the first one... and then that connection times out after all the others have been rejected. WTF ???

On the server side, there are two big things to code now :
- the HTTaP custom code / tree build system
- creating the example with both blocking and polling
The second part is partially written so it will be done first, but having the HTTaP keys build API would help too...

Then I'll start to write more JS to be included in HTML, to help with loading the extra external elements (a for-loop to sequentially load the images from reading the HTML source code)
New version : Firefox vs Chromium-browser
Yann Guidon / YGDES • 04/26/2020 at 19:15 • 0 comments
The latest version is out and solves quite some problems !

However things are a bit weird when it is used by "modern browsers". I have found no problem with links or lynx but these are text-based browsers that display only one page and don't load external resources, so it gets one elements and the connection times out... Since they don't really support JavaScript, they are not considered, anyway.

Firefox

This one is quite good but even if it is the recommended one, there are still wrinkles here and there.
- The good: It loads the test HTML page completely and completes more than a dozen of GET, up to loading the favicon.ico at the end, all with one socket. And it keeps the favicon in cache, so that part of the protocol is ok.
- The bad: for the it sends maybe 3 parallel requests after the first successful one. It also seems to limit the refresh rate for PING and the likes (not more than 1 or 2 per second ?).
- The offense: I can't understand WHY it sends null-sized requests (which are so far welcome with a close) some time (a minute or less ?) after the page has finished loading. Is it a method to "keep alive" ? Fortunately it doesn't interfere with an open/working socket.
Conclusion: Apparently, Firefox is smart enough to see that if one connection/socket is slammed on its face, the pending requests can/should go to the other working socket(s). Simple HTTaP/? requests seem to work rather smoothly.

So at least there is something that works, even though some behaviours need more investigations.

Chrome/Chromium-browser

It can do simple things right but its tries so much to optimise things that it sometimes feels like a fanatic or a lunatic. Maybe it was really too focused on working with the websites of the Alphabet Group.

Let's start with an easy case : GET /?PING is fine, with the little detail that apparently, Cache-Control: max-age=200 is not understood. So WHAT is required by Chrome to keep that data in cache ? Anyway at least the function is performed (though at least one parallel connection is opened and slammed) but wait for the rest.

Let's try to load a web page with about a dozen of external links.

10 parallel sockets are opened and slammed, 5 resources (including index.html) are loaded, in a seemingly random order, probably because of the extra sockets that contain the requests. The main socket closes from timeout without getting the missing ten requests. What the... ???

Setting the HTTAP_TCP_BACKLOG #define to 0 or 1 makes the situation even worse, the page loads slooooowly and incompletely... Chrome won't get the clue !

There is still the possibility to limit the number of concurrent open sockets on a webserver but it requires a user manipulation on the browser and it's too specific and intrusive. Meanwhile HTTP1.1 doesn't seem to specify anything about explicitly limiting the number in the headers

And then, there is the question : WHY does Firefox get the clue but Chrome won't ? I suspect it's because they use different queuing algorithms but this does not help me so far.

Remediation : this problem seems to affect webpages while the HTTaP parts doesn't seem to be affected.
- Keep pages small with few external elements.
- inline elements (using "data" URI in base64 encoding)
- use JS to serialise and download extra resources
I hope that I can solve this problem soon anyway...
Edit :
I just found why FF sends empty packets !!!
It happens when you hover the cursor over a local link...
Firefox speculates that you will click on it and prepares the connection !
A much better timing system
Yann Guidon / YGDES • 04/24/2020 at 05:38 • 0 comments
The project has hit a wall when I found that more than one socket needed to be read, because browsers tend to get webpages from multiple connections.
TL;DR : I have re-designed the system around poll() and the timeouts and other peripheral details are now handled in a wrapper function.
The latest archive doesn't contain a working system but the foundations seem to work well. I still need to modify the server but the surrounding code is pretty good now, and the conversion will be easy.
The new code exposes a number of global variables and functions :
- timeout_counter (flag that signals the expiration of a timeslice in polling mode)
- abort_program (flag that is set when the program must quit)
- poll_HTTaP() (the HTTaP code to poll and serves files)
- init_HTTaP() (run before the main loop)
- HTTaP_mode_block() (signals that the host program has finished heavylifting)
- HTTaP_mode_poll() (signals that the host program has work to do and needs the CPU for himself)
Overall it looks like collaborative multitasking, with timeout management sprinkled all over the code.
The constraint is that each slice of work must be reasonably short (100ms ?) to keep the system responsive.
- When all the work is done, call HTTaP_mode_block() to go into blocking mode and save/yield CPU while waiting for the next command.
- When a new command is received that requests more work, call HTTaP_mode_poll()
Behind the scene, it's not trivial but it seems to work well. The functions are rather well layered and the timeouts seem to do their work.
I have so far provided a "dumb" application that reads orders from the command line and wastes time writing to the screen, it will soon include the real socket server instead.
The cool part is that the host program has very few constraints and it should work nicely with GHDL !

View all 24 project logs

Step 1

Create your user program. Include HTTaP.h at the beginning : since it contains many inclusions, you don't need to include them as well.

Like an Arduino sketch, it must contain an infinite loop. This loop must run at least once per second. Inside the loop, call the server's routine.

Here is an example :

#include "HTTaP.h"

int main(int argc, char *argv[]) {
  const char spinner[4]="-\\|/";
  int phase=0;
  while (1) {
    // do something, dance, sing, whatever,
    // but set timeout to 1 once in a while.
    printf("%c%c", spinner[(phase++)&3],0xd);
    fflush(NULL);
    // run the server:
    HTTaP_server(timeout);
  }
}

Note : the server requires to be called with an integer number, that is set to 1 every second, to increment its internal counter.

Step 2

The server is a FSM (Finite States Machine) that contains all the necessary initialisation code. You don't even have to call a "init()" function, the FSM takes care of it.

However if you use the HTTaP protocol, you must modify the "request method" parsing routine and add your own "words", "commands" and their effects.

The code in question is there :

    case ETAT2_attente_donnee_entrante :
      /* receive the request */
      .....
      /* analyse the request */
      if ((recv_len >= 7) && (strncmp("GET /1 ", buffer, 7) == 0)) {
         // do something here because we received the URI "/1"
      }

You should be careful with the reply buffer (the pointer b), don't forget to update it to prevent from sending garbage.

Step 3

Update all the #defines you can. For example, which features are allowed (is it blocking, polled or both? is file-serving enabled ?)

There are several predefined values in HTTaP.h

HTTaP_src.20210106.tgz fixed poll() race condition that would exit the server, see https://hackaday.io/project/20042/log/187866 x-compressed-tar - 51.47 kB - 01/06/2021 at 21:28		Download
HTTaP_src.20200515.tgz speed test x-compressed-tar - 51.37 kB - 05/16/2020 at 00:19		Download
HTTaP_src.20200514.tgz Better graceful close, user call wrapper ! x-compressed-tar - 51.00 kB - 05/15/2020 at 03:30		Download
HTTaP_src.20200513.tgz more JS wizardry, and graceful close of the page, among other things ! x-compressed-tar - 50.69 kB - 05/13/2020 at 03:03		Download
HTTaP_src.20200511.tgz JS client sustains an exclusive connection and other later connections fail as they should. Progressive load of extra images is ok. x-compressed-tar - 49.86 kB - 05/11/2020 at 01:01		Download

micro HTTP server in C

Description

Details

Files

HTTaP_src.20210106.tgz

HTTaP_src.20200515.tgz

HTTaP_src.20200514.tgz

HTTaP_src.20200513.tgz

HTTaP_src.20200511.tgz

Project Logs

Collapse

File cache considerations

Refactoring with aligned strings

A poll() problem...

72 !

And now, JavaScript !

v20200504

More bugs

version 2020/05/01

New version : Firefox vs Chromium-browser

Firefox

Chrome/Chromium-browser

A much better timing system

Build Instructions

Collapse

Discussions

Similar Projects

Fire and Forget Wardriving

WEB Based IDE for Linux Computers

Whitestar

HTTaP

micro HTTP server in C

Become a Hackaday.io member

Just one more thing

Description

Details

Files

Project Logs Collapse

Firefox

Chrome/Chromium-browser

Build Instructions Collapse

Enjoy this project?

Discussions

Become a Hackaday.io Member

Similar Projects

Does this project spark your interest?

Report project as inappropriate

Send message

Remove Member

Project Logs

Collapse

Build Instructions

Collapse