close-circle
Close
0%
0%

HTTaP

Test Access Port over HTTP

Similar projects worth following
HTTaP is a sub-protocol of HTTP1.1 designed to access hardware resources (and more) with a browser-friendly interface. It's a good bet for your IoT project when Apache+CGI don't fit.

It is initially designed to provide a connexion (over a trusted link) to a device (either hardware or software, real, emulated or virtual), on the same computer, or next to it, or on the other side of the planet.

Contrary to other ad hoc protocols or WebSockets, HTTaP can work directly with a HTML/JavaScript page, using only plain GET and POST messages (unlike other lower-level protocols that require system programming). This enables rich and portable interfaces that work on most browser-enabled devices.

HTTaP is designed for use in a lab, in a controlled environment with no outside connexion. Safety, scheduling, encryption and authentication are not part of this protocol. Tunnelling over OpenSSL (instead of raw TCP) might solve this.

HTTaP was first published in the french GNU/Linux Magazine n° 173 (july 2014) "HTTaP : Un protocole de contrôle basé sur HTTP" as a simpler alternative to WebSockets.

The project #micro HTTP server in C is designed to implement this protocol. This is where you'll find the low-level details discussions.

This project documents the protocol itself, its definitions and evolutions, to help other clients and servers interoperate.


HTTaP could be described as an attempt to formalise requests and replies between a HTTP-capable client and a HTTaP server, as well as all the surrounding parameters.

Think of HTTaP as a WebAPI for hardware and logic circuits.

For example it can embed/encapsulate SCPI commands over Ethernet or Wifi instead of RS232 or USB. No need to install stupid Windows drivers or lousy (binary, non-free and obfuscated) applications !

The client is usually a web browser running JavaScript code to perform high-level work. The code can come from the HTTaP server or any other source such as the local filesystem, Internet... One client can talk simultaneously to different servers but one server (at a given pair of TCP/IP address and port) can serve only one client at a time, to prevent race conditions.

HTTaP messages are very simple : just GET or PUT values to certain places, using JSON notation. This is intentionally simple but limited so actual work is achieved through convoluted sequences of small atomic messages.

Standard addresses provide well-known points that provide enough informations to discover/explore the system, its hierarchy and capabilities, through individual client requests.

This is why the server needs to exchange many small packets "in order", in lock-step sequences, and fast, so the HTTaP server disables TCP's Nagle congestion-avoidance algorithm (this saves about 400ms second per round-trip).

The user is normally directly connected to the server so the latency is usually low and HTTaP doesn't implement elaborate bandwidth-enhancement algorithms. Real-time latency matters more because it is usually connected to a GUI.


Logs:
1. Overview
2. Compression
3. Loopback server
4. Vocabulary
5. HTTaP root object
6. Security and HTTP protocol

  • Security and HTTP protocol

    Yann Guidon / YGDES05/08/2017 at 11:29 0 comments

    I have already addressed security in a different post but on a different, related project: Security and sandboxing

    Today I address "antipatterns" as described in https://blog.cloudflare.com/iot-security-anti-patterns/


    Let's review the 4 points that are raised :

    HTTP Pub/Sub

    There is no such thing in the basic HTTaP protocol. There is no redirection or even mention of a third-party URL in the server because the whole thing is meant to be self-contained and autonomous.

    DOS is prevented through several passive means:

    • Only one client can be connected at a time
    • The connection protocol requires exchange of several messages, ensuring that there is no (basic) IP spoofing
    • All resources/services that could allow arbitrary "traffic amplification" must be unlocked by a small "are you a fast real-time computer" challenge (to prevent traffic replay)

    IoT Device as TLS Server

    Encryption is a difficult thing to do, particularly for this class of devices where most corners are cut.

    Encryption is not required for now and I don't think I'll use a library, at least for the final version. This is a server-side requirement, since most HTTP clients transparently manage HTTPS.

    The suggestion to use a 3rd party server for authentication actually helps a lot, to separate the authentication nightmare from the protocol itself. The HTTaP server can inquire or refresh a key once it starts, which helps a lot when several HTTaP servers are running in parallel (easier and dynamic key management, no more one-configuration-file-per-server nightmare).

    But that's for a v2 of the protocol.

    Unencrypted Bootloader

    This is out of the scope of the protocol.

    Database-as-IPC

    The HTTaP server can be seen as a sort of database, in a way... but the protocol itself shields against backend implementation variations and their effects.


    Note : for security over unreliable networks, whitequark suggests using a nginx layer in this minimalist Python server. I'm not fluent with cryptography and security protocols so I can't devise the best approach...

  • HTTaP root object

    Yann Guidon / YGDES03/31/2017 at 09:29 0 comments

    The first thing that a HTTaP client does, upon first connection to a server, is to check its configuration and characteristics. They are provided by a JSON object with these properties:

    • HTTaP_version : returns an integer that describes the date in YYYYMMDD format (when displayed in decimal). Reading this property indicates that the server is HTTaP-compatible so this property is required.
    • Type : returns a string that describes the server.
    • ID : return a string that descibes the name or serial number of the server.
    • Services : lists the available features that the server implements. For now it's a string but will become an object for a better (and hierarchical) description. Available services : Loopback, Files, ...
    • Signals : lists the available signals that can be queried. Each can be a complex object.
    • SessionID : integer number, generated randomly at each new connection. Used as a token for the persistent connection. For example, helps the client detect that the connection was interrupted.

    More will appear as the protocol grows...

  • Vocabulary

    Yann Guidon / YGDES03/28/2017 at 00:57 0 comments

    This page contains a draft for a uniform/universal URI layout and "vocabulary" that all applications/clients can use.

    HTTaP defines two "domains" :

    • The static domain contains files that do not change. They usually contain all the HTML/JS interface and all the support files. They are cacheable.
    • The dynamic domain is introduced with a « URI with query component(s) », starting with "/?", to prevent them from being stored in cache.

    The dynamic domain is split further:

    • lowercase names are usually reserved for standard HTTaP features (such as listings or loopback)
    • uppercase names describe application-specific resources, signals, like memory, registers, sensors...

    Signals use the object notation (with a dot) to represent hierarchy. As a rule of thumb, if it uses a copy of the circuit or an instance of the code, it will be dot-represented. They will be treated similarly by the handler of the server, for example memory spaces...

    URI comments GET POST
    /path/filename Static file to serve. *
    /? Root of HTTaP, serve a JSON-formatted
    list of available of valid dynamic resources names
    *
    /?loopback Loopback for JS save/restore (optional)
    *
    /?list Return the list of the signals and their hierarchy
    *
    /?changes same as /?list but only include signals that have changed
    *
    /?REG read the value of all the registers
    *
    /?REG/R1 read the value of register R1
    *
    /?REG/R1,R2 read the values of registers R1 and R2 (optional?)
    *
    /?MEM Dump the contents of the memory *
    /?MEM/123-456 Dump the contents of the memory from locations 123 to 456 *
    /?IO.SPI.1/ Read the full status of the first SPI port
    *

    I'm not sure about the representation of ranges but they are pretty useful to reduce the bandwidth and CPU load.

    Memory ranges should use the #HYX file format to save bandwidth, compared to plain JSON syntax.

  • Loopback server

    Yann Guidon / YGDES03/28/2017 at 00:18 0 comments

    HTTaP is meant to help integrate #YGWM with a web-based system, and there is more to it than serving files or sending commands to TCP widgets.

    Any editor needs to read and write files from the local disk/storage and the chosen language/framework (HTML/JS) does not allow that. The web browser prevents scripts from doing it, for obvious security reasons !

    Yet there is a solution. Or more precisely, it's a hack ! I have described it in an article in French:

    "Accéder aux fichiers en JavaScript (ou le Cross-Site Scripting utile)" in GLMF#105 pp.42-54

    This describes a dirty trick, using a PHP script along with a specially crafted HTML/JS page :

    • To read a local file into the JS framework, tell the browser that you want to upload said file to the server. The JS part can't access at all the file contents but the server will reply with a JS-formatted string that is the exact copy of the file. Bazinga !
    • To write a file to the local file system, the JS script will send the properly formatted data as a POST form. The server then transcode and reply with these data advertised as a binary blob of unknown type, which the server will understand as a file to save. Bingo !

    Note : these manipulations usually require user interaction and are not inherently more unsafe than usual methods.

    The PHP script was a pain to write and I'm glad to have a totally controlled environment (the HTTaP server) where I can process the data without layers of gotchas and poisonous sugaring...

    The JS framework has evolved too and "binary blobs" now solve many of the encoding problems I have !

    The "Loopback" feature should be a standard, user-configurable, option in the HTTaP protocol, with its own access key.


    20170429:

    Now, it seems that interactive websites use a technique similar to the loopback server. An example is the circuit simulator at http://www.falstad.com/circuit/circuitjs.html

    I recently spotted an addition to the HTML5 standard at https://developer.mozilla.org/en-US/docs/Using_files_from_web_applications which looks promising but it doesn't seem to writte files, and support is yet untested and unknown. I hope that my system works with HTML4 clients.

  • Compression

    Yann Guidon / YGDES03/27/2017 at 23:50 0 comments

    HTTaP should not be used in public-facing networks, where bandwidth is a concern (latency often is more important). Compression is not a priority but it IS possible to implement it.

    The inherent cost is that the server needs to parse the request headers and find the line that declares that compressed files are allowed. It's possible but this uselessly increases the coding effort...


    20170429:

    A couple of interesting things to notice.

    First, compression is interesting for large files, not the small requests and answers. This happens in two cases:

    • For large HTML or JS files that are served to the browser. Images are typically already compressed. Selected files can be pre-compressed and served if the client supports the chosen algorithm (usually gzip/deflate). But don't forget that proper minification and cleanup already contribute to smaller files and shorter downloads. Removing the tabs, whitespaces and comments shrinks data from 20 to 60% (depending on the source code's style)
    • Large data chunks, such as memory dumps, are expected to be exchanged in both directions and can't be pre-compressed. One easy way to reduce the chunk sizes is to use raw binary encoding instead of ASCII-encoded strings. Another is to use the #HYX file format that supports repetitions of the last character, as well as address ranges. It's not as good as proper compression but is easily supported in C, JavaScript, bash...

    Second, the client is not expected to change its capabilities during a TCP session. This means that the headers can be parsed just once, when the server receives the first request. The next requests can just skip the headers.

  • Overview

    Yann Guidon / YGDES03/21/2017 at 21:42 0 comments

    HTTaP uses a reduced subset of HTTP, keeping only a few essential features.

    • Any HTTP compliant server must be able to understand HTTaP requests even though it can't fulfill all the requests (at worst, it replies with a 404 status)
    • Any HTTP client can send a HTTaP compliant request with minimal effort. Normally, the HTTaP client is a classic web browser but wget or curl must work too.

    HTTaP servers work mostly like classic HTTP servers but differ in a few ways, such as

    • resource reference (naming conventions)
    • caching
    • no cookies
    • timeout
    • persistent connections
    • serialisation (no simultaneous/multiple accesses)
    • headers

    These implementation choices come from constrains in size, speed, complexity : HTTaP must run in "barebone systems" with limited code and data memory, reduced CPU resources and lax security.


    Development and support of HTTaP at the lowest level is on the server side because all the clients are meant to be HTTP compliant already. High-level development (the application's intelligence) focuses on the client side, which uses JavaScript (or any other powerful dynamic language, since Python is quite popular for example and a browser is not required) to assemble the requests and interpret the responses.

    The HTTaP server must be as lean and simple as possible.

    • One source of complexity is removed by not interpreting the client's request headers. Actually, none of these headers are pertinent or relevant to most of HTTaP's use cases (except Compression). This cuts a lot of work but also means there is no support for cookies. Standard authentication is impossible so HTTaP is unsecured. Any client can connect and use the resources at will. Use HTTaP only on airgaped networks.
    • Another source of complexity comes from the HTTP "vocabulary" : the only supported methods are GET, POST and HEAD.
      * GET reads resources (files or dynamic variables) like any usual request.
      * POST writes these variables (file upload is only an option)
      * HEAD is a requirement of HTTP1.1 and only minimal support is provided (because it is barely used in local links, since there is no proxy).
    • The server is single-threaded and serves only one client at a time
      * This ensures by design that there can be no race condition.
      * The server is typically used by only one client at a time anyway.
      * This reduces code complexity and timing issues
      * Raw performance and throughput are not critical, since the client and server are usually located next to each other and the server is minimalist, reducing processing and transfer overhead.

    A HTTaP server typically provides two separate domains:

    1. a static files server (a very basic HTTP server)
    2. a dynamic sever, like an embedded CGI inside the server program.

    The URL defines which domain is accessed with a very simple method : static files use standard URLs while dynamic ones start with the "?" character.

    The question mark is a common indicator and good heuristic for dynamic contents and would not be messed with by eventual proxies.

    1. When the requested URI starts with "/?" then the dynamic mode is selected and an embedded program parses the URI.
    2. Otherwise, this is a standard file, with a direct mapping to the file system (often a sub-directory). There is no support of automatic index.html generation or "open directories".

    No access control is provided for the static files, which usuallly contain the HTML/JS web application and all the required supporting files. Access rights must be correctly set on the filesystem by the developer to prevent 403 errors or unwanted access to unrelated files.

View all 6 project logs

Enjoy this project?

Share

Discussions

Danielchristan wrote 04/01/2017 at 12:19 point

Nice idea bro!!

  Are you sure? yes | no

Yann Guidon / YGDES wrote 04/01/2017 at 18:32 point

The needs leads to the deeds ;-)

  Are you sure? yes | no

Similar Projects

Does this project spark your interest?

Become a member to follow this project and never miss any updates