05/08/2017 at 11:29 •
I have already addressed security in a different post but on a different, related project: Security and sandboxing
Today I address "antipatterns" as described in https://blog.cloudflare.com/iot-security-anti-patterns/
Let's review the 4 points that are raised :
There is no such thing in the basic HTTaP protocol. There is no redirection or even mention of a third-party URL in the server because the whole thing is meant to be self-contained and autonomous.
DOS is prevented through several passive means:
- Only one client can be connected at a time
- The connection protocol requires exchange of several messages, ensuring that there is no (basic) IP spoofing
- All resources/services that could allow arbitrary "traffic amplification" must be unlocked by a small "are you a fast real-time computer" challenge (to prevent traffic replay)
IoT Device as TLS Server
Encryption is a difficult thing to do, particularly for this class of devices where most corners are cut.
Encryption is not required for now and I don't think I'll use a library, at least for the final version. This is a server-side requirement, since most HTTP clients transparently manage HTTPS.
The suggestion to use a 3rd party server for authentication actually helps a lot, to separate the authentication nightmare from the protocol itself. The HTTaP server can inquire or refresh a key once it starts, which helps a lot when several HTTaP servers are running in parallel (easier and dynamic key management, no more one-configuration-file-per-server nightmare).
But that's for a v2 of the protocol.
This is out of the scope of the protocol.
The HTTaP server can be seen as a sort of database, in a way... but the protocol itself shields against backend implementation variations and their effects.
Note : for security over unreliable networks, whitequark suggests using a nginx layer in this minimalist Python server. I'm not fluent with cryptography and security protocols so I can't devise the best approach...
03/31/2017 at 09:29 •
The first thing that a HTTaP client does, upon first connection to a server, is to check its configuration and characteristics. They are provided by a JSON object with these properties:
- HTTaP_version : returns an integer that describes the date in YYYYMMDD format (when displayed in decimal). Reading this property indicates that the server is HTTaP-compatible so this property is required.
- Type : returns a string that describes the server.
- ID : return a string that descibes the name or serial number of the server.
- Services : lists the available features that the server implements. For now it's a string but will become an object for a better (and hierarchical) description. Available services : Loopback, Files, ...
- Signals : lists the available signals that can be queried. Each can be a complex object.
- SessionID : integer number, generated randomly at each new connection. Used as a token for the persistent connection. For example, helps the client detect that the connection was interrupted.
More will appear as the protocol grows...
03/28/2017 at 00:57 •
This page contains a draft for a uniform/universal URI layout and "vocabulary" that all applications/clients can use.
HTTaP defines two "domains" :
- The static domain contains files that do not change. They usually contain all the HTML/JS interface and all the support files. They are cacheable.
- The dynamic domain is introduced with a « URI with query component(s) », starting with "/?", to prevent them from being stored in cache.
The dynamic domain is split further:
- lowercase names are usually reserved for standard HTTaP features (such as listings or loopback)
- uppercase names describe application-specific resources, signals, like memory, registers, sensors...
Signals use the object notation (with a dot) to represent hierarchy. As a rule of thumb, if it uses a copy of the circuit or an instance of the code, it will be dot-represented. They will be treated similarly by the handler of the server, for example memory spaces...
URI comments GET POST /path/filename Static file to serve. *
/? Root of HTTaP, serve a JSON-formatted
list of available of valid dynamic resources names
/?loopback Loopback for JS save/restore (optional)
/?list Return the list of the signals and their hierarchy
* /?changes same as /?list but only include signals that have changed
* /?REG read the value of all the registers
* /?REG/R1 read the value of register R1
* /?REG/R1,R2 read the values of registers R1 and R2 (optional?)
* /?MEM Dump the contents of the memory * /?MEM/123-456 Dump the contents of the memory from locations 123 to 456 * /?IO.SPI.1/ Read the full status of the first SPI port
I'm not sure about the representation of ranges but they are pretty useful to reduce the bandwidth and CPU load.
Memory ranges should use the #HYX file format to save bandwidth, compared to plain JSON syntax.
03/28/2017 at 00:18 •
HTTaP is meant to help integrate #YGWM with a web-based system, and there is more to it than serving files or sending commands to TCP widgets.
Any editor needs to read and write files from the local disk/storage and the chosen language/framework (HTML/JS) does not allow that. The web browser prevents scripts from doing it, for obvious security reasons !
Yet there is a solution. Or more precisely, it's a hack ! I have described it in an article in French:
This describes a dirty trick, using a PHP script along with a specially crafted HTML/JS page :
- To read a local file into the JS framework, tell the browser that you want to upload said file to the server. The JS part can't access at all the file contents but the server will reply with a JS-formatted string that is the exact copy of the file. Bazinga !
- To write a file to the local file system, the JS script will send the properly formatted data as a POST form. The server then transcode and reply with these data advertised as a binary blob of unknown type, which the server will understand as a file to save. Bingo !
Note : these manipulations usually require user interaction and are not inherently more unsafe than usual methods.
The PHP script was a pain to write and I'm glad to have a totally controlled environment (the HTTaP server) where I can process the data without layers of gotchas and poisonous sugaring...
The JS framework has evolved too and "binary blobs" now solve many of the encoding problems I have !
The "Loopback" feature should be a standard, user-configurable, option in the HTTaP protocol, with its own access key.
Now, it seems that interactive websites use a technique similar to the loopback server. An example is the circuit simulator at http://www.falstad.com/circuit/circuitjs.html
I recently spotted an addition to the HTML5 standard at https://developer.mozilla.org/en-US/docs/Using_files_from_web_applications which looks promising but it doesn't seem to writte files, and support is yet untested and unknown. I hope that my system works with HTML4 clients.
03/27/2017 at 23:50 •
HTTaP should not be used in public-facing networks, where bandwidth is a concern (latency often is more important). Compression is not a priority but it IS possible to implement it.
The inherent cost is that the server needs to parse the request headers and find the line that declares that compressed files are allowed. It's possible but this uselessly increases the coding effort...
A couple of interesting things to notice.
First, compression is interesting for large files, not the small requests and answers. This happens in two cases:
- For large HTML or JS files that are served to the browser. Images are typically already compressed. Selected files can be pre-compressed and served if the client supports the chosen algorithm (usually gzip/deflate). But don't forget that proper minification and cleanup already contribute to smaller files and shorter downloads. Removing the tabs, whitespaces and comments shrinks data from 20 to 60% (depending on the source code's style)
Second, the client is not expected to change its capabilities during a TCP session. This means that the headers can be parsed just once, when the server receives the first request. The next requests can just skip the headers.
03/21/2017 at 21:42 •
HTTaP uses a reduced subset of HTTP, keeping only a few essential features.
- Any HTTP compliant server must be able to understand HTTaP requests even though it can't fulfill all the requests (at worst, it replies with a 404 status)
- Any HTTP client can send a HTTaP compliant request with minimal effort. Normally, the HTTaP client is a classic web browser but wget or curl must work too.
HTTaP servers work mostly like classic HTTP servers but differ in a few ways, such as
- resource reference (naming conventions)
- no cookies
- persistent connections
- serialisation (no simultaneous/multiple accesses)
These implementation choices come from constrains in size, speed, complexity : HTTaP must run in "barebone systems" with limited code and data memory, reduced CPU resources and lax security.
The HTTaP server must be as lean and simple as possible.
- One source of complexity is removed by not interpreting the client's request headers. Actually, none of these headers are pertinent or relevant to most of HTTaP's use cases (except Compression). This cuts a lot of work but also means there is no support for cookies. Standard authentication is impossible so HTTaP is unsecured. Any client can connect and use the resources at will. Use HTTaP only on airgaped networks.
- Another source of complexity comes from the HTTP "vocabulary" : the only supported methods are GET, POST and HEAD.
* GET reads resources (files or dynamic variables) like any usual request.
* POST writes these variables (file upload is only an option)
* HEAD is a requirement of HTTP1.1 and only minimal support is provided (because it is barely used in local links, since there is no proxy).
- The server is single-threaded and serves only one client at a time
* This ensures by design that there can be no race condition.
* The server is typically used by only one client at a time anyway.
* This reduces code complexity and timing issues
* Raw performance and throughput are not critical, since the client and server are usually located next to each other and the server is minimalist, reducing processing and transfer overhead.
A HTTaP server typically provides two separate domains:
- a static files server (a very basic HTTP server)
- a dynamic sever, like an embedded CGI inside the server program.
The URL defines which domain is accessed with a very simple method : static files use standard URLs while dynamic ones start with the "?" character.
The question mark is a common indicator and good heuristic for dynamic contents and would not be messed with by eventual proxies.
- When the requested URI starts with "/?" then the dynamic mode is selected and an embedded program parses the URI.
- Otherwise, this is a standard file, with a direct mapping to the file system (often a sub-directory). There is no support of automatic index.html generation or "open directories".
No access control is provided for the static files, which usuallly contain the HTML/JS web application and all the required supporting files. Access rights must be correctly set on the filesystem by the developer to prevent 403 errors or unwanted access to unrelated files.