Close
0%
0%

Adventures in Rust

I decided to find out what the deal is with the Rust programming language

Similar projects worth following
A personal journey to learn Rust by jumping in the deep end and writing code

I decided to find out what the deal is with the Rust programming language. It's said to be as performant as C/C++ (henceforth just C for short) but also safer. So I took the plunge off the deep end and started rewriting in Rust a small non-spooling printer server of mine from many years ago. This is by no means a tutorial in Rust; there are resources for that and I'm still learning. It will be a collection of factlets I glean that give insights to the design philosophy. I will update this page, correcting mistakes as I become cognizant of new facts, unlike so many public idiotsfigures who just double down. 😉 I may also reorganise sections when OCD strikes.

Rust is not like C

Don't believe the people who tell you Rust is like C. It doesn't look like C. It looks more like ML.

Rust is like C

However Rust is still an imperative language. Many of the constructs map onto a small number of machine instructions, like C. Rust gives you access to low level operations. Word sizes of various lengths are supported and conversion between different sizes is controlled.

Rust is a typed language with type inference

But often an explcit declaration is not required, though allowed, because Rust can infer the type of many objects. Even numeric constants are typed and checked in expressions, unlike C where promotion happens and may or may not provoke a warning from the compiler.

Mutables are discouraged

Part of the safety of Rust comes from discouraging variables which are altered after being set. Instead one elaborates the computation by deriving new values as execution progresses. So a program ought to look more like a sequence of mathematical statements rather than instructions to poke and peek pigeon-holes in RAM. This mindset permeates programming style in Rust.

Does this affect compiler optimisation? It might even improve it since the compiler doesn't have to map intermediate values to memory locations but can store them in registers and other temporary locations as it sees fit.

But in the real world some variables must change, think I/O, so mutables are available.

Functions can return complex objects

It's not well-known but C does support returning complex objects like structs. This isn't used much, as programmers prefer to work with pointers or the safer references. In the beginning only objects that could fit into a register were returned, so this led to overloading system and library calls to return -1 or 0 for failure. In Rust this isn't done. Functions can return a complex object. Typically this is an enum. Which brings us to:

Enums are not enums as you know them in C

They are more like tagged unions some of you may remember from Pascal, Ada and Modula-2 as discriminated unions. variant records, or a similar name. So typically a function returns an enum consisting of a union which contains either the result on success, or details of the error on error. As the return type is checked strictly at assignment and all variants of the enum must be covered, this hinders programmers from ignoring the 6th Commandment of C Programming by Henry Spencer.

Rust is primarily an expression language

Even constructs that are statements in C, like loops, return a value. This neatly solves some problems. An example is where you search in a loop for the index of a match in an array. In C you have to note the index of the match then break out of the loop. So you have to make the scope of the index variable wider than the loop, mutate its value, and also test for no match. In Rust you combine a let with a while or loop, and the break returns the index of the match. Failing to find a match would return the failure variant of the enum. No scoping problem, no mutable needed.

Another example is the replacement for the ternary operator ?: in C. In Rust this is a let combined with an if expression returning a value.

A semicolon matters

In Rust the last expression evaluated before exiting the function is the return value....

Read more »

  • Heart of the program

    Ken Yap02/06/2024 at 07:37 0 comments

    Now that we have presented the command line parsing, logging, and daemonization code, we come to the code that does the copying from the network to the printer in src/lib.rs:

    extern crate lockfile;
    use lockfile::Lockfile;
    
    use std::str::FromStr;
    use std::net::{IpAddr,SocketAddr,TcpListener,TcpStream};
    use std::fs::{File,OpenOptions};
    use std::io::prelude::*;
    use std::{thread, time};
    
    extern crate log;
    use log::{trace,debug,info,warn,error};
    extern crate syslog;
    
    pub mod logger;
    
    macro_rules! lockpathformat {
    //    () => ("/var/lock/subsys/p910{}d")
          () => ("/tmp/p910{}d")
    }
    
    const BASEPORT:u32 = 9100;
    const PRINTER_RETRY:u64 = 4000; // milliseconds
    const BUFFERSIZE:usize = 8192;
    
    // Copy network data from inputfile (network) to pfile (printer) until EOS
    // If bidir, also copy data from printer to network
    fn copy_stream(mut conn: &TcpStream, mut pfile: &File) -> Result<(), std::io::Error>
    {
            info!("copy_stream");
            let mut buffer = [0u8; BUFFERSIZE];
            loop {
                    debug!("reading...");
                    let bytes_read = conn.read(&mut buffer)?;
                    debug!("{} bytes read", bytes_read);
                    if bytes_read == 0 {
                            break;
                    }
                    pfile.write_all(&buffer[0..bytes_read])?;
            }
            Ok(())
    }
    
    fn handle_client(stream: &TcpStream, device: &String, bidir: bool) -> Result<(), std::io::Error>
    {
            // wait until printer is available
            let pfile = loop {
                    match OpenOptions::new().read(bidir).write(true).create(false).truncate(true).open(device) {
                    Ok(f) => {
                            break f;
                            },
                    Err(_) => {
                            thread::sleep(time::Duration::from_millis(PRINTER_RETRY));
                            },
                    }
            };
            copy_stream(&stream, &pfile)?;
            pfile.sync_all()?;
            Ok(())
    }
    
    pub fn server(pnumber: u32, device: &String, bidir: bool, ba: &String) -> Result<(), std::io::Error>
    {
            let bindaddr = IpAddr::from_str(ba).expect(format!("{} not valid bind IP address", ba).as_str());
            let lockfilepath = format!(lockpathformat!(), pnumber);
            let lockfile = Lockfile::create(&lockfilepath).expect(format!("Lockfile {} already present", lockfilepath).as_str());
            let sockaddr = SocketAddr::new(bindaddr, (BASEPORT + pnumber) as u16);
            let listener = TcpListener::bind(sockaddr)?;
            info!("Server listening");
            loop {
                    if let Ok((stream, addr)) = listener.accept() {
                            info!("new client: {addr:?}");
                            if let Err(e) = handle_client(&stream, device, bidir) {
                                    info!("handle_client: {}", e);
                                    break;
                            }
                    } else {
                            break;
                    }
            };
            lockfile.release()              // or just let it autorelease
    }

    A few comments. You can see that even constants can be typed, for example the time constant passed to the sleep routine, which needs to be an unsigned 64-bit int. In the port argument to the SocketAddress constructor, an unsigned 16-bit int is needed per IP packet specification. The as u16 truncates, but we know this is safe because the BASEPORT is 9100, and the pnumber is constrained to be in [0..9].

    Both the buffer and printer File are mutable references because the buffer is written out and the internal state of the File is modified.

    Currently bidirectional copying isn't supported; this would require polling for incoming data available and printer File ready, using a crate that wraps around the epoll or select APIs. Signals are not handled, so the daemon could leave a lockfile behind if interrpted. And the original p910nd elaborate buffer management to maximise throughput hasn't been reimplemented.

  • Daemonization

    Ken Yap02/05/2024 at 02:03 0 comments

    Daemonization is the process of detaching from the invocation environment. For Linux processes that are to run as a system service, the most important things to do are to detach (and optionally reattach to files) the standard file descriptors stdin, stdout and stderr. Then it has to fork and the parent exits to the invoker, while the child continues on its own. (But things are slightly different if running under systemd, but that's a story for another log.)

    In addition other actions are commonly taken, as this snippet of an example of the use of the Daemonize module shows:

        let daemonize = Daemonize::new()
            .pid_file("/tmp/test.pid") // Every method except `new` and `start`
            .chown_pid_file(true)      // is optional, see `Daemonize` documentation
            .working_directory("/tmp") // for default behaviour.
            .user("nobody")
            .group("daemon") // Group name
            .group(2)        // or group id.
            .umask(0o777)    // Set umask, `0o027` by default.
            .stdout(stdout)  // Redirect stdout to `/tmp/daemon.out`.
            .stderr(stderr)  // Redirect stderr to `/tmp/daemon.err`.
            .privileged_action(|| "Executed before drop privileges");

    Yes, the cat is out of the bag, we let Daemonize do the heavy lifting. Even back in C days, the 7th commandment enjoined programmers to make use of the provided libraries instead of reinventing from scratch. In the case of Rust, one uses the crate ecosystem instead of libraries.

    So here's the snippet from main.rs that does it all:

            if !debug {
                    match Daemonize::new().start() {
                            Ok(_) => { },
                            Err(e) => { error!("Error {}", e); },
                    };
            };

    That's it. If we are debugging then we don't daemonize which helps strace and doesn't require watching the system log. We take all the defaults where none of the options are set. Naturally we need the imports at the top of the file.

    extern crate daemonize;
    use daemonize::Daemonize;
    

  • Logging

    Ken Yap02/05/2024 at 01:41 0 comments

    For a daemon like p910nd, it's important to be able to send these to the system log facility as the process will be detached from the standard 3 file descriptors in operation. But we still want to be able to send messages to stderr for debugging purposes.

    In main.rs we import the log module which defines macros for various log levels. For example, just before the server starts we output an informative message, and if the server exits abnormally we output an error message.

    The logger code is placed in logger.rs. It could have been part of lib.rs but we want to make a module out of it, for practice. Even if the code is textually in lib.rs, it can still be an inline module. This is logger.rs:

    use log::LevelFilter;
    use syslog::{Formatter3164,BasicLogger};
    
    pub fn log_init(debug: bool) -> () {
            if debug {
                    stderrlog::new()
                            .module(module_path!())
                            .verbosity(LevelFilter::Info)
                            .init()
                            .expect("Stderrlog not initialised");
            } else {
                    let logger = syslog::unix(Formatter3164::default()).unwrap();
                    log::set_boxed_logger(Box::new(BasicLogger::new(logger)))
                            .map(|()| log::set_max_level(LevelFilter::Info)).unwrap_or(())
            }
    }

    We also need to import entities from the log crate, but also the syslog crate which deals with the system logger. If debug is true then we set the global logger to send to stderr, otherwise we instantiate a connection to the system logger. This function returns (), the unit object, so we cannot communicate errors, if it fails it returns to the main program but doesn't prevent it from starting a server. This is something we might want to improve on in future.

    In lib.rs we have this line:

    pub mod logger;
    

    and as you see in main.rs, the public function log_init is invoked as logger::log_init(bool).

  • Embedded Rust

    Ken Yap02/04/2024 at 10:26 0 comments

    As you might guess, the strong guarantees the Rust provide can improve the reliability of bare-metal software. In fact this was one of my original motivations, to use Rust for embedded projects.

    Embedded support exists for various platforms. Your 8 or 16-bitter won't be supported by Rust*, but 32-bitters are not a problem, provided the toolchain exists. STM32, Cortex and RISC-V have support, as does the classic ESP32 using the Espressif tools. The newer ESP32-C3 is RISC-V based. Apparently an upstream merge of the Espressif compiler to the mainstream is expected. Since Rust uses LLVM compiler technology, these are the architectures supported: https://docs.rust-embedded.org/embedonomicon/compiler-support.html

    * But there appears to be an AVR support project.

    Rust has the #![no_std] crate attribute which indicates that the crate uses the core crate instead of the std crate which assumes an OS underneath. This rules out a lot of very convenient crates, so some searching and experimentation is required.

    Here are some links:

    From the horse's mouth: https://www.rust-lang.org/what/embedded

    An online book: https://docs.rust-embedded.org/book/

    About #![no_std]: https://docs.rust-embedded.org/book/intro/no-std.html

    Also have a look at Wokwi which supports Rust for their simulated IoT platforms: https://wokwi.com/

    A page with useful tips for getting started with Rust on ESP-32: https://nereux.blog/posts/getting-started-esp32-nostd/

  • A practice project

    Ken Yap02/04/2024 at 10:24 0 comments

    I have taken the code for my p910nd daemon and am rewriting it in Rust. One thing I discovered is trying to edit the C code to Rust is the wrong way to go. The languages differ so much in feel and ways of achieving results. For example a huge chunk of command line option handling in C can be simplified in Rust using the clap crate. Not only that but it autogenerates help responses for invocations.

    However I also found it frustrating because I need to discover the equivalent for many C and Linux features. So not all the features have been ported. But it's also more difficult as p910nd is a system utility, and has to do things like run as a service, interact with system services like the logger. If it were only a data manipulation program, it would have been much easier.

    Delving into the code

    To make concrete the observations above, I present parts of my Rust code. First the main program, which is usually named src/main.rs in the project tree.

    First I show the output of the help text if --help is given to the program. This shows what the options are:

    $ cargo run -- --help
    Non-spooling printer daemon
    
    Usage: p910nd-rust [OPTIONS] [PRINTER_NUMBER]
    
    Arguments:
      [PRINTER_NUMBER]  Printer number [default: 0]
    
    Options:
      -b, --bidir                Bidirectional communication
      -d, --debug                Log to stderr
      -f, --device       Device to spool to [default: /dev/usb/lp0]
      -i, --bindaddr   IP address to bind to [default: 0.0.0.0]
      -h, --help                 Print help
      -V, --version              Print version

    Now the main program, which is mostly an example of the use of the clap (command line argument parsing) crate. This is a popular module; other languages also have modules that encapsulate argument parsing.

    extern crate clap;
    use clap::{Arg,Command};
    extern crate log;
    use log::{info,error};
    
    use std::process;
    
    extern crate p910nd;
    use p910nd::logger;
    
    fn main()
    {
            let matches = Command::new("P910nd")
                    .version(env!("CARGO_PKG_VERSION"))
                    .author("https://github.com/kenyapcomau/p910nd-rust")
                    .about("Non-spooling printer daemon")
                    .arg(
                            Arg::new("bidir")
                                    .short('b')
                                    .long("bidir")
                                    .action(clap::ArgAction::SetTrue)
                                    .help("Bidirectional communication"),
                    )
                    .arg(
                            Arg::new("debug")
                                    .short('d')
                                    .long("debug")
                                    .action(clap::ArgAction::SetTrue)
                                    .help("Log to stderr"),
                    )
                    .arg(
                            Arg::new("device")
                                    .short('f')
                                    .long("device")
                                    .value_parser(clap::builder::NonEmptyStringValueParser::new())
                                    .action(clap::ArgAction::Set)
                                    .default_value("/dev/usb/lp0")
                                    .value_name("DEVICE")
                                    .help("Device to spool to"),
                    )
                    .arg(
                            Arg::new("bindaddr")
                                    .short('i')
                                    .long("bindaddr")
                                    .value_parser(clap::builder::NonEmptyStringValueParser::new())
                                    .action(clap::ArgAction::Set)
                                    .default_value("0.0.0.0")
                                    .value_name("BINDADDR")
                                    .help("IP address to bind to"),
                    )
                    .arg(
                            Arg::new("printer")
                                    .value_parser(clap::value_parser!(u32).range(0..9))
                                    .default_value("0")
                                    .value_name("PRINTER_NUMBER")
                                    .help("Printer number"),
                    )
                    .get_matches();
    
            let bidir = matches.get_flag("bidir");
            let debug = matches.get_flag("debug");
            let device = matches.get_one("device").unwrap();
            let bindaddr = matches.get_one("bindaddr").unwrap();
            let pnumber: u32 = *matches.get_one("printer").expect("required");
    
            logger::log_init(debug);
    
            info!("Run as server");
            if let Err(e) = p910nd::server(pnumber, &device, bidir, bindaddr) {
                    error!("{}", e);
                    process::exit(1);
            }
    }
    

    Instead of the #include mechanism of C, Rust uses safer indications to import modules. The crate p910nd in fact contains the body of the daemon code, which we will examine later.

    Most of the work is to understand the options of clap to use it effectively. Also note that method chaining is heavily used. This is a technique also used in C++ and Java and relies on references in the language, a safer alternative to pointers. Returning a reference to the self object allows the result to be used to call the next method.

    Finally note the use of the if let idiom to handle the case where the server function returns an Err.

View all 5 project logs

Enjoy this project?

Share

Discussions

Similar Projects

Does this project spark your interest?

Become a member to follow this project and never miss any updates