Close

Dockerizing the FTP Daemon

A project log for Dockerize All the Things

hard drive crash! recover, revive, and re-engineer server using docker contained services this time.

ziggurat29ziggurat29 11/20/2020 at 16:272 Comments

Summary

I add another dockerized service to the collection -- this one based on pure-ftpd.

Deets

Usually I prefer SCP to old-school FTP, but I still find FTP handy for sharing things in a pinch with others without having to create a real system account or walk folks through installing additional software.

FTP is an archaic and quirky protocol.  Hey, it's ancient -- from *1980* https://tools.ietf.org/html/rfc765.  Here, I'm going to support PASV, virtual users, and then also TLS for kicks as well.

The ftp daemon I am using here is 'pure-ftpd', which has been around for a while and is respected.  There does not seem to be a curated docker image for this, so as with fossil SCM, I will be cooking up a Dockerfile for it.  Unlike fossil, this will be an independent service running via systemd.

Most of the work in this exercise is understanding pure-ftpd, and it took me about 3+ days of work to get to this point.  What follows is the distillation of that, so I will cut to the chase and just explain some of the rationale rather than walking through the learning experience here.

First, I make a Dockerfile.  This will be a multistage build.

Dockerfile:

########################
#build stage for creating pure-ftpd installation
FROM alpine:latest AS buildstage

#this was the latest at the time of authorship
ARG PUREFTPD_VERSION="1.0.49"

WORKDIR /build

RUN set -x && \
#get needed dependencies
    apk add --update alpine-sdk build-base libsodium-dev mariadb-connector-c-dev openldap-dev postgresql-dev openssl-dev && \
#fetch the code and extract it
    wget https://download.pureftpd.org/pub/pure-ftpd/releases/pure-ftpd-${PUREFTPD_VERSION}.tar.gz && \
    tar xzf pure-ftpd-${PUREFTPD_VERSION}.tar.gz && \
    cd pure-ftpd-${PUREFTPD_VERSION} && \
    ./configure \
#we deploy into /pure-ftpd to make it easier to pluck out the needed stuff
        --prefix=/pure-ftpd \
#humour is a deeply embedded joke no-one would see anyway, and boring makes the server look more ordinary
        --without-humor \
        --with-boring \
#we will never be running from a superserver
        --without-inetd \
        --without-pam \
        --with-altlog \
        --with-cookie \
        --with-ftpwho \
#we put in support for various authenticator options (except pam; we have no plugins anyway)
        --with-ldap \
        --with-mysql \
        --with-pgsql \
        --with-puredb \
        --with-extauth \
#various ftp features
        --with-quotas \
        --with-ratios \
        --with-throttling \
        --with-tls \
        --with-uploadscript \
        --with-brokenrealpath \
#we will have separate cert and key file (default is combined); certbot emits separate ones
        --with-certfile=/etc/ssl/certs/fullchain.pem \
        --with-keyfile=/etc/ssl/private/privkey.pem && \
    make && \
    make install-strip

#now the entire built installation and support files will be in /pure-ftpd

########################
#production stage just has the built pure-ftpd things
FROM alpine:latest AS production

COPY --from=buildstage /pure-ftpd /pure-ftpd

RUN apk --update --no-cache add \
    bind-tools \
    libldap \
    libpq \
    libsodium \
    mariadb-connector-c \
    mysql-client \
    openldap-clients \
    openssl \
    postgresql-client \
    tzdata \
    zlib \
    && rm -f /etc/socklog.rules/* \
    && rm -rf /tmp/* /var/cache/apk/* \
#forward log to docker log collector
    && mkdir -p /var/log/pure-ftpd \
    && ln -sf /dev/stdout /var/log/pure-ftpd/pureftpd.log \
# setup ftpgroup and ftpuser; explicitly use 1001 just to match what I'm using on the host (can I do this in config instead?)
    && addgroup --gid 1001 -S ftpgroup \
    && adduser --uid 1001 -G ftpgroup -S ftpuser -h /home/ftpusers -s /sbin/nologin

#these avoid having to specify long command lines for tools like pure-pw
ENV PATH="/pure-ftpd/bin:/pure-ftpd/sbin:${PATH}" \
    PURE_PASSWDFILE=/pure-ftpd/etc/pureftpd.passwd \
    PURE_DBFILE=/pure-ftpd/etc/pureftpd.pdb

#from time-to-time you will want to shell in to update the virtual users; e.g.:
#    pure-pw useradd easy123 -m -u ftpuser -d /srv/ftp/virtual/easy123
#(unless you have the pure-pw tool on your host, and can do it from there)

#the 30000-300009 are for PASV; remember to specify a value for ForcePassiveIP in the .conf if you are behind NAT
EXPOSE 21 30000-30009

# startup
CMD pure-ftpd /pure-ftpd/etc/pure-ftpd.conf

This one builds quickly -- just a few minutes.  The build stage image is about 450 MiB, but the production stage is about 60 MiB.  Better!

I documented some of the build options in the 'configure' step, but I do want to point out that --with-certfile and --with-keyfile are done the way they are because the default for pure-ftpd is to assume that all the certs and also private key are contained in one file.  This is a hassle for us, because our certbot (from earlier) will be creating separate files.  So either I would need some step added to concatenate those, or rather I can explicitly tell pure-ftpd that they are separate.  The latter seemed the easier way to go.  The paths aren't critical -- it's just what an openssl installation would use by convention, and I have the flexibility to mount wherever I want, anyway.

There is a wart in this spec:  the 'addgroup' and 'adduser' specify specific IDs to use.  This is because of the vexing permissions issue with docker and bound filesystem objects:  The ID space on the host is disjoint from the ID space in the container.  Many times this doesn't matter if the program in the container is running as root (the default), but here the process is running as a well-known user.  I also had this problem with the PHP-FPM processor, but there it was easier to solve since the config allowed specifying the user:group under which to run (and numerically, at that).  I have not found such an option with pure-ftpd, alas.  I'll try to revisit this in the future; maybe I can do some magic with env vars from the host, and a script in the container to diddle those values at startup.

Pure-ftpd supports 'virtual users' and that's really the only way I plan on using it.  These virtual users are separate entities from the conventional systems users, and their identity is maintained in a separate database.  Two, actually.  There's a text-based passwd-esque file conventionally named 'pureftpd.passwd' and a binary equivalent named 'pureftpd.pdb'.  You use a separate tool 'pure-pw' to add/remove/update details of users.  This tool will update the text version, then you are meant to issue a separate command to transfer that data into the binary version.

pure-pw mkdb pure-ftpd.pdb -f pure-ftpd.passwd

The daemon only uses the binary version.  Tedious!  It must have been considered tedious by the authors as well, because I eventually found a switch '-m' that causes the binary version to be updated when performing add/remove/update operations on the text version.

The build stage builds the daemon with the '--prefix' option.  This causes all the built artifacts to be installed into a sub-directory of your choosing, rather than into system locations such as /bin and /sbin.  Using this makes copying into the production stage much easier, but it also makes specifying the path to the components a bit more tedious.  A few environment variables are modified/added to help with this.  It's useful to note that the ENV directive uses the multi-value setting form, instead of one ENV per variable.  This is considered a 'best practice', because the ENV creates another 'layer'.  Well, so says the docs https://docs.docker.com/develop/develop-images/dockerfile_best-practices/#env , at least.  Updating the path is welcome, and the two variables PURE_PASSWDFILE and PURE_DBFILE seem to at least sometimes be used by the tools and thereby avoid needing to explicitly specify those on the command line.

This image has a bunch of ports to expose on the host, thanks to the vagaries of FTP.  Note that the EXPOSE directive is purely for documentation; it doesn't cause anything to actually be exposed.  You still have to explicitly do that when creating the container on the host.

Lastly, the CMD directive causes the pure-ftpd daemon to be launched in accordance to a config file.  The pure-ftpd docs seem to suggest that its authors prefer to do this via the command line, but I much prefer a config file here.  I can just edit the config file on the host and restart, rather than edit the specification that starts the container.  But you could still do that if you really want, because the arguments to the 'docker run …' command will override whatever is in the CMD directive.

So, first I build:

docker image build -t pure-ftpd .

Now it's time to config!

pure-ftpd.conf and virtual users

As mentioned, I want PASV mode support, TLS support, and some virtual users.  There is an initial pure-ftpd.conf that was installed in /pure-ftpd/etc.  I'll pull that out of the image and start editing:

docker cp pure-ftpd:\pure-ftpd\etc\pure-ftpd.conf pure-ftpd.conf

This is a pretty big file, so I'm just going to show the diff's I made:  

*** pure-ftpd.orig.conf
--- pure-ftpd.conf
***************
*** 45,47 ****
  
! Daemonize                    yes
  
--- 45,47 ----
  
! Daemonize                    no

I definitely don't want to daemonize here, because that would cause the container to immediately exit.  Remember, the container stays live so long as the first process started therein has not exited, and daemonizing would spawn (and detach) a child, while the shell that started it (the main process) would then exit.

*** pure-ftpd.orig.conf
--- pure-ftpd.conf
***************
*** 124,126 ****
  
! # PureDB                       /etc/pureftpd.pdb
  
--- 124,126 ----
  
! PureDB                       /pure-ftpd/etc/pureftpd.pdb

Specifying the PureDB authentication method is what enables our virtual users.  Also, specify the path into our sub-tree for the user database rather than the default in the system tree.

*** pure-ftpd.orig.conf
--- pure-ftpd.conf
***************
*** 178,180 ****
  
! # PassivePortRange             30000 50000
  
--- 178,180 ----
  
! PassivePortRange             30000 30009
  
***************
*** 186,188 ****
  
! # ForcePassiveIP               192.168.0.1
  
--- 186,189 ----
  
! ForcePassiveIP               example.com
! #192.168.173.43

I want to support PASV mode for the benefit of those outside the NAT firewall.  The PassivePortRange is a pool of ports to be used for PASV, and is suggested to be as broad as possible.  However, I don't really want to forward 20,001 ports on my firewall, and have docker publish just as many, so I reduce that number to 10.  For this ad-hoc server that is expected to be rarely used when in-a-pinch, this should be quite sufficient.

When PASV mode is in effect, the server tells the client where it should connect, and by default it will do this by telling the IP address on which the client connected.  However, I am behind NAT, so that's not going to be reachable!  So I use the ForcePassiveIP option to have pure-ftpd to say something different.  If I have a static public IP, that would be suitable, but I am on a dynamic DNS, so the DNS name is more appropriate.  Pure-ftpd will look up that name and report its IP.

This has a consequence:  if you are on the local network, PASV will probably not work!  That's because your client will be trying to connect to the public IP, and when coming from inside the network, that connection will probably not be NAT'ed back into the network.  I believe there are some router shenanigans you can do if you have that much control over your router, but I prefer just to remember:  'only use active mode inside the network'.  However, for testing, you can temporarily supply the internal IP address, and use the hosts file on your client machine to direct the DNS name to the internal network address, and PASV will work there.  This is just for testing, because PASV will then /not/ work for clients outside the network!  Testing only!

*** pure-ftpd.orig.conf
--- pure-ftpd.conf
***************
*** 368,370 ****
  
! MaxDiskUsage                   99
  
--- 369,371 ----
  
! MaxDiskUsage                   75

This is to-taste.  It's to start refusing uploads when the volume on which the (virtual) user's home directory has been consumed to a certain amount.  Since this volume is shared with a bunch of unrelated services, I cranked this down quite a bit so that it is less likely to cause other services to fail.

*** pure-ftpd.orig.conf
--- pure-ftpd.conf
***************
*** 418,420 ****
  
! # TLS                          1
  
--- 419,421 ----
  
! TLS                          1
  
***************
*** 439,441 ****
  # CertFile                     /etc/ssl/private/pure-ftpd.pem
! # CertFileAndKey               "/etc/pure-ftpd.pem" "/etc/pure-ftpd.key"
  
--- 440,442 ----
  # CertFile                     /etc/ssl/private/pure-ftpd.pem
! CertFileAndKey               "/etc/letsencrypt/live/example.com/fullchain.pem" "/etc/letsencrypt/live/example.com/privkey.pem"

With this installation I am going to support FTP-over-TLS, so I selected option '1'.  This means 'do TLS or plaintext -- whatever the client requests'.  Option '2' is TLS only, and option '0' is plaintext only.

The CertFileAndKey option allows us to specify /separate/ certificate and key files, which is what certbot is going to automatically manage for us.  Much like with the other services, I am going to mount the certbot tree in the conventional location, hence the paths chosen here.

Sticking Stuff in Places

As with the other services, I created a directory:

/mnt/datadrive/srv/config/pure-ftpd

That contains config-related files.  In this directory I made a sub-directory 'etc' that contains the stuff I will mount onto /pure-ftpd/etc.  It will contain three files:

At this juncture I will just have pure-ftpd.conf.  Time to make some virtual users.

Making (l)users

Before I get to testing, I need to have a couple virtual users in existence so I can log in.  I don't have the pure-ftpd tools to do that on the host machine, so I do that from inside the container.  For this purpose the container will be launched interactively like this:

docker run -it --rm --name ftptest \
    --mount 'type=bind,src=/mnt/datadrive/srv/data/ftp/virtual,dst=/srv/ftp/virtual' \
    --mount 'type=bind,src=/mnt/datadrive/srv/config/pure-ftpd/etc,dst=/pure-ftpd/etc' \
    pure-ftpd sh

This is a 'minimal' launch in that I haven't mounted stuff or published ports necessary to actually run the daemon (so I can do this even if another one is running that /does/ expose those ports), but I do just enough to run some tools.

As mentioned, pure-ftpd has two user databases that must be in sync, but I found the special option '-m' that will allow you to keep them in sync when making changes.  So I can create our first user:

pure-pw useradd easy123 -m -u ftpuser -d /srv/ftp/virtual/easy123

Because I set the PURE_PASSWDFILE and PURE_DBFILE variables, I don't have to specify those filenames on the command line.

On the host machine there was already a directory for this user:

/mnt/datadrive/srv/data/ftp/virtual/easy123

and specify the password twice.  No, isn't especially scriptable, and the pure-ftpd docs explain the rationale for that.  It also goes into a couple shenanigans one can pull if you /really/ needed to do it.  For my rarelu-sued-except-when-in-a-pinch server, I will bit the bullet and do it interactively on the few occaisions I need to do so.

Now I exit and can test!

Testing

I am going to test both active and passive mode internally, so first I will edit the /mnt/datadrive/srv/config/pure-ftpd/etc/pure-ftpd.conf to temporarily specify ForcePassiveIP to be the machine's internal IP address '192.168.173.43'.  I won't leave it this way for production, but this way PASV will work from within the network.

I also modified my client machine's hosts file to point my 'example.com' domain to the internal IP address, so that I can reach it from the local network, and test out TLS.  My certbot has already been run some time back, so I already have the "/etc/letsencrypt/live/example.com/fullchain.pem" and "/etc/letsencrypt/live/example.com/privkey.pem".  Since this domain is on dynamic DNS, I can't have subdomains with my provider, so it is fine for me to use the same cert/key that I use for WWW.

Now I can launch fer real!

docker run -d --rm -p 21:21 -p 30000-30009:30000-30009 --name ftptest \
    --mount 'type=bind,src=/mnt/datadrive/srv/config/certbot/etc/letsencrypt,dst=/etc/letsencrypt' \
    --mount 'type=bind,src=/mnt/datadrive/srv/config/nginx/dhparam.pem,dst=/etc/ssl/private/pure-ftpd-dhparams.pem' \
    --mount 'type=bind,src=/mnt/datadrive/srv/data/ftp/virtual,dst=/srv/ftp/virtual' \
    --mount 'type=bind,src=/mnt/datadrive/srv/config/pure-ftpd/etc,dst=/pure-ftpd/etc' \
    pure-ftpd

This detaches from the tty (i.e. returning control back to the user and effectively running in the background), and self-cleans-up on main process exit, publishes the ports I need.  Note that here I can use port-range syntax for the PASV stuff.  The cert/key stuff is mounted in the typical (for certbot) location, much as before.  Pure-ftpd has a hard-coded path for Diffie-Hellman parameters, so I mount those at the expected path.  Strictly, you don't need to supply this at all since there is a baked-in default in the pure-ftpd source code.  I find it amusing that pure-ftpd seems to know that DH params are non-secret, yet the hard-coded path seems to indicate that they are private things.  Whatever.  The baked-in default is 2048 bits, I believe.  But since I have some 4096 already cooked up from before, why not use them?  My data tree for ftp on the datadrive is mounted into the expected place.  And finally the /pure-ftpd/etc is mounted.  I should talk about this briefly.

I found through a fair amount of pain that you need to mount the /pure-ftpd/etc as a directory which contains the files therein, rather than mounting those files directly.  In particular, for the user database stuff.  If you mount the three files directly, then pure-pw will not work!  That tool deletes and re-creates files, rather than opening and modifying.  This scenario apparently doesn't work with docker mounted files.  But mounting the directory as a whole enables such deletion/recreation activities.  Caveat configurator!

With this now running, I can finally test.  I am using WinSCP on the client machine.  There are a couple caveats with that:

I created a 'site' definition for the 'easy123' user I set up earlier, and tried connecting and uploading, downloading, and deleting files.  I tried that under TLS and unencrypted, and I tried that under active and PSV modes.  Yay!  For fun, I also tried in a separate session to the host machine copying files into and out of /mnt/datadrive/srv/data/ftp/virtual/easy123, and saw that those showed up and were similarly accessible through the ftp site.

OK, testing is finished!  I use:

docker container stop ftptest

to stop and auto-remove that container.  Now it's time to make it into a real service.

Since we're going to production, first, I undo the hack I did for testing PASV in pure-ftpd.conf to ForcePassiveIP, and return that to my dynamic DNS name 'example.com'.

systemd

This service doesn't need to be part of the web-related suite of services, so I am going to make a separate configuration for it.  This way I can start/stop/enable/disable independently of the web stuff.

My docker-compose@.service that I created way back does not need to be changed.  That rather delegated the service-specific activities to a docker-compose.yml file in a named subdirectory of /etc/docker/compose.  Here it will be called 'ftp':

/etc/docker/compose/ftp/docker-compose.yml:

version: '3'
services:

  #pure-ftpd service
  ftp:
    image: pure-ftpd
    container_name: ftp
    restart: unless-stopped
    tty: true
    ports:
      - "21:21"
      - "30000-30009:30000-30009"
    volumes:
      - /mnt/datadrive/srv/config/certbot/etc/letsencrypt:/etc/letsencrypt
      - /mnt/datadrive/srv/config/nginx/dhparam.pem:/etc/ssl/private/pure-ftpd-dhparams.pem
      - /mnt/datadrive/srv/data/ftp/virtual:/srv/ftp/virtual
      - /mnt/datadrive/srv/config/pure-ftpd/etc:/pure-ftpd/etc

This is pretty much just a transcoding of the command-line options we used when testing.

Then the yoosh of:

sudo systemctl enable docker-compose@ftp

and for this session, manually start:

sudo systemctl start docker-compose@ftp

And do another final test cycle (except for PASV).

Finis!

Next

A quicky for MQTT via stock eclipse-mosquitto.

Discussions

Gerben wrote 11/21/2020 at 19:35 point

Just a heads up. Browsers are starting to remove FTP support. So it may become less convenient for sharing files.

  Are you sure? yes | no

ziggurat29 wrote 11/22/2020 at 21:05 point

yes, alas; all things come to an end

  Are you sure? yes | no