Close

Dicker with Docker

A project log for Dockerize All the Things

hard drive crash! recover, revive, and re-engineer server using docker contained services this time.

ziggurat29ziggurat29 11/10/2020 at 17:110 Comments

Summary

Just exploring some Docker basics, with a bent towards Raspberry Pi when relevant.

Deets

The path to here is long and tortuous.  Multics. Unix. VAX. 386. V86 mode. chroot. Solaris Zones. cgroups. LXC. and now, Docker.

The concept is similar to a chroot jail, which partitions off part of the filesystem namespace, except that much more is also partitioned out.  Thinks like PIDs, network sockets, etc.  The low-level technology being used on Linux is 'cgroups'.  Windows also now has a containerization capability that is based on completely different technology.

'Docker' is a product built upon the low-level containerization technology that simplifies the use thereof.  (lol, 'simplifies'. It's still a bit complex.)  When you use the technology, you are creating a logical view of a system installation that has one purpose -- e.g. web server.  This logical view of the system is packaged in an 'image' that represents the filesystem, and then is reconstituted into a 'container' that represents the running instance.  The Docker program also helps with creating logical networks on which these containers are connected, and logical volumes that represent the persistent storage.  The result is similar to a virtual machine, but it's different in that the contained applications are still running natively on the host machine.  As such, those applications need to be built for the same CPU architecture and operating system -- well, mostly.  It needs to be for the same class of operating system -- Windows apps in Windows containers running on a Windows host, and Linux on Linux.  But with Linux you can run a different distribution in the container than that of the host.

Containers are much more resource friendly than full virtualization, and part of keeping that advantage is selecting a small-sized distribution for the container's OS image.  Alpine Linux is very popular as a base for containerized applications, and results in about a 5 MB image to start with.

For my host OS, I chose Ubuntu server (18.02).  To wit, Docker requires a 64-bit host system, so that is the build I installed.  Initial system update:

sudo apt update -y && sudo apt-get update -y && sudo apt-get upgrade -y && \
sudo apt dist-upgrade -y && sudo apt-get autoremove -y && \
sudo apt-get clean -y && sudo apt-get autoclean -y

For historic reasons, I created a separate partition called 'datadrive' and set it up to mount via fstab.  This is an artifact from migrating the system over the years -- originally it was a separate, large, drive.  It contains application data files, such as databases, www, ftp, source control, etc.  This is not a required setup, and I don't know that I even recommend it.

sudo bash -c 'echo "LABEL=datadrive  /mnt/datadrive  ext4  noatime,nodiratime,errors=remount-ro  0  1" >> /etc/fstab'

Then I make a swap partition:

sudo fallocate -l 2G /var/swapfile
sudo chmod 600 /var/swapfile
sudo mkswap /var/swapfile
sudo swapon /var/swapfile
sudo bash -c 'echo "/var/swapfile swap swap defaults 0 0" >> /etc/fstab'

It's useful to note that swapfile has compatibility issues with Kubernetes (aka 'k8s'), so if you eventually want to do that then you'll probably wind up turning that back off.  But I'm not planning on doing k8s on this machine, so I turn it on for now.

Then it's time to do some installing:

# Install some required packages first
sudo apt update
sudo apt install -y \
     apt-transport-https \
     ca-certificates \
     curl \
     gnupg2 \
     software-properties-common

# Get the Docker signing key for packages
curl -fsSL https://download.docker.com/linux/$(. /etc/os-release; echo "$ID")/gpg | sudo apt-key add -

# Add the Docker official repos
echo "deb [arch=$(dpkg --print-architecture)] https://download.docker.com/linux/$(. /etc/os-release; echo "$ID") \
     $(lsb_release -cs) stable" | \
    sudo tee /etc/apt/sources.list.d/docker.list

# Install Docker
sudo apt update
sudo apt install -y --no-install-recommends \
    docker-ce \
    cgroupfs-mount

#set dockerd to run on boot (and get it running now)
sudo systemctl enable docker
sudo systemctl start docker

# Install required packages
sudo apt update
sudo apt install -y python3-pip libffi-dev
sudo apt install -y libssl-dev libxml2-dev libxslt1-dev libjpeg8-dev zlib1g-dev

# Install Docker Compose from pip (using Python3)
# This might take a while
sudo pip3 install docker-compose

# run docker without sudo (will not be effective until log out/in)
sudo usermod -aG docker $(id -u -n)

#change boot mode to non-gui

sudo systemctl set-default multi-user.target
# to go back, set to 'graphical.target'

Some Basic Docker Concepts

Images

Docker uses a virtual filesystem backed by an 'image'.  These images are are composed of 'layers' which make up an overlay filesystem.  That is, the effective view of the filesystem is that of the layers merged together.  The top layer is read/write, the others are read-only.  This will make more sense when I discuss building images, because that's where they are created.

If you issue:

docker info

You'll get a lot of details about your installation.  In particular:

...
Docker Root Dir: /var/lib/docker
...

which is where the images will be stored (amongst other things).  Where specifically depends on other aspects of the configuration.  In my case, the 'Storage Driver' is 'overlay2', and so the images are in '/var/lib/docker/overlay2'.  You don't mess with these directly, though.  Rather, you use the main 'docker' tool to execute various commands; e.g.:

docker image ls

will list all the images and present the data in a more friendly way.  There are a boatload of commands, q.v. docs at http://docs.docker.com, but I'll walk through a couple of the common ones here.

Registry

Docker supports the notion of a 'registry' containing images.  There is a local registry on your system at /var/lib/docker, and there are external ones hosted by others.  The most common is 'Dockerhub' at http://hub.docker.com.  This is also the default that is built into the docker tool.  The act of fetching an image from the remote repository into your local repository is called a 'pull', and sensibly the act of sending an image to a remote repository is called a 'push'.  You can register an account on Dockerhub and push your images there, though there are some restrictions on the free account for how long they will keep them for you.

As I mentioned, Alpine Linux is a popular, so we can pull it:

docker pull alpine:latest

the second part of the name 'alpine:latest' is called the 'tag'.  It is free-form, but by convention it is typically a version number.  The value of 'latest' is also by convention meant to mean the latest release.  There is no magic to this -- only convention (with the possible exception that if you were to pull 'docker pull alpine' that the tool will assume the tag of 'latest').

This brings the image into your local registry.  But it's just a virtual filesystem image -- it's not alive.  You can list the images 'docker image ls' and see stuff:

REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
alpine              latest              2e77e061c27f        2 weeks ago         5.32MB

Containers

To get it running, you need a 'container'.  The container is the execution environment instantiated on the image.  Because the filesystem is overlaid, and because the top layer is the only read/write one, you can start multiple containers on the same image and these will not interfere with each other, because they will have distinct top layers.

docker container run -it alpine sh

and you'll be put in 'sh' inside the container, as root:

/ #

You can poke around and see that you are definitely not on your host system.  To exit, just 'exit' as per usual and you will be put back on your host system.  Since the main application 'sh' has now exited, the container is also stopped.  Stopped containers still take up resources, though.  You can see them:

docker container ls -a

CONTAINER ID  IMAGE    COMMAND   CREATED         STATUS                     PORTS  NAMES
8ecf6e60bc1a  alpine   "sh"      9 minutes ago   Exited (0) 8 seconds ago          youthful_elion

You need the -a switch so that even the stopped ones will be shown.  They will hang around forever, so you need to remove them explicitly:

docker container rm 8ecf

You can specify either the 'container id' or the 'name'.  When we created the container, we didn't specify a name, so the docker tool made a random one up 'youthful_elion'.  If you use the container id, you only need to specify enough leading hex digits to disambiguate from others -- I usually just use the first four.

I guess it became somewhat of an annoyance to manually removed stopped containers, because there is an option to have the container auto-remove whenever the main application has exited; e.g.:

docker container run -it --rm alpine sh

The '--rm' means 'auto remove on stop'.  I should mention at this time that the '-it' option means 'interactive' and also 'tty'.  This means that the container will be connected to the terminal of the host, and that was what enabled us to interact with the program running inside, 'sh'.  In most docker containers, you would be running something other than the shell.  The image typically has a command specified to run, rather than your having to provide it when starting the container.  This Alpine image is used as a base system, so it doesn't have that.  More on this when we get to building a new image.

Now you can see that when you exit the 'sh' on the container, that it will have auto removed itself as per 'docker container ls -a'.  The auto-remove feature is handy when doing development and testing.

Another handy thing is to be able to start another process in your container.  It is handy to launch and interactive shell into the container so you can poke around from the container's viewpoint of its filesystem.  To demonstrate, we'll get a container running, then return to the host system while leaving it running, then make a separate shell process that notionally we could use to poke around.

docker container run -it --rm alpine sh

OK, now you're in the container.  You can leave the container without stopping it by using the key sequence 'Ctrl-P,Ctrl-Q'.  You can see the container is still running:

CONTAINER ID  IMAGE   COMMAND  CREATED         STATUS         PORTS  NAMES
66e39b791db0  alpine  "sh"     50 seconds ago  Up 47 seconds         hardcore_pike

Now you can exec a new shell in it to poke around:

docker container exec -it hardcore_pike sh
/ # 

you can run 'ps' and see the first shell (still running) and the new one we are running now:

ps -ef
PID   USER     TIME  COMMAND
    1 root      0:00 sh
    6 root      0:00 sh
   12 root      0:00 ps -ef

Networks

It's useful to notice that docker creates a network adapter in these containers:

/ # ifconfig
eth0      Link encap:Ethernet  HWaddr 02:42:AC:11:00:02
          inet addr:172.17.0.2  Bcast:172.17.255.255  Mask:255.255.0.0

You can reach the host from within the container, and other systems on your network, however those systems cannot reach your container.  To do that, you need to 'publish' some port from your container.  We'll cover this later when we build some more interesting images.

If you 'exit' the (second) shell, you will be back at the host, with the container still running.  Another way of getting back into the container is with an 'attach' command.  This does not start a new process in the container, but rather re-establishes the link with the terminal session therein.

docker container attach hardcore_pike

Then you'll be back in.  You can see that this is the original shell with 'ps -ef'.  And now if you 'exit', the container will be 'stopped'.  If it was started with '--rm' it will also be auto-removed.

Volumes

The containerized environment does not have access to the filesystem outside of the container.  However, you can make that available as needed.  There are two modes of doing so:  'volumes' and 'binds'.  'Volumes' are virtualized filesystems much like the docker image itself, and 'binds' are like a mount point, making a file or directory on the host visible within the container.  I'm not going to show this now, but it will be used in later examples to make data on the 'datadrive' partition visible to the containerized application.

Next

Using an existing image.

Discussions