Close

Logging Protocol: Part 1

A project log for Raspberry Pi Sensor Network

A scalable, expandable home monitoring system designed for ease of use.

staticdet5staticdet5 03/15/2016 at 21:080 Comments

If you've read previous logs, you'll see that one of the goals is to be able to log numerous different kinds of sensors, events, and activities. To do this, and to be able to use these logs to drive future decisions, the logging protocol needs to be flexible. To enable a wide range of users to be able to use the system as a whole, the logging protocol needs to be relatively open ended.

For these reasons, I chose to use Python dictionaries to store data (either cPickled or as a standard Python output file). This is going to allow me to store values with verbal keys that define what the value is. The log file (after it is unPickled) is going to be human readable (allowing for easy use by others).

In Python, dictionaries are considered an "Indexed" data type. An easy "Indexed" data type is a numbered list (in Python, called a "List", curiously enough). For a numbered list, you have a value associated with a whole number. Each value is assigned a whole number, starting with zero. If you add a new value to the list, it automatically gets the next number in line.

Dictionaries are like that, but instead of a number, you have... something else. We call that something else a "Key". Each dictionary can have a large number of key/value pairs, but each key must be unique. The values can be anything. I can use a python dictionary to describe a car: car = {'Wheels' : 4, 'Color' : 'Blue', 'Speed' : '6', 'Seats' : '5', 'Roof': 'Convertible'}. I can also use it to describe a motorcycle: = {'Wheels' : 2, 'Color' : 'Blue', 'Speed' : '6', 'Seats' : '1.5'}

When I need to get values back, I can easily get it. car['Color'] returns 'Blue'. It gets a little cooler, because I can do things like: 'roof' in car, and get back True. But if I did that with motorcycle, it would report back False.

Looking at the logging function of the project, this gets really useful. I can easily write code that grabs log files, and only pulls data that meets specific criteria. If I'm looking to check temperatures for the house, between certain times, I can immediately cull any loglines that don't fall within my time frame. I can then cull any loglines that don't contain the "Temp" key in the dictionary. Then I can parse all of the remaining loglines and pull the temperatures that I need. Further, because the dictionary can have almost any arbitrary length, I can include a fair amount of detail there. I can further limit the logline pull so that it only checks the garage temperature sensor. Or all the temp sensors except for the one in the furnace room. This is going to give tremendous capability to the user, down the line.

The drawback is that this isn't a tiny file format. It's human readable, and if it isn't Pickled (cPickled), then it is flat text. Users could configure the system to use flags and tags to designate certain values and locations, but it would lose the easy human readability. They'd probably have to use a chart or a computer program to glance at logs, and I want to avoid that.
With that bulk, comes a slower system. So far, this hasn't been an issue. But the plan is to use Pickling to compress the loglines whenever possible.

Pickling is how Python "serializes" objects, turning them in to a byte stream. This byte stream can be used for a lot of things, but is typically (in my experience) written to disk. The opposite of Pickling is unPickling, which "unpacks" the object, allowing Python to easily use it. Pickle is written in Python, and you can take it apart and see how it works. Pickle is kinda slow, so someone wrote cPickle, which is written in C. This makes it up to a thousand times faster.

Pickle helps us three ways (so far). First, it really makes it easy to save things. You can Pickle almost any object. Once an object is Pickled, you can save it to file, or send it over the network (among other things). Next, Pickled objects tend to be smaller. This will save space on the drive. Finally, since the loglines are being Pickled, and then sent over the network, if the Pickled object is smaller, then there is less network burden associated with the movement of loglines across the network.

This log got a little big. We'll cover WHAT is already being logged in the next one.

Discussions