Close

Hack Chat Transcript, Page 1

A event log for Software for Satellites Hack Chat

It's rocket AND computer science

dan-maloneyDan Maloney 07/12/2023 at 20:400 Comments

Dan Maloney12:00 PM
Okie doke, top of the hour, let's get going! I'm Dan, I'll be modding today along with Dusan as we welcome Jacob Killelea to the Software for Satellites Hack Chat!

Jacob, how about you kick us off with a little about yourself?

Jacob Killelea12:00 PM
Sure thing!

homer.sajonia.ii joined  the room.12:02 PM

Jacob Killelea12:02 PM
Hi all, my name is Jacob Killelea! I’m an embedded and aerospace engineer currently living in the San Francisco Bay Area. My background and schooling was in Aerospace Engineering at CU Boulder, and I’ve been interested in just about everything relating to aerospace. I hold a private pilot certificate and a commercial small UAS certificate, and I spend my spare time hacking on microcontrollers, designing PCBs, and avidly watching what the Hackaday community comes up with!

My technical background covers aero and thermodynamics, control systems, life support, systems engineering, electrical and RF systems, and lots of software. From 2020 to 2022, I worked for Intuitive Machines in Houston, where I wrote flight software for the Nova-C lunar lander. I worked on payloads from high power laser rangefinders, to missile gyros, to space based camera systems. I got to get hands-on with flight hardware, test instruments in-flight, and contribute some pretty neat code to this mission.

Today, I'm here to tell you just about everything I remember about building, programming, and operating a spacecraft! A lot of stuff is similar to what you might be used to on the ground, but there's also some stark differences, both in how things are done and what the domain forces you to do. It's a fun and inventive domain, with a lot of technology developed for special purposes. Hopefully I can pass on some of what I know and fill in the gaps that people might have in what they know about space systems.

Jacob Killelea12:03 PM
So please, ask me anything and everything and I'll do my best to answer.

hkurz12:03 PM
wow!

thischillhome12:04 PM
Quite the resume. 😮

Dan Maloney12:04 PM
What sort of OSs might one encounter when coding for space systems?

Dan Maloney12:05 PM
I imagine some sort of real-time OS?

hkurz12:05 PM
Are you using an off-the-shelf OS os some special "pace grade" OS for such jobs?

Dan Maloney12:05 PM
Great minds think alike...

hkurz12:05 PM
hehe

stu12:05 PM
What coding standards, certifications, or guidelines are used for pico satellites? I'm thinking MISRA, DO-178, or similar?

Jacob Killelea12:06 PM
Great question, and the answer is all over the place! There are the systems you would absolutely expect to find, such as Linux and FreeRTOS, but spacecraft have historically had a wide range of solutions. Nova-C uses two computers, with one running Linux and the other running a proprietary RTOS called VxWorks.

thischillhome12:06 PM
Tips and tricks to keeping stuff within power budget? 😅

stu12:07 PM
Which VxWorks? I haven't used them since 6.9?

Jacob Killelea12:08 PM
Other RTOSes include my personal favorite, RTEMS, and lots of specialty systems from aerospace vendors like Green Hills

Andrew Elbert Wilson12:08 PM
What are some ways flight software monitors, mitigates, and recovers from errors & failures?

Thomas Shaddack12:08 PM
What are the advantages and disadvantages of using linux? Any special variants against "stock"?

stu12:08 PM
What does the UAS certificate enable? Is it easy to obtain? Also, how much (approximately, in total) for private pilot's license. I'm thinking like $5,000-$10,000 all said and done. (practice flights, instructor time, etc).

hkurz12:09 PM
Which programming language is used most often in space?

stu12:09 PM
C!

Jacob Killelea12:09 PM
@homer.sajonia.ii, Power budgets are really hard in space! If nothing else works, you spacecraft has to be able to keep itself thermally stable and power positive! Those are the core responsibilities of features like a spacecraft's safe mode. When things break, you need to be able to get the system back into a state that will keep it safe.

Jacob Killelea12:10 PM
@stu I think we were still on 6.9! Things move slowly and carefully in space. When things are well proven, we call that flight heritage, and generally that system won't be updated/touched again unless it really needs to be

Thomas Shaddack12:11 PM
What's the range of temperatures the spacecraft can encounter, the "healthy" all-ok range, the "probably should survive" range, the "can't survive" range?

Dan Maloney12:12 PM
They ran a rad-hardened CDP1802 on Galileo, I think.

Jacob Killelea12:12 PM
@Andrew Elbert Wilson That falls under the broad umbrella of FDIR (Fault Detection, Isolation, and Recovery). Every space system should report some amount of telemetry, usually called housekeeping, that indicates if things are working correctly. Faults might be software errors or hardware problems, and spacecraft generally recover by restarting software or power cycling the hardware/.

hkurz12:12 PM
Can the software be updated in-flight? How is this done securely?

Dan Maloney12:12 PM
Hardware, but the same conservative prinicple

Jacob Killelea12:13 PM
@Thomas Shaddack Linux don't provide the hard realtime guarantees that other systems do. People don't like to include it in the core spacecraft controls for that reason, but it's really nice and flexible if you have the computer resources to run it

Andrew Elbert Wilson12:13 PM
@Jacob Killelea Thanks, I am designing a soft RISC-V in an RTG4 FPGA for work, trying to think of low-level methods for FDIR.

Jacob Killelea12:14 PM
@stu, I just took an online class. Since I already had a PPL, the FAA made it easy for me. The PPL itself was a huge endeavor! I don't reccomend it unless you can commit to flying at least twice a week in order to stay sharp, and twice a month once you have your cert.

Jacob Killelea12:15 PM
@hkurz, yep, C is the most common language. I wrote exclusively C software using the cFS framework from NASA

Jacob Killelea12:15 PM

https://github.com/nasa/cFS

GITHUB NASA

GitHub - nasa/cFS: The Core Flight System (cFS)

The Core Flight System (cFS). Contribute to nasa/cFS development by creating an account on GitHub.

Read this on GitHub

Jacob Killelea12:16 PM
This is an amazing framework that runs on a few RTOSes and Linux, and supports a lot of the common tasks that spacecraft computers need, like process management, telemetry, error reporting, command ingest, scheduling, table driven configuration, and so on

daniel.fryling joined  the room.12:16 PM

Jacob Killelea12:16 PM
It also supports the CCSDS telemetry standards and has add-ons for multi-computer distributed computing

Jacob Killelea12:17 PM
@Thomas Shaddack it really depends on the component and where's it's placed on the spacecraft. A lot of things are rated for -40C/+80C, but actually keeping your components in that range is hard! You need to carefully insulate against both cold and heat, and be mindful of where the sun is hitting your spacecraft

Jacob Killelea12:18 PM
@Dan Maloney We ran AiTech SP0-s computers based around old MCP750s. Similar to the RAD750 from BAE that is running on Mars, IIRC

hkurz12:18 PM
How do you test your software? Do you have a replica spacecraft, load the softaware there and do end-to-end-tests or is it done mostly with unit tests?

Jacob Killelea12:19 PM
@Andrew Elbert Wilson, that reminds me of a payload I worked on that actually implemented a watchdog message at the FPGA level. It could tell if you if the soft CPU was running at all

stu12:19 PM
What happens if you have a fault in your FDIR housekeeping? Space brick?

Jacob Killelea12:19 PM
@Andrew Elbert Wilson check out the LEON 3 soft CPU design, it's a SPARC architecture written in VHDL that ESA uses, along with RTEMS (version 4.11, which is quite old) on a lot of their systems

Thomas Shaddack12:20 PM
Any difference between terrestrial testing environment and the space? What can/did bite due to such differences?

Thomas Shaddack12:20 PM
How to recover from a space brick?

daniel.fryling12:20 PM
Do most of the modern satellites have a way to maintain their orbit indefinitely?

Jacob Killelea12:21 PM
@hkurz the motto is always to "Test like you fly, fly like you test." This is hard though! How do you simulate microgravity for your IMUs? How do you simulate three-axis rotational motion for your star tracker? The answer is to test each section as thoughly as you can, and your whole composite of tests should cover operational cases of your spacecraft

Andrew Elbert Wilson12:21 PM
You can probably have a watchdog power system for a reboot or a cold spare as a last resort for FDIR.

Jacob Killelea12:23 PM
@stu, if all else fails, you spacecraft should be able to go into a recovery mode called safe mode. The goal of a safe mode to keep your spacecraft thermally stable so it won't overheat, power positive so it won't drain the batteries, and to be able to phone home so ground operators can try and recover it. This can be tripped by any kind of reboot, watchdog, or software situation that doesn't have another method of recovery.

Andrew Elbert Wilson12:24 PM
I do lots of fault injection for Xilinx SRAM-based FPGAs to measure the configuration memory bit sensitivity. Can software fault injection into flight software be helpful as well?

Jacob Killelea12:25 PM
@Thomas Shaddack IIRC, there was a hacker-led recovery effort for an old NASA probe that was abandoned. Its batteries had been destroyed by thermal conditions, but it was spinning slowly in space, and every once and a while, its solar panels would provide enough power for it to boot and begin sending tones. I don't know if it ultimately succeeded, but people have tried!

daniel.fryling12:25 PM
Mostly written in C, is it a basic "while" loop?

Thomas Shaddack12:26 PM
Can be radiation-induced faults simulated on the ground? How are they detected/mitigated?

stu12:26 PM
Radition-induced faults are like bit flips from gamma rays?

Josh Graff12:27 PM
You can take your chips to cyclotrons and hit them with strange particles until they stop working.

kb4oam joined  the room.12:27 PM

Jacob Killelea12:27 PM
@daniel.fryling no, basically every spacecraft relies on some kind of expendable propellant. Low earth orbit (LEO) sats have lots of drag from the fringes of the atmosphere, and their orbits will decay back into the planet in a matter of weeks to years without station keeping. Above that, there's not a lot of atmosphere, but perturbations from the earth's non-ideal gravity, the moon, other planets, solar radiation pressure, etc, will all serve to push a satellite off course. Even if it could hover in place, it would need to counteract various torques that develop on the spacecraft. These can be mostly handled by reaction wheels, but eventually those need to be de-saturated and spun down once they reach a limit. The only way to do that is with reaction mass

Jacob Killelea12:28 PM
@Andrew Elbert Wilson exactly, usually a multi-level watchdog with gradually increasing levels of action taken to recover the spacecraft.

Jacob Killelea12:28 PM
@daniel.fryling usually only one

Jacob Killelea12:28 PM
only one while loop*

stu12:28 PM
Jacob, where are you working now? What are you working on these days? I had an offer from capella space back in 2018. Regret not joining them.

Jacob Killelea12:29 PM
You want _everything_ else on the spacecraft to execute in bounded time, so you know that it will eventually finish and hand control back to a main process.

stu12:30 PM
What does safe mode look like. Is it the first to have executive control after POR? Is it VxWorks 6.9, or RTEMS? Does it then wait patiently in the background while operating quiescently?

Jacob Killelea12:31 PM
@Thomas Shaddack and @stu, we tested radiation induced faults with a cyclotron at Texas A&M. It fired a proton beam into a bunch of our chips and produced a variety of faults. Some of these were recoverable and some were not. However, this information needs to be extrapolated into the radiation environment that you're operating in. LEO spacecraft are mostly shielded by the Earth's magnetic field, but beyond that, there's a whole spectrum of charged particles eager to destroy your transistors! These range from gamma rays and electrons to heavy iron nuclei created in supernovas a billion years ago

stu12:31 PM
Super neat!

Andrew Elbert Wilson12:32 PM
Its a lot of fun using Cream96 and Spenvis to calculate the possible Radiation TID and SEE effects based on your orbit!

Jacob Killelea12:32 PM
@stu, I'm actually (f)unemployed at the moment. I was working for a Google X project called Everyday Robots, but we got shut down in the layoffs that Google did earlier this year. I should talk to Capella again, I live within walking distance of their office.

Jacob Killelea12:33 PM
Safe mode is hard! It needs to be the most validated thing on your spacecraft. Often times, it's a heritage component, meaning it's something that operated successfully on a previous mission.

Liam Kennedy12:33 PM
This reminds me of that time when LightSail-A got locked up (non-responsive) - and they were hoping for a cosmic-ray hit to force a system reboot. It eventually happened - and the mission continued to give good data/telemetry that helps with the full mission of LightSail-2 in 2019.

Andrew Elbert Wilson12:34 PM
@liam

Andrew Elbert Wilson12:34 PM
@Liam Kennedy Thats a cool story!

John Vaccaro12:36 PM
Have you ever encountered flight software written in Forth?

Jacob Killelea12:36 PM
However, that's not always a shortcut around your own engineering. I remember one mission I was taught about in school where a safe mode controller was borrowed from a previous mission and re-flown. This safe mode controller was designed to point the solar panels of the spacecraft at the sun and induce a slight rotation to keep it stable. However, the mission that borrowed this controller put the solar panels around the intermediate moment of inertia axis of the spacecraft. When it spun, this didn't keep it stable, it actually caused it to tumble! The spacecraft lost the ability to communicate since it's antennas were no longer pointed in the right direction and lost its power since its solar panels were no longer pointed at the sun. This was a case where simply borrowing proven hardware was a fatal error.

Liam Kennedy12:36 PM
LightSail-A (or 1) launched with a code error (introduced before spacecraft handover for flight integration) and they discovered it had a fault before launch - but could not fix it on the ground. It was the dreaded "oops I missed that ";"

Jacob Killelea12:36 PM
In short, there is no one size fits all method to do fault recovery or safe mode.

Thomas Shaddack12:36 PM
Is it possible (and a good idea) to handle the most basic thermal/attitude control and main system watchdogs with a microcontroller? Something low-complexity, bullet-resistant, radiation-resistant, mistake-resistant, that watches over the high-complexity high-integration "brain"?

Jacob Killelea12:37 PM
@Liam Kennedy Yikes! I had no idea. We are certainly not relying on that for the Nova-C mission. We have seriously dynamic and time critical maneuvers, such as landing on the surface of the moon.

Jacob Killelea12:38 PM
@Liam Kennedy, thankfully cFS actually includes the ability to upload new versions of software from the ground

Thomas Shaddack12:38 PM
How much is time-critical? Microseconds, milliseconds, seconds?

Dan Maloney12:38 PM
Jacob, any insights into the differences between software standards for LEO/MEO/GEO satellites and what's needed for deep-space missions?

Jacob Killelea12:39 PM
Things are generally not microsecond level critical. Physics just doesn't happen that fast. However, being a second late at several thousand m/s can be a big problem...

Paul Hitchcock joined  the room.12:39 PM

Jacob Killelea12:41 PM
@Dan Maloney back on track eh? Sure! A lot of it has to do with the mission budget and lifetime. LEO sats are often scientific, mid to low budget systems with modest lifetimes. The focus is on simple systems that collect data and beam it down. You have regular contact with your ground stations, so autonomy isn't as important.

Jacob Killelea12:42 PM
In MEO/GEO, you actually have more communication, since you're loitering over a certain part of the planet, possibly forever! A GEO comm sat is a huge bird, up to the size of a school bus. It has a big power budget, and therefore lots of processing power! They often will co-host a special ARINC certified industrial OS with an application OS to do their data processing

Jacob Killelea12:43 PM
These systems are built by the big players, such as Lockheed and Boeing, and will have accommodations for customers to write their own software payload that is hosted in a virtualized environment onboard.

Louis12:43 PM
What was the OS on Everyday Robots?

daniel.fryling12:43 PM
Thank you for the response, Jacob, however, just a couple of additional question... "orbits will decay back into the planet in a matter of weeks to years without station keeping." What is "station keeping"?

Jacob Killelea12:43 PM
Actually, James Webb is like that! It has a core OS, and then a hosted environment for guest science. Guest science applications are actually written in JavaScript over there, IIRC...

Jacob Killelea12:45 PM
When you get to deep space, everything is harder and your spacecraft has to be able to handle itself. You'll be out of comms for a lot of the time, so data needs to be accumulated onboard and then exchanged in short windows. Bandwidth is constrained, and so is power once you start moving away from the sun! Commands are unlinked in tables that are scheduled in advance. The spacecraft might have to maneuver in a way that points its main antenna away from Earth in order to make its observations.

Jacob Killelea12:47 PM
@daniel.fryling station keeping is the process of keeping a satellite in the orbit you want it to be in! This might be a certain altitude, inclination, or phase. For geostationary satellites, it's probably a certain longitude around the equator

Carl12:48 PM
But a Orbit that decays significantly within weeks is more on the low end of LEO.

Jacob Killelea12:49 PM
If people haven't seen them before, I totally recommend checking out cFS and running it yourself (https://github.com/nasa/cFS). It's a pretty friendly C programming framework with a core set of services for spacecraft such as process management, communications, logging, and application hosting. On top of that, it comes with a set of applications that talk over a pub/sub bus to support things like data storage, communications, and housekeeping. It also, of course, has provisions for you to write your own applications!

Jacob Killelea12:50 PM
@Carl totally true! the lowest of LEO sats will fall out of the sky in a short while due to drag, but the time it takes an orbit to decay is exponentially related to altitude. Once you're at MEO/GEO, those things are staying up there for thousands of years past their useful lifetime.

Jacob Killelea12:52 PM
That's why it's become increasingly important to manage and track both all the active satellites in space, but also all the debris! If thing hit each other (and they have), they produce even more debris. This could, hypothetically, cascade into something called the Kessler (?) Sydrome, where our local space environment becomes inaccessible because of the amount of debris whipping around the planet.

Dan Maloney12:52 PM
Transcript coming up at the e

Discussions