What is the Mars Curiosity Rover's software built in?

501

212

The Mars Curiosity rover has landed successfully, and one of the promo videos "7 minutes of terror" brags about there being 500,000 lines of code. It's a complicated problem, no doubt. But that is a lot of code, surely there was a pretty big programming effort behind it. Does anyone know anything about this project? I can only imagine it's some kind of embedded C.

InfinitiesLoop

Posted 2012-08-06T04:04:48.947

Reputation: 1 743

88Why would one assume there is only one language involved in the project.Rig 2012-08-06T04:11:13.010

5Good point, sure, it's probably got a breadth of technology associated with it. I want to know more about all of that :)InfinitiesLoop 2012-08-06T04:12:30.713

3Which part? The spacecraft? The rover? Instruments? The ground system? As other comments indicate, there are probably several languages used in the different components. It's not out of the question that assembler was used for some of the time critical components.GreenMatt 2012-08-06T04:27:44.410

1Since it's a government project I am guessing Forth, MUMPS 2011, and RPG V, with management interfaces built in Object COBOL, and motor control in Postscript.joshp 2012-08-06T05:44:05.817

I think Roverbasic, a new language purportedly designed by the JPL, but it will turn out to be actualy written by Microsoft.Mr Lister 2012-08-06T05:46:57.493

61To be honest, when I saw the 500kloc figure I caught myself thinking "Only?" It could have been realistic had it been Haskell, but having read a bit about previous projects and their low level languages, this seemed way too low. The 2.5mio loc C code cited below are more believable.Philip Kamenarsky 2012-08-06T06:02:38.950

@PhilipK - I think the 500kloc figure might only cover the EDL software.JohannesD 2012-08-06T14:43:25.147

Some of the sub-questions you asked were not answered in the other question before. That has been fixed :)Nate Parsons 2012-08-06T15:01:47.930

1@Philip K It might be the 500kloc is for the descent software only. The keynotes in the answer of drhorrible divides the MSL into 3 different stages, running different software, 1. the flight (earth to mars) 2. The descent and landing 3. The rover itself, roving around.nos 2012-08-06T16:38:47.133

1@PhilipK I'm thinking 500k LOC is with the comments and extra blank lines stripped out - so, 500k functional LOC, but 2.5m lines total in the codebase. ;)Izkata 2012-08-06T16:54:24.523

16A more interesting question that "in what language?" is "with what process?". It's the process that make the difference, and NASA has been using a rigorous one for decades now.dmckee 2012-08-06T17:07:39.797

It was all written in LISP. Nasa is trusting LISP's backtracking to infer all the correct decisions to make.JustinDanielson 2012-08-06T21:16:38.893

2@dmckee: I agree. Why don't you ask that question?Jim G. 2012-08-06T22:29:17.543

This overview was really interesting, talking about the tech behind its software and instruments: http://www.extremetech.com/extreme/134041-inside-nasas-curiosity-its-an-apple-airport-extreme-with-wheels

Matt 2012-08-08T13:50:02.067

I asked a question on ITSec.se regarding the security on the rover: <http://security.stackexchange.com/questions/18225/mars-curiosity-rover-security&gt;

pasawaya 2012-10-09T07:27:15.993

For an insight into NASA software engineering culture and practices, there's a great article from 1996: http://www.fastcompany.com/28121/they-write-right-stuff

akatkinson 2013-10-05T21:35:30.950

For sure, there is absolutely no .net code. :DSamuel 2015-03-18T10:21:26.207

Answers

459

It's running 2.5 million lines of C on a RAD750 processor manufactured by BAE. The JPL has a bit more information but I do suspect many of the details are not publicized. It does appear that the testing scripts were written in Python.

The underlying operating system is Wind River's VxWorks RTOS. The RTOS in question can be programmed in C, C++, Ada or Java. However, only C and C++ are standard to the OS, Ada and Java are supported by extensions. Wind River supplies a tremendous amount of detail as to the hows and whys of VxWorks.

The underlying chipset is almost absurdly robust. Its specs may not seem like much at first but it is allowed to have one and only one "bluescreen" every 15 years. Bear in mind, this is under bombardment from radiation that would kill a human many times over. In space, robustness wins out over speed. Of course, robustness like that comes at a cost. In this case, it's a cool $200,000 to $500,000.

An Erlang programmer talks about the features of the computers and codebase on Curiosity.

World Engineer

Posted 2012-08-06T04:04:48.947

Reputation: 24 073

Hmmm... it still is a surprise to me that such an important mission isn't running as closer to Machine Code e.g. Assembler...Dynamic 2012-08-06T04:37:13.533

42

JPL C language coding standards, specifically for embedded environments instead of "ground software" as they call it. http://lars-lab.jpl.nasa.gov/JPL_Coding_Standard_C.pdf

Patrick Hughes 2012-08-06T06:21:50.633

71@Dynamic: It's such an important mission that NASA wouldn't risk it. Humans writing assembly make more errors, that's a measured fact.MSalters 2012-08-06T08:16:45.310

6It's really all done in C? I thought NASA tended to avoid C on the grounds that its performance comes at the cost of being far too easy to shoot yourself in the foot with it, and preferred higher level languages with more robust error detection.GordonM 2012-08-06T08:27:21.867

14@GordonM: I guess NASA makes a lot of re-use of existing, mature code, developed in the last decades for previous missions. So it is more amazing the code is not written in Fortran.Doc Brown 2012-08-06T09:37:10.053

21Compiled C code is machine code, assembly language is machine code, I don't see the difference. There isn't a huge performance difference when you get down to it.Ramhound 2012-08-06T11:43:28.133

20NASA are extermely careful with their code. Everything (EVERYTHING) is done in the spec first and is repeatedly reviewed, checked and refined. When it is put into the life code stream it is almost a cut and paste of the spec's reference. The test scripts are given at least as much attention as the code is and no 'flashy' or clever code tricks are allowed unless they are critically needed.Stefan 2012-08-06T13:02:05.327

14Whoa, C. Colour me surprised. I would have assumed some strictly checked language without such things as pointers or undefined behaviour.Konrad Rudolph 2012-08-06T13:50:44.753

11One shouldn't judge a language only by what a minimal compiler allows. Have you seen any of today's "static" code analyzers? (They aren't exactly static anymore.)Jeanne Pindar 2012-08-06T14:06:04.587

3As for why they need to largely stick with C; if you look at the Rad750's datasheet, the fastest version on offer is only a 200mhz chip. Since it's not mentioned, I assume it's also only single core. That's not much hardware to give room for higher level languages overhead.Dan Neely 2012-08-06T14:56:22.743

256MB of DRAM.. Seriously? My phone has double that!Amarghosh 2012-08-06T14:57:28.197

93@Amarghosh: yeah, and see how well your cell phone works when it goes through a high-radiation environment such as outer space :)whatsisname 2012-08-06T15:02:29.093

9I'm surprised it's not Ada.Tom O'Connor 2012-08-06T15:08:54.703

18

@TomO'Connor: I'm not.

TMN 2012-08-06T15:16:16.377

I'm surprised its running on VxWorks TBHtMC 2012-08-06T18:27:06.320

11@KonradRudolph check the JPL Coding standard. No dynamic allocation is one of the rules.Ólafur Waage 2012-08-06T18:49:19.643

6@Dan That’s a fallacy. Higher-level languages don’t necessarily come with an overhead. Consider Ocaml. More importantly, the coding standards used here (JPL …) mandate many redundant checks which take a lot out of the speed advantage that C otherwise has. As a result, C’s performance advantages vanishes in a puff of smoke. The real reason for using C is most probably the potential of fine-grained control over allocation but again I’d have expected specialised higher-level languages to offer this.Konrad Rudolph 2012-08-06T19:22:56.803

6

@Stefan No, not everything at NASA has the same set of code standards. But every project involving software engineering has the same set of process standards. See NPR7150. http://nodis3.gsfc.nasa.gov/displayDir.cfm?t=NPR&c=7150&s=2 and even then this depends on the class of the software. Class A software usually involved keeping Humans alive in space. But class H software is general purpose desktop software. Class H software does not require verification and validation, but class A does.

Sean McCauliff 2012-08-06T22:58:02.030

@SeanMcCauliff, thanks. I read a doc about their software standards, I guess it only refered to a certain class but I assumed it was all of their software.Stefan 2012-08-07T08:15:22.740

3Looks like much of that C code (at least all the FSM parts) had been generated from a high level DSL.SK-logic 2012-08-07T08:51:32.763

1@Dynamic especially for an important mission you want all the extra help that higher languages can give you. – None – 2012-08-07T09:37:00.363

6@TMN If you had any experience with Ada you'd find it is a very safe language and leaves very little to chance, encouraging engineers to actually think about the code they write; hence it being used often in safety critical systems along with formal notation (Z is quite popular). To bypass the design intentions of Ada is not easy, yet the developers went out their way and did just this.R4D4 2012-08-07T10:29:01.227

11Why all the implicit assumption that performance is paramount? Consider the speed at which the rover travels and the speed of it's various servos, it doesn't have to be lightning, real-time, nanosecond fast. Nor is concurrency a major issue, it can pretty much do most of it's task serially (run motor 1 second, turn camera, turn wheel, run motor 0,5 seconds etc). Stability and precision, I would think, are far more important.pap 2012-08-07T14:06:45.233

9@pap:nano-second speed isn't necessarily the issue, but real-time is. Stuff has to happen pretty much exactly when it is supposed to happen. This is why VxWorks is a popular choice for embedded real-time systems. VxWorks has great support for C, and ok support for C++. I never used it with Java, but suspect that to make that real-time they'd have to make it non-standard. Anyways, my point is that VxWorks probably drove the language decision.Dunk 2012-08-07T14:45:15.403

3@KonradRudolph : if they ban dynamic allocation in C (for various reasons) then you'll never get it written in .NET or Java, as those systems use dynamic allocation almost exclusively. Java for example has licence restrictions for using it to write critical system. The point of C is that you can guarantee exactly what is happening at any given point in execution, something expensive to do, but necessary if sending a field engineer to debug is impractical.gbjbaanb 2012-08-08T17:26:50.573

@gbjbaanb And you may notice that I suggested neither of those languages. What make you think I did?Konrad Rudolph 2012-08-08T18:09:59.573

@ThorbjørnRavnAndersen: Sorry, but high level languages suck for anything performance and safety oriented. They abstract away many things, but all of them have problems, every solution adds new layers of problems, and getting critical system to work is to remove problems, so any high language is an exact opposite of what you want to do. Good read - http://www.joelonsoftware.com/articles/LeakyAbstractions.html

Coder 2012-08-09T02:14:31.393

2@Coder So you essentially say that you need to write in machine code to have anything reliable in a critical system? Or is your conclusion something else? – None – 2012-08-09T07:45:10.780

1@Coder False. You've ignored decades of progress in programming languages. There is actually the opposite argument: that if mission-critical software started getting written in higher level, functional languages, there would be fewer failures.Andres F. 2012-08-10T01:41:26.083

1@Coder: The latest Air Traffic Control systems are written in Java for one example. Don't mix up system reliability with software reliability- the most reliable systems are made up of unreliable parts. Because the most reliable parts (hardware and software) can fail, the system is designed to work when they do. Because it works when they fail, they non longer need to be reliable. Todays highly reliable systems are made out of consumer grade components. This does not apply to spacecraft where you tend to only have 1 of each system, and only duplicate the most critical.mattnz 2012-08-10T05:12:37.317

@ThorbjørnRavnAndersen: Almost, you need nop slides and stuff like that to fail when RAM error or cosmic ray creates a havoc in the memory. You really want to do any garbage collection in mission critical systems. Everything has to be super simple and easily checkable. Thinking that some high level gives you better reliability is very wrong. There are few things that might help when used sparingly, say templates, and things that downward suck, like exceptions.Coder 2012-08-12T23:58:22.847

1

@Coder I do not know if you program space crafts for a living. I don't so I found this small historic piece about Lisp on space crafts - Debugging a program running on a $100M piece of hardware that is 100 million miles away is an interesting experience - http://www.flownet.com/gat/jpl-lisp.html

– None – 2012-08-13T00:44:32.557

2@ThorbjørnRavnAndersen: If you need to debug an app 100 million miles away, it's already a problem. And don't tell me you want to debug Java program 100 million miles away. Once under certain circumstances some weird JRE bug causes some weird behavior in garbage collector which causes chain reaction in all dynamic memory accounting, addressing and deletion. You won't even be able to re-flash the thing. And the article had one thing right - "one thing you would do - get rid of lisp". Shame that the author is a fanboy and doesn't get the core of the problem.Coder 2012-08-13T14:27:18.880

@Coder read the article. They found the bug and fixed it - and Lisp is not exactly assembly language. – None – 2012-08-13T14:55:07.583

@Ramhound "There isn't a huge performance difference when you get down to it.": I admit it is a while ago, but the last time I wrote the same program in C and assembly to compare for speed, the assembly program turned out to be twice as fast.Giorgio 2012-09-15T08:57:10.713

Can we add a link to this talk into the answer? It's an hour-long look into the coding process for the Curiosity rover. Really fascinating.

Matt 2015-06-19T21:04:20.590

165

The code is based on that of MER (Spirit and Opportunity), which were based off of their first lander, MPF (Sojourner). It's 3.5 million lines of C (much of it autogenerated), running on a RA50 processor manufactured by BAE and the VxWorks Operating system. Over a million lines were hand coded.

The code is implemented as 150 separate modules, each performing a different function. Highly coupled modules are organized into Components that abstract the modules they contain, and "specify either a specific function, activity, or behavior." These components are futher organized into layers, and there are "no more than 10 top-level components."

Source: Keynote talk by Benjamin Cichy at 2010 Workshop on Spacecraft Flight Software (FSW-10), slides, audio, and video (starts with mission overview, architecture discussion at slide 80).


Someone on Hacker News asked "Not sure what means that most of the C code is auto generated. From what?"

I'm not 100% sure, although there probably is a separate presentation in that year or a different year that describes their auto-generation process. I know that it was a popular topic in general at the FSW-11 conference.

Simulink is a possibility. It's a MATLAB component popular among mechanical engineers, and therefore most navigation & control engineers, and allows them to 'code' and simulate things without thinking they're coding.

Model-based programming is definitely a thing that the industry is slowly becoming aware of, but I don't know how well it's catching on at JPL or if they would have chosen to use it when the project started.

The third and most likely possibility is for the communication code. With all space systems, you need to send commands to the flight software from the ground software, and receive telemetry from the flight software and process it with the ground software. Each command/telemetry packet is a heterogeneous data structure, and is is necessary that both sides are working from the exact same packet definition, and format the packet so it is correctly formatted on the one side, and parsed on the other side. This involves getting a whole lot of things right, including data type, size, and endianness (although the latter is usually a global thing, you could have multiple processors onboard with different endianness).

But that's just the surface. You need lots of repetitive code on both sides to handle things like logging, command/telemetry validation, limit checking, and error handling. And then you can do more sophisticated things. Say you have a command to set a hardware register value, and that value is sent back in telemetry in a particular packet. You could generate ground software that monitors that telemetry point to ensure that when this register value is set, eventually the telemetry changes to reflect the change. And of course, some telemetry points are more important than others (e.g. main bus current), and are designated to come down in multiple packets, which involves extra copying on the flight side and data de-duplication on the ground side.

With all that, it's much easier (in my opinion) to write one collection of static text files (in XML, csv, or some DSL/what-have-you), run them through a perl/python script, and presto! Code!

I do not work at JPL, so I cannot provide any detail that is not in the video, with one exception. I've heard that the autogenerated C code is written by Python scripts, and the amount of autocoding in a project varies greatly depending on who the FSW lead is.

Nate Parsons

Posted 2012-08-06T04:04:48.947

Reputation: 1 231

8

This might shed some light on Wind River, the contractor who makes VxWorks: http://www.windriver.com/news/press/pr.html?ID=10901

I've read that NASA has a team of people whose job is to find as many bugs as they can in the control system code written by another team. The bug-finding team is rewarded for bugs they find and they are really quite good in finding arcane bugs. When a bug is found, a 5Y-type analysis is done to find out how the software dev process could be improved to eliminate the possibility of similar bugs in the future. A very painstaking and expensive process.

Jim Raden 2012-08-06T17:47:58.237

15@JimRaden When the direct cost of failure for a probe runs from several hundred million to several billion dollars and several years (if at all) for a redo attempt extreme paranoia in QA is justified. The indirect costs in the form of dozens/hundreds of grad students losing years of work and having to restart on their phd work and various new professors who were counting on data from it to supply their tenure track research is another major hit but much harder to quantify than the line items in the NASA budget.Dan Neely 2012-08-06T18:41:32.493

1What was the C auto-generated from? Please tell me that it was not Simulink. :-)William Payne 2012-08-06T20:54:20.350

William: It's possible some navigation/control code was generated from Simulink, because it's fairly common in this industry, but I can't say for sure on this project. I believe most of it is generated from some sort of Interface Control Document spec and has to do with parsing + formatting of commands and telemetry. DRY, and all that.Nate Parsons 2012-08-06T21:30:13.393

2@William Payne The keynote states that some of it are autogenerated protocol encoding/decoding routines (for communication with earth), generated by python programs from XML descriptions.nos 2012-08-07T09:36:35.450

1Automagically generating code from ICDs is kinda cool. I like the idea! I would have used YAML rather than XML, though. :-)William Payne 2012-08-07T13:41:00.587

William: I also prefer YAML (or even JSON) over XML, and if I ever lead a project, it will be my choice. The project I'm on now uses .xls "because non-programmers have to edit the files" :(Nate Parsons 2012-08-07T15:04:14.917

Code generation is nothing new. http://en.wikipedia.org/wiki/Model-driven_software_development http://en.wikipedia.org/wiki/Model-driven_architecture http://en.wikipedia.org/wiki/Model-driven_engineering -- the goal is to express big parts of the system in a formal modelling language that lets you mathematically prove certain properties of your models. This should remind you of state machines and petri nets. These models are then transformed into code.

sleeplessnerd 2013-01-26T18:01:49.217

@NateParsons: Ever heard of proper schema validation and xml binding? - And the .xls argument is valid to some degree considering the existing very good APIs (http://poi.apache.org/spreadsheet/index.html)

sleeplessnerd 2013-01-29T16:03:55.463

@sleeplessnerd My apologies if I ever gave the impression that I thought code generation was anything new. Re: .xls, my beef is not with accessing them programmatically, but their QMS and change-tracking. The simple act of opening a spreadsheet in excel changes the file, and even 'real' changes have essentially unpredictable effects. On multi-engineer teams, conflicts are inevitable, and sorting them out in xls files is a pain. I think an ideal solution would basically allow well-formed YAML files to be edited via excelNate Parsons 2013-02-05T03:35:05.567

@NateParsons: No apologies needed, I was the one with the rough tone :) -- You are completely right though with the excel thing, I can feel your pain, I was assuming it was for something more non critical. -- The discussion of different markup dialects threw me off though, kind of the least important detail in relation to the post - Like choosing the type of wallpaper to start your house construction planning. (sounding condescending again) Seriously though, Schemas are the best.sleeplessnerd 2013-02-06T04:39:36.310

I would argue that model-based programming isn't always the best choice especially when developing a system.Andrew Larsson 2014-05-25T15:02:37.480

Unfortunately, Caltech is (apparently) no longer hosting/archiving any of the material from that conference, so your links are all broken. That's a real shame. If anyone knows of an alternate source for this material, it would be very much appreciated!kmote 2015-10-22T20:57:23.640