Adventures in Science: Arduino Literals, Variables and Data Types

We present another set of concepts in the computer science series as they relate to Arduino.

Favorited Favorite 1

Previously on “Adventures in Science,” I covered Arduino programming syntax. Now, we take a look at how to store numbers and characters in “containers” known as variables and how to access that data.

Before digging into literals and variables, we have to understand data types. A data type is a classification of information that tells the compiler how the programmer intends to use the information. In C, there are only three fundamental data types: integer (int), floating point (float) and character (char). However, you will also sometimes see void to indicate “nothing” or “no type.”

Arduino supports more data types, such as long int, which is an integer stored in 4 bytes, and unsigned int to mean an integer that’s only positive or 0.

Literals are fixed values that do not change throughout the program. For example, if you write the number 13 or 500 in your program, that’s a literal. Literals for characters can be expressed between single quotes as any ASCII-encoded character, such as 'g'.

Variables work like containers with labels. You can store information with the specified data type in a variable and then refer to the label later in the code when you need to retrieve or change the data. This can be extremely handy for manipulating data later in the code (with, for example, arithmetic operators) or setting a constant value once in the code (e.g., setting led = 13; and then using led instead of writing 13 several times).

Like your tutorials in written form? Here are a couple of guides that go over the basics of data types in Arduino and the American Standard Code for Information Interchange (ASCII):

Data Types in Arduino

Learn about the common data types and what they signify in the Arduino programming environment.


A brief history of how ASCII came to be, how it's useful to computers, and some helpful tables to convert numbers to characters.

Question for all you programmer types out there: Which do you prefer — strongly typed languages or weakly typed languages? Why? Please respond in the comments below.

Comments 14 comments

  • Programming for 8, 32, and 64 bit systems, I personally prefer to use names such as “uint8_t”, “int64_t”, etc. so that it’s clear if it’s signed or unsigned, and the number of bits is consistent regardless of the CPU/compiler. Not all compilers support this naming scheme, but most C/C++ compilers seem to support it.

    • Good call. I usually forget to do this (unless I’m working with uint8_t), since I’m so used to “int,” but specifically calling out the byte size is super helpful.

  • I’m not sure if I spotted an error, or if I learned something wrong. I could have sworn that the ‘char’ datatype was signed. The table in the video shows it as unsigned. (Otherwise, what is the point of ‘unsigned char’ or ‘byte’, and what would be a signed 8-bit variable?)

    • According to Arduino’s reference page, char is signed (-128 to 127), so you are correct in that it is in error on the table in the video. Thanks!

    • char is implementation defined. It can be signed or unsigned by default based on the compiler writers' whims. That’s why it’s always safest to specify the option you want.

      • Right, but in the context of that table it is the Arduino platform. I’m not sure if the Arduino pre-processing stuff defines it as int8_t or if it is the gpp compiler (the actual compiler used in the Arduino environment). Shawn has already weighed in though.

        I’ll ferit away into my memory that char is truly implementation specific, especially when I eventually move to a different implementation. I keep going back and forth in my mind weather it is best to use the low level int##t and uint##t data types or the higher level byte, int, long. Especially since for me I want to write code that works on both an 8-bit AVR Arduino through to a 32-bit ARM M0 Arduino, and all the points in between, with very little changes needed. I feel I should use the higher level datatypes for portability, but I really want to conserve memory on my little UNO… Having lots of #ifdef clauses for different platforms starts making code look really ugly and hard to actually read and figure out what the intent is.

        Or do I have it completely backwards and int##t and uint##t datatypes are actually higher level?

  • A couple of “nits” this week: First (since I’m a compiler designer) “label” has a very specific meaning (i.e., the “target” of a “goto”), so it’s much better to use the term “name”.

    Second nit: When talking about floats, it’s also important to note that there is a “smallest non-zero” number that can be stored. Off on a tangent, when a math operation on a float leads to a number that’s too small to represent in a float but is clearly non-zero, it causes an “underflow” error. This leads to something of a joke: I often say “my cup runneth under” (a play on Psalm 23:5, and meant to irritate the Bible-thumpers).

    On the “strongly-typed” versus “loosely-typed”, IM[NS]HO, it is FAR better to catch potential errors at compile time than having to waste hours/days/weeks (and, if you’re trying to control something physical, destroyed hardware) trying to track down that you passed a char ‘5’ when you should have passed a byte 5.

    (One of my pet peeves is that K&R [and many other language manual authors] didn’t include a table of ASCII as an “appendix”. With Google and Wikipedia, it’s not as critical today, but 20 years ago it was VERY frustrating.) BTW, I remember the days of complications with EBCDIC. Baudot, and some other more obscure codings for characters…

    • (One of my pet peeves is that K&R [and many other language manual authors] didn’t include a table of ASCII as an “appendix”.

      Don’t know if you’re on windows, but “man ascii” works on any unix box.

      • Goes to show that even an “old dog can learn new tricks”. Thanks! (I’ve been using *nix since about 1977, and had never run across this.) BTW, it also works in a “Terminal” window on OS-X (since OS-X is based on BSD Unix).

    • One other minor thing: You can guarantee that the C compiler will not use up data space for a named value if you use the #define capability, e.g.,

      #define led 13
      pinMode(led, OUTPUT);
      digitalWrite(led, HIGH);

      For those interested, what happens is that the C compiler has a “pre-processor” phase which deals with all the lines beginning with an octothorpe (the correct name for ‘#’). A #define tells the preprocessor to simply replace any (and every) occurance of the name with whatever is on the rest of the line before the (actual) compiler sees it. (Note that the preprocessor is “linear” in that a #define only affects code after the #define.)

      • The pre-processor in general, and the #define of a constant value, is evil!


        pollutes the namespace. Once you #define a value, you can never use a different definition without #undefining it first. Keeping track of that kind of monkeyshines is invitation to catastrophe.

        If this is not clear, consider the actual problem I ran into 3 hours ago. I had a class which contained a constant like this:

        class foo
            static const int FLOAT = 1;

        Then when I tried to use it:

        int dataType = foo::FLOAT;

        It gave me an error saying that “float” was an illegal token, even though I had named my constant “FLOAT”, not “float”

        After some searching, I found that a vendor-provided header file contained:

        #define FLOAT float

        Since they used a pre-processor definition instead of a namespaced declaration, that meant that I could never use the constant “FLOAT” anywhere in my program except as an alias for “float”, even if I fully-qualified the name with a namespace. My namespace is polluted with an alias I don’t need or want. I had to change my constant to “FLOATING”.

        Not only does this kind of thing happen all the time, but they tend to grab all the good constant names. WindRiver stole “DEBUG” in a VxWorks header file this way. There are too many examples to list, just keep in mind that the preprocessor is evil and use namespaced constants instead.

        And to answer Shawn’s question, strongly-typed languages rule. Bugs generally get 10x more expensive to find and fix every stage they pass through. The earlier you can find one, the cheaper the process (or faster if you’re a hobbyist who does this for free).

        • Actually, you should be snarling at the vendor for not having used the name “_FLOAT” (or even “__FLOAT”), or, better yet, prefix (or postfix) the name with the name of the package (or the vendor), e.g. “GOOD_FLOAT”.

          Having written literally hundreds of thousands of lines of C and C++ code (as well as some other languages), the use of #defines is excellent practice. You just have to be wary of the actual naming. (IMHO, “FLOATING” is a slightly better name, though “Floating” is probably even better. Remember, the C name space is case sensitive, so “PHRED”, “Phred”, “phred”, and “pHRED” are all different names as far as the compiler is concerned.)

          • We’ll have to agree to disagree on this.

            The examples you’re citing are fine for C. In fact, they’re absolutely necessary for C since C has one global namespace. For C++ (as on the Arduino) they are crutches used by old C programmers who never made the conversion to thinking in the new language. It’s the same as all those folks who persist in using char arrays instead of templated container classes, void pointers instead of templates, printf instead of stream I/O, C-style casts instead of C++ casts, etc. It’s like a person wandering through 21st century Rome trying to get by speaking Latin instead of Italian. They probably won’t starve and they may even have a good time (the program works) but it’ll be inefficient and annoying.

            The pre-processor is the ultimate untyped language - you can define anything to be anything else. It’s a true programming language with a lot of power. I’ve written macros that solve the Towers of Hanoi problem when evaluated in the pre-processor, without ever invoking the compiler. And that’s the key - without ever invoking the compiler. It works completely outside the type-checking and error detection protections offered by the C++ compiler. It follows none of the language rules.

            If you have a strongly-typed, object-oriented language why would you ever want to overlay that with a disorganized rogue layer that can hide errors under many layers of definitions and includes? That’s why the pre-processor is evil.

            To be fair, there is one legitimate use for the pre-processor - including files. But #include doesn’t override any of the compiler and language features. It’s just a patch on the build system (which actually is no longer necessary with modern IDEs).

            Finally, I too have written hundreds of thousands of lines of code in C/C++ and other languages. That’s not evidence one way or the other about the proper use of the pre-processor, but you seem to care so there it is :-)

  • Some more FYI, the processor instruction set size (8-bit, 16-bit, 32-bit, 64-bit) determines the initial datatype size.

    This means that some things which are larger (float, double) would have reduced size from what they typical are (float = 32-bit, double = 64-bit), so if floating point resolution is a major desire, keep that in mind.

Related Posts

Recent Posts


All Tags