Data Types
Author: Darren Yao
The data types needed for competitive programming.
Prerequisites
C++
Resources | ||||
---|---|---|---|---|
CPPR | sizes + ranges | |||
IUSACO | module is based off this | |||
CPH | Integers, Modular arithmetic, Floating point numbers | |||
PAPS | plenty of exercises |
C++: Common Fundamental Data Types
Note: These numbers may vary depending on your machine and/or compiler. For more fundamental data types, check the first resource in the table above.
Types | int | long long | double | bool | char |
---|---|---|---|---|---|
Description | 32-bit integer | 64-bit integer | Double-precision float | True/False value | 8-bit character |
Size (bytes) | 4 | 8 | 8 | 1 | 1 |
Range | to | to | -1.7E+308 to +1.7E+308 | or (true or false ) | to |
Java
Resources | ||||
---|---|---|---|---|
JavaDocs | ||||
IUSACO | module is based off this |
Java: Common Primitive Data Types
For more primitive data types, check the the first resource in the table above.
Types | int | long | double | boolean | char |
---|---|---|---|---|---|
Description | 32-bit integer | 64-bit integer | Double-precision float | True/False value | 16-bit Unicode character |
Size (bytes) | 4 | 8 | 8 | 1 bit(*) | 2 |
Range | to | to | -1.7E+308 to +1.7E+308 | true/false | \u0000 to \uffff () |
*Note: It's unlikely that booleans will actually use only 1 bit of memory, as in most cases data types must be aligned to bytes. However, only one bit of information can be stored in them.
Python
Resources | ||||
---|---|---|---|---|
IUSACO | module is based off this | |||
Python |
Types | int | float | bool | str |
---|---|---|---|---|
Description | Arbitrary-size integer | Double-precision (64 bit) IEEE 754 float | True/False value | String |
Values | Any integer | -1.7E+308 to +1.7E+308 | true/false | Any length of text |
There are several main data types that are used in contests: integers, floating point numbers, booleans, characters, and strings. Assuming that you are familiar with the language you are using, this should be mostly review.
The normal 32-bit integer data type (int
in C++ and Java) supports values
between and , which is roughly equal to
.
Some problems require you to use 64-bit integers (long long
in C++ and
long
in Java) instead of 32-bit integers (int
). 64-bit integers are less
likely to have overflow issues, since they can store any number between
and which
is roughly equal to . In Python,
int
s
have unlimited size.
Sometimes (but not always) a USACO problem statement (ex. Haircut) will contain a warning such as the following:
Note that the large size of integers involved in this problem may require the use of 64-bit integer data types (e.g., a "long long" in C/C++).
Contest problems are usually set such that the 64-bit integer is sufficient, so for lower divisions
it might be a good idea to use 64-bit integers in place of 32-bit integers
everywhere. Of course, you shouldn't do this when time and/or memory limits are
tight, which may be the case in higher divisions of USACO. Also note that in
Java, you will need to cast long
back to int
when accessing array indices.
Additionally, there exist 16-bit integers (short
in C++ and Java). However,
these are generally not useful as the extra memory saved by using them is
usually negligible. Unsigned integers (unsigned int
, unsigned long long
, etc.)
also exist. They aren't used as frequently, though the 2-fold increase in size is sometimes the difference between overflowing and not overflowing.
Floating point numbers are used to store decimal values. It is important to
know that floating point numbers are not exact, because the binary architecture
of computers can only store decimals to a certain precision. Hence, we should
always expect that floating point numbers are slightly off, so it's generally a
bad idea to compare two floating-point numbers for exact equality (==
).
Contest problems will usually accommodate the inaccuracy of floating point numbers by checking if the absolute or relative difference between your output and the answer is less than some small constant like .
- If your output is and the answer is , the absolute difference is .
- If your output is and the answer is , the relative difference is .
This is not the case for USACO, where problems generally have a unique correct output. So when floating point is necessary, the output format will be something along the lines of "Print times the maximum probability of receiving exactly one accepted invitation, rounded down to the nearest integer." (ex. Cow Dating).
Boolean variables have two possible states: true
and false
. We'll
usually use booleans to mark whether a certain process is done, and arrays of
booleans to mark which components of an algorithm have finished. Booleans
require 1 byte (8 bits) of storage, not 1 bit, wasting the other 7 bits of
storage. To use less memory, one can use bitsets (std::bitset
in C++ /
BitSet
in Java). Unfortunately, bitsets are not available in Python.
Character variables represent a single character. They are returned when you
access the character at a certain index within a string. Characters are
represented using the ASCII standard, which assigns each character to a
corresponding integer. This allows us to do arithmetic with them; for example,
both cout << ('f' - 'a');
in C++ and System.out.print('f' - 'a');
in Java
will print 5
. In Java, characters are 16 bits, while in C/C++, characters are
8 bits.
Strings are effectively arrays of characters. You can easily access the character at a certain
index and take substrings of the string (charAt()
and substring()
in Java).
Module Progress:
Join the USACO Forum!
Stuck on a problem, or don't understand a module? Join the USACO Forum and get help from other competitive programmers!