Python Variables

Overview

Python variables are very similar to how variables are used in R. The primary difference is that instead of using or to assign variables, Python uses a single =.

Python has a few key differences from R in regards to variable behavior. Information on variable assignment in Python can be found in the "Variable Assignment" section below.

Similar to other programming languages Python has several core variable types. Overview of each variable type are included below:

Variables are the method that Python uses for you to assign a helpful variable name (like "predictions" or "amount_of_icecream") to reference data that you will read or create.


Variable Assigment

my_var = 4

This declares a variable with a value of 4.

Actually this is technically not true. Numbers between -5 and 256 (inclusive) are already pre-declared and exist within Python’s memory before you assigned the value to my_var. The = operator simply forces my_var to point to that value that already exists! That is right, my_var is technically a pointer.

One of the most important differences between variables in R and Python is what is happening in the background. Take the code example below:

my_var = 4
new_variable = my_var
my_var = my_var + 1
print(f"my_var: {my_var}\nnew_variable: {new_variable}")
my_var: 5
new_variable: 4
my_var = [4,]
new_variable = my_var
my_var[0] = my_var[0] + 1
print(f"my_var: {my_var}\nnew_variable: {new_variable}")
my_var: [5]
new_variable: [5]

The first chunk of code behaves as you’d expect because int values are immutable, meaning the values cannot be changed. As a result, when we assign my_var = my_var + 1, my_vars value isn’t changing. Instead my_var is just being pointed to a different value (in this case 5). In comparison new_variable still points to the value of 4.

The second chunk of code is different because it is dealing with a mutable list. We first assign the first value (0 index) of the list to a value of 4. We then assign my_var to new_variable. Unlike the first example, this does not copy the values. Instead both my_var and new_variable point to the same mutable list object. When we then change the value of the list by 1 the change is reflected in each variable since they are pointing to the same object.

An excellent article goes into more detail and can be found here.


Null Values

None

None is a keyword used to define a null value. This would be the Python equivalent to R’s NULL. If used in an if-statement, None represents False. This does not mean None == False, in fact:

print(None == False)
False

Even though None can represent False in an if-statement, Python does not evaluate the two as equivalent.

NaN

The difference between NaN and None in Python can be somewhat confusing. The NaN value stand for "not a number" and is commonly used to reference missing data. Python will often convert None values to NaN dynamically, especially when working with numbers. There are lots of methods to identify and remove or fill NaN values, but it is worth noting that Python will evaluate them with no issues for many operations.

For example, if I wanted to sum the rows below and then remove any NaN values we could try the initial code snippet.

col_1 = [np.nan, 50, 100]
ccol_2 = [np.nan, 100, 50]

example_dataframe = pd.DataFrame(list(zip(col_1, col_2)), columns=['Value 1', 'Value 2'])
example_dataframe['example_sum'] = np.sum(example_dataframe, axis=1)

However, Python would evaluate this as the table below:

Value 1

Value 2

example_sum

NaN

NaN

0

50

100

150

100

50

150

If we were to try to remove NaN values based on the example_sum column no rows would be removed. In this case we’d want to remove or fill the Nan values prior to the aggregation (sum).


bool

A bool has two possible values: True and False. It is important to understand that technically they also correspond to integers:

print(True == 1)
True
print(False == 0)
True

The True and False values only correspond to 1 and 0 respectively. They will not evaluate in the same way for other numbers:

print(True == 2)
False

However, if used in an if-statement, numbers that do not equal 1 or 0 can evaulate to True. Think of the if-statement below as asking the question Does this value equal 3? and returning True or False as a result.

if 3:
    print("3 evaluates to True")
3 evaluates to True


str

str indicates a string in Python, an immutable object that is a combination of unicode characters. Strings can be surrounded in single quotes, double quotes, or triple quoted (with either single or double quotes):

print(f"Single quoted text is type: {type('test')}")
Single quoted text is type: <class 'str'>
print(f"Double quoted text is type: {type("test")}")
Double quoted text is type: <class 'str'>
print(f"Triple quoted with single quotes is type: {type('''test''')}")
Triple quoted with single quotes is type: <class 'str'>
print(f"Triple quoted with double quotes is type: {type("""test""")}")
Triple quoted with double quotes is type: <class 'str'>

The benefit of triple quoting a string is that it can span multiple lines in the code, whereas the others will throw errors if this is the case. These lines will include the whitespace between the text:

my_string = """This text
spans multiple
lines."""
print(my_string)
This text
spans multiple
lines.

We can use \ to indicate that a single or double-quoted string carries on, but this is only useful for keeping a line of code under a certain length, as it is not the same as a newline:

my_string = "This text won\
't throw an error"
print(my_string)
This text won't throw an error


int

int values are whole numbers. For instance:

my_var = 5
print(type(my_var))
<class 'int'>

int values can be added, subtracted, or multiplied without changing the variable type. Unlike other languages, however, divison of int values will change the variable type to float, meaning truncation does not happen:

print(type(6+2-2*2))
<class 'int'>
print(type(6/2))
<class 'float'>

Similarly, any calculation between an int and a float results in a float:

print(type(6+2.0)) ## 2.0 is a float
<class 'float'>


float

float values apply to most numbers with decimals attached.

my_var = 5.0
print(type(my_var))
<class 'float'>

float values can be converted back to int using the int() function. This coercion causes the float value to be truncated, regardless of how close to the "next" number the float is.

Keep in mind that truncating and rounding are different things — int() cuts off anything past the decimal point, while various round functions reassign the value to its closest integer.

print(int(5.5))
5
print(int(5.9999))
5


complex

complex values represent complex numbers. For example, j can be used to represent an imaginary number, but must be preceded by a number for Python to understand it (say, 1j).

my_var = 1j
print(my_var)
1j
print(type(my_var))
<class 'complex'>

Arithmetic with a complex value always results in a complex:

print(type(1j * 2))
<class 'complex'>

Unlike the other types mentioned above, you cannot convert a complex value to an int or float, for reasons that are fairly straightforward.

print(int(1j*1j))
print(float(1j*1j))
TypeError: can't convert complex to int
TypeError: can't convert complex to float