3  Control flow – conditions, loops, functions, etc

Download notebook.

Coding starts to become interesting when we can control how variables change, and repeat operations.

3.1 Conditions (if ... elif ... else)

As a simple (mathematical) example, let us compute the minimum of two numbers., which works by an if-condition:

a = 2
b = 5
if a < b:
    res = a
else:
    res = b
print(f"The minimum of {a} and {b} is:", res)
The minimum of 2 and 5 is: 2

In general, the structure of an if-statement reads:

if _condition1_:
    case1
elif _condition2_:
    case2
elif _condition3_:
    case3
...
else:
    case_else

Here, read elif alse else if, which means that the code jumps into this case if the respective condition is the first to hold. The else case only applied if both the if and all elif conditions do not apply.

In fact, there are abbreviations in Python which are very handy. In particular, look at this piece of code.

res = a if a < b else b
print(f"The minimum of {a} and {b} is:", res)
The minimum of 2 and 5 is: 2

Let us make another example, also using elif. Given some n, the task is to print Fizz if n is a multiple of 3, Buzz, if it is a multiple of 5, and FizzBuzz if both conditions are satisfied.

n = 165
res = ""
if n % 3 == 0 and n % 5 == 0:
    res = "FizzBuzz"
elif n % 3 == 0:
    res = "Fizz"
elif n % 5 == 0:
    res = "Buzz"
print(res)
FizzBuzz

3.2 Loops with fixed length (for)

There are for-loops in Python, as discussed in this section, and while-loops in the next section. The basic use-case of a for-loop is repeating a task several times. As a simple example, adding b a number of a times, results in a * b.

a = 3
b = 7
res = 0
for i in range(a):
    res = res + b
print(f"{a} * {b} = {res}")
3 * 7 = 21

Importantly, range(a) is a synonym for the numbers 0,..., a-1. (In particular, note that Python usually starts counting at 0 and not at 1, which will also be important in the next chapter.)

The general structure of a for-loop is as follows:

for _var_ in _iterable:
    _sequence of instructions_

3.2.1 Greatest common divisor I

Let us look at a more interesting example, computing the greatest common divisor (gcd) of two numbers, a and b. A very simple algorithm is going through all numbers 1,..., in order, and update result each time we encounter a number which divides both, a and b. (Here, range(1,a+1) is a synoym for the numbers 1,...,a.)

# Basic algorithm for finding the gcd of two numbers
a = 105
b = 33

res = 1
for i in range(1,a+1):
    if a % i == 0 and b % i == 0:
        res = i
print(f"The gcd of {a} and {b} is {res}.")
The gcd of 105 and 33 is 3.

Let us make some remarks on for-loops: * In the last example, note that the instructios within the for-loop depend on i. * Within a for-loop, you can use continue in order to stop the execution of the current iteration, and immediately jump to the next i. * Within a for-loop, you can use break in order to stop the whole execution of the for-loop.

# continue skips the rest of the current iteration
for i in range(5):
    if i == 2:
        continue
    print(i, end=" ")
print()
0 1 3 4 
# break stops the loop entirely
for i in range(10):
    if i == 5:
        break
    print(i, end=" ")
print()
0 1 2 3 4 

3.3 Loops with unknown length (while)

The for-loop directly iterates over some variable (i above). The while-loop instead comes with a condition, which is checked everytime it is started new. The general structure is:

while _condition_:
    instructions which might change the value of _condition_

Let us make two examples: finding \(\sqrt{2}\) and computing the gcd using Euclid’s algorithm.

3.3.1 Finding \(\sqrt{2}\)

For finding \(\sqrt{2}\), note that this is a fixed point for the iteration \[ x_{n+1} = \frac {x_n} 2 - \frac 1 {x_n}.\] (In order to see this, assume that \(x = x_n = x_{n+1}\), and multiply the recursion by \(x\).) We can use this as follows. Here, abs() is a built-in function returning the absolute value of a number.

x = 1
eps = 1e-10
x_prev = 0

while abs(x - x_prev) > eps:
    x_prev = x
    x = x/2 + 1 / x
print(x)
1.414213562373095

3.3.2 Finding the greatest common divisor II

The above algorithm (using for-loops) for finding the gcd were not particulary efficient. (You might want to test them with some large numbers.) Euclid’s algorithm is much more efficient. It is based on the observation that c % a == 0 (c divides a) and c % b == 0 iff c % a == 0 and c % (b % a).

In order to see this, note that c % a == 0 and c % b == 0 iff c % a == 0 and c % (b - da) == 0 for any d. Then, the result follows from choosing d = b \\ a since b - (b \\ a) * a = b % a. Moreover, every number divides 0, so if b % a == 0, the gcd of a and b is a.

We can use these insights, taking a < b for simplicity. Starting with a and b, we compute if b % a == 0. If yes, we are done, and return a. If not, we compute the gcd of a and b % a instead.

# Euclid's algorithm for finding the gcd of two numbers
a_input = 105
b_input = 33
# Copy the input to new variables
a = a_input
b = b_input
# Run Euclid's algorithm
while a != 0:
    r = b % a
    print(f"b = {b}, a = {a}, remainder = {r}")
    b = a
    a = r
print(f"The gcd of {a_input} and {b_input} is {b}.")
b = 33, a = 105, remainder = 33
b = 105, a = 33, remainder = 6
b = 33, a = 6, remainder = 3
b = 6, a = 3, remainder = 0
The gcd of 105 and 33 is 3.

3.4 Functions (def)

Once we implement something, we wan to reuse it without re-writing the code. This means, we want to pack our code into its own function for reusing it. For Euclid’s algorithm, this would look as follows:

a = 105
b = 33

def gcd(a,b):
    """ Compute the greatest common divisor of a and b."""
    while a != 0:
        r = b % a
        b = a
        a = r
    return b

print(f"The gcd of {a} and {b} is {gcd(a,b)}.")
The gcd of 105 and 33 is 3.

The general structure is

def name_of_the_funcion(arg1, arg2, arg3 = default3, arg4=default4):
    some instructions
    return something

Not all functions return something. Those who don’t usually produce some output. Here is an example

def greeting(name, prefix="Hello"):
    """ Return a greeting for name."""
    print(f"{prefix}, {name}!")

greeting("Alice")
greeting("Bob", prefix="Hi")
greeting(prefix="Welcome", name="Charlie")
Hello, Alice!
Hi, Bob!
Welcome, Charlie!

Here are some important remarks on functions:

  • Variables within functions are private to them. (There will be exceptions to this) In the gcd example, variable names a and b occur both, within and outside of the function. Although the variables a and b, which are private to the function, are changed within the function, the print-statement still knowns their value from before the function definition.
  • Variables which are given to functions might have default values, as the prefix = "Hello" in the greeting example.
  • For giving variables, there are two options:
    • you can rely on the position they have in the definition of the function. (Example: greeting("Bob", prefix="Hi"), where you rely on "Bob" being in the name position.) These are called positional variables.
    • you can give it the explicite variable name it has in the function definition. In this case, the order of variable names does not matter. (Example: In greeting(prefix="Welcome", name="Charlie"), the output was correct although the order of the two variables differs from the function definition.)
  • All positional arguments must come before keyword aguments.
  • Both functions from above have a docstring, which is the multiline string after the def-line. They indicate what this function is doing and are displayed in various places, e.g. when you hover over a function in your jupyter notebook.
  • Sometimes, it is too much to define an own function for a very simple task. In this case, Python comes with lambda-functions, which are very quick: lambda x: x * x. Most use cases are with container data types, and we will have one in Section 4.1.

3.5 Exception handling (raise, try ... except)

In code, various things happen, and some go wrong. Raising (throwing) errors, catching them, and dealing with them is the topic of this section. There are many error types implemented. Let us look at some of them:

Exception When it happens Example
ValueError Right type, wrong value int("abc")
TypeError Wrong type used "2" + 3
IndexError Index out of range [1,2][5]
KeyError Dictionary key missing {"a":1}["b"]
ZeroDivisionError Division by zero 1 / 0
FileNotFoundError File does not exist open("x.txt")
PermissionError No access rights opening protected file
AttributeError Object has no attribute "abc".foo
NameError Variable not defined print(x)
ImportError Import fails import nonexisting
ModuleNotFoundError Module not found import xyz
AssertionError assert fails assert False

We obtain an error here:

print("This will not work")
print(f"1/0 = {1/0}")
print("If this print works, the program has continued past 1/0")
This will not work
---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
Cell In[12], line 2
      1 print("This will not work")
----> 2 print(f"1/0 = {1/0}")
      3 print("If this print works, the program has continued past 1/0")

ZeroDivisionError: division by zero

However, if we catch the error, the code can run past it:

try:
    print("This will not work")
    print(f"1/0 = {1/0}")
    print("This never runs")
except ZeroDivisionError as e:
    print("You must not divide by zero!")
    print("This is the error message:", e)
print("The program continues!")
This will not work
You must not divide by zero!
This is the error message: division by zero
The program continues!

Now we know how to catch errors, but often, we need to throw them. This is actually simple using raise:

def compute_sth(x):
    if type(x) != int:
        raise TypeError("x must be an int!")
    y = x ** x
    return y

try:
    print(compute_sth(2))
    print(compute_sth("abc"))
except TypeError:
    print("Calculation not possible, sorry")
4
Calculation not possible, sorry

3.6 File input and output, context management (open, read, write, with)

Reading and writing files is one of the most common tasks in data analysis. Python’s built-in open function, combined with the with statement, provides a clean way to handle files. The with context manager handles opening and closing files for you. More generall, our Python file (or notebook) must interact with other files or ressources (databases, other connections). In order to do this safely, we are using with, which guarantees that the ressource is entered (opened) and exited (closed) correctly. The general structure is as follows:

with something as name:
    do_something()

Specifically for input and output from and to files, let f be an open file object. Then, the following are useful: * open(path, mode): open a file. See below for some details on mode. * f.read(): read the entire file as a single string. * f.readline(): read a single line. * f.readlines(): read all lines into a list of strings. * f.write(s): write string s to the file. * f.writelines(lines): write a list of strings to the file.

with open("hello.txt", "w") as f:
    f.write("Hello")
    f.write("This is a new line in the file!\n")
    print("File written.")
File written.
with open("hello.txt", "r") as f:
    text = f.read()
print("This is the content of the file:", text)
This is the content of the file: HelloThis is a new line in the file!
# reading line by line (useful for large files)
with open("hello.txt", "r") as f:
    for line in f:
        print(line.strip())
with open("hello.txt", "r") as f:
    print(f.read())
HelloThis is a new line in the file!
HelloThis is a new line in the file!

Here, "hello.txt" is closed at the end of the with block. The second argument of open comes with the following modes: * r: read (text mode) * w: (over-)write (text mode) * a: append (text mode) * x: create new (i.e. throw error if already exists) * rb: read (binary mode) * wb: (over-)write (binary mode) * ab: append (binary mode) * xb: create new (i.e. throw error if already exists)

3.7 Exercises

Exercise 1 Do newlines and tabs \n, \t count in str.isspace()? How do numbers interact with str.islower()? If s is the string of a number, what is s.lower()? Play around a bit with these functions and edge cases you might see in data. Then write a function classify_char(c) that takes a single character and returns "letter", "digit", "whitespace", or "other".

# Exercise 1

Exercise 2 Compute \(\sum_{i=1}^{100} i\) using a for-loop.

# Exercise 2

Exercise 3 Find the largest \(n\) such that \(\sum_{i=1}^n i < 1000\).

# Exercise 3

Exercise 4 for-loops can also become nested. Compute \(\sum_{i=1}^{100} \sum_{j=1}^i j\). (Note the double indentation for the inner for-loop!)

# Exercise 4

Exercise 5 Write a function (without using imports) which computes the sum of all digits (cross sum) of an int.

# Exercise 5

Exercise 6 Can you code the FizzBuzz example in a single line of code? So, dependent on some n, give one line of code, which gives Fizz if n is a multiple of 3, Buzz, if it is a multiple of 5, and FizzBuzz if both conditions are satisfied.

# Exercise 6

Exercise 7 Python (since version 3.10) comes with match, which is useful for branching on specific values. Write a function season(month) that takes a month number (1–12) and returns the season ("Winter", "Spring", "Summer", "Fall"). Use match with grouped patterns (e.g. case 3 | 4 | 5:) and a wildcard _ for invalid inputs. Find out how match differs from a chain of elifs.

# Exercise 7

Exercise 8 The following function is supposed to compute the average of a list of numbers, but it contains a subtle bug. Find and fix it. (Hint: try it with a few inputs of different lengths and compare the result to what you expect. Debugging with print statements or the built-in debugger can help — see Section 11.7.)

def average(numbers):
    total = 0
    for i in range(1, len(numbers)):
        total += numbers[i]
    return total / len(numbers)

print(average([10, 20, 30]))    # expected: 20.0
print(average([1, 2, 3, 4]))    # expected: 2.5
print(average([7, 8]))          # expected: 7.5
16.666666666666668
2.25
4.0
# Exercise 8

Exercise 9 Write a function collatz(n) that returns the number of steps it takes for the Collatz sequence starting at n to reach 1. (The rule is: if n is even, divide by 2; if odd, compute 3n+1.)

# Exercise 9

Exercise 10 Write a function newton_sqrt(a, tol=1e-10) that computes \(\sqrt{a}\) using Newton’s method, i.e. the iteration \(x_{k+1} = \frac{1}{2}\left(x_k + \frac{a}{x_k}\right)\), starting from \(x_0 = a\), and stopping when \(|x_{k+1} - x_k| < \text{tol}\).

# Exercise 10

Exercise 11 Write a function primes_up_to(n) that returns a list of all prime numbers up to n, using the Sieve of Eratosthenes.

# Exercise 11

Exercise 12 Write a function bisection(f, a, b, tol=1e-10) that finds a root of a continuous function \(f\) on \([a, b]\) using the bisection method. The function should raise a ValueError if \(f(a)\) and \(f(b)\) have the same sign. Test it on \(f(x) = x^3 - 2\).

# Exercise 12

Exercise 13 Write a function caesar(s, k) that shifts every letter in the string s by k positions in the alphabet (wrapping around from z to a). Non-letter characters should remain unchanged. (Hint: ord(c) returns the integer code of a character, e.g. ord("a") == 97, and chr(n) converts an integer back to a character, e.g. chr(97) == "a".)

# Exercise 13