5 Loops
for i in range(2):
print("Hello, World!")
## Hello, World!
## Hello, World!
Very much unlike human beings, computers are good at executing repetitive tasks without making errors. It just so happens that boring, repetitive tasks are an inescapable part of many exciting things. Take psychological research as an example. In conducting psychological research, you may come up with a theory about human beings, about how things work between humans, or maybe between a human operator and a machine. Or, alternatively, you may have a research question, a question about how things work, how human beings function, why they make the mistakes they make. Now, what do psychologists do to find an answer to their scientific questions? Right, psychologists conduct empirical research.
Let’s say, you have a research question and you try to answer your question by setting up an empirical experiment. You invite others to participate in your research. You repeat your experimental procedure with each and every participant. Depending on the kind of research you conducted, you repeat the qualitative inspection of the generated data for each participant. And depending on the statistical tool you use, you may end up filling in the data of each and every participant by hand into your statistical software. Now, here is the good news. Programming can help you unbore (yes, that is a word) the boring and repetitive bits of your exciting research.
At the root of boring tasks often lies repetition. You repeat an experimental procedure, you repeat the same quality checks for the data of each participant, you perform the same data cleaning on each single row in a data frame, and so on. Human beings tend to make mistakes when they repeat the same task over and over again. At the same time, this is exactly where the strength of computers lies. Computers do not make mistakes when they repeat a task. When a computer makes a mistake, then it is because it was falsely instructed in the first place.
Chapter 5 introduces you to the mechanism that lies at the heart of a computer’s paramount strength, repetition; or as it is called in computer jargon: loops. The Python language has a number of features that make it easier for you as a programmer to make your computer execute those boring, repetitive tasks for you.
5.1 The for
loop
for letter in "Hello":
print(letter)
## H
## e
## l
## l
## o
The for
loop is one possibility to create repetition in a computer program. In computer science, the repeated execution of a set of statements is called iteration. The for
loop iterates over a sequence of objects. Remember from chapter 4 Lists, that lists and strings are both sequences. They are ordered collections of elements, where strings are sequences of characters and lists can contain any kind of object. So what does it mean for the for
loop to iterate over a sequence? It means that it visits each element in the sequence one by one, in an ordered fashion.
Regarding the syntax of for
loops; a for
loop begins with the for
keyword, followed by the so-called loop variable. In the example above, the variable letter
serves as the loop variable. The in
keyword, which you encountered in chapter 4 follows on the loop variable. Then a sequence which can (amongst other things) be a list or a string is included. Finally, a colon :
demarcates the end of the for
statement. Semantically, the for
statement in the above example says nothing else but
For each letter that exists in the word “Hello”
On each iteration of the for
loop, the loop variable takes on the value of the next element in the sequence. This means that on the first iteration, the loop variable letter
has the value H
, on the second iteration, it has the value e
, on the third iteration, its value is l
, and so on. The loop variable ceases to take on a new value once it is assigned the last element of the designated sequence, in this case theo
of "Hello"
.
Syntactically, after the for
statement, the so-called loop body follows. The body of a loop specifies the statements that are to be executed on each iteration. In the example above, printing the current value of the loop variable letter
is executed on each iteration of the loop.
The body of a loop needs to be indented. The end of a for
loop is demarcated by indentation. Once the value of the last element in the sequence has been assigned to the loop variable, and the body of the loop has been executed, the loop terminates. The execution of the computer program is continued on the first line after the loop body, and indentation (or rather the lack thereof) demarcates this line, like so
for letter in "Hello":
print(letter)
print("World")
Here, print("Hello")
constitutes the first line after the for
loop.
Instead of including a manually defined sequence in the for
statement, such as the string "Hello"
, you can use the range
method to create a list of integers. For example, you may want to repeat a certain block of statements twice. However, the block of statements may not involve a suitable sequence over which you can make a for
loop iterate. In such cases, you can use the range
method to create a looping mechanism, like so
for i in range(2):
print("Hello, World!")
## Hello, World!
## Hello, World!
Unless specified otherwise, the range
method initiates a list that contains each integer starting at 0
until, but excluding the specified integer between the brackets ()
.
print(range(2))
## range(0, 2)
In the example above, the loop variable i
will thus have two values before the loop terminates, 0
and 1
. In total, the statement print("Hello, World")
is therefore executed twice.
Read the documentation of the range
method for further information on how to customize the sequence of integers it produces.
help(range)
In summary, from the above it follows that there are at least two ways for iterating over ordered sequences of objects, such as lists and strings, using for
loops. In the first strategy, you assign each element of the sequence iteratively to the loop variable, like so
= ['a', 'b']
a for i in a:
print(i)
## a
## b
Following the second strategy, you access the elements of the sequence through their indices, like so
for i in range(len(a)):
print(a[i])
## a
## b
Which way of iterating over ordered sequences you prefer is up to you. Sometimes, the problem you try to solve with your code will require one way, sometimes the other. However, when constructing loops, remember that you cannot access the elements of an unordered collection such as dictionaries using the index operator.
= {'a': 1,
d 'b': 2}
for i in range(len(d)):
print(d[i])
## Error in py_call_impl(callable, dots$args, dots$keywords): KeyError: 0
##
## Detailed traceback:
## File "<string>", line 2, in <module>
You will need to think of another way to iterate over the contents of a dictionary. You could use the keys()
method to iterate over the dictionary’s keys, or the values()
method for its values respectively, like so
for i in d.keys():
print(i)
## a
## b
for j in d.values():
print(j)
## 1
## 2
Alternatively, the items()
method can be used to iterate over the key-value pairs contained in a dictionary.
for z in d.items():
print(z)
## ('a', 1)
## ('b', 2)
As usual, which iteration strategy you use depends on what you try to achieve. There is not one golden solution that fits all problems.
5.2 The while
loop
= 0
a = 1
b
while b < 5:
+= b
a += b
b print("Current sum is:", a, "current value is:", b)
## Current sum is: 1 current value is: 2
## Current sum is: 3 current value is: 4
## Current sum is: 7 current value is: 8
print("This is the first line of code after the loop, the sum is:", a)
## This is the first line of code after the loop, the sum is: 7
A while
loop is centered around a condition formulated in the first line of the loop, the while
statement. It allows you to repeatedly execute a number of body statements as long as the condition in the while
statement remains True
. At the same time, this means that the body of a while
loop is never executed if the loop condition is False
at its initial execution.
= 0
a = 5
b
while b < 5:
+= b
a += b
b print("Current sum is:", a, "current value is:", b)
print("This is the first line of code after the loop, the sum is:", a)
## This is the first line of code after the loop, the sum is: 0
The strategy behind while
loops is that the body of the loop changes one or more (loop) variables so that the loop condition eventually becomes False
. Otherwise, the loop will continue forever and we speak of an infinte loop. Infinite loops can manoeuver your program into a dead end if started unintentionally.
Note how the basic structure of the Stroop Task program is a large while loop. Unless the required number of trials has been executed or the user manually closes the program, it will keep running no matter what.
5.2.1 for
or while
?
At times, it may be confusing whether either a for
or a while
loop will suit your needs best. If you are unsure which kind of loop you should use, ask yourself the following:
- Do I know in advance how many times I need the loop body to be executed? Problems like “searching a collection for a specified element” or “executing code a specified number of times” suggest that a
for
loop will suit your needs best. - Do I require the computations to stop when some condition is met, but I cannot tell in advance when (or if) this will happen? Chances are high that a
while
loop will fit your needs best in this case.
Still, as a programmer you have great freedom in choosing either for or while loops. In many cases, using either kind of loop is but a difference in perspective on the problem you try to solve with your code. In other cases, using one kind of loop will be more efficient than using the other.
5.3 The break
statement
= 17
myNumber while True:
= input("Enter a number:")
guess if guess > 17:
print("My number is smaller than {}".format(guess))
elif guess < 17:
print("My number is greater than {}".format(guess))
else:
break
print("Ding, ding! Correct, my number is {}".format(myNumber))
The break
statement allows you to immediately exit any loop, be it a for
or a while
loop. The next statement that is executed is the first line after the loop body. Interactive programs that process user input often require to exit their loops prematurely. Depending on the input given, the program may need to transition to the next state or terminate altogether.
Also note the usage of String formatting. The .format()
method can be applied to any String and allows you to specify value and variable substitutions within a given String. While a detailed discussion of formatters is beyond the scope of the current chapter, a tutorial on String formatting can be found [here] (https://www.digitalocean.com/community/tutorials/how-to-use-string-formatters-in-python-3). String formatting is highly useful, no matter whether you want to export some information out of your program by writing text or Excel files, or whether you simply want to print something to the console.
The basic idea behind String formatters is that you substitute a value, a variable, or an expression that you want to insert in your String by curly brackets {}
. You specify the to-be-inserted value within the method brackets, like so
print("{} participants took part in the conducted research".format(10))
## 10 participants took part in the conducted research
Formatters can manage multiple placeholders, variables and perform cool transformations on the to-be-substituted values, like rounding
= 10
n print("{0} participants took part in the conducted research. Their average score was {1:.2f}.".format(n, 25.7777))
## 10 participants took part in the conducted research. Their average score was 25.78.
Notice the usage of the positional arguments 0
and 1
to specify the order in which the to-be-substituted values n
and 25.7777
are inserted in the String. Positional arguments specify the index of the to-be-inserted values within the brackets of the format
method.
Here, rounding is achieved by some minor usage of format code syntax, which first specifies the position of the transformation and then the intended conversion {position:transformation}
. In this case we transform 25.7777
inhabiting index 1
into a floating point with two decimals, thus the format code syntax {1:.2f}
. We could as well have exchanged the two values, but that would not have made much sense, would it now?
= 10
n print("{1:.2f} participants took part in the conducted research. Their average score was {0}.".format(n, 25.7777))
## 25.78 participants took part in the conducted research. Their average score was 10.
Check out the tutorial for more information on String formatters!
5.4 Controlling the flow of your program with loops
By now you will have noticed that by default, Python faithfully executes any program in a top-down order. But what if you want to change the order in which specific parts of your program are being executed, possibly depending on external conditions such as the time of the day, or the input of a user?
The basic top-down execution order can be alterated using control flow statements. Python knows three control flow statements: if
,for
, and while
statements. We already encountered if
statements in Chapter 3 Conditionals. Control flow statements are readily used in interactive software such as the Stroop Task experiment
= "welcome"
STATE while True:
# Changing states by user input
for event in pygame.event.get():
if STATE == "welcome":
if event.type == KEYDOWN and event.key == K_SPACE:
= "prepare_next_trial"
STATE print(STATE)
if event.type == QUIT:
= "quit" STATE
In principle, the above code can run infinitely, unless the user triggers the QUIT
event.type
. Note that in the original Stroop experiment, there is another condition that prevents the program from running infinitely, namely a maximum number of trials.
Within the main while
loop, a for
loop manages the changing of program states as a consequence of user input. To this end, the loop variable event
iterates over event objects stored in an event queue, which is essentially a special sequence devised by the Pygame library. The pygame.event
module knowns a number of event types, among which the KEYDOWN
event, referring to the event that the user presses a key on the keyboard.
In the example above, several conditions have to be fulfilled so that the program state changes to "prepare next trial"
: The current program state has to be "welcome"
, the event has to be a KEYDOWN
and the key pressed has to be SPACE
. Note that in the program above, program termination can be achieved at any time through the "quit"
state.
5.5 Common errors
- Easily the most common error surrounding loops is trying to make them iterate over an integer instead of a sequence. Very often, you will want your program to modify the elements of a given sequence according to a set of statements. Logically, you will need as many iterations as elements in the sequence. This often leads to
for
statement definitions such as:
= [1,2,3,4,5]
myList for i in len(myList):
+= 1 myList[i]
## Error in py_call_impl(callable, dots$args, dots$keywords): TypeError: 'int' object is not iterable
##
## Detailed traceback:
## File "<string>", line 1, in <module>
Probably, your intention was to make the loop repeat as many times as there are elements in myList
. However, len(myList)
returns an integer, not a sequence object! Instead, you will have to convert the length of myList
into a sequence, for instance by using range()
.
= [5,2,3,4,5]
myList for i in range(len(myList)):
+= 1
myList[i]
print(myList)
## [6, 3, 4, 5, 6]
Alternatively, you can assign the values of myList
iteratively to the looping variable i
to avoid working with the length
of the sequence.
= [5,2,3,4,5]
myList for i in myList:
+= 1
myList[myList.index(i)]
print(myList)
## [6, 6, 3, 4, 5]
But wait, this result seems slightly different. You do get an alterated version of myList
where each element is enhanced by 1
(as in the first example), but the order is messed up. That is exactly because a different strategy is devised by accessing each element directly in the for
statement. Furthermore, the index()
method is used to find the index of the first occurrence of the value specified within the brackets. It is because the method returns the index of the first occurence that the two strategies return different results in this case.
When you opt for either of the two looping strategies, keep in mind that you may need to go to some length to get the exact same result using both strategies. Usually, either one strategy quickly turns out to be the better, less complicated means for your end. In the above example, you could for instance have used a second list object to obtain the same result as in the first example, like so
= [5,2,3,4,5]
myList = []
a for i in myList:
+ 1)
a.append(i
print(a)
## [6, 3, 4, 5, 6]
Again, whether using a second list is a viable option to solve the problem at hand is totally up to what you try to achieve with your code.
-
index out of range
errors tend to slip in when working withfor
loops.
= [1, 'a']
a = [1, 'a', 'b']
b
for i in range(len(b)):
= b[i] a[i]
## Error in py_call_impl(callable, dots$args, dots$keywords): IndexError: list assignment index out of range
##
## Detailed traceback:
## File "<string>", line 2, in <module>
An index out of range error
is thrown when you try to access a non-existing index in a sequence. In the example above, a
has two elements, thus two valid indices (0
and 1
). b
on the contrary has three elements and thereby three valid indices (0
, 1
and 2
). The code snippet throws an error when a third index is accessed in a
. Always make sure that the sequences you iterate over or intend to modify in the body of a for
loop have the right amount of indices at their disposal.
- Letting a
while
loop run endlessly. Unless you actively construct awhile
loop to terminate, it will not. This can result in all sorts of errors, but usually you will notice that your program does not enter its later states. When you construct yourwhile
loops, think well about how to terminate them. Thebreak
statement can also be an option to intentionally terminate the loop when another condition is met that is not defined in thewhile
statement, like so
= 5
i = 13
myNumber while i > 0:
= input("Enter a number:")
guess if guess == myNumber:
print("Ding, ding! Correct, my number is {}".format(myNumber))
break
elif guess > myNumber:
-= 1
i print("My number is smaller than {}".format(guess))
print("You may guess another {} times".format(i))
elif guess < myNumber:
-= 1
i print("My number is greater than {}".format(guess))
if i == 0:
print("Too bad, you lost")
else:
print("You may guess another {} times".format(i))
In the above alteration of the guessing game, there are two ways to make the program terminate. First, you guess the number. Second, you run out of trials. The easiest way of terminating a while
loop in response to user input remains using the break
statement.
5.6 Debugging
Loops can (in theory) continue for an infinite amount of iterations. Imagine looping over a data set of more than 1000 experimental participants. Loops with a large amount of iterations may seem cumbersome to debug at first sight. Computers make mistakes when they are falsely instructed and they will make the same mistake over and over again, infinitely if they must. You can imagine that debugging a loop that runs over hundreds or thousands of iterations is a pain when each time you change a small piece of code you need to run all iterations before you can inspect the result of the change in code. Here, the break
statement can be used to inspect the output your loop generates at a single, or a fixed amount of iterations. By doing so, you do not only save time because your program does not need to execute each iteration of the program, your output will also be less cluttered. Picture a piece of code that produces one printed line for each iteration of code, like so
for i in range(1000):
print("Hello, World!")
Now you change a small part of the code and want to inspect its effect.
for i in range(1000):
print("Hello " + "World" + "!")
It is unneccessary to print this one-liner a thousand times to see what is going on. Instead, you can use the break statement to print a single line.
for i in range(1000):
print("Hello " + "World" + "!")
break
## Hello World!
This comes in handy when you suspect a bug in one of your loops. Remember that if there is a bug in your loop, your computer will make the same mistake over and over again. So if you identify the bug in one iteration or a small amount of iterations, it will also be present in the exact same way in the remaining iterations. Usually, you can resolve bugs in your loops based on what you inspect in a subset of iterations.
5.7 Exercises
5.7.1 Exercise 1. Following the control flow I
What is the output of the following code snippets?
for i in range(len([1,2,3])):
print(i)
for i in range(len([])):
print(i)
= 0
a for i in range(3):
+= i
a print(a + i)
= 0
a = 30
b for i in range(5):
+= i
a -= i
b
print(a + b)
for i in range(1,3):
print(i)
= 10
i while i > 5:
print(i - 5)
-= 2 i
= 2
a = (1,2)
b
while a != 0:
for i in b:
print(a + i)
-= 1 a
5.7.2 Exercise 2. Debugging I
Do the following code snippets throw an error? If they do, what goes wrong? For each exercise, only one answer is correct.
a = [4,6,8]
for i in len(a):
print(a[i])
- No error is thrown, the code snippet prints
4
on its first iteration, then6
, and finally8
. - No error is thwon, the code snippet prints
0
on its first iteration, then1
and finally2
. - A
TypeError
is thrown because thefor
statement does not include a sequence to iterate over.
myNumber = 43
while True:
guess = input("Enter a number:")
if guess > 43:
print("My number is smaller than {0}".format(guess))
elif guess < 43:
print("My number is greater than {0}".format(guess))
else:
print("Ding, ding! Correct, my number is {0}".format(myNumber))
- Nothing goes wrong, once the user guesses
43
,Ding, ding! Correct, my number is 43
will be printed and the program terminates. - No error is thrown, once the user guesses
43
,Ding, ding! Correct, my number is 43
will be printed, but the program does not terminate. - An
IndexError
will be thrown by the positioning argument in the curly brackets because only one value is provided as substitution in which case using indexing is wrong.
a = [4,6]
b = [0, 1.5]
for i in a:
for j in range(len(b)):
print(i * b[j])
- A
TypeError
is thrown because the firstfor
statement involving the looping variablei
does not include a sequence to iterate over. - No error is thrown, the output is of the mini program is
0
,6.0
,0
,9.0
because each element ina
is multiplied by each element inb
. - A
NameError
is thrown becausei
is an undefined variable in the body of the secondfor
loop that starts withfor j in range(len(b))
. - No error is thrown, the output of the mini program is
0
,0
,0
,1.5
because each index ina
is multiplied by each element inb
.
5.7.3 Exercise 3. A guessing game
Make a Python script guessing.py
that implements the following behaviour:
- At the beginning of the script, a random number between 0 and 1000 is generated.
- The program continuously asks for user input prompting the user with the request to enter a number.
- If the user guesses the number correctly, the program gives feedback that the number was guessed correctly. The program also shows the number of guesses needed. Finally, the program terminates.
- If the guessed number is too large, the program gives feedback that the number is smaller.
- If the guessed number is too small, the program gives feedback that the number is larger.
- If no input is given, the program should terminate. In this case, the program should terminate if the given input is empty. Hint: Use the internet to figure out how to handle empty input.
Hint: Prompting and feedback via the console is sufficient.
5.7.4 Exercise 4. Controlling the flow
Programming small interactive programs such as the guessing game helps you understand how to use control flow statements to bring about a desired behaviour in your programs. For this exercise, you will be programming an extension to the guessing game, a three digit number lock!
5.7.5 Exercise 5. Following the control flow II
Read the script below carefully. Indicate the values of the following expressions after the script has finished executing:
dataset.keys()0]]
dataset[dataset.keys()[0]
condA[0][0]
condA[len(condA)
len(condB)
meanA meanB
You can check your answers afterwards by running the script.
import numpy as np
= {'p1':(21,"Female","Dutch","B-Psychology"),
dataset 'p2':(20,"Female","Dutch","B-Psychology"),
'p3':(21,"Female","Dutch","B-Applied_Mathematics"),
'p4':(23,"Male", "German","B-Communication_Science"),
'p5':(20,"n/a","Dutch","M-Business_Administration"),
'p6':(19,"Male","Swedish","B-Computer_Science"),
'p7':(19,"Male","German","B-Communication_Science"),
'p8':(24,"Female","Italian","M-Psychology"),
'p9':(23,"Female","Italian","M-Communication_Science"),
'p10':(18,"Male","Dutch","B-Computer_Science")
}
= []
condA = []
condB
= []
ageA = []
ageB
for participant in dataset.keys():
if dataset[participant][1] == "Female":
condA.append(dataset[participant])elif dataset[participant][1] == "Male":
condB.append(dataset[participant])
for participant in condA:
0])
ageA.append(participant[
= 0
count while count < len(condB):
0])
ageB.append(condB[count][+= 1
count
= np.mean(ageA)
meanA = np.mean(ageB) meanB
5.7.6 Exercise 6. Debugging II
The script below calculates some descriptives from data
. The code is still somewhat buggy. Debug the script and indicate the values of fastest
, slowest
, and all_mean
.
data = {'p1':(21,"Female","Condition B",[0.675,0.777,0.778,0.62,0.869]),
'p2':(20,"Female","Condition A",[0.599,0.674,0.698,0.569,0.7]),
'p3':(21,"Female","Condition A",[0.655,0.645,0.633,0.788,0.866]),
'p4':(23,"Male", "Condition A",[0.721,0.701,0.743,0.682,0.654]),
'p5':(20,"Male","Condition B",[0.721,0.701,0.743,0.682,0.654]),
'p6':(19,"Male","Condition B",[0.711,0.534,0.637,0.702,0.633]),
'p7':(19,"Male","Condition B",[0.687,0.657,0.766,0.788,0.621]),
'p8':(24,"Female","Condition A",[0.666,0.591,0.607,0.704,0.59]),
'p9':(23,"Female","Condition B",[0.728,0.544,0.671,0.689,0.644]),
'p10':(18,"Male","Condition A",[0.788,0.599,0.621,0.599,0.623])
}
fastest = ("initialization", 100)
for participant in data:
RTsum = 0
for i in len(data[participant][3]):
RTsum += data[participant][3][i]
RTmean = RTsum/len(data[participant][3])
if RTmean < fastest[1]:
fastest = (participant,RTmean)
slowest = ("initialization", 0)
for participant in data:
RTsum = 0
for RT in participant[3]:
RTsum += RT
RTmean = RTsum/len(data[participant][3])
if RTmean > slowest[1]:
slowest = (participant,RTmean)
all_mean = 0
all_sum = 0
number_of_trials = 0
for participant in data:
counter = 0
while counter < len(data[participant][3]):
all_sum += data[participant][3][counter]
all_mean = all_sum/number_of_trials*len(data)
5.7.7 Exercise 7. Nested loops
Nested data often requires nested loops for answering certain questions about the data. Let’s have a look at the list
data set below. each participant has a participant number paired with a list of additional experiment information.
= [
participants "p1", ["Condition 1","Location A","withdrew","no audio"]),
("p2", ["Condition 1","Location A","no audio","no video"]),
("p3", ["Condition 2","Location B"]),
("p4", ["Condition 1","Location A","withdrew"]),
("p5", ["Condition 2","Location A"]),
("p6", ["Condition 2","Location B","withdrew"]),
("p7", ["Condition 1","Location A","no video"]),
("p8", ["Condition 1","Location B"]),
("p9", ["Condition 2","Location B","withdrew"]),
("p10",["Condition 2","Location A","withdrew"])] (
If we want to know how many participants withdrew their participation we need a counter, and for each participant we need a second loop that tests each of the experiment information in turn on the "withdrew"
keyword:
= 0
counter for (participant, expInfo) in participants:
pass
# to be implemented by you
print("The number of participants who withdrew their participation is", counter)
Implement the second for
loop that iterates over the experiment information and count the number of participants who withdrew their participation.
5.7.8 Exercise 8. Data transformation using loops
- Write a script that transforms the data set
participants
from Exercise 3 into a dictonary data set. - Extend the script with a
counter
mechanism that counts the number of participants who were tested at location A and location B respectively.
5.7.9 Exercise 9. Calculating a mean
For the sequence seq
calculate the arithmetic mean using a while
loop. Check your implementation using the mean()
method of the NumPy
library.
import numpy as np
= range(1000) seq
5.8 Think Further
- Tutorial on String formatting. [Lisa Tagliaferri (2016). How To Use String Formatters in Python 3] (https://www.digitalocean.com/community/tutorials/how-to-use-string-formatters-in-python-3) Though the heading says ‘Python 3’, all Lisa discusses in her tutorial also applies to Python 2.7.