Python Tip #4 – Find an element in a list satisfying condition

What if you want to find the first item that matches the condition instead of getting a list of items?

selected = None
for i in items:
if condition:
selected = i

# Simpler version using next()
selected = next((i for i in items if condition), None)

next() is a built in function which is not that well known.

Sensible Test Data

I am currently working on a project called Peer Feedback, where we are trying to build a nice peer feedback system for college students. We use Canvas Learning Management System (CanvasLMS) API as the data source for our application. All data about the students, courses, assignments, submissions are all fetched from CanvasLMS. The application is written in Python Flask.

Current Setup

We are mostly getting data from API, realying it to frontend or storing it in the DB. So most of our testing is just mocking network calls and asserting response codes. There are only a few functions that contain original logic. So our test suite is focused on those functions and endpoints for the most part.

We recently ran into a situation where we needed to test something that involved the fetching and filtering data from API and retriving data from DB based on the result.

Faker Library and issues

The problem we ran into is, we can’t test the function without first initalizing the database. The code we had for initializing the CanvaLMS used the Faker library, which provides nice fake data to create real world feel for us. But it came with its own set of problems:

Painful Manual Testing

While we had the feel of testing realworld information, it came with real world problems. For e.g., I cannot log in as a user without first looking up the username in the output generated during initialization. So I had to maintian a postit on my desktop, use search functionality to find the user I want to test and copy his email and login with it.


Inconsistency across test cycles

When we write our tests, there is no assurity that we could reference a particular user in the test with the id and expect the parameters like email or user name to match. With the test data being generated with different fake data, any referencing or associaation of values held true only for that cycle. For e.g, a function called get_user_by_email couldn’t be tested because we didn’t know what to expect in the resulting user object.

Complex Test Suite

To compensate for the inconsistency in the data across cycles, we increased the complexity of the test suite, we saved test data in JSON files and used it for validation. It became a multi step process and almost an application on its own. For e.g, the get_user_by_email function would first initialize the DB, then read a json file containing the test data and get a user with email and validate the function, then find the user without an email and validate it throws the right error, find the user with a malformed email… you get the idea. The test function itself not had enough logic warranting a test of test suites.

Realworld problems

With the realworld like data came the realworld problems. The emails generated by faker are not really fake. There is a high chance a number of them are used by real people. So guess what would have happened when we decided to test our email program 🙂

Sensible Test Data

We finally are switching to a more sensible test data for our testing. We are dropping the faker data for user generation, and shifiting to the sequencial user generation system with usernames user001 and emails This solves the above mentioned issues:

  1. Now I can login without having to first look it up in a table. All I need to do is append a integer to the word user
  2. I can be sure that user001 would have the email and that these associations will be consistent across test cycles.
  3. I no longer have to read a JSON file to get a user object and test related information. I can simply pick one using the userXXX template, reducing complexity of the test suite.
  4. And we won’t be getting emails for random people asking us to remove them from mailing lists and probably we are saving ourselves being blacklisted as a spam domain.


Faker provided us with data which helped us test a number of things in the frontend, like the different length in names, multi part names, unique names for testing filtering and searching etc., while also added a set of problems that makes our work difficult and slow.

Our solution to having a sensible dataset for testing is plain numerically sequenced dataset.


Using the generic name tags like user was still causing friction as we have multiple roles like teacher, TAs, students …etc., So I improved it further by creating users like student0000, ta000, `teacher00“.

Python Tip #1 – Setting flags without using if statements

When you have to check for the presence of a value in a list and set a flag based on it, we can avoid typical:

set default => check => update 

routine in Python and condense it to a single line like this.

orders = ['pizza', 'coke', 'fries']
order_book = {}

# Setting an yes or no flag in another dictionary or object
order_book['pizza'] = False
if 'pizza' in orders:
    order_book['pizza'] = True

# Simpler Version
order_book['pizza'] = 'pizza' in orders

Python – Extended Assignment Operator and Lists

Extended Assignment Operators – we use them all the time as shorthands for assigning values.
Lets see how this works when using multiple identifiers (variables) in terms of simple data like a number.

>>> a = 5
>>> b = a
>>> b += 3
>>> print(a)
>>> print(b)  # only the value of b is changed even though we have assigned a to b
>>> b = b+4  # the long form assignment also doesn't affect the original a
>>> print(b)
>>> print(a)  


Let us apply a similar set of operations on list

>>> a = [1,2,3]
>>> b = a
>>> b += [4,5]
>>> print(a)  # surprise! - changing the value of b also affect the value of a
[1, 2, 3, 4, 5]
>>> print(b)
[1, 2, 3, 4, 5]
>>> b = b+[6,7]
>>> print(b)
[1, 2, 3, 4, 5, 6, 7]
>>> print(a)
[1, 2, 3, 4, 5]  # surprise again !! - but assigning a new value to b doesn't


Now the question is, why does the value of a change when adding elements to b?
And furthermore why doesn’t it change when we do it using the long form?

The answer is: Extended assignment operators act differently on mutable and immutable values.
So when we say b += [4,5] we are basically saying add 4 and 5 to the list b, but when we say
b = b + [6,7], we are saying “take the list b, add 6, 7 to it and create a new list from it, then assign it to the identifier b”.

This subtle difference will come to bite us when we least expect it. So be aware and take precautions 🙂

Peer Pairing Algorithm

Problem Statement

A class has N number of students. When the students submit an assignment, they are assigned K peers to review their assignments and provide feedback such that 0 < K = N


Input: N=3, K=1


    0: [1],
    1: [2],
    2: [0]

Input: N=3, K=2

    0: [1,2],
    1: [2,0],
    2: [0,1]


def generate_peer_pairs(N, K):
    """A function that provides unique peers for reviewing.
    The generated matches follow the following rules:
        1. The no.of reviews/rounds of review is less than total no.of users.
        2. The user won't be reviewing his/her own work
        3. All the reviewers assigned would be unique.
        4. Each user will have equal no.of reviews to give and receive.

    :param N: Total no.of student for whom the matching is to be done.
    :param K: The no.of reviews each student is supposed to receive.
    :returns: a dictionary of graders and their peers { grader_id: [peer_1, peer_2, ...]}
    if K >= N:
        return False

    available_offsets = list(range(1, N))
    offsets = []

    while len(offsets) < rounds:
        offset = random.choice(available_offsets)

    students = list(range(N))

    allocations = {}
    for idx, student in enumerate(students):
        allocations[student] = [students[(idx + offset) % N] for offset in offsets])

    return allocations


The problem is an interesting one. I started out with the idea that the peers should be arranged in a random way and wrote an algorithm by selecting a random recipient while looping over each grader. It was completely non-roboust and failed to make the right pairs more times that it worked.

The solution if you had noticed, is completely a position shifting algorithm. What appears a non-repeating random order for the grader and the recipient is not really that random.