THEORY AND PRACTICE

  • Input and Output Data
    • Tasks
  • Conditions
    • Tasks
  • For Loop
    • Tasks
  • Strings
    • Tasks
  • While Loop
    • Tasks
  • Lists
    • Tasks
  • Two-Dimensional Arrays
    • Tasks
  • Dictionaries
    • Tasks
  • Sets
    • Tasks
  • Functions and Recursion
    • Tasks

Lesson 9. Sets in python

Introduction to sets

Sets in Python: structural data types

A set in Python is a built—in collection that stores a set of unique elements. Like lists and dictionaries, arrays are one of the main tools for storing and processing data.

Basic properties of sets

Disorder

The elements in the set do not have a specific order. This means that you cannot access an element by index, as in a list. Each time a set is displayed on the screen, the order of the elements may change.

Uniqueness of the elements

This is a key property of sets. Within a single set, each element can occur only once. If you try to add an existing element, the array simply won't change.

Changeability

Standard sets (set) are mutable, meaning you can add and remove elements from them after creation. However, the elements of the set themselves must be immutable (for example, numbers, strings, tuples). You cannot add a list or other set as an element.

Sets differ from lists and dictionaries

  • From lists (list): Sets are unordered and contain only unique elements, while lists are ordered and may contain duplicates.
  • From dictionaries (dict): Dictionaries store key-value pairs, while sets store only individual elements. We can say that a multitude is like a dictionary that has only keys.

Main scenarios for the use of sets

Arrays are ideal for tasks where uniqueness and high-speed verification of an element's identity are important.

  • Removing duplicates from collections.
  • A quick check to see if an item is included in the dataset.
  • Performing mathematical operations: union, intersection, difference.

Creating sets in Python

Empty set via set()

To create an empty set, use the set() function. Note that using empty curly braces {} will create an empty dictionary, not a set.

# The right way to create an empty set
empty_s = set()
print(type(empty_s))  # <class 'set'>

# This will create an empty dictionary!
empty_dict = {}
print(type(empty_dict))  # <class 'dict'>
                        

A set with elements through {}

You can create a set by listing its elements in curly brackets. Duplicates will be automatically deleted.

# Creating a set with elements
fruits = {'apple', 'banana', 'orange', 'apple'}
print(fruits) # {'orange', 'banana', 'apple'} - the order is not guaranteed
                        

Transformation of other structures into a set

You can easily create a set from any iterable data structure (list, string, tuple) using the set() function..

From the list

This is the most popular way to remove duplicates from the list.

numbers_list = [1, 2, 3, 2, 4, 1, 5]
unique_numbers = set(numbers_list)
print(unique_numbers)  # {1, 2, 3, 4, 5}
                        

Extracting unique characters from a string

greeting = "hello world"
unique_chars = set(greeting)
print(unique_chars)  # {'h', 'w', 'r', 'l', 'd', ' ', 'o', 'e'}
                        

Extracting elements from a tuple

colors_tuple = ('red', 'green', 'blue', 'red')
unique_colors = set(colors_tuple)
print(unique_colors)  # {'blue', 'green', 'red'}
                        

Extracting keys from a dictionary

When converting a dictionary to a set, only the keys will get into it.

user_data = {'name': 'Alice', 'age': 30, 'city': 'New York'}
keys_set = set(user_data)
print(keys_set)  # {'name', 'age', 'city'}
                        

Adding elements

add() method

The add() method adds one element to a set. If an element already exists, the array will not change.

numbers = {1, 2, 3}
numbers.add(4)
print(numbers)  # {1, 2, 3, 4}

numbers.add(2) # Attempt to add an existing element
print(numbers) # {1, 2, 3, 4} - nothing has changed
                        

update() method

The update() method allows you to add multiple elements from any iterable structure (another set, list, string) at once.

s1 = {'a', 'b'}
s2 = ['b', 'c', 'd']
s1.update(s2)
print(s1)  # {'a', 'c', 'b', 'd'}

s1.update('xyz')
print(s1)  # {'a', 'd', 'x', 'y', 'z', 'b', 'c'}
                        

Deleting elements

remove() method

The remove() method deletes the specified element. If there is no such element in the set, it will cause the error KeyError.

items = {'pen', 'pencil', 'eraser'}
items.remove('pencil')
print(items)  # {'pen', 'eraser'}

# items.remove('ruler') # This will cause a KeyError error, because the 'ruler' is not in the set.
                        

discard() method

The discard() method also deletes the element, but unlike remove(), it will not cause an error if the element does not exist. This is a safer way to delete it.

items = {'pen', 'pencil', 'eraser'}
items.discard('pen')
print(items)  # {'pencil', 'eraser'}

items.discard('ruler') # There will be no error
print(items) # {'pencil', 'eraser'} - nothing has changed
                        

pop() method

The pop() method removes and returns an arbitrary element from the set. Since the sets are unordered, you cannot predict which element will be deleted. If the set is empty, pop() will cause the error KeyError.

data = {10, 20, 30, 40}
removed_element = data.pop()
print(f"Deleted element: {removed_element}")
print(f"Remaining set: {data}")
                        

clear() method

The clear() method removes all elements from the set, making it empty.

items = {'pen', 'pencil', 'eraser'}
items.clear()
print(items)  # set()
                        

Operations on sets

Union

Operator | (logical OR)

Returns a new set containing all the unique elements from both sets.

set_a = {1, 2, 3, 4}
set_b = {3, 4, 5, 6}
union_set = set_a | set_b
print(union_set)  # {1, 2, 3, 4, 5, 6}
                        

Union() method

Works similarly to the | operator, but it can take any iterable structure as an argument.

set_a = {1, 2, 3}
list_b = [3, 4, 5]
union_set = set_a.union(list_b)
print(union_set)  # {1, 2, 3, 4, 5}
                        

Intersection

Operator & (logical And)

Returns a new set containing only the elements that are in both sets.

set_a = {1, 2, 3, 4}
set_b = {3, 4, 5, 6}
intersection_set = set_a & set_b
print(intersection_set)  # {3, 4}
                        

intersection() method

An analog of the operator &.

set_a = {1, 2, 3, 4}
set_b = {3, 4, 5, 6}
intersection_set = set_a.intersection(set_b)
print(intersection_set)  # {3, 4}
                        

Difference

Operator -

Returns a new set containing the elements that are in the first set, but are missing from the second.

set_a = {1, 2, 3, 4}
set_b = {3, 4, 5, 6}
difference_set = set_a - set_b
print(difference_set)  # {1, 2}

difference_set_2 = set_b - set_a
print(difference_set_2) # {5, 6}
                        

difference() method

An analog of the operator is .

set_a = {1, 2, 3, 4}
set_b = {3, 4, 5, 6}
difference_set = set_a.difference(set_b)
print(difference_set)  # {1, 2}
                        

Symmetric difference

Operator ^ (logical exclusive OR)

Returns a new set containing elements that are in one of the sets, but not both at once.

set_a = {1, 2, 3, 4}
set_b = {3, 4, 5, 6}
sym_diff_set = set_a ^ set_b
print(sym_diff_set)  # {1, 2, 5, 6}
                        

The symmetric_difference() method

An analog of the operator ^.

set_a = {1, 2, 3, 4}
set_b = {3, 4, 5, 6}
sym_diff_set = set_a.symmetric_difference(set_b)
print(sym_diff_set)  # {1, 2, 5, 6}
                        

Checking relationships between sets

Subset — the issubset() method

Checks whether all the elements of one set are part of another. A.issubset(B) returns True if A is a subset of B. The and = operators can also be used for strict (less than) and non-strict (less than or equal to) validation.

set_a = {1, 2}
set_b = {1, 2, 3, 4}
print(set_a.issubset(set_b))  # True
print(set_a <= set_b)         # True
print(set_a < set_b)          # True (because A is not equal to B)
                        

Superset — issuperset() method

Checks whether one set contains all the elements of the other. A.issuperset(B) returns True if A is a superset of B. The and = operators also work.

set_a = {1, 2, 3, 4}
set_b = {1, 2}
print(set_a.issuperset(set_b)) # True
print(set_a >= set_b)          # True
print(set_a > set_b)           # True (because A is not equal to B)
                        

No intersection — isdisjoint() method

Returns True if the sets do not have a single element in common.

set_a = {1, 2}
set_b = {3, 4}
set_c = {2, 5}
print(set_a.isdisjoint(set_b)) # True
print(set_a.isdisjoint(set_c)) # False (there is a common element 2)
                        

Iterating through the elements of a set

Iterating through the elements through the for loop

You can iterate through the elements of a set in the same way as the elements of a list, but the order of iteration is not guaranteed.

my_set = {'cat', 'dog', 'fish'}
for animal in my_set:
    print(animal.capitalize())
                        

Iteration with a condition or filtering

Inside the for loop, you can use any conditions to process or filter elements.

numbers = {1, 2, 3, 4, 5, 6, 7, 8, 9}
for num in numbers:
    if num % 2 == 0:
        print(f"Even number: {num}")
                        

Set generators (set comprehension)

This is a compact and efficient way to create sets, similar to list generators.

Creating a set based on a loop

Syntax: {expression for element in iteration_object}

# Create a set of squares of numbers from 0 to 9
squares = {x**2 for x in range(10)}
print(squares) # {0, 1, 64, 4, 36, 9, 16, 49, 81, 25}
                        

With a condition inside the generator

Syntax: {expression for element in iteration_object if condition}

# Create a set of squares of only even numbers from 0 to 9
even_squares = {x**2 for x in range(10) if x % 2 == 0}
print(even_squares) # {0, 64, 4, 36, 16}
                        

Conversion between sets and other types

Converting a set to a list

Use list() to convert a set into a list. Keep in mind that the order of the items in the final list will be arbitrary.

my_set = {10, 20, 30}
my_list = list(my_set)
print(my_list) # [10, 20, 30] ( the order may vary)
                        

Set ⇄ string

Converting a string to a set creates a set of its unique characters. The reverse conversion can be performed using ".join().

unique_chars = set('abracadabra')
print(unique_chars)  # {'b', 'a', 'r', 'c', 'd'}

sorted_string = "".join(sorted(list(unique_chars))) # Sorting for a predictable result
print(sorted_string) # 'abcd'
                        

Set ⇄ tuple

The conversion works both ways using tuple() and set().

my_set = {1, 2, 3}
my_tuple = tuple(my_set)
print(my_tuple) # (1, 2, 3) ( the order may vary)

back_to_set = set(my_tuple)
print(back_to_set) # {1, 2, 3}
                        

The set of dictionary keys

As shown earlier, set() takes its keys from the dictionary.

my_dict = {'a': 1, 'b': 2}
keys_set = set(my_dict.keys()) # .keys() can be omitted, the result will be the same
print(keys_set) # {'a', 'b'}
                        

Useful tricks and tricks

Removing duplicates from the list

The classic and most effective way is to convert a list to a set and back to a list.

data = [1, 2, 5, 'a', 3, 2, 1, 'a', 'b']
unique_data = list(set(data))
print(unique_data) # ['a', 1, 2, 3, 5, ' b'] (the order is not preserved)
                        

Finding duplicates in the list

You can use a variety to track items you've already encountered.

data = [1, 2, 5, 'a', 3, 2, 1, 'a', 'b']
seen = set()
duplicates = set()

for item in data:
    if item in seen:
        duplicates.add(item)
    else:
        seen.add(item)

print(f"Duplicates found: {duplicates}") # {'a', 1, 2}
                        

Deleting all items from one list from another

Using the difference of sets is the fastest way.

main_list = [1, 2, 3, 4, 5, 6, 7]
to_remove = [2, 4, 6, 8] # 8 not in the main list, it won't cause an error.

result = list(set(main_list) - set(to_remove))
print(result) # [1, 3, 5, 7]
                        

Quick comparison of two collections (without regard to order)

If you need to check whether two collections consist of the same elements, regardless of their order and number, converting to sets is an ideal option.

list1 = [1, 2, 3, 2, 1]
list2 = [3, 1, 2]
list3 = [1, 2, 4]

print(set(list1) == set(list2)) # True
print(set(list1) == set(list3)) # False
                        

Quick check for the intersection of two structures

The isdisjoint() method is very effective for checking whether two collections have at least one element in common.

user1_permissions = ['read', 'write', 'execute']
user2_permissions = {'comment', 'delete'}
user3_permissions = ('read', 'edit')

# Check if there is at least one common law
print(not set(user1_permissions).isdisjoint(user2_permissions)) # False
print(not set(user1_permissions).isdisjoint(user3_permissions)) # True
                        

Extended structures

Sets of tuples

Since the elements of a set must be immutable, you cannot add a list to it. But you can add a tuple. This is useful for storing pairs or triples of unique values.

# Set of coordinates
coordinates = {(10, 20), (30, 40), (10, 20)}
print(coordinates) # {(10, 20), (30, 40)}

# coordinates.add([50, 60]) # TypeError: unhashable type: 'list'
                        

Frozen sets (frozenset)

Immutable sets

frozenset is a version of a set that cannot be changed after creation. It does not have the methods add(), remove(), update(), etc. However, all operations that return a new set (union, intersection) work with it.

fs = frozenset([1, 2, 3, 2])
print(fs) # frozenset({1, 2, 3})

# fs.add(4) # AttributeError: 'frozenset' object has no attribute 'add'
                        

Usage as keys in dictionaries

The main advantage of frozenset is that it can be used as a key in a dictionary or an element of another set, since it is immutable and hashed.

# Using frozenset as a dictionary key
group1 = frozenset(['read', 'write'])
group2 = frozenset(['read', 'execute'])

access_levels = {
    group1: 'Editor',
    group2: 'Auditor'
}

print(access_levels[frozenset(['write', 'read'])]) # 'Editor' - the order is not important

# Using frozenset as an element of another set
set_of_frozensets = {frozenset([1, 2]), frozenset(['a', 'b'])}
print(set_of_frozensets)