1.7 sets#

Important

This lesson is still under development.

A set is a collection of objects just like lists with the exception that it is unordered, does not contain same objects more than once, and can not contain immutable objects like lists.

A set can be created from an existing sequence object such as a string, list or tuple.

urdu = set("National language of Pakistan")

print(type(urdu))

print(urdu)
<class 'set'>
{' ', 'g', 't', 'e', 'l', 's', 'u', 'n', 'P', 'i', 'o', 'a', 'N', 'k', 'f'}
pak_langs = set(["Balochi", "Barohi", "Sindhi", "Balti"])
print(pak_langs)
{'Sindhi', 'Balochi', 'Barohi', 'Balti'}

If our sequence contains repeating objects, only one instance of those repeating objects will be included in the list.

pak_langs = set(("Balochi", "Barohi", "Sindhi", "Balti", "Balochi"))
print(pak_langs)
{'Sindhi', 'Balochi', 'Barohi', 'Balti'}

Although, we can create sets from lists, but a set can not contain a list as an object.

pak_langs = set((("Balochi", "Barohi"), ("punbabi", "siraiki")))
print(pak_langs)
{('punbabi', 'siraiki'), ('Balochi', 'Barohi')}
# uncomment following line
# pak_langs = set((["Balochi", "Barohi"], ["punbabi", "siraiki"]))
print(pak_langs)
{('punbabi', 'siraiki'), ('Balochi', 'Barohi')}

In second case above, we want our set to have two lists as objects, so the error was prompted. Sets are mutable i.e. they can be changed. We can add new objects in sets as following

pak_langs = set(["Balochi", "Barohi", "Sindhi"])
pak_langs.add("Pashto")
print(pak_langs)
{'Pashto', 'Sindhi', 'Balochi', 'Barohi'}

There are immutable sets as well with the name frozenset.

balochistan_langs = frozenset(["Balochi", "Barohi", "Pashto"])

# uncomment following line
# balochistan_langs.add("punjabi")
# Operations on sets

adding elements#

We saw, how to add objects in sets with the method add. We can not violate aforementioned rules using add method.

imperialists = {"bbc", "cnn"}

# uncomment following line
# imperialists.add(["voa","dw"])  # TypeError
imperialists.add('bbc')
print(imperialists)
{'bbc', 'cnn'}
imperialists.update(["voa","dw"])
print(imperialists)
{'bbc', 'cnn', 'voa', 'dw'}
imperialists = {"bbc", "cnn"}

# uncomment following line
# imperialists.update([["voa","dw"]])  # TypeError
print(imperialists)
{'bbc', 'cnn'}

| operator can also be used to add/concatenate two sets

imperialists = {"bbc", "cnn"}

imperialists | {"voa", "dw"}
{'bbc', 'cnn', 'voa', 'dw'}
imperialists = {"bbc", "cnn"}

imperialists |= {"voa", "dw"}

print(imperialists)
{'bbc', 'cnn', 'voa', 'dw'}

clear#

We can clear the contents of a set by using the method clear on a set.

dakus = {"musharaf", "nawaz", "benazir"}

dakus.clear()  # after NRO (https://en.wikipedia.org/wiki/National_Reconciliation_Ordinance)
print(dakus)
set()

Copy#

The assignment operation = does not create a new set.

more_dakus = {"pervaiz elahi", "altaf husain"}
dakus_backup = more_dakus
more_dakus.clear()
print(dakus_backup)
set()

copy method creates a shallow copy

more_dakus = {"pervaiz elahi", "altaf husain"}
dakus_backup = more_dakus.copy()
more_dakus.clear()
print(dakus_backup)
{'altaf husain', 'pervaiz elahi'}
imperialists = {"BBC", "CNN", "VOA"}

more_imperialists = imperialists.copy()

more_imperialists.add("DW")

print(imperialists)

print(more_imperialists)
{'BBC', 'CNN', 'VOA'}
{'BBC', 'DW', 'CNN', 'VOA'}

difference#

pml_q = {"zafrullah jamali", "fawad hussain", "pervaiz elahi", "umar ayyub"}
pml_n = {"choi Nisar" , "umar ayyub", "khawaja Asif"}
pti = {"firdows ashiq", "umar ayyub", "asad umar", "fawad hussain"}

pml_q.difference(pti)
{'pervaiz elahi', 'zafrullah jamali'}
lotas_2013 = pml_q.difference(pml_q.difference(pml_n))
print(lotas_2013)
{'umar ayyub'}

We can also make use of - operator

print(pml_q - pti)
{'pervaiz elahi', 'zafrullah jamali'}

difference_update#

This makes change in original set. similar to x-y with the exception that x is itself changed.

pml_q = {"zafrullah jamali", "fawad hussain", "pervaiz elahi", "umar ayyub"}
pml_n = {"choi Nisar" , "umar ayyub", "khawaja Asif"}
pti = {"firdows ashiq", "umar ayyub", "asad umar", "fawad hussain"}

pml_q.difference_update(pml_n)

print(pml_q)
{'fawad hussain', 'pervaiz elahi', 'zafrullah jamali'}
pml_q.difference_update(pti)

print(pml_q)
{'pervaiz elahi', 'zafrullah jamali'}

discard#

removes an element from set if it is present.

pml_q = {"zafrullah jamali", "fawad hussain", "pervaiz elahi", "umar ayyub"}
pml_q.discard("zafrullah jamali")
print(pml_q)
{'fawad hussain', 'pervaiz elahi', 'umar ayyub'}
pml_q.discard("choi nisar")
print(pml_q)
{'fawad hussain', 'pervaiz elahi', 'umar ayyub'}

ferdows ashiq is not present in set musharaf but using discard did not raise an error.

## ``remove``
# Same as `discard` with the exception that an error is raised if the object is
# not present in set.
pml_q = {"zafrullah jamali", "fawad hussain", "pervaiz elahi", "umar ayyub"}
pml_q.remove("zafrullah jamali")
print(pml_q)
{'fawad hussain', 'pervaiz elahi', 'umar ayyub'}
# uncomment following line
# pml_q.remove("choi nisar")  # KeyError
print(pml_q)
{'fawad hussain', 'pervaiz elahi', 'umar ayyub'}

pop#

pml_q = {"firdows ashiq", "fawad hussain", "pervaiz elahi", "umar ayyub"}
pml_q.pop()
print(pml_q)
{'pervaiz elahi', 'umar ayyub', 'firdows ashiq'}
pml_q.pop()
print(pml_q)
{'umar ayyub', 'firdows ashiq'}

Running the above cell multiple times will eventually raise an error when the set becomes empty.

union#

pml_q = {"firdows ashiq", "fawad hussain", "pervaiz elahi", "umar ayyub"}
pml_n = {"choi Nisar" , "umar ayyub", "khawaja Asif"}

pml_q.union(pml_n)
{'fawad hussain', 'pervaiz elahi', 'umar ayyub', 'firdows ashiq', 'khawaja Asif', 'choi Nisar'}
print(pml_q | pml_n)
{'fawad hussain', 'pervaiz elahi', 'umar ayyub', 'firdows ashiq', 'khawaja Asif', 'choi Nisar'}
## `intersection`
pml_q = {"firdows ashiq", "fawad hussain", "pervaiz elahi", "umar ayyub"}
pti = {"firdows ashiq", "umar ayyub", "asad umar", "fawad hussain"}

pml_q.intersection(pti)
{'umar ayyub', 'firdows ashiq', 'fawad hussain'}

We can also use & operator

print(pml_q & pti)
{'umar ayyub', 'firdows ashiq', 'fawad hussain'}

The original set pml_q remains unchanged.

print(pml_q)
{'fawad hussain', 'pervaiz elahi', 'umar ayyub', 'firdows ashiq'}

However, if we use intersection_update, the original set is changed

pml_q.intersection_update(pti)
print(pml_q)
{'umar ayyub', 'firdows ashiq', 'fawad hussain'}

If we want to find out intersection between multiple sets, we can do it as following.

pml_q = {"firdows ashiq", "fawad hussain", "pervaiz elahi", "umar ayyub"}
pti = {"firdows ashiq", "umar ayyub", "asad umar", "fawad hussain"}
pml_n = {"choi Nisar" , "umar ayyub", "khawaja Asif"}

sets = [pml_q, pml_n, pti]
set.intersection(*sets)
{'umar ayyub'}

or

sets = [pml_n, pti]
pml_q.intersection(*sets)
{'umar ayyub'}

So we can say that [umar ayyub](https://en.wikipedia.org/wiki/Omar_Ayub_Khan) is the most consistent lota.

isdisjoint#

returns True if the intersection of two sets is not empty set.

ppp = {"firdows ashiq", "fawad hussain", "Amin Faheem", "umar ayyub"}
pti = {"firdows ashiq", "umar ayyub", "asad umar", "fawad hussain"}
ji = {"liaquat baloch", "siraj ul haq", "munawar hasan"}

ppp.isdisjoint(ji)
True
ppp.isdisjoint(pti)
False
# `issubset`
# ------------
# ``<`` is used for proper subset and ``<=`` is used for subset checking.
pml_n = {"nawaz", "shahbaz", "pervaiz elahi", "mushahid husain"}
pml_q = {"pervaiz elahi", "mushahid husain"}

pml_q.issubset(pml_n)
True
print(pml_q <= pml_n)
True
print(pml_q < pml_q)
False

issuperset#

> is used for proper superset and >= is used for superset checking.

pml_n = {"nawaz", "shahbaz", "pervaiz elahi", "mushahid husain"}
pml_q = {"pervaiz elahi", "mushahid husain"}

pml_n.issuperset(pml_q)
True
print(pml_n >= pml_q)
True
print(pml_n > pml_n)
False
# Since sets are unordered, the operation ``in`` is faster when applied to
# sets as compared to lists.
print("nawaz" in pml_n)
True
print("nawaz" not in pml_q)
True

Total running time of the script: ( 0 minutes 0.016 seconds)

Gallery generated by Sphinx-Gallery