Problem
Given
nnodes labeled from0ton - 1and a list of undirected edges (each edge is a pair of nodes), write a function to find the number of connected components in an undirected graph.Example 1:
0 -- 1 -- 2 3 -- 4 Given
n = 5andedges = [[0, 1], [1, 2], [3, 4]], return2.Example 2:
0 -- 1 -- 2 -- 3 -- 4 Given
n = 5andedges = [[0, 1], [1, 2], [2, 3], [3, 4]], return1.Note: You can assume that no duplicate edges will appear in edges. Since all edges are undirected,
[0, 1]is the same as[1, 0]and thus will not appear together in edges.Source: https://leetcode.com/problems/number-of-connected-components-in-an-undirected-graph/
Idea
- Construct the graph from
edges - Initialize
components = 0 - Start at some unvisited node, traverse the connected component that contains this node and mark every node in this component at visited
- Increment
componentsby 1 - Repeat 3 and 4 until every node has been visited.
- Return
components
Before Implementing ...
How do we construct/represent a graph?
We want to represent a graph in such a way that supports fast neighbor lookup (like a getNeighbors() method). So we can either use a HashMap to map a node to its neighbors or utilize a GraphNode class:
class GraphNode():
def __init__(self, val):
self.val = val
self.children = []
This is a good opportunity to discuss with your interviewer about the trade-offs between the two implementations.
HashMap/dictionary:
- Pros: Easy to implement
- Cons: Code is not self-explanatory
GraphNode:
- Pros: Good design, self-explanatory code
- Cons: Extra overhead
An important note about using a GraphNode class: after initializing GraphNode(0), GraphNode(1), ..., GraphNode(n-1), when we see [2, 3], we want to add GraphNode(3) to GraphNode(2)'s children and vice versa. GraphNode(2) and GraphNode(3) have been created -- how do we find them? I.e, given some i, how do we store the GraphNode's so we can quickly obtain the GraphNode(i) that has been created?
Solution 1: When creating GraphNode(0) - GraphNode(n-1), we store them in an array and access GraphNode(i) by nodes[i].
nodes = [GraphNode(i) for i in xrange(n)]
Solution 2: Use a dictionary to map i to GraphNode(i) and access GraphNode(i) by children[i].
children = {i: GraphNode(i) for i in xrange(n)}
In this case when the i's range from 0 to n-1, solution 1 is slightly better since the dictionary solution uses more space. In other cases, however, when i is a string or i is an integer but all i's are sparsely distributed within some range, using an array to store all the nodes will be either impossible or a huge waste of space. (For more practice on HashMap, try Copy List with Random Pointer, Implement a Trie, and Implement LRU Cache)
Seeing that you understand how to use a GraphNode class to construct the graph, the interviewer will likely suggest you use a HashMap to represent the graph to save some time.
How do we traverse a connected component?
Both BFS and DFS will work.
How do we store the visited nodes?
In our algorithm, we need a way to retrieve some node that hasn't been visited yet. So instead of keeping track the visited nodes, we use a set to store the unvisited nodes so we can get some unvisited node using unvisited.pop(). (It's a good exercise to implement yourself such a Set class with a popRandom method in addition to the standard O(1) methods such as add, remove, contains, isEmpty.)
Implementation
Main code
def countComponents(n, edges):
children = {i: [] for i in xrange(n)}
for a, b in edges:
children[a].append(b)
children[b].append(a)
components = 0
unvisited = set(range(n))
while unvisited:
somenode = unvisited.pop()
dfs(somenode, children, unvisited) # dfs can be replaced by bfs
components += 1
return components
DFS
def dfs(root, children, unvisited):
stack = [root]
while stack:
current = stack.pop()
for x in children[current]:
if x in unvisited:
unvisited.discard(x)
stack.append(x)
Note: set.discard(x) can be replaced by set.remove(x). The difference between discard(x) and remove(x) is that remove will raise KeyError if x is not present.
BFS
from Queue import Queue
def bfs(root, children, unvisited):
q = Queue([root])
while q:
current = q.get()
for x in children[current]:
if x in unvisited:
unvisited.discard(x)
q.put(x)
Note: Popping the first item in an array is O(n) where n is the length of the array. Popping the last item, however, is O(1). That's why we need to specifically ask for a Queue for BFS but not for DFS.