Web scraping involves fetching data from websites, which is typically I/O bound due to network latency. By using threading, a scraper can handle multiple download requests simultaneously, significantly speeding up the data collection process.
import threading
import requests
class WebScraper(threading.Thread):
def __init__(self, url):
super().__init__()
self.url = url
def run(self):
response = requests.get(self.url)
print(f"Downloaded {self.url}: {len(response.text)} characters")
urls = [
"http://www.example.com",
"http://www.google.com",
"http://www.python.org"
]
threads = [WebScraper(url) for url in urls]
for thread in threads:
thread.start()
for thread in threads:
thread.join()
print("Finished downloading all websites.")
import threading
import requests
class WebScraper(threading.Thread):
def __init__(self, url):
super().__init__()
self.url = url
def run(self):
response = requests.get(self.url)
print(f"Downloaded {self.url}: {len(response.text)} characters")
urls = [
"http://www.example.com",
"http://www.google.com",
"http://www.python.org"
]
threads = [WebScraper(url) for url in urls]
for thread in threads:
thread.start()
for thread in threads:
thread.join()
print("Finished downloading all websites.")
In this example, multiple websites are downloaded concurrently, each in a separate thread, which can drastically reduce the total time taken compared to downloading each site sequentially.
Graphical User Interface (GUI) applications need to remain responsive to user input while performing other tasks. Threading can be used to perform background tasks such as data loading or processing without freezing the UI.
import tkinter as tk
from threading import Thread
import time
def long_running_task():
time.sleep(5)
print("Task completed.")
def start_thread():
Thread(target=long_running_task).start()
app = tk.Tk()
app.geometry("200x100")
start_button = tk.Button(app, text="Start Task", command=start_thread)
start_button.pack(pady=20)
app.mainloop()
This example shows a simple GUI that remains interactive while a long-running background task is executed in a separate thread.
Network servers often need to handle multiple client connections simultaneously. Threading allows each connection to be handled in a separate thread, improving the server’s ability to manage concurrent connections efficiently.
import socket
import threading
def handle_client(client_socket):
request = client_socket.recv(1024)
print(f"Received: {request.decode()}")
client_socket.send("ACK!".encode())
client_socket.close()
server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
server.bind(("localhost", 9999))
server.listen(5)
print("Server listening on localhost:9999")
while True:
client, addr = server.accept()
print(f"Accepted connection from {addr}")
client_handler = threading.Thread(target=handle_client, args=(client,))
client_handler.start()
This server listens for incoming connections and spawns a new thread to handle each client, ensuring that the main server can continue accepting new connections.