Python Language => Procesos e hilos

Introducción

La mayoría de los programas se ejecutan línea por línea, ejecutando solo un proceso a la vez. Los hilos permiten que múltiples procesos fluyan independientemente unos de otros. El subprocesamiento con múltiples procesadores permite que los programas ejecuten múltiples procesos simultáneamente. Este tema documenta la implementación y el uso de subprocesos en Python.

Bloqueo de intérprete global

El rendimiento de subprocesos múltiples de Python a menudo puede verse afectado por el bloqueo global de intérprete . En resumen, aunque puede tener varios subprocesos en un programa de Python, solo una instrucción de bytecode puede ejecutarse en paralelo al mismo tiempo, independientemente del número de CPU.

Como tal, el multihilo en casos en los que las operaciones están bloqueadas por eventos externos, como el acceso a la red, puede ser bastante efectivo:

import threading
import time


def process():
    time.sleep(2)


start = time.time()
process()
print("One run took %.2fs" % (time.time() - start))


start = time.time()
threads = [threading.Thread(target=process) for _ in range(4)]
for t in threads:
    t.start()
for t in threads:
    t.join()
print("Four runs took %.2fs" % (time.time() - start))

# Out: One run took 2.00s
# Out: Four runs took 2.00s

Tenga en cuenta que aunque cada process tardó 2 segundos en ejecutarse, los cuatro procesos juntos pudieron ejecutarse de manera efectiva en paralelo, tomando un total de 2 segundos.

Sin embargo, los subprocesos múltiples en los casos en los que se realizan cálculos intensivos en el código de Python, como muchos cálculos, no producen una gran mejora e incluso pueden ser más lentos que ejecutarse en paralelo:

import threading
import time


def somefunc(i):
    return i * i

def otherfunc(m, i):
    return m + i

def process():
    for j in range(100):
        result = 0
        for i in range(100000):
            result = otherfunc(result, somefunc(i))


start = time.time()
process()
print("One run took %.2fs" % (time.time() - start))


start = time.time()
threads = [threading.Thread(target=process) for _ in range(4)]
for t in threads:
    t.start()
for t in threads:
    t.join()
print("Four runs took %.2fs" % (time.time() - start))

# Out: One run took 2.05s
# Out: Four runs took 14.42s

En este último caso, el multiprocesamiento puede ser efectivo, ya que los procesos múltiples pueden, por supuesto, ejecutar múltiples instrucciones simultáneamente:

import multiprocessing
import time


def somefunc(i):
    return i * i

def otherfunc(m, i):
    return m + i

def process():
    for j in range(100):
        result = 0
        for i in range(100000):
            result = otherfunc(result, somefunc(i))


start = time.time()
process()
print("One run took %.2fs" % (time.time() - start))


start = time.time()
processes = [multiprocessing.Process(target=process) for _ in range(4)]
for p in processes:
    p.start()
for p in processes:
    p.join()
print("Four runs took %.2fs" % (time.time() - start))

# Out: One run took 2.07s
# Out: Four runs took 2.30s

Corriendo en múltiples hilos

Use threading.Thread para ejecutar una función en otro hilo.

import threading
import os

def process():
    print("Pid is %s, thread id is %s" % (os.getpid(), threading.current_thread().name))

threads = [threading.Thread(target=process) for _ in range(4)]
for t in threads:
    t.start()
for t in threads:
    t.join()
    
# Out: Pid is 11240, thread id is Thread-1
# Out: Pid is 11240, thread id is Thread-2
# Out: Pid is 11240, thread id is Thread-3
# Out: Pid is 11240, thread id is Thread-4

Ejecutando en múltiples procesos

Use multiprocessing.Process para ejecutar una función en otro proceso. La interfaz es similar a threading.Thread :

import multiprocessing
import os

def process():
    print("Pid is %s" % (os.getpid(),))

processes = [multiprocessing.Process(target=process) for _ in range(4)]
for p in processes:
    p.start()
for p in processes:
    p.join()
    
# Out: Pid is 11206
# Out: Pid is 11207
# Out: Pid is 11208
# Out: Pid is 11209

Compartir el estado entre hilos

Como todos los subprocesos se ejecutan en el mismo proceso, todos los subprocesos tienen acceso a los mismos datos.

Sin embargo, el acceso simultáneo a los datos compartidos debe protegerse con un bloqueo para evitar problemas de sincronización.

import threading

obj = {}
obj_lock = threading.Lock()

def objify(key, val):
    print("Obj has %d values" % len(obj))
    with obj_lock:
        obj[key] = val
    print("Obj now has %d values" % len(obj))

ts = [threading.Thread(target=objify, args=(str(n), n)) for n in range(4)]
for t in ts:
    t.start()
for t in ts:
    t.join()
print("Obj final result:")
import pprint; pprint.pprint(obj)

# Out: Obj has 0 values
# Out:  Obj has 0 values
# Out: Obj now has 1 values
# Out: Obj now has 2 valuesObj has 2 values
# Out: Obj now has 3 values
# Out: 
# Out:  Obj has 3 values
# Out: Obj now has 4 values
# Out: Obj final result:
# Out: {'0': 0, '1': 1, '2': 2, '3': 3}

Estado de intercambio entre procesos

El código que se ejecuta en diferentes procesos no comparte, de forma predeterminada, los mismos datos. Sin embargo, el módulo de multiprocessing contiene primitivas para ayudar a compartir valores a través de múltiples procesos.

import multiprocessing

plain_num = 0
shared_num = multiprocessing.Value('d', 0)
lock = multiprocessing.Lock()

def increment():
    global plain_num
    with lock:
        # ordinary variable modifications are not visible across processes
        plain_num += 1
        # multiprocessing.Value modifications are
        shared_num.value += 1

ps = [multiprocessing.Process(target=increment) for n in range(4)]
for p in ps:
    p.start()
for p in ps:
    p.join()

print("plain_num is %d, shared_num is %d" % (plain_num, shared_num.value))

# Out: plain_num is 0, shared_num is 4

Modified text is an extract of the original Stack Overflow Documentation

Licenciado bajo CC BY-SA 3.0

No afiliado a Stack Overflow

Python Language
Procesos e hilos

Buscar..