Python Language => Clasificación, mínimo y máximo

Obteniendo el mínimo o máximo de varios valores.

min(7,2,1,5)
# Output: 1

max(7,2,1,5)
# Output: 7

Usando el argumento clave

Encontrar el mínimo / máximo de una secuencia de secuencias es posible:

list_of_tuples = [(0, 10), (1, 15), (2, 8)]
min(list_of_tuples)
# Output: (0, 10)

pero si desea ordenar por un elemento específico en cada secuencia, use la key -argumento:

min(list_of_tuples, key=lambda x: x[0])         # Sorting by first element
# Output: (0, 10)

min(list_of_tuples, key=lambda x: x[1])         # Sorting by second element
# Output: (2, 8)

sorted(list_of_tuples, key=lambda x: x[0])      # Sorting by first element (increasing)
# Output: [(0, 10), (1, 15), (2, 8)]

sorted(list_of_tuples, key=lambda x: x[1])      # Sorting by first element
# Output: [(2, 8), (0, 10), (1, 15)]

import operator   
# The operator module contains efficient alternatives to the lambda function
max(list_of_tuples, key=operator.itemgetter(0)) # Sorting by first element
# Output: (2, 8)

max(list_of_tuples, key=operator.itemgetter(1)) # Sorting by second element
# Output: (1, 15)

sorted(list_of_tuples, key=operator.itemgetter(0), reverse=True) # Reversed (decreasing)
# Output: [(2, 8), (1, 15), (0, 10)]

sorted(list_of_tuples, key=operator.itemgetter(1), reverse=True) # Reversed(decreasing)
# Output: [(1, 15), (0, 10), (2, 8)]

Argumento predeterminado a max, min

No puedes pasar una secuencia vacía a max o min :

min([])

ValueError: min () arg es una secuencia vacía

Sin embargo, con Python 3, puede pasar el default argumento de palabra clave con un valor que se devolverá si la secuencia está vacía, en lugar de generar una excepción:

max([], default=42)        
# Output: 42
max([], default=0)        
# Output: 0

Caso especial: diccionarios

Obtener el mínimo o el máximo o usar sorted depende de las iteraciones sobre el objeto. En el caso de dict , la iteración es solo sobre las teclas:

adict = {'a': 3, 'b': 5, 'c': 1}
min(adict)
# Output: 'a'
max(adict)
# Output: 'c'
sorted(adict)
# Output: ['a', 'b', 'c']

Para mantener la estructura del diccionario, debe iterar sobre .items() :

min(adict.items())
# Output: ('a', 3)
max(adict.items())
# Output: ('c', 1)
sorted(adict.items())
# Output: [('a', 3), ('b', 5), ('c', 1)]

Para sorted , puede crear un OrderedDict para mantener la clasificación mientras tiene una estructura similar a un dict :

from collections import OrderedDict
OrderedDict(sorted(adict.items()))
# Output: OrderedDict([('a', 3), ('b', 5), ('c', 1)])
res = OrderedDict(sorted(adict.items()))
res['a']
# Output: 3

Por valor

De nuevo, esto es posible usando el argumento key :

min(adict.items(), key=lambda x: x[1])
# Output: ('c', 1)
max(adict.items(), key=operator.itemgetter(1))
# Output: ('b', 5)
sorted(adict.items(), key=operator.itemgetter(1), reverse=True)
# Output: [('b', 5), ('a', 3), ('c', 1)]

Obteniendo una secuencia ordenada

Usando una secuencia:

sorted((7, 2, 1, 5))                 # tuple
# Output: [1, 2, 5, 7]

sorted(['c', 'A', 'b'])              # list
# Output: ['A', 'b', 'c']

sorted({11, 8, 1})                   # set
# Output: [1, 8, 11]

sorted({'11': 5, '3': 2, '10': 15})  # dict
# Output: ['10', '11', '3']          # only iterates over the keys

sorted('bdca')                       # string
# Output: ['a','b','c','d']

El resultado es siempre una nueva list ; Los datos originales se mantienen sin cambios.

Mínimo y máximo de una secuencia.

Obtener el mínimo de una secuencia (iterable) es equivalente a acceder al primer elemento de una secuencia sorted :

min([2, 7, 5])
# Output: 2
sorted([2, 7, 5])[0]
# Output: 2

El máximo es un poco más complicado, porque sorted mantiene el orden y max devuelve el primer valor encontrado. En caso de que no haya duplicados, el máximo es el mismo que el último elemento de la declaración ordenada:

max([2, 7, 5])
# Output: 7
sorted([2, 7, 5])[-1]
# Output: 7

Pero no si hay varios elementos que se evalúan como teniendo el valor máximo:

class MyClass(object):
    def __init__(self, value, name):
        self.value = value
        self.name = name
        
    def __lt__(self, other):
        return self.value < other.value
    
    def __repr__(self):
        return str(self.name)

sorted([MyClass(4, 'first'), MyClass(1, 'second'), MyClass(4, 'third')])
# Output: [second, first, third]
max([MyClass(4, 'first'), MyClass(1, 'second'), MyClass(4, 'third')])
# Output: first

Se permite cualquier elemento que contenga iterable que admita < o > operaciones.

Hacer clases personalizables ordenable

min , max y sorted todos necesitan que los objetos se puedan ordenar. Para ser ordenados adecuadamente, la clase necesita definir todos los 6 métodos __lt__ , __gt__ , __ge__ , __le__ , __ne__ y __eq__ :

class IntegerContainer(object):
    def __init__(self, value):
        self.value = value
        
    def __repr__(self):
        return "{}({})".format(self.__class__.__name__, self.value)
    
    def __lt__(self, other):
        print('{!r} - Test less than {!r}'.format(self, other))
        return self.value < other.value
    
    def __le__(self, other):
        print('{!r} - Test less than or equal to {!r}'.format(self, other))
        return self.value <= other.value

    def __gt__(self, other):
        print('{!r} - Test greater than {!r}'.format(self, other))
        return self.value > other.value

    def __ge__(self, other):
        print('{!r} - Test greater than or equal to {!r}'.format(self, other))
        return self.value >= other.value

    def __eq__(self, other):
        print('{!r} - Test equal to {!r}'.format(self, other))
        return self.value == other.value

    def __ne__(self, other):
        print('{!r} - Test not equal to {!r}'.format(self, other))
        return self.value != other.value

Aunque implementar todos estos métodos parece innecesario, omitir algunos de ellos hará que su código sea propenso a errores .

Ejemplos:

alist = [IntegerContainer(5), IntegerContainer(3),
         IntegerContainer(10), IntegerContainer(7)
        ]

res = max(alist)
# Out: IntegerContainer(3) - Test greater than IntegerContainer(5)
#      IntegerContainer(10) - Test greater than IntegerContainer(5)
#      IntegerContainer(7) - Test greater than IntegerContainer(10)
print(res)
# Out: IntegerContainer(10)

res = min(alist)   
# Out: IntegerContainer(3) - Test less than IntegerContainer(5)
#      IntegerContainer(10) - Test less than IntegerContainer(3)
#      IntegerContainer(7) - Test less than IntegerContainer(3)
print(res)
# Out: IntegerContainer(3)

res = sorted(alist)
# Out: IntegerContainer(3) - Test less than IntegerContainer(5)
#      IntegerContainer(10) - Test less than IntegerContainer(3)
#      IntegerContainer(10) - Test less than IntegerContainer(5)
#      IntegerContainer(7) - Test less than IntegerContainer(5)
#      IntegerContainer(7) - Test less than IntegerContainer(10)
print(res)
# Out: [IntegerContainer(3), IntegerContainer(5), IntegerContainer(7), IntegerContainer(10)]

sorted con reverse=True también usa __lt__ :

res = sorted(alist, reverse=True)
# Out: IntegerContainer(10) - Test less than IntegerContainer(7)
#      IntegerContainer(3) - Test less than IntegerContainer(10)
#      IntegerContainer(3) - Test less than IntegerContainer(10)
#      IntegerContainer(3) - Test less than IntegerContainer(7)
#      IntegerContainer(5) - Test less than IntegerContainer(7)
#      IntegerContainer(5) - Test less than IntegerContainer(3)
print(res)
# Out: [IntegerContainer(10), IntegerContainer(7), IntegerContainer(5), IntegerContainer(3)]

Pero sorted puede usar __gt__ cambio si el valor predeterminado no está implementado:

del IntegerContainer.__lt__   # The IntegerContainer no longer implements "less than"

res = min(alist) 
# Out: IntegerContainer(5) - Test greater than IntegerContainer(3)
#      IntegerContainer(3) - Test greater than IntegerContainer(10)
#      IntegerContainer(3) - Test greater than IntegerContainer(7)
print(res)
# Out: IntegerContainer(3)

Los métodos de clasificación generarán un TypeError si no se implementan __lt__ ni __gt__ :

del IntegerContainer.__gt__   # The IntegerContainer no longer implements "greater then"

res = min(alist)

TypeError: tipos no ordenados: IntegerContainer () <IntegerContainer ()

functools.total_ordering decorador functools.total_ordering se puede utilizar para simplificar el esfuerzo de escribir estos métodos de comparación ricos. Si total_ordering tu clase con total_ordering , necesitas implementar __eq__ , __ne__ y solo uno de los __lt__ , __le__ , __ge__ o __gt__ , y el decorador completará el resto:

import functools

@functools.total_ordering
class IntegerContainer(object):
    def __init__(self, value):
        self.value = value
        
    def __repr__(self):
        return "{}({})".format(self.__class__.__name__, self.value)
    
    def __lt__(self, other):
        print('{!r} - Test less than {!r}'.format(self, other))
        return self.value < other.value
    
    def __eq__(self, other):
        print('{!r} - Test equal to {!r}'.format(self, other))
        return self.value == other.value
    
    def __ne__(self, other):
        print('{!r} - Test not equal to {!r}'.format(self, other))
        return self.value != other.value


IntegerContainer(5) > IntegerContainer(6)
# Output: IntegerContainer(5) - Test less than IntegerContainer(6)
# Returns: False

IntegerContainer(6) > IntegerContainer(5)
# Output: IntegerContainer(6) - Test less than IntegerContainer(5)
# Output: IntegerContainer(6) - Test equal to IntegerContainer(5)
# Returns True

Observe cómo el > ( mayor que ) ahora llama al método menos que , y en algunos casos incluso el método __eq__ . Esto también significa que si la velocidad es de gran importancia, debe implementar cada método de comparación enriquecida.

Extraer N artículos más grandes o N más pequeños de un iterable

Para encontrar un número (más de uno) de los valores más grandes o más pequeños de un iterable, puede usar el nlargest y nsmallest del módulo heapq :

import heapq

# get 5 largest items from the range

heapq.nlargest(5, range(10))
# Output: [9, 8, 7, 6, 5]

heapq.nsmallest(5, range(10))
# Output: [0, 1, 2, 3, 4]

Esto es mucho más eficiente que clasificar todo el material y luego cortarlo desde el final o el principio. Internamente, estas funciones utilizan la estructura de datos de la cola de prioridad del montón binario , que es muy eficiente para este caso de uso.

Al igual que min , max y sorted , estas funciones aceptan el argumento de palabra key clave opcional, que debe ser una función que, dado un elemento, devuelve su clave de clasificación.

Aquí hay un programa que extrae 1000 líneas más largas de un archivo:

import heapq
with open(filename) as f:
    longest_lines = heapq.nlargest(1000, f, key=len)

Aquí abrimos el archivo y pasamos el identificador de archivo f a nlargest . La iteración del archivo produce cada línea del archivo como una cadena separada; nlargest luego pasa cada elemento (o línea) que pasa a la función len para determinar su clave de clasificación. len , dada una cadena, devuelve la longitud de la línea en caracteres.

Esto solo necesita almacenamiento para una lista de 1000 líneas más grandes hasta el momento, que se puede contrastar con

longest_lines = sorted(f, key=len)[1000:]

que tendrá que mantener todo el archivo en la memoria .

Modified text is an extract of the original Stack Overflow Documentation

Licenciado bajo CC BY-SA 3.0

No afiliado a Stack Overflow

Python Language
Clasificación, mínimo y máximo

Buscar..