Python Language => 検索

備考

n要素を含むiterables上のすべての探索アルゴリズムは、複雑さがO(n) 。 bisect.bisect_left()ような特殊なアルゴリズムだけが、 O(log(n))複雑さにより高速になりO(log(n)) 。

ストリングの索引の取得：str.index（）、str.rindex（）およびstr.find（）、str.rfind（）

Stringにはindexメソッドだけでなく、より高度なオプションと追加のstr.findます。これらの両方に補完的な逆の方法があります。

astring = 'Hello on StackOverflow'
astring.index('o')  # 4
astring.rindex('o') # 20

astring.find('o')   # 4
astring.rfind('o')  # 20

index / rindexとfind / rfindは、文字列に部分文字列が見つからない場合です。

astring.index('q') # ValueError: substring not found
astring.find('q')  # -1

これらのメソッドはすべて、開始インデックスと終了インデックスを許可します。

astring.index('o', 5)    # 6
astring.index('o', 6)    # 6 - start is inclusive
astring.index('o', 5, 7) # 6
astring.index('o', 5, 6) #  - end is not inclusive

ValueError：部分文字列が見つかりません

astring.rindex('o', 20) # 20 
astring.rindex('o', 19) # 20 - still from left to right

astring.rindex('o', 4, 7) # 6

要素の検索

Pythonの組み込みコレクションはすべて、inを使っin要素メンバシップをチェックする方法を実装しinます。

タプル

atuple = ('0', '1', '2', '3', '4')
4 in atuple    # False
'4' in atuple  # True

文字列

astring = 'i am a string'
'a' in astring   # True
'am' in astring  # True
'I' in astring   # False

セット

aset = {(10, 10), (20, 20), (30, 30)}
(10, 10) in aset  # True
10 in aset        # False

Dict

dictはちょっと特殊です。通常のinはキーをチェックするだけです。値を検索する場合は、値を指定する必要があります。 キーと値のペアを検索する場合も同じです。

adict = {0: 'a', 1: 'b', 2: 'c', 3: 'd'}
1 in adict                 # True   - implicitly searches in keys
'a' in adict               # False
2 in adict.keys()          # True   - explicitly searches in keys
'a' in adict.values()      # True   - explicitly searches in values
(0, 'a') in adict.items()  # True   - explicitly searches key/value pairs

インデックスリストとタプルの取得：list.index（）、tuple.index（）

listとtupleは、要素の位置を取得するindex -methodがあります。

alist = [10, 16, 26, 5, 2, 19, 105, 26]
# search for 16 in the list
alist.index(16) # 1
alist[1]        # 16

alist.index(15)

ValueError：15がリストにありません

しかし、最初に見つかった要素の位置だけを返します。

atuple = (10, 16, 26, 5, 2, 19, 105, 26)
atuple.index(26)   # 2
atuple[2]          # 26
atuple[7]          # 26 - is also 26!

dictの値のキーを検索する

dict 辞書は順不同であるため、値またはキーを検索するための組み込みメソッドがありません。指定した値のキー（またはキー）を取得する関数を作成できます。

def getKeysForValue(dictionary, value):
    foundkeys = []
    for keys in dictionary:
        if dictionary[key] == value:
            foundkeys.append(key)
    return foundkeys

これは、同等のリスト内包としても書くことができます：

def getKeysForValueComp(dictionary, value): 
    return [key for key in dictionary if dictionary[key] == value]

見つかったキーが1つだけ気になる場合：

def getOneKeyForValue(dictionary, value):
    return next(key for key in dictionary if dictionary[key] == value)

最初の2つの関数は、指定された値を持つすべてのkeys listを返します。

adict = {'a': 10, 'b': 20, 'c': 10}
getKeysForValue(adict, 10)     # ['c', 'a'] - order is random could as well be ['a', 'c']
getKeysForValueComp(adict, 10) # ['c', 'a'] - dito
getKeysForValueComp(adict, 20) # ['b']
getKeysForValueComp(adict, 25) # []

もう一方のキーは1つのキーのみを返します。

getOneKeyForValue(adict, 10)   # 'c'  - depending on the circumstances this could also be 'a'
getOneKeyForValue(adict, 20)   # 'b'

StopIteration - 値がdictない場合はException ：

getOneKeyForValue(adict, 25)

StopIteration

ソートされたシーケンスのインデックスを取得する：bisect.bisect_left（）

ソートされたシーケンスでは、より高速な検索アルゴリズムを使用できますbisect.bisect_left() ¹ ：

import bisect

def index_sorted(sorted_seq, value):
    """Locate the leftmost value exactly equal to x or raise a ValueError"""
    i = bisect.bisect_left(sorted_seq, value)
    if i != len(sorted_seq) and sorted_seq[i] == value:
        return i
    raise ValueError

alist = [i for i in range(1, 100000, 3)] # Sorted list from 1 to 100000 with step 3
index_sorted(alist, 97285) # 32428
index_sorted(alist, 4)     # 1
index_sorted(alist, 97286)

ValueError

非常に大きな分類されたシーケンスの場合 、速度の増加はかなり高くなる可能性があります。近似的に500倍速い第1の探索の場合：

%timeit index_sorted(alist, 97285)
# 100000 loops, best of 3: 3 µs per loop
%timeit alist.index(97285)
# 1000 loops, best of 3: 1.58 ms per loop

要素が最初の要素の1つである場合は少し遅くなりますが、

%timeit index_sorted(alist, 4)
# 100000 loops, best of 3: 2.98 µs per loop
%timeit alist.index(4)
# 1000000 loops, best of 3: 580 ns per loop

ネストされたシーケンスの検索

tuple listのようなネストしたシーケンスを検索するには、 dict値をキーで検索するなどの方法が必要ですが、カスタマイズされた関数が必要です。

値がシーケンス内に見つかった場合は、最も外側のシーケンスのインデックス。

def outer_index(nested_sequence, value):
    return next(index for index, inner in enumerate(nested_sequence) 
                      for item in inner 
                      if item == value)

alist_of_tuples = [(4, 5, 6), (3, 1, 'a'), (7, 0, 4.3)]
outer_index(alist_of_tuples, 'a')  # 1
outer_index(alist_of_tuples, 4.3)  # 2

または外側と内側の配列のインデックス：

def outer_inner_index(nested_sequence, value):
    return next((oindex, iindex) for oindex, inner in enumerate(nested_sequence) 
                                 for iindex, item in enumerate(inner) 
                                 if item == value)

outer_inner_index(alist_of_tuples, 'a') # (1, 2)
alist_of_tuples[1][2]  # 'a'

outer_inner_index(alist_of_tuples, 7)   # (2, 0)
alist_of_tuples[2][0]  # 7

一般的に（ 必ずしもそうではない ） nextと、検索された値が最初に出現する条件を持つジェネレータ式が最も効率的なアプローチです。

カスタムクラスでの検索：containsおよびiter

使用できるようにするにはinカスタムクラスのためのクラスはマジックメソッドを提供しなければならないのいずれか__contains__または、その失敗を、 __iter__ -methodを。

listのlistを含むクラスがあるとします。

class ListList:
    def __init__(self, value):
        self.value = value
        # Create a set of all values for fast access
        self.setofvalues = set(item for sublist in self.value for item in sublist)
        
    def __iter__(self):
        print('Using __iter__.')
        # A generator over all sublist elements
        return (item for sublist in self.value for item in sublist)
        
    def __contains__(self, value):
        print('Using __contains__.')
        # Just lookup if the value is in the set
        return value in self.setofvalues

        # Even without the set you could use the iter method for the contains-check:
        # return any(item == value for item in iter(self))

in ：を使用してメンバーシップテストを使用することができます。

a = ListList([[1,1,1],[0,1,1],[1,5,1]])
10 in a    # False
# Prints: Using __contains__.
5 in a     # True
# Prints: Using __contains__.

__contains__メソッドを削除した後でも：

del ListList.__contains__
5 in a     # True
# Prints: Using __iter__.

注：（ for i in a ）ループinは、クラスが__contains__メソッドを実装していても常に__iter__を使用します。

Modified text is an extract of the original Stack Overflow Documentation

ライセンスを受けた CC BY-SA 3.0

所属していない Stack Overflow

Python Language
検索

サーチ…

備考

ストリングの索引の取得：str.index（）、str.rindex（）およびstr.find（）、str.rfind（）

要素の検索

リスト

タプル

文字列

セット

Dict

インデックスリストとタプルの取得：list.index（）、tuple.index（）

dictの値のキーを検索する

ソートされたシーケンスのインデックスを取得する：bisect.bisect_left（）

ネストされたシーケンスの検索

カスタムクラスでの検索：containsおよびiter

備考

ストリングの索引の取得：str.index（）、str.rindex（）およびstr.find（）、str.rfind（）

要素の検索

リスト

タプル

文字列

セット

Dict

インデックスリストとタプルの取得：list.index（）、tuple.index（）

dictの値のキーを検索する

ソートされたシーケンスのインデックスを取得する：bisect.bisect_left（）

ネストされたシーケンスの検索

カスタムクラスでの検索：__contains__および__iter__

カスタムクラスでの検索：containsおよびiter