Pythonでスレッドを使用するにはどうすればよいですか？

1281

Pythonのスレッディングを理解しようとしています。私はドキュメントと例を見てきましたが、率直に言って、多くの例は過度に洗練されており、それらを理解するのに苦労しています。

マルチスレッド用に分割されているタスクをどのように明確に示しますか？

— アルブルノ
ソース

31

このトピックに関する良い一般的な議論は、Jeff KnuppによるPythonの最も難しい問題にあります。要約すると、スレッディングは初心者向けではないようです。

— Matthew Walker

112

ハハ、私はスレッディングは皆のためだと思う傾向がありますが、初心者はスレッディングのためではありません:)））））

— Bohdan

42

新しい言語機能が利用されているため、後の方が間違いなくより良いので、人々がすべての回答を読むべきであることを示すためだけにフラグを付けます...

— Gwyn Evans

5

コアロジックをCで記述し、それをctypesを介して呼び出して、Pythonスレッドを実際に活用してください。

— aaa90210 2015

4

PyPubSubがスレッドのフローを制御するメッセージを送受信する優れた方法であることを追加したいと思います

— ytpillai

1418

2010年にこの質問が出されて以来、mapとpoolを使用してPythonで単純なマルチスレッドを実行する方法は非常に単純化されています。

以下のコードは、必ず確認する必要がある記事/ブログの投稿（アフィリエーションなし）からのものです。1行の並列処理：日々のスレッドタスクのより良いモデル。以下に要約します。最終的には数行のコードになります。

from multiprocessing.dummy import Pool as ThreadPool
pool = ThreadPool(4)
results = pool.map(my_function, my_array)

次のマルチスレッドバージョンはどれですか。

results = []
for item in my_array:
    results.append(my_function(item))

説明

Mapはクールな小さな関数であり、並列処理をPythonコードに簡単に挿入するための鍵となります。馴染みのない人にとって、マップはLispのような関数型言語から持ち上げられたものです。シーケンスに別の関数をマッピングする関数です。

Mapはシーケンスの反復を処理し、関数を適用して、すべての結果を最後に便利なリストに格納します。

ここに画像の説明を入力してください

実装

map関数の並列バージョンは、2つのライブラリー：multiprocessingによって提供されます。また、ほとんど知られていませんが、同様に素晴らしいステップの子：multiprocessing.dummyです。

multiprocessing.dummyマルチプロセッシングモジュールとまったく同じですが、代わりにスレッドを使用します（重要な違い -CPUを集中的に使用するタスクには複数のプロセスを使用します。I/ O（およびI / O中）にはスレッドを使用します）。

multiprocessing.dummyはマルチプロセッシングのAPIを複製しますが、これはスレッドモジュールのラッパーにすぎません。

import urllib2
from multiprocessing.dummy import Pool as ThreadPool

urls = [
  'http://www.python.org',
  'http://www.python.org/about/',
  'http://www.onlamp.com/pub/a/python/2003/04/17/metaclasses.html',
  'http://www.python.org/doc/',
  'http://www.python.org/download/',
  'http://www.python.org/getit/',
  'http://www.python.org/community/',
  'https://wiki.python.org/moin/',
]

# Make the Pool of workers
pool = ThreadPool(4)

# Open the URLs in their own threads
# and return the results
results = pool.map(urllib2.urlopen, urls)

# Close the pool and wait for the work to finish
pool.close()
pool.join()

そしてタイミングの結果：

Single thread:   14.4 seconds
       4 Pool:   3.1 seconds
       8 Pool:   1.4 seconds
      13 Pool:   1.3 seconds

複数の引数を渡す（Python 3.3以降でのみこのように機能します）：

複数の配列を渡すには：

results = pool.starmap(function, zip(list_a, list_b))

または、定数と配列を渡すには：

results = pool.starmap(function, zip(itertools.repeat(constant), list_a))

以前のバージョンのPythonを使用している場合は、この回避策を使用して複数の引数を渡すことができます）。

（役立つコメントを提供してくれたuser136036に感謝します。）

— 哲学
ソース

90

投稿されたばかりなので、投票数が足りません。この回答は美しく機能し、ここでの他の回答よりも構文を理解しやすくする「マップ」機能を示しています。

— 2015

25

これはスレッドではなく、プロセスですか？マルチプロセスを試みているようです！=マルチスレッド

— AturSams

72

ちなみに、皆さんは、with Pool(8) as p: p.map( *whatever* )簿記の行を記述して削除することもできます。

11

@BarafuAlbino：そのまま使えるので、これはPython 3.3以降でのみ機能することに注意してください。

— fuglede 2015年

9

この答えを残して、I / Oオペレーションにのみ役立つことは言うまでもありませんか？これは、ほとんどの場合役に立たない単一のスレッドでのみ実行され、通常の方法で実行するよりも実際には遅い

— Frobot

714

簡単な例を次に示します。いくつかの代替URLを試し、最初のURLのコンテンツを返して応答する必要があります。

import Queue
import threading
import urllib2

# Called by each thread
def get_url(q, url):
    q.put(urllib2.urlopen(url).read())

theurls = ["http://google.com", "http://yahoo.com"]

q = Queue.Queue()

for u in theurls:
    t = threading.Thread(target=get_url, args = (q,u))
    t.daemon = True
    t.start()

s = q.get()
print s

これは、スレッド化が単純な最適化として使用される場合です。各サブスレッドは、そのコンテンツをキューに入れるために、URLが解決および応答するのを待機しています。各スレッドはデーモンです（メインスレッドが終了した場合、プロセスを維持しません-これは一般的です）。メインスレッドはすべてのサブスレッドを開始getし、キューでa を実行してput、それらの1 つがを完了するまで待機し、結果を発行して終了します（デーモンスレッドであるため、まだ実行されている可能性のあるサブスレッドをすべて停止します）。

Pythonでのスレッドの適切な使用は、常にI / O操作に関連付けられています（CPythonは複数のコアを使用してCPUにバインドされたタスクを実行しないため、スレッド化の唯一の理由は、I / Oの待機中にプロセスをブロックしないことです。）。ちなみにキューは、ほぼ常にスレッドに作業を展開したり、作業の結果を収集したりするための最良の方法であり、キューは本質的にスレッドセーフであるため、ロック、条件、イベント、セマフォ、およびその他のインター-スレッドの調整/通信の概念。

— アレックス・マルテリ
ソース

10

ありがとう、MartelliBot。すべてのURLが応答するのを待つように例を更新しました：インポートキュー、スレッド、urllib2 q = Queue.Queue（）urls = '' ' a.com b.com c.com' '' 。split （） urls_received = 0 def get_url（q、url）：req = urllib2.Request（url）resp = urllib2.urlopen（req）q.put（resp.read（））global urls_received urls_received + = 1 print urls_received for u in urls：t = threading.Thread（target = get_url、args =（q、u））t.daemon = True t.start（）while q.empty（）and urls_received <len（urls）：s = q.get（）print s

— htmldrum、2013年

3

@JRM：以下の答えを見ると、スレッドが終了するまで待機するより良い方法は、join()メソッドを使用することだと思います。これにより、メインスレッドは、プロセッサを常に消費することなく完了するまで待機します。値を確認します。@アレックス：おかげで、これはスレッドの使用方法を理解するために必要なものです。

— krs013 2013年

6

python3の場合、「import urllib2」を「import urllib.request as urllib2」に置き換えます。かっこをprintステートメントに挿入します。

— Harvey

5

Python 3の場合、Queueモジュール名をに置き換えますqueue。メソッド名は同じです。

— JSmyth 2014年

2

私は解決策がページの1つだけを印刷することに注意してください。キューから両方のページを印刷するには、次のコマンドを再度実行するだけs = q.get() print s です：@ krs013 joinQueue.get（）がブロックしているため、必要はありません。

— トムアンダーソン

256

注：Pythonでの実際の並列化では、マルチプロセッシングモジュールを使用して、並列に実行される複数のプロセスをフォークする必要があります（グローバルインタープリターロックにより、Pythonスレッドはインターリーブを提供しますが、実際には並列ではなく逐次的に実行され、 I / O操作をインターリーブするときに役立ちます）。

ただし、単にインターリーブを探している場合（またはグローバルインタープリターロックにもかかわらず並列化できるI / O操作を実行している場合）は、スレッドモジュールが開始する場所です。本当に簡単な例として、並列にサブ範囲を合計することによって大きな範囲を合計する問題を考えてみましょう：

import threading

class SummingThread(threading.Thread):
     def __init__(self,low,high):
         super(SummingThread, self).__init__()
         self.low=low
         self.high=high
         self.total=0

     def run(self):
         for i in range(self.low,self.high):
             self.total+=i


thread1 = SummingThread(0,500000)
thread2 = SummingThread(500000,1000000)
thread1.start() # This actually causes the thread to run
thread2.start()
thread1.join()  # This waits until the thread has completed
thread2.join()
# At this point, both threads have completed
result = thread1.total + thread2.total
print result

上記は非常に愚かな例であることに注意してください。これは、I / Oがまったくなく、グローバルインタープリターロックのためにCPythonでインターリーブされますが（コンテキストスイッチングのオーバーヘッドが追加されます）、シリアルに実行されます。

— マイケルアーロンサフィアン
ソース

16

@Alex、私はそれが実用的であるとは言いませんでしたが、スレッドを定義して生成する方法を示しています。これはOPが望んでいることだと思います。

— Michael Aaron Safyan、

6

これはスレッドを定義して生成する方法を示していますが、実際にはサブ範囲を並行して合計するものではありません。thread1メインスレッドがブロックしている間に完了するまで実行され、次に同じことがで発生しthread2、メインスレッドが再開し、累積した値を出力します。

— martineau 14

そうじゃないのsuper(SummingThread, self).__init__()？stackoverflow.com/a/2197625/806988と

— James Andres

@JamesAndres、「SummingThread」から誰も継承しないと仮定すると、どちらも正常に機能します。このような場合には、スーパー（SummingThread、自己が）threading.Threadあるメソッド解決順序（MRO）、内の次のクラスを検索するだけの空想の方法です（とその後の呼び出しのinitどちらの場合も、その上を）。しかし、あなたは正しいです。というのも、super（）を使用することは、現在のPythonのスタイルとしては優れているからです。Superは、私がこの回答を提供した時点で比較的新しいため、super（）を使用するのではなく、スーパークラスを直接呼び出します。スーパーを使用するようにこれを更新します。

— Michael Aaron Safyan 2014年

14

警告：このようなタスクではマルチスレッドを使用しないでください！Dave Beazleyが示したように（dabeaz.com/python/NewGIL.pdf）、2つのCPU上の2つのpythonスレッドは、1つのCPU上の1つのスレッドよりも2倍遅く、1つのCPU上の2つのスレッドより1.5倍遅いCPU負荷のタスクを実行します。この奇妙な振る舞いは、OSとPythonの間の作業の調整ミスによるものです。スレッドの実際の使用例は、I / O負荷の高いタスクです。たとえば、ネットワーク経由で読み取り/書き込みを実行する場合、データの読み取り/書き込みを待機しているスレッドをバックグラウンドに置き、CPUをデータを処理する必要がある別のスレッドに切り替えることは理にかなっています。

— ボリス・バーコフ2014年

98

他の人が述べたように、CPythonはGILによるI / O待機にのみスレッドを使用できます。

CPUにバインドされたタスクに複数のコアを利用する場合は、マルチプロセッシングを使用します。

from multiprocessing import Process

def f(name):
    print 'hello', name

if __name__ == '__main__':
    p = Process(target=f, args=('bob',))
    p.start()
    p.join()

— 甲斐
ソース

33

これが何をするかを少し説明してもらえますか？

— パンディータ2013

5

@pandita：コードはプロセスを作成し、それを開始します。これで、2つのことが同時に実行されfます。プログラムのメイン行と、ターゲットの関数から始まるプロセスです。並行して、メインプログラムはプロセスが終了joinするのを待つだけで、それに追いつきます。メインパートが終了した直後の場合、サブプロセスは最後まで実行される場合とされない場合があるため、join常に実行することをお勧めします。

— johntellsall 14

1

map関数を含む拡張された答えはここにあります：stackoverflow.com/a/28463266/2327328

— philshem

2

ここに述べたように@philshemは、あなたが投稿B / Cのリンクは、（プロセスではない）スレッドのプールを使用しているように注意してくださいstackoverflow.com/questions/26432411/...。ただし、この回答はプロセスを使用しています。私はこれについては初心者ですが、（GILのおかげで）Pythonでマルチスレッドを使用する場合、特定の状況でのみパフォーマンスが向上するようです。ただし、プロセスのプールを使用すると、1つのプロセスで複数のコアを処理することにより、マルチコアプロセッサを利用できます。

— user3731622 2015

3

これは、実際に役立つことを実行し、複数のCPUコアを利用するための最良の答えです

— Frobot

92

注：スレッド化にはキューは必要ありません。

これは、10個のプロセスが同時に実行されていることを想像できる最も簡単な例です。

import threading
from random import randint
from time import sleep


def print_number(number):

    # Sleeps a random 1 to 10 seconds
    rand_int_var = randint(1, 10)
    sleep(rand_int_var)
    print "Thread " + str(number) + " slept for " + str(rand_int_var) + " seconds"

thread_list = []

for i in range(1, 10):

    # Instantiates the thread
    # (i) does not make a sequence, so (i,)
    t = threading.Thread(target=print_number, args=(i,))
    # Sticks the thread in a list so that it remains accessible
    thread_list.append(t)

# Starts threads
for thread in thread_list:
    thread.start()

# This blocks the calling thread until the thread whose join() method is called is terminated.
# From http://docs.python.org/2/library/threading.html#thread-objects
for thread in thread_list:
    thread.join()

# Demonstrates that the main process waited for threads to complete
print "Done"

— ダグラス・アダムス
ソース

3

最後の引用を「完了」に追加して、「完了」と印刷します

— iChux

1

この例はMartelliの例よりも気に入っています。操作が簡単です。ただし、何が起こっているのかを少し明確にするために、printNumberで以下を実行することをお勧めします。randintをスリープ状態にする前に変数に保存し、印刷を変更して「Thread」+ str（番号）+ "スリープした時間" + theRandintVariable + "秒"

— Nickolai 14

スレッドが終了するときに、各スレッドがいつ終了したかを知る方法はありますか？

— Matt

1

@マットそのようなことをする方法はいくつかありますが、それはあなたのニーズに依存します。1つの方法は、whileループで監視され、スレッドの最後に更新されているシングルトンまたは他のパブリックにアクセス可能な変数を更新することです。

— ダグラスアダムス

2

2番目のforループは必要ありませんthread.start()。最初のループで呼び出すことができます。

— Mark Mishyn

49

Alex Martelliからの回答が役に立ちました。しかし、これは私がもっと便利だと思った修正版です（少なくとも私には）

更新： Python 2とPython 3の両方で動作します

try:
    # For Python 3
    import queue
    from urllib.request import urlopen
except:
    # For Python 2 
    import Queue as queue
    from urllib2 import urlopen

import threading

worker_data = ['http://google.com', 'http://yahoo.com', 'http://bing.com']

# Load up a queue with your data. This will handle locking
q = queue.Queue()
for url in worker_data:
    q.put(url)

# Define a worker function
def worker(url_queue):
    queue_full = True
    while queue_full:
        try:
            # Get your data off the queue, and do some work
            url = url_queue.get(False)
            data = urlopen(url).read()
            print(len(data))

        except queue.Empty:
            queue_full = False

# Create as many threads as you want
thread_count = 5
for i in range(thread_count):
    t = threading.Thread(target=worker, args = (q,))
    t.start()

— JimJty
ソース

6

例外を打破しないのはなぜですか？

— Stavros Korokithakis 14

1

できます、個人的な好み

— JimJty

1

私はコードを実行していませんが、スレッドをデーモン化する必要はありませんか？最後のforループの後、プログラムは終了する可能性があると思います。少なくとも、スレッドが機能するために、プログラムは終了します。ワーカーデータをキューに入れるのではなく、出力をキューに入れる方が良いアプローチだと思います。ワーカーからキューに入ってくる情報を処理するだけでなく、スレッド化もしないメインループができるからです。そして、あなたはそれが時期尚早に終了しないことを知っています。

— dylnmc 2016年

1

@dylnmc、それは私のユースケースの範囲外です（私の入力キューは事前定義されています）。あなたがあなたのルートに行きたいなら、私はセロリを

— JimJty

@JimJtyこのエラーが発生する理由を知っていますか：import Queue ModuleNotFoundError: No module named 'Queue'私はpython 3.6.5を実行しています一部の投稿はpython 3.6.5であると述べていますがqueue、変更した後でも機能しません

— user9371654

25

関数が与えられた場合、次のfようにスレッド化します。

import threading
threading.Thread(target=f).start()

引数を渡すには f

threading.Thread(target=f, args=(a,b,c)).start()

— スターフライ
ソース

これは非常に簡単です。スレッドを使い終わったら、スレッドを確実に閉じるにはどうすればよいですか？

— cameronroytaylor 2017

私が理解している限り、関数が終了すると、Threadオブジェクトはクリーンアップされます。docsを参照してください。is_alive()必要に応じてスレッドをチェックするために使用できるメソッドがあります。

— starfry

私はそのis_alive方法を見ましたが、それをスレッドに適用する方法がわかりませんでした。割り当てthread1=threading.Thread(target=f).start()てからで確認しようとしましたがthread1.is_alive()、がthread1入力されているNoneため、うまくいきません。スレッドにアクセスする他の方法があるかどうか知っていますか？

— cameronroytaylor

4

：あなたは変数にスレッドオブジェクトを割り当て、そのvaraibleを使用して、それを開始する必要があるthread1=threading.Thread(target=f)が続きますthread1.start()。その後、行うことができますthread1.is_alive()。

— starfry 2017年

1

うまくいきました。そして、はい、テストは関数が終了するとすぐにthread1.is_alive()戻りFalseます。

— cameronroytaylor

25

私はこれが非常に便利だと思いました：コアと同じ数のスレッドを作成し、それらに（大量の）タスク（この場合はシェルプログラムを呼び出す）を実行させます：

import Queue
import threading
import multiprocessing
import subprocess

q = Queue.Queue()
for i in range(30): # Put 30 tasks in the queue
    q.put(i)

def worker():
    while True:
        item = q.get()
        # Execute a task: call a shell program and wait until it completes
        subprocess.call("echo " + str(item), shell=True)
        q.task_done()

cpus = multiprocessing.cpu_count() # Detect number of cores
print("Creating %d threads" % cpus)
for i in range(cpus):
     t = threading.Thread(target=worker)
     t.daemon = True
     t.start()

q.join() # Block until all tasks are done

— イルカ
ソース

@shavenwarthogは、必要に応じて「cpus」変数を調整できることを確認します。とにかく、サブプロセス呼び出しはサブプロセスを生成し、これらはOSによってCPUが割り当てられます（Pythonの「親プロセス」はサブプロセスの「同じCPU」を意味しません）。

— イルカ14

2

あなたは正しい、「スレッドは親プロセスと同じCPUで開始される」という私のコメントは間違っています。返信いただきありがとうございます！

— johntellsall 14

1

同じメモリ空間を使用するマルチスレッドとは異なり、マルチプロセッシングは変数/データを簡単に共有できないことに注意してください。+1。

— 2014

22

Python 3には、並列タスクを起動する機能があります。これにより、作業が簡単になります。

それは持っているスレッドプールおよびプロセス・プーリングを。

以下は洞察を与えます：

ThreadPoolExecutorの例（ソース）

import concurrent.futures
import urllib.request

URLS = ['http://www.foxnews.com/',
        'http://www.cnn.com/',
        'http://europe.wsj.com/',
        'http://www.bbc.co.uk/',
        'http://some-made-up-domain.com/']

# Retrieve a single page and report the URL and contents
def load_url(url, timeout):
    with urllib.request.urlopen(url, timeout=timeout) as conn:
        return conn.read()

# We can use a with statement to ensure threads are cleaned up promptly
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:
    # Start the load operations and mark each future with its URL
    future_to_url = {executor.submit(load_url, url, 60): url for url in URLS}
    for future in concurrent.futures.as_completed(future_to_url):
        url = future_to_url[future]
        try:
            data = future.result()
        except Exception as exc:
            print('%r generated an exception: %s' % (url, exc))
        else:
            print('%r page is %d bytes' % (url, len(data)))

ProcessPoolExecutor（ソース）

import concurrent.futures
import math

PRIMES = [
    112272535095293,
    112582705942171,
    112272535095293,
    115280095190773,
    115797848077099,
    1099726899285419]

def is_prime(n):
    if n % 2 == 0:
        return False

    sqrt_n = int(math.floor(math.sqrt(n)))
    for i in range(3, sqrt_n + 1, 2):
        if n % i == 0:
            return False
    return True

def main():
    with concurrent.futures.ProcessPoolExecutor() as executor:
        for number, prime in zip(PRIMES, executor.map(is_prime, PRIMES)):
            print('%d is prime: %s' % (number, prime))

if __name__ == '__main__':
    main()

— ジェリル
ソース

18

驚異的な新しいconcurrent.futuresモジュールの使用

def sqr(val):
    import time
    time.sleep(0.1)
    return val * val

def process_result(result):
    print(result)

def process_these_asap(tasks):
    import concurrent.futures

    with concurrent.futures.ProcessPoolExecutor() as executor:
        futures = []
        for task in tasks:
            futures.append(executor.submit(sqr, task))

        for future in concurrent.futures.as_completed(futures):
            process_result(future.result())
        # Or instead of all this just do:
        # results = executor.map(sqr, tasks)
        # list(map(process_result, results))

def main():
    tasks = list(range(10))
    print('Processing {} tasks'.format(len(tasks)))
    process_these_asap(tasks)
    print('Done')
    return 0

if __name__ == '__main__':
    import sys
    sys.exit(main())

エグゼキューターアプローチは、以前にJavaに手を染めたことがある人なら誰でもよく知っているように思えるかもしれません。

また、余談ですが、ユニバースを健全な状態に保つために、withコンテキストを使用しない場合はプール/エグゼキューターを閉じることを忘れないでください（これは非常に素晴らしいので、実行してくれます）

— Shubham Chaudhary
ソース

17

私にとって、スレッドの完璧な例は非同期イベントの監視です。このコードを見てください。

# thread_test.py
import threading
import time

class Monitor(threading.Thread):
    def __init__(self, mon):
        threading.Thread.__init__(self)
        self.mon = mon

    def run(self):
        while True:
            if self.mon[0] == 2:
                print "Mon = 2"
                self.mon[0] = 3;

このコードを試すには、IPythonセッションを開いて次のようにします。

>>> from thread_test import Monitor
>>> a = [0]
>>> mon = Monitor(a)
>>> mon.start()
>>> a[0] = 2
Mon = 2
>>>a[0] = 2
Mon = 2

数分待つ

>>> a[0] = 2
Mon = 2

— dvreed77
ソース

1

AttributeError： 'Monitor'オブジェクトに属性 'stop'がありません？

— パンディータ2013

5

イベントが発生するのを待っている間、CPUサイクルを爆発させていませんか？常に実行するのは非常に実用的ではありません。

— モーグル2013

3

モーグルが言うように、これは常に実行されます。少なくとも、sleep（0.1）などの短いスリープを追加できます。これにより、このような簡単な例でCPUの使用量が大幅に削減されます。

— 2014

3

これは恐ろしい例であり、1つのコアを浪費しています。少なくともスリープを追加しますが、適切な解決策は、いくつかの信号メカニズムを使用することです。

— PureW

16

ほとんどのドキュメントとチュートリアルはPython ThreadingとQueueモジュールを使用しているため、初心者にとっては圧倒されるかもしれません。

おそらくconcurrent.futures.ThreadPoolExecutorPython 3 のモジュールを検討してください。

with句とリストの理解と組み合わせると、それは本当の魅力かもしれません。

from concurrent.futures import ThreadPoolExecutor, as_completed

def get_url(url):
    # Your actual program here. Using threading.Lock() if necessary
    return ""

# List of URLs to fetch
urls = ["url1", "url2"]

with ThreadPoolExecutor(max_workers = 5) as executor:

    # Create threads
    futures = {executor.submit(get_url, url) for url in urls}

    # as_completed() gives you the threads once finished
    for f in as_completed(futures):
        # Get the results
        rs = f.result()

— イボ
ソース

15

実際の作業が行われていない多くの例をここで見ましたが、それらはほとんどCPUに依存していました。以下は、1000万から10.05百万のすべての素数を計算するCPUバウンドタスクの例です。ここでは4つの方法すべてを使用しました。

import math
import timeit
import threading
import multiprocessing
from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor


def time_stuff(fn):
    """
    Measure time of execution of a function
    """
    def wrapper(*args, **kwargs):
        t0 = timeit.default_timer()
        fn(*args, **kwargs)
        t1 = timeit.default_timer()
        print("{} seconds".format(t1 - t0))
    return wrapper

def find_primes_in(nmin, nmax):
    """
    Compute a list of prime numbers between the given minimum and maximum arguments
    """
    primes = []

    # Loop from minimum to maximum
    for current in range(nmin, nmax + 1):

        # Take the square root of the current number
        sqrt_n = int(math.sqrt(current))
        found = False

        # Check if the any number from 2 to the square root + 1 divides the current numnber under consideration
        for number in range(2, sqrt_n + 1):

            # If divisible we have found a factor, hence this is not a prime number, lets move to the next one
            if current % number == 0:
                found = True
                break

        # If not divisible, add this number to the list of primes that we have found so far
        if not found:
            primes.append(current)

    # I am merely printing the length of the array containing all the primes, but feel free to do what you want
    print(len(primes))

@time_stuff
def sequential_prime_finder(nmin, nmax):
    """
    Use the main process and main thread to compute everything in this case
    """
    find_primes_in(nmin, nmax)

@time_stuff
def threading_prime_finder(nmin, nmax):
    """
    If the minimum is 1000 and the maximum is 2000 and we have four workers,
    1000 - 1250 to worker 1
    1250 - 1500 to worker 2
    1500 - 1750 to worker 3
    1750 - 2000 to worker 4
    so let’s split the minimum and maximum values according to the number of workers
    """
    nrange = nmax - nmin
    threads = []
    for i in range(8):
        start = int(nmin + i * nrange/8)
        end = int(nmin + (i + 1) * nrange/8)

        # Start the thread with the minimum and maximum split up to compute
        # Parallel computation will not work here due to the GIL since this is a CPU-bound task
        t = threading.Thread(target = find_primes_in, args = (start, end))
        threads.append(t)
        t.start()

    # Don’t forget to wait for the threads to finish
    for t in threads:
        t.join()

@time_stuff
def processing_prime_finder(nmin, nmax):
    """
    Split the minimum, maximum interval similar to the threading method above, but use processes this time
    """
    nrange = nmax - nmin
    processes = []
    for i in range(8):
        start = int(nmin + i * nrange/8)
        end = int(nmin + (i + 1) * nrange/8)
        p = multiprocessing.Process(target = find_primes_in, args = (start, end))
        processes.append(p)
        p.start()

    for p in processes:
        p.join()

@time_stuff
def thread_executor_prime_finder(nmin, nmax):
    """
    Split the min max interval similar to the threading method, but use a thread pool executor this time.
    This method is slightly faster than using pure threading as the pools manage threads more efficiently.
    This method is still slow due to the GIL limitations since we are doing a CPU-bound task.
    """
    nrange = nmax - nmin
    with ThreadPoolExecutor(max_workers = 8) as e:
        for i in range(8):
            start = int(nmin + i * nrange/8)
            end = int(nmin + (i + 1) * nrange/8)
            e.submit(find_primes_in, start, end)

@time_stuff
def process_executor_prime_finder(nmin, nmax):
    """
    Split the min max interval similar to the threading method, but use the process pool executor.
    This is the fastest method recorded so far as it manages process efficiently + overcomes GIL limitations.
    RECOMMENDED METHOD FOR CPU-BOUND TASKS
    """
    nrange = nmax - nmin
    with ProcessPoolExecutor(max_workers = 8) as e:
        for i in range(8):
            start = int(nmin + i * nrange/8)
            end = int(nmin + (i + 1) * nrange/8)
            e.submit(find_primes_in, start, end)

def main():
    nmin = int(1e7)
    nmax = int(1.05e7)
    print("Sequential Prime Finder Starting")
    sequential_prime_finder(nmin, nmax)
    print("Threading Prime Finder Starting")
    threading_prime_finder(nmin, nmax)
    print("Processing Prime Finder Starting")
    processing_prime_finder(nmin, nmax)
    print("Thread Executor Prime Finder Starting")
    thread_executor_prime_finder(nmin, nmax)
    print("Process Executor Finder Starting")
    process_executor_prime_finder(nmin, nmax)

main()

これは私のMac OS X 4コアマシンでの結果です

Sequential Prime Finder Starting
9.708213827005238 seconds
Threading Prime Finder Starting
9.81836523200036 seconds
Processing Prime Finder Starting
3.2467174359990167 seconds
Thread Executor Prime Finder Starting
10.228896902000997 seconds
Process Executor Finder Starting
2.656402041000547 seconds

— PirateApp
ソース

1

@TheUnfunCatどのプロセスエグゼキュータも、CPUバウンドタスクのスレッドよりはるかに優れています

— PirateApp

1

すばらしい答えです。Windows上のPython 3.6では（少なくとも）ThreadPoolExecutorがCPU負荷の高いタスクに対して何の役にも立たないことを確認できます。計算にコアを利用していません。ProcessPoolExecutorは、それが生成するすべてのプロセスにデータをコピーしますが、大きな行列には致命的です。

— アナトリーアレクセーエフ

1

非常に役立つ例ですが、それがどのように機能したかはわかりません。if __name__ == '__main__':メインコールの前にaが必要です。それ以外の場合、測定値はそれ自体を生成して出力します。前に新しいプロセスを開始しようとしました...。

— スタイン

1

@Steinしかし、それはWindowsの問題にすぎないと思います。

— AMC

12

これは、スレッドを使用したCSVインポートの非常に単純な例です。（ライブラリーの組み込みは、目的によって異なる場合があります。）

ヘルパー関数：

from threading import Thread
from project import app
import csv


def import_handler(csv_file_name):
    thr = Thread(target=dump_async_csv_data, args=[csv_file_name])
    thr.start()

def dump_async_csv_data(csv_file_name):
    with app.app_context():
        with open(csv_file_name) as File:
            reader = csv.DictReader(File)
            for row in reader:
                # DB operation/query

ドライバー機能：

import_handler(csv_file_name)

— チラグ・ボラ
ソース

9

簡単な例と、この問題に自分で取り組む必要があったときに役に立ったと思う説明を提供したいと思います。

この回答では、PythonのGIL（グローバルインタープリターロック）に関するいくつかの情報と、multiprocessing.dummyを使用して記述された簡単な日常の例に加えて、いくつかの簡単なベンチマークがあります。

グローバルインタープリターロック（GIL）

Pythonでは、本当の意味でのマルチスレッド化は許可されていません。これにはマルチスレッドパッケージがありますが、マルチスレッドでコードを高速化したい場合は、通常、それを使用することはお勧めできません。

Pythonには、グローバルインタープリターロック（GIL）と呼ばれる構造があります。GILは、一度に実行できる「スレッド」は1つだけであることを確認します。スレッドはGILを取得し、少し作業を行ってから、GILを次のスレッドに渡します。

これは非常に迅速に行われるため、人間の目にはスレッドが並列で実行されているように見えるかもしれませんが、実際には同じCPUコアを使用して順番を取っているだけです。

このすべてのGILの引き渡しは、実行にオーバーヘッドを追加します。つまり、コードをより高速に実行したい場合、スレッディングパッケージを使用することは多くの場合良い考えではありません。

Pythonのスレッドパッケージを使用する理由があります。いくつかを同時に実行したい場合で、効率が問題にならないのであれば、それはまったく問題なく便利です。または、何か（I / Oなど）を待機する必要があるコードを実行している場合、それは非常に理にかなっています。ただし、スレッドライブラリでは、追加のCPUコアを使用できません。

マルチスレッドは、オペレーティングシステム（マルチプロセッシングを実行すること）、およびPythonコードを呼び出す外部アプリケーション（SparkやHadoopなど）、またはPythonコードが呼び出すコード（たとえば：高価なマルチスレッド処理を行うC関数をPythonコードで呼び出すようにします）。

なぜこれが重要なのか

多くの人が、GILとは何かを学ぶ前に、派手なPythonマルチスレッドコードのボトルネックを見つけることに多くの時間を費やしているからです。

この情報が明確になったら、これが私のコードです。

#!/bin/python
from multiprocessing.dummy import Pool
from subprocess import PIPE,Popen
import time
import os

# In the variable pool_size we define the "parallelness".
# For CPU-bound tasks, it doesn't make sense to create more Pool processes
# than you have cores to run them on.
#
# On the other hand, if you are using I/O-bound tasks, it may make sense
# to create a quite a few more Pool processes than cores, since the processes
# will probably spend most their time blocked (waiting for I/O to complete).
pool_size = 8

def do_ping(ip):
    if os.name == 'nt':
        print ("Using Windows Ping to " + ip)
        proc = Popen(['ping', ip], stdout=PIPE)
        return proc.communicate()[0]
    else:
        print ("Using Linux / Unix Ping to " + ip)
        proc = Popen(['ping', ip, '-c', '4'], stdout=PIPE)
        return proc.communicate()[0]


os.system('cls' if os.name=='nt' else 'clear')
print ("Running using threads\n")
start_time = time.time()
pool = Pool(pool_size)
website_names = ["www.google.com","www.facebook.com","www.pinterest.com","www.microsoft.com"]
result = {}
for website_name in website_names:
    result[website_name] = pool.apply_async(do_ping, args=(website_name,))
pool.close()
pool.join()
print ("\n--- Execution took {} seconds ---".format((time.time() - start_time)))

# Now we do the same without threading, just to compare time
print ("\nRunning NOT using threads\n")
start_time = time.time()
for website_name in website_names:
    do_ping(website_name)
print ("\n--- Execution took {} seconds ---".format((time.time() - start_time)))

# Here's one way to print the final output from the threads
output = {}
for key, value in result.items():
    output[key] = value.get()
print ("\nOutput aggregated in a Dictionary:")
print (output)
print ("\n")

print ("\nPretty printed output: ")
for key, value in output.items():
    print (key + "\n")
    print (value)

— ピット
ソース

7

ここに役立つ簡単な例のマルチスレッドがあります。あなたはそれを実行して、マルチスレッドがPythonでどのように機能しているかを簡単に理解することができます。前のスレッドが作業を完了するまで、他のスレッドへのアクセスを防ぐためにロックを使用しました。このコード行を使用すると、

tLock = threading.BoundedSemaphore（value = 4）

一度に複数のプロセスを許可し、後で実行するか、前のプロセスが終了した後に実行される残りのスレッドを保持することができます。

import threading
import time

#tLock = threading.Lock()
tLock = threading.BoundedSemaphore(value=4)
def timer(name, delay, repeat):
    print  "\r\nTimer: ", name, " Started"
    tLock.acquire()
    print "\r\n", name, " has the acquired the lock"
    while repeat > 0:
        time.sleep(delay)
        print "\r\n", name, ": ", str(time.ctime(time.time()))
        repeat -= 1

    print "\r\n", name, " is releaseing the lock"
    tLock.release()
    print "\r\nTimer: ", name, " Completed"

def Main():
    t1 = threading.Thread(target=timer, args=("Timer1", 2, 5))
    t2 = threading.Thread(target=timer, args=("Timer2", 3, 5))
    t3 = threading.Thread(target=timer, args=("Timer3", 4, 5))
    t4 = threading.Thread(target=timer, args=("Timer4", 5, 5))
    t5 = threading.Thread(target=timer, args=("Timer5", 0.1, 5))

    t1.start()
    t2.start()
    t3.start()
    t4.start()
    t5.start()

    print "\r\nMain Complete"

if __name__ == "__main__":
    Main()

— cSharma
ソース

5

この投稿を借用することで、マルチスレッド、マルチプロセッシング、async / asyncio、およびそれらの使用法の選択について知っています。

Python 3には、並行性と並列性を実現するための新しい組み込みライブラリが含まれています。concurrent.futures

だから私は.sleep()方法で4つのタスク（すなわちメソッド）を実行する実験を通してデモンストレーションしThreading-Poolます：

from concurrent.futures import ThreadPoolExecutor, as_completed
from time import sleep, time

def concurrent(max_worker=1):
    futures = []

    tick = time()
    with ThreadPoolExecutor(max_workers=max_worker) as executor:
        futures.append(executor.submit(sleep, 2))  # Two seconds sleep
        futures.append(executor.submit(sleep, 1))
        futures.append(executor.submit(sleep, 7))
        futures.append(executor.submit(sleep, 3))

        for future in as_completed(futures):
            if future.result() is not None:
                print(future.result())

    print('Total elapsed time by {} workers:'.format(max_worker), time()-tick)

concurrent(5)
concurrent(4)
concurrent(3)
concurrent(2)
concurrent(1)

出力：

Total elapsed time by 5 workers: 7.007831811904907
Total elapsed time by 4 workers: 7.007944107055664
Total elapsed time by 3 workers: 7.003149509429932
Total elapsed time by 2 workers: 8.004627466201782
Total elapsed time by 1 workers: 13.013478994369507

[ 注 ]：

上記の結果からわかるように、最良のケースは、これら4つのタスクの3人のワーカーです。
あなたの代わりにI / Oのプロセス・タスクを使用している場合（バインドまたはブロッキングmultiprocessing対threading変更できる）ThreadPoolExecutorへProcessPoolExecutor。

— ベニャミン・ジャファリ
ソース

4

以前のソリューションでは、GNU / Linuxサーバーで複数のコアを実際に使用していません（管理者権限がない場合）。彼らはただシングルコアで走った。

下位レベルのos.forkインターフェイスを使用して、複数のプロセスを生成しました。これは私のために働いたコードです：

from os import fork

values = ['different', 'values', 'for', 'threads']

for i in range(len(values)):
    p = fork()
    if p == 0:
        my_function(values[i])
        break

— デビッド・シューマン
ソース

2

import threading
import requests

def send():

  r = requests.get('https://www.stackoverlow.com')

thread = []
t = threading.Thread(target=send())
thread.append(t)
t.start()

— スキラーDz
ソース

1

@sP_私が推測しているのは、スレッドオブジェクトがあり、完了するのを待つことができるためです。

— AleksandarMakragić18年

1

t = threading.Thread（target = send（））はt = threading.Thread（target = send）でなければなりません

— TRiNE

この回答は、重大な不正確さを含むだけでなく、既存の回答をどのように改善するかについての説明を提供していないため、反対票を投じています。

— ジュール、