如何流式完成
默认情况下,当您向 OpenAI 完成端点发送提示时,它会计算整个完成并在单个响应中发回
如果您从 davinci 级模型生成非常长的完成,等待响应可能需要很多秒。 截至 2022 年 8 月,来自 text-davinci-002
的响应通常需要约 1 秒加上每 100 个完成令牌约 2 秒。
如果您想更快地获得响应,您可以在生成时“流式传输”完成。 这允许您在整个完成完成之前开始打印或以其他方式处理完成的开始。
要流式完成,请在调用 Completions 端点时设置 stream=True
。 这将返回一个对象,该对象将文本作为纯数据服务器发送的事件流回。
缺点
请注意,在生产应用程序中使用 stream=True
会使调整完成的内容变得更加困难,这对批准的使用有影响。
流式响应的另一个小缺点是响应不再包含 usage 字段来告诉您消耗了多少令牌。 收到并合并所有回复后,您可以使用 tiktoken 自行计算。
示例代码
以下是如何接收流完成的 Python 代码示例。
# imports import openai # for OpenAI API calls import time # for measuring time savings
一个典型的完成请求
通过典型的 Completions API 调用,首先计算文本,然后一次性返回所有文本。
# Example of an OpenAI Completion request # https://beta.openai.com/docs/api-reference/completions/create # record the time before the request is sent start_time = time.time() # send a Completion request to count to 100 response = openai.Completion.create( model='text-davinci-002', prompt='1,2,3,', max_tokens=193, temperature=0, ) # calculate the time it took to receive the response response_time = time.time() - start_time # extract the text from the response completion_text = response['choices'][0]['text'] # print the time delay and text received print(f"Full response received {response_time:.2f} seconds after request") print(f"Full text received: {completion_text}")
Full response received 7.32 seconds after request Full text received: 4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100
流式完成请求
通过流式 Completions API 调用,文本通过一系列事件发回。 在 Python 中,您可以使用 for
循环迭代这些事件。
# Example of an OpenAI Completion request, using the stream=True option # https://beta.openai.com/docs/api-reference/completions/create # record the time before the request is sent start_time = time.time() # send a Completion request to count to 100 response = openai.Completion.create( model='text-davinci-002', prompt='1,2,3,', max_tokens=193, temperature=0, stream=True, # this time, we set stream=True ) # create variables to collect the stream of events collected_events = [] completion_text = '' # iterate through the stream of events for event in response: event_time = time.time() - start_time # calculate the time delay of the event collected_events.append(event) # save the event response event_text = event['choices'][0]['text'] # extract the text completion_text += event_text # append the text print(f"Text received: {event_text} ({event_time:.2f} seconds after request)") # print the delay and text # print the time delay and text received print(f"Full response received {event_time:.2f} seconds after request") print(f"Full text received: {completion_text}")
Text received: 4 (0.16 seconds after request) Text received: , (0.19 seconds after request) Text received: 5 (0.21 seconds after request) Text received: , (0.24 seconds after request) Text received: 6 (0.27 seconds after request) Text received: , (0.29 seconds after request) Text received: 7 (0.32 seconds after request) Text received: , (0.35 seconds after request) Text received: 8 (0.37 seconds after request) Text received: , (0.40 seconds after request) Text received: 9 (0.43 seconds after request) Text received: , (0.46 seconds after request) Text received: 10 (0.48 seconds after request) Text received: , (0.51 seconds after request) Text received: 11 (0.54 seconds after request) Text received: , (0.56 seconds after request) Text received: 12 (0.59 seconds after request) Text received: , (0.62 seconds after request) Text received: 13 (0.64 seconds after request) Text received: , (0.67 seconds after request) Text received: 14 (0.70 seconds after request) Text received: , (0.72 seconds after request) Text received: 15 (0.75 seconds after request) Text received: , (0.78 seconds after request) Text received: 16 (0.84 seconds after request) Text received: , (0.84 seconds after request) Text received: 17 (0.86 seconds after request) Text received: , (0.89 seconds after request) Text received: 18 (0.91 seconds after request) Text received: , (0.94 seconds after request) Text received: 19 (1.41 seconds after request) Text received: , (1.41 seconds after request) Text received: 20 (1.41 seconds after request) Text received: , (1.41 seconds after request) Text received: 21 (1.41 seconds after request) Text received: , (1.41 seconds after request) Text received: 22 (1.41 seconds after request) Text received: , (1.41 seconds after request) Text received: 23 (1.41 seconds after request) Text received: , (1.41 seconds after request) Text received: 24 (1.46 seconds after request) Text received: , (1.46 seconds after request) Text received: 25 (1.46 seconds after request) Text received: , (1.55 seconds after request) Text received: 26 (1.61 seconds after request) Text received: , (1.65 seconds after request) Text received: 27 (1.66 seconds after request) Text received: , (1.70 seconds after request) Text received: 28 (1.72 seconds after request) Text received: , (1.75 seconds after request) Text received: 29 (1.78 seconds after request) Text received: , (2.05 seconds after request) Text received: 30 (2.08 seconds after request) Text received: , (2.13 seconds after request) Text received: 31 (2.16 seconds after request) Text received: , (2.20 seconds after request) Text received: 32 (2.26 seconds after request) Text received: , (2.28 seconds after request) Text received: 33 (2.31 seconds after request) Text received: , (2.35 seconds after request) Text received: 34 (2.38 seconds after request) Text received: , (2.54 seconds after request) Text received: 35 (2.55 seconds after request) Text received: , (2.59 seconds after request) Text received: 36 (2.61 seconds after request) Text received: , (2.64 seconds after request) Text received: 37 (2.67 seconds after request) Text received: , (2.71 seconds after request) Text received: 38 (2.86 seconds after request) Text received: , (2.89 seconds after request) Text received: 39 (2.92 seconds after request) Text received: , (2.95 seconds after request) Text received: 40 (2.99 seconds after request) Text received: , (3.01 seconds after request) Text received: 41 (3.04 seconds after request) Text received: , (3.08 seconds after request) Text received: 42 (3.15 seconds after request) Text received: , (3.33 seconds after request) Text received: 43 (3.36 seconds after request) Text received: , (3.43 seconds after request) Text received: 44 (3.47 seconds after request) Text received: , (3.50 seconds after request) Text received: 45 (3.53 seconds after request) Text received: , (3.56 seconds after request) Text received: 46 (3.59 seconds after request) Text received: , (3.63 seconds after request) Text received: 47 (3.65 seconds after request) Text received: , (3.68 seconds after request) Text received: 48 (3.71 seconds after request) Text received: , (3.77 seconds after request) Text received: 49 (3.77 seconds after request) Text received: , (3.79 seconds after request) Text received: 50 (3.82 seconds after request) Text received: , (3.85 seconds after request) Text received: 51 (3.89 seconds after request) Text received: , (3.91 seconds after request) Text received: 52 (3.93 seconds after request) Text received: , (3.96 seconds after request) Text received: 53 (3.98 seconds after request) Text received: , (4.04 seconds after request) Text received: 54 (4.05 seconds after request) Text received: , (4.07 seconds after request) Text received: 55 (4.10 seconds after request) Text received: , (4.13 seconds after request) Text received: 56 (4.19 seconds after request) Text received: , (4.20 seconds after request) Text received: 57 (4.20 seconds after request) Text received: , (4.23 seconds after request) Text received: 58 (4.26 seconds after request) Text received: , (4.30 seconds after request) Text received: 59 (4.31 seconds after request) Text received: , (4.59 seconds after request) Text received: 60 (4.61 seconds after request) Text received: , (4.64 seconds after request) Text received: 61 (4.67 seconds after request) Text received: , (4.72 seconds after request) Text received: 62 (4.73 seconds after request) Text received: , (4.76 seconds after request) Text received: 63 (4.80 seconds after request) Text received: , (4.83 seconds after request) Text received: 64 (4.86 seconds after request) Text received: , (4.89 seconds after request) Text received: 65 (4.92 seconds after request) Text received: , (4.94 seconds after request) Text received: 66 (4.97 seconds after request) Text received: , (5.00 seconds after request) Text received: 67 (5.03 seconds after request) Text received: , (5.06 seconds after request) Text received: 68 (5.09 seconds after request) Text received: , (5.14 seconds after request) Text received: 69 (5.16 seconds after request) Text received: , (5.19 seconds after request) Text received: 70 (5.22 seconds after request) Text received: , (5.28 seconds after request) Text received: 71 (5.30 seconds after request) Text received: , (5.33 seconds after request) Text received: 72 (5.36 seconds after request) Text received: , (5.38 seconds after request) Text received: 73 (5.41 seconds after request) Text received: , (5.44 seconds after request) Text received: 74 (5.48 seconds after request) Text received: , (5.51 seconds after request) Text received: 75 (5.53 seconds after request) Text received: , (5.56 seconds after request) Text received: 76 (5.60 seconds after request) Text received: , (5.62 seconds after request) Text received: 77 (5.65 seconds after request) Text received: , (5.68 seconds after request) Text received: 78 (5.71 seconds after request) Text received: , (5.77 seconds after request) Text received: 79 (5.77 seconds after request) Text received: , (5.79 seconds after request) Text received: 80 (5.82 seconds after request) Text received: , (5.85 seconds after request) Text received: 81 (5.88 seconds after request) Text received: , (5.92 seconds after request) Text received: 82 (5.93 seconds after request) Text received: , (5.97 seconds after request) Text received: 83 (5.98 seconds after request) Text received: , (6.01 seconds after request) Text received: 84 (6.04 seconds after request) Text received: , (6.07 seconds after request) Text received: 85 (6.09 seconds after request) Text received: , (6.11 seconds after request) Text received: 86 (6.14 seconds after request) Text received: , (6.17 seconds after request) Text received: 87 (6.19 seconds after request) Text received: , (6.22 seconds after request) Text received: 88 (6.24 seconds after request) Text received: , (6.27 seconds after request) Text received: 89 (6.30 seconds after request) Text received: , (6.31 seconds after request) Text received: 90 (6.35 seconds after request) Text received: , (6.36 seconds after request) Text received: 91 (6.40 seconds after request) Text received: , (6.44 seconds after request) Text received: 92 (6.46 seconds after request) Text received: , (6.49 seconds after request) Text received: 93 (6.51 seconds after request) Text received: , (6.54 seconds after request) Text received: 94 (6.56 seconds after request) Text received: , (6.59 seconds after request) Text received: 95 (6.62 seconds after request) Text received: , (6.64 seconds after request) Text received: 96 (6.68 seconds after request) Text received: , (6.68 seconds after request) Text received: 97 (6.70 seconds after request) Text received: , (6.73 seconds after request) Text received: 98 (6.75 seconds after request) Text received: , (6.78 seconds after request) Text received: 99 (6.90 seconds after request) Text received: , (6.92 seconds after request) Text received: 100 (7.25 seconds after request) Full response received 7.25 seconds after request Full text received: 4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100
时间比较
在上面的示例中,两个请求大约需要 7 秒才能完全完成。
但是,对于流式请求,您会在 0.16 秒后收到第一个令牌,并在大约 0.035 秒后收到后续令牌。
评论 (0)