openAI|如何流式完成

当使用OpenAI完成端点时,流式传输可以更快地获得响应,提高应用程序的效率和性能。本文提供Python示例,介绍如何接收流完成并处理,以便在整个完成完成之前就可以开始打印或以其他方式处理完成的开始。

如何流式完成

默认情况下,当您向 OpenAI 完成端点发送提示时,它会计算整个完成并在单个响应中发回

如果您从 davinci 级模型生成非常长的完成,等待响应可能需要很多秒。 截至 2022 年 8 月,来自 text-davinci-002 的响应通常需要约 1 秒加上每 100 个完成令牌约 2 秒。

如果您想更快地获得响应,您可以在生成时“流式传输”完成。 这允许您在整个完成完成之前开始打印或以其他方式处理完成的开始。

要流式完成,请在调用 Completions 端点时设置 stream=True。 这将返回一个对象,该对象将文本作为纯数据服务器发送的事件流回。

缺点

请注意,在生产应用程序中使用 stream=True 会使调整完成的内容变得更加困难,这对批准的使用有影响。

流式响应的另一个小缺点是响应不再包含 usage 字段来告诉您消耗了多少令牌。 收到并合并所有回复后,您可以使用 tiktoken 自行计算。

示例代码

以下是如何接收流完成的 Python 代码示例。

# imports
import openai  # for OpenAI API calls
import time  # for measuring time savings

一个典型的完成请求

通过典型的 Completions API 调用,首先计算文本,然后一次性返回所有文本。

# Example of an OpenAI Completion request
# https://beta.openai.com/docs/api-reference/completions/create

# record the time before the request is sent
start_time = time.time()

# send a Completion request to count to 100
response = openai.Completion.create(
    model='text-davinci-002',
    prompt='1,2,3,',
    max_tokens=193,
    temperature=0,
)

# calculate the time it took to receive the response
response_time = time.time() - start_time

# extract the text from the response
completion_text = response['choices'][0]['text']

# print the time delay and text received
print(f"Full response received {response_time:.2f} seconds after request")
print(f"Full text received: {completion_text}")
Full response received 7.32 seconds after request
Full text received: 4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100

流式完成请求

通过流式 Completions API 调用,文本通过一系列事件发回。 在 Python 中,您可以使用 for 循环迭代这些事件。

# Example of an OpenAI Completion request, using the stream=True option
# https://beta.openai.com/docs/api-reference/completions/create

# record the time before the request is sent
start_time = time.time()

# send a Completion request to count to 100
response = openai.Completion.create(
    model='text-davinci-002',
    prompt='1,2,3,',
    max_tokens=193,
    temperature=0,
    stream=True,  # this time, we set stream=True
)

# create variables to collect the stream of events
collected_events = []
completion_text = ''
# iterate through the stream of events
for event in response:
    event_time = time.time() - start_time  # calculate the time delay of the event
    collected_events.append(event)  # save the event response
    event_text = event['choices'][0]['text']  # extract the text
    completion_text += event_text  # append the text
    print(f"Text received: {event_text} ({event_time:.2f} seconds after request)")  # print the delay and text

# print the time delay and text received
print(f"Full response received {event_time:.2f} seconds after request")
print(f"Full text received: {completion_text}")
Text received: 4 (0.16 seconds after request)
Text received: , (0.19 seconds after request)
Text received: 5 (0.21 seconds after request)
Text received: , (0.24 seconds after request)
Text received: 6 (0.27 seconds after request)
Text received: , (0.29 seconds after request)
Text received: 7 (0.32 seconds after request)
Text received: , (0.35 seconds after request)
Text received: 8 (0.37 seconds after request)
Text received: , (0.40 seconds after request)
Text received: 9 (0.43 seconds after request)
Text received: , (0.46 seconds after request)
Text received: 10 (0.48 seconds after request)
Text received: , (0.51 seconds after request)
Text received: 11 (0.54 seconds after request)
Text received: , (0.56 seconds after request)
Text received: 12 (0.59 seconds after request)
Text received: , (0.62 seconds after request)
Text received: 13 (0.64 seconds after request)
Text received: , (0.67 seconds after request)
Text received: 14 (0.70 seconds after request)
Text received: , (0.72 seconds after request)
Text received: 15 (0.75 seconds after request)
Text received: , (0.78 seconds after request)
Text received: 16 (0.84 seconds after request)
Text received: , (0.84 seconds after request)
Text received: 17 (0.86 seconds after request)
Text received: , (0.89 seconds after request)
Text received: 18 (0.91 seconds after request)
Text received: , (0.94 seconds after request)
Text received: 19 (1.41 seconds after request)
Text received: , (1.41 seconds after request)
Text received: 20 (1.41 seconds after request)
Text received: , (1.41 seconds after request)
Text received: 21 (1.41 seconds after request)
Text received: , (1.41 seconds after request)
Text received: 22 (1.41 seconds after request)
Text received: , (1.41 seconds after request)
Text received: 23 (1.41 seconds after request)
Text received: , (1.41 seconds after request)
Text received: 24 (1.46 seconds after request)
Text received: , (1.46 seconds after request)
Text received: 25 (1.46 seconds after request)
Text received: , (1.55 seconds after request)
Text received: 26 (1.61 seconds after request)
Text received: , (1.65 seconds after request)
Text received: 27 (1.66 seconds after request)
Text received: , (1.70 seconds after request)
Text received: 28 (1.72 seconds after request)
Text received: , (1.75 seconds after request)
Text received: 29 (1.78 seconds after request)
Text received: , (2.05 seconds after request)
Text received: 30 (2.08 seconds after request)
Text received: , (2.13 seconds after request)
Text received: 31 (2.16 seconds after request)
Text received: , (2.20 seconds after request)
Text received: 32 (2.26 seconds after request)
Text received: , (2.28 seconds after request)
Text received: 33 (2.31 seconds after request)
Text received: , (2.35 seconds after request)
Text received: 34 (2.38 seconds after request)
Text received: , (2.54 seconds after request)
Text received: 35 (2.55 seconds after request)
Text received: , (2.59 seconds after request)
Text received: 36 (2.61 seconds after request)
Text received: , (2.64 seconds after request)
Text received: 37 (2.67 seconds after request)
Text received: , (2.71 seconds after request)
Text received: 38 (2.86 seconds after request)
Text received: , (2.89 seconds after request)
Text received: 39 (2.92 seconds after request)
Text received: , (2.95 seconds after request)
Text received: 40 (2.99 seconds after request)
Text received: , (3.01 seconds after request)
Text received: 41 (3.04 seconds after request)
Text received: , (3.08 seconds after request)
Text received: 42 (3.15 seconds after request)
Text received: , (3.33 seconds after request)
Text received: 43 (3.36 seconds after request)
Text received: , (3.43 seconds after request)
Text received: 44 (3.47 seconds after request)
Text received: , (3.50 seconds after request)
Text received: 45 (3.53 seconds after request)
Text received: , (3.56 seconds after request)
Text received: 46 (3.59 seconds after request)
Text received: , (3.63 seconds after request)
Text received: 47 (3.65 seconds after request)
Text received: , (3.68 seconds after request)
Text received: 48 (3.71 seconds after request)
Text received: , (3.77 seconds after request)
Text received: 49 (3.77 seconds after request)
Text received: , (3.79 seconds after request)
Text received: 50 (3.82 seconds after request)
Text received: , (3.85 seconds after request)
Text received: 51 (3.89 seconds after request)
Text received: , (3.91 seconds after request)
Text received: 52 (3.93 seconds after request)
Text received: , (3.96 seconds after request)
Text received: 53 (3.98 seconds after request)
Text received: , (4.04 seconds after request)
Text received: 54 (4.05 seconds after request)
Text received: , (4.07 seconds after request)
Text received: 55 (4.10 seconds after request)
Text received: , (4.13 seconds after request)
Text received: 56 (4.19 seconds after request)
Text received: , (4.20 seconds after request)
Text received: 57 (4.20 seconds after request)
Text received: , (4.23 seconds after request)
Text received: 58 (4.26 seconds after request)
Text received: , (4.30 seconds after request)
Text received: 59 (4.31 seconds after request)
Text received: , (4.59 seconds after request)
Text received: 60 (4.61 seconds after request)
Text received: , (4.64 seconds after request)
Text received: 61 (4.67 seconds after request)
Text received: , (4.72 seconds after request)
Text received: 62 (4.73 seconds after request)
Text received: , (4.76 seconds after request)
Text received: 63 (4.80 seconds after request)
Text received: , (4.83 seconds after request)
Text received: 64 (4.86 seconds after request)
Text received: , (4.89 seconds after request)
Text received: 65 (4.92 seconds after request)
Text received: , (4.94 seconds after request)
Text received: 66 (4.97 seconds after request)
Text received: , (5.00 seconds after request)
Text received: 67 (5.03 seconds after request)
Text received: , (5.06 seconds after request)
Text received: 68 (5.09 seconds after request)
Text received: , (5.14 seconds after request)
Text received: 69 (5.16 seconds after request)
Text received: , (5.19 seconds after request)
Text received: 70 (5.22 seconds after request)
Text received: , (5.28 seconds after request)
Text received: 71 (5.30 seconds after request)
Text received: , (5.33 seconds after request)
Text received: 72 (5.36 seconds after request)
Text received: , (5.38 seconds after request)
Text received: 73 (5.41 seconds after request)
Text received: , (5.44 seconds after request)
Text received: 74 (5.48 seconds after request)
Text received: , (5.51 seconds after request)
Text received: 75 (5.53 seconds after request)
Text received: , (5.56 seconds after request)
Text received: 76 (5.60 seconds after request)
Text received: , (5.62 seconds after request)
Text received: 77 (5.65 seconds after request)
Text received: , (5.68 seconds after request)
Text received: 78 (5.71 seconds after request)
Text received: , (5.77 seconds after request)
Text received: 79 (5.77 seconds after request)
Text received: , (5.79 seconds after request)
Text received: 80 (5.82 seconds after request)
Text received: , (5.85 seconds after request)
Text received: 81 (5.88 seconds after request)
Text received: , (5.92 seconds after request)
Text received: 82 (5.93 seconds after request)
Text received: , (5.97 seconds after request)
Text received: 83 (5.98 seconds after request)
Text received: , (6.01 seconds after request)
Text received: 84 (6.04 seconds after request)
Text received: , (6.07 seconds after request)
Text received: 85 (6.09 seconds after request)
Text received: , (6.11 seconds after request)
Text received: 86 (6.14 seconds after request)
Text received: , (6.17 seconds after request)
Text received: 87 (6.19 seconds after request)
Text received: , (6.22 seconds after request)
Text received: 88 (6.24 seconds after request)
Text received: , (6.27 seconds after request)
Text received: 89 (6.30 seconds after request)
Text received: , (6.31 seconds after request)
Text received: 90 (6.35 seconds after request)
Text received: , (6.36 seconds after request)
Text received: 91 (6.40 seconds after request)
Text received: , (6.44 seconds after request)
Text received: 92 (6.46 seconds after request)
Text received: , (6.49 seconds after request)
Text received: 93 (6.51 seconds after request)
Text received: , (6.54 seconds after request)
Text received: 94 (6.56 seconds after request)
Text received: , (6.59 seconds after request)
Text received: 95 (6.62 seconds after request)
Text received: , (6.64 seconds after request)
Text received: 96 (6.68 seconds after request)
Text received: , (6.68 seconds after request)
Text received: 97 (6.70 seconds after request)
Text received: , (6.73 seconds after request)
Text received: 98 (6.75 seconds after request)
Text received: , (6.78 seconds after request)
Text received: 99 (6.90 seconds after request)
Text received: , (6.92 seconds after request)
Text received: 100 (7.25 seconds after request)
Full response received 7.25 seconds after request
Full text received: 4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100

时间比较

在上面的示例中,两个请求大约需要 7 秒才能完全完成。

但是,对于流式请求,您会在 0.16 秒后收到第一个令牌,并在大约 0.035 秒后收到后续令牌。

此文章由OpenAI开源维基百科原创发布,如若转载请注明出处:https://openai.wiki/how_to_stream_completions.html

(0)
上一篇 2023-02-18 22:18
下一篇 2023-02-19 19:45

相关推荐

  • api_request_parallel_processor.py

    API 请求并行处理器使用OpenAI API快速处理大量文本需要小心。如果您逐一提交百万个API请求,它们将需要数天时间才能完成。如果您并行涌入一百万个API请求,它们将超出速率限制并因错误而失败。

    ChatGPT 2023-02-18
    002.2K
  • openAI|使用嵌入进行问答

    本文介绍了使用 OpenAI 的 GPT-3 模型回答用户问题的方法,包括如何预处理上下文信息、创建嵌入向量、使用文档嵌入和检索。本文还提供了使用文本搜索和语义建议的技巧,以及自定义嵌入的方法。

    ChatGPT 2023-02-20
    001.7K
  • Prompt|基本使用

    本指南提供了使用提示来执行自然语言处理任务的示例,并介绍了对更高级指南很重要的关键概念。涵盖任务包括文本摘要、信息提取、问答和文本分类。我们展示了如何通过更具体的提示信息让模型输出更精确和具体的结果。

    ChatGPT 2023-02-18
    001.6K
  • openAI|嵌入超长文本

    本文介绍了两种处理超过OpenAI嵌入模型最大上下文长度的文本的方法,即简单截断文本和分块处理文本。通过本文,您可以了解到如何避免因超过最大长度而导致的错误,同时又不失去可能相关的文本内容。

    ChatGPT 2023-02-20
    002.7K
  • openAI|如何使用 DALL-E 生成和编辑图像

    本文以一个实例演示了如何使用该 API 端点生成图像。文章还介绍了各个 API 端点的所需和可选输入,并提供了具体的代码实现。本文有助于理解和使用 DALL-E 图像 API 端点,提升图像生成能力。

    2023-02-20
    002.4K

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注

微信