# 语言模型

模型能力须知：该接口调用的是微软的GPT3.5/4模型能力，请在调用时注意信息安全，同时由于GPT-4价格较贵，请谨慎发起申请
调用申请：需申请开通「调用微软相关模型」权限，同时需要发起审批（Boss Hi工作台-流程中心-大模型资源申请），审批通过后才能进行模型调用。点击申请 (opens new window)
调用量限制：为了防止您的代码在异常情况多次调用，每个应用对于GPT-3.5将默认获得50qpm和500万token/日的额度；对于GPT-4将获得50qpm和100万token/日的额度。如果您的业务调用量较大，请在发起审批时进行调整
基础数据安全策略：为了保证数据安全，我们做了如下基础策略限制，如果您想要进行更多策略限制，建议在您的服务端处理完成后再调用该接口

（1）用户个人信息：身份证号、护照号、银行卡号、支付宝账户、GPS信息（正则匹配处理）
（2）个人联系方式：手机号、邮箱等
（3）公司业务信息：蓝领、牛炸、首善、达成、机构号、pv等
（4）公司组织架构信息：产品*组，研发*组，算法*组
（5）其他：如boss直聘+股票

# 请求

基本
HTTP URL	https://hi-gw.weizhipin.com/open-apis/ai/open/api/send/message
HTTP Method	POST
支持的应用类型	自建应用
权限要求开启任一权限即可	调用微软相关模型

# 请求头

名称	类型	必填	描述
Authorization	string	是	tenant_access_token 值格式："Bearer `access_token`" 示例值："Bearer t-7f1bcd13fc57d46bac21793a18e560"
Content-Type	string	是	固定值："application/json; charset=utf-8"

# 请求体

名称	类型	必填	描述
model	string	是	调用模型类型【2】:GPT3.5; 【6】:GPT-4-turbo(微软); 【7】:GPT-4-turbo(openai); 【8】:GPT4-8k; 【10】:GPT-4V(openai,已停用,平台将自动转发至17); 【14】:(openai gpt-4-0409); 【15】:(claude claude-3-opus-20240229); 【16】:(claude claude-3-sonnet-20240229); 【17】:gpt4-4o(openai)(持续指向最新，当前版本gpt-4o-2024-05-13)(qpm+token复用『7』额度); 【18】:gpt4-4o(微软)(qpm+token复用『4、5、6』额度) ; 【19】:gpt-4o-mini(openai)(qpm+token复用『7』额度) ; 【21】:gpt-4o-mini-0718(openai)(qpm+token复用『7』额度) ; 【22】:gpt-4o-mini-2024-0718(微软)(qpm+token复用『4、5、6』额度) ; 【23】:chatgpt-4o-latest(qpm+token复用『7』额度) ; 【24】:gpt-4o-2024-05-13(openai)(qpm+token复用『7』额度) ; 【25】:gpt-4o-2024-08-06(openai)(qpm+token复用『7』额度); 【40】:o1-mini(openai); 【41】:o1-mini-2024-09-12(openai)(qpm+token复用『40』额度); 【42】:o1-preview(openai)(qpm+token复用『40』额度) ; 【43】:o1-preview-2024-09-12(openai)(qpm+token复用『40』额度)
messages	list	是	A list of messages describing the conversation so far. role string 必填 The role of the author of this message. One of system, user, or assistant. content string 必填 The contents of the message. name string 选填 The name of the author of this message. May contain a-z, A-Z, 0-9, and underscores, with a maximum length of 64 characters. 一份消息列表，描述迄今为止的对话。 role string 必填用于指定该消息的作者角色，包括系统（system）、用户（user）和小助手（assistant） content string 必填用于指定消息的具体内容。 name string 选填用于指定消息的作者名称，可包含字母a-z、A-Z、0-9以及下划线，并且最长长度为64个字符。
temperature	number	否	What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.\nWe generally recommend altering this or top_p but not both. 在0到2之间使用什么采样温度。较高的值如0.8会使输出更随机，而较低的值如0.2会使其更加集中和确定性。
top_p	number	否	An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered. We generally recommend altering this or temperature but not both. 通常我们建议修改这个或者top_p，但不要同时修改两个。另一种采样温度的替代方法叫做核采样，其中模型考虑了具有top_p概率质量的令牌的结果。所以0.1表示只考虑占前10%概率质量的令牌。
n	int	否	How many chat completion choices to generate for each input message. 每个输入消息要生成多少个聊天完成选项。
stream	boolean	否	If set, partial message deltas will be sent, like in ChatGPT. Tokens will be sent as data-only server-sent events as they become available, with the stream terminated by a data: [DONE] message." 如果设置，将发送部分消息增量，就像在ChatGPT中一样。令牌将按照数据的方式发送为仅服务端发送事件，在它们可用时被发送，随着数据流的结束用一个数据:[DONE]消息进行终止
stop	string/array	否	Up to 4 sequences where the API will stop generating further tokens. 最多4个序列，API将停止生成进一步的令牌
max_tokens	number	否	The maximum number of tokens allowed for the generated answer. By default, the number of tokens the model can return will be (4096 - prompt tokens). 当模型为Claude时必传生成答案所允许的最大令牌数。默认情况下，模型可以返回的令牌数为(4096-提示符令牌)。
max_completion_tokens	number	否	The maximum number of tokens allowed for the generated answer. By default, the number of tokens the model can return will be (4096 - prompt tokens). 生成答案所允许的最大令牌数。默认情况下，模型可以返回的令牌数为(4096-提示符令牌)。
presence_penalty	number	否	Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics. -2.0到2.0之间的数字。正数会根据新令牌是否出现在文本中来惩罚它们，从而增加模型谈论新主题的可能性。
frequency_penalty	number	否	Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim. -2.0到2.0之间的数字。正值会根据文本中现有令牌的频率惩罚新令牌，从而降低模型重复相同行的可能性。修改指定令牌在完成中出现的可能性。
logit_bias	number	否	Modify the likelihood of specified tokens appearing in the completion. Accepts a json object that maps tokens (specified by their token ID in the tokenizer) to an associated bias value from -100 to 100. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token. 接受将标记(由令牌器中的令牌ID指定)映射到-100至100的相关偏置值的JSON对象。数学上，偏置被添加到模型产生的对数之前进行采样。确切的效果将因模型而异，但-1到1之间的值应该会减少或增加选择的可能性；-100或100之类的值应该会导致相关令牌的禁令或专有选择。
user	number	否	A unique identifier representing your end-user, which can help Azure OpenAI to monitor and detect abuse. 用于表示您的终端用户的唯一标识符，可帮助Azure OpenAI监视和检测滥用。

# 请求体示例

# openai o1

注意事项：max_tokens(限定回答的最大tokens)参数改为max_completion_tokens, 选填, int类型. openai官网已经明确max_tokens参数已经淘汰, o1是第一个正式不支持该参数的模型. 目前max_completion_tokens对openai渠道所有模型有效, max_tokens对除o1以外的其他模型依旧有效. 微软渠道目前仍旧使用max_tokens不支持max_completion_tokens.

{
  "messages": [
    {"role": "user", "content": "hello"}
  ],
  "max_completion_tokens":1024,
  "model": "25"
}

1
2
3
4
5
6
7

# 通用请求体

{
  "messages": [
    {"role": "system", "content": "我需要中文"},
    {"role": "user", "content": "@RestControllerAdvice 处理完成之后，怎么拦截最终的接口响应数据"}
  ],
  "model": "2"
}

1
2
3
4
5
6
7

# 响应

# 响应体

名称	类型	描述
code	int	错误码，非 0 表示失败
msg	string	错误描述
data	data	-

# 响应体示例

{
  "code": 0,
  "msg": "",
  "gpt_data": {
    "id": "chatcmpl-7JgIKQA8RR6xHwfmrkCNypKK7LtIj",
    "object": "chat.completion",
    "created": 1684925168,
    "model": "gpt-35-turbo",
    "choices": [{
      "index": 0,
      "finish_reason": "stop",
      "message": {
        "role": "assistant",
        "content": "在使用`@RestControllerAdvice`处理完请求后，可以通过实现`ResponseBodyAdvice`接口对最终的接口响应数据进行拦截和处理。\n\n```java\n@RestControllerAdvice\npublic class CustomResponseAdvice implements ResponseBodyAdvice<Object> {\n\n    @Override\n    public boolean supports(MethodParameter returnType, Class<? extends HttpMessageConverter<?>> converterType) {\n        \/\/ 判断当前请求对应的返回类型是否需要拦截返回值\n        return true;\n    }\n\n    @Override\n    public Object beforeBodyWrite(Object body, MethodParameter returnType, MediaType selectedContentType,\n                                  Class<? extends HttpMessageConverter<?>> selectedConverterType,\n                                  ServerHttpRequest request, ServerHttpResponse response) {\n        \/\/ 对返回值进行处理\n        return body;\n    }\n}\n```\n\n在`supports`方法中可以根据`MethodParameter`和`HttpMessageConverter`等参数确定需要拦截的响应类型。在`beforeBodyWrite`方法中可以对响应数据进行处理，并返回最终的响应数据。\n\n需要注意的是，这里的拦截器仅能对响应体进行拦截和处理，对响应头信息的处理需要使用其他的方式。"
      }
    }],
    "usage": {
      "completion_tokens": 272,
      "prompt_tokens": 43,
      "total_tokens": 315
    },
    "forbbiden":{//code == 80008, 敏感信息的拦截详情
      
    }
  },
  "trace_id": "45ed534a1bf5446b95d76c6fa5c40034"
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

# forbbiden拦截示例

# forbbiden解析规则

一级字段
- person 个人信息(包含word/type/position二级字段)
- contact 联系方式(包含word/type/position二级字段)
- sensitive 敏感词(包含word/position二级字段)
- hit_xxx 是否命中对应检测能力
二级/三级字段
- position
- index 命中内容的层级位置(下标), 只有像chat接口提交内容在list结构里才有意义.
- start 命中内容的起始位置(包含)
- end 命中内容的结束位置(不包含)
- word 命中的词条
- type 词条类型
其他不认识的字段查case用

    "forbbiden": {
        "hit_person": true,
        "hit_contact": true,
        "hit_sensitive": true,
        "person": [
            {
                "word": "17610677733",
                "type": "aliPayNo",
                "typeCode": 5,
                "position": {
                    "index": 1,
                    "start": 6,
                    "end": 16
                }
            }
        ],
        "contact": [
            {
                "word": "533224599",
                "type": "QQ",
                "position": {
                    "index": 1,
                    "start": 22,
                    "end": 31
                }
            },
            {
                "word": "17610677733",
                "type": "Phone",
                "position": {
                    "index": 1,
                    "start": 6,
                    "end": 17
                }
            }
        ],
        "sensitive": [
            {
                "word": "股票",
                "strategy": "公司状况-方案词-拦截",
                "tag": "公司经营-B",
                "risk": "NO_SUBMIT",
                "position": {
                    "index": 1,
                    "start": 43,
                    "end": 45
                }
            },
            {
                "word": "boss直聘",
                "strategy": "公司状况-方案词-拦截",
                "tag": "公司经营-A",
                "risk": "NO_SUBMIT",
                "position": {
                    "index": 1,
                    "start": 36,
                    "end": 42
                }
            }
        ]
    }
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62

← 通用模型 embedding →