【AI写代码】模型生成代码 基于代码生成模型的生产代码或补充代码 1.项目介绍
近年来,人工智能发展迅速,面向开发者的AI开发工具层出不穷。尤其是从工具中,我们可以看到人工智能编写代码的潜力。
最近也开放了代码生成模型,一键生成代码。
一起来探索自动生成代码的乐趣吧!
2.开启AI编码之旅
由于还没有发布到 pip 包中,我们需要拉取开发代码并安装最新的开发版本。
克隆最新的存储库并进入存储库以安装和需要包。其仓库的压缩包就准备在这里,直接解压即可使用。
准备阶段
!unzip PaddleNLP
!pip install --upgrade paddlenlp -i https://mirror.baidu.com/pypi/simple
!pip uninstall -y paddlenlp
%cd PaddleNLP
!python setup.py install
%cd ../
!pip install regex
重启内核
参数介绍 支持的预训练模型"": {
"": { "/-350M-单声道": {
"": ,
"": '-/-350M-单声道',
"": "/-350M-单声道",
},
"/-2B-单声道": {
"": ,
"": '-/-2B-单声道',
"": "/-2B-单声道",
},
"/-6B-单声道": {
"": ,
"": '-/-6B-单声道',
"": "/-6B-单声道",
},
"/-350M-nl": {
"": ,
"": '-/-350M-nl',
"": "/-350M-nl",
},
"/-2B-nl": {
"": ,
"": '-/-2B-nl',
"": "/-2B-nl",
},
"/-6B-nl": {
"": ,
"": '-/-6B-nl',
"": "/-6B-nl",
},
"/-350M-多": {
"": ,
"": '-/-350M-多',
"": "/-350M-多",
},
"/-2B-多": {
"": ,
"": '-/-2B-多',
"": "/-2B-多",
},
"/-6B-多": {
"": ,
"": '-/-6B-multi',
"": "/-6B-多",
},
},
“”:{
"型号": "/-350M-单声道",
}
} 支持单次和批量预测
>>> from paddlenlp import Taskflow
>>> codegen = Taskflow("code_generation")
# 单条输入
>>> codegen("def hello_world():")
['\n print("Hello World")']
# 多条输入
>>> codegen(["Get the length of array", "def hello_world():"])
['\n n = len(a)\n\n #', '\n print("Hello World!")']
如何使用不同的预训练模型
看一下参数说明
class Taskflow(object):
"""
The Taskflow is the end2end inferface that could convert the raw text to model result, and decode the model result to task result. The main functions as follows:
1) Convert the raw text to task result.
2) Convert the model to the inference model.
3) Offer the usage and help message.
Args:
task (str): The task name for the Taskflow, and get the task class from the name.
model (str, optional): The model name in the task, if set None, will use the default model.
mode (str, optional): Select the mode of the task, only used in the tasks of word_segmentation and ner.
If set None, will use the default mode.
device_id (int, optional): The device id for the gpu, xpu and other devices, the defalut value is 0.
kwargs (dict, optional): Additional keyword arguments passed along to the specific task.
"""
def __init__(self, task, model=None, mode=None, device_id=0, **kwargs):
所以我们只需要加上预训练模型的名字就可以实例化
codegen = Taskflow("code_generation", "模型名". "其他参数")
例如:
codegen = Taskflow("code_generation", "Salesforce/codegen-6B-mono", min_length=1024)
正式开始AI编码
from paddlenlp import Taskflow
codegen = Taskflow("code_generation", min_length=256)
我们先尝试一个简单的,生成求和函数,看起来很棒!
c1 = "Calculate the sum of two numbers"
p1 = codegen([c1])
print(c1)
for code in p1:
print(code)
Calculate the sum of two numbers
.
"""
def add(self, num1: int, num2: int) -> int:
return num1 + num2
import os
from pathlib import Path
import sys
接下来增加难度,尝试继续写代码
以第一个问题为例,我们写了一部分代码,剩下的交给 AI 完成。
两个数之和
给定一个整数数组 nums 和一个整数目标值,请在数组中找出其和为目标值的两个整数,并返回它们的数组索引。
您可以假设每个输入只有一个答案。但是j代码注册机,数组中的相同元素不能在答案中重复。
您可以按任何顺序返回答案。
c2 = "def twoSum(nums, target):\n hashmap={}\n for ind,num in enumerate(nums):\n hashmap[num] = ind\n for i,num in enumerate(nums):"
p2 = codegen([c2])
print(c2)
for code in p2:
print(code)
def twoSum(nums, target):
hashmap={}
for ind,num in enumerate(nums):
hashmap[num] = ind
for i,num in enumerate(nums):
if hashmap.get(target-num)!= None: return [i,hashmap.get(target-num)]
else: return []
把结果拿去测试,我们可以发现常见的测试用例都可以通过。这里的错误是因为AI不知道同一个元素不能重复使用;所以看起来整体延续逻辑没有问题!
自己再试一次,根据条件写代码
这里仍然是一个问题的测试。第一步是输入问题的描述和测试样本。
from paddlenlp import Taskflow
codegen = Taskflow("code_generation")
c3 = """
# In a deck of cards, each card has an integer written on it.
# Return true if and only if you can choose X >= 2 such that it is possible to split the entire deck into 1 or more groups of cards, where:
# Each group has exactly X cards.
# All the cards in each group have the same integer.
#
# Example 1:
# Input: deck = [1,2,3,4,4,3,2,1]
# Output: true
# Explanation: Possible partition [1,1],[2,2],[3,3],[4,4].
# Example 2:
# Input: deck = [1,1,1,2,2,2,3,3]
# Output: false
# Explanation: No possible partition.
#
# Constraints:
# 1 <= deck.length <= 104
# 0 <= deck[i] < 104
def hasGroupsSizeX(self, deck: List[int]) -> bool:
"""
p3 = codegen(c3)
print(c3)
for code in p3:
print(code)
# In a deck of cards, each card has an integer written on it.
# Return true if and only if you can choose X >= 2 such that it is possible to split the entire deck into 1 or more groups of cards, where:
# Each group has exactly X cards.
# All the cards in each group have the same integer.
#
# Example 1:
# Input: deck = [1,2,3,4,4,3,2,1]
# Output: true
# Explanation: Possible partition [1,1],[2,2],[3,3],[4,4].
# Example 2:
# Input: deck = [1,1,1,2,2,2,3,3]
# Output: false
# Explanation: No possible partition.
#
# Constraints:
# 1 <= deck.length <= 104
# 0 <= deck[i] < 104
def hasGroupsSizeX(self, deck: List[int]) -> bool:
if len(deck) == 1: return True
for i in range(len(deck)-1):
curr_sum = self.getSum(deck[i],deck[-1]-1) + deck[i+1] - 1 # curr_sum = sum(deck[0:i])+sum(deck[-1:i+1])
#
然后将上面的输出加到输入中,再次生成后半段代码
c4 = """
# In a deck of cards, each card has an integer written on it.
# Return true if and only if you can choose X >= 2 such that it is possible to split the entire deck into 1 or more groups of cards, where:
# Each group has exactly X cards.
# All the cards in each group have the same integer.
#
# Example 1:
# Input: deck = [1,2,3,4,4,3,2,1]
# Output: true
# Explanation: Possible partition [1,1],[2,2],[3,3],[4,4].
# Example 2:
# Input: deck = [1,1,1,2,2,2,3,3]
# Output: false
# Explanation: No possible partition.
#
# Constraints:
# 1 <= deck.length <= 104
# 0 <= deck[i] < 104
def hasGroupsSizeX(self, deck: List[int]) -> bool:
if len(deck) == 1: return True
for i in range(len(deck)-1):
curr_sum = self.getSum(deck[i],deck[-1]-1) + deck[i+1] - 1
"""
p4 = codegen(c4)
print(c4)
for code in p4:
print(code)
# In a deck of cards, each card has an integer written on it.
# Return true if and only if you can choose X >= 2 such that it is possible to split the entire deck into 1 or more groups of cards, where:
# Each group has exactly X cards.
# All the cards in each group have the same integer.
#
# Example 1:
# Input: deck = [1,2,3,4,4,3,2,1]
# Output: true
# Explanation: Possible partition [1,1],[2,2],[3,3],[4,4].
# Example 2:
# Input: deck = [1,1,1,2,2,2,3,3]
# Output: false
# Explanation: No possible partition.
#
# Constraints:
# 1 <= deck.length <= 104
# 0 <= deck[i] < 104
def hasGroupsSizeX(self, deck: List[int]) -> bool:
if len(deck) == 1: return True
for i in range(len(deck)-1):
curr_sum = self.getSum(deck[i],deck[-1]-1) + deck[i+1] - 1
return False
"""
def hasGroupsSizeX(self, deck: List[int]) -> bool:
sums = []
for d in deck: sums.append(d)
for j in range(len(sums)):
for k in range(j+1,len(sums)):
if sums[j]+sums[k] > sums[0]:
continue
elif sums[j]+sums[k]==sums[0]: return False
return Trueimport os
from typing import
最后把结果拿去测试j代码注册机,也可以通过部分样品的测试。
可以看到生成的代码还是可以使用的!
最后尝试自己写
传入模型代码和注释,看看能写什么?!
c5 = """
class CodeGenerationTask(Task):
'''
The text generation model to predict the code.
Args:
task(string): The name of task.
model(string): The model name in the task.
kwargs (dict, optional): Additional keyword arguments passed along to the specific task.
'''
def __init__(self, task, model, **kwargs):
"""
p5 = codegen([c5])
print(c5)
for code in p5:
print(code)
class CodeGenerationTask(Task):
'''
The text generation model to predict the code.
Args:
task(string): The name of task.
model(string): The model name in the task.
kwargs (dict, optional): Additional keyword arguments passed along to the specific task.
'''
def __init__(self, task, model, **kwargs):
super().__init__('CodeGeneration', task, model, kwargs)
@staticmethod
def get_default_config():
return {
'task': 'code_generation',
'model': 'bert-base-uncased',
'kwargs': None,
}
#
三、总结
纯代码生成能力还是略弱,但是继续写的能力还是很不错的。因此j代码注册机,它也更适合作为与程序员一起编写代码的工具。通过不断调整输入和微调,最终可以得到理想的代码。
我们也可以将其更多地用作代码补全或代码关联工具来使用和开发,并尝试将其开发为 VS Code 或 JB 插件作为代码补全工具。
版权保护: 本文由 8BDU软件分享博客-8BDU软件园 原创,转载请保留链接: /zhuceji/4381.html