深度学习模型训练——tensorflow版
- 拖拽深度学习组件,在深度学习组件内部搭建:公共数据集+自定义网络模型。
画布如图所示,可以选取给定的公共数据集,总共有四个输出,out1,out2分别为训练集的train跟lable,out3,out4分别为测试集的train跟lable在python脚本编辑器中编写模型代码,输出值为tf模型。模型评估组件输出为loss跟accuracy指标。
# The script MUST contain a function named run
# which is the entry point for this module.
# The entry point function can contain several input arguments:
# Param<in1>: a pandas.DataFrame
# Param<in2>: a pandas.DataFrame
def run(in1=None, in2=None):
# import required packages
import datetime
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from suanpan.log import logger
import tensorflow as tf
# logger.info("xxxx")
train,label = in1,in2
# train = train.reshape(len(label),28,28)
# logger.info(train.shape)
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation='softmax')
])
model.compile(optimizer='adam',loss='sparse_categorical_crossentropy',metrics=['accuracy'])
model.fit(train, label, epochs=1)
print(datetime.datetime.now())
#out1, out2 = train_test_split(in1, train_size=0.33)
# Return value must be of a sequence of pandas.DataFrame
# return out1, out2, in2
return model
- 拖拽深度学习组件,在深度学习组件内部搭建:上传自己的数据集+自定义网络模型。
画布如图所示,自定义数据集目前只支持zip文件。压缩包格式如下:
images.zip,文件夹名表示class。
每个class内部如下:
训练代码demo如下
# The script MUST contain a function named run
# which is the entry point for this module.
# The entry point function can contain several input arguments:
# Param<in1>: a pandas.DataFrame
# Param<in2>: a pandas.DataFrame
def run(in1 = None, in2 = None):
from keras.layers import Dense,Flatten,Dropout
from keras.models import Sequential
from keras.layers.normalization.batch_normalization import BatchNormalization
from keras.layers.convolutional import Conv2D
from keras.layers.convolutional import MaxPooling2D
from keras.initializers.initializers_v1 import TruncatedNormal
from keras.layers.core import Activation
from keras.layers.core import Flatten
from keras.layers.core import Dropout
from keras.layers.core import Dense
from keras.optimizers import SGD
INIT_LR = 0.01
model=Sequential()
inputShape = (64, 64, 3)
model.add(Conv2D(32, (3, 3), padding="same",input_shape=inputShape))
model.add(Activation("relu"))
model.add(BatchNormalization(axis=-1))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(512,kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.01)))
model.add(Activation("relu"))
model.add(BatchNormalization())
model.add(Dropout(0.6))
model.add(Dense(10,kernel_initializer=TruncatedNormal(mean=0.0, stddev=0.01)))
model.add(Activation("softmax"))
opt = SGD(lr=INIT_LR, decay=INIT_LR / 10)
model.compile(loss="categorical_crossentropy", optimizer=opt,metrics=["accuracy"])
model.fit(in1,in2,epochs=5, batch_size=1)
model.summary()
return model
- 拖拽深度学习组件,在深度学习组件内部搭建:选取自定义数据集+Retrain预训练模型
组件内部加载多个预训练好的深度模型,用户可以再此基础上预训练模型。
画布如图所示,其中Retrain预训练模型组件,参数配置中可选优化函数,epoch,batchsize等,还可以选取预训练模型的某一层开始训练只需在“中间层名称”中写入预训练模型的层名称即可,默认为网络距离全连接层的最近一层。
- 通过算法csv上传文件,算盘流式训练模型。以LSTM预测模型为例:
拖拽的组件如上图所示,首先适配深度学习组件的输入桩,修改输入类型为CSV,输出类型也为CSV
深度学习组件内部结构如图:拖入输入数据,python脚本编辑器中写自己的模型。模型结果保存到全局变量中。
拖入in2(预测数据集),获取全局变量(保存的深度模型),python脚本编辑器对模型做评估/预测并进行结果输出。
案例代码以及数据
list_to_csv.csvlist_to_csv1.csv
# The script MUST contain a function named run
# which is the entry point for this module.
# The entry point function can contain several input arguments:
# Param<in1>: a pandas.DataFrame
# Param<in2>: a pandas.DataFrame
def run(in1 = None):
# import required packages
X,y = in1["data"], in1["label"]
import keras
from keras.models import Sequential
from keras.layers import Activation
from keras.layers import LSTM
from keras.layers import Dropout
from keras.layers import Dense
from sklearn.model_selection import train_test_split
from suanpan.log import logger as log
model = Sequential()
model.add(LSTM(128, return_sequences=True, input_shape=(1,1),activation="tanh"))
model.add(Dropout(0.01))
model.add(LSTM(128, return_sequences=True, activation="tanh"))
model.add(Dropout(0.01))
model.add(Dense(1, input_dim=1))
model.compile(loss="mse", optimizer="adam")
model.fit(X,y)
model.summary()
return model
# The script MUST contain a function named run
# which is the entry point for this module.
# The entry point function can contain several input arguments:
# Param<in1>: a pandas.DataFrame
# Param<in2>: a pandas.DataFrame
def run(in1 = None, in2 = None):
# import required packages
import numpy as np
from suanpan.log import logger as log
from pandas.core.frame import DataFrame
testdata = in2["data"]
# testdata = np.array(in2["data"].tolist())
# log.info(str(testdata))
# testdata = testdata[:, np.newaxis, np.newaxis]
model = in1
Y_pred = model.predict(testdata)
dic = {}
dic = {"data":testdata.tolist(),"predict":Y_pred.reshape(len(Y_pred)).tolist()}
result = DataFrame(dic)
log.info(result)
return result
- 深度学习除了支持输入桩为csv类型,还支持string,json类型,string时可配置输入桩为所有类型,json时可配输入桩为对象。
用户可以在输入框或者json组件中传入自己的数据集oss路径。具体案例如下所示:
前面板如图:
(注意:深度学习与算盘其他组件配合使用时,如上面的4,5按钮,在深度学习的编辑graph之后,需要按部署按钮,然后再点击前面板进行操作 )