Zheng Chu's Blog

让希望永驻


  • 主页

  • 所有专栏

  • 历史文章

  • 标签

  • 关于我

Tensorflow2-highLevel

Posted on 2020-10-01 Edited on 2020-12-06 In Tensorflow Views:
  • GAN training loop from scratch
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
discriminator = keras.Sequential(
[
keras.Input(shape=(28, 28, 1)),
layers.Conv2D(64, (3, 3), strides=(2, 2), padding="same"),
layers.LeakyReLU(alpha=0.2),
layers.Conv2D(128, (3, 3), strides=(2, 2), padding="same"),
layers.LeakyReLU(alpha=0.2),
layers.GlobalMaxPooling2D(),
layers.Dense(1),
],
name="discriminator",
)
discriminator.summary()


latent_dim = 128

generator = keras.Sequential(
[
keras.Input(shape=(latent_dim,)),
# We want to generate 128 coefficients to reshape into a 7x7x128 map
layers.Dense(7 * 7 * 128),
layers.LeakyReLU(alpha=0.2),
layers.Reshape((7, 7, 128)),
layers.Conv2DTranspose(128, (4, 4), strides=(2, 2), padding="same"),
layers.LeakyReLU(alpha=0.2),
layers.Conv2DTranspose(128, (4, 4), strides=(2, 2), padding="same"),
layers.LeakyReLU(alpha=0.2),
layers.Conv2D(1, (7, 7), padding="same", activation="sigmoid"),
],
name="generator",
)


# Instantiate one optimizer for the discriminator and another for the generator.
d_optimizer = keras.optimizers.Adam(learning_rate=0.0003)
g_optimizer = keras.optimizers.Adam(learning_rate=0.0004)

# Instantiate a loss function.
loss_fn = keras.losses.BinaryCrossentropy(from_logits=True)


@tf.function
def train_step(real_images):
# Sample random points in the latent space
random_latent_vectors = tf.random.normal(shape=(batch_size, latent_dim))
# Decode them to fake images
generated_images = generator(random_latent_vectors)
# Combine them with real images
combined_images = tf.concat([generated_images, real_images], axis=0)

# Assemble labels discriminating real from fake images
labels = tf.concat(
[tf.ones((batch_size, 1)), tf.zeros((real_images.shape[0], 1))], axis=0
)
# Add random noise to the labels - important trick!
labels += 0.05 * tf.random.uniform(labels.shape)

# Train the discriminator
with tf.GradientTape() as tape:
predictions = discriminator(combined_images)
d_loss = loss_fn(labels, predictions)
grads = tape.gradient(d_loss, discriminator.trainable_weights)
d_optimizer.apply_gradients(zip(grads, discriminator.trainable_weights))

# Sample random points in the latent space
random_latent_vectors = tf.random.normal(shape=(batch_size, latent_dim))
# Assemble labels that say "all real images"
misleading_labels = tf.zeros((batch_size, 1))

# Train the generator (note that we should *not* update the weights
# of the discriminator)!
with tf.GradientTape() as tape:
predictions = discriminator(generator(random_latent_vectors))
g_loss = loss_fn(misleading_labels, predictions)
grads = tape.gradient(g_loss, generator.trainable_weights)
g_optimizer.apply_gradients(zip(grads, generator.trainable_weights))
return d_loss, g_loss, generated_images



import os

# Prepare the dataset. We use both the training & test MNIST digits.
batch_size = 64
(x_train, _), (x_test, _) = keras.datasets.mnist.load_data()
all_digits = np.concatenate([x_train, x_test])
all_digits = all_digits.astype("float32") / 255.0
all_digits = np.reshape(all_digits, (-1, 28, 28, 1))
dataset = tf.data.Dataset.from_tensor_slices(all_digits)
dataset = dataset.shuffle(buffer_size=1024).batch(batch_size)

epochs = 1 # In practice you need at least 20 epochs to generate nice digits.
save_dir = "./"

for epoch in range(epochs):
print("\nStart epoch", epoch)

for step, real_images in enumerate(dataset):
# Train the discriminator & generator on one batch of real images.
d_loss, g_loss, generated_images = train_step(real_images)

# Logging.
if step % 200 == 0:
# Print metrics
print("discriminator loss at step %d: %.2f" % (step, d_loss))
print("adversarial loss at step %d: %.2f" % (step, g_loss))

# Save one generated image
img = tf.keras.preprocessing.image.array_to_img(
generated_images[0] * 255.0, scale=False
)
img.save(os.path.join(save_dir, "generated_img" + str(step) + ".png"))

# To limit execution time we stop after 10 steps.
# Remove the lines below to actually train the model!
if step > 10:
break
  • REF: https://keras.io/guides/writing_a_training_loop_from_scratch/

keras.utils.plot_model

保存模型图:

1
2
3
model = tf.keras.Model(inputs=[input], outputs=[output])
dot_img_file = '/tmp/model_1.png'
tf.keras.utils.plot_model(model, to_file=dot_img_file, show_shapes=True)

keras.save

model.save() to save the entire model as a single file. You can later recreate the same model from this file, even if the code that built the model is no longer available.

This saved file includes the:

  • model architecture 模型结构
  • model weight values (that were learned during training)。;模型权重
  • model training config, if any (as passed to compile)。;模型config
  • optimizer and its state, if any (to restart training where you left off);优化器
1
2
3
4
model.save("path_to_my_model")
del model
# Recreate the exact same model purely from the file:
model = keras.models.load_model("path_to_my_model")

keras.Model as layer

例子:Autoencoder:

All models are callable, just like layers

You can treat any model as if it were a layer by invoking it on an Input or on the output of another layer. By calling a model you aren’t just reusing the architecture of the model, you’re also reusing its weights.

(当使用Model作为layer时,不仅在使用模型结构同样也使用模型的权重。)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
# model: encoder
encoder_input = keras.Input(shape=(28, 28, 1), name="original_img")
x = layers.Conv2D(16, 3, activation="relu")(encoder_input)
x = layers.Conv2D(32, 3, activation="relu")(x)
x = layers.MaxPooling2D(3)(x)
x = layers.Conv2D(32, 3, activation="relu")(x)
x = layers.Conv2D(16, 3, activation="relu")(x)
encoder_output = layers.GlobalMaxPooling2D()(x)

encoder = keras.Model(encoder_input, encoder_output, name="encoder")
encoder.summary()

# model: decoder
decoder_input = keras.Input(shape=(16,), name="encoded_img")
x = layers.Reshape((4, 4, 1))(decoder_input)
x = layers.Conv2DTranspose(16, 3, activation="relu")(x)
x = layers.Conv2DTranspose(32, 3, activation="relu")(x)
x = layers.UpSampling2D(3)(x)
x = layers.Conv2DTranspose(16, 3, activation="relu")(x)
decoder_output = layers.Conv2DTranspose(1, 3, activation="relu")(x)

decoder = keras.Model(decoder_input, decoder_output, name="decoder")
decoder.summary()

# final model: autoencoder
autoencoder_input = keras.Input(shape=(28, 28, 1), name="img")
encoded_img = encoder(autoencoder_input)
decoded_img = decoder(encoded_img)
autoencoder = keras.Model(autoencoder_input, decoded_img, name="autoencoder")
autoencoder.summary()

image-20200924123938817

ensembling:多个模型的输出集成在一起;

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# *ensembling*:

def get_model():
inputs = keras.Input(shape=(128,))
outputs = layers.Dense(1)(inputs)
return keras.Model(inputs, outputs)


model1 = get_model()
model2 = get_model()
model3 = get_model()

inputs = keras.Input(shape=(128,))
y1 = model1(inputs)
y2 = model2(inputs)
y3 = model3(inputs)
outputs = layers.average([y1, y2, y3])
ensemble_model = keras.Model(inputs=inputs, outputs=outputs)

multiple input & output

Sequential API不能实现多输入多输出的要求;

Shared layers

1
2
3
4
5
6
7
8
9
10
11
12
# Embedding for 1000 unique words mapped to 128-dimensional vectors
shared_embedding = layers.Embedding(1000, 128)

# Variable-length sequence of integers
text_input_a = keras.Input(shape=(None,), dtype="int32")

# Variable-length sequence of integers
text_input_b = keras.Input(shape=(None,), dtype="int32")

# Reuse the same layer to encode both inputs
encoded_input_a = shared_embedding(text_input_a)
encoded_input_b = shared_embedding(text_input_b)

functional API

  • A functional model can be serialized or cloned
    • Because a functional model is a data structure rather than a piece of code, it is safely serializable and can be saved as a single file that allows you to recreate the exact same model without having access to any of the original code.
  • Functional API weakness
    • The functional API treats models as DAGs of layers. This is true for most deep learning architectures, but not all — for example, recursive networks or Tree RNNs do not follow this assumption and cannot be implemented in the functional API.

functional API 形式:

1
2
3
4
inputs = keras.Input(shape=(32,))
x = layers.Dense(64, activation='relu')(inputs)
outputs = layers.Dense(10)(x)
mlp = keras.Model(inputs, outputs)

等价的子类形式:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
class MLP(keras.Model):

def __init__(self, **kwargs):
super(MLP, self).__init__(**kwargs)
self.dense_1 = layers.Dense(64, activation='relu')
self.dense_2 = layers.Dense(10)

def call(self, inputs):
x = self.dense_1(inputs)
return self.dense_2(x)

# Instantiate the model.
mlp = MLP()
# Necessary to create the model's state.
# The model doesn't have a state until it's called at least once.
_ = mlp(tf.zeros((1, 32)))

whether compile before predict

  • https://stackoverflow.com/questions/58378374/why-does-keras-model-predict-slower-after-compile/58385156#58385156
  • https://github.com/tensorflow/tensorflow/issues/33340

速度问题

是否compile模型

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

def get_uncompiled_model():
inputs = keras.Input(shape=(784,), name="digits")
x = layers.Dense(64, activation="relu", name="dense_1")(inputs)
x = layers.Dense(64, activation="relu", name="dense_2")(x)
outputs = layers.Dense(10, activation="softmax", name="predictions")(x)
model = keras.Model(inputs=inputs, outputs=outputs)
return model


def get_compiled_model():
model = get_uncompiled_model()
model.compile(
optimizer="rmsprop",
loss="sparse_categorical_crossentropy",
metrics=["sparse_categorical_accuracy"],
)
return model

自定义损失函数:可以包含一些超参数

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
class CustomMSE(keras.losses.Loss):
def __init__(self, regularization_factor=0.1, name="custom_mse"):
super().__init__(name=name)
self.regularization_factor = regularization_factor

def call(self, y_true, y_pred):
mse = tf.math.reduce_mean(tf.square(y_true - y_pred))
reg = tf.math.reduce_mean(tf.square(0.5 - y_pred))
return mse + reg * self.regularization_factor


model = get_uncompiled_model()
model.compile(optimizer=keras.optimizers.Adam(), loss=CustomMSE())

y_train_one_hot = tf.one_hot(y_train, depth=10)
model.fit(x_train, y_train_one_hot, batch_size=64, epochs=1)

自定义度量信息:

  • __init__(self), in which you will create state variables for your metric.
  • update_state(self, y_true, y_pred, sample_weight=None), which uses the targets y_true and the model predictions y_pred to update the state variables.
  • result(self), which uses the state variables to compute the final results.
  • reset_states(self), which reinitializes the state of the metric.

State update and results computation are kept separate (in update_state() and result(), respectively) because in some cases, results computation might be very expensive, and would only be done periodically.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30

class CategoricalTruePositives(keras.metrics.Metric):
def __init__(self, name="categorical_true_positives", **kwargs):
super(CategoricalTruePositives, self).__init__(name=name, **kwargs)
self.true_positives = self.add_weight(name="ctp", initializer="zeros")

def update_state(self, y_true, y_pred, sample_weight=None):
y_pred = tf.reshape(tf.argmax(y_pred, axis=1), shape=(-1, 1))
values = tf.cast(y_true, "int32") == tf.cast(y_pred, "int32")
values = tf.cast(values, "float32")
if sample_weight is not None:
sample_weight = tf.cast(sample_weight, "float32")
values = tf.multiply(values, sample_weight)
self.true_positives.assign_add(tf.reduce_sum(values))

def result(self):
return self.true_positives

def reset_states(self):
# The state of the metric will be reset at the start of each epoch.
self.true_positives.assign(0.0)


model = get_uncompiled_model()
model.compile(
optimizer=keras.optimizers.RMSprop(learning_rate=1e-3),
loss=keras.losses.SparseCategoricalCrossentropy(),
metrics=[CategoricalTruePositives()],
)
model.fit(x_train, y_train, batch_size=64, epochs=3)

通过add_loss添加损失:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25

class ActivityRegularizationLayer(layers.Layer):
def call(self, inputs):
self.add_loss(tf.reduce_sum(inputs) * 0.1)
return inputs # Pass-through layer.


inputs = keras.Input(shape=(784,), name="digits")
x = layers.Dense(64, activation="relu", name="dense_1")(inputs)

# Insert activity regularization as a layer
x = ActivityRegularizationLayer()(x)

x = layers.Dense(64, activation="relu", name="dense_2")(x)
outputs = layers.Dense(10, name="predictions")(x)

model = keras.Model(inputs=inputs, outputs=outputs)
model.compile(
optimizer=keras.optimizers.RMSprop(learning_rate=1e-3),
loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
)

# The displayed loss will be much higher than before
# due to the regularization component.
model.fit(x_train, y_train, batch_size=64, epochs=1)

Loading mechanics

64 * 100, fanouts=[5, 10],

[320, 100] —-> [64, 5, 100]

[3200, 100] ——>[64, 50, 100]

gat模型1:

1、用8-10号的作为训练集,8-14作为预测。

阈值:模型的预测值大于阈值的order被认为是刷单。

满足增量差评率小于万分之四的时候,增量差评率为2.844,阈值是0.45,增量为31644;

2、同样8-10号的作为训练集,但是其他模型标签里做了筛选:

1
2
3
#  其他模型预测概率大于0.2,且为刷单;
# 其他模型预测概率小于0.05,且为非刷单;
[x._2.toDouble >= 0.2]---join---阈值总量来看;

满足增量差评率小于万分之四的时候,增量差评率为1.570,阈值是0.55,增量为25470;

9-14 号:增量差评率23000单;0.6的阈值;

这个结果是因为什么?

  • 另外一天的;

3、filter-metapath之后的order数量:

1
2
3
4
[ads_9ncloud@zhengchu-gpu-qz0io 09-14-2020]$ wc -l filter_order_id.txt
372137 filter_order_id.txt
[ads_9ncloud@zhengchu-gpu-qz0io 09-14-2020]$ wc -l order_id.txt
2859795 order_id.txt

Q1: use-bahavior-result(embeddingTable:= “aoftest3.aof_user_behavior_result”)这个表里的label、prob是对应的什么?

  • Prob是预测为刷单的概率; prob和label的关联:上游模型

Q2: sku、shop的特征是什么模型得到的?

  • sku,shop;统计特征;
  • sku的特征是否和数据库里的一样;

Q3: 不考虑邻居节点信息,直接对每个节点做神经网络得到一个模型预测值,然后设置一个阈值,大于该阈值的order被认为是刷单,不考虑这些order的差评。为什么?

小于0.2的不用识别;

1、异构图order-sku-shop的异构图上,order-sku-order的元路径上计算相似度,

  1. 元路径Order-Sku-Order上的GAT模型1,继承自定义的基类,后续模型都能继承实现任意参数的正则化、初始化和激活等调优,模型1用2020-8-10的数据作为训练集,在预测集上以0.05作为阈值间隔生成csv统计信息;

  2. 目前预测2020-8-14号的增量差评率小于万分之四的时候,增量差评率为2.844,阈值是0.45,增量为31644;预测2020-9-14号的增量差评率为1.570,阈值是0.55,增量为25470;后续还可以通过用户行为信息进一步筛选,从而提高增量。

  3. 实现任意一条长度的元路径上做特征拼接的GAT模型2,该模型利用sku、shop等节点上的统计信息,后续可以叠加多条元路径作为训练,但目前训练(元路径为Order-Sku-Order)时F1得分不理想,后续待进一步验证数据与模型。

  4. 后续准备使用多天数据进行训练并验证增量结果,目前Galileo刚更新了版本1.0,需要对现有的模型进行重命名,调整输入文件格式等调试。

# Tensorflow
Tensorflow2-Estimator
Tensorflow2-tf.function
  • Table of Contents
  • Overview
Zheng Chu

Zheng Chu

90 posts
20 categories
25 tags
GitHub 简书 CSDN E-Mail
  1. 1. keras.utils.plot_model
  2. 2. keras.save
  3. 3. keras.Model as layer
  • All models are callable, just like layers
    1. 1. multiple input & output
    2. 2. Shared layers
    3. 3. functional API
    4. 4. whether compile before predict
    5. 5. 是否compile模型
    6. 6. 自定义损失函数:可以包含一些超参数
    7. 7. 自定义度量信息:
    8. 8. Loading mechanics
  • © 2021 Zheng Chu
    Powered by Hexo v4.2.1
    |
    Theme – NexT.Pisces v7.3.0
    |