Build a Neural Network¶
In the article Recognition of MNIST Handwritten Digits, we have used "operator" in
oneflow.nn and "layer" in
oneflow.layers to build a LeNet neural network. Now we will use this simple neural network to introduce the core element for network construction in OneFlow: operator and layer.
LeNet is constructed by convolution layer, pooling layer and fully connected layer.
There are two types of elements in the above diagram, one is the computing units represented by boxes which including
layer such as
max_pool2d and etc. The other is the data represented by arrows. It corresponds to the following code:
def lenet(data, train=False): initializer = flow.truncated_normal(0.1) conv1 = flow.layers.conv2d( data, 32, 5, padding="SAME", activation=flow.nn.relu, name="conv1", kernel_initializer=initializer, ) pool1 = flow.nn.max_pool2d( conv1, ksize=2, strides=2, padding="SAME", name="pool1", data_format="NCHW" ) conv2 = flow.layers.conv2d( pool1, 64, 5, padding="SAME", activation=flow.nn.relu, name="conv2", kernel_initializer=initializer, ) pool2 = flow.nn.max_pool2d( conv2, ksize=2, strides=2, padding="SAME", name="pool2", data_format="NCHW" ) reshape = flow.reshape(pool2, [pool2.shape, -1]) hidden = flow.layers.dense( reshape, 512, activation=flow.nn.relu, kernel_initializer=initializer, name="dense1", ) if train: hidden = flow.nn.dropout(hidden, rate=0.5, name="dropout") return flow.layers.dense(hidden, 10, kernel_initializer=initializer, name="dense2")
Datais firstly used as input in
conv2dto participate in the convolution calculation and the result from
conv1is passed to
Operator and Layer¶
Operator is a common concept. It is the basic calculation unit in OneFlow.
nn.max_pool2d used in LeNet code are two kinds of operators.
layers.dense are not operator. They are layers which constructed by specific operators. The existence of layers makes it easier to build neural networks. For more details please refer to oneflow.layers API
By reading oneflow.layers source code, you can learn the details of building a layer of calculations from basic operators.
Data block in neural network¶
OneFlow's default mode is a static graph mechanism and the network is actually built and run separately. As a result, when defining the network, there is no real data in each variable, which means they are just placeholders. The computation of the real data occurs during the call of the job function.
When building the network by defining job function, we only describe the attributes and shapes(such as
dtype) of the nodes in network. There is no data in the node, we call the node as PlaceHolder, OneFlow can compile and infer according to these placeholders to get the computation graph.
The placeholders are usually called
Blob in OneFlow. There is a corresponding base class
BlobDef in OneFlow.
When we build network, we can print the information of
Blob. For example, we can print data
dtype as follow.
BlobDef class implements operator overloading which means
BlobDef supports math operators such as addition, subtraction, multiplication and division and so on.
Like '+' in the following code:
output = output + fc2_biases
output = flow.broadcast_add(output, fc2_biases)
Neural network builds by OneFlow require the operators or layers provided by OneFlow as computing units. The placeholder
Blob serves as input and output for the operators and layers. Also operator reloading helps simplify some of the statements.