How exactly tf.estimator.DNNClassifier calculate Output/Logits?

Refresh

2 weeks ago

Views

11 time

1

i am new in Machine Learning here, just try to experimenting with tensorflow to understand some fundamental concept.

I want to ask about how exactly tensorflow calculate output/logits layer, especially with Pre-Made models like tf.estimator.DNNClassifier. When i try to see the logits value from tensorflow estimator, and try to manually calculate this using usual Y = M*X + B with relu activation, the result is different if i try to increase node size in hidden_units more than 2.

this is my code:

import pandas as pd
import io
import tensorflow as tf 
import numpy as np
from sklearn.model_selection import train_test_split

# set Numpy print option
np.set_printoptions(suppress=True)
np.set_printoptions(threshold=np.nan)

# prepare data
raw_data = pd.read_csv(io.StringIO(uploaded['dataset.csv'].decode('utf-8')),sep=';')
x_data = raw_data[['index0','index1','index2','index3','index4']]
y_label = raw_data['ClassType']
x_train, x_test, y_train, y_test = train_test_split(x_data, y_label, test_size=0.3, random_state=101)

# Print x_data and y_label head
x_data.head().as_matrix()
""" array([[0.19475628, 0.12746904, 0.04672323, 0.07501275, 0.07693333],
       [0.66347056, 0.70985919, 0.70651323, 0.67363084, 0.67045463],
       [0.54878968, 0.63825475, 0.63908431, 0.70264373, 0.71397793],
       [0.43528058, 0.40889643, 0.40793766, 0.29906271, 0.41115305],
       [0.44894346, 0.44471657, 0.42760782, 0.42890334, 0.43528058]])"""

y_label.head().as_matrix()
# array([1, 0, 0, 1, 0])

# Create Feature Columns
IDX0 = tf.feature_column.numeric_column("index0")
IDX1 = tf.feature_column.numeric_column("index1")
IDX2 = tf.feature_column.numeric_column("index2")
IDX3 = tf.feature_column.numeric_column("index3")
IDX4 = tf.feature_column.numeric_column("index4")
feat_cols = [IDX0,IDX1,IDX2,IDX3,IDX4]

# Declare Input Function
input_func = tf.estimator.inputs.pandas_input_fn(x=x_train,y=y_train,batch_size=100,num_epochs=100,shuffle=True)

# FIRST ATTEMPT, DNNClassifier with hidden_units=[2,2]
model = tf.estimator.DNNClassifier(hidden_units = [2,2], feature_columns=feat_cols,model_dir='MODEL',n_classes=2)

model.train(input_fn=input_func,steps=10000)

# try to predict one data
x1 = x_data.iloc[[15]]
test = x1.as_matrix()
# array([[0.44360469, 0.44796062, 0.50455245, 0.50898942, 0.41743509]])
y1 = y_label.iloc[[15]]
# array([0])

predOneFN = tf.estimator.inputs.pandas_input_fn(x=x1,batch_size=1,num_epochs=1,shuffle=False)
predOne_gen = model.predict(predOneFN)
list(predOne_gen)
"""[{'class_ids': array([0]),
  'classes': array([b'0'], dtype=object),
  'logistic': array([0.4911849], dtype=float32),
  'logits': array([-0.03526414], dtype=float32),
  'probabilities': array([0.5088151, 0.4911849], dtype=float32)}]"""

after that, i try to manually calculate logits/output using Y = (weight)*(input) + (bias) to connect layer:

# get and store that variable
a = model.get_variable_value('dnn/hiddenlayer_0/bias')
b = model.get_variable_value('dnn/hiddenlayer_0/kernel')
c = model.get_variable_value('dnn/hiddenlayer_1/bias')
d = model.get_variable_value('dnn/hiddenlayer_1/kernel')
k = model.get_variable_value('dnn/logits/bias')
l = model.get_variable_value('dnn/logits/kernel')

# y = mx + b, dense all neuron
mx = np.matmul(test,b)
mx = mx + a
mx = np.maximum(0,mx)

layer2 = np.matmul(mx,d)
layer2 = layer2 + c
# relu
layer2 = np.maximum(0,layer2)

out = np.matmul(layer2,l)
out = out + k

# out = array([[-0.03526414]])

This method (y = mx + b) is EXACTLY same with logits output from predOne_gen with hidden_units[2,2]. (-0.03526414 vs -0.03526414)

However, when i change hidden layer to [3,3], and repeat the training and predicting one value, i get diferent result, as follow:

list(predOne_gen)
"""[{'class_ids': array([0]),
  'classes': array([b'0'], dtype=object),
  'logistic': array([0.45599243], dtype=float32),
  'logits': array([-0.17648707], dtype=float32),
  'probabilities': array([0.5440076, 0.4559924], dtype=float32)}]"""

mx = np.matmul(test,b)
mx = mx + a
mx = np.maximum(0,mx)

layer2 = np.matmul(mx,d)
layer2 = layer2 + c
# relu
layer2 = np.maximum(0,layer2)

out = np.matmul(layer2,l)
out = out + k

# out = array([[-0.1764868]])

as you can see that the result is DIFFERENT (-0.17648707 vs -0.1764868), although this difference is small, but when i try to increase data dimension, the difference is become bigger and bigger and bigger.

So, how i can know the exact formula to calculate tensorflow logits/output?

Thanks for your help NB. sorry for my bad english and bad ML Code.

0 answers