Questions tagged [recurrent-neural-network]

1

votes
1

answer
2.2k

Views

what is the principle of readout and teacher forcing?

These days I study something about RNN and teacher forcing. But there is one point that I can't figure out. What is the principle of readout and teacher forcing? How can we feeding the output(or ground truth) of RNN from the previous time step back to the current time step, by using the output as fe...
slkingxr
0

votes
0

answer
3

Views

Blas GEMM launch failed [Tensorflow-GPU]

I'm trying to train LSTM network for sequence to sequence task using tensorflow-gpu and no other library like keras on top of it. Whenever begin the training process i get this annoying error. Blas GEMM launch failed : a.shape=(128, 532), b.shape=(532, 1024), m=128, n=1024, k=532 [[{{node rnn/while/...
deepak nandwani
1

votes
0

answer
10

Views

Expected input to torch Embedding layer with pre_trained vectors from gensim

I would like to use pre-trained embeddings in my neural network architecture. The pre-trained embeddings are trained by gensim. I found this informative answer which indicates that we can load pre_trained models like so: import gensim from torch import nn model = gensim.models.KeyedVectors.load_word...
Bram Vanroy
1

votes
0

answer
8

Views

why MSE on test set is very low and doesn't seem to evolve (not increasing after increasing epochs)

I am working on a problem of predicting stock values using LSTMs. My work is based on the following project . I use a data set (time series of stock prices) of total length 12075 that I split into train and test set (almost 10%). It is the same used in the link project. train_data.shape (11000,) te...
Othmane
1

votes
2

answer
1.2k

Views

How to generate sequence using LSTM?

I want to generate a sequence when a particular input is activated. I want to generate odd or even sequence according to its corresponding input neuron activation. I am trying to create a model using LSTM because it can remember the short term order. I tried this way import numpy as np from keras.mo...
Eka
0

votes
0

answer
4

Views

How to make a trainable initial state for a RNN in Tensorflow?

I am gonna write a bi-RNN by myself but I encounter such problem that I don't know how to make a trainable initial state. Part of my code is followed. # Inputs self.input_X = tf.placeholder(tf.float32, [batch_size, None, embedding_size]) self.input_Y = tf.placeholder(tf.float32, [batch_size, classes...
Jack Wang
9

votes
1

answer
1.5k

Views

String Matching Using Recurrent Neural Networks

I have recently started exploring Recurrent Neural Networks. So far I have trained character level language model on tensorFlow using Andrej Karpathy's blog. It works great. I couldnt however find any study on using RNNs for string matching or keyword spotting. For one of my project I require OCR of...
Fahad Sarfraz
1

votes
1

answer
402

Views

Multiple RNN in tensorflow

I'm trying to use a 2 deep layer RNN without MultiRNNCell in TensorFlow, I mean using the output of the 1layer as the input of the 2layer as: cell1 = tf.contrib.rnn.LSTMCell(num_filters, state_is_tuple=True) rnn_outputs1, _ = tf.nn.dynamic_rnn(cell1, inputs, dtype = tf.float32) cell2 = tf.contrib.rn...
RobinHood
2

votes
0

answer
151

Views

I would like to have an example of using Tensorflow ConvLSTMCell

I would like to have a small example of building an encoder-decoder network using Tensorflow ConvLSTMCell. Thanks
3

votes
2

answer
35

Views

Is there a standard approach to counting repetitions in an oscillating signal?

I am collecting sensor data from a repetitive physical process (think an elevator moving up and down). This is an example of what the signal looks like. The y-axis reflects our equivalent of 'height' and the x-axis is just time. Perhaps not surprising, this particular image reflects 5 repetitions...
migsvult
17

votes
3

answer
10.4k

Views

What's the difference between a bidirectional LSTM and an LSTM?

Can someone please explain this? I know bidirectional LSTMs have a forward and backward pass but what is the advantage of this over a unidirectional LSTM? What is each of them better suited for?
shekit
18

votes
3

answer
8.9k

Views

Time Series Prediction via Neural Networks

I have been working on Neural Networks for various purposes lately. I have had great success in digit recognition, XOR, and various other easy/hello world'ish applications. I would like to tackle the domain of time series estimation. I do not have a University account at the moment to read all the I...
digitalfoo
2

votes
1

answer
42

Views

Should The Gradients For The Output Layer of an RNN Clipped?

I am currently training an LSTM RNN for time-series forecasting. I understand that it is common practice to clip the gradients of the RNN when it crosses a certain threshold. However, I am not completely clear on whether or not this includes the output layer. If we call the hidden layer of an RNN h...
Rehaan Ahmad
2

votes
1

answer
1.1k

Views

Recurrent Neural network classification [closed]

I want to use R recurrent neural network package rnn to classify polarity of aspect and sentiment pairs. for example, the inputs are pre-trained embedding of word "speed" and "fast", I expect to get a class label of this pair by RNN classification. Could you give me some instruction about using the...
An Zhao
3

votes
0

answer
275

Views

Initializing Decoder States in Sequence To Sequence Models

I'm writing my first neural machine translator in tensorflow. I am using an encoder/decoder mechanism with attention. My encoder and decoder are lstm stacks with residual connections, but the encoder has an initial bidirectional layer. The decoder does not. It is common practice in the code that I h...
AlexDelPiero
2

votes
1

answer
3.2k

Views

How to get the output shape of a layer in Keras?

I have the following code in Keras (Basically I am modifying this code for my use) and I get this error: 'ValueError: Error when checking target: expected conv3d_3 to have 5 dimensions, but got array with shape (10, 4096)' Code: from keras.models import Sequential from keras.layers.convolutional imp...
3

votes
1

answer
721

Views

Multidimensional RNN on Tensorflow

I'm trying to implement a 2D RNN in the context of human action classification (joints on one axis of the RNN and time on the other) and have been searching high and low for something in Tensorflow that could do the job. I heard of GridLSTMCell (internally and externally contributed) but couldn't g...
mercurial
4

votes
3

answer
4.1k

Views

How to determine maximum batch size for a seq2seq tensorflow RNN training model

Currently, I am using the default 64 as the batch size for the seq2seq tensorflow model. What is the maximum batch size , layer size etc I can go with a single Titan X GPU with 12 GB RAM with Haswell-E xeon 128GB RAM. The input data is converted to embeddings. Following are some helpful parameters I...
stackit
2

votes
0

answer
1.2k

Views

Python: Identify patterns in time series based data

I have a dataset of following format: 11/26/2015 14:00:00 N234 11004066 27.05 11/26/2015 14:01:00 N234 11004066 27.07 11/26/2015 14:02:00 N234 11004066 27.04 11/26/2015 14:03:00 N234 11004066 27.02 11/26/2015 14:04:00 N234 11004066 27.00 11/26/2015 14:05:00 N234 1100...
Nisarg Shastri
3

votes
0

answer
196

Views

Tensorflow, encode variable length sentences by RNN, without padding

I am dealing with a sentence modeling problem where I have variable length sentences as input. I want to encode sentences with and RNN (e.g. LSTM or GRU). All examples that I found use some sort of padding or bucketing for encoding the sentences to make sure all sentences in the batch are of same le...
CentAu
6

votes
2

answer
10.5k

Views

ValueError: Input 0 is incompatible with layer lstm_13: expected ndim=3, found ndim=4

I am trying for multi-class classification and here are the details of my training input and output: train_input.shape= (1, 95000, 360) (95000 length input array with each element being an array of 360 length) train_output.shape = (1, 95000, 22) (22 Classes are there) model = Sequential() model.add(...
Urja Pawar
2

votes
1

answer
2.9k

Views

What is the structure of ATIS (Airline Travel Information System) dataset

When I use ATIS (Airline Travel Information System) dataset(http://lisaweb.iro.umontreal.ca/transfert/lisa/users/mesnilgr/atis/) to do research in recurrent-neural-network. I am confused with it's structure. For example, after using data = pickle.load(open("./dataset/atis.fold0.pkl", "rb"),encoding...
Nils Cao
19

votes
1

answer
11.3k

Views

Soft attention vs. hard attention

In this blog post, The Unreasonable Effectiveness of Recurrent Neural Networks, Andrej Karpathy mentions future directions for neural networks based machine learning: The concept of attention is the most interesting recent architectural innovation in neural networks. [...] soft attention scheme for...
dimid
5

votes
6

answer
892

Views

Why do we take the derivative of the transfer function in calculating back propagation algorithm?

What is the concept behind taking the derivative? It's interesting that for somehow teaching a system, we have to adjust its weights. But why are we doing this using a derivation of the transfer function. What is in derivation that helps us. I know derivation is the slope of a continuous function at...
auryndb
5

votes
1

answer
1.5k

Views

What's the difference between two implementations of RNN in tensorflow?

I find two kinds of implementations of RNN in tensorflow. The first implementations is this (from line 124 to 129). It uses a loop to define each step of input in RNN. with tf.variable_scope("RNN"): for time_step in range(num_steps): if time_step > 0: tf.get_variable_scope().reuse_variables() (cell_...
Nils Cao
5

votes
1

answer
2.3k

Views

Explanation of GRU cell in Tensorflow?

Following code of Tensorflow's GRUCell unit shows typical operations to get a updated hidden state, when previous hidden state is provided along with current input in the sequence. def __call__(self, inputs, state, scope=None): """Gated recurrent unit (GRU) with nunits cells.""" with vs.variable_sco...
Sangram
5

votes
1

answer
701

Views

swap_memory in dynamic_rnn allows quasi-infinite sequences?

I am trying to tag letters in long char-sequences. The inherent structure of the data requires me to use a bidirectional approach. Furthermore based on this idea I need access to the hidden state at each timestep, not just the final one. To try the idea I used a fixed length approach. I currently us...
Phillip Bock
5

votes
1

answer
1.7k

Views

Keras simple RNN implementation

I found problems when trying to compile a network with one recurrent layer. It seems there is some issue with the dimensionality of the first layer and thus my understanding of how RNN layers work in Keras. My code sample is: model.add(Dense(8, input_dim = 2, activation = "tanh", use_bias = False))...
Seraph
7

votes
1

answer
3.7k

Views

ValueError: The two structures don't have the same number of elements

with tf.variable_scope('forward'): cell_img_fwd = tf.nn.rnn_cell.GRUCell(hidden_state_size, hidden_state_size) img_init_state_fwd = rnn_img_mapped[:, 0, :] img_init_state_fwd = tf.multiply( img_init_state_fwd, tf.zeros([batch_size, hidden_state_size])) rnn_outputs2, final_state2 = tf.nn.dynamic_rnn...
user3640928
5

votes
1

answer
679

Views

How to generate a sentence from feature vector or words?

I used VGG 16-Layer Caffe model for image captions and I have several captions per image. Now, I want to generate a sentence from those captions (words). I read in a paper on LSTM that I should remove the SoftMax layer from the training network and provide the 4096 feature vector from fc7 layer dire...
foo
5

votes
1

answer
1.2k

Views

Stream Output of Predictions in Keras

I have an LSTM in Keras that I am training to predict on time series data. I want the network to output predictions on each timestep, as it will receive a new input every 15 seconds. So what I am struggling with is the proper way to train it so that it will output h_0, h_1, ..., h_t, as a constant...
Rob
5

votes
1

answer
1.8k

Views

Where to add dropout in neural network?

I have seen description about the dropout in different parts of the neural network: dropout in the weight matrix, dropout in the hidden layer after the matrix multiplication and before relu, dropout in the hidden layer after the relu, and dropout in the output score prior to the softmax function...
DiveIntoML
5

votes
1

answer
1.4k

Views

One to many LSTM in Keras

Is it possible to implement a one-to-many LSTM in Keras? If yes, can you please provide me with a simple example?
Omar Samir
5

votes
1

answer
1.1k

Views

Data Shape / Format for RNNs with Multiple Features

I'm trying to build an RNN using python / keras. I understand how it's done with one feature (with t+1 being the output), but how is done with multiple features? What if I had a regression problem and a dataset with a few different features, one expected output, and I wanted to have the time steps /...
Zach
4

votes
1

answer
1.5k

Views

Sequence labeling in Keras

I'm working on sentence labeling problem. I've done embedding and padding by myself and my inputs look like: X_i = [[0,1,1,0,2,3...], [0,1,1,0,2,3...], ..., [0,0,0,0,0...], [0,0,0,0,0...], ....] For every word in sentence I want to predict one of four classes, so my desired output should look like:...
Rachnog
3

votes
1

answer
806

Views

What's the difference between a greedy decoder RNN and a beam decoder with k=1?

Given a state vector we can recursively decode a sequence in a greedy manner by generating each output successively, where each prediction is conditioned on the previous output. I read a paper recently that described using beam search during decoding with a beam size of 1 (k=1). If we are only retai...
jstaker7
11

votes
1

answer
3.5k

Views

Shuffling training data with LSTM RNN

Since an LSTM RNN uses previous events to predict current sequences, why do we shuffle the training data? Don't we lose the temporal ordering of the training data? How is it still effective at making predictions after being trained on shuffled training data?
hellowill89
4

votes
1

answer
1k

Views

Tensorflow dynamic_rnn TypeError: 'Tensor' object is not iterable

I'm trying to get a basic LSTM working in TensorFlow. I'm receiving the following error: TypeError: 'Tensor' object is not iterable. The offending line is: rnn_outputs, final_state = tf.nn.dynamic_rnn(cell, x, sequence_length=seqlen, initial_state=init_state,)` I'm using version 1.0.1 on windo...
アンド
4

votes
1

answer
689

Views

How to make the weights of an RNN cell untrainable in Tensorflow?

I'm trying to make a Tensorflow graph where part of the graph is already pre-trained and running in prediction mode, while the rest trains. I've defined my pre-trained cell like so: rnn_cell = tf.contrib.rnn.BasicLSTMCell(100) state0 = tf.Variable(pretrained_state0,trainable=False) state1 = tf.Varia...
AlexR
6

votes
1

answer
4.3k

Views

Simple Recurrent Neural Network input shape

I am trying to code a very simple RNN example with keras but the results are not as expected. My X_train is a repeated list with length 6000 like: 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, ... I formatted this to shape: (6000, 1, 1) My y_train is a repeated list with length 6000 like: 1, 0.8, 0.6, 0, 0, 0, 1, 0...
user3406687

View additional questions