# Questions tagged [recurrent-neural-network]

381 questions

1

votes

1

answer

2.2k

Views

### what is the principle of readout and teacher forcing?

These days I study something about RNN and teacher forcing. But there is one point that I can't figure out. What is the principle of readout and teacher forcing? How can we feeding the output(or ground truth) of RNN from the previous time step back to the current time step, by using the output as fe...

0

votes

0

answer

3

Views

### Blas GEMM launch failed [Tensorflow-GPU]

I'm trying to train LSTM network for sequence to sequence task using tensorflow-gpu and no other library like keras on top of it. Whenever begin the training process i get this annoying error.
Blas GEMM launch failed : a.shape=(128, 532), b.shape=(532, 1024), m=128, n=1024, k=532
[[{{node rnn/while/...

1

votes

0

answer

10

Views

### Expected input to torch Embedding layer with pre_trained vectors from gensim

I would like to use pre-trained embeddings in my neural network architecture. The pre-trained embeddings are trained by gensim. I found this informative answer which indicates that we can load pre_trained models like so:
import gensim
from torch import nn
model = gensim.models.KeyedVectors.load_word...

1

votes

0

answer

8

Views

### why MSE on test set is very low and doesn't seem to evolve (not increasing after increasing epochs)

I am working on a problem of predicting stock values using LSTMs.
My work is based on the following project .
I use a data set (time series of stock prices) of total length 12075 that I split into train and test set (almost 10%). It is the same used in the link project.
train_data.shape
(11000,)
te...

1

votes

2

answer

1.2k

Views

### How to generate sequence using LSTM?

I want to generate a sequence when a particular input is activated. I want to generate odd or even sequence according to its corresponding input neuron activation. I am trying to create a model using LSTM because it can remember the short term order.
I tried this way
import numpy as np
from keras.mo...

0

votes

0

answer

4

Views

### How to make a trainable initial state for a RNN in Tensorflow?

I am gonna write a bi-RNN by myself but I encounter such problem that I don't know how to make a trainable initial state. Part of my code is followed.
# Inputs
self.input_X = tf.placeholder(tf.float32, [batch_size, None, embedding_size])
self.input_Y = tf.placeholder(tf.float32, [batch_size, classes...

9

votes

1

answer

1.5k

Views

### String Matching Using Recurrent Neural Networks

I have recently started exploring Recurrent Neural Networks. So far I have trained character level language model on tensorFlow using Andrej Karpathy's blog. It works great.
I couldnt however find any study on using RNNs for string matching or keyword spotting. For one of my project I require OCR of...

1

votes

1

answer

402

Views

### Multiple RNN in tensorflow

I'm trying to use a 2 deep layer RNN without MultiRNNCell in TensorFlow, I mean using the output of the 1layer as the input of the 2layer as:
cell1 = tf.contrib.rnn.LSTMCell(num_filters, state_is_tuple=True)
rnn_outputs1, _ = tf.nn.dynamic_rnn(cell1, inputs, dtype = tf.float32)
cell2 = tf.contrib.rn...

2

votes

0

answer

151

Views

### I would like to have an example of using Tensorflow ConvLSTMCell

I would like to have a small example of building an encoder-decoder network using Tensorflow ConvLSTMCell.
Thanks

3

votes

2

answer

35

Views

### Is there a standard approach to counting repetitions in an oscillating signal?

I am collecting sensor data from a repetitive physical process (think an elevator moving up and down). This is an example of what the signal looks like. The y-axis reflects our equivalent of 'height' and the x-axis is just time. Perhaps not surprising, this particular image reflects 5 repetitions...

17

votes

3

answer

10.4k

Views

### What's the difference between a bidirectional LSTM and an LSTM?

Can someone please explain this? I know bidirectional LSTMs have a forward and backward pass but what is the advantage of this over a unidirectional LSTM?
What is each of them better suited for?

18

votes

3

answer

8.9k

Views

### Time Series Prediction via Neural Networks

I have been working on Neural Networks for various purposes lately. I have had great success in digit recognition, XOR, and various other easy/hello world'ish applications.
I would like to tackle the domain of time series estimation. I do not have a University account at the moment to read all the I...

2

votes

1

answer

42

Views

### Should The Gradients For The Output Layer of an RNN Clipped?

I am currently training an LSTM RNN for time-series forecasting. I understand that it is common practice to clip the gradients of the RNN when it crosses a certain threshold. However, I am not completely clear on whether or not this includes the output layer.
If we call the hidden layer of an RNN h...

2

votes

1

answer

1.1k

Views

### Recurrent Neural network classification [closed]

I want to use R recurrent neural network package rnn to classify polarity of aspect and sentiment pairs. for example, the inputs are pre-trained embedding of word "speed" and "fast", I expect to get a class label of this pair by RNN classification.
Could you give me some instruction about using the...

3

votes

0

answer

275

Views

### Initializing Decoder States in Sequence To Sequence Models

I'm writing my first neural machine translator in tensorflow. I am using an encoder/decoder mechanism with attention. My encoder and decoder are lstm stacks with residual connections, but the encoder has an initial bidirectional layer. The decoder does not.
It is common practice in the code that I h...

2

votes

1

answer

3.2k

Views

### How to get the output shape of a layer in Keras?

I have the following code in Keras (Basically I am modifying this code for my use) and I get this error:
'ValueError: Error when checking target: expected conv3d_3 to have 5 dimensions, but got array with shape (10, 4096)'
Code:
from keras.models import Sequential
from keras.layers.convolutional imp...

3

votes

1

answer

721

Views

### Multidimensional RNN on Tensorflow

I'm trying to implement a 2D RNN in the context of human action classification (joints on one axis of the RNN and time on the other) and have been searching high and low for something in Tensorflow that could do the job.
I heard of GridLSTMCell (internally and externally contributed) but couldn't g...

4

votes

3

answer

4.1k

Views

### How to determine maximum batch size for a seq2seq tensorflow RNN training model

Currently, I am using the default 64 as the batch size for the seq2seq tensorflow model. What is the maximum batch size , layer size etc I can go with a single Titan X GPU with 12 GB RAM with Haswell-E xeon 128GB RAM. The input data is converted to embeddings. Following are some helpful parameters I...

2

votes

0

answer

1.2k

Views

### Python: Identify patterns in time series based data

I have a dataset of following format:
11/26/2015 14:00:00 N234 11004066 27.05
11/26/2015 14:01:00 N234 11004066 27.07
11/26/2015 14:02:00 N234 11004066 27.04
11/26/2015 14:03:00 N234 11004066 27.02
11/26/2015 14:04:00 N234 11004066 27.00
11/26/2015 14:05:00 N234 1100...

3

votes

0

answer

196

Views

### Tensorflow, encode variable length sentences by RNN, without padding

I am dealing with a sentence modeling problem where I have variable length sentences as input. I want to encode sentences with and RNN (e.g. LSTM or GRU). All examples that I found use some sort of padding or bucketing for encoding the sentences to make sure all sentences in the batch are of same le...

6

votes

2

answer

10.5k

Views

### ValueError: Input 0 is incompatible with layer lstm_13: expected ndim=3, found ndim=4

I am trying for multi-class classification and here are the details of my training input and output:
train_input.shape= (1, 95000, 360) (95000 length input array with each
element being an array of 360 length)
train_output.shape = (1, 95000, 22) (22 Classes are there)
model = Sequential()
model.add(...

2

votes

1

answer

2.9k

Views

### What is the structure of ATIS (Airline Travel Information System) dataset

When I use ATIS (Airline Travel Information System) dataset(http://lisaweb.iro.umontreal.ca/transfert/lisa/users/mesnilgr/atis/) to do research in recurrent-neural-network. I am confused with it's structure.
For example, after using data = pickle.load(open("./dataset/atis.fold0.pkl", "rb"),encoding...

19

votes

1

answer

11.3k

Views

### Soft attention vs. hard attention

In this blog post, The Unreasonable Effectiveness of Recurrent Neural Networks, Andrej Karpathy mentions future directions for neural networks based machine learning:
The concept of attention is the most interesting recent architectural innovation in neural networks. [...] soft attention scheme for...

5

votes

6

answer

892

Views

### Why do we take the derivative of the transfer function in calculating back propagation algorithm?

What is the concept behind taking the derivative? It's interesting that for somehow teaching a system, we have to adjust its weights. But why are we doing this using a derivation of the transfer function. What is in derivation that helps us. I know derivation is the slope of a continuous function at...

5

votes

1

answer

1.5k

Views

### What's the difference between two implementations of RNN in tensorflow?

I find two kinds of implementations of RNN in tensorflow.
The first implementations is this (from line 124 to 129). It uses a loop to define each step of input in RNN.
with tf.variable_scope("RNN"):
for time_step in range(num_steps):
if time_step > 0: tf.get_variable_scope().reuse_variables()
(cell_...

5

votes

1

answer

2.3k

Views

### Explanation of GRU cell in Tensorflow?

Following code of Tensorflow's GRUCell unit shows typical operations to get a updated hidden state, when previous hidden state is provided along with current input in the sequence.
def __call__(self, inputs, state, scope=None):
"""Gated recurrent unit (GRU) with nunits cells."""
with vs.variable_sco...

5

votes

1

answer

701

Views

### swap_memory in dynamic_rnn allows quasi-infinite sequences?

I am trying to tag letters in long char-sequences. The inherent structure of the data requires me to use a bidirectional approach.
Furthermore based on this idea I need access to the hidden state at each timestep, not just the final one.
To try the idea I used a fixed length approach. I currently us...

5

votes

1

answer

1.7k

Views

### Keras simple RNN implementation

I found problems when trying to compile a network with one recurrent layer. It seems there is some issue with the dimensionality of the first layer and thus my understanding of how RNN layers work in Keras.
My code sample is:
model.add(Dense(8,
input_dim = 2,
activation = "tanh",
use_bias = False))...

7

votes

1

answer

3.7k

Views

### ValueError: The two structures don't have the same number of elements

with tf.variable_scope('forward'):
cell_img_fwd = tf.nn.rnn_cell.GRUCell(hidden_state_size, hidden_state_size)
img_init_state_fwd = rnn_img_mapped[:, 0, :]
img_init_state_fwd = tf.multiply(
img_init_state_fwd,
tf.zeros([batch_size, hidden_state_size]))
rnn_outputs2, final_state2 = tf.nn.dynamic_rnn...

5

votes

1

answer

679

Views

### How to generate a sentence from feature vector or words?

I used VGG 16-Layer Caffe model for image captions and I have several captions per image. Now, I want to generate a sentence from those captions (words).
I read in a paper on LSTM that I should remove the SoftMax layer from the training network and provide the 4096 feature vector from fc7 layer dire...

5

votes

1

answer

1.2k

Views

### Stream Output of Predictions in Keras

I have an LSTM in Keras that I am training to predict on time series data. I want the network to output predictions on each timestep, as it will receive a new input every 15 seconds. So what I am struggling with is the proper way to train it so that it will output h_0, h_1, ..., h_t, as a constant...

5

votes

1

answer

1.8k

Views

### Where to add dropout in neural network?

I have seen description about the dropout in different parts of the neural network:
dropout in the weight matrix,
dropout in the hidden layer after the matrix multiplication and before relu,
dropout in the hidden layer after the relu,
and dropout in the output score prior to the softmax function...

5

votes

1

answer

1.4k

Views

### One to many LSTM in Keras

Is it possible to implement a one-to-many LSTM in Keras?
If yes, can you please provide me with a simple example?

5

votes

1

answer

1.1k

Views

### Data Shape / Format for RNNs with Multiple Features

I'm trying to build an RNN using python / keras. I understand how it's done with one feature (with t+1 being the output), but how is done with multiple features?
What if I had a regression problem and a dataset with a few different features, one expected output, and I wanted to have the time steps /...

4

votes

1

answer

1.5k

Views

### Sequence labeling in Keras

I'm working on sentence labeling problem. I've done embedding and padding by myself and my inputs look like:
X_i = [[0,1,1,0,2,3...], [0,1,1,0,2,3...], ..., [0,0,0,0,0...], [0,0,0,0,0...], ....]
For every word in sentence I want to predict one of four classes, so my desired output should look like:...

3

votes

1

answer

806

Views

### What's the difference between a greedy decoder RNN and a beam decoder with k=1?

Given a state vector we can recursively decode a sequence in a greedy manner by generating each output successively, where each prediction is conditioned on the previous output. I read a paper recently that described using beam search during decoding with a beam size of 1 (k=1). If we are only retai...

11

votes

1

answer

3.5k

Views

### Shuffling training data with LSTM RNN

Since an LSTM RNN uses previous events to predict current sequences, why do we shuffle the training data? Don't we lose the temporal ordering of the training data? How is it still effective at making predictions after being trained on shuffled training data?

4

votes

1

answer

1k

Views

### Tensorflow dynamic_rnn TypeError: 'Tensor' object is not iterable

I'm trying to get a basic LSTM working in TensorFlow. I'm receiving the following error:
TypeError: 'Tensor' object is not iterable.
The offending line is:
rnn_outputs, final_state = tf.nn.dynamic_rnn(cell, x, sequence_length=seqlen,
initial_state=init_state,)`
I'm using version 1.0.1 on windo...

4

votes

1

answer

689

Views

### How to make the weights of an RNN cell untrainable in Tensorflow?

I'm trying to make a Tensorflow graph where part of the graph is already pre-trained and running in prediction mode, while the rest trains. I've defined my pre-trained cell like so:
rnn_cell = tf.contrib.rnn.BasicLSTMCell(100)
state0 = tf.Variable(pretrained_state0,trainable=False)
state1 = tf.Varia...

6

votes

1

answer

4.3k

Views

### Simple Recurrent Neural Network input shape

I am trying to code a very simple RNN example with keras but the results are not as expected.
My X_train is a repeated list with length 6000 like: 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, ...
I formatted this to shape: (6000, 1, 1)
My y_train is a repeated list with length 6000 like: 1, 0.8, 0.6, 0, 0, 0, 1, 0...