# Questions tagged [lstm]

1228 questions

1

votes

1

answer

216

Views

### LSTM with Numpy, can't find a definitive algorithm

I am doing LSTM from scratch and am following this guide, but the loss is not decreasing but increasing. This is the best guide I have found thus far but that is not saying much as even this one is incomplete. Aside from spotting the problem specific to my code I would appreciate any sources showing...

1

votes

1

answer

693

Views

### Time series classification using LSTM - How to approach?

I am working on an experiment with LSTM for time series classification and I have been going through several HOWTOs, but still, I am struggling with some very basic questions:
Is the main idea for learning the LSTM to take a same sample from every time series?
E.g. if I have time series A (with sam...

1

votes

0

answer

57

Views

### Erorr using keras 2.0 in R

I am trying to replicate Siraj's code for predicting stock prices in R (https://github.com/llSourcell/How-to-Predict-Stock-Prices-Easily-Demo).
This is my code:
url

1

votes

0

answer

315

Views

### Attention mechanism in spelling correction model

I'm trying to test attention mechanism in this code (based on the work of MajorTal):
def generate_model(output_len, chars=None):
'''Generate the model'''
print('Build model...')
chars = chars or CHARS
model = Sequential()
# 'Encode' the input sequence using an RNN, producing an output of hidden_size...

1

votes

0

answer

32

Views

### Keras RNN - ValueError when checking Input

I recently got a Nvidia Card and wanted to try LSTM-Models with the new GPU-Support. Sadly I do not know much about LSTMs. And I build this little model to test it:
import pandas as pd
from keras.models import Sequential
from keras.layers import Dense, LSTM, Dropout
from sklearn.model_selection impo...

1

votes

1

answer

93

Views

### Error: Value rank (4) should be larger than the Variable rank (1) at most by number of dynamic axes (2)

I would like to know How to provide input varibales to CNTK Network Modelling and training the model like given here
Since i have two independent variable and 3 target variables, X and Y are of shape:
X['train'].shape
Out[60]: (34567, 10, 2, 1)
Y['train'].shape
Out[61]: (34567, 3, 1)
i gave time_...

1

votes

0

answer

531

Views

### Keras LSTM model has very low accuracy

I am trying to build a model that takes a sentence as input, takes each word and tries to predict the next word. My input and output both are a 3D matrix with (number of sentences, number of words per sentence, dimension of word embedding). Input is e. g. 'I like green apples' while the output is 'l...

1

votes

2

answer

346

Views

### How to modify the initial value in BasicLSTMcell in tensorflow

I want to initial the value for weight and bias in BasicLSTMcell in Tensorflow with my pre-trained value (I get them by .npy). But as I use get_tensor_by_name to get the tensor, it seems that it just returns a copy for me and the raw value is never changed. I need your help!

1

votes

0

answer

35

Views

### Python- LSTM Based RNN error in input?

i am trying to build a deep learning network based on LSTM RNN
here is what is tried
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.layers import Embedding
from keras.layers import LSTM
import numpy as np
train = np.loadtxt('TrainDatasetFinal.txt...

1

votes

1

answer

130

Views

### Why do tensorflow Variables in outputs of dynamic (or static) rnn with lstm cell has plus-one dimensionality?

My question is about the elemental dynamic or static rnn outputs's dimensionality.
nlu_input = tf.placeholder(tf.float32, shape=[4,1607,1])
cell = tf.nn.rnn_cell.BasicLSTMCell(80)
outts, states = tf.nn.dynamic_rnn(cell=cell, inputs=nlu_input, dtype=tf.float32)
Then tf.gloabal_valiables() returns the...

1

votes

1

answer

160

Views

### lstm dimension not match by tensorflow

I construct a LSTM network, and my input's dimension is 100*100*83 ( batch_size=100, steps = 100, char_vector = 83). I build a two LSTM layers which has 512 hidden units.
# coding: utf-8
from __future__ import print_function
import tensorflow as tf
import numpy as np
import time
class CharRNN:
def...

1

votes

0

answer

273

Views

### Confusion about many-to-one, many-to-many LSTM architectures in Keras

My task is the following: I have a (black box) method which computes a sequence starting from an initial element. At each step, my method reads an input from an external source of memory and outputs an action which potentially changes this memory. You can think of this method as a function f: (exter...

1

votes

1

answer

183

Views

### LSTM parity generator

I am trying to learn deep learning, I have stumbled on one exercise here
It is first warm-up exercise. I am stuck. For constant sequence of small lengths(2,3) it solves it no problem. However when I try whole sequence of 50. it stops at 50% accuracy, which is basically random guess.
According to he...

1

votes

1

answer

458

Views

### How to reuse LSTM layer and variables in variable scope (attention mechanism)

I have an issue in my code where I would like to share weights in my lstm_decoder (so essentially just use one LSTM). I know there are a few resources online on that but I am still unable to understand why the following does not share weights:
initial_input = tf.unstack(tf.zeros(shape=(1,1,hidden_si...

1

votes

1

answer

2.1k

Views

### Pytorch tutorial LSTM

I was trying to implement the exercise about Sequence Models and Long-Short Term Memory Networks with Pytorch. The idea is to add an LSTM part-of-speech tagger character-level features but I can't seem to work it out. They gave as a hint that there should be two LSTMs involved, one that will output...

1

votes

0

answer

151

Views

### Input dimension mismatch in Keras

Hi Can anyone help me out with the error, I have seemed to search through the documentation but to no avail.
The aim is to predict a time series. I have used a dummy data shape = (N, timesteps, features). I wish to predict x_2 from x_1, x_3 from x_2 and so on till x_11 from x_10 using LSTM. (Any sug...

1

votes

0

answer

371

Views

### Training on multiple time-series of various length using recurrent layers in Keras

TL;DR - I have a couple of thousand speed-profiles (time-series where the speed of a car has been sampled) and I am unsure how to configure my models such that I can perform arbitrary forecasting (i.e. predict t+n samples given a sample t).
I have read numerous explanations (1, 2, 3, 4, 5) about how...

1

votes

1

answer

477

Views

### Inputting both Word-level and Char-level embedding in LSTM for PoS tagging

I am referring to this research paper 'Learning Character-level Representations for Part-of-Speech Tagging', where the author says: 'The proposed neural network uses a convolutional
layer that allows effective feature extraction from
words of any size. At tagging time, the convolutional layer
gener...

1

votes

0

answer

158

Views

### How to train a LSTM Neural Network to forecast a full “cycle” of a Time Series?

I have the below data whose 'cycles' grow in period over time:
My goal is to feed some fraction of this into a LSTM network in order to predict or forecast the remaining points (at least enough to make up one full cycle).
I've been trying to follow this tutorial: https://machinelearningmastery.com/...

1

votes

0

answer

185

Views

### Keras LSTM Error: ValueError: setting an array element with a sequence

I'm loading sentence vectors embeddings (Facebook's InferSent) in my neural network along with other metadata. Here is a glimpse of my dataset:
My dataset
Endpoint Vector: Sentence embedding ndarray. Shape: (4096,1)
Description Vector: Sentence embedding ndarray. Shape: (4096,1)
HTTP Path: Categoric...

1

votes

0

answer

403

Views

### LSTM tensorflow shape error?

I have a Class for RNN that can classify the sentences into positive or negative for sentiment analysis. Because the length of sentences are different, so I used placeholder named text_len to be a vector of sentence lengths in one batch when feeding the training batch data:
class TextRNN:
def __init...

1

votes

0

answer

173

Views

### Examples on how to declare Caffe LSTM?

Caffe seems to have an LSTM layer now:
http://caffe.berkeleyvision.org/tutorial/layers/lstm.html
However, there is no documentation on how to use it. How would I declare the following LSTM layer with the new API?
layer {
name: 'lstm1'
type: 'Lstm'
bottom: 'data'
bottom: 'clip'
top: 'lstm1'
Recurrent...

1

votes

0

answer

232

Views

### Increasing Label Error Rate (Edit Distance) and Fluctuating Loss?

I am training a handwriting recognition model of this architecture:
{
'network': [
{
'layer_type': 'l2_normalize'
},
{
'layer_type': 'conv2d',
'num_filters': 16,
'kernel_size': 5,
'stride': 1,
'padding': 'same'
},
{
'layer_type': 'max_pool2d',
'pool_size': 2,
'stride': 2,
'padding': 'same'
},
{
'la...

1

votes

0

answer

20

Views

### Implemented LSTM inference code shows different test loss compared to using lasagne.

I have difficulty implementing LSTM inference code using matlab. I extraced weight from trained model using theano and lasagne. And then, i coded LSTM inference(test) code using extraced weights and biases.
The loss of implemented inference code shows about 5, but the test loss of theano and lasagn...

1

votes

1

answer

132

Views

### Time series prediction with LSTM

I'm currently learning on LSTMs and time series prediction with LSTMs. And I try to predict speeds for a road segment.
from sklearn.preprocessing import MinMaxScaler
sc = MinMaxScaler()
training_set = sc.fit_transform(training_set)
X_train = training_set[0:1257] //speed at (t)
y_train = training_set...

1

votes

1

answer

698

Views

### LSTM overfitting but validation accuracy not improving

The task I am trying to do is to classify EEG signals into 4 possible classes. The data is divided up into trials. Subjects were asked to think about doing 1 of four actions, and the classification task is to predict what they were thinking based on the EEG signals.
I have ~2500 trials. For each tr...

1

votes

0

answer

42

Views

### Multiple Sequence to Sequence RNN with Various Time Scales

I'm trying to forecast a time series based on several influencing factors.
For example, forecasting how many shoppers per hour there will be in my shop tomorrow will depend on three things:
Yesterday's weather and shoppers/hour
Tomorrow's weather
How many tourists there are in town
The first two are...

1

votes

1

answer

127

Views

### How to use LSTM to make prediction with both feature from the past and the current ones?

Suppose I have a data frame like the following, Discount is calculated from selling price/list price, unit_sales is the number of items sold on that day.
If I am going to use LSTM to make sales prediction for the next day (data in the green box), based on the past 3 days of sales and discount (data...

1

votes

0

answer

29

Views

### Predicting the next value of data points using LSTM and Keras having tensorflow as a backend

I have a data with 65 samples. Example: 2697,2825,2136,2824,3473,2513,2538,3051,2737.9805,3133.849,2350.8695,6000,3121.225
I've divided the data in training and testing (train and test), scaled it and also supervised it. I have to predict what would be the 66th sample value.
2697,2825,2136,2824,347...

1

votes

1

answer

31

Views

### How can we define an RNN - LSTM neural network with multiple output for the input at time “t”?

I am trying to construct a RNN to predict the possibility of a player playing the match along with the runs score and wickets taken by the player.I would use a LSTM so that performance in current match would influence player's future selection.
Architecture summary:
Input features: Match details -...

1

votes

1

answer

251

Views

### Tensorflow: gradients are zero for LSTM and GradientDescentOptimizer

Gradients which are computed by GradientDescentOptimizer for LSTM network are always zero. They are zero even on the first step, so, I think it is not vanishing gradient problem. The same issue happens for AdamOptimizer.
I have reduced input to one point of time series and label (expected output) to...

1

votes

1

answer

550

Views

### Keras show MemoryError at model.fit()

I keep getting MemoryError without additional explanation from Keras at model.fit(), no matter how small the number of neurons or batch size. Does anyone have any idea what error does this error refer to or how to fix this?
Error:
Using TensorFlow backend.
Traceback (most recent call last):
File 'C:...

1

votes

1

answer

234

Views

### Error when fitting LSTM on Time Series Data

I have daily time series data of 14 features. I am interested in predicting a one-step ahead forecast for feature 1 (binary) using all features 1 to 14. In order to do so I implement a LSTM model in R using Keras.
To work with the LSTM layer I convert my train and validation data matrix in a 3D arr...

1

votes

1

answer

78

Views

### Implement modified architecture of LSTM cell in Python

I wish to modify the internal cell structure and equations of the standard LSTM cell representation(for example in the implementation of Keras). How do I do the same(which functions/modules to overwrite and in which file of the Keras implementation)?
Any other suggestions in terms of libraries or fr...

1

votes

1

answer

369

Views

### Overfitting, underfitting or good fit?

So im training an lstm rnn for a binary text classification task and i am having some issues understanding the loss results. The training set is roughly 700 000 examples, and the validation set is around 35000 examples.
The validation set and training set are independent, so i am not training on th...

1

votes

1

answer

126

Views

### How can I change the tensor shape from 4 dim to 5 dim in Keras?

now, I have 70 images with 200x200x1(one channel only).
I converted the shape of this training date set from (70,200,200,1) to (1,70,200,200,1) before operating model.fit.(actually my code is RNN)
So my RNN model starts as below.
def get_RNN
input1 = Input((img_rows, img_cols, 1), name='input1') # (...

1

votes

0

answer

216

Views

### AttributeError: 'LSTMStateTuple' object has no attribute 'get_shape' in tf.contrib.seq2seq.dynamic_decode(decoder)

I don't know why I am getting this error.
I saw a some posts to change state_is_tuple=False but it was giving me some other error. I think the error is in the way I defined lstm cell but not sure what should I change? I followed this link which has similar code structure.
Here is my code:
Require...

1

votes

0

answer

187

Views

### what is the difference of 'call' v.s '__call__' RNN methods in tensorflow?

I know what__call__ is,but what confuses me is that some classes like BasicRNNCell or tf.nn.rnn_cell.MultiRNNCell have this 'call' method instead of _call__ . What is this plain call method? it seems like the same thing , if it is not then i didnt see it being called.
I found this explanation somewh...

1

votes

0

answer

494

Views

### LSTM cell input matrix dimensions

Im trying to build an LSTM using just numpy to try and get a feel for whats going on, but I'm running into an issue with my understanding of how the LSTM matrixes work. I found this image from http://colah.github.io/posts/2015-08-Understanding-LSTMs/ of an RNN
From my understanding of an RNN, xt is...

1

votes

0

answer

40

Views

### How to avoid dying weights/gradients in custom LSTM cell in tensorflow. What shall be ideal loss function?

I am trying to train a name generation LSTM network. I am not using pre-defined tensorflow cells (like tf.contrib.rnn.BasicLSTMCell, etc). I have created LSTM cell myself. But the error is not reducing beyond a limit. It only decreases 30% from what it is initially (when random weights were used in...