Questions tagged [tidyverse]

1

votes
1

answer
231

Views

Using predict function for new data along with tidyverse

I want to use predict function for new data along with tidyverse as in the following example. However, I could not figured out how to use with new data for wt = 4.0 and 4.2. Any hints, please. library(tidyverse) mtcars %>% dplyr::mutate(cyl1 = factor(cyl)) %>% tidyr::nest(-cyl) %>% dplyr::mutate...
MYaseen208
1

votes
1

answer
41

Views

R Find the Distance between Two US Zipcode columns

I was wondering what the most efficient method of calculating the distance in miles between two US zipcode columns would be using R. I have heard of the geosphere package for computing the difference between zipcodes but do not fully understand it and was wondering if there were alternative methods...
mrsquid
1

votes
1

answer
25

Views

How can I use gather function to manipulate my data frame? [duplicate]

This question already has an answer here: Collapse / concatenate / aggregate a column to a single comma separated string within each group 3 answers I Have a data frame as follows: df
yas.f
1

votes
2

answer
24

Views

How to find opening and closing balances

Could someone please help me find opening_baland closing_bal. I have all the transaction aggregates that happened in the month (new/transfers/exits etc) and I also have the closing balance for the last month. Using this data I needed to work back. library(tidyverse) library(lubridate) # this is th...
cephalopod
1

votes
2

answer
14

Views

Trying to combine dates and times

I am trying to combine dates and times. These are from a file when imported, looks like this: library(tidyverse) library(lubridate) bookings
wl1234
0

votes
1

answer
24

Views

Visualise differences between factor levels using ggplot

I have a plot in my mind that I would like to create, but I don't know how to successfully achieve this goal. I have 2 dataframes, one containing the mean value for each factor level, and the other, pairwise differences between these levels. contrasts
0

votes
3

answer
24

Views

How do you find if a value is found in specific columns?

ID Pred1 Pred2 Pred3 Obs1 Obs2 Obs3 FP 1 Boston Tokyo London Boston London Other 0 2 Tokyo London Paris Seattle Paris Other 0 3 London Berlin Paris Paris Berlin London 0 4 Seattle Berlin London Tokyo Paris Boston 1 This is my dataset. What I am tryi...
Molly
6

votes
3

answer
104

Views

Creating one variable from a list of variables in R?

I have a sequence of variables in a dataframe (over 100) and I would like to create an indicator variable for if particular text patterns are present in any of the variables. Below is an example with three variables. One solution I've found is using tidyr::unite() followed by dplyr::mutate(), but I'...
patward5656
1

votes
1

answer
387

Views

Equivalent of Stata tab command in R

I'm trying to find out what the Stata command tab x y if z>1 would be in R. Other than d %>% filter (z>1).
Rafael Sr
1

votes
2

answer
67

Views

`dplyr::case_when` don't give me correct results

case_when don't produces the expected results: My list: library(tidyverse) 1:6%>% str_c('var',.)%>% map(~assign(.,runif(30,20,100),envir=globalenv())) tibble
user108927
1

votes
1

answer
47

Views

Modify column value based on another column value

This is creating troubles to me,I am using dplyr and I want to change the value of each Week(W1 to W3) based on the value of CP: if < CP then 0 CP W1 W2 W3 W4 1 50 0 60 0 0 4 10...
3nomis
1

votes
1

answer
35

Views

dplyr::starts_with and ends_with not subsetting based on arguments

I want to select a number of variables based on thier names to transform them. The variable names all start with inq and end with 7, 8, 10, 13:15. This is not working for me... Apologies if this is obvious, but I cannot get it to work. Am I using the wrong functions, putting my functions and argumen...
Atanas Janackovski
1

votes
2

answer
30

Views

How do I prevent interpolation between values where there are more than X number of missing rows of data?

I would like to interpolate missing data, but skip scenarios where there are more than X number (e.g., 3) missing rows of data. I have code below, but the final step does not work. I previously posted a question and got a great answer (How do I prevent interpolation between values where there are mo...
D Kincaid
1

votes
1

answer
33

Views

How to unnest a list containing data frames

I'm trying to expand a nested column that contains a list of data frames. They are either NULL or 1 row by n columns, so the goal is to just add n columns to the tibble. (NULL list items would preferably expand to NAs). I've tried several solutions including those from this answer. The goal for th...
jzadra
1

votes
1

answer
28

Views

Is there a limit for columns created within one `mutate`call?

I'm currently restructuring an application, which provides data for a certain subject. At the moment I'm designing the structure of the new scripts for the shiny app and it works well. Before I go on and finalize things, I wanted to ask if anybody encountered problems when creating new columns with...
huan
1

votes
0

answer
129

Views
1

votes
1

answer
488

Views

How to Use Forcats::Fct_Collapse in a Function Across Different Dataframes with Different Factor Levels

library(tidyverse) library(forcats) I have two simple dataframes (code at bottom) and I want to create a new recoded variable by collapsing the 'Animal' column. I usually do this with forcats::fct_collapse. However, I want to make a function to apply fct_collapse to many different dataframes that ha...
Mike
1

votes
1

answer
27

Views

Assigning a list of lists as a nested column

I want to use purrr to generate some data based on some parameters. Shown below is a script that will generate a beta density on 0 to 1 parametedized by a a and b (the columns of the dataframe params. library(tidyverse) a = c(2,4,6) b = c(10,12,14) params = expand.grid(a = a, b = b) gen_den = functi...
Demetri Pananos
1

votes
1

answer
61

Views

Filter Start Date with Greather Than or Equal To and End Date that Contains Months as Strings [closed]

library(tidyverse) library(lubridate) I'm new to working with dates in the tidyverse and I'm attempting to filter by Start_Date that is greater than or equal to 08-MAY-2017, and an End_Date that contains the months of AUG or JUL. I attempted this with the code below. I first used lubridate::mdy...
Mike
1

votes
1

answer
83

Views

Spreading keys/values over multiple data frames stored in a list using a for loop

I have a bunch of data frames stored in a single list. My goal is to format each data frame in the list such that values in a specific column turn into column names. Since I would like every data frame in the list to be transformed, I tried to apply the spread function in tidyverse over all elements...
mochi
1

votes
0

answer
67

Views

fuzzy matching in DNA seqs

For the purposes of the reprex I've generated a tibble called random_DNA_tbl that is a random selection of 10 DNA sequences (of 100 bases). I've got a separate tibble called subseq_tbl, with 3 shorter sequences that match 100% to 3 of the sequences in random_DNA_tbl, but I'd also like to use fuzzy m...
biomiha
1

votes
1

answer
85

Views

Creating a factor: error using the cut() function

I am receiving this Error in mutate_impl(.data, dots) : Evaluation error: lengths of 'breaks' and 'labels' differ. error when attempting to create a new variable that indicates if the Air Quality Index is greater than 50 for over 100 days. Basically, I want to create a 'yes' or 'no' and label. I wa...
Josh
1

votes
1

answer
79

Views

Error: could not find function “lang_unnamespace”

I am getting the error here in this Travis build, and I cannot reproduce it locally. Yes, I realize that I do not have a minimal reproducible example, but I do know that it happens within tidyselect::vars_select(). Has anyone else encountered this before? I cannot find any mention of lang_unnamespac...
landau
1

votes
0

answer
63

Views

what is the correct way to reference variables when using tidyverse with other functions?

say I would like to use reporttools with tidyverse, I first make sure the packages are loaded, #install.packages('tidyverse', 'reporttools') #Use this to install it, do this only once library(reporttools); library(tidyverse) Second I test it with a basic reporttools tableNominal, i.e., data(CO2)...
Eric Fail
1

votes
1

answer
1.3k

Views

pmap _df: Error in bind_rows_(x, .id) : Argument 1 must have names

I thought the map_df family can fully replace plyr::ldply, as the release note in purrr package claimed a long time ago. However, I'm quite frustrated to realize that I cannot find a simple and elegant solution in this case. params % pmap_dfr(rnorm, n = 5) An error message will be returned: Error...
Novus
1

votes
1

answer
102

Views

Using purrr to convert list of vectors to list of matrices

EDITED: Based on suggestion by user @useR I have the following reprex for my required question (see end of post). # This is the source list i.e. list of vectors all_list [[1]] #> [1] 1 10 19 28 37 #> #> [[2]] #> [1] 4 13 22 31 40 #> #> [[3]] #> [1] 7 16 25 34 43 #> #> [[4]] #> [1] 2 11 20 29...
user4687531
1

votes
1

answer
250

Views

Converting time-series results to dates

I use fpp2 for forecasting. My workflow involves importing data, converting to a time series, then forecasting. One pain-point is that after forecasting I am left with data that is an extension of my current data, but no longer retains the same date column. For example, if I am working with weeks...
Alex
1

votes
2

answer
67

Views

gather multiple columns with nested, repeated measures

I have a dataset of people (pid) of different types (type2=c('dad', 'mom', 'kid'; and for ease, type=c('a', 'b', 'c')) nested in households (hid) with repeated measurements (time). Some variables like v1_ are asked to everyone, but the values are spread across three columns. For instance, v1_a cont...
Eric Green
1

votes
1

answer
76

Views

Grouped tibbletime and using collapse_index, getting weird results

I have a file (appx 9K records) that I want to aggregate based on the group first, and then on dates that are within seven days of each other. However, I'm not understanding why the results look the way they do. I realize there are other ways I could achieve the same results with this particular exa...
Knachman
1

votes
1

answer
515

Views

dplyr mutating multiple columns by prefix and suffix

I have a problem that I can replicate using the iris dataset, where many groups (same prefix in name) of variables with two different suffixes. I want to be take a ratio for all these groups but can't find a tidyverse solution.. I would have through mutate_at() might have been able to help. In the i...
Gareth Netto
1

votes
2

answer
44

Views

R efficiency iterating through dataframes

I am working with a large data set, lets call it data, and want to create a new column, lets call it data$results based off of some column data$input. The results are based off of some conditional if/then logic, so my original approach was something like: for (rows in data) { data$results
niccalis
1

votes
0

answer
62

Views

How to make tibble saved with write_tsv readable by read_tsv

I have quite large tibble() (data.frame) which I save with write_tsv() and would like to read with read_tsv(). I am using all default options. However, read_tsv() emits a bunch of warnings (See example below). What strategy could I use to make it work? (also tried write_csv() -> read_csv() but same...
witek
1

votes
2

answer
1.6k

Views

Package installation in R fail on MacOS

I tried to install two packages in R Studio: tidyverse and quantmod. However both give me errors and I can't understand why (googling doesn't help to understand the problem). For tidy verse I get: > install.packages('tidyverse') also installing the dependency ‘xml2’ There are binary versions ava...
Olivier
1

votes
1

answer
27

Views

facet_grid with multiple line colours

I have the following data frame resulting from simulations of ODEs with different parameter sets, e.g. df % gather(p, pval, -t, -x, -xval) %>% distinct() df.1$pval
Pascal
1

votes
2

answer
136

Views

write_csv Scientific notation depending on trailing “000”?

Writing a csv with the write_csv() function from package readr seems to treat numbers differently depending on trailing zeros. 4001705344 is saved as is, but 4100738000 is saved as 4100738e3 in the csv. This causes problems when I reopen the csv (e.g. in Excel). For a reproducible example s. library...
gpaul
1

votes
1

answer
106

Views

How to undo dplyr mutate silently round the division operation [duplicate]

This question already has an answer here: Why does as_tibble() round floats to the nearest integer? 1 answer I have the following data frame: library(tidyverse) dat # A tibble: 3 x 3 #> sid nof_reads mapped_reads #> #> 1 MK1 19786677 19785168 #> 2 MK2 29531664 295...
scamander
1

votes
1

answer
157

Views

Format numeric data table rowwise in R using tidyverse for kable output

I have a table of values that I want to save as a kable() table. Each row of the table is a variable and each column is a value of that variable (e.g., a mean, minimum, maximum, etc.). You can apply the format() function to columns of a data frame but applying it across rows seems very awkward. I fi...
Simon Woodward
1

votes
1

answer
21

Views

How do I add common rows together?

This is an example of the data frame I am using UserID
Brandon Jablon
1

votes
0

answer
171

Views

R's purrr package shortcut dot meaning: .f, .p

My question is, I do not quite understand the meaning of the . when writing a function such as implementing a self-designed every() function of purrr predicate functions: every2 1}) #> [1] FALSE every2(1:3, function(x) {x > 0}) #> [1] TRUE What does the . mean when you type .f? I have tried it by m...
1

votes
1

answer
241

Views

ggplot2 (version 3) incompatibility with ggmap for geom_density_2d

ggplot2 version 3 seems to have an incompatibility with ggmap when using the geom_density2d() function to add a layer. The following code returns an error (though worked with ggplot2 version 2): # Create a data frame df
mike

View additional questions