kungphil
10 questions
0
votes
0
answer
7
views
specify dtypes when saving pandas dataframe to a binary file
I have a pandas DataFrame I want to write to a binary file, however the df contains mixed dtypes and ints. If I used df.values.tofile() I cannot specify different dtypes (even when specifying astype('f4, f4, i4, i4').tofile() in below example). Workaround at the moment is to use struct but is very s...
2
votes
1
answer
232
views
non-conformable arrays when passing numpy array to R via rpy2
I am trying to pass a numpy array to the GAMLSS package in R.
import numpy as np
import rpy2.robjects as robjects
from rpy2.robjects import numpy2ri
numpy2ri.activate()
r = robjects.r
r.library("gamlss")
r.library("gamlss.mx")
L = r['data.frame'](np.array(np.random.normal(size=1000),
dtype=([('x',...
30
votes
8
answer
31.5k
views
Fitting a Weibull distribution using Scipy
I am trying to recreate maximum likelihood distribution fitting, I can already do this in Matlab and R, but now I want to use scipy. In particular, I would like to estimate the Weibull distribution parameters for my data set.
I have tried this:
import scipy.stats as s
import numpy as np
import mat...
3
votes
1
answer
483
views
Sharing a ctypes numpy array without lock when using multiprocessing
I have a large array (~500k rows x 9 columns) which I would like to share when running a number of parallel processes using Python's multiprocessing module. I am using this SO answer to create my shared array and I understand from this SO answer that the array is locked. However in my case as I neve...
13
votes
2
answer
36.9k
views
What does %*% mean in R [duplicate]
This question already has an answer here:
The R %*% operator
3 answers
I am following some code and I can apply everything until I get to the command:
s1 %*% cc1$xcoef
This line does not work for me and I can't find documentation to explain it's purpose. I get this error:
Error in s1 %*% cc1$xcoef...
2
votes
2
answer
108
views
Data assimilation to correct imagery
I am attempting to correct some imagery.
The image is a composite of different aerial images which were collected under less than ideal lighting conditions and therefore when they are mosaiced there is a noticeable difference between them i.e. a dark stripe. To resolve this I have simulated how the...
2
votes
4
answer
71
views
pd.to_csv set float_format with list
I need to write a df to a text file, to save some space on disk I would like to set the number of decimal places for each column i.e. have each column a different width.
I have tried:
df = pd.DataFrame(np.random.random(size=(10, 4)))
df.to_csv(path, float_format=['%.3f', '%.3f', '%.3f', '%.10f'])
B...
2
votes
2
answer
823
views
Rpy2 and Pandas: join output from predict to pandas dataframe
I am using the randomForest library in R via RPy2. I would like to pass back the values calculated using the caret predict method and join them to the original pandas dataframe. See example below.
import pandas as pd
import numpy as np
import rpy2.robjects as robjects
from rpy2.robjects import pand...
2
votes
2
answer
68
views
Quickest way to remove mirror opposites from a list
Say I have a list of tuples [(0, 1, 2, 3), (4, 5, 6, 7), (3, 2, 1, 0)], I would like to remove all instances where a tuple is reversed e.g. removing (3, 2, 1, 0) from the above list.
My current (rudimentary) method is:
L = list(itertools.permutations(np.arange(x), 4))
for ll in L:
if ll[::-1] in L:...
2
votes
0
answer
1.8k
views
Writing to PostgreSQL from pandas: AttributeError: 'Engine' object has no attribute 'cursor'
I am trying to write a table to a PostgreSQL database from a Pandas data frame (following this answer) but I am getting the error AttributeError: 'Engine' object has no attribute 'cursor'
My code is:
import pandas as pd
from sqlalchemy import create_engine
import numpy as np
df = pd.DataFrame(index=...