python - Populate a Pandas SparseDataFrame from a SciPy Sparse Matrix -


i noticed pandas has support sparse matrices , arrays. currently, create dataframe()s this:

return dataframe(matrix.toarray(), columns=features, index=observations) 

is there way create sparsedataframe() scipy.sparse.csc_matrix() or csr_matrix()? converting dense format kills ram badly. thanks!

a direct conversion not supported atm. contributions welcome!

try this, should ok on memory spareseries csc_matrix (for 1 column) , pretty space efficient

in [37]: col = np.array([0,0,1,2,2,2])  in [38]: data = np.array([1,2,3,4,5,6],dtype='float64')  in [39]: m = csc_matrix( (data,(row,col)), shape=(3,3) )  in [40]: m out[40]:  <3x3 sparse matrix of type '<type 'numpy.float64'>'         6 stored elements in compressed sparse column format>  in [46]: pd.sparsedataframe([ pd.sparseseries(m[i].toarray().ravel())                                in np.arange(m.shape[0]) ]) out[46]:     0  1  2 0  1  0  4 1  0  0  5 2  2  3  6  in [47]: df = pd.sparsedataframe([ pd.sparseseries(m[i].toarray().ravel())                                     in np.arange(m.shape[0]) ])  in [48]: type(df) out[48]: pandas.sparse.frame.sparsedataframe 

Comments

Popular posts from this blog

javascript - DIV "hiding" when changing dropdown value -

Does Firefox offer AppleScript support to get URL of windows? -

android - How to install packaged app on Firefox for mobile? -