python - pandas - rank elements of dataframe -

data = pandas.dataframe(numpy.random.randn(4,3))  print data  out[4]:             0         1         2  0 -1.122880 -2.662009  1.180418  1 -0.335768  0.162640  0.105928  2 -1.282813  0.049638  1.532208  3 -0.422884 -1.110049  0.031648

working huge dataset , trying efficiently return tuples rank elements of dataframe. tried few awkward sequences of apply(), rank() , such want nicer.

looking function get_ranks(data) return ordered set of (row, col) tuples. above: (2,2), (0,2), (3,2), (1,1), ...

i searched around bunch haven't found commentary applying in particular. should cat rows or cols , rank there? or there more direct path?

here can :

>>> import pandas pd >>> import numpy np >>> df = pd.dataframe(np.random.randn(4,3))                                                                       >>> df           0         1         2 0  1.644294  1.476467 -0.137539 1 -0.448040 -0.329539 -0.996425 2 -1.015308 -1.397746  0.369095 3 -0.570194 -0.989716 -1.489257 >>> df2 = pd.dataframe(df.values.flatten()) >>> df2            0 0   1.644294 1   1.476467 2  -0.137539 3  -0.448040 4  -0.329539 5  -0.996425 6  -1.015308 7  -1.397746 8   0.369095 9  -0.570194 10 -0.989716 11 -1.489257 >>> df3 = df2.rank() >>> df3['row'] = df3.index % 4 >>> df3['column'] = (df3.index/4).astype(int)                                                                     >>> df3        0  row  column 0   12.0    0       0 1   11.0    1       0 2    9.0    2       0 3    7.0    3       0 4    8.0    0       1 5    4.0    1       1 6    3.0    2       1 7    2.0    3       1 8   10.0    0       2 9    6.0    1       2 10   5.0    2       2 11   1.0    3       2

some explanations :

i flatten original dataframe, , use rank() rank of values in flattened array. use modulo , division operations original position of value.

the resulting dataframe has 3 columns : first 1 rank of value (12 -> max, 1 -> min), second 1 index of original row of value, , third index of original column of value.

hope it'll helpful, , please let me know if it's not entirely clear.

Search This Blog

Employment

python - pandas - rank elements of dataframe -

Popular posts from this blog

Apache NiFi ExecuteScript: Groovy script to replace Json values via a mapping file -

audio - What is the sound ID for the "Glass" sound in iOS? -

python 3.x - PyQt5 - Signal : pyqtSignal no method connect -