python - Pivot Pandas Dataframe and calculate 'columns' parameter -
say have following dataframe:
import pandas pd df = pd.dataframe() df['id'] = [1, 1, 1, 2, 2] df['type'] = ['a', 'b', 'q', 'b', 'r'] df['status'] = [0, 0, 1, 0, 1] >>> df id type status 0 1 0 1 1 b 0 2 1 q 1 3 2 b 0 4 2 r 1 >>>
i want group dataframe 'id' , reshape have "type" variable , "status" variable each item within group. see below:
type1 type2 type3 status1 status2 status3 id 1 b q 0 0 1 2 b r nan 0 1 nan
the number of rows in output dataframe depend on max number of records in 1 group of ids.
i believe pivot function want use here. however, calls "columns" parameter believe should id of each item within each group. have clunky way of calculating this, appreciate advice on better way of doing this:
>>> g=df.groupby('id') >>> df['idingroup']=[item sublist in [range(1,len(i[1])+1) in g] item in sublist] >>> df id type status idingroup 0 1 0 1 1 1 b 0 2 2 1 q 1 3 3 2 b 0 1 4 2 r 1 2 >>>
then can loop through 'type' , 'status' variables , pivot each , merge them together:
>>> listofvalues=[] >>> valuecol in ['type','status']: ... f=df.pivot(index='id',columns='idingroup',values=valuecol) ... f.columns=[valuecol+str(col) col in f.columns] ... f.columns.name=none ... listofvalues.append(f) ... >>> pd.concat(listofvalues,1) type1 type2 type3 status1 status2 status3 id 1 b q 0 0 1 2 b r nan 0 1 nan >>>
am taking correct approach here? , if so, better way calculate "columns" parameter pivot function? (the id of each item within each group)
try this:
import pandas pd df = pd.dataframe() df['id'] = [1, 1, 1, 2, 2] df['type'] = ['a', 'b', 'q', 'b', 'r'] df['status'] = [0, 0, 1, 0, 1] g = df.groupby("id")[["type","status"]] df2 = g.apply(pd.dataframe.reset_index, drop=true).unstack()
here result:
type status 0 1 2 0 1 2 id 1 b q 0 0 1 2 b r nan 0 1 nan
the columns multiindex, if want flat it:
df2.columns = [l0 + str(l1+1) l0, l1 in df2.columns]
the output:
type1 type2 type3 status1 status2 status3 id 1 b q 0 0 1 2 b r nan 0 1 nan
Comments
Post a Comment