python - How to plot values for multiple factors from one column across several dates in pandas + matplotlib? -


i've got data frame looks this:

df=pd.dataframe({'animal': {timestamp('2014-11-12 00:00:00'): 'dog',   timestamp('2014-11-13 00:00:00'): 'rabbit',   timestamp('2014-11-14 00:00:00'): 'rabbit',   timestamp('2014-11-15 00:00:00'): 'rabbit',   timestamp('2014-11-16 00:00:00'): 'rabbit',   timestamp('2014-11-17 00:00:00'): 'rabbit',   timestamp('2014-11-18 00:00:00'): 'dog',   timestamp('2014-11-19 00:00:00'): 'rabbit',   timestamp('2014-11-20 00:00:00'): 'dog',   timestamp('2014-11-21 00:00:00'): 'dog',   timestamp('2014-12-01 00:00:00'): 'rabbit',   timestamp('2014-12-02 00:00:00'): 'dog',   timestamp('2014-12-03 00:00:00'): 'dog',   timestamp('2014-12-04 00:00:00'): 'rabbit',   timestamp('2014-12-05 00:00:00'): 'rabbit',   timestamp('2014-12-06 00:00:00'): 'dog',   timestamp('2014-12-07 00:00:00'): 'dog',   timestamp('2014-12-08 00:00:00'): 'rabbit',   timestamp('2014-12-09 00:00:00'): 'rabbit',   timestamp('2014-12-10 00:00:00'): 'rabbit',   timestamp('2014-12-11 00:00:00'): 'rabbit',   timestamp('2014-12-12 00:00:00'): 'rabbit',   timestamp('2014-12-13 00:00:00'): 'rabbit',   timestamp('2014-12-14 00:00:00'): 'rabbit',   timestamp('2014-12-15 00:00:00'): 'dog',   timestamp('2014-12-16 00:00:00'): 'dog',   timestamp('2014-12-17 00:00:00'): 'dog',   timestamp('2014-12-18 00:00:00'): 'rabbit',   timestamp('2014-12-19 00:00:00'): 'rabbit',   timestamp('2014-12-20 00:00:00'): 'dog'},  'count': {timestamp('2014-11-12 00:00:00'): 6136,   timestamp('2014-11-13 00:00:00'): 14620,   timestamp('2014-11-14 00:00:00'): 16437,   timestamp('2014-11-15 00:00:00'): 17273,   timestamp('2014-11-16 00:00:00'): 15302,   timestamp('2014-11-17 00:00:00'): 15180,   timestamp('2014-11-18 00:00:00'): 7177,   timestamp('2014-11-19 00:00:00'): 16193,   timestamp('2014-11-20 00:00:00'): 8226,   timestamp('2014-11-21 00:00:00'): 9741,   timestamp('2014-12-01 00:00:00'): 26237,   timestamp('2014-12-02 00:00:00'): 12146,   timestamp('2014-12-03 00:00:00'): 12910,   timestamp('2014-12-04 00:00:00'): 25820,   timestamp('2014-12-05 00:00:00'): 29323,   timestamp('2014-12-06 00:00:00'): 17294,   timestamp('2014-12-07 00:00:00'): 15219,   timestamp('2014-12-08 00:00:00'): 26174,   timestamp('2014-12-09 00:00:00'): 27112,   timestamp('2014-12-10 00:00:00'): 27131,   timestamp('2014-12-11 00:00:00'): 28268,   timestamp('2014-12-12 00:00:00'): 34059,   timestamp('2014-12-13 00:00:00'): 39162,   timestamp('2014-12-14 00:00:00'): 38314,   timestamp('2014-12-15 00:00:00'): 19807,   timestamp('2014-12-16 00:00:00'): 20606,   timestamp('2014-12-17 00:00:00'): 21552,   timestamp('2014-12-18 00:00:00'): 36499,   timestamp('2014-12-19 00:00:00'): 42163,   timestamp('2014-12-20 00:00:00'): 30301},  'day': {timestamp('2014-11-12 00:00:00'): 12,   timestamp('2014-11-13 00:00:00'): 13,   timestamp('2014-11-14 00:00:00'): 14,   timestamp('2014-11-15 00:00:00'): 15,   timestamp('2014-11-16 00:00:00'): 16,   timestamp('2014-11-17 00:00:00'): 17,   timestamp('2014-11-18 00:00:00'): 18,   timestamp('2014-11-19 00:00:00'): 19,   timestamp('2014-11-20 00:00:00'): 20,   timestamp('2014-11-21 00:00:00'): 21,   timestamp('2014-12-01 00:00:00'): 1,   timestamp('2014-12-02 00:00:00'): 2,   timestamp('2014-12-03 00:00:00'): 3,   timestamp('2014-12-04 00:00:00'): 4,   timestamp('2014-12-05 00:00:00'): 5,   timestamp('2014-12-06 00:00:00'): 6,   timestamp('2014-12-07 00:00:00'): 7,   timestamp('2014-12-08 00:00:00'): 8,   timestamp('2014-12-09 00:00:00'): 9,   timestamp('2014-12-10 00:00:00'): 10,   timestamp('2014-12-11 00:00:00'): 11,   timestamp('2014-12-12 00:00:00'): 12,   timestamp('2014-12-13 00:00:00'): 13,   timestamp('2014-12-14 00:00:00'): 14,   timestamp('2014-12-15 00:00:00'): 15,   timestamp('2014-12-16 00:00:00'): 16,   timestamp('2014-12-17 00:00:00'): 17,   timestamp('2014-12-18 00:00:00'): 18,   timestamp('2014-12-19 00:00:00'): 19,   timestamp('2014-12-20 00:00:00'): 20},  'month': {timestamp('2014-11-12 00:00:00'): 11,   timestamp('2014-11-13 00:00:00'): 11,   timestamp('2014-11-14 00:00:00'): 11,   timestamp('2014-11-15 00:00:00'): 11,   timestamp('2014-11-16 00:00:00'): 11,   timestamp('2014-11-17 00:00:00'): 11,   timestamp('2014-11-18 00:00:00'): 11,   timestamp('2014-11-19 00:00:00'): 11,   timestamp('2014-11-20 00:00:00'): 11,   timestamp('2014-11-21 00:00:00'): 11,   timestamp('2014-12-01 00:00:00'): 12,   timestamp('2014-12-02 00:00:00'): 12,   timestamp('2014-12-03 00:00:00'): 12,   timestamp('2014-12-04 00:00:00'): 12,   timestamp('2014-12-05 00:00:00'): 12,   timestamp('2014-12-06 00:00:00'): 12,   timestamp('2014-12-07 00:00:00'): 12,   timestamp('2014-12-08 00:00:00'): 12,   timestamp('2014-12-09 00:00:00'): 12,   timestamp('2014-12-10 00:00:00'): 12,   timestamp('2014-12-11 00:00:00'): 12,   timestamp('2014-12-12 00:00:00'): 12,   timestamp('2014-12-13 00:00:00'): 12,   timestamp('2014-12-14 00:00:00'): 12,   timestamp('2014-12-15 00:00:00'): 12,   timestamp('2014-12-16 00:00:00'): 12,   timestamp('2014-12-17 00:00:00'): 12,   timestamp('2014-12-18 00:00:00'): 12,   timestamp('2014-12-19 00:00:00'): 12,   timestamp('2014-12-20 00:00:00'): 12}} 

i'm trying plot line graph counts of each of 2 animals across 7 day period; essentially, i'm aiming time series each animal displayed on same graph.

here's code:

df['date'] = pd.to_datetime(df['date'], dayfirst=true, infer_datetime_format = true) df['animal'] = df['animal'].astype('category') df = df.set_index('date')  grouped = df.groupby('animal') key, group in grouped:     data = group.groupby(lambda x: x.day)     data['count'].plot(label=key)   plt.legend()  plt.show() 

instead of counts both animals displayed, this: enter image description here

the closest following: enter image description here

i feel i'm missing obvious chunk here, can't quite figure out.

edit: couldn't quite grasp how order both month , day, appended data data frame.

make day column store day numbers:

df['day'] = df.index.day 

and since want days ordered along x-axis, sort column too:

df = df.sort_values(by='day') 

then can group animal , plot each subgroup:

grouped = df.groupby(['animal']) fig, ax = plt.subplots() key, group in grouped:     group.plot('day', 'count', label=key, ax=ax) 

note group.plot calls dataframe.plot allows specify columns used x- , y-axes. in contrast, group['count'].plotcalls series.plot, assumes x-axis index , y-axis series's values.


import matplotlib.pyplot plt import pandas pd pandas import timestamp   df = pd.dataframe({'animal': {12: 'dog', 44: 'dog', 47: 'dog', 69: 'rabbit', 76: 'rabbit', 84: 'dog', 122: 'rabbit', 162: 'rabbit', 177: 'rabbit', 190: 'rabbit', 217: 'dog', 219: 'dog', 220: 'dog', 226: 'rabbit'},  'count': {12: 34573, 44: 30676, 47: 41821, 69: 56880, 76: 73172, 84: 30581, 122: 52895, 162: 58430, 177: 57132, 190: 53903, 217: 32001, 219: 35776, 220: 31095, 226: 53809},  'date': {12: timestamp('2014-12-29 00:00:00'), 44: timestamp('2014-12-28 00:00:00'), 47: timestamp('2014-12-31 00:00:00'), 69: timestamp('2014-12-29 00:00:00'), 76: timestamp('2014-12-31 00:00:00'), 84: timestamp('2014-12-26 00:00:00'), 122: timestamp('2014-12-25 00:00:00'), 162: timestamp('2014-12-30 00:00:00'), 177: timestamp('2014-12-27 00:00:00'), 190: timestamp('2014-12-28 00:00:00'), 217: timestamp('2014-12-27 00:00:00'), 219: timestamp('2014-12-30 00:00:00'), 220: timestamp('2014-12-25 00:00:00'), 226: timestamp('2014-12-26 00:00:00')}})  df['animal'] = df['animal'].astype('category') df = df.set_index('date')   df['day'] = df.index.day df = df.sort_values(by='day') grouped = df.groupby(['animal']) fig, ax = plt.subplots() key, group in grouped:     group.plot('day', 'count', label=key, ax=ax)  plt.legend(loc='best')  plt.show() 

enter image description here


for revised question, if want entire date along x-axis, perhaps easiest use series.plot (as doing in original code):

import matplotlib.pyplot plt import pandas pd pandas import timestamp  df = pd.dataframe({'animal': ['dog', 'rabbit', 'rabbit', 'rabbit', 'rabbit', 'rabbit', 'dog', 'rabbit', 'dog', 'dog', 'rabbit', 'dog', 'dog', 'rabbit', 'rabbit', 'dog', 'dog', 'rabbit', 'rabbit', 'rabbit', 'rabbit', 'rabbit', 'rabbit', 'rabbit', 'dog', 'dog', 'dog', 'rabbit', 'rabbit', 'dog'], 'count': [6136, 14620, 16437, 17273, 15302, 15180, 7177, 16193, 8226, 9741, 26237, 12146, 12910, 25820, 29323, 17294, 15219, 26174, 27112, 27131, 28268, 34059, 39162, 38314, 19807, 20606, 21552, 36499, 42163, 30301], 'date': [timestamp('2014-11-12 00:00:00'), timestamp('2014-11-13 00:00:00'), timestamp('2014-11-14 00:00:00'), timestamp('2014-11-15 00:00:00'), timestamp('2014-11-16 00:00:00'), timestamp('2014-11-17 00:00:00'), timestamp('2014-11-18 00:00:00'), timestamp('2014-11-19 00:00:00'), timestamp('2014-11-20 00:00:00'), timestamp('2014-11-21 00:00:00'), timestamp('2014-12-01 00:00:00'), timestamp('2014-12-02 00:00:00'), timestamp('2014-12-03 00:00:00'), timestamp('2014-12-04 00:00:00'), timestamp('2014-12-05 00:00:00'), timestamp('2014-12-06 00:00:00'), timestamp('2014-12-07 00:00:00'), timestamp('2014-12-08 00:00:00'), timestamp('2014-12-09 00:00:00'), timestamp('2014-12-10 00:00:00'), timestamp('2014-12-11 00:00:00'), timestamp('2014-12-12 00:00:00'), timestamp('2014-12-13 00:00:00'), timestamp('2014-12-14 00:00:00'), timestamp('2014-12-15 00:00:00'), timestamp('2014-12-16 00:00:00'), timestamp('2014-12-17 00:00:00'), timestamp('2014-12-18 00:00:00'), timestamp('2014-12-19 00:00:00'), timestamp('2014-12-20 00:00:00')]}) df = df.set_index('date')  grouped = df.groupby(['animal']) fig, ax = plt.subplots() key, group in grouped:     group['count'].plot(label=key, ax=ax)  plt.legend(loc='best')  plt.show() 

enter image description here


Comments

Popular posts from this blog

get url and add instance to a model with prefilled foreign key :django admin -

css - Make div keyboard-scrollable in jQuery Mobile? -

ruby on rails - Seeing duplicate requests handled with Unicorn -