python - How do I combine Pandas dataframes by looking at dates in one dataframe that fall within a date range in another dataframe? -


i have 2 dataframes have employee data below. 1 data file has employee data including dates on employees sick, , other data file has dates on employees worked (i.e. presented date ranges). combine 2 files (hopefully in pandas) looking @ "sick day" particular employee falls in "work range". example, in image/data below, employee 1 sick on 11/25/2015, 12/23/2015, , 10/12/2015. these fall in "work ranges" 11/21/2015 - 11/29/2015, 12/21/2015 - 12/29/2015, , 10/9/2015 - 10/17/2015, respectively.

employee work dates data:

╔══════════╦════════════╦════════════╗  ║ employee ║   datein   ║  dateout   ║  ╠══════════╬════════════╬════════════╣  ║        1 ║ 11/21/2015 ║ 11/29/2015 ║  ║        2 ║ 12/9/2015  ║ 12/14/2015 ║  ║        3 ║ 11/10/2015 ║ 11/19/2015 ║  ║        4 ║ 11/11/2015 ║ 11/17/2015 ║  ║        5 ║ 11/30/2015 ║ 12/8/2015  ║  ║        1 ║ 12/21/2015 ║ 12/29/2015 ║  ║        2 ║ 1/7/2016   ║ 1/12/2016  ║  ║        3 ║ 12/10/2015 ║ 12/19/2015 ║  ║        4 ║ 12/10/2015 ║ 12/16/2015 ║  ║        5 ║ 12/30/2015 ║ 1/7/2016   ║  ║        1 ║ 10/9/2015  ║ 10/17/2015 ║  ║        2 ║ 10/27/2015 ║ 11/1/2015  ║  ║        3 ║ 9/28/2015  ║ 10/7/2015  ║  ║        4 ║ 9/29/2015  ║ 10/5/2015  ║  ╚══════════╩════════════╩════════════╝

employee sick dates data:

╔══════════╦════════════╦═══════════╗  ║ employee ║  sickdate  ║ sickness  ║  ╠══════════╬════════════╬═══════════╣  ║        1 ║ 11/25/2015 ║ flu       ║  ║       10 ║ 11/21/2015 ║ hd        ║  ║       21 ║ 9/20/2015  ║ other     ║  ║        1 ║ 12/23/2015 ║ other     ║  ║        4 ║ 12/13/2015 ║ vacationx ║  ║        7 ║ 7/21/2015  ║ cough     ║  ║        3 ║ 10/1/2015  ║ rash      ║  ║        4 ║ 10/5/2015  ║ other     ║  ║        5 ║ 1/7/2016   ║ eyex      ║  ║        2 ║ 12/12/2015 ║ tanx      ║  ║        1 ║ 10/12/2015 ║ fatiguex  ║  ╚══════════╩════════════╩═══════════╝

consolidated data:

╔══════════╦════════════╦════════════╦════════════╦═══════════╗  ║ employee ║   datein   ║  dateout   ║  sickdate  ║ sickness  ║  ╠══════════╬════════════╬════════════╬════════════╬═══════════╣  ║        1 ║ 11/21/2015 ║ 11/29/2015 ║ 11/25/2015 ║ flu       ║  ║        2 ║ 12/9/2015  ║ 12/14/2015 ║ 12/12/2015 ║ tanx      ║  ║        3 ║ 11/10/2015 ║ 11/19/2015 ║            ║           ║  ║        4 ║ 11/11/2015 ║ 11/17/2015 ║            ║           ║  ║        5 ║ 11/30/2015 ║ 12/8/2015  ║            ║           ║  ║        1 ║ 12/21/2015 ║ 12/29/2015 ║ 12/23/2015 ║ other     ║  ║        2 ║ 1/7/2016   ║ 1/12/2016  ║            ║           ║  ║        3 ║ 12/10/2015 ║ 12/19/2015 ║            ║           ║  ║        4 ║ 12/10/2015 ║ 12/16/2015 ║ 12/13/2015 ║ vacationx ║  ║        5 ║ 12/30/2015 ║ 1/7/2016   ║ 1/7/2016   ║ eyex      ║  ║        1 ║ 10/9/2015  ║ 10/17/2015 ║ 10/12/2015 ║ fatiguex  ║  ║        2 ║ 10/27/2015 ║ 11/1/2015  ║            ║           ║  ║        3 ║ 9/28/2015  ║ 10/7/2015  ║ 10/1/2015  ║ rash      ║  ║        4 ║ 9/29/2015  ║ 10/5/2015  ║ 10/5/2015  ║ other     ║  ╚══════════╩════════════╩════════════╩════════════╩═══════════╝


how do in pandas or python? (thank help!)

you need put data pd.dataframe( ... ) df1 , set_index('employee')

╔══════════╦════════════╦════════════╗  ║ employee ║   datein   ║  dateout   ║  ╠══════════╬════════════╬════════════╣  ║        1 ║ 11/21/2015 ║ 11/29/2015 ║  ║        2 ║ 12/9/2015  ║ 12/14/2015 ║  ║        3 ║ 11/10/2015 ║ 11/19/2015 ║  ║        4 ║ 11/11/2015 ║ 11/17/2015 ║  ║        5 ║ 11/30/2015 ║ 12/8/2015  ║  ║        1 ║ 12/21/2015 ║ 12/29/2015 ║  ║        2 ║ 1/7/2016   ║ 1/12/2016  ║  ║        3 ║ 12/10/2015 ║ 12/19/2015 ║  ║        4 ║ 12/10/2015 ║ 12/16/2015 ║  ║        5 ║ 12/30/2015 ║ 1/7/2016   ║  ║        1 ║ 10/9/2015  ║ 10/17/2015 ║  ║        2 ║ 10/27/2015 ║ 11/1/2015  ║  ║        3 ║ 9/28/2015  ║ 10/7/2015  ║  ║        4 ║ 9/29/2015  ║ 10/5/2015  ║  ╚══════════╩════════════╩════════════╝

then put data pd.dataframe( ... ) df2 , set_index('employee')

╔══════════╦════════════╦═══════════╗  ║ employee ║  sickdate  ║ sickness  ║  ╠══════════╬════════════╬═══════════╣  ║        1 ║ 11/25/2015 ║ flu       ║  ║       10 ║ 11/21/2015 ║ hd        ║  ║       21 ║ 9/20/2015  ║ other     ║  ║        1 ║ 12/23/2015 ║ other     ║  ║        4 ║ 12/13/2015 ║ vacationx ║  ║        7 ║ 7/21/2015  ║ cough     ║  ║        3 ║ 10/1/2015  ║ rash      ║  ║        4 ║ 10/5/2015  ║ other     ║  ║        5 ║ 1/7/2016   ║ eyex      ║  ║        2 ║ 12/12/2015 ║ tanx      ║  ║        1 ║ 10/12/2015 ║ fatiguex  ║  ╚══════════╩════════════╩═══════════╝

finally, df = df1.join(df2).reset_index()


Comments

Popular posts from this blog

get url and add instance to a model with prefilled foreign key :django admin -

css - Make div keyboard-scrollable in jQuery Mobile? -

ruby on rails - Seeing duplicate requests handled with Unicorn -