dataframe combine_first() - 두 DataFrame 합치기

python/pandas 활용

dataframe combine_first() - 두 DataFrame 합치기

Memory! 2022. 8. 30. 20:49

728x90

데이터를 전처리 하다보면 누락된 table을 다른 테이블에서 채워 넣거나, 테이블 내에서 동일한 key로 여러 행이 존재하는 경우가 간혹 있다.
combine_first() 함수는 두 DataFrame을 합칠 때 한개의 값은 그대로 사용하고, NA (빈값)을 다른 테이블에서 가져와서 채운다.

코드로 알아보자.

df1 = pd.DataFrame({'A': [None, 0], 'B': [4, None], 'C':[1, None]})
df2 = pd.DataFrame({'B': [3, 3], 'C': [1, 1]})

df1

	A	B	C
1	NaN	4	1
2	0	NaN	NaN

df2

	b	c
1	3	1
2	3	1

df1.combine_first(df2)

	A	B	C
1	NaN	4	1
2	0	3	1

df1의 2행의 B,C 값이 NaN 에서 df2의 2행 값이 채워졌음을 볼수 있다.
※ 기존에 값이 있던 df1의 1행의 값은 변경되지 않는다는 것이 combine_first()의 주요 포인트이다.

- 두개의 DataFrame을 이어 붙이는 경우 concat() 메소드
- 두개의 DataFrame을 하나의 key로 join 하는 경우 merge() 함수를 통해 가능하다.

728x90

'python > pandas 활용' 카테고리의 다른 글

pandas groupby() 활용 - 데이터 그룹화 (0)	2024.03.09
pandas datetime, 숫자 타입 변환 (0)	2023.08.13
pandas - pivot, pivot_table (0)	2022.04.20
padnas(dataframe) 문자열이 포함된 column 가져오기 (0)	2022.04.11

현재글dataframe combine_first() - 두 DataFrame 합치기

250x250

errors ignore, dataframe apply errors, namedtuple to dict, dataframe groupby agg, pandas groupby multi index, 파이썬 데이터타입, dataframe typq 변환, namedtuple kwargs, python with open, dict to namedtuple, pandas groupby column name, dataframe groupby, namdtuple list, pandas apply ValueError, namedtuple dict, pandas 통계, pandas groupby agg, python namedtuple, dataframe datetime, groupby agg lambda,

Today :
Yesterday :

일	월	화	수	목	금	토
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

Memories

dataframe combine_first() - 두 DataFrame 합치기

'python > pandas 활용' 카테고리의 다른 글

'python/pandas 활용'의 다른글

티스토리툴바

dataframe combine_first() - 두 DataFrame 합치기

'python > pandas 활용' 카테고리의 다른 글

'python/pandas 활용'의 다른글

관련글

티스토리툴바