빅데이터 개인공부 - Tacademy Python을 통한 데이터 처리 강의들

주피터

pd.read_csv{'gapminder.csv' , index_col = 'unncamed: 0'}

데이터 프레임 : 두개이상의 series 의 합이다.

fillna : missingvalue 를 어떻게 처리할 것인가 .

by_year = df,groupby('year')

df['continent'].unique() array([' ' , ' ' , ' ' , ' ' ]), dtype=object

df['continent'].nunique() => continent가 몇가지의 종류인지

merge 함수

정렬을 할 때

df.sort_values ( by = 'life_exp' , ascending = False ).head(10)

ascending 은 높은값 -> 작은값 순으로 나옴

연도별, 대륙별 income 이 얼마나 나왔는지

In [ ] : df.pivot_table(values = 'income' , index = [' year' , 'continent '])

pivot 테이블 : 머지의 형태로 보여줄 수 있다.

In [] : csv_df = pd.read_csv( ' example ')

csv_df

불러온 파일을 저장하는 방법

In [] : csv_df.to_csv('example2' , index = False)

엑셀파일로 읽는 방법

: sheet1 에 있는 데이터를 불러와주세요

pd.read_excel('Excel_Sample.xlsx' , sheet_name = 'sheet1')

불러온 데이터를 저장하는 방법

xlsx_df.to_excel('Excel_Sample2.xlsx' , sheet_name = 'Sheet1')

html_df[0].head()

python 시각화

matplotlib exercise 를 이용위해 : 설치를 해야한다.

아나콘다 에서 install matplotlib

사용위해 import matplotlib.pyplot as plt 임포트 한다.

%matplotlib inline

데이터값 생성

import numpy as np x = np.linespace[0 , 5, 11] : 0부터 5까지 11개로구성

x = np.linspace(0, 5, 11)

#functional Method (그래프 그리기 )

plt.plot(x, y)

plt.xlabel('X Label')

plt.ylabel('Y Label')

여러개의 매소드를 생성되는 것.

In [] : plt.subplot( 1 , 2)

어떤 것에 plot 을 지정해줄지, [1, 2, 1]

plt.plot(x, y)

plt.subplot(1, 2, 2)

plt.plot(y, x)

object oriented method 를 활용해서 그리기

객체를 생성한다.

#Object-oriented method

fig = plt.figure()

너비와 높이

axe1 = fig.add([0.1, 0.1 , 0.8 ])

axe2 = fig.add_axes([0.2 , 0.5 , 0.4, 0.3])

ax1.plot(x, y)

ax2_set_xlabel ('x')

axes.set_ylabel('y')

axes.set_title('Title')

데이터를 저장해줄 수 있다.

fig.savefig('my_pic.png' , dpi = 200)

: 저장되는 형식을 지정 (png ) , 해상도 설정 (200)

fig = plt.figure()

ax = fig.add_axes([0, 0, 1, 1])

ax.plot(x, x ** 2 , label = 'X squared')

ax.plot(x, x ** 3)

ax.legend()

데이터 좋아하는 개발자의 블로그

이 블로그 검색

빅데이터 개인공부 - Tacademy Python을 통한 데이터 처리 강의들

댓글

댓글 쓰기