word2vec¶

단어를 벡터화할 때 단어의 문맥적 의미를 보존

from konlpy.tag import Kkma
from konlpy.utils import pprint
from gensim.models.word2vec import Word2Vec

import warnings
warnings.filterwarnings("ignore")

input data
네이버 기사 로드

data = [["KB금융지주가 KB증권, KB캐피탈, KB부동산신탁 등 3개 계열사 대표를 새로 선정했다. 가장 큰 관심이 쏠렸던 KB증권 대표이사 후보에는 박정림 KB증권 부사장 겸 KB국민은행 부행장을 선정했다. 박 후보가 주주총회 등을 거쳐 최종 선임되면 금융투자업계 최초 여성 최고경영자(CEO)가 등장한다. KB금융은 19일 계열사 대표이사후보 추천위원회(이하 대추위)를 열고 KB증권·KB캐피탈·KB부동산신탁 등 7개 계열사 대표 후보를 선정했다고 밝혔다. KB증권 신임 대표이사 후보에는 박정림 KB증권 부사장 겸 KB국민은행 부행장과 김성현 KB증권 부사장을 추천했다. KB증권은 기존의 복수 대표체제를 유지한다. KB캐피탈에는 황수남 KB캐피탈 전무, KB부동산신탁에는 김청겸 KB국민은행 영등포 지역영업그룹대표가 각각 대표이사 후보로 선정됐다. 양종희 KB손해보험 대표, 조재민·이현승 KB자산운용 대표, 김해경 KB신용정보 대표는 재선정됐다. KB데이타시스템은 이른 시일 내에 적합한 인사를 찾아 추후 추천할 계획이다. 박정림 부사장이 KB증권 대표에 취임하면 증권사 최초로 여성 CEO가 탄생한다. 박 후보는 KB금융지주에서 WM(자산관리)과 리스크, 여신 등 요직을 두루 거쳤다. 그룹 WM 부문 시너지영업을 이끌며 리더십을 발휘하고 있다는 점을 높게 평가 받는다. 김성현 부사장은 대표적인 투자은행(IB) 전문가다. 투자자산 다변화를 통해 시장 지위를 바꿀 수 있는 리더십을 갖췄다는 평가를 받는다. 신임 대표는 20∼21일 계열사 대표이사후보 추천위원회의 최종 심사와 추천을 거쳐 주주총회에서 확정할 계획이다. 신임 대표 임기는 2년, 재선정 대표 임기는 1년이다."],
["서울 여의도에서 ‘카카오 카풀’에 반대하는 전국 택시업계 관계자들이 20일 대규모 집회를 벌인다. 택시기사 최 모 씨의 분신 등을 계기로 업계가 ‘총력투쟁’을 예고한 가운데 집회‧시위 시간이 출‧퇴근 시간과 겹쳐 이 시각 여의도 주변에 극심한 교통체증이 예상된다. 19일 경찰과 택시업계 등에 따르면 20일 오후 2시 전국택시노동조합연맹, 전국민주택시노동조합연맹, 전국개인택시운송사업조합연합회 등 4개 단체가 서울 여의도 국회 앞 의사당대로에서 3차 집회를 연다. 강신표 전국택시노동조합연맹 위원장은 집회를 하루 앞두고 열린 기자회견에서 “죽든지 살든지 총력 투쟁을 할 것”이라고 말했다. ‘국회를 포위하겠다던 기존 계획은 그대로 진행되느냐’는 질문에 강 위원장은 “그렇다”면서도 “만약 (경찰이) 막으면 할 수 없겠지만, 하는 데까지 최선을 다해 적폐 1호인 국회를 반드시 심판할 것”이라고 강조했다. 강 위원장은 “내일은 제주도를 포함한 전국의 택시가 운행을 중지한다”며 “앞으로 4차, 5차 집회 일정이 잡히면 그 날마다 택시 운행이 정지될 것”이라고 말했다. 이어 “자꾸 시민에게 불편을 드려죄송하지만 생존권을 지키기 위해 여의도 국회 앞에 모일 수밖에 없는 절박한 상황을 헤아려 주시길 바란다”고 덧붙였다."],
       ["'내보험 찾아줌' 홈페이지가 접속자 폭주로 인해 접속 대기시간이 길어지고 있다. 19일 오후 9시20분 현재 '내보험 찾아줌' 홈페이지에는 접속자 수가 몰리며 서비스 이용이 불가능한 상태다. 현재 사이트 접속자 수는 4641여명에 달한다. 금감원에 따르면 11월 말 기준 소비자가 찾아가지 않은 숨은 보험금은 약 9조8130억원인 것으로 나타났다.지난해 12월부터 지난달까지 숨은보험금 찾아주기 안내 활동을 통해 약 3조125억원(240만5000건)이 주인을 찾았다. 업권별로는 생명보험회사가 약 2조7907억원(222만건), 손해보험회사가 2218억원(18만 5000건)을 찾아줬다. 금감원은 20일부터 기존 '내보험 찾아줌' 서비스를 개선해 찾은 숨은보험금을 각 보험회사 온라인 청구시스템에 바로 접속할 수있도록 링크를 제공한다고 밝혔다. 기존 숨은 보험금 청구 시에는 소비자가 개별적으로 해당 보험회사 홈페이지, 콜센터, 계약 유지·관리 담당 설계사 등을 찾아 별도로 진행해야 하는 불편이 있었다. 앞으로는 '내보험 찾아줌' 홈페이지에 접속해 이름, 휴대폰 번호, 주민등록번호를 입력 후 휴대폰 인증을 거치면 생명보험 25개사, 손해보험 16개사 등 모두 41개 보험회사를 대상으로 숨은 보험금을 조회할 수 있다. 숨은 보험금이 있는 경우 해당 보험사에 보험금 지급청구를 하면 영업일 3일 이내 금액을 지급한다. 단 이미 보험금을 청구해 심사 중이거나 지급정지 등으로 청구할 수 없는 보험금은 조회되지 않는다."],
       ["KB증권은 김성현 KB증권 IB총괄 부사장과 박정림 KB증권 WM 부문 부사장을 신임 대표로 각각 선임했다고 19일 밝혔다. 윤경은, 전병조 대표이사가 자리에서 물러났지만 각자 대표이사 체제는 유지된다. 이는 WM과 IB의 부문을 각각 집중하기 위함으로 풀이된다. 특히 박 신임 대표는 증권업계 첫 여성 최고경영자(CEO)이다. 박 신임 대표는 서울대 경영학과·경영대학원 출신으로 1986년 체이스맨해튼 서울지점, 조흥은행, 삼성화재 등을 거쳐 2004년 처음으로 KB국민은행에 들어왔다. 당시 시장운영리스크 부장을 시작으로 2012년엔 WM본부장, 2014년 리스크관리그룹 부행장, 2015년 KB금융지주 리스크관리책임자 부사장 겸 리스크관리그룹 부행장을 맡았고 2016년엔 여신그룹 부행장을 맡았다. 작년부턴 KB금융 WM총괄 부사장 겸 은행 WM그룹 부행장 겸 KB증권 WM부문 부사장을 맡고 있다. KB금융지주는 박 신임 대표에 대해 WM, 리스크, 여신 등 폭넓은 업무 경험을 바탕으로 수익 확대에 대한 실행역량을 보유하고 있다고 밝혔다. 그룹 WM 부문의 시너지영업을 진두지휘하며 리더십을 발휘했다는 평가다. 현 IB총괄 부사장인 김성현 신임 대표는 IB부문을 총괄한다. 김 신임 대표이사는 연세대 경제학과를 졸업하고 1988년 대신증권에 입사한 이후 한누리투자증권을 거쳐 2008년 KB투자증권 기업금융본부장으로 임명됐다. 이후 2015년부터 KB투자증권 IB부문에서 일한 전문가다. KB금융지주는 김 신임 대표에 대해 IB 전문가로 투자자산 다변화 등을 통해 시장 지위를 개선시킬 수 있는 검증된 리더십을 보유했다고 평가했다."], 
        ["""서민금융진흥원은 지난 18일 서울 청계천로 본원에서 제2차 서민금융 전문가 간담회를 개최했다소 19일 밝혔다.

이번 간담회는 서민금융, 복지, 자활사업 등 각 분야 전문가들이 참석한 가운데, 정책서민금융 지원의 방향성에 대해서 의견을 청취하기 위해 마련됐다. 이날 이 원장은 "소득양극화와 고용부진 심화 등으로 서민·취약계층, 자영업자들의 경제적 어려움이 커지는 가운데 사회안전망으로서 서민금융의 역할이 중요한 시점"이라며, "현재 8등급 이하자가 263만명이고 이들중 74%가 연체중인 상황에서 정상적인 금융 이용이 어려운 취약계층에게 꼭 필요한 서민금융 지원을 위해 노력해야 한다"고 강조했다.

이어서 이 원장은 "현장 전문가의 의견을 반영하여 취약계층을 위한 금융과 함께 금융교육, 컨설팅, 종합상담 등 자활기반을 구축하도록 힘쓰겠다"고 밝혔다. 이날 참석자들은 '정책서민금융지원에 대한 방향성'에 대하여 다양한 의견을 제시했다.

진흥원은 이날 간담회의 다양한 제언들을 바탕으로 수요자가 체감할 수 있는 실질적인 방안 마련을 위해 더욱 노력하고, 지속적으로 서민금융 현장의 폭넓은 의견을 청취할 계획이다.
"""],
       ["""JB금융지주는 차기 회장 후보자로 김기홍 JB자산운용 대표(사진)를 선정했다.

19일 JB금융지주 임원후보추천위원회는 최종 후보군에 대해 PT발표와 심층면접을 진행한 후, 김 대표를 최종 후보자로 선정했다.

이날 PT발표와 심층면접에선 후보자의 JB금융그룹의 성장 비전과 전문성, 리더십, 기업의 사회적 책임 등 후보자의 역량에 대해 평가했으며, 김 대표는 은행을 비롯 보험사, 자산운용사 등 금융권 임원 경험을 바탕으로 금융 전반에 대한 전문적인 지식과 넓은 식견을 갖추고 있다는 점이 높이 평가됐다.

JB금융지주 임추위 관계자는 "김 후보자가 20년 이상 금융산업에 종사한 경험을 바탕으로 금융에 대한 전문적인 식견 뿐 만 아니라 리더십과 소통능력도 탁월하다"며 "급변하는 금융환경에 대응하고 계열사 간 시너지 창출을 통해 기업가치를 극대화하는 등 JB금융그룹을 최고의 소매전문 금융그룹으로 발전시킬 적임자"라고 밝혔다. 이에 따라 김 내정자는 내년 3월 정기주주총회와 이사회를 거쳐 대표이사 회장으로 선임 될 예정이다.
"""], 
        ["""1800만 근로자의 2018년 귀속 근로소득에 대한 연말정산 신고기간이 한 달여 앞으로 다가왔다.

올해 연말정산에는 중소기업 취업 청년에 대한 소득세 감면이 확대되고 도서·공연비 지출액에 대한 신용카드 사용액에 소득공제가 적용되는 등 새로운 기준이 적용되기 때문에 바뀐 공제 기준을 꼼꼼히 챙기는 것이 중요하다.

국세청은 올해 근로소득이 발생한 근로자는 내년 2월분 급여를 지급받을 때까지 연말정산을 신고해야 한다고 20일 밝혔다.

◇올해부터 달라지는 주요 공제 항목

올해 연말정산부터는 중소기업 취업 청년에 대한 소득세 감면을 받을 수 있는 대상 연령이 기존 29세에서 34세로 확대된다. 감면율도 70%에서 90%로 확대되고 감면 적용기간도 3년에서 5년으로 확대된다.

총급여액 7000만원 이하 근로자는 도서·공연비를 신용카드로 결제한 경우 해당 비용을 최대 100만원까지 추가 소득공제 받을 수 있다. 올 7월1일 이후 도서공연비로 지출한 금액의 소득공제율 30%가 적용되기 때문이다.

건강보험 산정특례 대상자로 등록된 부양가족을 위해 지출한 의료비는 기존 700만원 한도가 폐지되고 올해부터 전액공제를 받을 수 있게 됐다.

총급여액이 5500만원이거나 종합소득금액이 4000만원 초과 근로자의 경우 월세액 세액공제율이 10%에서 12%로 인상된다. 월세액 세액공제 한도는 750만원이며 임대차 계약서상 주소지와 계약기간 등 내역을 정확히 기재해야 공제를 받을 수 있다.

임차보증금 3억원 이하의 주택 임차보증금 반환 보증 보험료도 올해 연말정산부터 보험료 세액공제를 받을 수 있으며, 생산직 근로자의 초과근로수당 비과세 적용 시 기준이 되는 월정액 급여액은 150만원 이하에서 190만원 이하로 상향된다.

6세 이하 자녀 세액공제는 아동수당 지급에 따라 올해부터 폐지된다. 올 연말정산부터는 종교단체가 종교인에게 지급한 소득도 신고대상에 포함된다."""]
       ]

한글 자연어 처리 클래스 적용

kkma = Kkma()

# kkma.sentences(data[0][0])

# sentences = [kkma.sentences(da[0]) for da in data]
# word_list = [[kkma.nouns(w) for w in sentence] for sentence in sentences]

# word_list

# word_list = []
# sentences = []
# for da in data:
# #     print(da)
#     sentences.append(kkma.sentences(da[0]))
#     for s in sentences:
#         for w in s:
#             for t in kkma.nouns(w):
#                 if len(t) >= 2:
#                     word_list.append(t)
# #             word_list.append(kkma.nouns(w))

word2vec으로 학습하기 위한 데이터 전처리

sentences = []
list_vec = []
for da in data:
#     print(da)
    sentences.append(kkma.sentences(da[0]))
    for s in sentences:
        for w in s:
            list_vec.append(kkma.nouns(w))

word_list = []
for l in list_vec:
    empty_vec = []
    for w in l:
        if len(w)>=2:
            empty_vec.append(w)   
    word_list.append(empty_vec)

modeling

# word_list

# word_list_tot = word_list[0]
# for i in range(len(word_list)-1):
#     word_list_tot = word_list_tot + word_list[i+1]

# word_list_tot

# sg : {0, 1}, optional
# Training algorithm: 1 for skip-gram; otherwise CBOW.
# size : embedding 차원
embedding_model = Word2Vec(word_list, size=100, window = 5, min_count=2, workers=3, iter=1000, sg=1, sample=1e-3)

# word_count = ["증권"]
# word_top = embedding_model.wv.most_similar(positive=["선임", "대표", "증권"], topn=10)
# word_top
# [w[0] for w in word_top]

# embedding_model.wv.distances("증권")

# embedding_model.wv.vectors.shape

벡터화된 단어들로 Kmean Clustering

from sklearn.cluster import KMeans

word_vectors = embedding_model.wv.syn0 # 어휘의 feature vector
num_clusters = int(word_vectors.shape[0]/50) # 어휘 크기의 1/5나 평균 5단어
print(num_clusters)
num_clusters = int(num_clusters)

9

kmeans_clustering = KMeans(n_clusters=num_clusters)
idx = kmeans_clustering.fit_predict(word_vectors)

idx = list(idx)
names = embedding_model.wv.index2word
word_centroid_map = {names[i]: idx[i] for i in range(len(names))}

결과 확인

for c in range(num_clusters):
    # 클러스터 번호를 출력
    print("\ncluster {}".format(c))
    
    words = []
    cluster_values = list(word_centroid_map.values())
    for i in range(len(cluster_values)):
        if (cluster_values[i] == c):
            words.append(list(word_centroid_map.keys())[i])            
    print(words)

cluster 0
['지급', '소득', '신용', '기준', '대상', '경우', '금액', '확대', '공제', '연말', '정산', '올해', '지급청구', '3일', '이내', '근로자', '적용', '세액', '내년', '근로', '기간', '감면', '도서', '공연비', '급여액', '근로소득', '신고', '중소기업', '취업', '청년', '소득세', '신용카드', '카드', '때문', '지출', '공제율', '한도', '폐지', '초과', '수당']

cluster 1
['택시', '집회', '전국', '위원장', '시너지', '택시업계', '총력', '투쟁', '경찰', '택시노동조합연맹', '노동', '조합', '연맹', '발휘', '오후', '이후', '주택', '포함', '대규모', '택시기사', '기사', '분신', '계기', '2시', '전국민', '전국민주택', '강신', '하루', '열린', '기자', '기자회견', '회견', '질문', '만약', '최선', '적폐', '1호', '심판', '내일', '제주', '제주도', '운행', '중지', '4차', '5차', '일정', '입력', '인증', '생명보험', '25', '25개', '집중', '풀이', '진두지휘', '1988', '1988년', '대신', '대신증권', '입사', '누리', '누리투자증권', '2008', '2008년', '급변', '금융환경', '환경', '대응', '창출', '기업가치', '가치', '극대화', '소매', '소매전문', '발전', '적임자']

cluster 2
['대표', '증권', '후보', '신임', '이사', '선정', '은행', '부사장', '투자', '20', '대표이사', '부문', '추천', '그룹', '업계', '19', '19일', '국민', '국민은행', '계열사', '영업', '부행장', '캐피탈', '부동산', '부동산신탁', '신탁', '박정', '여성', '주주', '총회', '최종', '추천위원회', '위원회', '주주총회', '최초', '대표이사후보', '선임', '최고', '이하', '심사', '총괄', '경영자', '체제', '3개', '관심', '금융투자업계', '등장', '대추', '7개', '복수', '영등포', '지역', '영업그룹', '박정림', '정림', '부사', '취임', '증권사', '탄생', '21', '확정', '임기', '2년', '1년', '전병', '자리', '각자', '증권업계', '서울대', '경영학과', '경영', '경영대학원', '연세대', '경제', '경제학과', '학과', '졸업', '회장', '내정자', '3월', '정기', '정기주주총회', '이사회', '예정']

cluster 3
['보험', '보험금', '기존', '내보험', '홈페이지', '회사', '청구', '20일', '손해', '유지', '관리', '접속자', '접속', '보험회사', '손해보험', '시스템', '해당', '서비스', '금감원', '소비자', '생명', '조회', '개선', '이용', '종희', '조재', '계약', '등록', '폭주', '대기', '9시', '20분', '수가', '불가능', '상태', '생명보험회사', '2조', '7907', '7907억원', '222', '만건', '온라인', '청구시스템', '링크', '제공', '개별적', '센터', '담당', '설계사', '별도', '이름', '휴대폰', '번호', '주민', '주민등록번호', '16', '개사', '41', '41개', '지급정지']

cluster 4
['가운데', '시간', '상황', '현재', '총력투쟁', '예고', '퇴근', '시각', '주변', '교통', '체증', '예상', '사이트', '4641', '여명', '중요', '자영업자', '경제적', '어려움', '사회', '안전망', '역할', '시점', '등급', '자가', '263', '263만명', '74', '연체', '중인', '정상적', '필요']

cluster 5
['금융', '전문가', '서민', '서민금융', '의견', '이날', '간담회', '지원', '18', '본부장', '2015', '2015년', '투자증권', '대표적', '진흥원', '자활', '정책', '방향성', '청취', '마련', '원장', '취약', '취약계층', '계층', '노력', '현장', '다양', '관리그룹', '관리책임자', '책임자', '2016', '2016년', '작년', '기업금융본부장', '임명', '종합', '18일', '청계', '천로', '본원', '2차', '개최', '이번', '복지', '자활사업', '분야', '참석', '양극화', '고용', '고용부진', '부진', '심화', '반영', '금융교육', '교육', '컨설팅', '종합상담', '상담', '참석자', '서민금융지원', '제시', '제언', '수요자', '체감', '실질적', '방안', '지속적']

cluster 6
['정지', '5000', '5000건', '수남', '전무', '12', '11', '11월', '9조', '8130', '8130억원', '지난해', '12월', '지난달', '주기', '안내', '활동', '3조', '125', '125억원', '240', '주인', '손해보험회사', '2218', '2218억원', '삼성', '화재', '2004', '2004년', '처음', '자활기반', '기반', '구축']

cluster 7
['여의도', '국회', '계획', '서울', '진행', '불편', '사업', '데이타', '데이타시스템', '시일', '적합', '인사', '추후', '단체', '카카오', '카풀', '반대', '노동조합연맹', '개인', '운송', '운송사업조합', '연합회', '4개', '의사당', '대로', '3차', '포위', '시민', '생존권', '대학원', '출신', '1986', '1986년', '체이스', '맨해튼', '서울지점', '지점', '조흥', '조흥은행', '당시', '시장운영', '운영', '부장', '시작', '2012', '2012년', '2014']

cluster 8
['지주', '자산', '리더십', '평가', '리스크', '여신', '시장', '자산운용', '운용', '투자자산', '다변화', '지위', '바탕', '관계자', '경험', '보유', '기업', '후보자', '김해경', '신용정보', '정보', '재선', '자산관리', '요직', '보험사', '역량', '업무', '수익', '실행', '검증', '임원', '발표', '심층', '면접', '금융그룹', '전문', '전문적', '식견', '차기', '사진', '임원후보', '성장', '비전', '사회적', '책임', '금융권', '전반', '지식', '추위', '20년', '이상', '금융산업', '산업', '종사', '소통', '소통능력', '능력']

word2vec을 100차원으로 했기 때문에 시각화를 위해서 2차원으로 축소,
축소할 때 관계를 유지하기 위해 t-SNE로 transform

from sklearn.manifold import TSNE
import matplotlib.font_manager as fm
import matplotlib.pyplot as plt
import matplotlib

path_gothic = "/home/ururu/fonts/NanumGothic.ttf"
prop = fm.FontProperties(fname=path_gothic)
matplotlib.rcParams["axes.unicode_minus"] = False

vocab = list(embedding_model.wv.vocab)
X = embedding_model[vocab]

tsne = TSNE(n_components=2)
X_tsne = tsne.fit_transform(X)

import pandas as pd

df = pd.DataFrame(X_tsne, index=vocab, columns=["x", "y"])

df.head()

%matplotlib inline

fig = plt.figure()
fig.set_size_inches(40, 20)
ax = fig.add_subplot(1, 1, 1)
ax.scatter(df["x"], df["y"])

for word, pos in list(df.iterrows()):
    ax.annotate(word, pos, fontsize=12, fontproperties=prop)
plt.show()

classification¶

# embedding_model.wv.save_word2vec_format("word2vec.txt")
vectors = embedding_model.wv.vectors
names = embedding_model.wv.index2word

distance matrix

from scipy.spatial import distance_matrix
distance = distance_matrix(vectors, vectors)

distance_df = pd.DataFrame(distance, columns=names, index=names)

# 금융, 부동산, 보험
# distance_df.loc[vecs, :]

Term Document Matrix

import pandas as pd
from sklearn.feature_extraction.text import CountVectorizer

word_data = []
for word in word_list:    
    for argu in word:
        word_data.append(argu)

vec = CountVectorizer()
X = vec.fit_transform(set(word_data))

TDM_DF = pd.DataFrame(X.toarray(), columns=vec.get_feature_names()).T

TDM_DF = TDM_DF.sum(axis=1).to_frame()
TDM_DF.rename(columns={0:"doc1"}, inplace=True)
TDM_DF.head()

idx_bool = TDM_DF["doc1"] >= 1
TDM_DF[idx_bool] = 1

TDM_matrix = TDM_DF.loc[names].values
distance_df_matrix = distance_df

labeling

classification_word = ["금융", "부동산", "보험", "은행", "카드", "증권"]
# wanted_word = ["금융", "부동산", "보험"]

target_matrix = distance_df.loc[classification_word,:].values

vector inner product로 최대값을 구하여 분류

import numpy as np

result = np.matmul(target_matrix, TDM_matrix)
maxpool = np.argmax(result)
classification_rlt = classification_word[maxpool]

print("classification result: {}".format(classification_rlt))

classification result: 은행

from IPython.core.display import display, HTML

display(HTML("<style> .container{width:100% !important;}</style>"))

word2vec¶

import tensorflow as tf
import matplotlib.pyplot as plt
import matplotlib
import numpy as np

# 단어 벡터를 분석해볼 임의의 문장들
sentences = ["나 우루루다",
             "나 강아지 좋다",
             "나 동물 좋다",
             "강아지 고양이 동물",
             "나 고양이 싫다"
             "강아지 여자친구 좋다", 
             "강아지 생선 우유 싫다",
             "고양이 생선 싫다 우유 좋다",
             "강아지 고양이 눈 좋다",
             "나 여자친구 좋다",
             "여자친구 나 좋다",
             "여자친구 나 영화 책 게임 좋다",
             "나 게임 만화 애니 좋다",
             "고양이 강아지 싫다",
             "강아지 고양이 좋다"]

문장을 전부 합친 후 공백으로 단어들을 나누고 고유한 단어들로 리스트를 만듬

word_sequence = " ".join(sentences).split()
word_list = " ".join(sentences).split()
word_list = list(set(word_list))

문자열로 분석하는 것 보다, 숫자로 분석하는 것이 훨씬 용이
리스트에서 문자들의 인덱스를 뽑아서 사용하기 위해,
이를 표현하기 위한 연관 배열과, 단어 리스트에서 단어를 참조 할 수 있는 인덱스 배열을 만듬

# word: index
word_dict = {w: i for i, w in enumerate(word_list)}

윈도우 사이즈를 1 로 하는 skip-gram 모델을 만듬
예) 나 게임 만화 애니 좋다
-> ([나, 만화], 게임), ([게임, 애니], 만화), ([만화, 좋다], 애니)
-> (게임, 나), (게임, 만화), (만화, 게임), (만화, 애니), (애니, 만화), (애니, 좋다)

skip_grams = []

for i in range(1, len(word_sequence) - 1):
    # (context, target) : ([target index - 1, target index + 1], target)
    # 스킵그램을 만든 후, 저장은 단어의 고유 번호(index)로 저장
    target = word_dict[word_sequence[i]]
    context = [word_dict[word_sequence[i - 1]], word_dict[word_sequence[i + 1]]]

    # (target, context[0]), (target, context[1])..
    for w in context:
        skip_grams.append([target, w])


# skip-gram 데이터에서 무작위로 데이터를 뽑아 입력값과 출력값의 배치 데이터를 생성하는 함수
def random_batch(data, size):
    random_inputs = []
    random_labels = []
    random_index = np.random.choice(range(len(data)), size, replace=False)

    for i in random_index:
        random_inputs.append(data[i][0])  # target
        random_labels.append([data[i][1]])  # context word

    return random_inputs, random_labels

training_epoch = 1000
learning_rate = 0.01
batch_size = 20

# 단어 벡터를 구성할 임베딩 차원의 크기
# 이 예제에서는 x, y 그래프로 표현하기 쉽게 2 개의 값만 출력
embedding_size = 2

# word2vec 모델을 학습시키기 위한 nce_loss 함수에서 사용하기 위한 샘플링 크기
# batch_size 보다 작아야
num_sampled = 15

# 총 단어 갯수
voc_size = len(word_list)

inputs = tf.placeholder(tf.int32, shape=[batch_size])

# tf.nn.nce_loss 를 사용하려면 출력값을 [batch_size, 1] 구성
labels = tf.placeholder(tf.int32, shape=[batch_size, 1])

# word2vec 모델의 결과 값인 임베딩 벡터를 저장할 변수
# 총 단어 갯수와 임베딩 갯수를 크기로 하는 두 개의 차원을 갖음
embeddings = tf.Variable(tf.random_uniform([voc_size, embedding_size], -1.0, 1.0))

# 임베딩 벡터의 차원에서 학습할 입력값에 대한 행들을 뽑음
# 예) embeddings     inputs    selected
#    [[1, 2, 3]  -> [2, 3] -> [[2, 3, 4]
#     [2, 3, 4]                [3, 4, 5]]
#     [3, 4, 5]
#     [4, 5, 6]]
selected_embed = tf.nn.embedding_lookup(embeddings, inputs)

# nce_loss 함수에서 사용할 변수들을 정의
nce_weights = tf.Variable(tf.random_uniform([voc_size, embedding_size], -1.0, 1.0))
nce_biases = tf.Variable(tf.zeros([voc_size]))

# nce_loss 함수를 직접 구현하려면 매우 복잡하지만,
# 함수를 텐서플로우가 제공하므로 그냥 tf.nn.nce_loss 함수를 사용
loss = tf.reduce_mean(
            tf.nn.nce_loss(nce_weights, nce_biases, labels, selected_embed, num_sampled, voc_size))

train_op = tf.train.AdamOptimizer(learning_rate).minimize(loss)

init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)

loss_val_list = []
for step in range(1, training_epoch+1):
    batch_inputs, batch_labels = random_batch(skip_grams, batch_size)
    _, loss_val = sess.run([train_op, loss],
                           feed_dict={inputs: batch_inputs,
                                      labels: batch_labels})
    
    loss_val_list.append(loss_val)

    if step % 200 == 0:
        print("loss at step, step: {}: {}".format(step, loss_val))

    # matplot 으로 출력하여 시각적으로 확인해보기 위해
    # 임베딩 벡터의 결과 값을 계산하여 저장
    # with 구문 안에서는 sess.run 대신 간단히 eval() 함수를 사용
trained_embeddings = sess.run(embeddings, feed_dict={inputs: batch_inputs,
                                                     labels: batch_labels})

loss at step, step: 200: 3.1223583221435547
loss at step, step: 400: 3.23138165473938
loss at step, step: 600: 3.411635637283325
loss at step, step: 800: 2.716653347015381
loss at step, step: 1000: 2.981886625289917

plt.figure(figsize=(20, 7))
plt.title("cost")
plt.plot(loss_val_list, linewidth=0.7)
plt.show()

# matplotlib 한글 표시
import matplotlib.font_manager as fm

path_gothic = "/home/ururu/fonts/NanumGothic.ttf"
prop = fm.FontProperties(fname=path_gothic)

# 임베딩된 Word2Vec 결과 확인
# 결과는 해당 단어들이 얼마나 다른 단어와 인접해 있는지를 보여줌
plt.figure(figsize=(20, 7))
for i, label in enumerate(word_list):
    x, y = trained_embeddings[i]
    plt.scatter(x, y)
    plt.annotate(label, xy=(x, y), xytext=(5, 2),
                 textcoords='offset points', ha='right', va='bottom', fontproperties=prop)


plt.show()

from IPython.core.display import HTML, display

display(HTML("<style> .container{width:100% !important;}</style>"))

sequence to sequence¶

seq2seq는 RNN과 출력하는 신경망을 조합한 모델
번역이나 챗봇 등 문장을 입력받아 다른 문장을 출력하는 프로그램에서 많이 사용

seq2seq모델은 인코더와 디코더로 구성
- encoder는 원문을, decoder는 encoder의 결과물
- 후에 decoder가 출력한 결과물을 번역한 결과물과 비교하면서 학습

symbol:
decoder에 입력이 시작됨을 알려주는 symbol
decoder의 출력이 끝났음을 알려주는 symbol
빈 데이터를 채울 때 사용하는 아무 의미가 없는 symbol

import tensorflow as tf
import numpy as np

char_arr = [c for c in "SEPabcdefghijklmnopqrstuvwxyz단어나무놀이소녀키스사랑봉구우루"]
num_dic = {n: i for i, n in enumerate(char_arr)}
dic_len = len(num_dic)

seq_data = [['word', "단어"], ["wood", "나무"], ["game", "놀이"], ["girl", "소녀"], 
            ["kiss", "키스"], ["love", "사랑"], ["bong", "봉구"], ["uruu", "우루"]]

def make_batch(seq_data):
    input_batch = []
    output_batch = []
    target_batch = []
    
    for seq in seq_data:
        input = [num_dic[n] for n in seq[0]]
        output = [num_dic[n] for n in ("S" + seq[1])]
        target = [num_dic[n] for n in (seq[1] + "E")]
        
        input_batch.append(np.eye(dic_len)[input])
        output_batch.append(np.eye(dic_len)[output])
        target_batch.append(target)
        
    return input_batch, output_batch, target_batch

learning_rate = 0.001
n_hidden = 128
total_epoch = 1000

n_class = n_input = dic_len

enc_input = tf.placeholder(tf.float32, [None, None, n_input])
dec_input = tf.placeholder(tf.float32, [None, None, n_input])
targets = tf.placeholder(tf.int64, [None, None])

# encoder: [batch size, time steps, input size]
# decoder: [batch size, time steps]

with tf.variable_scope("encode"):
    enc_cell = tf.nn.rnn_cell.BasicRNNCell(n_hidden)
    enc_cell = tf.nn.rnn_cell.DropoutWrapper(enc_cell, output_keep_prob=0.5)
    
    outputs, enc_states = tf.nn.dynamic_rnn(enc_cell, enc_input, dtype=tf.float32)
    
with tf.variable_scope("decode"):
    dec_cell = tf.nn.rnn_cell.BasicRNNCell(n_hidden)
    dec_cell = tf.nn.rnn_cell.DropoutWrapper(enc_cell, output_keep_prob=0.5)
    
    outputs, dec_stats = tf.nn.dynamic_rnn(dec_cell, dec_input, 
                                           initial_state=enc_states, dtype=tf.float32)

WARNING:tensorflow:From <ipython-input-5-2da500f4b7bd>:5: BasicRNNCell.__init__ (from tensorflow.python.ops.rnn_cell_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This class is equivalent as tf.keras.layers.SimpleRNNCell, and will be replaced by that in Tensorflow 2.0.

model = tf.layers.dense(outputs, n_class, activation=None)
cost = tf.reduce_mean(
    tf.nn.sparse_softmax_cross_entropy_with_logits(
        logits=model, labels=targets
    )
)
opt = tf.train.AdamOptimizer(learning_rate).minimize(cost)

init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)

input_batch, output_batch, target_batch = make_batch(seq_data)

cost_val = []
for epoch in range(total_epoch):
    _, loss = sess.run([opt, cost], feed_dict={enc_input: input_batch,
                                               dec_input: output_batch,
                                               targets: target_batch})
    cost_val.append(loss)
    
    if (epoch+1) % 200 ==0:
        print("Epoch: {:04d}, cost: {}".format(epoch+1, loss))
    
    
print("\noptimization complete")

Epoch: 0200, cost: 0.05466647818684578
Epoch: 0400, cost: 0.013100271113216877
Epoch: 0600, cost: 0.0049271308816969395
Epoch: 0800, cost: 0.002698158845305443
Epoch: 1000, cost: 0.0030308999121189117

optimization complete

import matplotlib.pyplot as plt
plt.rcParams["axes.unicode_minus"] = False

plt.figure(figsize=(20, 10))
plt.title("cost")
plt.plot(cost_val, linewidth=1, alpha=0.8)
plt.show()

입력으로 word를 받았다면 seq_data는 ["word", "PPPP"]로 구성될 것
input_batch는 ["w", "o", "r", "d"], outout_batch는 ["P", "P", "P", "P"]글자들의 인덱스를 one-hot encoding한 값
target_batch는 각 글자의 인덱스인 [2, 2, 2 ,2]가 될 것
[batch_size, time step, input size]형태로 나오기 때문에 3번째 차원을 argmax로 취함

예측 결과는 글자의 인덱스를 뜻하는 숫자이므로 각 숫자에 해당하는 글자를 가져와 배열을 만듬
그리고 출력의 끝을 의미하는 "E"이후의 글자들을 제거하고 문자열로 만듬
decoder의 입력(time steps) 크기만큼 출력값이 나오므로 최종 결과는 ["사", "랑", "E", "E"]처럼 나오기 때문

def translate(word):
    seq_data = [word, "P" * len(word)]
    
    input_batch, output_batch, target_batch = make_batch([seq_data])
    prediction = tf.argmax(model, 2)
    
    result = sess.run(prediction, feed_dict={enc_input: input_batch,
                                             dec_input: output_batch,
                                             targets: target_batch})
    decoded = [char_arr[i] for i in result[0]]
    
    try:
        end = decoded.index("E")
        translated = "".join(decoded[:end])
        return translated
        
    except Exception as ex:
        pass

print("\n ==== translate test ====")

print("word -> {}".format(translate("word")))
print("wodr -> {}".format(translate("wodr")))
print("love -> {}".format(translate("love")))
print("loev -> {}".format(translate("loev")))
print("bogn -> {}".format(translate("bogn")))
print("uruu -> {}".format(translate("uruu")))
print("abcd -> {}".format(translate("abcd")))

 ==== translate test ====
word -> 단어
wodr -> 나무
love -> 사랑
loev -> 사랑
bogn -> 봉구
uruu -> 우루
abcd -> 이

from IPython.core.display import HTML, display

display(HTML("<style> .container{width:100% !important;}</style>"))

word auto complete¶

염문자 4개를 학습시켜 3글자만 주어지면 나머지 한 글자를 추처하여 단어를 완성
dynamic_rnn의 sequence_length 옵션을 사용하면 가변 길이 단어를 학습시킬 수 있음
짧은 단어는 가장 긴 단어의 길이 만큼 뒷부분을 0으로 채우고, 해당 단어의 길이를 계산해 (batch_size)만큼의 배열로 sequence_length로 넘겨주면 됨

학습시킬 데이터는 영문자로 구성된 임의의 단어를 사용할 것이고, 한 글자 한글자를 하나의 단계로 봄
한글자가 한 단계의 입력값이 되고, 총 글자 수가 전체 단계가 됨
입력으로는 알파벳 순서에서 각 글자에 해당하는 인덱스를 one-hot encoding으로 표현한 값을 취함

import tensorflow as tf
import numpy as np

char_arr = ["a", "b", "c", "d", "e", "f", "g",
            "h", "i", "j", "k", "l", "m", "n",
            "o", "p", "q", "r", "s", "t", "u",
            "v", "w", "x", "y", "z"]

num_dic = {n: i for i, n in enumerate(char_arr)}
dic_len = len(num_dic)

seq_data = ["word", "wood", "deep", "dive", "cold", "cool", "load", "love", "kiss", "kind"]

utiliy function¶

"deep"는 입력으로 "d", "e", "e"를 취하고, 각 알파벳의 인덱스를 구해 배열로 만들면 [3, 4, 4]가 됨
이를 one-hot encoding

def make_batch(seq_data):
    input_batch = []
    target_batch = []
    
    for seq in seq_data:
        input = [num_dic[n] for n in seq[:-1]]
        target = num_dic[seq[-1]]
        input_batch.append(np.eye(dic_len)[input])
        target_batch.append(target)
        
    return input_batch, target_batch

hyper parameter setting¶

단어의 전체중 처음 3글자를 단계적으로 학습할 것이므로 n_step=3
입력값과 출력값은 one-hot encoding을 사용하므로 dic_len과 같음
sparse_softmax_cross_entropy_with_logits 함수를 사용하더라도 예측 모델의 출력값은 one-hot encoding을 해야함
sparse_softmax_cross_entropy_with_logits 함수를 사용할 때 실측값인 labels의 값은 인덱스의 숫자를 그대로 사용하고, 예측 모델의 출력값은 인덱스의 one-hot encoding을 사용

learning_rate = 0.001
n_hidden = 128
total_epoch = 10000

n_step = 3
n_input = n_class = dic_len

variable setting¶

Y의 placeholder는 batch_size에 해당하는 하나의 차원만 있음

X = tf.placeholder(tf.float32, [None, n_step, n_input], name="input_X")
Y = tf.placeholder(tf.int32, [None])

W = tf.Variable(tf.random_normal([n_hidden, n_class]))
b = tf.Variable(tf.random_normal([n_class]))

model setting¶

RNN cell 2개 생성
DropoutWrapper를 이용하여 RNN에도 overfitting 방지

cell1 = tf.nn.rnn_cell.BasicLSTMCell(n_hidden)
cell1 = tf.nn.rnn_cell.DropoutWrapper(cell1, output_keep_prob=0.5)
cell2 = tf.nn.rnn_cell.BasicLSTMCell(n_hidden)

# MultiRNNCell 함수를 사용하여 조합
multi_cell = tf.nn.rnn_cell.MultiRNNCell([cell1, cell2])
outputs, states = tf.nn.dynamic_rnn(multi_cell, X, dtype=tf.float32)

WARNING:tensorflow:From <ipython-input-7-06df5d391133>:1: BasicLSTMCell.__init__ (from tensorflow.python.ops.rnn_cell_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This class is deprecated, please use tf.nn.rnn_cell.LSTMCell, which supports all the feature this cell currently has. Please replace the existing code with tf.nn.rnn_cell.LSTMCell(name='basic_lstm_cell').

outputs = tf.transpose(outputs, [1, 0, 2])
outputs = outputs[-1]
model = tf.matmul(outputs, W) + b

modeling¶

cost = tf.reduce_mean(
    tf.nn.sparse_softmax_cross_entropy_with_logits(logits=model, labels=Y)   
)
opt = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)

init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)

input_batch, output_batch = make_batch(seq_data)

cost_epoch = []
for epoch in range(total_epoch):
    _, loss = sess.run([opt, cost], feed_dict={X: input_batch, Y: output_batch})
    cost_epoch.append(loss)
    
    if (epoch+1) % 2000 ==0:
        print("Epoch: {}, cost= {}".format(epoch+1, loss))
        
print("\noptimization complete")

Epoch: 2000, cost= 1.4388267118192744e-05
Epoch: 4000, cost= 8.106222821879783e-07
Epoch: 6000, cost= 3.695482178045495e-07
Epoch: 8000, cost= 1.323218612014898e-06
Epoch: 10000, cost= 1.4305105366929638e-07

optimization complete

import matplotlib.pyplot as plt

plt.rcParams["axes.unicode_minus"] = False
plt.figure(figsize=(20,6))
plt.title("cost")
plt.plot(cost_epoch, linewidth=1)
plt.show()

실측값을 원-핫 인코딩이아닌 인덱스를 그대로 사용

prediction = tf.cast(tf.argmax(model, 1), tf.int32)
prediction_check = tf.equal(prediction, Y)
accuracy = tf.reduce_mean(tf.cast(prediction_check, tf.float32))

prediction¶

prediction model

input_batch, target_batch = make_batch(seq_data)

predict, accuracy_val = sess.run([prediction, accuracy], 
                                 feed_dict={X: input_batch, Y: target_batch})

predict

predict_word = []
for idx, val in enumerate(seq_data):
    last_char = char_arr[predict[idx]]
    predict_word.append(val[:3] + last_char)
    
print("\n==== prediction ====")
print("input_value: \t\t{}".format([w[:3] for w in seq_data]))
print("prediction_value: \t{}".format(predict_word))
print("accuracy: {:.3f}".format(accuracy_val))

==== prediction ====
input_value: 		['wor', 'woo', 'dee', 'div', 'col', 'coo', 'loa', 'lov', 'kis', 'kin']
prediction_value: 	['word', 'wood', 'deep', 'dive', 'cold', 'cool', 'load', 'love', 'kiss', 'kind']
accuracy: 1.000

from IPython.core.display import HTML, display

display(HTML("<style> .container{width:100% !important;}</style>"))

이 그림의 가운데에 있는 한 덩어리의 신경망을 RNN에서는 Cell이라 부름
cell을 여러개 중첩하여 심층 신경망을 만듬
앞 단계에서 학습한 결과를 다음 단계의 학습에 이용
따라서 학습 데이터를 단계별로 구분하여 입력

사람은 글씨를 위에서 아래로 내려가면서 쓰는 경향이 많으므로
가로 한줄의 28 픽셀을 한 단계의 입력값으로 삼고
세로줄이 총 28개 이므로 28단계를 거쳐 데이터를 입력 받음

library load¶

import tensorflow as tf
import numpy as np

from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("./mnist/data/", one_hot=True)

Extracting ./mnist/data/train-images-idx3-ubyte.gz
Extracting ./mnist/data/train-labels-idx1-ubyte.gz
Extracting ./mnist/data/t10k-images-idx3-ubyte.gz
Extracting ./mnist/data/t10k-labels-idx1-ubyte.gz

hyper parameter¶

입력값 X에 n_step이라는 차원을 하나 더 추가
RNN은 순서가 있는 데이터를 다루므로 한 번에 입력 받을 갯수와 몇 단계로 이뤄진 데이터를 받을지를 설정
가로 픽셀 수를 n_input, 세로 픽셀 수를 입력 단계인 n_step으로 설정
앞에서 설명한 대로 RNN은 순서가 있는 데이터를 다루므로 한 번에 입력 받을 갯수와 총 몇 단계로 이뤄진 데이터를 받을지를 설정
가로 픽셀수: n_input, 세로 픽셀수: n_step
출력값은 계속해서 온 것처럼 MNIST의 분류인 0~9까지 10개의 숫자를 one-hot encoding으로 표현

learning_rate = 0.001
total_epoch = 30
batch_size = 128

n_input = 28
n_step = 28
n_hidden = 128
n_class = 10

X = tf.placeholder(tf.float32, [None, n_step, n_input], name="input_X")
Y = tf.placeholder(tf.float32, [None, n_class], name="output_Y")
W = tf.Variable(tf.random_normal([n_hidden, n_class], name="weight_W"))
b = tf.Variable(tf.random_normal([n_class], name="bias_b"))

hidden개의 출력값을 갖는 RNN cell을 생성¶

cell = tf.nn.rnn_cell.BasicRNNCell(n_hidden)

WARNING:tensorflow:From <ipython-input-5-e006f918b220>:1: BasicRNNCell.__init__ (from tensorflow.python.ops.rnn_cell_impl) is deprecated and will be removed in a future version.
Instructions for updating:
This class is equivalent as tf.keras.layers.SimpleRNNCell, and will be replaced by that in Tensorflow 2.0.

BasicLSTMCell, GRUCell 등 다양한 방식의 셀을 사용
RNN의 기본신경망은 긴 단계의 데이터를 학습할 때 맨 뒤에서는 맨 앞의 정보를 잘 기억하지 못하는 특성이 존재
이를 보완하기 나온 것이 LSTM^{Long Short-Term Memory}, GRU^{Gated Recurrent Units}
GRU는 LSTM과 비슷하지만, 구조가 조금 더 간단한 신경망 Architecture

complete RNN¶

outputs, states = tf.nn.dynamic_rnn(cell, X, dtype=tf.float32)

결과값을 one-hot encoding 형태로 만들 것이므로 손실 함수로 tf.nn.softmax_cross_entropy_with_logits_v2를 사용
이 함수를 사용하려면 최종 결과값이 [batch_size, n_class] 여야 함
RNN 신경망에서 나오는 출력값은 [batch_size, n_step, n_hidden]

# outputs : [batch_size, n_step, n_hidden]
outputs = tf.transpose(outputs, [1, 0, 2]) # index를 기준으로 transpose
outputs = outputs[-1]

modeling¶

$y = (X \times W) +b$

model = tf.matmul(outputs, W) + b
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(logits=model, labels=Y))
opt = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)

variable initializer¶

init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)

x_train = mnist.train.images
y_train = mnist.train.labels

class Dataset:
    def __init__(self, x, y):
        self.index_in_epoch = 0
        self.epoch_completed = 0
        self.x_train = x
        self.y_train = y
        self.num_examples = x.shape[0]
        
    def data(self):
        return self.x_train, self.y_train
    
    def next_batch(self, batch_size):
        start = self.index_in_epoch
        self.batch_size = batch_size
        self.index_in_epoch += self.batch_size
        
        if start==0 and self.epoch_completed==0:
            idx = np.arange(self.num_examples)
            np.random.shuffle(idx)
            self.x_train = self.x_train[idx]
            self.y_train = self.y_train[idx]
            
        if start + batch_size > self.num_examples:            
            self.epoch_completed += 1
            
            perm = np.arange(self.num_examples)
            np.random.shuffle(perm)
            self.x_train = self.x_train[perm]
            self.y_train = self.y_train[perm]

            start = 0
            self.index_in_epoch = self.batch_size

        end = self.index_in_epoch
        return self.x_train[start:end], self.y_train[start:end]

total_batch = int(x_train.shape[0]/batch_size)
epoch_cost_val_list = []
cost_val_list = []
for epoch in range(total_epoch):
    epoch_cost = 0
    for i in range(total_batch):
        batch_xs, batch_ys = Dataset(x=x_train, y=y_train).next_batch(batch_size=batch_size)
        batch_xs = batch_xs.reshape([batch_size, n_step, n_input])
        
        _, cost_val = sess.run([opt, cost], feed_dict={
            X: batch_xs, Y: batch_ys
        })
        
        epoch_cost += cost_val
        cost_val_list.append(cost_val)        
        
    epoch_cost_val_list.append(epoch_cost)   
    
    if (epoch+1) %5 == 0:
        print("Epoch: %04d" % (epoch+1),
              "Avg.cost = {}".format(epoch_cost/total_batch))
    
print("\noptimization complete")

is_correct = tf.equal(tf.argmax(model, 1), tf.argmax(Y, 1))
accuracy = tf.reduce_mean(tf.cast(is_correct, tf.float32))

test_batch_size = len(mnist.test.images)
test_xs = mnist.test.images.reshape(test_batch_size, n_step, n_input)
test_ys = mnist.test.labels

print("\naccuracy {:.3f}%".format(
    sess.run(accuracy*100, feed_dict={X: test_xs, Y: test_ys})
))

Epoch: 0005 Avg.cost = 0.13626715096716696
Epoch: 0010 Avg.cost = 0.0966348738251102
Epoch: 0015 Avg.cost = 0.08380235770244003
Epoch: 0020 Avg.cost = 0.07091071723204861
Epoch: 0025 Avg.cost = 0.07897876071068513
Epoch: 0030 Avg.cost = 0.05630091918466313

optimization complete

accuracy 97.660%

import matplotlib.pyplot as plt

plt.rcParams["axes.unicode_minus"] = False

_, ax = plt.subplots(1, 2, figsize=(20, 5))
ax[0].set_title("cost_epoch")
ax[0].plot(epoch_cost_val_list, linewidth=0.3)
ax[1].set_title("cost_value")
ax[1].plot(cost_val_list, linewidth=0.3)
plt.show()

from IPython.core.display import HTML, display

display(HTML("<style> .container{width:100% !important;}</style>"))

GAN¶

GAN^{Generative Adversarial Network}

서로 대립^adversarial하는 두 신경망을 경쟁시켜가며 결과물 생성 방법을 학습

GAN을 제안한 Ian Goodfellow가 논문에서 제시한 비유를 사용

위조지폐범(생성자)과 경찰(구분자)로 나눈 후
위조지폐범은 경찰을 취대한 속이려 하고, 경찰을 위조지폐를 최대한 감별하려 노력

이처럼 위조지폐범을 만들고 감별하려는 경쟁을 통해 서로의 능력이 바전하게 되고, 결국 위조지폐범은 진짜와 거의 구분할 수 없을 정도로
진짜 같은 위조지폐를 만듬

import tensorflow as tf
import numpy as np

from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("./mnist/data/", one_hot=True)

Extracting ./mnist/data/train-images-idx3-ubyte.gz
Extracting ./mnist/data/train-labels-idx1-ubyte.gz
Extracting ./mnist/data/t10k-images-idx3-ubyte.gz
Extracting ./mnist/data/t10k-labels-idx1-ubyte.gz

setting hyper-parameter¶

total_epoch   = 100
batch_size    = 100
learning_rate = 0.0001
n_hidden      = 256
n_input       = 28*28
n_noise       = 128

GAN도 비지도 학습이므로 Y를 사용하지 않음
구분자에 넣을 이미지가 실제 이미지와 생성한 가짜 이미지 두 개이고
가짜 이미지는 노이즈에서 생성할 것이므로 노이즈를 입력할 placeholder Z를 추가

X = tf.placeholder(tf.float32, [None, n_input])
Z = tf.placeholder(tf.float32, [None, n_noise])

setting Generator¶

첫 번째 가중치와 bias는 hidden layer로 출력하기 위한 변수
두 번째 가중치와 bias는 export layer로 사용할 변수들
두 번째 가중치의 변수 크기는 실제 이미지 크기와 같아야 함

with tf.name_scope("Generator_W1"):
    G_W1 = tf.Variable(tf.random_normal([n_noise, n_hidden], stddev=0.01))
    
with tf.name_scope("Generator_b1"):
    G_b1 = tf.Variable(tf.zeros([n_hidden]))
    
with tf.name_scope("Generator_W2"):
    G_W2 = tf.Variable(tf.random_normal([n_hidden, n_input], stddev=0.01))
    
with tf.name_scope("Generator_b2"):
    G_b2 = tf.Variable(tf.zeros([n_input]))

setting discriminator¶

hidden layer는 Generrator와 동일하게 구성
discriminator는 진짜와 얼마나 가까운지를 판단하는 값으로 0~1사이의 값을 구성
출력값은 하나의 scalar
- 실제 이미지를 판별하는 구분자 신경망과 생성한 이미지를 판별하는 구분자 신경망은 같은 변수를 사용해야함
- 같은 신경망으로 구분을 시켜와 진짜 이미지와 가짜 이미지를 구변하는 특징들을 잡아낼 수 있기 때문

with tf.name_scope("discriminator_W1"):
    D_W1 = tf.Variable(tf.random_normal([n_input, n_hidden], stddev=0.01))
    
with tf.name_scope("discriminator_b1"):
    D_b1 = tf.Variable(tf.zeros([n_hidden]))
    
with tf.name_scope("discriminator_W2"):
    D_W2 = tf.Variable(tf.random_normal([n_hidden, 1], stddev=0.01))
    
with tf.name_scope("discriminator_b2"):
    D_b2 = tf.Variable(tf.zeros([1]))

setting neural network¶

def generator(noise_z):
    hidden = tf.nn.relu(tf.add(tf.matmul(noise_z, G_W1), G_b1))
    output = tf.nn.sigmoid(tf.add(tf.matmul(hidden, G_W2), G_b2))
    
    return output

def discriminator(inputs):
    hidden = tf.nn.relu(tf.add(tf.matmul(inputs, D_W1), D_b1))
    output = tf.nn.sigmoid(tf.add(tf.matmul(hidden, D_W2), D_b2))
    
    return output

def get_noise(batch_size, n_noise):
    output = np.random.normal(size=(batch_size, n_noise))
    return output

noiseZ를 이용해 가짜 이미지를 만들 생성자 G를 만들고 이 G가 만든 가짜 이미지와 진짜 이미지 X를 각각 구분자에 넣어 입력한 이미지가 진짜인지를 판별

G = generator(Z)
D_gene = discriminator(G)
D_real = discriminator(X)

cost¶

cost는 2개가 필요
생성자가 만든 이미지를 구분자가 가짜라고 판단하도록 하는 손실값(경찰 학습용)
진짜라고 판단하도록 하는 손실값(위조지폐범 학습용)
경찰을 학습시키려면 진짜 이미지 판별값 D_real은 1에 가까워야 하고(진짜라고 판별)
가짜 이미지 판별값 D_gene는 0에 가까워야함(가짜라고 판별)

with tf.name_scope("cost"):
    loss_D = tf.reduce_mean(tf.log(D_real) + tf.log(1-D_gene)) # 경찰, 높아야함
    loss_G = tf.reduce_mean(tf.log(D_gene)) # 위조지폐범, 높아야함
    
    tf.summary.scalar("loss_D", loss_D)
    tf.summary.scalar("loss_G", loss_G)

training¶

loss_D를 구할 때는 구분자 신경망에 사용되는 변수들만 사용
loss_G를 구할 때는 생성자 신경망에 사용되는 변수들만 사용

loss_D를 학습할 때는 생성자가 변하지 않고, loss_G를 학습할때는 구분자가 변하지 않기 때문

D_var_list = [D_W1, D_b1, D_W2, D_b2]
G_var_list = [G_W1, G_b1, G_W2, G_b2]

논문에 의하면 loss를 최대화 해야하지만, minize에 -를 붙여 최대화

train_D = tf.train.AdamOptimizer(learning_rate).minimize(-loss_D, var_list=D_var_list)
train_G = tf.train.AdamOptimizer(learning_rate).minimize(-loss_G, var_list=G_var_list)

init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)

merge = tf.summary.merge_all()
writer = tf.summary.FileWriter("./logs/mnist_gan", sess.graph)

total_batch = int(mnist.train.num_examples / batch_size)
loss_val_D, loss_val_G = 0, 0
tot_loss_val_D, tot_loss_val_G = 0, 0

session 설정과 minibatch를 위한 코드를 만들고 loss_D와 loss_G의 결과값을 받을 변수를 지정
구분자는 X값을, 생성자는 노이즈인 Z값을 받으므로 노이즈를 생성해주는 get_noise 함수를 통해 배치 크기만큼 노이즈를 만들고 이를 입력
그리고 구분자와 생성자 신경망을 학습

import matplotlib.pyplot as plt
plt.rcParams["axes.unicode_minus"] = False

loss_val_D_list = []
loss_val_G_list = []

for epoch in range(total_epoch):
    for i in range(total_batch):
        batch_xs, batch_ys = mnist.train.next_batch(batch_size)
        noise = get_noise(batch_size, n_noise)
        
        _, loss_val_D = sess.run([train_D, loss_D], feed_dict={X: batch_xs, Z: noise})
        _, loss_val_G = sess.run([train_G, loss_G], feed_dict={Z: noise})
        
        tot_loss_val_D += -loss_val_D
        tot_loss_val_G += -loss_val_G
        loss_val_D_list.append(tot_loss_val_D)
        loss_val_G_list.append(tot_loss_val_G)
    
    if epoch == 0 or epoch % 20 == 19:
        print("Epoch: {}\tlossD: {:0.4f}\tlossG: {:0.4f}".format(epoch+1, 
                                                                     loss_val_D, 
                                                                     loss_val_G))
    
    if epoch ==0 or (epoch+1) % 20 == 0:
        sample_size = 10
        noise = get_noise(sample_size, n_noise)
        samples = sess.run(G, feed_dict={Z:noise})
        
        fig, ax = plt.subplots(1, sample_size, figsize=(sample_size, 1))
        
        for i in range(sample_size):
            ax[i].set_axis_off()
            ax[i].imshow(np.reshape(samples[i], (28, 28))) 
        plt.show()
        
# plt.show()        
print("\n optimization complete!")

Epoch: 1	lossD: -0.8597	lossG: -1.5412

Epoch: 20	lossD: -0.3039	lossG: -2.4034

Epoch: 40	lossD: -0.5556	lossG: -2.1424

Epoch: 60	lossD: -0.6952	lossG: -2.2227

Epoch: 80	lossD: -0.4060	lossG: -2.4146

Epoch: 100	lossD: -0.4189	lossG: -2.3728

 optimization complete!

_, axe = plt.subplots(1, 2, figsize=(15, 5))
axe[0].set_title("loss_D")
axe[1].set_title("loss_G")
axe[0].plot(loss_val_D_list)
axe[1].plot(loss_val_G_list)

[<matplotlib.lines.Line2D at 0x7fbf43950eb8>]

import jptensor as jp
tf_graph = tf.get_default_graph().as_graph_def()
jp.show_graph(tf_graph)

from IPython.core.display import HTML, display
display(HTML("<style> .container{width:100% !important;}</style>"))

auto-encoder¶

auto-encoder는 입력값과 출력값을 같게하는 신경망
가운데 hidden layer가 input layer보다 작아 데이터를 압축하는 효과를 갖음
이 과정으로 인해 noise 제거에 효과적
auto-encoder의 원리는 출력값을 입력값과 같아지도록 가중치를 찾아냄

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt

mnist dataload¶

from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("./mnist/data/", one_hot=True)

Extracting ./mnist/data/train-images-idx3-ubyte.gz
Extracting ./mnist/data/train-labels-idx1-ubyte.gz
Extracting ./mnist/data/t10k-images-idx3-ubyte.gz
Extracting ./mnist/data/t10k-labels-idx1-ubyte.gz

hyper parameter¶

learning_rate = 0.001 # learning_rate
training_epoch = 20 # 전체 횟수
batch_size = 256 # 한 번에 학습할 데이터(이미지 갯수)
n_hidden = 256 # hidden_layer 갯수
n_input = 28*28 # input_layer의 크기

setting place holder¶

X = tf.placeholder(tf.float32, [None, n_input], name="X")
# encoder와 decoder를 만드는 방식에 따라 다양한 auto-encoder를 맏들 수 있음

setting encoder¶

W_encode = tf.Variable(tf.random_normal([n_input, n_hidden]))
b_encode = tf.Variable(tf.random_normal([n_hidden]))

with tf.name_scope("encoder"):
    encoder = tf.add(tf.matmul(X, W_encode), b_encode)
    encoder = tf.nn.sigmoid(encoder)

setting decoder¶

W_decode = tf.Variable(tf.random_normal([n_hidden, n_input]))
b_decode = tf.Variable(tf.random_normal([n_input]))

with tf.name_scope("decoder"):
    decoder = tf.add(tf.matmul(encoder, W_decode), b_decode)
    decoder = tf.nn.sigmoid(decoder)

cost¶

# tf.pow()
# x = tf.constant([[2, 2], [3, 3]])
# y = tf.constant([[8, 16], [2, 3]])
# tf.pow(x, y)  # [[256, 65536], [9, 27]]

# 입력값인 X를 평가하기 위한 실츠값으로 사용, decoder가 내보낸 결과값의 차이를 cost로 구현
# 이 값의 차이는 거리함수로 구현
with tf.name_scope("cost"):
    cost = tf.reduce_mean(tf.pow(X-decoder, 2))
    opt = tf.train.RMSPropOptimizer(learning_rate).minimize(cost)
    
    tf.summary.scalar("cost", cost)

init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
cost_epoch = []

merged = tf.summary.merge_all()
writer = tf.summary.FileWriter("./logs/mnist_autoencoder", sess.graph)

total_batch = int(mnist.train.num_examples / batch_size)

for epoch in range(training_epoch):
    total_cost = 0
    
    for i in range(total_batch):
        batch_xs, batch_ys = mnist.train.next_batch(batch_size)
        _, cost_val = sess.run([opt, cost], feed_dict={X: batch_xs})
        
        total_cost += cost_val
        cost_epoch.append(total_cost)
        
        summary = sess.run(merged, feed_dict={X: batch_xs})
        writer.add_summary(summary)
    print("Epoch: %4d,  Avg.cost %.4f " % (epoch+1, total_cost/total_batch))
print("opt complete")

Epoch:    1,  Avg.cost 0.4240 
Epoch:    2,  Avg.cost 0.1718 
Epoch:    3,  Avg.cost 0.1135 
Epoch:    4,  Avg.cost 0.1032 
Epoch:    5,  Avg.cost 0.0961 
Epoch:    6,  Avg.cost 0.0911 
Epoch:    7,  Avg.cost 0.0870 
Epoch:    8,  Avg.cost 0.0833 
Epoch:    9,  Avg.cost 0.0805 
Epoch:   10,  Avg.cost 0.0773 
Epoch:   11,  Avg.cost 0.0749 
Epoch:   12,  Avg.cost 0.0725 
Epoch:   13,  Avg.cost 0.0709 
Epoch:   14,  Avg.cost 0.0695 
Epoch:   15,  Avg.cost 0.0682 
Epoch:   16,  Avg.cost 0.0670 
Epoch:   17,  Avg.cost 0.0657 
Epoch:   18,  Avg.cost 0.0637 
Epoch:   19,  Avg.cost 0.0627 
Epoch:   20,  Avg.cost 0.0619 
opt complete

%matplotlib inline

plt.figure(figsize=(20, 8))
plt.plot(cost_epoch, "g")
plt.title("cost_value")
plt.show()

import jptensor as jp

tf_graph = tf.get_default_graph().as_graph_def()
jp.show_graph(tf_graph)

sample_size = 10

samples = sess.run(decoder, feed_dict={X: mnist.test.images[:sample_size]})

%matplotlib inline

fig, ax = plt.subplots(2, sample_size, figsize=(sample_size, 2))

for i in range(sample_size):
    ax[0][i].set_axis_off()
    ax[1][i].set_axis_off()
    ax[0][i].imshow(np.reshape(mnist.test.images[i], (28, 28)))
    ax[1][i].imshow(np.reshape(samples[i], (28, 28)))
plt.show()

from IPython.core.display import HTML, display
display(HTML("<style> .container{width:100% !important;}</style>"))

CNN¶

CNN^{Convolutional Neural Network}

CNN은 합성곱 계층^{convolusion layer}과 풀링 계층^{pooling layer}으로 구성

지정한 크기의 영역을 window라 하며, 이 window를 오른쪽, 아래쪽으로 움직이면서 hidden layer를 완성
몇 칸씩 움직이는 값을 stride^{스트라이드}라 함

이렇게 input layer의 window를 hidden layer의 뉴런 하나로 압축할 때, convolution 계층에서는 window 크기(ex 3x3이면 9개의 가중치)만큼의 가중치와 1개의 bias이 필요함
이 때 window의 크기와 bias를 kernel 혹은 filter라고 하며, 이 kernel은 해당 hidden layer를 만들기 위한 모든 window에 공통으로 적용
기본 신경망으로 모든 뉴런을 연결하면 784개의 가중치를 찾아내야 하지만, convolution에서는 3x3개인 9개의 가중치만 찾아내면 되므로 시간이 빠름
알고리즘을 진행하는데 튜닝하는 파라미터를 하이퍼 파라미터라함

import tensorflow as tf
import pandas as pd
import numpy as np

from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("./mnist/data/", one_hot=True)

Extracting ./mnist/data/train-images-idx3-ubyte.gz
Extracting ./mnist/data/train-labels-idx1-ubyte.gz
Extracting ./mnist/data/t10k-images-idx3-ubyte.gz
Extracting ./mnist/data/t10k-labels-idx1-ubyte.gz

CNN 모델에서는 2차원 평면으로 구성하므로 조금 더 직관적인 형태로 구성할 수 있음.

X의 첫번째 차원인 None은 입력 데이터 갯수
마지막 차원은 1, MNIST 데이터는 회색조 이미지라 색상이 한개 뿐이므로 depth=1을 사용
출력값인 10개의 분류와, dropout, keep_prob를 정의

global_step = tf.Variable(0, trainable=False, name="global_step")
X = tf.placeholder(tf.float32, shape=[None, 28, 28, 1], name="X")
Y = tf.placeholder(tf.float32, shape=[None, 10], name="Y")
keep_prob = tf.placeholder(tf.float32, name="KEEP_PROB")

첫 번째 CNN 계층을 구성

3x3 크기의 커널을 가진 convolution 계층을 구성
kernel에 사용할 가중치 변수와 tensorflow가 제공하는 tf.nn.conv2d()함수를 사용

# 3x3x1 크기의 커널과 (1)을 가지고32개의 커널
with tf.name_scope("layer1"):
    W1 = tf.Variable(tf.random_normal([3, 3, 1, 32], stddev=0.01))

# 입력층 x, 첫 번째 계층의 가중치 W1, 오른쪽과 아래쪽으로 1칸 , 
# padding="SAME" -> 이미지의 가장 외곽에서 한 칸 밖으로 움직임
# strides=[1, 오른쪽, 아래쪽, 1], 양 끝은 반드시 1
# [batch_size, image_rows, image_cols, number_of_colors]
    L1 = tf.nn.conv2d(X, W1, strides=[1, 1, 1 ,1], padding="SAME")
    L1 = tf.nn.relu(L1)
# [batch_size, images_width, images_height, number_of_colors]
    L1 = tf.nn.max_pool(L1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding="SAME")

with tf.name_scope("layer2"):
# 3x3x1 -- 32개를 받아 64개로 convolution 계층 만듬
    W2 = tf.Variable(tf.random_normal([3, 3, 32, 64], stddev=0.01))
    L2 = tf.nn.conv2d(L1, W2, strides=[1, 1, 1, 1], padding="SAME")
    L2 = tf.nn.relu(L2)
    L2 = tf.nn.max_pool(L2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding="SAME")

with tf.name_scope("layer3"):
# 7x7x64크기의 1차원 계층을 만들고 중간단계인 256개의 뉴런으로 연결하는 신경망을 만들어줌
# fully connected layer
    W3 = tf.Variable(tf.random_normal([7 * 7 * 64, 256], stddev=0.01))
    L3 = tf.reshape(L2, [-1, 7 * 7 *64])
    L3 = tf.matmul(L3, W3)
    L3 = tf.nn.relu(L3)
    L3 = tf.nn.dropout(L3, keep_prob)

with tf.name_scope("layer4"):
    W4 = tf.Variable(tf.random_normal([256, 10], stddev=0.01))
    model = tf.matmul(L3, W4)

with tf.name_scope("cost"):
    cost = tf.reduce_mean(
    tf.nn.softmax_cross_entropy_with_logits_v2(logits=model, labels=Y))
    opt = tf.train.AdamOptimizer(0.001).minimize(cost)
# opt = tf.train.RMSPropOptimizer(0.001, 0.9).minimize(cost)
    tf.summary.scalar("cost", cost)

# batch_xs.reshape(-1, 28, 28, 1)
# mnist.test.images.reshape(-1, 28, 28, 1)

init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)

merged = tf.summary.merge_all()
writer = tf.summary.FileWriter("./logs/mnist_cnn", sess.graph)
cost_epoch = []

modeling¶

%%time
batch_size = 100
total_batch = int(mnist.train.num_examples / batch_size)

for epoch in range(15):
    total_cost = 0
    
    for i in range(total_batch):
        batch_xs, batch_ys = mnist.train.next_batch(batch_size)
        batch_xs = batch_xs.reshape(-1, 28, 28, 1)
        
        _, cost_val = sess.run([opt, cost], feed_dict={X: batch_xs,
                                                       Y: batch_ys,
                                                       keep_prob: 0.8})
        
        total_cost += cost_val
        cost_epoch.append(total_cost)
        
        summary = sess.run(merged, feed_dict={X:batch_xs, Y: batch_ys, keep_prob:0.8})
        writer.add_summary(summary, global_step=sess.run(global_step))
        
    print("Epoch:", "%4d" % (epoch+1), 
          "Avg.Cost:", "%.4f" % (total_cost / total_batch))
    
print("optimization completed")

Epoch:    1 Avg.Cost: 0.3136
Epoch:    2 Avg.Cost: 0.0977
Epoch:    3 Avg.Cost: 0.0681
Epoch:    4 Avg.Cost: 0.0531
Epoch:    5 Avg.Cost: 0.0438
Epoch:    6 Avg.Cost: 0.0356
Epoch:    7 Avg.Cost: 0.0310
Epoch:    8 Avg.Cost: 0.0275
Epoch:    9 Avg.Cost: 0.0233
Epoch:   10 Avg.Cost: 0.0193
Epoch:   11 Avg.Cost: 0.0198
Epoch:   12 Avg.Cost: 0.0155
Epoch:   13 Avg.Cost: 0.0151
Epoch:   14 Avg.Cost: 0.0130
Epoch:   15 Avg.Cost: 0.0120
optimization completed
CPU times: user 38min 12s, sys: 1min 17s, total: 39min 29s
Wall time: 14min 52s

costfunction¶

%matplotlib inline
import matplotlib.pyplot as plt

plt.figure(figsize=(20, 8))
plt.plot(cost_epoch, "g")
plt.title("cost")
plt.show()

tensorgrapth¶

## jptensor.py 를 워킹디렉토리에 import
import jptensor as jp

tf_graph = tf.get_default_graph().as_graph_def()
jp.show_graph(tf_graph)

is_correct = tf.equal(tf.argmax(model, 1), tf.argmax(Y, 1))
accuracy = tf.reduce_mean(tf.cast(is_correct, tf.float32))
print("accuracy %.4f" % (sess.run(accuracy, feed_dict={
    X:mnist.test.images.reshape(-1, 28, 28, 1),
    Y:mnist.test.labels,
    keep_prob: 1
})))

accuracy 0.9909

labels¶

%matplotlib inline
labels = sess.run(model, feed_dict={X: mnist.test.images.reshape(-1, 28, 28, 1),
                                    Y: mnist.test.labels,
                                    keep_prob: 1})
fig = plt.figure()
for i in range(10):
    # (2, 5)의 그래프, i + 1번째 숫자 이미지 출력
    subplot = fig.add_subplot(2, 5, i+1)
    
    # x, y축 눈금 제거
    subplot.set_xticks([])
    subplot.set_yticks([])
    
    # 출력한 이미지 위에 예측한 숫자를 출력
    # np.argmax와 tf.argmax는 같은 기능
    # 결과값인 labels의 i번째 요소가 one-hot encoding으로 되어 있으므로
    # 해당 배열에서 가장 높은 값을 가진 인덱스를 예측한 숫자로 출력
    subplot.set_title("%d" % np.argmax(labels[i]))
    
    # 1차원 배열로 되어 있는 i번째 이미지 데이터를
    # 28 x 28형태의 2차원 배열로 변환
    subplot.imshow(mnist.test.images[i].reshape(28, 28))
plt.show()

from IPython.core.display import display, HTML

display(HTML("<style> .container{width:100% !important;}</style>"))

import pandas as pd
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt

plt.rcParams["axes.unicode_minus"] = False
plt.rcParams["figure.figsize"] = (12, 8)

from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("./mnist/data/", one_hot=True)

Extracting ./mnist/data/train-images-idx3-ubyte.gz
Extracting ./mnist/data/train-labels-idx1-ubyte.gz
Extracting ./mnist/data/t10k-images-idx3-ubyte.gz
Extracting ./mnist/data/t10k-labels-idx1-ubyte.gz

variable setting¶

global_step = tf.Variable(0, trainable=False, name="global_step")
X = tf.placeholder(tf.float32, shape=[None, 784], name="X")
Y = tf.placeholder(tf.float32, shape=[None,  10], name="Y")

W1 = tf.Variable(tf.random_normal([784, 256], mean=0, stddev=0.01), name="W1")
W2 = tf.Variable(tf.random_normal([256, 256], mean=0, stddev=0.01), name="W2")
W3 = tf.Variable(tf.random_normal([256,  10], mean=0, stddev=0.01), name="W3")

b1 = tf.zeros([256], name="bias1")
b2 = tf.zeros([256], name="bias2")
b3 = tf.zeros([10] , name="bais3")

model setting¶

keep_prob = tf.placeholder(tf.float32)

with tf.name_scope("layer1"):
    L1 = tf.add(tf.matmul(X, W1), b1)
    L1 = tf.nn.relu(L1)
    L1 = tf.nn.dropout(L1, keep_prob)
    
with tf.name_scope("layer2"):
    L2 = tf.add(tf.matmul(L1, W2), b2)
    L2 = tf.nn.relu(L2)
    L2 = tf.nn.dropout(L2, keep_prob)
    
with tf.name_scope("layer3"):
    model = tf.add(tf.matmul(L2, W3), b3)
    
with tf.name_scope("cost"):
    cost = tf.reduce_mean(
        tf.nn.softmax_cross_entropy_with_logits_v2(labels=Y, logits=model))
    opt = tf.train.AdamOptimizer(0.001).minimize(cost, global_step=global_step)
    
    tf.summary.scalar("cost", cost)

model initialization¶

init = tf.global_variables_initializer()
sess = tf.Session()

sess.run(init)

merged = tf.summary.merge_all()
writer = tf.summary.FileWriter("./logs/mnist_matplotlib", sess.graph)

batch_size = 50
total_batch = int(mnist.train.num_examples / batch_size)
cost_epoch = []

model training¶

%%time
for epoch in range(20):
    total_cost = 0
    
    for i in range(total_batch):
        batch_xs, batch_ys = mnist.train.next_batch(batch_size)
        
        _, cost_val = sess.run([opt, cost], feed_dict={X:batch_xs, Y: batch_ys, keep_prob:0.8})
        total_cost += cost_val
        cost_epoch.append(total_cost)
        
        summary = sess.run(merged, feed_dict={X:batch_xs, Y: batch_ys, keep_prob:0.8})
        writer.add_summary(summary, global_step=sess.run(global_step))
        
    print("epoch: %d, Avg.cost: %.4f" % (
        epoch+1, total_cost / total_batch
    ))

epoch: 1, Avg.cost: 0.3481
epoch: 2, Avg.cost: 0.1395
epoch: 3, Avg.cost: 0.1000
epoch: 4, Avg.cost: 0.0806
epoch: 5, Avg.cost: 0.0697
epoch: 6, Avg.cost: 0.0591
epoch: 7, Avg.cost: 0.0507
epoch: 8, Avg.cost: 0.0455
epoch: 9, Avg.cost: 0.0417
epoch: 10, Avg.cost: 0.0394
epoch: 11, Avg.cost: 0.0362
epoch: 12, Avg.cost: 0.0361
epoch: 13, Avg.cost: 0.0305
epoch: 14, Avg.cost: 0.0303
epoch: 15, Avg.cost: 0.0271
epoch: 16, Avg.cost: 0.0282
epoch: 17, Avg.cost: 0.0267
epoch: 18, Avg.cost: 0.0267
epoch: 19, Avg.cost: 0.0219
epoch: 20, Avg.cost: 0.0238
CPU times: user 3min 22s, sys: 43.7 s, total: 4min 6s
Wall time: 2min 27s

cost function¶

plt.figure(figsize=(20, 8))
plt.plot(cost_epoch, "g")
plt.title("cost_epoch")
plt.show()

tensor graph¶

## jptensor.py 를 워킹디렉토리에 import
import jptensor as jp

tf_graph = tf.get_default_graph().as_graph_def()
jp.show_graph(tf_graph)

test¶

is_correct = tf.equal(tf.argmax(model, 1), tf.argmax(Y, 1))
accuracy = tf.reduce_mean(tf.cast(is_correct, tf.float32))

accuracy_val = sess.run(accuracy, feed_dict={X: mnist.test.images, 
                                             Y: mnist.test.labels,
                                             keep_prob: 1})

print("accuracy: %.3f" % (accuracy_val))

accuracy: 0.980

labels¶

labels = sess.run(model, feed_dict={X: mnist.test.images,
                                    Y: mnist.test.labels,
                                    keep_prob: 1})

%matplotlib inline
fig = plt.figure()
for i in range(10):
    # (2, 5)의 그래프, i + 1번째 숫자 이미지 출력
    subplot = fig.add_subplot(2, 5, i+1)
    
    # x, y축 눈금 제거
    subplot.set_xticks([])
    subplot.set_yticks([])
    
    # 출력한 이미지 위에 예측한 숫자를 출력
    # np.argmax와 tf.argmax는 같은 기능
    # 결과값인 labels의 i번째 요소가 one-hot encoding으로 되어 있으므로
    # 해당 배열에서 가장 높은 값을 가진 인덱스를 예측한 숫자로 출력
    subplot.set_title("%d" % (np.argmax(labels[i])))
    
    # 1차원 배열로 되어 있는 i번째 이미지 데이터를
    # 28 x 28형태의 2차원 배열로 변환
    subplot.imshow(mnist.test.images[i].reshape((28, 28)))
plt.show()

from IPython.core.display import HTML, display

display(HTML("<style> .container{width:100% !important;}</style>"))

from IPython.core.display import display, HTML
display(HTML("<style> .container{width:100% !important;}</style>"))

import tensorflow as tf
import warnings
import numpy as np
import matplotlib.pyplot as plt
plt.rcParams["figure.figsize"] = (12, 8)

from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("./mnist/data/", one_hot=True)

WARNING:tensorflow:From <ipython-input-3-4dcbd946c02b>:2: read_data_sets (from tensorflow.contrib.learn.python.learn.datasets.mnist) is deprecated and will be removed in a future version.
Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.
WARNING:tensorflow:From /anaconda3/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/datasets/mnist.py:260: maybe_download (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version.
Instructions for updating:
Please write your own downloading logic.
WARNING:tensorflow:From /anaconda3/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/datasets/mnist.py:262: extract_images (from tensorflow.contrib.learn.python.learn.datasets.mnist) is deprecated and will be removed in a future version.
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting ./mnist/data/train-images-idx3-ubyte.gz
WARNING:tensorflow:From /anaconda3/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/datasets/mnist.py:267: extract_labels (from tensorflow.contrib.learn.python.learn.datasets.mnist) is deprecated and will be removed in a future version.
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting ./mnist/data/train-labels-idx1-ubyte.gz
WARNING:tensorflow:From /anaconda3/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/datasets/mnist.py:110: dense_to_one_hot (from tensorflow.contrib.learn.python.learn.datasets.mnist) is deprecated and will be removed in a future version.
Instructions for updating:
Please use tf.one_hot on tensors.
Extracting ./mnist/data/t10k-images-idx3-ubyte.gz
Extracting ./mnist/data/t10k-labels-idx1-ubyte.gz
WARNING:tensorflow:From /anaconda3/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/datasets/mnist.py:290: DataSet.__init__ (from tensorflow.contrib.learn.python.learn.datasets.mnist) is deprecated and will be removed in a future version.
Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.

global_step = tf.Variable(0, trainable=False, name="global_step")
X = tf.placeholder(tf.float32, shape=[None, 784], name="X") # None, 784
Y = tf.placeholder(tf.float32, shape=[None, 10], name="Y")

W1 = tf.Variable(tf.random_normal([784, 256], mean=0, stddev=0.01), name="W1")
W2 = tf.Variable(tf.random_normal([256, 256], mean=0, stddev=0.01), name="W2")
W3 = tf.Variable(tf.random_normal([256,  10], mean=0, stddev=0.011), name="W3")

b1 = tf.zeros([256], name="bias1")
b2 = tf.zeros([256], name="bias2")
b3 = tf.zeros([10],  name="bias3")

dropout¶

학습시 전체 신경망 중 일부만 사용하도록 함 -> 과적합 방지
시간이 오래걸리는 편

keep_prob = tf.placeholder(tf.float32)

with tf.name_scope("layer1"):
    L1 = tf.add(tf.matmul(X, W1), b1)
    L1 = tf.nn.relu(L1)
    L1 = tf.nn.dropout(L1, keep_prob)
    
with tf.name_scope("layer2"):
    L2 = tf.add(tf.matmul(L1, W2), b2)
    L2 = tf.nn.relu(L2)
    L2 = tf.nn.dropout(L2, keep_prob)
    
with tf.name_scope("layer3"):
    model = tf.add(tf.matmul(L2, W3), b3)

with tf.name_scope("optimizer"):
    cost = tf.reduce_mean(
        tf.nn.softmax_cross_entropy_with_logits_v2(labels=Y, logits=model))
    opt = tf.train.AdamOptimizer(learning_rate=0.001).minimize(cost, global_step=global_step)
    tf.summary.scalar("cost", cost)

init = tf.global_variables_initializer()
sess = tf.Session()
# saver = tf.train.Saver(tf.global_variables())
sess.run(init)

merged = tf.summary.merge_all()
writer = tf.summary.FileWriter("./logs/mnist_dropout", sess.graph)

batch_size = 50
total_batch = int(mnist.train.num_examples/batch_size)
cost_epoch = []

%%time
for epoch in range(30):
    total_cost = 0
    
    for i in range(total_batch):
        batch_xs, batch_ys = mnist.train.next_batch(batch_size)
        
        _, cost_val = sess.run([opt, cost], feed_dict={X:batch_xs, Y:batch_ys, keep_prob: 0.8})
        total_cost += cost_val
        cost_epoch.append(total_cost)
        
        summary = sess.run(merged, feed_dict={X:batch_xs, Y:batch_ys, keep_prob: 0.8})
        writer.add_summary(summary, global_step=sess.run(global_step))
        
    print("epoch: {}, Avg.cost: {}".format(epoch+1, total_cost / total_batch))

epoch: 1, Avg.cost: 0.3532606958614832
epoch: 2, Avg.cost: 0.1427451760597019
epoch: 3, Avg.cost: 0.10226461782120168
epoch: 4, Avg.cost: 0.08699281872800467
epoch: 5, Avg.cost: 0.06855666186279533
epoch: 6, Avg.cost: 0.05921383855340537
epoch: 7, Avg.cost: 0.0536975436815357
epoch: 8, Avg.cost: 0.04580990582659565
epoch: 9, Avg.cost: 0.04084625356087186
epoch: 10, Avg.cost: 0.040573723167723404
epoch: 11, Avg.cost: 0.035842695366584604
epoch: 12, Avg.cost: 0.03263294939398871
epoch: 13, Avg.cost: 0.03360669748346316
epoch: 14, Avg.cost: 0.030501310914848794
epoch: 15, Avg.cost: 0.028370174647349235
epoch: 16, Avg.cost: 0.02699218331828392
epoch: 17, Avg.cost: 0.026614617982999005
epoch: 18, Avg.cost: 0.027732884158863685
epoch: 19, Avg.cost: 0.02651893331764189
epoch: 20, Avg.cost: 0.024510102366322662
epoch: 21, Avg.cost: 0.024103802091576653
epoch: 22, Avg.cost: 0.021529521410293455
epoch: 23, Avg.cost: 0.024205624244715927
epoch: 24, Avg.cost: 0.021746395409784947
epoch: 25, Avg.cost: 0.02059082699589949
epoch: 26, Avg.cost: 0.02283201359495644
epoch: 27, Avg.cost: 0.021406652638101233
epoch: 28, Avg.cost: 0.022226517286706812
epoch: 29, Avg.cost: 0.019306987923368567
epoch: 30, Avg.cost: 0.020735127189004873
CPU times: user 4min 42s, sys: 1min, total: 5min 43s
Wall time: 3min 23s

plt.plot(cost_epoch, "g")
plt.title("cost_epoch")
plt.show()

is_correct = tf.equal(tf.argmax(model, 1), tf.argmax(Y, 1))
accuracy = tf.reduce_mean(tf.cast(is_correct, tf.float32))
print("accuracy: {}".format(sess.run(accuracy, feed_dict={X: mnist.test.images,
                                                        Y: mnist.test.labels,
                                                        keep_prob: 1})))

accuracy: 0.9835000038146973

### tensorboard graph

from IPython.display import clear_output, Image, display, HTML

def strip_consts(graph_def, max_const_size=32):
    """Strip large constant values from graph_def."""
    strip_def = tf.GraphDef()
    for n0 in graph_def.node:
        n = strip_def.node.add() 
        n.MergeFrom(n0)
        if n.op == 'Const':
            tensor = n.attr['value'].tensor
            size = len(tensor.tensor_content)
            if size > max_const_size:
                tensor.tensor_content = "<stripped %d bytes>"%size
    return strip_def

def show_graph(graph_def, max_const_size=32):
    """Visualize TensorFlow graph."""
    if hasattr(graph_def, 'as_graph_def'):
        graph_def = graph_def.as_graph_def()
    strip_def = strip_consts(graph_def, max_const_size=max_const_size)
    code = """
        <script>
          function load() {{
            document.getElementById("{id}").pbtxt = {data};
          }}
        </script>
        <link rel="import" href="https://tensorboard.appspot.com/tf-graph-basic.build.html" onload=load()>
        <div style="height:600px">
          <tf-graph-basic id="{id}"></tf-graph-basic>
        </div>
    """.format(data=repr(str(strip_def)), id='graph'+str(np.random.rand()))

    iframe = """
        <iframe seamless style="width:1200px;height:620px;border:0" srcdoc="{}"></iframe>
    """.format(code.replace('"', '&quot;'))
    display(HTML(iframe))

show_graph(tf.get_default_graph().as_graph_def())

# ### tensorboard

# def TB(cleanup=False):
#     import webbrowser
#     webbrowser.open('http://127.0.0.1:6006')

#     !tensorboard --logdir="./logs/mnist_dropout/"

# TB()

import pandas as pd
import numpy as np

# 털, 날개, 기타, 포유류, 조류
col_list = ["hair", "wing", "etc", "mammals", "bird"]
classification_ex = pd.DataFrame({
    "hair": [0, 1, 1, 0, 0, 0],
    "wing": [0, 0, 1, 0 ,0 ,1],
    "bird": [1, 0, 0, 1, 1, 0],
    "etc" : [0, 1, 0, 0, 0, 0],
    "mammals": [0, 0, 1 ,0 ,0 ,1]   
}, columns=col_list)

classification_ex.to_csv("./datas/classification_ex1.csv", 
                         encoding="utf-8",
                        index=False) # header=False

## iris_data
from sklearn.datasets import load_iris

iris = load_iris()
iris_values = np.hstack([iris.data, iris.target.reshape(-1, 1)])

col_names = ["sepal_length", "sepal_width", "petal_length", "petal_width", "species"]
iris_data = pd.DataFrame(data=iris_values, columns=col_names)

iris_data["species"].replace(to_replace=0.0, value="setosa", inplace=True)
iris_data["species"].replace(to_replace=1.0, value="versicolor", inplace=True)
iris_data["species"].replace(to_replace=2.0, value="virginica", inplace=True)

iris_data.to_csv("./datas/iris.csv", encoding="utf-8", index=False)

## breast_cancer
from sklearn.datasets import load_breast_cancer

cancer = load_breast_cancer()
cancer_values = np.hstack([cancer.data, cancer.target.reshape(-1, 1)])

col_names = np.hstack([cancer.feature_names, "result"])
cancer_data = pd.DataFrame(data=cancer_values, columns=col_names)

cancer_data["result"].replace(to_replace=0, value="malignant", inplace=True)
cancer_data["result"].replace(to_replace=1, value="benign", inplace=True)

cancer_data.to_csv("./datas/cancer.csv", encoding="utf-8", index=False)

from IPython.core.display import display, HTML
display(HTML("<style> .container{width:100% !important;}</style>"))

import tensorflow as tf
import warnings
warnings.filterwarnings("ignore")

mnist data 준비¶

from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("./mnist/data/", one_hot=True)

WARNING:tensorflow:From <ipython-input-2-4dcbd946c02b>:2: read_data_sets (from tensorflow.contrib.learn.python.learn.datasets.mnist) is deprecated and will be removed in a future version.
Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.
WARNING:tensorflow:From /anaconda3/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/datasets/mnist.py:260: maybe_download (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version.
Instructions for updating:
Please write your own downloading logic.
WARNING:tensorflow:From /anaconda3/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/datasets/mnist.py:262: extract_images (from tensorflow.contrib.learn.python.learn.datasets.mnist) is deprecated and will be removed in a future version.
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting ./mnist/data/train-images-idx3-ubyte.gz
WARNING:tensorflow:From /anaconda3/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/datasets/mnist.py:267: extract_labels (from tensorflow.contrib.learn.python.learn.datasets.mnist) is deprecated and will be removed in a future version.
Instructions for updating:
Please use tf.data to implement this functionality.
Extracting ./mnist/data/train-labels-idx1-ubyte.gz
WARNING:tensorflow:From /anaconda3/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/datasets/mnist.py:110: dense_to_one_hot (from tensorflow.contrib.learn.python.learn.datasets.mnist) is deprecated and will be removed in a future version.
Instructions for updating:
Please use tf.one_hot on tensors.
Extracting ./mnist/data/t10k-images-idx3-ubyte.gz
Extracting ./mnist/data/t10k-labels-idx1-ubyte.gz
WARNING:tensorflow:From /anaconda3/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/datasets/mnist.py:290: DataSet.__init__ (from tensorflow.contrib.learn.python.learn.datasets.mnist) is deprecated and will be removed in a future version.
Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.

model setting¶

image = 28X28, 784 attribute
label = 0, 1, ,2 ,3 ,4 ,5, 6, 7, 8, 9

global_step = tf.Variable(0, trainable=False, name="global_step")
X = tf.placeholder(tf.float32, [None, 784], name="X")
Y = tf.placeholder(tf.float32, [None, 10], name="Y")

minibatch¶

이미지를 하나씩 학습시키는 것보다 여러 개를 한꺼번에 학습시키는 쪽이 효과가 좋음.(많은 컴퓨팅 자원이 뒷받침 될 때)
따라서 일반적으로 데이터를 적당한크기로 잘라서 학습 --> 미니배치^minibatch

placeholder에서 [None, 784]는 한 번에 학습시킬 이미지의 갯수를 지정 -- minibatch
원하는 크기로 지정할 수도 있지만 학습할 갯수를 바꿔가면서 진행할 때는 "None"으로 넣어주면 tensorflow가 계산함

784(입력, 특징수) -> 256(first hidden layer) -> 256(second hidden layer) -> 10 (output 0-9 분류 갯수)

W1 = tf.Variable(tf.random_normal([784, 256], mean=0, stddev=1), name="var1")
W2 = tf.Variable(tf.random_normal([256, 256], mean=0, stddev=1), name="var2")
W3 = tf.Variable(tf.random_normal([256, 10],  mean=0, stddev=1), name="var3")

b1 = tf.zeros([256], name="bias1")
b2 = tf.zeros([256], name="bias2")
b3 = tf.zeros([10],  name="bias3")

with tf.name_scope("layer1"):
    L1 = tf.add(tf.matmul(X, W1), b1)
    L1 = tf.nn.relu(L1)

with tf.name_scope("layer2"):
    L2 = tf.add(tf.matmul(L1, W2), b2)
    L2 = tf.nn.relu(L2)
    
with tf.name_scope("layer3"):
    model = tf.add(tf.matmul(L2, W3), b3)

with tf.name_scope("opt"):
    cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(labels=Y, logits=model))
    optimizer = tf.train.AdamOptimizer(learning_rate=0.01).minimize(cost, global_step=global_step)
    
    tf.summary.scalar("cost", cost)

init = tf.global_variables_initializer()
sess = tf.Session()
saver = tf.train.Saver(tf.global_variables())
sess.run(init)


merged = tf.summary.merge_all()
writer = tf.summary.FileWriter("./logs/mnist_01", sess.graph)

batch_size = 100
total_batch = int(mnist.train.num_examples / batch_size)

MNIST는 데이터가 수만 개로 매우 크므로 학습에 미니배치 사용

미니배치의 크기를 100개로 설정
mnist.train.num_examples를 배치크기로 나눠 미니배치가 총 몇 개인지를 저장

그리고 MNIST 데이터 전체를 학습하는 일을 총 15번 반복
학습 데이터 전체를 한 바퀴 도는 것을 에포치^epoch라 함

for epoch in range(15):
    total_cost = 0
    
    for i in range(total_batch):
        batch_xs, batch_ys = mnist.train.next_batch(batch_size)
        
        _, cost_val = sess.run([optimizer, cost], feed_dict={X:batch_xs, Y:batch_ys})
        total_cost += cost_val
        
        summary = sess.run(merged, feed_dict={X:batch_xs, Y:batch_ys})
        writer.add_summary(summary, global_step=sess.run(global_step))
        
    print("Epoch: {}, Avg.cost = {:.3f}".format(epoch+1, total_cost / total_batch))

Epoch: 1, Avg.cost = 53.018
Epoch: 2, Avg.cost = 8.887
Epoch: 3, Avg.cost = 4.864
Epoch: 4, Avg.cost = 3.215
Epoch: 5, Avg.cost = 2.577
Epoch: 6, Avg.cost = 2.208
Epoch: 7, Avg.cost = 2.240
Epoch: 8, Avg.cost = 1.909
Epoch: 9, Avg.cost = 1.490
Epoch: 10, Avg.cost = 1.574
Epoch: 11, Avg.cost = 1.440
Epoch: 12, Avg.cost = 1.100
Epoch: 13, Avg.cost = 0.913
Epoch: 14, Avg.cost = 1.022
Epoch: 15, Avg.cost = 0.742

# import os

# if not os.path.isdir("./model/mnist_01"):
#     os.mkdir("./model/mnist_01")

# saver.save(sess, "./model/mnist_01/dnn/ckpt", global_step=global_step)
# print("optimize complete!")

is_correct = tf.equal(tf.argmax(model, 1), tf.argmax(Y, 1))
accuracy = tf.reduce_mean(tf.cast(is_correct, tf.float32))
print("accuracy: {:.4f}".format(sess.run(accuracy, feed_dict={X:mnist.test.images, 
                                                     Y:mnist.test.labels})))

accuracy: 0.9612

from IPython.core.display import display, HTML
display(HTML("<style> .container{width:100% !important;}</style>"))

import tensorflow as tf
import pandas as pd
import numpy as np

import warnings
warnings.filterwarnings("ignore")

import os

# set working directory
HOME = os.getenv("HOME")
WORKDIR = os.path.join(HOME, "python", "deep_learning", "tensorflow")
os.chdir(WORKDIR)

# load_data
cancer = pd.read_csv("./datas/cancer.csv")

# cancer_target_names = np.unique(cancer.result.values)
# cancer_target_names = cancer_target_names[[1, 0]] 

# for i, n in enumerate(cancer_target_names):
#     cancer.replace(to_replace=cancer_target_names[i], value=i, inplace=True)

# data split
from sklearn.model_selection import train_test_split

train_set, test_set = train_test_split(cancer, test_size=0.2, random_state=0)
train_labels = train_set["result"].values
test_labels  = test_set["result"].values

# preprocessing
from sklearn.preprocessing import OneHotEncoder, StandardScaler
from sklearn.impute import SimpleImputer
from sklearn.base import BaseEstimator, TransformerMixin
from sklearn.pipeline import Pipeline, FeatureUnion

class DataframeSelector(BaseEstimator, TransformerMixin):
    def __init__(self, attr_list):
        self.attr_list = attr_list
        
    def fit(self, X, y=None):
        return self
    
    def transform(self, X):
        return X.iloc[:, self.attr_list].values

n_list = range(train_set.shape[1]-1)
c_list = [train_set.shape[1]-1]

num_pipeline = Pipeline([
    ["selector", DataframeSelector(n_list)],
    ["imputer", SimpleImputer(strategy="median")],
    ["scaler", StandardScaler()]
])

cat_pipeline = Pipeline([
    ["selector", DataframeSelector(c_list)],
    ["encoder", OneHotEncoder(sparse=False)]
])

full_pipeline = FeatureUnion(transformer_list=[
    ["nums", num_pipeline],
    ["cats", cat_pipeline]
])

scaled_train = full_pipeline.fit_transform(train_set)
scaled_test  = full_pipeline.fit_transform(test_set)

x_train, y_train = scaled_train[:, :30], scaled_train[:, 30:] 
x_test, y_test = scaled_test[:, :30], scaled_test[:, 30:]

x_train = x_train.astype("float32")
y_train = y_train.astype("float32")

global_step = tf.Variable(0, trainable=False, name="global_step")

X = tf.placeholder(tf.float32, name="X")
Y = tf.placeholder(tf.float32, name="Y")

# 30, 2
W1 = tf.Variable(tf.random_normal([30, 10], mean=0, stddev=1))
W2 = tf.Variable(tf.random_normal([10, 100], mean=0, stddev=1))
W3 = tf.Variable(tf.random_normal([100, 500], mean=0, stddev=1))
W4 = tf.Variable(tf.random_normal([500, 2], mean=0, stddev=1))

b1 = tf.zeros([10])
b2 = tf.zeros([100])
b3 = tf.zeros([500])
b4 = tf.zeros([2])

with tf.name_scope("layer1"):
    L1 = tf.add(tf.matmul(X, W1), b1)
    L1 = tf.nn.sigmoid(L1)

with tf.name_scope("layer2"):
    L2 = tf.add(tf.matmul(L1, W2), b2)
    L2 = tf.nn.sigmoid(L2)

with tf.name_scope("layer3"):
    L3 = tf.add(tf.matmul(L2, W3), b3)
    L3 = tf.nn.sigmoid(L3)   
    
with tf.name_scope("layer4"):
    model = tf.add(tf.matmul(L3, W4), b4)

with tf.name_scope("optimizer"):
    cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits_v2(labels=Y, logits=model))
    optimizer = tf.train.AdamOptimizer(learning_rate=0.01)
    train_op = optimizer.minimize(cost, global_step=global_step)
    
    tf.summary.scalar("cost", cost)
#     tf.summary.scalar("W1", W1)
#     tf.summary.scalar("W2", W2)
#     tf.summary.scalar("W3", W3)
#     tf.summary.scalar("W4", W4)

#     tf.summary.scalar("b1", b1)
#     tf.summary.scalar("b2", b2)
#     tf.summary.scalar("b3", b3)
#     tf.summary.scalar("b4", b4)

# tf.reset_default_graph()
sess = tf.Session()
saver = tf.train.Saver(tf.global_variables())

ckpt = tf.train.get_checkpoint_state("./model/cancer")
if ckpt and tf.train.checkpoint_exists(ckpt.model_checkpoint_path):
    saver.restore(sess, ckpt.model_checkpoint_path)
    
else:
    init = tf.global_variables_initializer()
    sess.run(init)

INFO:tensorflow:Restoring parameters from ./model/cancer/dnn.ckpt-5000

merged = tf.summary.merge_all()
writer = tf.summary.FileWriter("./logs/cancer", sess.graph)

for step in range(5000):
    sess.run(train_op, feed_dict={X:x_train, Y:y_train})
    
    if (step+1) % 200 == 0:
        print("step: {}, cost: {}".\
             format(sess.run(global_step),
                    sess.run(cost, feed_dict={X:x_train, Y:y_train})))
        
    summary = sess.run(merged, feed_dict={X:x_train, Y:y_train})   
    writer.add_summary(summary, global_step=sess.run(global_step))
    
saver.save(sess, "./model/cancer/dnn.ckpt", global_step=global_step)

step: 5200, cost: 2.7666922619573597e-07
step: 5400, cost: 2.444436688620044e-07
step: 5600, cost: 2.1798199156819464e-07
step: 5800, cost: 1.9518827798492566e-07
step: 6000, cost: 1.747525146811313e-07
step: 6200, cost: 1.5667471586766624e-07
step: 6400, cost: 1.4069287601614633e-07
step: 6600, cost: 1.247110219537717e-07
step: 6800, cost: 1.1213514028440841e-07
step: 7000, cost: 1.0191723021080179e-07
step: 7200, cost: 9.196132566557935e-08
step: 7400, cost: 8.200541401492956e-08
step: 7600, cost: 7.51934692289069e-08
step: 7800, cost: 6.759553627944115e-08
step: 8000, cost: 6.052158596503432e-08
step: 8200, cost: 5.554362658699574e-08
step: 8400, cost: 4.925567154145938e-08
step: 8600, cost: 4.427770861070712e-08
step: 8800, cost: 4.139572951089576e-08
step: 9000, cost: 3.694175632062979e-08
step: 9200, cost: 3.353578392761847e-08
step: 9400, cost: 3.065380482780711e-08
step: 9600, cost: 2.7509825528682086e-08
step: 9800, cost: 2.541383992138435e-08
step: 10000, cost: 2.3317852537729777e-08

'./model/cancer/dnn.ckpt-10000'

prediction = tf.argmax(model, 1)
target = tf.argmax(Y, 1)
is_correct = tf.equal(prediction, target)
accuracy = tf.reduce_mean(tf.cast(is_correct, tf.float32))

print("=================================")
print("train_prediction: \n{}".format(sess.run(prediction, feed_dict={X:x_train, Y:y_train})))
print("train_target: \n{}".format(sess.run(target, feed_dict={X:x_train, Y:y_train})))
print("train_accuracy: \n{:.3f}".format(sess.run(accuracy*100, feed_dict={X:x_train, Y:y_train})))

print("\n=================================")
print("test_prediction: \n{}".format(sess.run(prediction, feed_dict={X:x_test, Y:y_test})))
print("test_target: \n{}".format(sess.run(target, feed_dict={X:x_test, Y:y_test})))
print("test_accuracy: \n{:.3f}".format(sess.run(accuracy*100, feed_dict={X:x_test, Y:y_test})))

=================================
train_prediction: 
[0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 0 1 1 0 0 0 1 0 0 1 0 1 0 0 0
 0 0 0 0 1 0 1 0 1 1 0 0 1 0 1 1 1 0 0 0 0 0 0 1 0 0 0 0 1 1 0 0 1 0 0 0 0
 1 0 0 1 1 0 0 1 1 0 0 1 0 0 1 1 1 0 0 0 1 0 0 0 0 0 1 0 1 0 1 0 1 0 1 0 0
 0 0 1 0 1 0 0 0 1 0 0 0 1 0 0 1 1 0 1 0 1 0 1 1 1 1 0 1 0 1 0 1 0 1 0 0 1
 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 1 0 0 1 0 1 1 0 1 1 0
 0 1 0 0 0 1 0 0 0 0 0 1 1 0 0 0 0 1 1 0 0 0 0 1 0 0 1 0 0 1 0 1 0 0 0 0 1
 1 1 1 0 1 0 1 1 0 0 0 0 0 1 0 0 1 0 0 1 1 0 0 0 1 1 0 0 1 0 0 0 1 0 1 0 1
 1 1 1 0 0 0 0 1 1 0 0 0 0 0 1 0 0 1 0 0 1 1 1 1 0 0 1 0 0 0 1 1 0 0 0 0 0
 1 1 1 0 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 1 0 1 0 1 1 0 0 0 0 0 1 1 1 0 0 1
 1 0 0 1 0 1 1 0 1 0 0 0 0 0 0 0 1 1 0 0 0 1 0 0 0 0 1 0 1 1 1 0 1 0 1 0 1
 1 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 1 1 0 1 0 0 1
 1 1 1 0 1 1 1 0 1 0 1 0 0 1 1 0 1 0 0 0 0 1 0 0 1 0 0 0 1 1 0 0 0 1 0 0 1
 0 0 0 0 0 1 1 1 0 0 0]
train_target: 
[0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 0 0 1 1 0 0 0 1 0 0 1 0 1 0 0 0
 0 0 0 0 1 0 1 0 1 1 0 0 1 0 1 1 1 0 0 0 0 0 0 1 0 0 0 0 1 1 0 0 1 0 0 0 0
 1 0 0 1 1 0 0 1 1 0 0 1 0 0 1 1 1 0 0 0 1 0 0 0 0 0 1 0 1 0 1 0 1 0 1 0 0
 0 0 1 0 1 0 0 0 1 0 0 0 1 0 0 1 1 0 1 0 1 0 1 1 1 1 0 1 0 1 0 1 0 1 0 0 1
 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 1 0 0 1 0 1 1 0 1 1 0
 0 1 0 0 0 1 0 0 0 0 0 1 1 0 0 0 0 1 1 0 0 0 0 1 0 0 1 0 0 1 0 1 0 0 0 0 1
 1 1 1 0 1 0 1 1 0 0 0 0 0 1 0 0 1 0 0 1 1 0 0 0 1 1 0 0 1 0 0 0 1 0 1 0 1
 1 1 1 0 0 0 0 1 1 0 0 0 0 0 1 0 0 1 0 0 1 1 1 1 0 0 1 0 0 0 1 1 0 0 0 0 0
 1 1 1 0 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 1 0 1 0 1 1 0 0 0 0 0 1 1 1 0 0 1
 1 0 0 1 0 1 1 0 1 0 0 0 0 0 0 0 1 1 0 0 0 1 0 0 0 0 1 0 1 1 1 0 1 0 1 0 1
 1 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 1 1 0 1 0 0 1
 1 1 1 0 1 1 1 0 1 0 1 0 0 1 1 0 1 0 0 0 0 1 0 0 1 0 0 0 1 1 0 0 0 1 0 0 1
 0 0 0 0 0 1 1 1 0 0 0]
train_accuracy: 
100.000

=================================
test_prediction: 
[1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 1 1 1 1 1 0 0 1 0 0 1 0 1 0 1 0 1 0 1 0
 1 0 1 1 0 1 0 0 1 0 0 0 1 1 0 1 0 0 0 0 0 0 1 1 1 0 0 1 0 1 1 1 0 0 1 0 1
 1 0 0 0 0 0 1 1 1 0 1 0 0 0 1 1 0 1 0 1 0 0 1 0 0 0 0 0 0 0 1 0 1 0 1 1 0
 1 1 0]
test_target: 
[1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 1 1 1 1 0 0 1 0 0 1 0 1 0 1 0 1 0 1 0
 1 0 1 1 0 1 0 0 1 0 0 0 1 1 1 1 0 0 0 0 0 0 1 1 1 0 0 1 0 1 1 1 0 0 1 0 1
 1 0 0 0 0 0 1 1 1 0 1 0 0 0 1 1 0 1 0 1 0 0 1 0 0 0 0 0 0 0 1 0 1 0 1 1 0
 1 1 0]
test_accuracy: 
98.246

# tensorboard --logdir=./logs/cancer

from IPython.core.display import display, HTML

display(HTML("<style> .container{width:100% !important;}</style>"))

#!/anaconda3/envs/py36/bin/python3
import numpy as np
import pandas as pd
import tensorflow as tf

import warnings
warnings.filterwarnings("ignore")

iris = pd.read_csv("./datas/iris.csv")

iris_target_names = np.unique(iris["species"].values)

for i, n in enumerate(iris_target_names):
    iris.replace(to_replace=iris_target_names[i], value=i, inplace=True)

from sklearn.model_selection import train_test_split

train_set, test_set = train_test_split(iris, test_size=0.2, random_state=42)

train_labels = train_set["species"].values
test_labels = test_set["species"].values

import matplotlib.pyplot as plt

%matplotlib inline
train_set.plot(kind="scatter", x="sepal_length", y="sepal_width", alpha=0.4, figsize=(10, 8), c="species", cmap=plt.get_cmap("jet"), colorbar=True, sharex=False)
plt.show()

train_set.hist(bins=50, figsize=(20, 15))
plt.show()

from sklearn.preprocessing import OneHotEncoder, StandardScaler, LabelEncoder
from sklearn.impute import SimpleImputer
from sklearn.base import BaseEstimator, TransformerMixin
from sklearn.pipeline import Pipeline, FeatureUnion

class DataFrameSelector(BaseEstimator, TransformerMixin):
    def __init__(self, lists):
        self.lists = lists
        
    def fit(self, X, y=None):
        return self
    
    def transform(self, X):
        return X.iloc[:, self.lists].values
    
# class LabelEncoders(BaseEstimator, TransformerMixin):
#     def __init__(self):
#         self.encoder = LabelEncoder()
        
#     def fit(self, X, y=None):
#         self.encoder.fit(X)
#         return self
    
#     def transform(self, X, y=None):
#         return self.encoder.transform(X)

n_list = [0, 1, 2, 3]
e_list = [4]

train_set["species"] = train_set["species"].astype(str)

num_pipeline = Pipeline([
    ["selector", DataFrameSelector(lists=n_list)],
    ["imputer" , SimpleImputer(strategy="median")],
    ["scaler"  , StandardScaler()]
])

encoding_pipeline = Pipeline([
    ["selector", DataFrameSelector(lists=e_list)],
    ["encoder", OneHotEncoder(sparse=False, categories="auto")]
])

full_pipeline = FeatureUnion(transformer_list=[
    ["nums", num_pipeline],
    ["encoding", encoding_pipeline]
])

scaled_train = full_pipeline.fit_transform(train_set)
scaled_test  = full_pipeline.fit_transform(test_set)

x_train, y_train = scaled_train[:, :4], scaled_train[:, 4:]
x_test, y_test = scaled_test[:, :4], scaled_test[:, 4:]

global_step = tf.Variable(0, trainable=False, name="global_step")

X = tf.placeholder(tf.float32)
Y = tf.placeholder(tf.float32)

# 4, 1
W1 = tf.Variable(tf.random_normal([4, 10], mean=0, stddev=1))
W2 = tf.Variable(tf.random_normal([10, 100], mean=0, stddev=1))
W3 = tf.Variable(tf.random_normal([100, 3], mean=0, stddev=1))

b1 = tf.zeros([10])
b2 = tf.zeros([100])
b3 = tf.zeros([3])

# Layer 1
with tf.name_scope("Layer1"):
    L1 = tf.add(tf.matmul(X, W1), b1)
    L1 = tf.nn.sigmoid(L1)
    
with tf.name_scope("Layer2"):
    L2 = tf.add(tf.matmul(L1, W2), b2)
    L2 = tf.nn.sigmoid(L2)
    
with tf.name_scope("Layer3"):
    model = tf.add(tf.matmul(L2, W3), b3)
    model = tf.nn.sigmoid(model)

with tf.name_scope("optiimizer"):
    cost = tf.reduce_mean(-tf.reduce_sum(Y*tf.log(model) + (1-Y)*tf.log(1-model)))
    optimizer = tf.train.AdadeltaOptimizer(learning_rate=0.01)
    train_op = optimizer.minimize(cost, global_step=global_step)
    
    tf.summary.scalar("cost", cost)
    tf.summary.histogram("Weight1", W1)
    tf.summary.histogram("Weight2", W2)
    tf.summary.histogram("Weight3", W3)
    tf.summary.histogram("bias1", b1)
    tf.summary.histogram("bias2", b2)
    tf.summary.histogram("bias3", b3)

sess = tf.Session()
saver = tf.train.Saver(tf.global_variables())

ckpt = tf.train.get_checkpoint_state("./model/iris")
if ckpt and tf.train.checkpoint_exists(ckpt.model_checkpoint_path):
    saver.restore(sess, ckpt.model_checkpoint_path)
    
else:
    sess.run(tf.global_variables_initializer())

INFO:tensorflow:Restoring parameters from ./model/iris/dnn.ckpt-5000

merged = tf.summary.merge_all()
writer = tf.summary.FileWriter("./logs/iris", sess.graph)

for step in range(5000):
    sess.run(train_op, feed_dict={X:x_train, Y:y_train})
    
    if (step+1) % 200 == 0:
        print("step: {}, cost: {:.5f}".\
             format(sess.run(global_step),
                    sess.run(cost, feed_dict={X:x_train, Y:y_train})))
        
    summary = sess.run(merged, feed_dict={X:x_train, Y:y_train})
    writer.add_summary(summary, global_step=sess.run(global_step))
    
saver.save(sess, "./model/iris/dnn.ckpt", global_step=global_step)

step: 5200, cost: 550.79474
step: 5400, cost: 511.44778
step: 5600, cost: 472.62866
step: 5800, cost: 434.81265
step: 6000, cost: 398.32742
step: 6200, cost: 362.83173
step: 6400, cost: 326.22653
step: 6600, cost: 287.97693
step: 6800, cost: 256.19949
step: 7000, cost: 231.45538
step: 7200, cost: 211.00574
step: 7400, cost: 193.48444
step: 7600, cost: 178.29463
step: 7800, cost: 165.20464
step: 8000, cost: 154.06615
step: 8200, cost: 144.64468
step: 8400, cost: 136.65207
step: 8600, cost: 129.83101
step: 8800, cost: 123.97330
step: 9000, cost: 118.91176
step: 9200, cost: 114.51005
step: 9400, cost: 110.65558
step: 9600, cost: 107.25186
step: 9800, cost: 104.21504
step: 10000, cost: 101.47087

'./model/iris/dnn.ckpt-10000'

# print("{}".format(sess.run(model, feed_dict={X:x_scaled_test, Y:y_test})))

prediction = tf.argmax(model, 1)
target = tf.argmax(Y, 1)
is_correct = tf.equal(prediction, target)
accuracy = tf.reduce_mean(tf.cast(is_correct, tf.float32))


print("====================================")
print("train_prediction: \n{}".format(sess.run(prediction, feed_dict={X: x_train})))
print("train_target: \n{}\n".format(sess.run(target, feed_dict={Y:y_train})))
print("\naccuracy: \n{:.3f}%".format(sess.run(accuracy*100, 
                                              feed_dict={X: x_train, Y:y_train})))

print("\n====================================")
print("test_prediction: \n{}".format(sess.run(prediction, feed_dict={X: x_test})))
print("test_target: \n{}".format(sess.run(target, feed_dict={Y:y_test})))


print("\naccuracy: \n{:.3f}%".format(sess.run(accuracy*100, 
                                              feed_dict={X: x_test, Y:y_test})))

====================================
train_prediction: 
[0 0 2 0 0 2 2 0 0 0 2 2 2 0 0 1 2 2 2 2 1 2 1 0 2 1 0 0 0 1 2 0 0 0 1 0 1
 2 0 1 2 0 2 2 1 1 2 1 0 1 2 0 0 1 2 0 2 0 0 2 1 2 1 1 2 1 0 0 1 2 0 0 0 1
 2 0 2 2 0 1 2 2 2 2 0 2 1 2 1 1 2 1 2 1 0 1 2 2 0 1 2 2 0 2 0 2 2 2 1 2 1
 1 1 2 0 1 1 0 1 2]
train_target: 
[0 0 1 0 0 2 1 0 0 0 2 1 1 0 0 1 2 2 1 2 1 2 1 0 2 1 0 0 0 1 2 0 0 0 1 0 1
 2 0 1 2 0 2 2 1 1 2 1 0 1 2 0 0 1 1 0 2 0 0 1 1 2 1 2 2 1 0 0 2 2 0 0 0 1
 2 0 2 2 0 1 1 2 1 2 0 2 1 2 1 1 1 0 1 1 0 1 2 2 0 1 2 2 0 2 0 1 2 2 1 2 1
 1 2 2 0 1 2 0 1 2]


accuracy: 
85.833%

====================================
test_prediction: 
[1 0 2 1 1 0 1 2 1 1 2 0 0 0 0 2 2 1 1 2 0 2 0 2 2 2 2 2 0 0]
test_target: 
[1 0 2 1 1 0 1 2 1 1 2 0 0 0 0 1 2 1 1 2 0 2 0 2 2 2 2 2 0 0]

accuracy: 
96.667%

# tensorboard --logdir=./logs/iris

from IPython.core.display import display, HTML

display(HTML("<style> .container{width:100% !important;}</style>"))

import numpy as np
import pandas as pd
import tensorflow as tf

data = pd.read_csv("./datas/classification_ex.csv", dtype="float32").values

x_data = data[:, 0:2]
y_data = data[:, 2:]

model save¶

global_step -> trainable=False

global_step = tf.Variable(0, trainable=False, name="global_step")

model setting¶

# ground truth
X = tf.placeholder(tf.float32)
Y = tf.placeholder(tf.float32)

# weight, bias
W1 = tf.Variable(tf.random_normal([2, 10],  mean=0, stddev=1))
W2 = tf.Variable(tf.random_normal([10, 20], mean=0, stddev=1))
W3 = tf.Variable(tf.random_normal([20, 3],  mean=0, stddev=1))

b1 = tf.zeros([10])
b2 = tf.zeros([20])
b3 = tf.zeros([3])

# Layer 1
with tf.name_scope("Layer_1"):
    L1 = tf.add(tf.matmul(X, W1), b1)
    L1 = tf.nn.sigmoid(L1)

# Layer 2
with tf.name_scope("Layer_2"):
    L2 = tf.add(tf.matmul(L1, W2), b2)
    L2 = tf.nn.sigmoid(L2)

# Layer 3
with tf.name_scope("Layer_3"):
    model = tf.add(tf.matmul(L2, W3), b3)
    model = tf.nn.softmax(model)

    
# cost function
with tf.name_scope("optimizer"):
    cost = tf.reduce_mean(-tf.reduce_sum(Y*tf.log(model) + (1-Y)*tf.log(1-model)))
    optimizer = tf.train.AdamOptimizer(learning_rate=0.01)
    train_op = optimizer.minimize(cost, global_step=global_step)
    tf.summary.scalar("cost", cost)
    tf.summary.histogram("Weight1", W1)
    tf.summary.histogram("Weight2", W2)
    tf.summary.histogram("Weight3", W3)
    tf.summary.histogram("bais1", b1)
    tf.summary.histogram("bais2", b2)
    tf.summary.histogram("bais3", b3)

create session¶

sess = tf.Session()
saver = tf.train.Saver(tf.global_variables())

tf.global_variables()는 앞서 정의한 변수들을 가져오는 함수

make checkpoint¶

ckpt = tf.train.get_checkpoint_state("./model")
if ckpt and tf.train.checkpoint_exists(ckpt.model_checkpoint_path):
    saver.restore(sess, ckpt.model_checkpoint_path)
    
else:
    sess.run(tf.global_variables_initializer())

tf.summary.merge_all 함수로 앞서 지정한 텐서들을 수집한 다음 tf.summary.FileWriter 함수를 이용해 그래프와 텐서들의 값을 저장할 디렉토리를 설정

merged = tf.summary.merge_all()
writer = tf.summary.FileWriter("./logs", sess.graph)

running model¶

for step in range(1000):
    sess.run(train_op, feed_dict={X:x_data, Y:y_data})
    
    if (step+1) % 50 == 0:    
        print("step: {}, cost: {:.5f}".\
              format(sess.run(global_step), 
                     sess.run(cost, feed_dict={X:x_data, Y:y_data})))
        
    summary = sess.run(merged, feed_dict={X:x_data, Y:y_data})
    writer.add_summary(summary, global_step=sess.run(global_step))

step: 50, cost: 6.33244
step: 100, cost: 2.69545
step: 150, cost: 0.85424
step: 200, cost: 0.35161
step: 250, cost: 0.19246
step: 300, cost: 0.12306
step: 350, cost: 0.08628
step: 400, cost: 0.06428
step: 450, cost: 0.04998
step: 500, cost: 0.04012
step: 550, cost: 0.03300
step: 600, cost: 0.02768
step: 650, cost: 0.02359
step: 700, cost: 0.02037
step: 750, cost: 0.01779
step: 800, cost: 0.01568
step: 850, cost: 0.01393
step: 900, cost: 0.01246
step: 950, cost: 0.01122
step: 1000, cost: 0.01016

saver.save(sess, "./model/dnn.ckpt", global_step=global_step)

'./model/dnn.ckpt-1000'

prediction = tf.argmax(model, 1)
target = tf.argmax(Y, 1)
print("prediction: \t{}".format(sess.run(prediction, feed_dict={X: x_data})))
print("target: \t{}".format(sess.run(target, feed_dict={Y:y_data})))

is_correct = tf.equal(prediction, target)
accuracy = tf.reduce_mean(tf.cast(is_correct, tf.float32))
print("\naccuracy: \t{:.3f}%".format(sess.run(accuracy*100, feed_dict={X: x_data, Y: y_data})))

prediction: 	[2 0 1 2 2 1]
target: 	[2 0 1 2 2 1]

accuracy: 	100.000%

# tensorboard --logdir=./logs

from IPython.core.display import display, HTML

display(HTML("<style> .container{width:100% !important;}</style>"))

	x	y
금융	-8.060934	-1.692813
지주	-11.529280	-6.778675
증권	-9.415570	-0.928613
캐피탈	-10.257173	1.668560
부동산	-10.782980	1.529749

18.word2vec (0)	2018.12.19
17.seq2seq (0)	2018.12.19
16.RNN_word_autoComplete (0)	2018.12.18
15.RNN_mnist (1)	2018.12.18
14.gan (0)	2018.12.16

19.word2vec (0)	2018.12.20
17.seq2seq (0)	2018.12.19
16.RNN_word_autoComplete (0)	2018.12.18
15.RNN_mnist (1)	2018.12.18
14.gan (0)	2018.12.16

19.word2vec (0)	2018.12.20
18.word2vec (0)	2018.12.19
16.RNN_word_autoComplete (0)	2018.12.18
15.RNN_mnist (1)	2018.12.18
14.gan (0)	2018.12.16

18.word2vec (0)	2018.12.19
17.seq2seq (0)	2018.12.19
15.RNN_mnist (1)	2018.12.18
14.gan (0)	2018.12.16
13.auto-encoder (0)	2018.12.15

13.auto-encoder (0)	2018.12.15
12.mnist_cnn (0)	2018.12.12
10.mnist_dropout (0)	2018.12.10
00.write_csv (0)	2018.12.09
09.mnist_01_minibatch (0)	2018.12.09

12.mnist_cnn (0)	2018.12.12
11.mnist_matplotlib_dropout_tensorgraph (0)	2018.12.10
00.write_csv (0)	2018.12.09
09.mnist_01_minibatch (0)	2018.12.09
08.tensorboard03_example (0)	2018.12.09

08.tensorboard03_example (0)	2018.12.09
07.tensorboard02_example (0)	2018.12.09
05.deep_neural_net_Costfun2 (0)	2018.12.09
04.deep_neural_net_Costfun1 (0)	2018.12.09
03.classification (0)	2018.12.09

Deep_Learning

word2vec¶

classification¶

'Deep_Learning' 카테고리의 다른 글

word2vec¶

'Deep_Learning' 카테고리의 다른 글

sequence to sequence¶

'Deep_Learning' 카테고리의 다른 글

word auto complete¶

utiliy function¶

hyper parameter setting¶

variable setting¶

model setting¶

modeling¶

prediction¶

'Deep_Learning' 카테고리의 다른 글

library load¶

hyper parameter¶

hidden개의 출력값을 갖는 RNN cell을 생성¶

complete RNN¶

modeling¶

variable initializer¶

'Deep_Learning' 카테고리의 다른 글

GAN¶

setting hyper-parameter¶

setting Generator¶

setting discriminator¶

setting neural network¶

cost¶

training¶

'Deep_Learning' 카테고리의 다른 글

auto-encoder¶

mnist dataload¶

hyper parameter¶

setting place holder¶

setting encoder¶

setting decoder¶

cost¶

'Deep_Learning' 카테고리의 다른 글

CNN¶

modeling¶

costfunction¶

tensorgrapth¶

labels¶

'Deep_Learning' 카테고리의 다른 글

variable setting¶

model setting¶

model initialization¶

model training¶

cost function¶

tensor graph¶

test¶

labels¶

'Deep_Learning' 카테고리의 다른 글

dropout¶

'Deep_Learning' 카테고리의 다른 글

'Deep_Learning' 카테고리의 다른 글

mnist data 준비¶

model setting¶

minibatch¶

'Deep_Learning' 카테고리의 다른 글

'Deep_Learning' 카테고리의 다른 글

'Deep_Learning' 카테고리의 다른 글

model save¶

model setting¶

create session¶

make checkpoint¶

running model¶

'Deep_Learning' 카테고리의 다른 글

티스토리툴바