KT AIVLE/Daily Review

241105

bestone888 2024. 11. 5. 18:33

241105

1. 언어모델의 이해

API

API: 데이터를 주고받는 상호작용할 수 있게 해주는 인터페이스
클라이언트에게 요청받아 서버로 전달, 결과 데이터를 클라이언트로 전달할 때 API를 거침
request & response

NLP(Natural Language Processing)

자연어 처리(NLP): 컴퓨터가 인간의 언어를 이해가소 생성할 수 있도록 하는 기술, 데이터를 분석하여 의미 파악(맥락 이해)

RNN(Recurrent Neural Network)
- 시퀀스 길이에 따라 순차적 정보처리, 시퀀스가 길어질 경우 연산속도 저하
- 장기 의존성: 멀리 떨어진 위치의 정보 연결하기 어려움
Transformer
- RNN 모델 단점 보완
- 구조 복잡
- 문맥 파악에 유리함

Transformer의 사용
1. 문장의 긍정, 부정 분류
2. 학습되지 않은 클래스에 대한 분류
3. 요약
4. 번역
5. 텍스트 생성

In [ ]:

from transformers import pipeline

classifier = pipeline(task = "sentiment-analysis", model = 'bert-base-multilingual-cased')
classifier = pipeline("sentiment-analysis")

text = ["I really enjoyed the life in Germany because I did nothing.",
        "I should have studied German hard",
        "Since I am in korea, my German account have been blocked",
        "I'm stuck in the middle with you",
        "You tell me that you need me then you walk away",

        "Ich kann nich Deutsch sprechen",
        "Ich denke es ist zu kalt hier",
        "Ich liebe dich",
        "Ich war froh, dich kennenzulernen",
        'WTF']    # ?

classifier(text)

Out[ ]:

[{'label': 'POSITIVE', 'score': 0.9649511575698853},
 {'label': 'NEGATIVE', 'score': 0.9987061023712158},
 {'label': 'NEGATIVE', 'score': 0.9995985627174377},
 {'label': 'NEGATIVE', 'score': 0.9912315011024475},
 {'label': 'NEGATIVE', 'score': 0.9942225813865662},
 {'label': 'NEGATIVE', 'score': 0.9729437232017517},
 {'label': 'NEGATIVE', 'score': 0.9292004704475403},
 {'label': 'NEGATIVE', 'score': 0.9590762257575989},
 {'label': 'NEGATIVE', 'score': 0.9924020767211914},
 {'label': 'POSITIVE', 'score': 0.6635804176330566}]

In [ ]:

classifier = pipeline(task = "zero-shot-classification", model="facebook/bart-large-mnli")
candidate_labels = ["tech", "politics", "business", "finance"]

# 분류하고자 하는 텍스트
text = "I'm stuck in the middle with you."

# 분류 수행
result = classifier(text, candidate_labels)

# 결과 출력
print(f"Labels: {result['labels']}")
print(f"Scores: {result['scores']}")

Labels: ['business', 'tech', 'politics', 'finance']
Scores: [0.3821035921573639, 0.2603168785572052, 0.19367429614067078, 0.16390523314476013]

Hugging Face

NLP 라이브러리 제공
transformer 라이브러리 제공

Tokenizing & Embedding

언어 모델링 절차
- 데이터 전처리: Tokenize
- 모델 사용: Embedding Vector, Transformer
- 결과 후처리

Tokenize
- 문장 분석을 위한 최소 단위
- 사람이 결정

Embedding
- 자연어를 machine이 이해하는 벡터로 변환

'KT AIVLE > Daily Review' 카테고리의 다른 글

241112 (0)	2024.11.13
241111 (0)	2024.11.11
241104 (0)	2024.11.04
241101 (0)	2024.11.03
241031 (0)	2024.10.31

현재글241105

bestone888 님의 블로그

Herzlich willkommen

Today :
Yesterday :

bestone888 님의 블로그

241105

241105

1. 언어모델의 이해

API

NLP(Natural Language Processing)

Hugging Face

Tokenizing & Embedding

'KT AIVLE > Daily Review' 카테고리의 다른 글

'KT AIVLE/Daily Review'의 다른글

티스토리툴바

« 2025/07 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

241105

241105

1. 언어모델의 이해

API

NLP(Natural Language Processing)

Hugging Face

Tokenizing & Embedding

'KT AIVLE > Daily Review' 카테고리의 다른 글

'KT AIVLE/Daily Review'의 다른글

관련글

티스토리툴바