Model Details
- Model Description: Speech style converter model based on gogamza/kobart-base-v2
- Developed by: Juhwan, Lee and Jisu, Kim, TakSung Heo, and Minsu Jeong
- Model Type: Text-generation
- Language: Korean
- License: CC-BY-4.0
Dataset
- korean SmileStyle Dataset
- Randomly split train/valid dataset (9:1)
BLEU Score
- 25.35
Uses
This model can be used for convert speech style
- formal: 문어체
- informal: 구어체
- android: 안드로이드
- azae: 아재
- chat: 채팅
- choding: 초등학생
- emoticon: 이모티콘
- enfp: enfp
- gentle: 신사
- halbae: 할아버지
- halmae: 할머니
- joongding: 중학생
- king: 왕
- naruto: 나루토
- seonbi: 선비
- sosim: 소심한
- translator: 번역기
from transformers import pipeline
model = "KoJLabs/bart-speech-style-converter"
tokenizer = AutoTokenizer.from_pretrained(model)
nlg_pipeline = pipeline('text2text-generation',model=model, tokenizer=tokenizer)
styles = ["문어체", "구어체", "안드로이드", "아재", "채팅", "초등학생", "이모티콘", "enfp", "신사", "할아버지", "할머니", "중학생", "왕", "나루토", "선비", "소심한", "번역기"]
for style in styles:
text = f"{style} 형식으로 변환:오늘은 닭볶음탕을 먹었다. 맛있었다."
out = nlg_pipeline(text, max_length=100)
print(style, out[0]['generated_text'])
Model Source
https://github.com/KoJLabs/speech-style/tree/main
Speech style conversion package
You can exercise korean speech style conversion task with python package KoTAN