This model is a binary classifier developed to analyze comment authorship patterns on Korean news articles. For further details, refer to our paper on Journalism: News comment sections and online echo chambers: The ideological alignment between partisan news stories and their user comments

How to use

from KorBertTokenizer import KorBertTokenizer
from transformers import BertForSequenceClassification
import torch

tokenizer = KorBertTokenizer.from_pretrained('conviette/korPolBERT')
model = BertForSequenceClassification.from_pretrained('conviette/korPolBERT')

def classify(text):
    inputs = tokenizer(text, padding='max_length', max_length=70, return_tensors='pt')

    with torch.no_grad():
        logits=model(**inputs).logits
        predicted_class_id = logits.argmax().item()
        return model.config.id2label[predicted_class_id]


input_strings = ['좌파가 나라 경제 안보 말아먹는다',
                 '수꼴들은 나라 일본한테 팔아먹었냐']

for input_string in input_strings:
    print('===\n입력 텍스트: {}\n분류 결과: {}\n==='.format(input_string, classify(input_string)))

Model performance