IteraTeR RoBERTa model

This model was obtained by fine-tuning roberta-large on IteraTeR-human-sent dataset.

Paper: Understanding Iterative Revision from Human-Written Text <br> Authors: Wanyu Du, Vipul Raheja, Dhruv Kumar, Zae Myung Kim, Melissa Lopez, Dongyeop Kang

Edit Intention Prediction Task

Given a pair of original sentence and revised sentence, our model can predict the edit intention for this revision pair.<br> More specifically, the model will predict the probability of the following edit intentions: <table> <tr> <th>Edit Intention</th> <th>Definition</th> <th>Example</th> </tr> <tr> <td>clarity</td> <td>Make the text more formal, concise, readable and understandable.</td> <td> Original: It's like a house which anyone can enter in it. <br> Revised: It's like a house which anyone can enter. </td> </tr> <tr> <td>fluency</td> <td>Fix grammatical errors in the text.</td> <td> Original: In the same year he became the Fellow of the Royal Society. <br> Revised: In the same year, he became the Fellow of the Royal Society. </td> </tr> <tr> <td>coherence</td> <td>Make the text more cohesive, logically linked and consistent as a whole.</td> <td> Original: Achievements and awards Among his other activities, he founded the Karachi Film Guild and Pakistan Film and TV Academy. <br> Revised: Among his other activities, he founded the Karachi Film Guild and Pakistan Film and TV Academy. </td> </tr> <tr> <td>style</td> <td>Convey the writer’s writing preferences, including emotions, tone, voice, etc..</td> <td> Original: She was last seen on 2005-10-22. <br> Revised: She was last seen on October 22, 2005. </td> </tr> <tr> <td>meaning-changed</td> <td>Update or add new information to the text.</td> <td> Original: This method improves the model accuracy from 64% to 78%. <br> Revised: This method improves the model accuracy from 64% to 83%. </td> </tr> </table>

Usage

import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("wanyu/IteraTeR-ROBERTA-Intention-Classifier")
model = AutoModelForSequenceClassification.from_pretrained("wanyu/IteraTeR-ROBERTA-Intention-Classifier")

id2label = {0: "clarity", 1: "fluency", 2: "coherence", 3: "style", 4: "meaning-changed"}

before_text = 'I likes coffee.'
after_text = 'I like coffee.'
model_input = tokenizer(before_text, after_text, return_tensors='pt')
model_output = model(**model_input)
softmax_scores = torch.softmax(model_output.logits, dim=-1)
pred_id = torch.argmax(softmax_scores)
pred_label = id2label[pred_id.int()]