This model is a multi-class classifier, model fine-tuned using the model 'bert-base-uncased'.

It is built around a large corpus of Twitter users' metadata.

It filters the data into 3 main categories - (1) Non-ExpertUser (2) ExpertUser (3) Other. The aim of this project was to find out whether a tweet belongs to an individual or not. And if it is, whether the person is an expert in the field of Security and Privacy.

Originally, the Model had 4 classes - where the 'Other' field was classified into 'Non-Person' (denoting accounts such as organizations)and 'Unknown'.

Since the main aim was to find out about whether a user is a non-expert user or not, the classes were reduced to 3 classes in this version 2.

The validation scores for the module were as follows

Accuracy = 0.93

<table> <tr> <th>Class</th> <th>Precision</th> <th>Recall</th> <th>F1-Score</th> </tr> <tr> <td>ExpertUser (0)</td> <td>0.88</td> <td>0.90</td> <td>0.89</td> </tr> <tr> <td><b>Non-ExpertUser (1)</b></td> <td><b>0.95</b></td> <td><b>0.97</b></td> <td><b>0.96</b></td> </tr> <tr> <td>Other (2)</td> <td>0.85</td> <td>0.78</td> <td>0.81</td> </tr> </table>

<b>Paper:</b> The paper detailing how it was designed can be found here <a href="https://www.sciencedirect.com/science/article/pii/S016740482200400X">Perspectives of non-expert users on cyber security and privacy: An analysis of online discussions on twitter</a>

<b>Please cite the paper if you use this model </b>:

Nandita Pattnaik, Shujun Li, and Jason R.C. Nurse. 2023.<br> Perspectives of non-expert users on cyber security and privacy: An analysis of online discussions on Twitter.<br> Computers & Security 125 (2023), 103008. https://doi.org/10.1016/j.cose.2022.103008