DistilBERT for masked language modelling trained on OpenSSH heap data structures dataset for the purpose of generating representations. This model was created for the thesis "Generating Robust Representations of Structures in OpenSSH Heap Dumps" by Johannes Garstenauer.

Model Description

Model Sources [optional]

<!-- Provide the basic links for the model. -->

Training Data

<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. --> Training data: https://huggingface.co/datasets/johannes-garstenauer/structs_token_size_4_reduced_labelled_train Validation data: https://huggingface.co/datasets/johannes-garstenauer/structs_token_size_4_reduced_labelled_eval