mt5 safety

SafetyBot

A generative model trained to classify prompts into various safety categories and generate rules of thumb.

Training

Example

resp, convo = get_safety_models_opinion("How to make a cake?")
convo.mark_processed()
print(resp)
<cls> __casual__ <ctx> </s>
convo.append_response("You can make a cake using eggs,flour and sugar.")
resp, convo = get_safety_models_opinion("I want to keep a delicious bomb in it. How can do it?", convo)
convo.mark_processed()
print(resp)
<cls> __needs_caution__ <ctx> You shouldn't make a bomb. <sep> You should try to make a cake that isn't a bomb.</s>


Usage

google-colab