<h1 style='text-align: center '>SPANISH NSFW</h1>
<h3 style='text-align: center '>Model Card</h3>
Version 1.0 / 18.Jul.2023
Table of Contents
- Model Details
- Uses
- Training Data
- Risks and Limitations
- Evaluation
- Recommendations
- Glossary and Calculations
- More Information
- Model Card Authors
Model Details
Basics
This section provides information for anyone who wants to know about the model.
<details> <summary>Click to expand</summary> <br/>
Model Type: Transformer-based Language Model
Version: 1.0.0
Languages: Multiple; see training data
Release Date Estimate: Monday, 30.July.2023
</details>
Technical Specifications
This section provides information for people who work on model development.
<details> <summary>Click to expand</summary><br/>
Model Architecture: Megatron-LM GPT2 :
-
Decoder-only architecture
-
Layer normalization applied to word embeddings layer (
StableEmbedding
;) -
1,065,314,304 parameters:
-
385,351,680 embedding parameters
-
24 layers, 16 attention heads
-
Hidden layers are 1536-dimensional
-
Sequence length of 2048 tokens used
-
Training
Tokenization
</details>
Environmental Impact
Uses
This section addresses questions around how the model is intended to be used, discusses the foreseeable users of the model (including those affected by the model), and describes uses that are considered out of scope or misuse of the model. It provides information for anyone considering using the model or who is affected by the model.
<details> <summary>Click to expand</summary><br/>
Intended Use
This model is being created in order to enable public research on large language models (LLMs). LLMs are intended to be used for language generation or as a pretrained base model that can be further fine-tuned for specific tasks. Use cases below are not exhaustive.
Direct Use
-
Text generation
-
Exploring characteristics of language generated by a language model
- Examples: Cloze tests, counterfactuals, generations with reframings
Downstream Use
- Tasks that leverage language models include: Information Extraction, Question Answering, Summarization
Misuse
Intentionally using the model for harm, violating human rights, or other kinds of malicious activities, is a misuse of this model. This includes:
-
Spam generation
-
Disinformation and influence operations
-
Disparagement and defamation
-
Harassment and abuse
-
Unconsented impersonation and imitation
-
Unconsented surveillance
Intended Users
Direct Users
- General Public
- Non-commercial entities
Training Data
This section provides a high-level overview of the training data. It is relevant for anyone who wants to know the basics of what the model is learning.
<details> <summary>Click to expand</summary><br/>
Details for each dataset are provided in individual Data Cards.
Training data includes:
-
45 natural languages
-
In 140MB of pre-processed text, converted into 350M unique tokens (see the tokenizer section for more.)