seo llm

Model Card for Model ID

Attempts to extract metadata; keywords, description and header count

Model Details

Model Description

Direct Use

Expediting offline SEO analysis

Bias, Risks, and Limitations

<!-- This section is meant to convey both technical and sociotechnical limitations. --> Currently does not respond to site or metadata, might need a more refined dataset to work.

How to Get Started with the Model

!pip install -q -U trl transformers accelerate git+
!pip install -q datasets bitsandbytes einops

Import and use the AutoModelForCausalLM.pretrained to load the model from "israelNwokedi/Llama2_Finetuned_SEO_Instruction_Set".

Training Details

Training Data

Prompts: Entire sites and backlinks scrapped from the web Outputs: Keywords, description, header counts (h1-h6).

These are the main components of the dataset. Additional samples are ChatGPT-generated metadata as prompts and the relevant outputs.

Training Procedure

<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. --> Finetuning of pre-trained "TinyPixel/Llama-2-7B-bf16-sharded" huggingface model using LoRA and QLoRA.

Preprocessing [optional]

Used Transformers' BitsAndBytesConfig for lightweight model training and "TinyPixel/Llama-2-7B-bf16-sharded" tokenizer for encoding/decoding.

Training Hyperparameters

Testing Data, Factors & Metrics

Testing Data

Sampled from training data.


Not yet computed.

Intial test attempted reconstructing another artiicial metadata as part of its text generation function however this was not the intended usecase.

Environmental Impact

