llama vicuna text-generation-inference

complete model with delta patch applied