A guide to prompting Llama 2
Captured source
source ↗A guide to prompting Llama 2 – Replicate blog
Replicate Blog
A guide to prompting Llama 2
Posted August 14, 2023 by cbh123
A llama typing on a keyboard by stability-ai/sdxl Prompting large language models like Llama 2 is an art and a science. In this post we’re going to cover everything I’ve learned while exploring Llama 2, including how to format chat prompts, when to use which Llama variant, when to use ChatGPT over Llama, how system prompts work, and some tips and tricks.
There’s still much to be learned, but you should leave this post with a better understanding of how to be a Llama whisperer.
💡 Want to try an interactive version of this post? Check out our colab version .
Contents
System Prompts
Ghost Attention
Why should you care?
How to Format Chat Prompts
Wrap user input with [INST] [/INST] tags
How to format prompts in production
How to deal with context windows
7B v 13B v 70B
What about the chat vs base variant?
Prompting Tips
Play with the temperature
Tell Llama about tools it can use
Get rid of the default system prompt
Try telling Llama to think step-by-step or giving it an example
What is Llama 2 better at than ChatGPT?
In Conclusion
What’s next?
System Prompts
💡 A system_prompt is text that is prepended to the prompt. It’s used in a chat context to help guide or constrain model behavior.
Let’s say you wanted to write a chatbot that talks like a pirate. One way to do this would be to prepend “you are a pirate” to every prompt.
This gets tedious after a while. Instead, we can set a system_prompt ”You are a pirate,” and the model will understand your request without having to be told in every prompt:
Copy
output = replicate.run( "meta/llama-2-70b-chat:2c1608e18606fad2812020dc541930f2d0495ce32eee50074220b87300bc16e1" , input = { "prompt" : "Can you explain what a transformer is (in a machine learning context)?" , "system_prompt" : "You are a pirate" } ) '' .join(output)
Ahoy matey! I be happy to explain what a transformer be in the world o’ machine learnin’.
A transformer be a type o’ neural network architecture that helps computers understand natural language. It’s like a magic spell that lets machines read and comprehend text like a human would.
The way it works is by using something called self-attention. This means the computer looks at a piece of text and figures out which parts are important, kinda like how ye might look at a treasure map and decide which X marks the spot.
Then, the computer takes those important parts and compares them to other pieces o’ text it’s seen before. It’s like sayin’, “Ah, this part here reminds me o’ that other bit o’ booty I found last week!”
By doin’ this, the computer can figure out things like grammar, syntax, and even the meaning o’ words. It’s like unlockin’ a treasure chest o’ knowledge!
But wait, there be more! Transformers also have a special trick called positional encoding. It’s like drawin’ a map with invisible ink, so the computer knows where each part o’ the text fits in the grand scheme o’ things.
So there ye have it, me hearty! That be the tale o’ transformers in machine learnin’. They be powerful tools for makin’ sense o’ the sea o’ text that surrounds us every day. Fair winds and following seas!
You can also use system prompts to make Llama behave in a more… professional way. Try system prompts like “Act if as if you’re responding to documentation questions” or “You are responding to highly technical customers.” Llama 2 is quite good at respecting system prompts. It’s no accident.
Copy
output = replicate.run(llama2_70b, input = { "prompt" : "Can you explain what a transformer is?" , "system_prompt" : "You are responding to highly technical customers" } ) '' .join(output)
Certainly! A transformer is an electrical device that transfers electrical energy from one circuit to another through electromagnetic induction. It consists of two coils of wire, known as the primary and secondary coils, which are wrapped around a common magnetic core.
When an alternating current (AC) flows through the primary coil, it generates a magnetic field that induces an electromotive force (EMF) in the secondary coil. The direction of the induced EMF is such that it opposes the change in the magnetic field that produced it.
The primary and secondary coils are typically wound on a ferromagnetic core, such as iron or silicon steel. The magnetic field generated by the primary coil induces a voltage across the secondary coil that is proportional to the primary voltage, but not necessarily equal to it. The ratio of the primary to secondary voltage is determined by the turns ratio of the two coils, which is defined as the number of turns in the primary coil divided by the number of turns in the secondary coil.
Transformers are commonly used in power distribution systems to step up or step down voltages, as needed. For example, a step-up transformer can increase the voltage of AC power for transmission over long distances, while a step-down transformer can reduce the voltage for safe and efficient use in homes and businesses.
In addition to their use in power distribution, transformers also have applications in electronic devices such as audio equipment, power supplies, and motor drives. They are also used in medical equipment, railway systems, and many other areas where electrical energy needs to be transferred or transformed.
I hope this explanation helps you understand what a transformer is and how it works! Let me know if you have any further questions.
Ghost Attention
In the Llama 2 research paper, the researchers note that initial trained versions tended to “forget the instruction after a few turns of dialogue”. To address this, they used a method called Ghost Attention (GAtt).
How does Ghost Attention work? I asked Llama 2 to explain: Copy
Can you explain this like I'm a very intelligent teenager? I'm interested in how this works and all the technical details. Make the output as if it’s a dialogue between a kid and a teacher.
Llama 2 70b’s Response Kid: Hi, Teacher! I'm interested in learning about dialogue systems and how they work.
Teacher: Sure, kid! Dialogue systems are computer programs that can have conversations with people. They're really useful for things like customer service or virtual assistants.
Kid: That's cool! But how do they work?
Teacher: Well, dialogue systems use something called Reinforcement Learning with Human Feedback (RLHF) to learn how to respond to…
Excerpt shown — open the source for the full document.