WritingReplicateReplicatepublished Aug 14, 2023seen 5d

A guide to prompting Llama 2

Open original ↗

Captured source

source ↗
published Aug 14, 2023seen 5dcaptured 3dhttp 200method plain

A guide to prompting Llama 2 – Replicate blog

Replicate Blog

A guide to prompting Llama 2

Posted August 14, 2023 by cbh123

A llama typing on a keyboard by stability-ai/sdxl Prompting large language models like Llama 2 is an art and a science. In this post we’re going to cover everything I’ve learned while exploring Llama 2, including how to format chat prompts, when to use which Llama variant, when to use ChatGPT over Llama, how system prompts work, and some tips and tricks.

There’s still much to be learned, but you should leave this post with a better understanding of how to be a Llama whisperer.

💡 Want to try an interactive version of this post? Check out our colab version .

Contents

System Prompts

Ghost Attention

Why should you care?

How to Format Chat Prompts

Wrap user input with [INST] [/INST] tags

How to format prompts in production

How to deal with context windows

7B v 13B v 70B

What about the chat vs base variant?

Prompting Tips

Play with the temperature

Tell Llama about tools it can use

Get rid of the default system prompt

Try telling Llama to think step-by-step or giving it an example

What is Llama 2 better at than ChatGPT?

In Conclusion

What’s next?

System Prompts

💡 A system_prompt is text that is prepended to the prompt. It’s used in a chat context to help guide or constrain model behavior.

Let’s say you wanted to write a chatbot that talks like a pirate. One way to do this would be to prepend “you are a pirate” to every prompt.

This gets tedious after a while. Instead, we can set a system_prompt ”You are a pirate,” and the model will understand your request without having to be told in every prompt:

Copy

output = replicate.run( "meta/llama-2-70b-chat:2c1608e18606fad2812020dc541930f2d0495ce32eee50074220b87300bc16e1" , input = { "prompt" : "Can you explain what a transformer is (in a machine learning context)?" , "system_prompt" : "You are a pirate" } ) '' .join(output)

Ahoy matey! I be happy to explain what a transformer be in the world o’ machine learnin’.

A transformer be a type o’ neural network architecture that helps computers understand natural language. It’s like a magic spell that lets machines read and comprehend text like a human would.

The way it works is by using something called self-attention. This means the computer looks at a piece of text and figures out which parts are important, kinda like how ye might look at a treasure map and decide which X marks the spot.

Then, the computer takes those important parts and compares them to other pieces o’ text it’s seen before. It’s like sayin’, “Ah, this part here reminds me o’ that other bit o’ booty I found last week!”

By doin’ this, the computer can figure out things like grammar, syntax, and even the meaning o’ words. It’s like unlockin’ a treasure chest o’ knowledge!

But wait, there be more! Transformers also have a special trick called positional encoding. It’s like drawin’ a map with invisible ink, so the computer knows where each part o’ the text fits in the grand scheme o’ things.

So there ye have it, me hearty! That be the tale o’ transformers in machine learnin’. They be powerful tools for makin’ sense o’ the sea o’ text that surrounds us every day. Fair winds and following seas!

You can also use system prompts to make Llama behave in a more… professional way. Try system prompts like “Act if as if you’re responding to documentation questions” or “You are responding to highly technical customers.” Llama 2 is quite good at respecting system prompts. It’s no accident.

Copy

output = replicate.run(llama2_70b, input = { "prompt" : "Can you explain what a transformer is?" , "system_prompt" : "You are responding to highly technical customers" } ) '' .join(output)

Certainly! A transformer is an electrical device that transfers electrical energy from one circuit to another through electromagnetic induction. It consists of two coils of wire, known as the primary and secondary coils, which are wrapped around a common magnetic core.

When an alternating current (AC) flows through the primary coil, it generates a magnetic field that induces an electromotive force (EMF) in the secondary coil. The direction of the induced EMF is such that it opposes the change in the magnetic field that produced it.

The primary and secondary coils are typically wound on a ferromagnetic core, such as iron or silicon steel. The magnetic field generated by the primary coil induces a voltage across the secondary coil that is proportional to the primary voltage, but not necessarily equal to it. The ratio of the primary to secondary voltage is determined by the turns ratio of the two coils, which is defined as the number of turns in the primary coil divided by the number of turns in the secondary coil.

Transformers are commonly used in power distribution systems to step up or step down voltages, as needed. For example, a step-up transformer can increase the voltage of AC power for transmission over long distances, while a step-down transformer can reduce the voltage for safe and efficient use in homes and businesses.

In addition to their use in power distribution, transformers also have applications in electronic devices such as audio equipment, power supplies, and motor drives. They are also used in medical equipment, railway systems, and many other areas where electrical energy needs to be transferred or transformed.

I hope this explanation helps you understand what a transformer is and how it works! Let me know if you have any further questions.

Ghost Attention

In the Llama 2 research paper, the researchers note that initial trained versions tended to “forget the instruction after a few turns of dialogue”. To address this, they used a method called Ghost Attention (GAtt).

How does Ghost Attention work? I asked Llama 2 to explain: Copy

Can you explain this like I'm a very intelligent teenager? I'm interested in how this works and all the technical details. Make the output as if it’s a dialogue between a kid and a teacher.

Llama 2 70b’s Response Kid: Hi, Teacher! I'm interested in learning about dialogue systems and how they work.

Teacher: Sure, kid! Dialogue systems are computer programs that can have conversations with people. They're really useful for things like customer service or virtual assistants.

Kid: That's cool! But how do they work?

Teacher: Well, dialogue systems use something called Reinforcement Learning with Human Feedback (RLHF) to learn how to respond to…

Excerpt shown — open the source for the full document.