3.3.8 Run length encoding

This resource is built around AQA GCSE Computer Science 3.3.8 Run length encoding and stays tightly focused on what students need for that exact specification point. The topic sits within Fundamentals of data representation, where students are expected to understand why data is compressed and explain how run length encoding (RLE) works. In practice, this is the part of the course where repeated data stops being boring and starts becoming useful.

For teachers, the challenge is not usually introducing the idea. It is helping students encode data accurately, explain why the method works, and recognise when it is effective and when it is not. This page is designed to make that easier, with classroom-ready explanations, marking guidance, example responses, and exam-style practice.

At a Glance

🎯 Specification context

AQA GCSE Computer Science, 3.3.8 Data compression

This page focuses on the run length encoding part of the specification

Students may be asked to explain the method or represent data as frequency/data pairs

Students need to know

RLE is a form of lossless compression

Repeated values are stored as a count followed by the value

It works best when data contains long runs of the same character, bit, or pixel value

In bitmap-style questions, students may need to encode the data row by row

Key exam focus

explaining how repeated data is compressed

writing the correct sequence of frequency/data pairs

recognising when RLE reduces file size and when it may not

Common sticking points

reversing the order of the count and the value

missing where one run ends and the next begins

assuming RLE always makes data smaller

Understanding the Topic

Where this sits in the curriculum

In AQA GCSE Computer Science, run length encoding appears in 3.3.8 Data compression. Students are not expected to give a grand tour of every compression method under the sun. They need to understand how RLE works, why compression may be useful, and how to represent suitable data in the required compressed format.

This means answers should stay close to the specification wording. If students drift into vague comments such as it squashes the file a bit, the mark scheme is unlikely to be charmed.

What run length encoding actually does

Run length encoding compresses data by replacing a sequence of repeated values with:

the number of times the value appears in a run
the value itself

So instead of storing every symbol separately, the file stores each repeated block more efficiently.

For example, the binary string 0000011100000011 can be written in RLE as:

5 0 3 1 6 0 2 1

That means:

five 0s
three 1s
six 0s
two 1s

Why this counts as lossless compression

RLE is lossless because no original data is thrown away. The compressed version can be expanded back to the exact original sequence. That matters in computer science because some files must stay accurate after compression, especially where every value still matters.

When RLE works well

RLE works best when data contains long runs of repeated values. That is why it is commonly linked to:

simple bitmap images with repeated colours
binary patterns with long sections of the same digit
any data where repetition is high

The longer the runs, the more useful the method becomes.

When RLE works badly

RLE is not equally effective for all data. If the data changes value constantly, the compressed version may be no smaller and can even become larger.

For example, a short sequence such as 10101010 has no long runs. Turning that into frequency/data pairs gives lots of tiny runs, which is not much of a bargain.

💡 Teacher tip
Push students to say why RLE helps: it reduces repeated storage by recording a run once with a count. That is stronger than simply saying it compresses the data.

What AQA is really testing here

For this specification point, students should be able to:

explain how data can be compressed using RLE
recognise that the method stores frequency/data pairs
convert a simple sequence into the correct encoded form
apply the method to binary data or bitmap-style examples
understand that RLE is most effective where there is repetition

Key Terms and Concepts

Term	Explanation
Compression	Reducing the amount of storage needed for data.
Lossless compression	A compression method where the original data can be reconstructed exactly.
Run	A sequence of the same value appearing consecutively.
Run length encoding (RLE)	A lossless method that stores repeated data as a count followed by the value.
Frequency/data pair	The count of a repeated value and the value itself, such as `5 0`.
Bitmap image	An image made from individual pixels. RLE works best when many neighbouring pixels are the same.
Redundancy	Repeated information that can be stored more efficiently.
Negative compression	When the compressed version ends up the same size or larger because the data has too little repetition.

How to Teach This Topic

Teaching moves that work well

Start with an everyday pattern such as AAAAABB before moving to binary data.
Model the idea of a run first, then introduce the frequency/data pair format.
Move from text examples to binary strings, then to bitmap-style rows.
Ask students to point to where each run starts and ends before they encode anything.
Use paired examples where one sequence compresses well and another does not.

What to watch for in class

Students often swap the order and write value/count instead of count/value.
Some students split one long run into several smaller ones.
Others forget that a new symbol means a new run, even if it appears again later.
Many assume compression always means a smaller file, which is not guaranteed with RLE.
Bitmap questions can trip students up if they do not follow the data in the correct order.

A practical teaching sequence

Introduce repeated patterns using letters.
Translate the same idea into binary.
Show a specification-style example such as 0000011100000011 → 5 0 3 1 6 0 2 1.
Move to simple bitmap rows and ask students to encode them carefully.
Compare a good RLE example with a poor one so students see that suitability depends on the data.

Discussion prompts

Why is RLE called lossless?
Why does a long run help compression more than a short run?
What would happen if the data alternated constantly between 0 and 1?
Why might RLE be useful for a simple bitmap image but less useful for highly detailed image data?

Scaffolding ideas

Give students colour-coded runs before removing the visual support.
Use sentence stems such as: RLE compresses data by... and This works well when...
Provide partially completed frequency/data pairs and ask students to finish them.
Ask students to decode as well as encode. Decoding often reveals whether the original concept is secure.

Extension activities

Ask students to compare RLE with another lossless method such as Huffman coding at a high level.
Set a challenge where students decide whether a sequence is worth compressing using RLE and justify the answer.
Use simple bitmap grids and ask students to explain why some images compress better than others.

🧠 Helpful classroom reminder
Students usually improve faster when they say the run aloud before writing it. For example: five 0s, three 1s, six 0s, two 1s. It slows down the guesswork and reduces careless reversals.

How to Mark This Topic Effectively

✅ Reward answers that:

identify runs accurately and in the correct order

write complete frequency/data pairs with no missing sections

use the language of lossless compression correctly

explain that RLE is effective when data contains repeated values

apply the method to the actual sequence given, rather than describing it vaguely

Answer feature	What a stronger answer does	What a weaker answer does
Method	Explains that consecutive repeated values are replaced by a count and the value.	Says only that the data is made shorter.
Accuracy	Captures every run in the correct order.	Misses runs, merges runs incorrectly, or reverses count and value.
Exam language	Uses terms such as lossless, run, and frequency/data pair appropriately.	Uses everyday language only, with little technical precision.
Evaluation	Notes that RLE works best when there is lots of repetition.	Claims it always reduces file size.

Common marking traps

giving credit for a description of compression that never actually explains RLE
overlooking a missed run in the middle of an otherwise tidy answer
accepting lossy where the student clearly means lossless
rewarding an answer that compresses the sequence incorrectly but explains the idea reasonably well without separating method marks from accuracy marks

📝 Examiner lens
The best responses are specific. They do not just say RLE stores repeated data more efficiently. They show how by using the sequence in front of them.

Example Student Responses

Example question

Use run length encoding to compress the binary string 0000011100000011 and explain one reason why run length encoding is suitable for this data.

Marks: 4

Marking guidance

1 mark for identifying the first run correctly as 5 0
1 mark for completing the full sequence as 5 0 3 1 6 0 2 1
1 mark for explaining that repeated values are stored as a count and a value
1 mark for explaining that the data contains repeated runs, so RLE can reduce storage

Strong response

The compressed version is 5 0 3 1 6 0 2 1. Run length encoding is suitable because the binary string has long runs of repeated values, so instead of storing each bit separately, it stores how many times each value appears.

Why this is strong

The sequence is fully correct.
The response explains the method, not just the final answer.
The explanation links suitability to the repeated pattern in the data.

Weak response

The answer is 0 5 1 3 0 6 1 2. RLE is good because it makes files smaller.

Why this is weak

The student has reversed the order and written value/count instead of count/value.
The explanation is too general.
It does not explain why this particular data is suitable for RLE.

Practice Questions

Question 1

Compress the binary string 11110000 using run length encoding.

Marks: 2

Marking guidance:

1 mark for 4 1
1 mark for 4 0

Question 2

Explain why run length encoding is described as lossless compression.

Marks: 2

Marking guidance:

1 mark for stating that no data is lost
1 mark for explaining that the original sequence can be reconstructed exactly

Question 3

A student says that run length encoding always reduces file size. Explain why this is not always true.

Marks: 3

Marking guidance:

1 mark for recognising that some data has little repetition
1 mark for explaining that many short runs produce many pairs
1 mark for concluding that the compressed data may stay the same size or become larger

Question 4

A bitmap row contains the values 0000000011111111. Write the RLE frequency/data pairs and explain why this row compresses well.

Marks: 4

Marking guidance:

2 marks for 8 0 8 1
1 mark for identifying long runs
1 mark for explaining that repeated values need fewer stored entries

Common Misconceptions

Misconception	Quick correction
RLE always reduces file size.	No. It works well only when the data contains repeated runs.
The pair should be written as value then count.	For AQA-style RLE questions, students should use count then value.
If a value appears again later, it stays in the same run.	A run ends as soon as the value changes. A later repeat starts a new run.
Lossless means the file becomes tiny.	Lossless means the original data can be recovered exactly.
RLE is only for text.	It can also be used for binary data and bitmap-style image data.

FAQ

Do students need to memorise a special formula for run length encoding?

No. Students need to understand the process. They should be able to spot each run, count it accurately, and write the frequency/data pairs in order.

What is the most common exam error in this topic?

The most common error is reversing the pair and writing the value before the count. The next most common issue is missing where one run ends and the next begins.

Should I teach decoding as well as encoding?

Yes. Decoding helps students check whether they really understand the structure of the pairs. It is also a very effective way to spot who is guessing.

Do students need to know when RLE is a poor choice?

Yes. They should understand that RLE is most effective when the data contains repeated values and much less effective when the pattern changes constantly.

How can I make bitmap questions easier to teach?

Use simple rows first. Ask students to trace the pixels from left to right, count each run, and say the pair aloud before writing it down.

Make feedback on RLE answers faster

Marking.ai can help teachers spot count/value reversals, missed runs, vague explanations, and weak use of technical vocabulary more quickly. That means faster marking, sharper feedback, and less time untangling answers that were nearly right but not quite there yet.