3.1.1 The differences and relationships between, data, information and knowledge.
1. Data
What it is:
Data is the raw facts and figures. On its own, it doesn’t have much meaning until it’s organised or processed.
Example (Digital Support & Security):
Imagine a server log that records every login attempt. A single line might look like this: 2025-09-01 09:45:12 | User: JamesF | Login: Failed
On its own, that’s just one piece of data.
2. Information
What it is:
When data is processed, organised, or put into context so it makes sense, it becomes information. Information answers questions like “who?”, “what?”, “where?”, “when?”.
Example (Digital Support & Security):
If you take the server log data and count how many failed login attempts happened in the last 24 hours, you might discover: “There were 45 failed login attempts from 5 different IP addresses on the college’s network.”
This is information because it’s structured and tells you something meaningful.
3. Knowledge
What it is:
Knowledge is when you analyse and interpret the information to make decisions or take action. It answers “why?” and “how?”.
Example (Digital Support & Security):
From the information about 45 failed login attempts from 5 IPs, you recognise a possible brute-force attack on student accounts. You know this because of your training in cybersecurity, which tells you that multiple failed logins from a small set of IP addresses is a common threat indicator.
Using this knowledge, you might:
Block those IP addresses in the firewall.
Alert the IT security team.
Review authentication logs for suspicious activity.
3.1.2 The sources for generating data:
Human (Surveys, Forms)
Humans generate data whenever they give information directly – for example, filling in a form, survey, or feedback questionnaire. This is usually self-reported data (what a person chooses to share).
(Digital Support & Security):
A college IT support team might send students a survey about how secure they feel when using online learning platforms. The answers (Yes/No, ratings out of 5, written comments) are data that can be collected and analysed.
AI and machine learning systems create data as they learn from user behaviour. A feedback loop happens when the AI uses its own output as new input, which can lead to bias or errors being reinforced.
(Digital Support & Security):A cybersecurity monitoring tool that uses machine learning to detect suspicious logins could wrongly flag normal student behaviour (like logging in late at night) as a threat. If those false alarms are fed back into the system as “evidence,” it may become overly strict and block real students from logging in.
Task 1
AI and ML Worksheet Download file Task 2
Create a presentation in your group around AI and ML Download file
Sensors collect data from the environment. They measure physical things like heat, movement, sound, or light.
(Digital Support & Security):
In a server room at college, temperature sensors monitor if equipment is overheating. If the temperature goes above a safe level, the system can trigger an alert to the IT support team before the servers shut down.
Internet of Things (IoT) – Smart Objects
IoT devices are everyday objects connected to the internet (e.g., smart lights, thermostats, security cameras). They collect and send data automatically.
(Digital Support & Security):
A college might use smart security cameras that detect movement and send alerts to the IT team’s dashboard. This data helps keep the campus safe, but IT staff must also secure the devices to stop hackers from gaining access.
Every time someone buys something, signs up for a service, or logs into a system, data is generated. Transactions create a digital footprint.
(Digital Support & Security):
When a student pays for a college esports event online, the system records:
The student’s name
Payment method
Date & time
Items purchased (e.g., entry ticket + team jersey)
This transaction data must be stored securely to comply with data protection laws (like GDPR) and to prevent cybercriminals from stealing card details.
"Data Detectives"
Scenario:
You’re part of a college IT support team. The college wants to improve security and gather data from different sources. Below are some situations.
Task (10 mins):
For each scenario, identify: What is the raw data?
What information can you get from that data?
What knowledge or decisions could the IT team make?
Scenarios:
1 - A survey sent to 200 students asks: “Do you feel safe using the college Wi-Fi?” 120 say Yes, 80 say No.
2 - A machine learning tool notices that a student logs into the network at 2 a.m. every night and flags it as unusual.
3 - A server room temperature sensor records 35°C at 3:00 p.m. (normal temperature should be under 25°C).
4 - The college installs smart locks on computer labs that record every time someone enters or leaves.
5 - The esports society’s online shop records that most students buy merchandise around payday (the 28th of each month)
Extension (5 mins):
Find an example of an IoT device that could be used in a school or esports setting.
Describe what data it collects, what information it provides, and what knowledge the IT team could gain from it.
3.1.3 Ethical data practices and the metrics to determine the value of data:
Ethical Data Practices & Metrics to Determine Data Value
Before we dive into the metrics, remember: Ethical data practice means collecting, storing, and using data responsibly.
This includes:
Getting permission (consent) from users.
Protecting data from cyberattacks.
Not misusing personal information.
Following laws like GDPR (General Data Protection Regulation) in the UK/EU.
Now, let’s explore the metrics used to decide how valuable data is.
Quantity
Quantity refers to the amount of data collected. More data can help identify patterns more accurately.
(Digital Support & Security):
A college IT team collects data from 10 login attempts vs 10,000 login attempts. The larger dataset is more valuable because it shows broader patterns (e.g., which times of day attacks are most common).
Don’t collect more data than necessary – only gather what’s useful (“data minimisation” under GDPR).
Timeframe
Timeframe is about when the data was collected and how long it remains relevant. Recent data is often more valuable than old data.
(Digital Support & Security):
A log of failed Wi-Fi logins from yesterday is more useful for spotting a live cyberattack than logs from 2019.
Don’t keep data longer than necessary. For example, student support tickets might be deleted after a year once resolved.
Source
The value of data depends on where it comes from and how trustworthy the source is.
(Digital Support & Security):
Login data from the college’s own servers = reliable source.
A random spreadsheet emailed by an unknown user = unreliable (could be fake or manipulated).
Always check sources and avoid using stolen or illegally obtained data.
Veracity
Veracity means the accuracy and truthfulness of data. Data full of errors or lies is less valuable.
(Digital Support & Security):
If students fill in a survey about cyber safety and many joke by giving fake answers (“My password is 123456”), the veracity of the data is low, so the results can’t be trusted.
Organisations should clean and validate data, and not mislead people by presenting false or incomplete results.
3.1.4 How organisations use data and information:
Analysis to Identify Patterns
Organisations look at large sets of data to find trends, behaviours, or repeated issues. Patterns help predict future events and improve decision-making.
The IT support team analyses helpdesk tickets and notices that every Monday morning, many students report Wi-Fi login problems. The pattern suggests that systems might need restarting after the weekend.
Google analyses search trends (e.g., millions of people suddenly searching for the same issue). This helps them detect outbreaks of cyberattacks or bugs spreading online.
System Performance Analysis (Load, Outage, Throughput, Status)
Organisations monitor how well their systems are running:
Load – how much demand is placed on the system (e.g., number of users).
Outage – when systems go down or stop working.
Throughput – how much data or traffic can pass through the system.
Status – current health of servers, networks, or applications.
An esports tournament hosted at a college requires fast servers. The IT team monitors server load and bandwidth usage during live matches. If the system slows down, they can add more resources to avoid crashes.
Amazon Web Services (AWS) constantly monitors its cloud servers. If a data centre goes down, traffic is automatically re-routed to another server to prevent downtime for customers.
User Monitoring (Login/Logout, Resources Accessed)
Organisations track user activity to ensure systems are being used correctly and securely.
A college IT team monitors who logs into the Virtual Learning Environment (VLE). If a student logs in from two countries within the same hour, it may indicate a hacked account.
Microsoft 365 monitors user logins across the world. If an account logs in from London and then five minutes later from New York, it may block the login and alert security teams.
Targeted Marketing (Discounts, Upselling)
Organisations use data about customer behaviour to send personalised offers, suggest upgrades, or advertise products people are likely to buy.
A college esports society collects data on what students buy in the online shop. If a student buys a gaming jersey, they might get an email offering a discount on a matching mousepad.
Steam (Valve) analyses what games you play and recommends new titles you’re likely to enjoy. They also send personalised sale notifications to encourage more purchases.
Organisations analyse data to spot risks (threats) or advantages (opportunities). This can relate to cybersecurity, business competition, or legal compliance.
The IT security team compares data about phishing attempts with government alerts from the NCSC (National Cyber Security Centre). If a new type of phishing attack is targeting colleges, they can prepare staff with updated training – turning a threat into an opportunity to strengthen security.
NCSC (UK) collects data on cyber incidents across the country. They publish reports on new cyber threats, which organisations use to improve security and stay compliant with regulations like GDPR.
"Data in Action"
Scenario:
You are working in the IT support and security team for a college esports club. You have access to the following datasets:
1 - Login records: Show that some students are logging in at 3 a.m. from outside the UK. 2 - Server stats: During last Friday’s tournament, the main game server slowed down when 200 players connected at once. 3 - Shop sales: Jerseys sell out every time there’s a big tournament, but headsets don’t sell as well. 4 - Competitor data: Another nearby college just announced a new gaming lab with high-spec PCs.
Task:
1 - Analysis to Identify Patterns:
Which dataset shows a repeated trend?
What pattern do you see?
2 - System Performance:
Which dataset shows a system issue?
What actions should IT take to prevent it happening again?
3- User Monitoring:
What do the login records tell you?
What security risks do they suggest?
4 - Targeted Marketing:
How could the esports club use the shop sales data to increase revenue?
5 - Threat/Opportunity Assessment:
How should the club respond to the competitor’s new gaming lab?
Extension:
Research how a company like Netflix or Amazon uses data to recommend products or detect suspicious activity.
Share your findings with the group.
3.1.5 Interrelationships between data, information and the way it is generated and make judgements about the suitability of data, information and the way it is generated in digital support and security.
What this means
Data = raw facts or figures (numbers, logs, text, clicks, etc.) without context.
Information = processed, organised, and meaningful data that helps people make decisions.
Way it is generated = how the data is collected (e.g. login records, surveys, sensors, monitoring tools).
These three parts are linked together:
The way data is generated determines the type and quality of the data you get.
That raw data needs to be processed and organised.
Once processed, the data becomes information that can be used to make decisions.
If the data is incomplete, biased, or collected in the wrong way, the information may not be suitable for decision-making.
"A College Cybersecurity Incident Response"
Scenario:
A UK college notices that some students’ accounts have been logging in at unusual times. The IT security team collects data from three different sources:
1 - Login/Logout Records (system generated data)
2 - Firewall Logs (network traffic data, showing unusual connections from overseas IPs)
3 - Incident Reports (manually generated by staff when they notice suspicious behaviour)
How the interrelationships work:
Data:
Login records show timestamps, usernames, and IP addresses.
Firewall logs capture packet traffic and potential intrusion attempts.
Staff reports note suspicious emails and students complaining about locked accounts.
Information (processed data):
Combining the login timestamps with IP addresses shows multiple students logging in from a single overseas location at odd hours.
Staff reports confirm phishing emails were sent to many accounts the day before.
Suitability of Data:
Login data: Useful and reliable, but could be misleading if students use VPNs.
Firewall logs: Provide technical detail, but require expertise to interpret.
Staff reports: Subjective, but add valuable context about user behaviour.
Judgement:
The most suitable data in this case is the combination of automated system logs (objective, timestamped evidence) and user-reported incidents (human context). Relying on only one source could lead to misinterpretation (e.g. mistaking a VPN for a hacker).
Real-World Industry Example
NHS Digital (UK Health Service) collects data from hospital IT systems about cyber incidents.
In 2017’s WannaCry ransomware attack, logs showed unusual traffic patterns while staff reported being locked out of systems.
By combining both machine data (network logs, malware signatures) and human-reported issues, NHS Digital was able to coordinate with cybersecurity agencies to restore services and improve future protections.
This demonstrates how data, information, and generation methods must work together to make correct security decisions.
"Data to Information Detective"
1 - Work in pairs or small groups.
2 - Read the case study above about the college cybersecurity incident.
3 - Answer the following questions together (10 minutes):
Data: List two types of raw data the IT team collected. Why is each useful? Information: How did the IT team turn the raw data into useful information? Suitability: Which source of data (login logs, firewall logs, or staff reports) do you think is most reliable for making security decisions? Why? Judgement: If you were the IT manager, what actions would you take based on the information gathered? (E.g., resetting passwords, training, blocking IP addresses.)
Extention:
(Optional challenge if time allows, 5 minutes):
Think of a real organisation (like a bank, online shop, or gaming company).
What kind of data do they collect?
How do they turn it into information?
What threats or opportunities might this create?
Output:
Each group should share one key insight with the class about why it’s important to think about both the data itself and how it’s generated when making digital support or security decisions.
Files that support this week
English:
Reading & comprehension of technical text
Students must read definitions and explanations of data, information, knowledge, the differences and relationships between them. mystudentsite.co.uk
They must understand the sections on “sources for generating data,” “ethical data practices / metrics,” “how organisations use data,” and the interrelationships. mystudentsite.co.uk
Summarising / paraphrasing
In the “Data Detectives” scenarios, students are asked to state: What is the raw data? What information can you get? What knowledge or decisions could be made? Summarising these in their own words is an exercise in paraphrase / condensation. mystudentsite.co.uk
In the “Data to Information Detective” questions: summarisation of case studies and articulating the relationships in their own phrasing. mystudentsite.co.uk
Explanation / justification / argumentation
Students must explain which data sources are useful, how raw data is processed into information, and justify choices (e.g. which data source is most reliable, what decisions to take). mystudentsite.co.uk
They are asked to make judgments (e.g. “If you were the IT manager, what actions would you take?”) requiring reasoned argumentation. mystudentsite.co.uk
The extension task: research a real organisation’s data use and explain how data → information → knowledge in that context. That requires structuring and explaining in prose. mystudentsite.co.uk
Use of precise vocabulary / technical terms
Terms like data, information, knowledge, ethical practices, metrics, veracity, quantity, timeframe, source, system performance, user monitoring, targeted marketing, threat/opportunity assessment, etc., are introduced and must be used in explanations. mystudentsite.co.uk
In students’ writeups, they must use these terms appropriately and integrate them into their reasoning.
Oral / group discussion / sharing
Students work in pairs or groups to discuss scenarios, answer the detective style questions, and then groups share key insights with the class. That involves oral communication skills. mystudentsite.co.uk
The presentation tasks (from other weeks) are implied to continue through the module, so this week’s groundwork supports those.
Assessment:
Learning Outcomes:
Awarding Organisation Criteria:
Maths:
Quantitative reasoning & interpretation
In the “Data Detectives” scenarios, students deal with numeric data (e.g. “120 say Yes, 80 say No”) and must interpret what that raw data tells them as information / patterns. mystudentsite.co.uk
They must reason about how much data (quantity), timeframes (when), veracity, source reliability — these metrics involve comparing numeric or relative magnitudes. mystudentsite.co.uk
Comparisons, proportions, and trends
The scenario about transaction timings (“most students buy merchandise around payday”) suggests patterns over temporal cycles — students can investigate proportions, frequencies over time. mystudentsite.co.uk
The lesson invites pattern detection: e.g. repeated peaks or anomalies in data across datasets. mystudentsite.co.uk
Classification / mapping raw → processed
Students map raw data (logs, sensor readings, survey responses) into processed information (counts, summarised statements). That mapping is a kind of data transformation, which is a numeracy skill. mystudentsite.co.uk
They decide which sources are more “suitable” (in terms of reliability, veracity) — which entails comparing numeric attributes (accuracy, error rates). mystudentsite.co.uk
Logical / structured decision-making
Determining which data sources to trust and which actions to take is a decision process that depends on quantitative judgments (e.g. weighting data, considering potential error) — combining logic and numeracy.
Recognising interrelationships (data → information → knowledge) implies structural / logical ordering as well as understanding dependencies among quantities.
Metrics & measurement concepts
The module introduces metrics for value of data: quantity, timeframe, source, veracity. Each metric can be considered as a measurable attribute. Students must think about how to measure / compare them. mystudentsite.co.uk
The students may (in extension) examine real organisations’ data metrics and compare different measures (e.g. volume of data vs freshness vs reliability).
Stretch and Challenge:
E&D / BV
Homework / Extension:
ILT
→
→
→
→
→
→
Week 2
T&L Activities:
3.2 Methods of transforming data
3.2.1 Methods of transforming data:
When organisations collect data, it is often raw and not immediately useful. To make it valuable, it must be transformed. The main methods are:
Manipulating
Analysing
Processing
Manipulating Data
Changing or reorganising data to make it more understandable or useful. This might include filtering, sorting, or combining data from different sources.
A college IT support team exports login data from the network. At first, it’s just thousands of rows of timestamps and usernames. By manipulating the data (sorting by user, filtering failed attempts), they quickly see which accounts have repeated login failures.
Splunk and Elastic (ELK Stack) are widely used in cybersecurity to manipulate and search through huge log files, making it easier to spot patterns of suspicious behaviour
Analysing Data
Looking at data in depth to identify patterns, trends, or relationships. Analysing moves beyond just reorganising – it’s about making sense of the information.
After manipulating login records, the IT team analyses them and notices that 80% of failed logins happen between midnight and 3 a.m. This unusual pattern suggests a brute-force attack.
IBM Security QRadar analyses logs from multiple systems (firewalls, servers, apps) to detect cyber threats by identifying unusual traffic patterns.
Processing Data
Converting raw data into a different format or structure so it can be used by systems, applications, or people. Processing often involves automation.
A system collects sensor data from a server room (temperature, humidity). This raw data is processed into a dashboard that shows “green, amber, red” warnings. IT staff don’t need to read every number – the processed data tells them instantly if action is needed.
SIEM (Security Information and Event Management) tools like Azure Sentinel automatically process logs from thousands of endpoints and generate alerts for IT teams.
You are part of a college IT security team. Below is some raw login data:
Task:
Manipulating:
Sort the data by username. What do you notice?
Analysing:
Which accounts show suspicious behaviour? Why?
Processing:
Imagine you are designing a dashboard. How would you present this data (e.g., traffic light system, charts, alerts)?
Extension:
Research one industry tool (Splunk, ELK Stack, QRadar, or Azure Sentinel).
Explain: Does it mainly manipulate, analyse, or process data – or all three?
Files that support this week
English:
Reading & comprehension of technical prose
Students must read and understand the descriptions of manipulating, analysing, processing data (how raw data is transformed). mystudentsite.co.uk+1
They must interpret a sample raw dataset (usernames, timestamps, status) and understand the implied narrative. mystudentsite.co.uk
Explanation / description writing
In tasks, students will explain their observations: e.g. “Sort the data by username. What do you notice?” requires them to describe patterns in their own words. mystudentsite.co.uk+1
“Which accounts show suspicious behaviour? Why?” demands reasoning and justification in prose. mystudentsite.co.uk
“Imagine you are designing a dashboard. How would you present this data?” asks them to describe the design and rationale (textually). mystudentsite.co.uk
Inquiry / research & reporting
The extension task: “Research one industry tool (Splunk, ELK, QRadar, or Azure Sentinel). Explain: Does it mainly manipulate, analyse, or process data – or all three?” This requires gathering information from external sources and then writing an explanation in structured form. mystudentsite.co.uk
Use of technical vocabulary
Terms like manipulating, analysing, processing, dashboard, pattern, suspicious behaviour etc. must be used correctly in answers. mystudentsite.co.uk
Students will need to communicate clearly with precise vocabulary when describing what transformations or analyses do.
Logical / sequential narrative
The tasks follow a logical progression (manipulate → analyse → process). Students’ answers can mirror that sequence in writing, helping them practise structuring written argumentation in a logical order.
Assessment:
Learning Outcomes:
Awarding Organisation Criteria:
Maths:
Sorting / ordering
“Sort the data by username.” That is an ordering / sorting operation (alphanumeric sorting). It encourages thinking about ordering rules (alphabetic, timestamp) and how datasets can be reorganised. mystudentsite.co.uk
Pattern detection / trend identification
When students analyse the data to find suspicious accounts, they must look for patterns (e.g. multiple failures by one user, clustering in time). That is numerical / logical pattern recognition. mystudentsite.co.uk
Data filtering / selection
The notion of filtering (selecting subsets of data that meet criteria) is itself a numeracy / data operation (e.g. "only failed logins", "only those with > n failures"). mystudentsite.co.uk+1
When designing dashboard output (e.g. traffic light, chart, alerts), students decide how to map numeric data to visuals (thresholds, ranges). That involves thinking about scales, cut-offs, representation of numeric values visually. mystudentsite.co.uk
Converting detailed raw numeric logs to more digestible summary forms is a form of data aggregation / summarisation (though implicitly).
Classification / categorisation of behaviour
Deciding which accounts are “suspicious” vs “normal” is a classification exercise based on numeric criteria (e.g. number of failed attempts, clustering). This involves thresholding, comparison, logical testing.
Understanding data transformation hierarchy
The underpinning conceptual structure (manipulate → analyse → process) implicitly involves mathematical thinking about stages of transforming data (e.g. reorganising, aggregating, mapping).
Stretch and Challenge:
E&D / BV
Homework / Extension:
ILT
→
→
→
→
→
→
Week 3
T&L Activities:
3.3 Data taxonomy
What is a Taxonomy?
Think of a taxonomy like a family tree, but for data. It’s a way of splitting things into groups so we know what type of data we’re dealing with.
3.3.1 Definition of qualitative and quantitative, its purpose, and how data is categorised
Quantitative
quantitative data – which basically means numbers. If you can count it or measure it, it’s quantitative.
Two Types of Quantitative Data can be
Discrete Data
Discrete means things you can count in whole numbers.
You can’t have half of one, it’s either 1, 2, 3… but not 2.5.
In IT support/security:
How many times a student typed the wrong password.
The number of emails flagged as spam.
How many viruses an antivirus tool finds.
“How many login attempts failed this morning?” and you answer “7”, that’s discrete data.
Continuous Data
Continuous means measurements – and you can have decimals.
In IT support/security:
The server room temperature (22.3°C, 22.4°C, etc.).
Bandwidth speed during an esports match (245.6 Mbps).
CPU load (%) on a computer.
“What’s the server temperature right now?” and it says “23.5°C” – that’s continuous data.
Both are useful, but in different ways:
Discrete data is great for counting events – like how many people tried to hack into your system.
Continuous data is better for monitoring performance – like spotting if your server is overheating or slowing down.
Take Amazon Web Services (AWS) they’re running thousands of servers worldwide they use discrete data to count login attempts and block suspicious ones. At the same time, they use continuous data to monitor server performance. If both types spike at once, they know something is wrong.
Qualitative.
What is Qualitative Data?
Qualitative data is about descriptions, opinions, and categories rather than numbers.
Types of Qualitative Data;
Categorical (or Nominal) Data
Data that can be sorted into groups, but the groups don’t have a natural order.
In Digital Support & Security:
Type of cyberattack: phishing, malware, ransomware, brute force.
Operating system: Windows, macOS, Linux.
User role: student, staff, admin.
It’s like labels – they tell you what “type” something is, but not which one is bigger or better.
Ordinal Data
Data that can be put in a ranked order, but the gaps between them aren’t necessarily equal.
In Digital Support & Security:
Student feedback on password security training (Poor, Okay, Good, Excellent).
So ordinal data has a sense of order, but it’s not really about numbers. “High risk” is more serious than “Low risk,” but we can’t say it’s exactly “two times” more serious.
Quantitative data is great for spotting patterns in numbers – but qualitative data adds the human side:
What people think
How people feel
Why something is happening
NCSC (National Cyber Security Centre, UK):
They collect quantitative data about how many phishing emails are reported but they also collect qualitative data from feedback surveys asking staff how confident they feel spotting phishing emails, by combining the two, they can judge not just how many phishing attempts are happening, but also how well people are prepared to deal with them.
Case Study: College Cybersecurity Awareness
Your college has recently run a campaign to improve cybersecurity awareness among students and staff. The IT support and security team collected both quantitative and qualitative data to see if it worked.
Data Collected:
• Quantitative (numbers):
- 1,200 phishing emails reported in Term 1.
- Only 450 phishing emails reported in Term 2.
- 95% of students logged in successfully without needing password resets.
• Qualitative (opinions/descriptions):
- “I feel more confident spotting phishing emails now.”
- “The password rules are still too complicated.”
- “Training was useful but too short.”
- Risk ratings given by IT staff: Low, Medium, High.
Task Part 1 – Analysis (20 mins, group work)
Work in small groups and:
1. Identify the quantitative data in the case study.
2. Identify the qualitative data in the case study.
3. Explain how each type of data helps the IT team understand the effectiveness of the campaign.
4. Make a judgement: Do the numbers and opinions show the campaign was successful? Why or why not?
Task Part 2 – Research (Homework or 30 mins independent task)
Each group must research a real-world cybersecurity awareness campaign. Examples:
- NCSC “Cyber Aware” (UK)
- Google Security Checkup
- StaySafeOnline.org (US)
- OR another campaign you find.
For your chosen case:
- Find one example of quantitative data they collected.
- Find one example of qualitative data they used.
- Explain how combining both types of data made their campaign stronger.
Task Part 3 – Group Presentation (15 mins prep + delivery in next lesson)
Prepare a 5-minute presentation to share with the class. Your presentation should include:
1. A short explanation of the difference between quantitative and qualitative data.
2. An analysis of the college case study – was the awareness campaign effective?
3. Findings from your research case study.
4. A recommendation: If you were the IT manager, what would you do next to improve cybersecurity awareness?
Tip: Use visuals like graphs (for quantitative data) and word clouds or quotes (for qualitative data).
Extension / Stretch Task
Design your own mini research survey that collects both quantitative and qualitative data about how safe students feel online. Share 3–5 questions (mix of numerical scales and open-ended questions).
3.3.2 Know the definition for structured data, understand its purpose, and understand that quantitative data is structured.
Structured data is data that is organised and stored in a defined format, usually within tables, rows, and columns, such as in databases or spreadsheets. It follows strict rules, which make it easier to enter, store, search, and analyse. Because it is predictable and consistent, structured data can be processed quickly by machines and used to support decision-making
The purpose of structured data is to:
Enable fast access and retrieval – information is easily searchable with SQL queries or filters.
Support accurate analysis – data can be aggregated, compared, and visualised (charts, dashboards, reports).
Improve reliability – stored in databases with validation rules, ensuring accuracy and reducing errors.
Aid security and compliance – structured systems can apply access controls and encryption consistently.
Quantitative data is numerical data that can be measured and counted. It is structured by:
Discrete values – whole numbers, e.g. number of employees.
Continuous values – measured data, e.g. temperature, sales revenue.
Categorical values – numerical codes representing groups, e.g. “1 = Male, 2 = Female” in a HR database.
This data fits neatly into tables where each record (row) contains values across defined fields (columns).
Case Studies
Tesco (Retail)
Tesco uses structured data in their loyalty programme (Clubcard). Customer transactions are stored in databases: product IDs, time of purchase, cost, and store location. Structured quantitative data allows Tesco to identify buying patterns, target promotions, and forecast stock demand.
NHS (Healthcare)
The NHS uses structured patient data – age, blood pressure readings, appointment times – stored in Electronic Health Records. This ensures doctors can quickly retrieve accurate medical histories, track quantitative health measures, and comply with legal standards such as GDPR.
Airlines (British Airways)
Airlines store structured data for bookings: passenger details, flight numbers, seat allocations, ticket prices. Quantitative data (ticket sales, baggage weight, passenger counts) helps them optimise scheduling, revenue management, and compliance with aviation regulations.
Spot the Structure (15 minutes – group task)
Your Challenge
In this task, you will work in small groups to explore different types of data and decide which ones are structured and which are not. Then, you’ll look at how organisations use numbers (quantitative data) to make decisions.
Step 1 – Sort the Data (5 minutes)
You will be given a sheet with different examples of data:
- A shopping list with prices
- A short blog post
- Patient heart rate readings
- A set of photos
- Flight booking details
With your group, sort the data into two piles:
- Structured data (fits into a table, rows, or columns) - Unstructured data (free text, images, videos, anything without a clear format)
Step 2 – Find the Numbers (5 minutes)
From your structured data pile, highlight or circle the quantitative values (numbers, measurements, statistics).
Example: prices on the shopping list, heart rate readings, ticket sales.
Then, discuss:
How could an organisation use these numbers?
What decisions could they make based on them?
Step 3 – Share Your Findings (5 minutes)
Choose one example from your group
Be ready to tell the class:
1. Is it structured or unstructured?
2. What numbers did you find?
3. How could a business or organisation use that information?
What You’ll Learn
By the end of this activity, you should be able to:
- Spot the difference between structured and unstructured data.
- Identify where numbers (quantitative data) appear in structured data.
- Explain how organisations can use structured data to make decisions.
3.3.3 Know the definition for unstructured data, understand its purpose, and understand that qualitative data is unstructured.
Unstructured data is information that does not have a predefined format or structure. It does not fit neatly into tables of rows and columns, and it is often text-heavy, image-based, or multimedia. Examples include emails, social media posts, documents, photos, audio, and video files.
The purpose of unstructured data is to:
Capture rich, descriptive detail – allows organisations to understand opinions, behaviours, and context.
Support decision-making beyond numbers – text, images, and speech can provide meaning that numbers alone cannot.
Enable qualitative analysis – helps to identify themes, trends, or insights in customer feedback, medical notes, or research interviews.
Drive innovation – unstructured data can reveal opportunities for product design, marketing, or service improvement.
Qualitative Data and Unstructured Data
Qualitative data is descriptive, non-numerical data – such as feelings, opinions, and experiences. It is usually unstructured because it cannot be easily measured or placed into rows and columns.
Example: A customer saying “The product was too difficult to set up” in a feedback survey.
Unlike quantitative data (numbers), qualitative data focuses on meaning, reasons, and motivations.
Case Studies
BBC (Media)
The BBC analyses unstructured social media comments, audience feedback emails, and video views to understand what viewers like or dislike. This qualitative data helps shape programme schedules and digital content.
Amazon (E-commerce)
Amazon uses unstructured product reviews and customer questions to improve product recommendations. Sentiment analysis (positive/negative reviews) gives insight into customer satisfaction beyond raw sales numbers.
NHS (Healthcare)
Doctors’ notes, medical scans, and patient feedback are unstructured but essential for care. Analysing this qualitative data helps identify patterns in patient experiences and improve treatment plans.
Supporting Activity (15 minutes – Small Groups)
Title:“Unpack the Unstructured”
Your Challenge
In this task, you will explore different types of unstructured data and think about how organisations can use them to understand people’s experiences and opinions
Step 1 – Identify the Unstructured Data (5 minutes)
You will be given a sheet with examples of data:
- A tweet from a customer about poor service
- A product review from Amazon
- A doctor’s note about a patient’s symptoms
- A video clip description from YouTube
- A company sales report
With your group, decide which examples are unstructured and which (if any) are structured.
Step 2 – Spot the Qualitative Information (5 minutes)
From the unstructured examples, highlight or underline the qualitative details (opinions, descriptions, experiences).
Example: “The app keeps crashing and is frustrating to use.”
Then discuss:
- How could an organisation use this type of feedback?
- What changes or improvements could it lead to?
Step 3 – Share Your Insights (5 minutes)
Pick one example and be ready to share:
1.Why is it unstructured?
2. What qualitative information did you find?
3. How could an organisation act on this information?
What You’ll Learn
By the end of this activity, you should be able to:
Recognise examples of unstructured data.
Understand how qualitative data provides meaning and context.
Explain how organisations use unstructured data to improve services or products.
3.3.4 Know the definition for each representation and understand the representations of quantitative data:
When working with data, it’s important to understand how numbers can be represented and organised. Quantitative data is data that deals with numbers and measurements it tells us how many, how much, or how often.
However, not all numbers behave in the same way. Some numbers are easy to count, some are measured on a scale, and others are used to represent categories or groups. To make sense of this, quantitative data is usually represented in three main forms:
Discrete values
Values you count which can only take certain distinct (usually whole number) values. There are gaps between possible values. Examples: number of students in a class; number of defects in a product; count of hospital visits.
Continuous values
Values you measure, which can take any value within a (possibly infinite) range, including decimals/fractions. There are no gaps between possible values in theory. Examples: height, weight, temperature, time, distance
Categorical values.
Values that represent categories or groups rather than numerical amounts. Sometimes further divided into nominal (no inherent order) and ordinal (order matters, but distances between categories are not necessarily equal). Examples: blood type; customer rating (poor / fair / good / excellent); brand; gender.
Type
Benefits
Drawbacks / Limitations
Good settings / less good settings
Discrete values
Easy to count and understand
Good for summarising how many or how often something happens - counts
Often simpler to work with fewer possible values, often integers
Cannot capture very fine-scale variation (no halves, decimals)
Sometimes artificially coarse: e.g treating continuous phenomena as discrete (e.g rounding) can lose information
May have many possible categories, which makes some analyses harder
Good in attendance counts, inventory, surveys with count questions, defect counts. Less good in measurements where precision matters (e.g in science/engineering: length, weight).
Continuous values
Can capture fine-grained variation, more precise measurements
Allow for more sophisticated analysis (regression, modelling, detecting small differences)
More flexibility in representation (histograms, density plots, etc.)
Greater measurement error possible (precision issues, instrument limits)
Sometimes overkill when only broad categories are needed
Can be harder to interpret meaningfully if decimals dominate or if data is noisy
Good in scientific measurement (physics, biology), health data (blood pressure, cholesterol), environmental monitoring. Less good when people only need broad categories (e.g in a large survey maybe age bands matter more than exact age to the nearest day).
Categorical values
Useful for grouping, classification, segmentation
Cannot always be ordered (if nominal) if ordinal, spacing between categories is ambiguous
Statistical tests and visualisations are more limited (can't do arithmetic on nominal categories)
Too many categories can be unwieldy (e.g too many brands or types)
Good in survey data (preferences, satisfaction levels), branding, demographic classification, marketing. Less good in when precision/quantity matters more, or where categories are too broad or ambiguous.
Examples
To clarify, here are some concrete examples in organisational / real-world settings, showing each type in action, plus mixed use, and evaluation.
Retail company / inventory management
Discrete: Number of units of each product in stock (e.g. 15 chairs, 200 mugs).
Continuous: The weight of a shipment in kg; the size of product packaging (volume).
Categorical: Type of product (furniture vs kitchenware vs electronics); category of suppliers; product color.
Benefit: Using discrete counts allows quick decisions about restocking. Continuous values help with logistics (weight, volume) for shipping. Categorical helps in analysing patterns (which product categories sell best).
Drawback: If continuous measures are too precise (e.g. milligrams) that don’t affect business decisions, they add complexity for little benefit. If categories are too many or poorly defined, comparisons become messy.
Healthcare / Hospitals
Discrete: Number of patients admitted per day; number of surgeries done.
Continuous: Patients' temperature; blood pressure; time taken in surgery; length of hospital stay (in hours).
Benefit: Continuous values allow detecting small changes in vital signs; discrete counts help with capacity planning; categorical allows grouping by disease or risk for policy decisions.
Drawback: Measurement error in continuous values can mislead (e.g. inaccurate blood pressure readings). Discrete counts fluctuate daily and may be influenced by external factors. Some categorical groupings (severity) are subjective.
Education / Schools
Discrete: Number of students in a class; number of books borrowed; number of discipline incidents.
Continuous: Test scores (if measured on continuous scale); time students spend on tasks; percentage marks.
Categorical: Grades (A / B / C / Fail); subject area (Math, English, Science); level of satisfaction in surveys.
Benefit: Discrete and continuous data enable quantitative tracking (progress, comparison). Categorical help with grouping and reporting to stakeholders.
Drawback: Grades (categorical) may hide wide variation in actual performance. Continuous scores may vary slightly due to test difficulty but may not represent real learning. Also, privacy/ethical issues when dealing with precise student data.
3.3.5 Know and understand the properties of qualitative data:
• stored and retrieved only as a single object
• codified into structured data.
3.3.6 Understand the interrelationships between data categories data structure and transformation and make judgements about the suitability of data categories, data structure and transformation in digital support and security.
Files that support this week
English:
Explanation / definitions in prose
Students will need to read and understand definitions of qualitative vs quantitative, structured vs unstructured, categorical, ordinal etc.
They may be asked to rephrase or summarise definitions in their own words (to check understanding).
Written reasoning / justification
In Task Part 1, students identify which data in the case study are quantitative vs qualitative, and then explain how each type helps the IT team. That involves constructing reasoned sentences. mystudentsite.co.uk
Also, the judgment task: “Do the numbers and opinions show the campaign was successful? Why or why not?” — this is argumentative / evaluative writing. mystudentsite.co.uk
In the homework / research component, students are asked to find examples and explain how combining both types of data made their campaign stronger — again, explanation in writing. mystudentsite.co.uk
Oral / presentation skills
Students prepare a 5-minute presentation of their findings (from the case study and their research) to share with the class. mystudentsite.co.uk
They will need to present definitions, their analysis, and recommendations — communicating to peers, possibly using visual aids (graphs, quotes). mystudentsite.co.uk
Use of technical / subject-specific vocabulary
Terms like “quantitative data”, “qualitative data”, “structured data”, “unstructured data”, “categorical”, “ordinal”, “representation” etc. appear, and students will have to use and understand them in context. mystudentsite.co.uk
In writing or speaking, correct usage of these terms helps precision and clarity.
Comparative / evaluative writing
Students are asked to make judgments (e.g. success of campaign, suitability of data types) which require comparing evidence, weighing pros and cons, and writing a persuasive or evaluative argument. mystudentsite.co.uk
Also, in describing the limitations / benefits of discrete, continuous, categorical data, students may be asked to contrast them in prose. mystudentsite.co.uk
Question design / survey writing (extension task)
The stretch task invites designing a mini research survey mixing numerical scales and open-ended questions. That requires crafting good question wording in English (clear, unbiased, well framed). mystudentsite.co.uk
Assessment:
Learning Outcomes:
Awarding Organisation Criteria:
Maths:
Understanding data types / measurement types
Distinguish between discrete (countable whole numbers) and continuous (measurable with decimals) data. mystudentsite.co.uk
Understanding that quantitative data can take various representations (discrete, continuous, categorical) and the properties / limitations of each. mystudentsite.co.uk
Classification / sorting tasks
In group activity “Spot the Structure”: students sort data examples into structured vs unstructured, then highlight quantitative values in structured data. That is a numeracy task (identifying where numbers occur) and classifying numeric vs non-numeric. mystudentsite.co.uk
In another task, “Unpack the Unstructured”: students pick unstructured examples and identify the qualitative (non-numeric) parts, which helps cement understanding of numeric vs descriptive data. mystudentsite.co.uk
Quantitative interpretation / reasoning
In the case study about the cybersecurity awareness campaign: students will interpret numeric data (e.g. 1,200 phishing emails reported in Term 1, 450 in Term 2, 95% logins without resets) — draw conclusions and compare trends. mystudentsite.co.uk
They will reason about what the numeric changes imply and whether the campaign was effective.
Linking quantitative and qualitative data
Evaluating how numbers and opinions / descriptive feedback interact to give a fuller picture: combining numeric trends and narrative insights to make judgments. mystudentsite.co.uk
This encourages thinking about triangulation of data: numbers and words.
In the stretch task (creating a survey), students will choose numerical scales (e.g. 1–5, percent, counts) and think about how to measure perceptions / feelings quantitatively. That is a numeracy design decision. mystudentsite.co.uk
Understanding limitations / trade-offs of measurement
The content asks students to consider benefits / drawbacks of discrete, continuous, and categorical data representations (e.g. precision, interpretability, number of categories). That involves mathematical reasoning about error, granularity, and usefulness. mystudentsite.co.uk
Stretch and Challenge:
E&D / BV
Homework / Extension:
ILT
→
→
→
→
→
→
Week 4
T&L Activities:
Learning Aims and Objectives:
Aim:
Objectives:
1. By the end of this week's page students will be able to demonstrate where to apply specific data types for data that appears in databases, tables or datasets
2. By the end of the week's page students will be able to explain and identify where specific data can be decomporsed and extracted from given scenarios in to appropriate data tables.
3. By the end of the week's page students will be able to demonstrate the process of normalisation and explain the purpose of it in given scenarios situations.
4. By the end of the week's page students will be able to reflect on the interrelationships between data type and data transformation.
3.4 Data types
3.4.1 The definition of common data types, their purpose, and when each is used:
Integer (Whole numbers)
What/why: Whole numbers (no decimal point). Efficient for counting, indexing, quantities, IDs. When to use: Anything that can’t have fractions: number of users, attempts, port numbers, stock counts. Gotchas: Watch out for range limits (e.g., 32-bit vs 64-bit) and accidental division that produces decimals.
Example
Suitable Uses
Not Suitable For
0
Counter start
Currency with pennies
7
Login attempts
Temperatures needing decimals
65535
Network ports (unsigned)
Precise measurements (e.g. cm)
-12
Temperature differences
Real (Floating-point / Decimal)
What/why: Numbers with fractional parts. When to use: Measurements (temperature, CPU load), ratios, scientific values. Gotchas: Floating-point rounding error (binary floating point). For money, prefer fixed-point/decimal types.
Example
Suitable Uses
Notes
3.14
Maths/geometry
Stored as float/double
-0.75
Signal values
Rounding errors possible
72.5
CPU temperature °C
Use DECIMAL for money (not float)
Character (Char)
What/why: A single textual symbol (one character). When to use: Fixed-width codes (Y/N flags), single-letter grades, check digits. Gotchas: In Unicode, a “character” users see may be multiple code points (accents/emoji). Many systems still treat CHAR as a single byte/letter in a given encoding.
Example
Suitable Uses
Notes
'Y'
Yes/No flag
Case sensitivity may matter
'A'
Grade
Encoding/locale may affect storage
'#'
Delimiter symbol
String (Text)
What/why: Ordered sequence of characters (words, sentences, IDs). When to use: Names, emails, file paths, JSON blobs-as-text, logs. Gotchas: Validate length and content; normalise case; be mindful of Unicode, whitespace, and injection risks.
What/why: Logical truth value with two states. When to use: Feature flags, on/off, pass/fail, access granted/denied. Gotchas: In databases and CSVs, Booleans are often stored as 1/0, TRUE/FALSE, Y/N—be consistent when importing/exporting.
Example
Suitable Uses
Storage Variants
TRUE
MFA enabled?
TRUE/FALSE, 1/0, or Y/N
FALSE
Account locked?
Keep consistent across DBs
Date (and Date/Time)
What/why: Calendar date (optionally time and timezone). When to use: Timestamps for logs, booking dates, certificate expiry, backups. Gotchas: Time zones and daylight saving; choose UTC for servers, localise only for display. Use proper date types, not strings, for comparisons and indexing.
Example
Suitable Uses
Notes
2025-09-02
Report date
Use ISO 8601 format
2025-09-02T10:30:00Z
Audit timestamp (UTC)
Store UTC, display in local timezone
2025-12-31T23:59:59+1
Regional display
Avoid treating dates as strings
BLOB (Binary Large Object)
What/why: Arbitrary binary data (files) stored as a single value. When to use: Images, PDFs, compressed archives, firmware, encrypted payloads—when you must keep the bytes intact. Gotchas: Large size affects backups and query speed; consider storing large files in object storage (S3, Azure Blob) and keep only a URL/metadata in the database.
Example
Suitable Uses
Notes
PNG logo bytes
Small media in DB
Mind database size limits
PDF policy document
Immutable file storage
Often better in file/object storage
Encrypted payload
Secure binary storage
Store MIME type, size, checksum for integrity
"Mia's Sandwich Shop"
Task 1.
Using the above video, in small groups of no larger than 3 discuss the issues that the company are having.
Identify what data is being recorded
Suggest/Agree a solution for them.
Task 2.
In your groups identify the tables that might need to appear in a database, use the process of Normalisation as well as the Computational Thinking principles of Decomposition, Abstractions and Pattern Recognition.
Task 3.
Present in your groups the findings from your normalisation.Explain/justify your reasoning around the choices made.
create an infoamtive presentation that discusses and exaplains the the following areas of databases;
What a Primary key is and its function, use examples to further show your understanding
What a Foriegn key is and its function, use examples to further show your understanding
What a Composite key is and its function, use examples to further show your understanding
What a relational database is, and why would you use one.
3.4.2 The interrelationships between structured data, unstructured data and data type.
In today’s digital world, organisations gather information in many different forms – from neatly organised spreadsheets of customer transactions to complex streams of emails, images, and social media posts. To make sense of this, we look at three key concepts: structured data, unstructured data, and data types.
Structured data is highly organised, stored in predefined formats such as rows and columns within a spreadsheet or database. This makes it straightforward to search, filter, and analyse. Examples include account numbers, dates of birth, and purchase amounts.
Structured Data
Organised in a predefined format (rows, columns, fields).
Easily stored in databases (SQL, relational systems).
By contrast, unstructured data has no fixed format or schema, making it harder to process. It includes content such as emails, audio recordings, images, videos, or free-text survey responses. While it carries rich insights, it requires more advanced tools and techniques to interpret.
Unstructured Data
No fixed schema or easily searchable structure.
Stored in raw formats like documents, images, videos, social media posts.
Examples: customer service call recordings, CCTV footage, email bodies.
At the foundation of both lies the concept of data types. A data type defines how a particular piece of information is stored and used – for instance, an integer for whole numbers, a string for text, or a blob for multimedia. Structured systems rely on data types to keep information consistent, while unstructured data is often stored in broader types like text fields or binary objects to preserve its form.
Together, these three elements form the backbone of how data is represented, stored, and ultimately transformed into meaningful information.
Examples in Practice
Scenario
Structured Data
Unstructured Data
Data Types in Play
Banking Transactions
Account ID, amount, timestamp
Call centre audio logs
Integer, DateTime, Blob
Healthcare
Patient ID, diagnosis code, prescription dosage
MRI scans, doctor notes
String, Decimal, Blob
Social Media
Username, post date, likes count
Image posts, videos, captions
String, Integer, Blob, Text
Cybersecurity
Login/logout logs, IP addresses
Suspicious emails, attached files
String, Boolean, Blob
Case Studies
Case Study 1: Healthcare – NHS Patient Records
Structured: Patient demographic data (NHS number, date of birth, appointment dates).
Unstructured: Doctor notes, x-ray images, voice dictations.
Interrelationship: Structured records (like appointment schedules) link to unstructured evidence (x-rays stored as BLOBs). The combination provides a holistic medical history.
Application: AI systems analyse unstructured scans, while SQL systems schedule appointments. Both need data types (integer IDs, date, blob images).
Unstructured: Email attachments, phishing attempts, PDF exploits.
Interrelationship: Structured logs identify when and where data entered; unstructured payloads (attachments) must be analysed with ML tools. Data types (IP as string, timestamp as date, file as blob) define how each element is stored and processed.
Application: SIEM (Security Information and Event Management) platforms like Splunk combine both data types to detect anomalies.
Case Study 3: Retail – Amazon Recommendations
Structured: Order history (user ID, product ID, purchase date).
Interrelationship: Data types underpin storage (strings for reviews, integers for quantities, blobs for images). Machine learning models merge structured purchase histories with unstructured reviews to improve recommendations.
Linked to: Core Paper 2 – Data (1.2.1, 1.2.2, 1.2.3)
Topic focus: Understanding the interrelationship between structured data, unstructured data, and data types
By the end of this 25-minute activity, you will be able to:
1. Differentiate between structured and unstructured data.
2. Identify how data types exist within both forms of data.
3. Explain how these three concepts (structured, unstructured, and data types) interrelate in real-world digital systems.
In digital support and cyber security environments, you’ll often manage both structured and unstructured data.
Understanding how data types fit into these categories helps professionals make decisions about:
- Storage (e.g., database vs cloud object store),
- Processing (e.g., SQL query vs machine learning model),
- Security and access control (structured tables vs open media files).
These ideas are interconnected:
- Structured data relies heavily on defined data types (e.g., Integer, Boolean, Date).
- Unstructured data often contains or implies data types inside its content (e.g., text or images may include embedded timestamps or numbers).
- Effective data transformation or classification depends on identifying and linking these types together.
Discussion Starter
Ask:
“If you were the IT support technician for a hospital, what kinds of data would you need to store?”
Then ask:
“Which of those are structured and which are unstructured?”
Step 2 – The Sorting Challenge (10 minutes)
In Pairs or small groups (2–3 students)
Using the provide mixed dataset samples like the one below, either printed or on screen.
Task Instructions
Each group should:
Categorise each example as: Structured data, Unstructured data, (or Semi-structured data, if appropriate).
Identify the data type(s) found or implied in each example
(e.g., text/string, integer, Boolean, date/time, float).
Draw or describe how structured/unstructured data and data types connect.
Sketch a small diagram showing arrows between:
Structured data → relies on → defined data types
Unstructured data → contains → mixed/hidden data types
Allow 10 minutes.
Each group should explain one example to the class.
Step 3 – Reflection Discussion (7 minutes)
Questions for Reflection
Step 4 – Mini Summary Task (3 minutes)
Writes a short paragraph in your own words to answer:
“Explain how structured data, unstructured data, and data types interrelate in digital systems. Give an example from a real-world situation.”
Example student response:
“Structured data, like a customer database, uses fixed data types such as integers and dates to ensure consistency. Unstructured data, such as customer emails, still contains text and time stamps but lacks a fixed schema. Both can be linked — for instance, a support system may combine structured ticket records with unstructured message logs to identify issues faster.”
Prompt
Expected Thinking
Why do structured data systems (like databases) need strict data types?
How might unstructured data still contain data types?
How does this relationship affect security?
.
How could a cyber analyst make use of both?
.
3.4.3 Understand the interrelationships between data type and data transformation.
In digital support and cyber security roles, you’ll often manage data that comes from multiple sources — databases, websites, sensors, and even user input forms.
For that data to be useful, reliable, and secure, it must be stored in the correct data type and transformed into the right structure or format for use.
The interrelationship between these two ideas — data types and data transformation — is crucial to maintaining accuracy, preventing data corruption, and securing systems from attack.
Understanding Data Types
A data type defines what kind of data can be stored and how it’s processed.
Computers must know how to interpret the information they are given.
Data Type
Description
Example
Typical Uses
String/Text
Letters, numbers, and symbols treated as text.
"James", "Password123"
Names, postcodes, usernames.
Integer
Whole numbers only.
27, 2001
Counting logins, age, quantities.
Float/Real
Numbers with decimals.
3.14, 75.5
Percentages, prices, CPU usage.
Boolean
True/False values.
TRUE, FALSE
Security flags, on/off states.
Date/Time
Stores time and date data.
08/10/2025 13:45
Logging, timestamps.
Think of data types as the “containers” that hold different kinds of information.
Just as you wouldn’t pour soup into a paper bag, you wouldn’t store a date as text if you plan to sort by time later
What Is Data Transformation?
Data transformation means converting, cleaning, or reshaping data so that it becomes usable, accurate, or compatible with another system.
Transformations can include:
Changing data from one type to another (e.g. String → Integer).
Reformatting dates (MM/DD/YYYY → DD/MM/YYYY).
Cleaning messy data (Y, Yes, TRUE → TRUE).
Combining or splitting fields (e.g. First name + Surname → Full Name).
These transformations make data usable, comparable, and secure.
How Data Types and Data Transformation Are Connected?
These two concepts constantly interact:
Example
What Happens
Why the Relationship Matters
Importing survey results from a website where every answer is stored as text.
You need to transform "18" to an integer to do calculations (like averages).
The transformation depends on knowing the target data type.
A user enters their name in a field meant for a number.
Without correct data type validation, this could break the system or cause a security flaw.
The data type restricts what transformations or inputs are accepted.
Merging datasets from two departments with different date formats.
You must transform the date strings to one consistent date/time format.
Correct data typing ensures the merge works accurately.
In cyber security, knowing data types helps prevent:
SQL Injection: A hacker could enter malicious text in a numeric field.
Buffer Overflow: Supplying too much text to a field expecting a smaller data type.
Data Leakage: Incorrect transformations might expose sensitive data.
Proper transformations with correct data typing protect systems from these risks.
You are helping your college’s IT department combine attendance data from two different systems.
One system exports CSV data like this:
Student_ID
Present
Hours
00123
Yes
6.5
Before the data can be analysed, you must:
Convert "00123" → Integer (to remove text formatting and leading zeros).
Convert "Yes" → Boolean TRUE.
Convert "6.5" → Float (so you can calculate averages).
The transformations are only possible if you understand the data types involved.
3.4.4 Be able to make judgements about the suitability of using structured data, unstructured data, data types, and data transformations in digital support and security.
Data Decisions in Digital Support and Security Duration: 30 minutes Level: Pearson T-Level in Digital Support and Security Format: Small group task (3-4 learners) Final Output: Short presentation (3-5 minutes per group)
Learning Objective
By the end of this session, you will be able to:
Make reasoned judgements about the suitability of structured vs unstructured data.
Evaluate how different data types (e.g., integer, string, Boolean, date/time) and data transformations (e.g., normalisation, aggregation, cleaning) impact digital support and security decisions.
Communicate your findings effectively to both a technical and non-technical audience.
Stage 1 - Scenario Briefing (5 mins)
You are part of a digital support and security team at a college that manages:
A ticketing system for IT support requests (structured data). Incident reports written by users and technicians (unstructured data). Network logs collected from servers and routers (semi-structured data).
Your manager has asked you to decide which type of data and data transformations are most suitable for improving the college’s cyber-incident response system.
Stage 2 - Research & Discussion (10 mins)
As a group:
Identify examples of structured, unstructured, and semi-structured data in the scenario.
2. Discuss how data types (e.g., integers, text, Boolean) influence how the information is stored and analysed.
3. Explore what data transformations (e.g., cleaning, filtering, converting formats, normalising) could make the data more useful.
4. Evaluate the benefits and drawbacks of using each data form in the context of:
Use this guiding question:
“Which data type and transformation process gives us the most secure and useful insight for decision-making?”
Stage 3 – Judgement and Decision (10 mins)Create a short decision table or mind map comparing your options.
Data Type / Structure
Example
Transformation Used
Pros for Security
Cons / Risks
Your Judgement
Structured
Ticketing database
Normalisation
Easy to query; consistent
Rigid; may miss details
Suitable for trend analysis
Unstructured
Incident text logs
Keyword extraction
Rich detail
Hard to automate
Supplementary use
Use your table to justify your final judgement about which type(s) of data and transformations are most suitable for the college’s digital support and security needs.
Stage 4 - Mini Presentation (5 mins per group)
Each group presents:
Their chosen data type(s) and transformation(s)
The judgements made and the reasoning behind them
How their approach supports security operations (e.g., faster response, data reliability, GDPR compliance)
Presentation audience: The class (acting as the IT management team).
Extension / Differentiation
Stretch: Ask students to link their decision to real-world tools (e.g., Splunk, Wireshark, SQL Server, Power BI). Support: Provide example datasets and a glossary of data types and transformation methods.
Students are asked to write a short paragraph in their own words:
“Explain how structured data, unstructured data, and data types interrelate in digital systems…” mystudentsite.co.uk
This encourages them to summarise technical content in accessible language.
Explanation / justification
In group tasks, students present their normalisation decisions and justify their reasoning for choices made. mystudentsite.co.uk
In the decision-making task (3.4.4), they produce a short presentation, communicating technical decisions to a non-technical audience. mystudentsite.co.uk
They must explain and identify relationships between concepts (structured vs unstructured, transformations, etc.) in the content. mystudentsite.co.uk
Technical vocabulary use
The content introduces specific technical terms (“data type”, “decimal / floating point”, “BLOB”, “normalisation”, “transformation”) and students must use them correctly.
They must use terms like “structured / unstructured data”, “Boolean”, “date/time”, etc., in discussions and writing. mystudentsite.co.uk
Oral communication / presentation skills
In the group work, students deliver a mini presentation (3–5 minutes) of their findings. mystudentsite.co.uk
They must tailor explanations for both technical and non-technical audiences. mystudentsite.co.uk
Reflection / metacognitive writing
There is a “reflection discussion” built in: students must reflect on the relationships between data types and transformations. mystudentsite.co.uk
The “mini summary task” is also reflective: summarising learnings in own words. mystudentsite.co.uk
Assessment:
Learning Outcomes:
Awarding Organisation Criteria:
Maths:
Maths opportunities
Understanding number types and properties
The lesson covers Integers (whole numbers) and Real / floating-point (decimal) types, discussing where to use them, limitations (e.g. rounding) etc. mystudentsite.co.uk
Students must recognise when a value should be integer vs decimal, and understand fractional parts, rounding error etc. mystudentsite.co.uk+1
Conversions between types / transformations
Converting between string → integer or decimal (e.g. from text data to numeric) is a “type transformation” task. mystudentsite.co.uk+1
Reformatting dates, cleaning messy data, combining/splitting fields (e.g. splitting full name into parts) are forms of data transformation with structural / numerical aspects. mystudentsite.co.uk+1
Data normalisation / structuring
The normalization process involves decomposing data into tables, removing redundancy, deciding how to structure numeric and non-numeric attributes. That involves logical structuring and possibly thinking about dependency, relationships, cardinalities (though more of database theory, but mathematically informed). mystudentsite.co.uk
Recognising which values should be stored as numeric or as text, and how that affects aggregations, comparisons, sorting etc.
Quantitative reasoning / comparisons
In judging which transformations or data types are most “suitable,” students implicitly compare options based on numeric criteria (precision, error risk, storage cost, performance) — e.g. floating vs fixed vs integer precision tradeoffs.
They must reason about the pros and cons (tradeoffs) of different representations, which involves quantitative thinking (which method gives more precise numeric behavior, which is more efficient etc.)
Logical / Boolean reasoning
The Boolean data type (true/false) is itself a mathematical/logical concept; using it in system flags, comparisons, conditional logic. mystudentsite.co.uk
Students must reason about when to use Boolean vs other types, and how Boolean logic underlies many system decisions (on/off, pass/fail).
Stretch and Challenge:
E&D / BV
Homework / Extension:
ILT
→
→
→
→
→
→
Week 5
T&L Activities:
3.5 Data formats
3.5.1 Know the definition of common data formats and understand their purpose and when each is used
In digital systems, data must be stored, exchanged, and interpreted in ways that both humans and machines can understand. To achieve this, information is organised using data formats. A data format defines how data is represented, encoded, and structured. Some formats focus on being lightweight and easy to parse by machines, while others are more human-readable or better suited for specific applications.
Choosing the correct data format is essential: it affects compatibility, performance, storage requirements, and security. For example, structured formats like JSON and XML are ideal for web communication, while simple formats like CSV or text files are better for raw storage or simple data transfer. Encodings like UTF-8 and ASCII ensure that text is represented consistently across devices and platforms.
Definitions, Purposes, and Uses
1. JSON (JavaScript Object Notation)
Definition: A lightweight text-based format for representing structured data using key–value pairs and arrays.
Purpose & Use: Commonly used for web APIs, configuration files, and data interchange between client and server. Easy to read by humans and parse by machines.
Definition: A file containing unformatted plain text, typically encoded in ASCII or UTF-8.
Purpose & Use: Used for notes, documentation, log files, or lightweight storage where structure isn’t required.
Examples: A .txt file storing error logs from a program.
Compatible Software: Notepad, WordPad, VS Code, Notepad++, Linux nano/vim.
3. CSV (Comma-Separated Values)
Definition: A plain text format where rows represent records and columns are separated by commas (or semicolons).
Purpose & Use: Ideal for tabular data (spreadsheets, databases) and for exporting/importing between systems.
Examples:
Name, Age, Department John, 25, IT Sarah, 30, HR
Compatible Software: Microsoft Excel, Google Sheets, LibreOffice Calc, Python (Pandas library), SQL import/export tools.
4. UTF-8 (Unicode Transformation Format – 8-bit)
Definition: A character encoding capable of representing every character in the Unicode standard using 1–4 bytes.
Purpose & Use:Global standard for web and modern applications; supports multiple languages and symbols.
Examples: A UTF-8 file can contain English, Arabic, Chinese, and emojis in the same document.
Compatible Software: Modern browsers, Linux/Windows/Mac OS systems, text editors (VS Code, Sublime), databases (MySQL, PostgreSQL).
5. ASCII (American Standard Code for Information Interchange)
Definition: An older encoding system representing characters using 7-bit binary (128 possible characters).
Purpose & Use: Used for basic text files, programming, and communication protocols where extended character sets are unnecessary.
Examples: ASCII encodes ‘A’ as 65.
Compatible Software: Legacy systems, early internet protocols (SMTP, FTP), C/C++ compilers, terminal applications.
6. XML (eXtensible Markup Language)
Definition: A markup language that uses custom tags to define and store structured data in a hierarchical tree format.
Purpose & Use: Common for configuration files, data interchange, and web services (SOAP, RSS feeds). More verbose than JSON but supports complex structures.
Examples:
Alex 22 student
Compatible Software: Web browsers, Microsoft Excel (XML data maps), Apache web services, Java DOM parsers, .NET applications.
3.5.2 Understand the interrelationships between data format and data
transformation, and make judgements about the suitability of using
data formats in digital support and security.
Files that support this week
English:
Assessment:
Learning Outcomes:
Awarding Organisation Criteria:
Maths:
Stretch and Challenge:
E&D / BV
Homework / Extension:
ILT
→
→
→
→
→
→
Week 6
T&L Activities:
3.6 Structures for storing data
3.6.1 Understand the role of metadata in providing descriptions and contexts for data.
When data is created, stored, or transmitted, it often needs additional information to make it meaningful and useful. This is where metadata comes in. Metadata is often described as “data about data.” It provides descriptions, context, and structure that help people and systems understand, manage, and organise the main data.
Without metadata, a file, dataset, or digital object would just be raw content with no clear meaning. For example, a photo file would only contain pixel data, but metadata can add context such as when it was taken, who took it, the camera settings, and even GPS location. This descriptive information makes data easier to search, retrieve, interpret, and manage.
Definition and Purpose of Metadata
Definition: Metadata is information that describes the characteristics, properties, or context of data. It does not alter the data itself but provides supporting details that enhance understanding and usability.
Purpose:
To give context (e.g., who created the data, when, and why).
To aid organisation and retrieval (e.g., library catalogues, search engines).
To support data governance and security (e.g., permissions, classification).
To provide interoperability across systems (e.g., file sharing between applications).
Roles, Uses, and Examples of Metadata
1. Descriptive Metadata
Role: Provides information about the content.
Use: Used in catalogues, search engines, and digital libraries to help users find resources.
Example: A library entry describing a book’s title, author, and ISBN.
Part 1 – Explore Metadata (10 mins)
In small groups (2–3 students), open different types of files on your computer (e.g., Word document, PDF, photo, or MP3 file).
Right-click the file and check Properties (Windows) or Get Info (Mac).
Record the metadata you can find, such as:
- Author/creator
- Date created/modified
- File size
- Keywords/tags
- Technical details (resolution, encoding, etc.)
Part 2 – Research Case Studies (10–15 mins)
Research one real-world case study where metadata is essential. Examples could include: - Photography – how EXIF metadata (camera settings, GPS location) is used in photo management or digital forensics. - Music/Film – how metadata in MP3s/MP4s allows Spotify or Netflix to categorise and recommend content. - Cybersecurity – how hidden metadata in documents (e.g., author names in leaked Word/PDF files) has exposed sensitive information. - Libraries & Archives – how descriptive metadata helps catalogues and digital archives stay searchable.
Prepare 2–3 key points from your chosen case study to share.
Part 3 – Present Your Findings (10–15 mins)
Each group should prepare a short presentation (3–4 minutes) covering: - Definition: What metadata is in your own words. - Examples: Metadata you found in your own files. - Case Study: The real-world use of metadata you researched. - Impact: Why metadata is valuable in making data more useful and reliable.
Stretch / Challenge Task
Discuss as a group: Can metadata ever be a risk? (e.g., GPS location data in photos uploaded online, exposing personal info).
Suggest one security measure organisations can use to manage metadata safely.
3.6.2 Know the definition of file-based and directory-based structures and understand their purposes and when they are used.
All digital systems must store and organise data in ways that make it easy to access, manage, and retrieve. Two of the most common organisational models are file-based structures and directory-based structures.
A file-based structure focuses on storing data in individual, stand-alone files. Each file is independent and may not directly connect with other files, meaning data can be duplicated or difficult to share between systems.
A directory-based structure is more organised, using folders (directories) and subfolders (subdirectories) to group related files. This hierarchy makes it easier to navigate and manage large sets of data.
Both approaches are still used today, and the choice depends on data complexity, collaboration needs, and the scale of storage required.
File-Based Structures
Definition
A storage model where data is stored in independent files, often with no enforced relationships between them. Each file is self-contained.
Purpose & Use
Simple and low-cost way to store and access data.
Common for personal use, small systems, or applications where data doesn’t need to be shared widely.
Used when performance and simplicity are more important than complex data relationships.
Examples & Case Studies
Case Study 1 – Small Business Accounting:
A local shop saves all its sales records in Excel spreadsheets and stores them as individual files (e.g., Jan_sales.xlsx, Feb_sales.xlsx). This is easy to set up but leads to duplication of customer details and makes cross-checking totals more time-consuming.
Case Study 2 – Medical Practice (Legacy Systems):
An older clinic database saves each patient’s record in a separate file. This makes searching slow and creates issues when patients have multiple files across departments.
Software Examples
Microsoft Excel / Access (file-based storage)
CSV or text files in data logging systems
Legacy business systems
Directory-Based Structures
Definition
A hierarchical storage model where files are grouped into directories (folders) and subdirectories, providing a structured way to organise information.
Purpose & Use
Provides a clear hierarchy and reduces duplication.
Easier navigation and searching across large datasets.
Common in operating systems, enterprise systems, and cloud storage where data is shared and must be controlled.
Examples & Case Studies
Case Study 1 – Corporate File Server:
An IT company uses a shared drive with directories like Projects > 2025 > ClientX > Reports. This makes it simple for teams to collaborate while keeping data well organised. Metadata (permissions, timestamps) helps manage access.
Case Study 2 – University Learning Platform:
A university stores student submissions in directories by course and module (Course > Module > StudentID). This ensures work is easy to locate and secure.
Case Study 3 – Cloud Collaboration (Google Drive/SharePoint):
Teams working remotely store documents in shared directories, ensuring all members see the same updated files without creating multiple versions.
You are going to investigate the difference between file-based and directory-based structures, using the case studies provided. Your task is to show your understanding by applying real-world reasoning and producing a short written or visual response.
Instructions
Part 1 – Compare the Structures (10 mins)
1. Write down two key features of file-based structures.
2. Write down two key features of directory-based structures.
3. Explain in your own words why a small business (e.g., local shop with sales spreadsheets) might choose a file-based structure instead of a directory-based one.
4. Explain why a university or IT company would prefer directory-based storage instead of file-based.
Part 2 – Case Study Scenarios (10 mins)
For each scenario below, decide whether a file-based structure or a directory-based structure would be best. Write 2–3 sentences explaining your choice. Scenario A: A freelance photographer saves all their client photos. Each photoshoot needs to be kept separate but easy to find later. Scenario B: A multinational corporation needs to share HR records across several countries, with access restrictions for different teams. Scenario C: A student keeps lecture notes on their personal laptop. Each week’s notes are saved in Word files.
Part 3 – Reflection (5 mins)
In one short paragraph, explain which structure you personally use most often (on your own computer, cloud storage, or phone).
Why does that structure suit your needs?
Output Options
You can present you work as:
A written response (1–2 pages).
A diagram or mind map comparing file vs directory structures with examples.
3.6.3 Know the definition of hierarchy-based structure and understand its purpose and when it is used.
3.6.4 Understand the interrelationships between storage structures and data transformation.
Files that support this week
English:
Assessment:
Learning Outcomes:
Awarding Organisation Criteria:
Maths:
Stretch and Challenge:
E&D / BV
Homework / Extension:
ILT
→
→
→
→
→
→
Week 7
T&L Activities:
3.7 Data dimensions and maintenance
3.7.1 Know the definitions of the six Vs (dimensions) and understand the six Vs (dimensions) of Big Data and their impact on gathering, storing, maintaining and processing:
• volume
• variety
• variability
• velocity
• veracity
• value.
3.7.2 Know the definition of Big Data and understand that it has multiple dimensions.
3.7.3 Understand the impact of each dimension on how data is gathered and maintained.
3.7.4 Know the definitions of data quality assurance methods and understand their purpose and when each is used:
• validation
• verification
• reliability
• consistency
• integrity
• redundancy.
3.7.5 Know and understand factors that affect how data is maintained:
• time
• skills
• cost.
3.7.6 Understand the interrelationships between the dimensions of data, quality assurance methods and factors that impact how data is maintained and make judgements about the suitability of maintaining, transforming and quality assuring data in digital support and security.
Files that support this week
English:
Assessment:
Learning Outcomes:
Awarding Organisation Criteria:
Maths:
Stretch and Challenge:
E&D / BV
Homework / Extension:
ILT
→
→
→
→
→
→
Week 8
T&L Activities:
3.8 Data systems
3.8.1 Know the definition of data wrangling and understand its purpose andwhen it is used.
3.8.2 Know and understand the purpose of each step of data wrangling:
• structure
• clean
• validate
• enrich
• output.
3.8.3 Know and understand the purpose of each core function of a data system:
• input
• search
• save
• integrate
• organise (index)
• output
• feedback loop.
3.8.4 Know the types of data entry errors and understand how and why they occur:
• transcription errors
• transposition errors.
3.8.5 Know and understand methods to reduce data entry errors:
• validation of user input
• verification of user input by double entry
• drop-down menus
• pre-filled data entry boxes.
3.8.6 Know and understand the factors that impact implementation of data entry:
• time needed to create the screens
• expertise needed to create screens
• time needed to enter the data.
3.8.7 Understand the relationship between factors that impact data entry and data quality and make judgements about the suitability of methods to reduce data entry errors in digital support and security.
3.8.8 Understand the relationship between factors that impact implementation of data entry and make judgements about the suitability of implementing data entry in digital support and security.
Files that support this week
English:
Assessment:
Learning Outcomes:
Awarding Organisation Criteria:
Maths:
Stretch and Challenge:
E&D / BV
Homework / Extension:
ILT
→
→
→
→
→
→
Week 9
T&L Activities:
3.9 Data visualisation
3.9.1 Know and understand data visualisation formats and when they are used:
• graphs
• charts
• tables
• reports
• dashboards
• infographics.
3.9.2 Know and understand the benefits and drawbacks of data visualisation formats based on:
• type of data
• intended audience
• brief.
Files that support this week
English:
Assessment:
Learning Outcomes:
Awarding Organisation Criteria:
Maths:
Stretch and Challenge:
E&D / BV
Homework / Extension:
ILT
→
→
→
→
→
→
Week 10
T&L Activities:
3.10 Data models
3.10.1 Know the types of data models and understand how they organise data into structures:
• hierarchical
• network
• relational.
3.10.2 Know and understand the factors that impact the selection of data model for organising data:
• efficiency of accessing individual items of data
• efficiency of data storage
• level of complexity in implementation.
3.10.3 Understand the benefits and drawbacks of different data models and make judgements about the suitability of data models based on efficiency and complexity.
3.10.4 Be able to draw and represent data models:
• hierarchical models with blocks, arrows and labels
• network models with blocks, arrows and labels
• relational models with tables, rows, columns and labels.
Files that support this week
English:
Assessment:
Learning Outcomes:
Awarding Organisation Criteria:
Maths:
Stretch and Challenge:
E&D / BV
Homework / Extension:
ILT
→
→
→
→
→
→
Week 11
T&L Activities:
3.11 Data access across platforms
3.11.1 Understand the features, purposes, benefits and drawbacks of accessing data across platforms:
• permissions
o authorisation
o privileges
o access rights
o rules
• access mechanisms:
o role-based access (RBAC)
o rule-based access control (RuBAC)
o Application Programming Interfaces (API).
3.11.2 Know and understand the benefits and drawbacks of methods to access data across platforms.
3.11.3 Understand the interrelationships between data access requirements and data access methods and make judgements about the suitability of accessing data in digital support and security.
Files that support this week
English:
Assessment:
Learning Outcomes:
Awarding Organisation Criteria:
Maths:
Stretch and Challenge:
E&D / BV
Homework / Extension:
ILT
→
→
→
→
→
→
Week 12
T&L Activities:
3.12 Data analysis tools
3.12.1 Know data analysis tools and understand their purpose and when they are used:
• storing Big Data for analysis:
o data warehouse
o data lake
o data mart
• analysis of data:
o data mining
o reporting
• use of business intelligence gained through analysis:
o financial planning and analysis
o customer relationship management (CRM):
– customer data analytics
– communications.
3.12.2 Understand the interrelationships between data analysis tools and the scale of data