week 1

3.1 Data, information and knowledge

3.1.1 The differences and relationships between, data, information and knowledge.

1. Data

What it is:
Data is the raw facts and figures. On its own, it doesn’t have much meaning until it’s organised or processed.
Example (Digital Support & Security):
Imagine a server log that records every login attempt. A single line might look like this:
2025-09-01 09:45:12 | User: JamesF | Login: Failed
On its own, that’s just one piece of data.

2. Information

What it is:
When data is processed, organised, or put into context so it makes sense, it becomes information. Information answers questions like “who?”, “what?”, “where?”, “when?”.
Example (Digital Support & Security):
If you take the server log data and count how many failed login attempts happened in the last 24 hours, you might discover:
“There were 45 failed login attempts from 5 different IP addresses on the college’s network.”
This is information because it’s structured and tells you something meaningful.

3. Knowledge

What it is:
Knowledge is when you analyse and interpret the information to make decisions or take action. It answers “why?” and “how?”.
Example (Digital Support & Security):
From the information about 45 failed login attempts from 5 IPs, you recognise a possible brute-force attack on student accounts. You know this because of your training in cybersecurity, which tells you that multiple failed logins from a small set of IP addresses is a common threat indicator.
Using this knowledge, you might:
- Block those IP addresses in the firewall.
- Alert the IT security team.
- Review authentication logs for suspicious activity.

3.1.2 The sources for generating data:

Human (Surveys, Forms)

Humans generate data whenever they give information directly – for example, filling in a form, survey, or feedback questionnaire. This is usually self-reported data (what a person chooses to share).

(Digital Support & Security):
A college IT support team might send students a survey about how secure they feel when using online learning platforms. The answers (Yes/No, ratings out of 5, written comments) are data that can be collected and analysed.

Artificial Intelligence (AI) / Machine Learning – Dangers of Feedback Loops

AI and machine learning systems create data as they learn from user behaviour. A feedback loop happens when the AI uses its own output as new input, which can lead to bias or errors being reinforced.

(Digital Support & Security):A cybersecurity monitoring tool that uses machine learning to detect suspicious logins could wrongly flag normal student behaviour (like logging in late at night) as a threat. If those false alarms are fed back into the system as “evidence,” it may become overly strict and block real students from logging in.

Sensors (Temperature, Accelerometer, Vibration, Sound, Light, Pressure)

Sensors collect data from the environment. They measure physical things like heat, movement, sound, or light.

(Digital Support & Security):
In a server room at college, temperature sensors monitor if equipment is overheating. If the temperature goes above a safe level, the system can trigger an alert to the IT support team before the servers shut down.

Internet of Things (IoT) – Smart Objects

IoT devices are everyday objects connected to the internet (e.g., smart lights, thermostats, security cameras). They collect and send data automatically.

(Digital Support & Security):
A college might use smart security cameras that detect movement and send alerts to the IT team’s dashboard. This data helps keep the campus safe, but IT staff must also secure the devices to stop hackers from gaining access.

Transactions (Customer Data, Membership, Timing, Basket)

Every time someone buys something, signs up for a service, or logs into a system, data is generated. Transactions create a digital footprint.

(Digital Support & Security):
When a student pays for a college esports event online, the system records:

The student’s name
Payment method
Date & time
Items purchased (e.g., entry ticket + team jersey)
This transaction data must be stored securely to comply with data protection laws (like GDPR) and to prevent cybercriminals from stealing card details.

"Data Detectives"
Scenario:
You’re part of a college IT support team. The college wants to improve security and gather data from different sources. Below are some situations.

Task (10 mins):
For each scenario, identify:
What is the raw data?
What information can you get from that data?
What knowledge or decisions could the IT team make?

Scenarios:
1 - A survey sent to 200 students asks: “Do you feel safe using the college Wi-Fi?” 120 say Yes, 80 say No.
2 - A machine learning tool notices that a student logs into the network at 2 a.m. every night and flags it as unusual.
3 - A server room temperature sensor records 35°C at 3:00 p.m. (normal temperature should be under 25°C).
4 - The college installs smart locks on computer labs that record every time someone enters or leaves.
5 - The esports society’s online shop records that most students buy merchandise around payday (the 28th of each month)

Extension (5 mins):
Find an example of an IoT device that could be used in a school or esports setting.
Describe what data it collects, what information it provides, and what knowledge the IT team could gain from it.

3.1.3 Ethical data practices and the metrics to determine the value of data:

Ethical Data Practices & Metrics to Determine Data Value

Before we dive into the metrics, remember:
Ethical data practice means collecting, storing, and using data responsibly.
This includes:

Getting permission (consent) from users.
Protecting data from cyberattacks.
Not misusing personal information.
Following laws like GDPR (General Data Protection Regulation) in the UK/EU.

Now, let’s explore the metrics used to decide how valuable data is.

Quantity

Quantity refers to the amount of data collected. More data can help identify patterns more accurately.

(Digital Support & Security):
A college IT team collects data from 10 login attempts vs 10,000 login attempts. The larger dataset is more valuable because it shows broader patterns (e.g., which times of day attacks are most common).

Don’t collect more data than necessary – only gather what’s useful (“data minimisation” under GDPR).

Timeframe

Timeframe is about when the data was collected and how long it remains relevant. Recent data is often more valuable than old data.

(Digital Support & Security):
A log of failed Wi-Fi logins from yesterday is more useful for spotting a live cyberattack than logs from 2019.

Don’t keep data longer than necessary. For example, student support tickets might be deleted after a year once resolved.

Source

The value of data depends on where it comes from and how trustworthy the source is.

(Digital Support & Security):

A random spreadsheet emailed by an unknown user = unreliable (could be fake or manipulated).

Always check sources and avoid using stolen or illegally obtained data.

Veracity

Veracity means the accuracy and truthfulness of data. Data full of errors or lies is less valuable.

(Digital Support & Security):
If students fill in a survey about cyber safety and many joke by giving fake answers (“My password is 123456”), the veracity of the data is low, so the results can’t be trusted.

Organisations should clean and validate data, and not mislead people by presenting false or incomplete results.

3.1.4 How organisations use data and information:

Analysis to Identify Patterns

Organisations look at large sets of data to find trends, behaviours, or repeated issues. Patterns help predict future events and improve decision-making.

The IT support team analyses helpdesk tickets and notices that every Monday morning, many students report Wi-Fi login problems. The pattern suggests that systems might need restarting after the weekend.

Google analyses search trends (e.g., millions of people suddenly searching for the same issue). This helps them detect outbreaks of cyberattacks or bugs spreading online.

System Performance Analysis (Load, Outage, Throughput, Status)

Organisations monitor how well their systems are running:

Load – how much demand is placed on the system (e.g., number of users).
Outage – when systems go down or stop working.
Throughput – how much data or traffic can pass through the system.
Status – current health of servers, networks, or applications.

An esports tournament hosted at a college requires fast servers. The IT team monitors server load and bandwidth usage during live matches. If the system slows down, they can add more resources to avoid crashes.

Amazon Web Services (AWS) constantly monitors its cloud servers. If a data centre goes down, traffic is automatically re-routed to another server to prevent downtime for customers.

User Monitoring (Login/Logout, Resources Accessed)

Organisations track user activity to ensure systems are being used correctly and securely.

A college IT team monitors who logs into the Virtual Learning Environment (VLE). If a student logs in from two countries within the same hour, it may indicate a hacked account.

Microsoft 365 monitors user logins across the world. If an account logs in from London and then five minutes later from New York, it may block the login and alert security teams.

Targeted Marketing (Discounts, Upselling)

Organisations use data about customer behaviour to send personalised offers, suggest upgrades, or advertise products people are likely to buy.

A college esports society collects data on what students buy in the online shop. If a student buys a gaming jersey, they might get an email offering a discount on a matching mousepad.

Steam (Valve) analyses what games you play and recommends new titles you’re likely to enjoy. They also send personalised sale notifications to encourage more purchases.

Threat/Opportunity Assessment (Competitors, Security, Compliance)

Organisations analyse data to spot risks (threats) or advantages (opportunities). This can relate to cybersecurity, business competition, or legal compliance.

The IT security team compares data about phishing attempts with government alerts from the NCSC (National Cyber Security Centre). If a new type of phishing attack is targeting colleges, they can prepare staff with updated training – turning a threat into an opportunity to strengthen security.

NCSC (UK) collects data on cyber incidents across the country. They publish reports on new cyber threats, which organisations use to improve security and stay compliant with regulations like GDPR.

"Data in Action"

Scenario:
You are working in the IT support and security team for a college esports club. You have access to the following datasets:

1 - Login records: Show that some students are logging in at 3 a.m. from outside the UK.
2 - Server stats: During last Friday’s tournament, the main game server slowed down when 200 players connected at once.
3 - Shop sales: Jerseys sell out every time there’s a big tournament, but headsets don’t sell as well.
4 - Competitor data: Another nearby college just announced a new gaming lab with high-spec PCs.

Task:
1 - Analysis to Identify Patterns:
Which dataset shows a repeated trend?
What pattern do you see?

2 - System Performance:
Which dataset shows a system issue?
What actions should IT take to prevent it happening again?

3- User Monitoring:
What do the login records tell you?
What security risks do they suggest?

4 - Targeted Marketing:
How could the esports club use the shop sales data to increase revenue?

5 - Threat/Opportunity Assessment:
How should the club respond to the competitor’s new gaming lab?

Extension:
Research how a company like Netflix or Amazon uses data to recommend products or detect suspicious activity.
Share your findings with the group.

3.1.5 Interrelationships between data, information and the way it is generated and make judgements about the suitability of data, information and the way it is generated in digital support and security.

What this means

Data = raw facts or figures (numbers, logs, text, clicks, etc.) without context.
Information = processed, organised, and meaningful data that helps people make decisions.
Way it is generated = how the data is collected (e.g. login records, surveys, sensors, monitoring tools).

These three parts are linked together:

The way data is generated determines the type and quality of the data you get.
That raw data needs to be processed and organised.
Once processed, the data becomes information that can be used to make decisions.

If the data is incomplete, biased, or collected in the wrong way, the information may not be suitable for decision-making.

"A College Cybersecurity Incident Response"

Scenario:
A UK college notices that some students’ accounts have been logging in at unusual times. The IT security team collects data from three different sources:

1 - Login/Logout Records (system generated data)
2 - Firewall Logs (network traffic data, showing unusual connections from overseas IPs)
3 - Incident Reports (manually generated by staff when they notice suspicious behaviour)

How the interrelationships work:

Data:
- Login records show timestamps, usernames, and IP addresses.
- Firewall logs capture packet traffic and potential intrusion attempts.
- Staff reports note suspicious emails and students complaining about locked accounts.
Information (processed data):
- Combining the login timestamps with IP addresses shows multiple students logging in from a single overseas location at odd hours.
- Staff reports confirm phishing emails were sent to many accounts the day before.
Suitability of Data:
- Login data: Useful and reliable, but could be misleading if students use VPNs.
- Firewall logs: Provide technical detail, but require expertise to interpret.
- Staff reports: Subjective, but add valuable context about user behaviour.
Judgement:
The most suitable data in this case is the combination of automated system logs (objective, timestamped evidence) and user-reported incidents (human context). Relying on only one source could lead to misinterpretation (e.g. mistaking a VPN for a hacker).

Real-World Industry Example

NHS Digital (UK Health Service) collects data from hospital IT systems about cyber incidents.

In 2017’s WannaCry ransomware attack, logs showed unusual traffic patterns while staff reported being locked out of systems.

By combining both machine data (network logs, malware signatures) and human-reported issues, NHS Digital was able to coordinate with cybersecurity agencies to restore services and improve future protections.

This demonstrates how data, information, and generation methods must work together to make correct security decisions.

"Data to Information Detective"

1 - Work in pairs or small groups.
2 - Read the case study above about the college cybersecurity incident.
3 - Answer the following questions together (10 minutes):

Data: List two types of raw data the IT team collected. Why is each useful?
Information: How did the IT team turn the raw data into useful information?
Suitability: Which source of data (login logs, firewall logs, or staff reports) do you think is most reliable for making security decisions? Why?
Judgement: If you were the IT manager, what actions would you take based on the information gathered? (E.g., resetting passwords, training, blocking IP addresses.)

Extention:
(Optional challenge if time allows, 5 minutes):
Think of a real organisation (like a bank, online shop, or gaming company).

What kind of data do they collect?
How do they turn it into information?
What threats or opportunities might this create?

Output:
Each group should share one key insight with the class about why it’s important to think about both the data itself and how it’s generated when making digital support or security decisions.

Last Updated
2025-09-01 12:45:40

English and Maths

English

Maths

Stretch and Challenge

Fast to implement
Accessible by default
No dependencies

Homework

Equality and Diversity Calendar

How to's

How 2's Coverage

Links to Learning Outcomes		Links to Assessment criteria

Files that support this week

Week 1 ←
Prev Week 2 ←
Prev →
Next