week 2
3.2 Methods of transforming data
3.2.1 Methods of transforming data:
When organisations collect data, it is often raw and not immediately useful. To make it valuable, it must be transformed. The main methods are:
-
Manipulating
-
Analysing
-
Processing
Manipulating Data
Changing or reorganising data to make it more understandable or useful. This might include filtering, sorting, or combining data from different sources.
A college IT support team exports login data from the network. At first, it’s just thousands of rows of timestamps and usernames. By manipulating the data (sorting by user, filtering failed attempts), they quickly see which accounts have repeated login failures.
Splunk and Elastic (ELK Stack) are widely used in cybersecurity to manipulate and search through huge log files, making it easier to spot patterns of suspicious behaviour
Analysing Data
Looking at data in depth to identify patterns, trends, or relationships. Analysing moves beyond just reorganising – it’s about making sense of the information.
After manipulating login records, the IT team analyses them and notices that 80% of failed logins happen between midnight and 3 a.m. This unusual pattern suggests a brute-force attack.
IBM Security QRadar analyses logs from multiple systems (firewalls, servers, apps) to detect cyber threats by identifying unusual traffic patterns.
Processing Data
Converting raw data into a different format or structure so it can be used by systems, applications, or people. Processing often involves automation.
A system collects sensor data from a server room (temperature, humidity). This raw data is processed into a dashboard that shows “green, amber, red” warnings. IT staff don’t need to read every number – the processed data tells them instantly if action is needed.
SIEM (Security Information and Event Management) tools like Azure Sentinel automatically process logs from thousands of endpoints and generate alerts for IT teams.
You are part of a college IT security team. Below is some raw login data:
+----------------+---------------------------------+------------+
| Username | Timestamp | Status |
+----------------+---------------------------------+------------+
| Alex01 | 02/09/2025 00:15:12 | Failed |
+----------------+---------------------------------+------------+
| Alex01 | 02/09/2025 00:15:12 | Failed |
+----------------+---------------------------------+------------+
| Alex01 | 02/09/2025 00:15:12 | Failed |
+----------------+---------------------------------+------------+
| Sam02 | 02/09/2025 00:15:12 | Success |
+----------------+---------------------------------+------------+
| Mia03 | 02/09/2025 00:15:12 | Failed |
+----------------+---------------------------------+------------+
| Mia03 | 02/09/2025 00:15:12 | Failed |
+----------------+---------------------------------+------------+
| Mia03 | 02/09/2025 00:15:12 | Success |
+----------------+---------------------------------+------------+
Task:
Manipulating:
Sort the data by username. What do you notice?
Analysing:
Which accounts show suspicious behaviour? Why?
Processing:
Imagine you are designing a dashboard. How would you present this data (e.g., traffic light system, charts, alerts)?
Extension:
Research one industry tool (Splunk, ELK Stack, QRadar, or Azure Sentinel).
Explain: Does it mainly manipulate, analyse, or process data – or all three?
Last Updated
2025-10-09 21:45:36
English and Maths
English
-
Reading & comprehension of technical prose
-
Students must read and understand the descriptions of manipulating, analysing, processing data (how raw data is transformed). mystudentsite.co.uk+1
-
They must interpret a sample raw dataset (usernames, timestamps, status) and understand the implied narrative. mystudentsite.co.uk
-
-
Explanation / description writing
-
In tasks, students will explain their observations: e.g. “Sort the data by username. What do you notice?” requires them to describe patterns in their own words. mystudentsite.co.uk+1
-
“Which accounts show suspicious behaviour? Why?” demands reasoning and justification in prose. mystudentsite.co.uk
-
“Imagine you are designing a dashboard. How would you present this data?” asks them to describe the design and rationale (textually). mystudentsite.co.uk
-
-
Inquiry / research & reporting
-
The extension task: “Research one industry tool (Splunk, ELK, QRadar, or Azure Sentinel). Explain: Does it mainly manipulate, analyse, or process data – or all three?” This requires gathering information from external sources and then writing an explanation in structured form. mystudentsite.co.uk
-
-
Use of technical vocabulary
-
Terms like manipulating, analysing, processing, dashboard, pattern, suspicious behaviour etc. must be used correctly in answers. mystudentsite.co.uk
-
Students will need to communicate clearly with precise vocabulary when describing what transformations or analyses do.
-
-
Logical / sequential narrative
-
The tasks follow a logical progression (manipulate → analyse → process). Students’ answers can mirror that sequence in writing, helping them practise structuring written argumentation in a logical order.
-
Maths
-
Sorting / ordering
-
“Sort the data by username.” That is an ordering / sorting operation (alphanumeric sorting). It encourages thinking about ordering rules (alphabetic, timestamp) and how datasets can be reorganised. mystudentsite.co.uk
-
-
Pattern detection / trend identification
-
When students analyse the data to find suspicious accounts, they must look for patterns (e.g. multiple failures by one user, clustering in time). That is numerical / logical pattern recognition. mystudentsite.co.uk
-
-
Data filtering / selection
-
The notion of filtering (selecting subsets of data that meet criteria) is itself a numeracy / data operation (e.g. "only failed logins", "only those with > n failures"). mystudentsite.co.uk+1
-
-
Visual / dashboard design & quantitative representation
-
When designing dashboard output (e.g. traffic light, chart, alerts), students decide how to map numeric data to visuals (thresholds, ranges). That involves thinking about scales, cut-offs, representation of numeric values visually. mystudentsite.co.uk
-
Converting detailed raw numeric logs to more digestible summary forms is a form of data aggregation / summarisation (though implicitly).
-
-
Classification / categorisation of behaviour
-
Deciding which accounts are “suspicious” vs “normal” is a classification exercise based on numeric criteria (e.g. number of failed attempts, clustering). This involves thresholding, comparison, logical testing.
-
-
Understanding data transformation hierarchy
-
The underpinning conceptual structure (manipulate → analyse → process) implicitly involves mathematical thinking about stages of transforming data (e.g. reorganising, aggregating, mapping).
-
Stretch and Challenge
Stretch and Challenge
- Fast to implement
- Accessible by default
- No dependencies
Homework
Homework
Equality and Diversity Calendar
How to's
How 2's Coverage
Links to Learning Outcomes |
Links to Assessment criteria |
|
|---|---|---|
Files that support this week
Week 1←
PrevWeek 2←
PrevWeek 3←
Prev→
Next