week 4
3.4 Data types3.4.1 The definition of common data types, their purpose, and when each is used:
Integer (Whole numbers)
What/why: Whole numbers (no decimal point). Efficient for counting, indexing, quantities, IDs.
When to use: Anything that can’t have fractions: number of users, attempts, port numbers, stock counts.
Gotchas: Watch out for range limits (e.g., 32-bit vs 64-bit) and accidental division that produces decimals.
Example | Suitable Uses | Not Suitable For |
---|---|---|
0 | Counter start | Currency with pennies |
7 | Login attempts | Temperatures needing decimals |
65535 | Network ports (unsigned) | Precise measurements (e.g. cm) |
-12 | Temperature differences |
Real (Floating-point / Decimal)
What/why: Numbers with fractional parts.
When to use: Measurements (temperature, CPU load), ratios, scientific values.
Gotchas: Floating-point rounding error (binary floating point). For money, prefer fixed-point/decimal types.
Example | Suitable Uses | Notes |
---|---|---|
3.14 | Maths/geometry | Stored as float/double |
-0.75 | Signal values | Rounding errors possible |
72.5 | CPU temperature °C | Use DECIMAL for money (not float) |
Character (Char)
What/why: A single textual symbol (one character).
When to use: Fixed-width codes (Y/N flags), single-letter grades, check digits.
Gotchas: In Unicode, a “character” users see may be multiple code points (accents/emoji). Many systems still treat CHAR as a single byte/letter in a given encoding.
Example | Suitable Uses | Notes |
---|---|---|
'Y' | Yes/No flag | Case sensitivity may matter |
'A' | Grade | Encoding/locale may affect storage |
'#' | Delimiter symbol |
String (Text)
What/why: Ordered sequence of characters (words, sentences, IDs).
When to use: Names, emails, file paths, JSON blobs-as-text, logs.
Gotchas: Validate length and content; normalise case; be mindful of Unicode, whitespace, and injection risks.
Example | Suitable Uses | Validation Ideas |
---|---|---|
"James Farrington" | Person name | Trim spaces; allow accents |
"[email protected]" | Email address | Regex format check |
"/var/www/index" | File path | Disallow control chars |
"BE-UK-2025-00017" | Reference code | Check length & pattern |
Boolean (True/False)
What/why: Logical truth value with two states.
When to use: Feature flags, on/off, pass/fail, access granted/denied.
Gotchas: In databases and CSVs, Booleans are often stored as 1/0, TRUE/FALSE, Y/N—be consistent when importing/exporting.
Example | Suitable Uses | Storage Variants |
---|---|---|
TRUE | MFA enabled? | TRUE/FALSE, 1/0, or Y/N |
FALSE | Account locked? | Keep consistent across DBs |
Date (and Date/Time)
What/why: Calendar date (optionally time and timezone).
When to use: Timestamps for logs, booking dates, certificate expiry, backups.
Gotchas: Time zones and daylight saving; choose UTC for servers, localise only for display. Use proper date types, not strings, for comparisons and indexing.
Example | Suitable Uses | Notes |
---|---|---|
2025-09-02 | Report date | Use ISO 8601 format |
2025-09-02T10:30:00Z | Audit timestamp (UTC) | Store UTC, display in local timezone |
2025-12-31T23:59:59+1 | Regional display | Avoid treating dates as strings |
BLOB (Binary Large Object)
What/why: Arbitrary binary data (files) stored as a single value.
When to use: Images, PDFs, compressed archives, firmware, encrypted payloads—when you must keep the bytes intact.
Gotchas: Large size affects backups and query speed; consider storing large files in object storage (S3, Azure Blob) and keep only a URL/metadata in the database.
Example | Suitable Uses | Notes |
---|---|---|
PNG logo bytes | Small media in DB | Mind database size limits |
PDF policy document | Immutable file storage | Often better in file/object storage |
Encrypted payload | Secure binary storage | Store MIME type, size, checksum for integrity |
3.4.2 The interrelationships between structured data, unstructured data and data type.
In today’s digital world, organisations gather information in many different forms – from neatly organised spreadsheets of customer transactions to complex streams of emails, images, and social media posts. To make sense of this, we look at three key concepts: structured data, unstructured data, and data types.
Structured data is highly organised, stored in predefined formats such as rows and columns within a spreadsheet or database. This makes it straightforward to search, filter, and analyse. Examples include account numbers, dates of birth, and purchase amounts.
Structured Data
-
Organised in a predefined format (rows, columns, fields).
-
Easily stored in databases (SQL, relational systems).
-
Examples: customer IDs, transaction amounts, dates, sensor readings.
By contrast, unstructured data has no fixed format or schema, making it harder to process. It includes content such as emails, audio recordings, images, videos, or free-text survey responses. While it carries rich insights, it requires more advanced tools and techniques to interpret.
Unstructured Data
-
No fixed schema or easily searchable structure.
-
Stored in raw formats like documents, images, videos, social media posts.
-
Examples: customer service call recordings, CCTV footage, email bodies.
At the foundation of both lies the concept of data types. A data type defines how a particular piece of information is stored and used – for instance, an integer for whole numbers, a string for text, or a blob for multimedia. Structured systems rely on data types to keep information consistent, while unstructured data is often stored in broader types like text fields or binary objects to preserve its form.
Together, these three elements form the backbone of how data is represented, stored, and ultimately transformed into meaningful information.
Examples in Practice
Scenario | Structured Data | Unstructured Data | Data Types in Play |
---|---|---|---|
Banking Transactions | Account ID, amount, timestamp | Call centre audio logs | Integer, DateTime, Blob |
Healthcare | Patient ID, diagnosis code, prescription dosage | MRI scans, doctor notes | String, Decimal, Blob |
Social Media | Username, post date, likes count | Image posts, videos, captions | String, Integer, Blob, Text |
Cybersecurity | Login/logout logs, IP addresses | Suspicious emails, attached files | String, Boolean, Blob |
Case Studies
Case Study 1: Healthcare – NHS Patient Records
-
Structured: Patient demographic data (NHS number, date of birth, appointment dates).
-
Unstructured: Doctor notes, x-ray images, voice dictations.
-
Interrelationship: Structured records (like appointment schedules) link to unstructured evidence (x-rays stored as BLOBs). The combination provides a holistic medical history.
-
Application: AI systems analyse unstructured scans, while SQL systems schedule appointments. Both need data types (integer IDs, date, blob images).
Case Study 2: Cybersecurity – Threat Detection
-
Structured: Firewall logs (IP addresses, timestamps, action taken).
-
Unstructured: Email attachments, phishing attempts, PDF exploits.
-
Interrelationship: Structured logs identify when and where data entered; unstructured payloads (attachments) must be analysed with ML tools. Data types (IP as string, timestamp as date, file as blob) define how each element is stored and processed.
-
Application: SIEM (Security Information and Event Management) platforms like Splunk combine both data types to detect anomalies.
Case Study 3: Retail – Amazon Recommendations
-
Structured: Order history (user ID, product ID, purchase date).
-
Unstructured: Customer reviews, product images, Q&A responses.
-
Interrelationship: Data types underpin storage (strings for reviews, integers for quantities, blobs for images). Machine learning models merge structured purchase histories with unstructured reviews to improve recommendations.
3.4.3 Understand the interrelationships between data type and data transformation.
3.4.4 Be able to make judgements about the suitability of using structured data, unstructured data, data types, and data transformations in digital support and security.
Last Updated
2025-09-02 15:51:06
English and Maths
English
Maths
Stretch and Challenge
Stretch and Challenge
- Fast to implement
- Accessible by default
- No dependencies
Homework
Homework
Equality and Diversity Calendar
How to's
How 2's Coverage
Links to Learning Outcomes |
Links to Assessment criteria |
|
---|---|---|
Files that support this week
Week 3←
PrevWeek 4←
PrevWeek 5←
Prev→
Next