Digging Deeper Into Information & Data
How do you start working with Information & Data (I)?
And drawing on prior experience and hindsight, you should expect that answer to be a bit… recursive. Or perhaps the better adjective is connected? Or even dependent? Working with information requires Organization (O) and Process (P). There is plenty of (D) & (S), too. This article will focus on the first two but we will circle around in the end.
Stated a little differently, your data has a model (O) and a syntax (P). It may seem odd to jump right to the model (O), but if it makes you feel better — there is a process (P) required to view that model (O). So it is not really skipping, more of a quick trip.
Anyone who is familiar with databases has likely seen an ERD. The one pictured here is quite simple and reflects the organizational structure for a photo album containing photos of members. It includes tags, comments, and locations. Makes perfect sense right? Perhaps. Or not? I have a lot of questions…
I am also making a lot of assumptions. I assume the tables are populated. I assume the field names are descriptive. I assume the connections and keys are meaningful. I assume this entity/table structure is a model of reality and that it accurately reflects that reality. I also assumed the pictures are “of members”. Perhaps they are owned by members? We will need a process (P) to verify ALL of those assumptions.
Those more experienced with data may be thinking — I am not sure I needed a model for that. They are right. After just a little experience, most analysts internalize the basic model concepts. Structure becomes process. Where have we heard that before? But first timers will catch on much faster this way.
A quick insight — learning analytics is harder because so many practitioners have lost sight of the steps. Humans don’t like to show their work in any discipline. Humans chunk — a scientific concept.
We do it to steps, to information, and even other people. It is inherit in our language. Reading this on a phone? Think of what that little word implies (the word is clipped, the concept is chunked). Is it lazy? Perhaps. It is definitely human. Good luck to our coming AI overlords!
Back to our model and its connections, of which there are a recursive many. Our ERD visualizes how our data is connected, but it also clues you to the process. Notice the little key 🔑 icons? They are next to the fields named ID. IDs, the clipped form of identifiers, a very (I) concept, provide our database a primary key. This syntax (P) means that ID is a unique identifier in most of these tables.
Unique identifiers or primary keys allow the user to connect tables in a referential database (more syntax). They create organization through a clear process. If we match the ID of any photo to the album table, we will learn what album the photo is in (or we assume). The concept is called a merge or join. Various querying languages use varying syntax to do that. Lately, SQL uses the term join (pretty convenient).
Back to our ERD. You may have noted that two of the connecting lines are dashed. This likely indicates that the ID on those tables is not primary, meaning not unique. On these tables, ID is still an identifier but it is referred to as a foreign key. That means it is unique to another table, not this one. Or put a different way, every unique photo may match to several members and several tags.
This last many to one concept for members and photos is what led me to assume that the photos are “of” members rather than owned by them… but there is no certainty. Analysts must connect the ERD to reality. Are there processes for that? Certainly, but they fall in the realm of common sense. Unfortunately, while the process is common enough — actually engaging in it is a commonly overlooked step for many novice analysts.
Analytics is a connected discipline. You can’t just go around breaking everything down. That is only one half of the challenge (analysis). You need to employ synthesis, as well. Data only becomes information and information only become knowledge through the process (P) of organizing and connecting it (O). Avoid Blue Steel moments, get up from your desk and go talk to the people who help create the data. It is a decision (D) you need to make to understand your data’s story (S). Our circle is complete.
Looking forward — most concepts, tools, and technologies involved in the analytic process have both models (O) and syntax (P). They are key to the understanding and synthesis of data into true insight. Experience will make you comfortable with these concepts, until then — just try to recognize them when you see them. Thanks for reading!
Start at the beginning:
Get the right help:
Follow more closely:
Gurupriyan is a Software Engineer and a technology enthusiast, he’s been working on the field for the last 6 years. Currently focusing on mobile app development and IoT.