I’m writing this post in very difficult times. Russia declared war to Ukraine, annexed Crimea, significant part of Ukrainian territory. God bless Ukraine!
This post will melt together multiple things, partially described in my previous posts. There will be references for details by distinguished topics.
From Data to Information
We need information. Information is a “knowledge communicated or received concerning a particular fact or circumstance“, it’s citation from the Wikipedia. Modern useful information technique [in business] is called analytics. There is descriptive analytics or AS-IS snapshot of the things. There is predictive analytics or WHAT-WILL. Predictive analytics is already on Gartner’s Plateau of Productivity. Prescriptive analytics goes further, answering WHAT & WHY. Decision makers [in business and life] need information in Inverted Pyramid manner. Most important information on top, then major facts & reasons, and so on downstairs…
But we have data at the beginning. Tons of data bits which are generated by wide variety of data sources. There are big volumes of classical enterprise data in ERPs, CRMs, legacy apps. That data is primarily relational, SQL friendly. There are big volumes of relatively new social data, as public and private user profiles, user timelines in social networks, mixed content of text, imagery, video, locations, emotions, relations, statuses and so forth. There are growing volumes of machine data, starting from access control systems with turnstiles in the office or parking to M2M sensors on transport fleet or quantified-self individuals. Social and machine data is not necessarily SQL friendly. Check out Five Sources of Big Data for more details.
Everything starts from proper abstraction & design. Old school methods still works, but modern methods unlocks even more potential towards creation of information out of the raw data. Abstraction [of the business models or life models] leads to design of data models which are often some kinds of graphs. It is absolutely normal to have multiple graphs within a solution/product. E.g. people relations are straightforward abstracted to Social Graph, while machine data might be represented into Network Graphs, Mobile Graph. There are other common abstractions, such as Logistic Graph, Recommendations Graph and so on. More details could be found in Six Graphs of Big Data.
The key concept of the processing could be abstracted to a funnel. On the left you got raw data, you feed it into the funnel, and gets kind of information of the right. This is depicted at high-level on the diagram.
What makes it Advanced?
An interesting question… Modern does not always mean advanced. What makes it advanced is another technology, related to the user experience – mobiles and wearables. As soon as predictive and prescriptive analytics is delivered in real-time at your fingertips, it could be considered to be advanced.
There are several technological limitations and challenges. Let’s start from the mobiles and wearables. The biggest issue is a screen size. Entire visualization must be designed from the scratch. Reuse for big screens does not work, despite of our browsing of full blown web sites on those screens… The issue with wearables is related to their emergence. Nobody simply isn’t aware enough how to design for them. The paradigms will emerge as soon as adoption rate starts to decelerate. Right now we are observing the boom of wearables. There is insight on wearables: Wearable Technology and Wearable Technology, Part II. A lot to change there!
The requirement of real-time or near-real-time information delivery assumes high-performance computing at the backend, some data massage and pre-processing must be done in advance; then bits must be served out from the memory. It is client-cloud architecture, where client is mobile or wearable gadget, cloud is backend with plenty of RAM with ready-made bits. This is depicted on the diagram, read it from left to right.
This is a new era of the tools and technologies to enable and accelerate the processing pipeline from data to information into your pocket/hand/eyes. There is a lack of tools and frameworks to melt old and modern data together. Hadoop is doing well there, but things are not so smooth as install & run. There is a lack of data platform tools. There is a lack of integration, aggregation tools. Visualization is totally absent, there are still no good executive dashboards even for PC screens, not mentioning smartphones. I will address those opportunities in more details in next posts.