Tag Archives: Big Data

Building AI: Another Intelligence


How AI tools can be combined with the latest Big Data concepts to increase people productivity and build more human-like interactions with end users. The Second Machine Age is coming. We’re now building thinking tools and machines to help us with mental tasks, in the same way that mechanical robots already help us with physical work. Older technologies are being combined with newly-created smart ones to meet the demands of the emerging experience economy. We are now in-between two computing ages: the older, transactional computing era and a new cognitive one.

In this new world, Big Data is a must-have resource for any cutting-edge enterprise project. And this Big Data serves as an excellent resource for building intelligence of all kinds: artificial smartness, intelligence as a service, emotional intelligence, invisible interfaces, and attempts at true general AI. However, often with new projects you have no data to begin with. So the challenge is, how do you acquire or produce data? During this session, Vasyl will discuss what the process of creation of new technology to solve business problems, and the strategies for approaching the “No Data Challenge”, including:

  • Using software and hardware agents capable of recording new types of data;
  • The Five Sources of Big Data;
  • The Six Graphs of Big Data as strategies for modern solutions; and
  • The Eight Exponential Technologies.

This new era of computing is all about the end user or professional user, and these new AI tools will help to improve their lifestyle and solve their problems.



Tagged , , , , , , , , ,

Consumerism via IoT @ IT2I in Munich

We buy things we don’t need
with money we don’t have
to impress people we don’t like

Tyler Durden
Fight Club


Consumption sucks

That sucks. That got to be changed. Fight Club changed it that violent way… Thanks God it was in book/movie only. We are changing it different way, peacefully, via consumerism. We are powerful consumers in new economy – Experience Economy. Consumers don’t need goods only, they need experiences, staged from goods, services and something else.


Staging experience is difficult. Staging personal experience is a challenge for this decade. We have to gather, calculate and predict about literally each customer. The situation gets more complicated with growing Do It Yourself attitude from consumers. They want to make it, not just to buy it…

If you have not so many customers then staging of experience could be done by people, e.g. Vitsoe. They are writing on letter cards exclusively for you! To establish realistic human-human interface from the very beginning. You, as consumer, do make it, by shooting pictures of your rooms and describing the concept of your shelving system. New Balance sneakers maker directly provides “Make button”, not buy button, for number of custom models. You are involved into the making process, it takes 2 days, you are informed about the facilities [in USA] and so on; though you are just changing colors of the sneaker pieces, not a big deal for a man but the big deal for all consumers.

There are big etalons in Experience Economy to look for: Starbucks, Walt Disney. Hey, old school guys, to increase revenue and profit think of price goes up, and cost too; think of staging great experiences instead of cutting costs.


Webization of life

Computers disrupted our lives, lifestyle, work. Computers changed the world and still continue to change it. Internet transformed our lives tremendously. It was about connected machines, then connected people, then connected everything. The user used to sit in front of computer. Today user sits within big computer [smart house, smart ambient environment, ICU room] and wears tiny computers [wristbands, clasps, pills]. Let’s recall six orders of magnitude for human-machine interaction, as Bill Joy named them – Six Webs – Near, Hear, Far, Weird, B2B, D2D. http://video.mit.edu/embed/9110/

Nowadays we see boost for Hear, Weird, D2D. Reminder what they are: Hear is your smartphone, smartwatch [strange phablets too], wearables; Weird is voice interface [automotive infotaintent, Amazon Echo]; D2D is device to device or machine to machine [aka M2M]. Wearables are good with anatomical digital gadgets while questionable with pseudo-anatomical like Google Glass. Mobile first strategies prevail. Voice interop is available on all new smartphones and cars. M2M is rolling out, connecting “dumb” machines via small agents, which are connected to the cloud with some intelligent services there.

At the end of 2007 we experienced 5000 days of the Web. Check out what Kevin Kelly predicts for next 5000 days [actually less than 3000 days from now]. There will be only One machine, its OS is web, it encodes trillions of things, all screens look into One, no bits live outside, to share is to gain, One reads it all, One is us…


Webization of things

Well, next 3000 days are still to come, but where we are today? At two slightly overlapping stages: Identification of Everything and Miniaturization & Connecting of Everything. Identification difficulties delay connectivity of more things. Especially difficult is visual identification. Deep Neural Networks did not solve the problem, reached about 80% accuracy. It’s better than old perceptrons but not sufficient for wide generic application. Combinations with other approaches, such as Random Forests bring hope to higher accuracy of visual recognition.

Huge problem with neural networks is training. While breakthrough is needed for ad hoc recognition via creepy web camera. Intel released software library for computer vision OpenCV to engage community to innovate. Then most useful features are observed, improved and transferred from sw library into hw chips by Intel. Sooner or later they are going to ship small chips [for smartphones for sure] with ready-made special object recognition bits processing, so that users could identify objects via small phone camera in disconnected mode with better accuracy than 85-90%, which is less or more applicable for business cases.


As soon as those two IoT stages [Identification and Miniaturization] are passed, we will have ubiquitous identification of everything and everyone, and everything and everyone will be digitized and connected – in other words we will create a digital copy of us and our world. It is going to be completed somewhere by 2020-2025.

Then we will augment ourselves and our world. Then I don’t know how it will unfold… My personal vision is that humanity was a foundation for other more intelligent and capable species to complete old human dream of reverse engineering of this world. It’s interesting what will start to evolve after 2030-2040. You could think about Singularity. Phase shift.


Hot industries in IoT era

Well, back to today. Today we are still comfortable on the Earth and we are doing business and looking for lucrative industries. Which industries are ready to pilot and rollout IoT opportunities right away? Here is a list by Morgan Stanley since April 2014:

Utilities (smart metering and distribution)
Insurance (behavior tracking and biometrics)
Capital Goods (factory automation, autonomous mining)
Agriculture (yield improvement)
Pharma (critical trial monitoring)
Healthcare (leveraging human capital and clinical trials)
Medtech (patient monitoring)
Automotive (safety, autonomous driving)


What IoT is indeed?

Time to draw baseline. Everybody is sure to have true understanding of IoT. But usually people have biased models… Let’s figure out what IoT really is. IoT is synergistic phenomenon. It emerged at the interaction of Semiconductors, Telecoms and Software. There was tremendous acceleration with chips and their computing power. Moore’s Law still has not reached its limit [neither at molecular nor atomic level nor economic]. There was huge synergy from wide spread connectivity. It’s Metcalfe’s Law, and it’s still in place, initially for people, now for machines too. Software scaled globally [for entire planet, for all 7 billions of people], got Big Data and reached Law of Large Numbers.


As a result of accelerated evolution of those three domains – we created capability to go even further – to create Internet of Things at their intersection, and to try to benefit from it.


Reference architecture for IoT

If global and economic description is high-level for you, then here you go – 7 levels of IoT – called IoT Reference Architecture by Cisco, Intel and IBM in October 2014 at IoT World Forum. A canonical model sounds like this: devices send/receive data, interacting with network where the data is transmitted, normalized and filtered using edge computing before landing in databases/data storage, accessible by applications and services, which process it [data] and provide it to people, who will act and collaborate.



Who is IoT?

You could ask which company is IoT one. This is very useful question, because your next question could be about criteria, classifier for IoT and non-IoT. Let me ask you first: is Uber IoT or not?

Today Uber is not, but as soon as the cars are self-driven Uber will be. An only missing piece is a direct connection to the car. Check out recent essay by Tim O’Reilly. Another important aspect is to mention society, as a whole and each individual, so it is not Internet of Things, but it is Internet of Things & Humans. Check out those ruminations http://radar.oreilly.com/2014/04/ioth-the-internet-of-things-and-humans.html

Humans are consumers, just a reminder. Humans is integral part of IoT, we are creating IoT ourselves, especially via networks, from wide social to niche professional ones.


Software is eating the world

Chips and networks are good, let’s look at booming software, because technological process is depending vastly on software now, and it’s accelerating. Each industry opens more and more software engineering jobs. It started from office automation, then all those classical enterprise suites PLM, ERP, SCADA, CRM, SCM etc. Then everyone built web site, then added customer portal, web store, mobile apps. Then integrated with others, as business app to business app aka B2B. Then logged huge clickstreams and other logs such as search, mobile data. Now everybody is massaging the data to distill more information how to meet business goals, including consumerism shaped goals.

  1. Several examples to confirm that digitization of the world is real.
    Starting from easiest example for understanding – newspapers, music, books, photography, movies went digital. Some of you have never seen films and film cameras, but google it, they were non-digital not so long ago. Well, last example from this category is Tesla car. It is electrical and got plenty of chips with software & firmware on them.
  2. Next example is more advanced – intellectual property shifts to digital models of goods. 3D model with all related details does matter, while implementation of that model in hard good is trivial. You have to pay for the digital thing, then 3D print it at home or store. As soon as fabrication technology gets cheaper, the shift towards digital property will be complete. Follow Formula One, their technologies are transferred to our simpler lives. There is digital modeling and simulations, 3D printed into carbon, connected cars producing tons of telemetry data. As soon as consumer can’t distinguish 3D printed hard goods from produced with today’s traditional method, and as soon as technology is cheap enough – it is possible to produce as late as possible and as adequate as possible for each individual customer.
  3. All set with hard goods. What about others? Food is also 3D printed. First 3D printed burger from Modern Meadow was printed more than year ago, BTW funded by googler Sergey Brin. The price was high, about $300K, exactly the amount of his investment. Whether food will be printed or produced via biotech goo, the control and modeling will be software. You know recipes, processes, they are digital. They are applied to produce real food.
  4. Drugs and vaccines. Similar to the food and hard goods. Just great opportunity to get quick access to the brand new medications is unlocked. The vaccine could be designed in Australia and transferred as digital model to your [or nearby] 3D printer or synthesizer, your instance will be composed from the solutions and molecules exclusively, and timely.

So whatever your industry is, think about more software coding and data massage. Plenty of data, global scale, 7 billions of people and 30 billions of internet devices. Think of traditional and novel data, augmented reality and augmented virtuality are also digitizers of our lives towards real virtuality.


How  to design for IoT?

If you know how, then don’t read further, just go ahead with your vision, I will learn from you. For others my advice will be to design for personal experience. Just continue to ride the wave of more & more software piece in the industries, and handle new software problems to deliver personal experience to consumers.

First of all, start recognizing novel data sources, such as Search, Social, Crowdsourced, Machine. It is different from Traditional CRM, ERP data. Record data from them, filter noise, recognize motifs, find intelligence origins, build data intelligence, bind to existing business intelligence models to improve them. Check out Five Sources of Big Data.

Second, build information graphs, such as Interest, Intention, Consumption, Mobile, Social, Knowledge. Consumer has her interests, why not count on them? Despite the interests consumer’s intentions could be different, why not count on them? Despite the intentions consumer’s consumption could be different, why not count on them? And so on. Build mobility graph, communication graph and other specific graphs for your industry. Try to build a knowledge graph around every individual. Then use it to meet that’s individual expectations or bring individualized unexpected innovations to her. Check out Six Graphs of Big Data.

As soon as you grasp this, your next problem will be handling of multi-modality. Make sure you got mathematicians into your software engineering teams, because the problem is not trivial, exactly vice versa. Good that for each industry some graph may prevail, hence everything else could be converted into the attributes attached to the primary graph.



PLM in IoT era

Taking simplified PLM as BEFORE –> DURING –> AFTER…

Design of the product should start as early as possible, and it is not isolated, instead foster co-creation and co-invention with your customers. There is no secret number how much of your IP to share publicly, but the criteria is simple – if you share insufficiently, then you will not reach critical mass to trigger consumer interest to it; and if you share too much, your competitors could take it all. The rule of thumb is about technological innovativeness. If you are very innovative, let’s say leader, then you could share less. Examples of technologically innovative businesses are Google, Apple. If you are technologically not so innovative then you might need to share more.

The production or assembly should be as optimal as possible. It’s all about transaction optimization via new ways of doing the same things. Here you could think about Coase Law upside down – outsource to external patterns, don’t try to do everything in-house. Shrink until internal transaction cost equals to external. Specialization of work brings [external] costs down. Your organization structure should reduce while the network of partners should grow. In the modern Internet the cost of external transactions could be significantly lower than the cost of your same internal transactions, while the quality remains high, up to the standards. It’s known phenomenon of outsourcing. Just Coase upside down, as Eric Schmidt mentioned recently.

Think about individual customization. There could be mass customization too, by segments of consumers… but it’s not so exciting as individual. Even if it is such simple selection of the colors for your phones or sneakers or furniture or car trim. It should take place as late as possible, because it’s difficult to forecast far ahead with high confidence. So try to squeeze useful information from your data graphs as closer to the production/assembly/customization moment as possible, to be sure you made as adequate decisions as could be made at that time. Optimize inventory and supply chains to have right parts for customized products.

Then try to keep the customer within experience you created. Customers will return to you to repeat the experience. You should not sit and wait while customer comes back. Instead you need to evolve the experience, think about ecosystem. Invent more, costs may raise, but the price will raise even more, so don’t push onto cost reduction, instead push onto innovativeness towards better personal experiences. We all live within experiences [BTW more and more digitized products, services and experiences]. The more consumer stays within ecosystem, the more she pays. It’s experience economy now, and it’s powered by Internet of Things. May be it will rock… and we will avoid Fight Club.




Tagged , , , , , , , , , , , , , , , , , , , , , , , ,

Big Data Graphs Revisited

Some time ago I’ve outlined Six Graphs of Big Data as a pathway to the individual user experience. Then I’ve did the same for Five Sources of Big Data. But what’s between them remained untold. Today I am going to give my vision how different data sources allow to build different data graphs. To make it less dependent on those older posts, let’s start from the real-life situation, business needs, then bind to data streams and data graphs.


Context is a King

Same data in different contexts has different value. When you are late to the flight, and you got message your flight was delayed, then it is valuable. In comparison to receiving same message two days ahead, when you are not late at all. Such message might be useless if you are not traveling, but airline company has your contacts and sends such message on the flight you don’t care about. There was only one dimension – time to flight. That was friendly description of the context, to warm you up.

Some professional contexts are difficult to grasp by the unprepared. Let’s take situation from the office of some corporation. Some department manager intensified his email communication with CFO, started to use a phone more frequently (also calling CFO, and other department managers), went to CFO office multiple times, skipped few lunches during a day, remained at work till 10PM several days. Here we got multiple dimensions (five), which could be analyzed together to define the context. Most probably that department manager and CFO were doing some budgeting: planning or analysis/reporting. Knowing that, it is possible to build and deliver individual prescriptive analytics to the department manager, focused and helping to handle budget. Even if that department has other escalated issues, such as release schedule or so. But severity of the budgeting is much higher right away, hence the context belongs to the budgeting for now.

By having data streams for each dimension we are capable to build run-time individual/personal context. Data streams for that department manager were kind of time series, events with attributes. Email is a dimension we are tracking; peers, timestamps, type of the letter, size of the letter, types and number of attachments are attributes. Phone is a dimension; names, times, durations, number of people etc. are attributes. Location is a dimension; own office, CFO’s office, lunch place, timestamps, durations, sequence are attributes. And so on. We defined potentially useful data streams. It is possible to build an exclusive context out of them, from their dynamics and patterns. That was more complicated description of the context.


Interpreting Context

Well, well, but how to interpret those data streams, how to interpret the context? What we have: multiple data streams. What we need: identify the run-time context. So, the pipeline is straightforward.

First, we have to log the Data, from each interested dimension. It could be done via software or hardware sensors. Software sensors are usually plugins, but could be more sophisticated, such as object recognition from surveillance cameras. Hardware sensors are GPS, Wi-Fi, turnstiles. There could be combinations, like check-in somewhere. So, think that it could be done a lot with software sensors. For the department manager case, it’s plugin to Exchange Server or Outlook to listen to emails, plugin to ATS to listen to the phone calls and so on.

Second, it’s time for low-level analysis of the data. It’s Statistics, then Data Science. Brute force to ensure what is credible or not, then looking for the emerging patterns. Bottleneck with Data Science is a human factor. Somebody has to look at the patterns to decrease false positives or false negatives. This step is more about discovery, probing and trying to prepare foundation to more intelligent next step. More or less everything clear with this step. Businesses already started to bring up their data science teams, but they still don’t have enough data for the science:)

Third, it’s Data Intelligence. As MS said some time ago “Data Intelligence is creating the path from data to information to knowledge”. This should be described in more details, to avoid ambiguity. From Technopedia: “Data intelligence is the analysis of various forms of data in such a way that it can be used by companies to expand their services or investments. Data intelligence can also refer to companies’ use of internal data to analyze their own operations or workforce to make better decisions in the future. Business performance, data mining, online analytics, and event processing are all types of data that companies gather and use for data intelligence purposes.” Some data models need to be designed, calibrated and used at this level. Those models should work almost in real-time.

Fourth, is Business Intelligence. Probably the first step familiar to the reader:) But we look further here: past data and real-time data meet together. Past data is individual for business entity. Real-time data is individual for the person. Of course there could be something in the middle. Go find comparison between stats, data science, business intelligence.

Fifth, finally it is Analytics. Here we are within individual context for the person. There worth to be a snapshot of ‘AS-IS’ and recommendations of ‘TODO’, if the individual wants, there should be reasoning ‘WHY’ and ‘HOW’. I have described it in details in previous posts. Final destination is the individual context. I’ve described it in the series of Advanced Analytics posts, link for Part I.

Data Streams

Data streams come from data sources. Same source could produce multiple streams. Some ideas below, the list is unordered. Remember that special Data Intelligence must be put on top of the data from those streams.

In-door positioning via Wi-Fi hotspots contributing to mobile/mobility/motion data stream. Where the person spent most time (at working place, in meeting rooms, on the kitchen, in the smoking room), when the person changed location frequently, directions, durations and sequence etc.

Corporate communication via email, phone, chat, meeting rooms, peer to peer, source control, process tools, productivity tools. It all makes sense for analysis, e.g. because at the time of release there should be no creation of new user stories. Or the volumes and frequency of check-ins to source control…

Biometric wearable gadgets like BodyMedia to log intensity of mental (or physical) work. If there is low calories burn during long bad meetings, then that could be revealed. If there is not enough physical workload, then for the sake of better emotional productivity, it could be suggested to take a walk.


Data Graphs from Data Streams

Ok, but how to build something tangible from all those data streams? The relation between Data Graphs and Data Streams is many to many. Look, it is possible to build Mobile Graph from the very different data sources, such as face recognition from the camera, authentication at the access point, IP address, GPS, Wi-Fi, Bluetooth, check-in, post etc. Hence when designing the data streams for some graph, you should think about one to many relations. One graph can use multiple data streams from corresponding data sources.

To bring more clarity into relations between graphs and streams, here is another example: Intention Graph. How could we build Intention Graph? The intentions of somebody could be totally different in different contexts. Is it week day or weekend? Is person static in the office or driving the car? Who are those peers that the person communicates a lot recently? What is a type of communication? What is a time of the day? What are person’s interests? What were previous intentions? As you see there could be data logged from machines, devices, comms, people, profiles etc. As a result we will build the Intention Graph and will be able to predict or prescribe what to do next.


Context from Data Graphs

Finally, having multiple data graphs we could work on the individual context, personal UX. Technically, it is hardly possible to deal with all those graphs easily. It’s not possible to overlay two graphs. It is called modality (as one PhD taught me). Hence you must split and work with single modality. Select which graph is most important for your needs, use it as skeleton. Convert relations from other graphs into other things, which you could apply to the primary graph. Build intelligence model for single modality graph with plenty of attributes from other graphs. Obtain personal/individual UX at the end.

Tagged , , , , , , , , , , , , , , , , , , , , , ,

Five Sources of Big Data

Some time ago I’ve described how to think when you build solutions from Big Data in the post Six Graphs of Big Data. Today I am going to look in the opposite direction, where Big Data come from? I see distinctive five sources of the data: Transactional, Crowdsourced, Social, Search and Machine. All details are below.

Transactional Data

This is old good data, most familiar and usual for the geeks and managers. It’s plenty of RBDMSes, running or archived, on premise and in the cloud. Majority of transactional data belong to corporations, because the data was authored/created mainly by businesses. It was a golden era of Oracle and SQL Server (and some others). At some point the RDBMS technology appeared to be incapable of handling more transactional data, thus we got Teradata (and others) to fix the problem. But there was no significant shift for the way we work with those data sources. Data warehouses and analytic cubes are trending, but they were used for years already. Financial systems/modules of the enterprise architectures will continue to rely on transactional data solutions from Oracle or IBM.

Crowdsourced Data

This data source has emerged from the activity rather than from type of technology. The phenomenon of Wikipedia confirmed that crowdsourcing really works. Much time passed since Wikipedia adoption by the masses… We got other fine data sources built by the crowds, for example Open Street Maps, Flickr, Picasa, Instagram.

Interesting things happen with the rise of personal genetic testing (verifying DNA for million of known markers via 23andme). This leads to public crowdsourced databases. More samples available, e.g. amateur astronomy. Volunteers do author useful data. The size of crowdsourced data is increasing.

What differentiates it from transactional/enterprise data? It’s a price. Usually crowdsourced data is free for use, with one of creative commons licenses. Often, the motivation for creation of such data set is digitization of our world or making free alternative to paid content. With the rise of nanofactories, we will see the growth of 3D models of every physical product. By using crowdsourced models we will print the goods at home (or elsewhere).

Social Data

With the rise of Friendster–>MySpace–>Facebook and then others (Linkedin, Twitter etc.) we got new type of data — Social. It should not be mixed for Crowdsourced data, because of completely different nature of it. The social data is a digitization of ourselves as persons and our behavior. Social data is very well complementing the Crowdsourced data. Eventually there will be digital representation of everyone… So far social profiles are good enough for meaningful use. Social data is dynamic, it is possible to analyze it in real-time. E.g. put Tweets or Facebook posts thru the Google Predictive API to grab emotions. I’m sure everybody intuitively understands this type of data source.

Search Data

This is my favourite. Not obvious for many of you, while really strong data source. Just recall how much do you search on Amazon or eBay? How do you search on Wikis (not messing up with Wikipedia). Quora gets plenty of search requests. StackOverflow is a good source of search data within Information Technology. There are intranet searches within Confluence and SharePoint. If those search logs are analyzed properly, then it is clear about potential usefulness and business application. E.g. Intention Graph and Interest Graph are related to the search data.

There is a problem of “walled gardens” for search data… This problem is big, bigger than for social data, because public profiles are fully or partially available, while searches are kept behind the walls.

Machine Data

This is also my favourite. In the Internet of Things every physical thing will be connected. New things are designed to be connectable. Old things are got connected via M2M. Consumers adopted wearable technology. I’ve posted about it earlier. Go to Wearable Technology and Wearable Technology, Part II.

The cost of data gathering is decreasing. The cost of wireless data transfer is decreasing. The bandwidth of wireless transfer is increasing dramatically. Fraunhofer and KIT completed 100Gbps transmission. It’s fourteen times faster than the most robust 802.11ac. The moral is — measure everything, just gather data until it become Big Data, then analyze it properly and operate proactively. Machine data is probably the most important data source for Big Data during next years. We will digitize the world and ourselves via devices. Open Street Map got competitors, the fleet of eBees described Matterhorn with million of spatial points. More to expect from machines.

Tagged , , , , , , , , , , , , , , , , , , , , ,

Six Graphs of Big Data

This post is about Big Data. We will talk about the value and economical benefits of Big Data, not the atoms that constitute it [Big Data]. For the atoms you can refer to Wearable Technology or Getting Ready for the Internet of Things by Alex Sukholeyster, or just logging of the click stream… and you will get plenty of data, but it will be low-level, atom level, not much useful.

The value starts at the higher levels, when we use social connections of the people, understand their interests and consumptions, know their movement, predict their intentions, and link it all together semantically. In other words, we are talking about six graphs: Social, Interest, Consumption, Intention, Mobile and Knowledge. Forbes mentions five of them in Strategic Big Data insight. Gartner provided report “The Competitive Dynamics of the Consumer Web: Five Graphs Deliver a Sustainable Advantage”, it is paid resource unfortunately. It would be fine to look inside, but we can move forward with our vision, then compare to Gartner’s and analyze the commonality and variability. I foresee that our vision is wider and more consistent!

Social Graph

This is mostly analyzed and discussed graph. It is about connections between people. There are fundamental researches about it, like Six degrees of separation. Since LiveJournal times (since 1999), the Social Graph concept has been widely adopted and implemented. Facebook and its predecessors for non-professionals, LinkedIn mainly for professionals, and then others such as Twitter, Pinterest. There is a good overview about Social Graph Concepts and Issues on ReadWrite. There is good practical review of social graph by one of its pioneers, Brad Fitzpatrick, called Thoughts on the Social Graph. Mainly he reports a problem of absence of a single graph that is comprehensive and decentralized. It is a pain for integrations because of all those heterogeneous authentications and “walled garden” related issues.

Regarding implementation of the Social Graph, there are advices from the successful implementers, such as Pinterest. Official Pinterest engineering blog revealed how to Build a Follower Model from scratch. We can look at the same thing [Social Graph] from totally different perspective – technology. The modern technology provider Redis features tutorial how to Build a Twitter clone in PHP and (of course) Redis. So situation with Social Graph is less or more established. Many build it, but nobody solved the problem of having single consistent independent graph (probably built from other graphs).

Interest Graph

It is representation of the specific things in which an individual is interested. Read more about Interest Graph on Wikipedia. This is the next hot graph after the social. Indeed, the Interest Graph complements the Social one. Social Commerce see the Interest + Social Graphs together. People provide the raw data on their public and private profiles. Crawling and parsing of that data, plus special analysis is capable of building the Interest Graph for each of you. Gravity Labs created a special technology for building the Interest Graph. They call it Interest Graph Builder. There is an overview (follow previous link) and a demo. There are ontologies, entities, entity matching etc. Interesting insight about the Future of Interest Graph is authored by Pinterest’s head of engineering. The idea is to improve the Amazon’s recommendation engine, based on the classifiers (via pins). Pinterest knows the reasoning, “why” users pinned something, while Amazon doesn’t know. We are approaching Intention Graph.

Intention Graph

Not much could be said about intentions. It is about what we do and why we do.  Social and Interests are static in comparison to Intentions. This is related to prescriptive analytics, because it deals with the reasoning and motivation, “why” it happens or will happen. It seems that other graphs together could reveal much more about intentions, than trying to figure them [Intentions] out separately.

Intention Graph is tightly bound to the personal experience, or personal UX. It was foreseen in far 1999, by Harvard Business Review, as Experience Economy. Many years were spent, but not much implemented towards personal UX. We still don’t stage a personal ad hoc experience from goods and services exclusively for each user. I predict that Social + Interest + Consumption + Mobile graphs will allow us to build useful Intention Graph and achieve capabilities to build/deliver individual experiences. When the individual is within the service, then we are ready to predict some intentions, but it is true when Service Design was done properly.

Consumption Graph

One of the most important graphs of Big Data. Some call it Payment Graph. But Consumption is a better name, because we can consume without payment, Consumption Graph is relatively easy for e-commerce giants, like Amazon and eBay, but tricky for 3rd parties, like you. What if you want to know what user consumes? There are no sources of such information. Both Amazon and eBay are “walled gardens”. Each tracks what you do (browse, buy, put into wish list etc.), how you do it (when log in, how long staying within, sequence of your activities etc.), they send you some notifications/suggestions and measure how do you react, and many other tricks how to handle descriptive, predictive and prescriptive analytics. But what if user buys from other e-stores? There is a same problem like with Social Graph. IMHO there should be a mechanism to grab user’s Consumption Graph from sub-graphs (if user identifies herself).

Well, but there is still big portion of retail consumption. How to they build your Consumption Graph? Very easy, via loyalty cards. You think about discounts by using those cards, while retailers think about your Consumption Graph and predicts what to do with all of users/client together and even individually. There is the same problem of disconnected Consumption Graphs as in e-commerce, because each store has its own card. There are aggregators like Key Ring. Theoretically, they simplify the life of consumer by shielding her from all those cards. But in reality, the back-end logic is able to build a bigger Consumption Graph for retail consumption! Another aspect: consumption of goods vs. consumption of services and experiences, is there a difference? What is a difference between hard goods and digital goods? There are other cool things about retail, like tracking clients and detecting their sex and age. It is all becoming the Consumption Graph. Think about that yourself:)

Anyway, Consumption Graph is very interesting, because we are digitizing this World. We are printing digital goods on 3D printers. So far the shape and look & feel is identical to the cloned product (e.g. cup), but internals are different. As soon as 3D printer will be able to reconstruct the crystal structure, it will be brand new way of consumption. It is thrilling and wide topic, hence I am going to discuss it separately. Keep in touch to not miss it.

Mobile Graph

This graph is built from mobile data. It does not mean the data comes from mobile phones. Today may be majority of data is still generated by the smartphones, but tomorrow it will not be the truth. Check out Wearable Technology to figure out why. Second important notion is about the views onto the understanding of the Mobile Graph. Marketing based view described on Floatpoint is indeed about the smartphones usage. It is considered that Mobile Graph is a map of interactions (with contexts how people interact) such as Web, social apps/bookmarks/sharing, native apps, GPS and location/checkins, NFC, digital wallets, media authoring, pull/push notifications. I would view the Mobile Graph as a user-in-motion. Where user resides at each moment (home, office, on the way, school, hospital, store etc.), how user relocates (fast by car, slow by bike, very slow by feet; or uniformly or not, e.g. via public transport), how user behaves on each location (static, dynamic, mixed), what other users’ motions take place around (who else traveled same route, or who also reside on same location for that time slot) and so on. I am looking at the Motion Graph more as to the Mesh Network.

Why dynamic networking view makes more sense? Consider users as people and machines. Recall about IoT and M2M. Recall the initiatives by Ford and Nokia for resolving the gridlock problems in real-time. Mobile Graphs is better related to the motion, mobility, i.e. to the essence of the word “mobile”. If we consider it from motion point of view and add/extend with the marketing point of view, we will get pretty useful model for the user and society. Mobile Graph is not for oneself. At least it is more efficient for many than for one.

Knowledge Graph

This is a monster one. It is about the semantics between all digital and physical things. Why Google rocks still? Because they built the Knowledge Graph. You can see it action here. Check out interesting tips & tricks here. Google’s Knowledge Graph is a tool to find the UnGoogleable. There is a post on Blumenthals that Google’s Local Graph is much better than Knowledge, but this probably will be eliminated with time. IMHO their Knowledge Graph is being taught iteratively.

As Larry Page said many times, Google is not a search engine or ads engine, but the company that is building the Artificial Intelligence. Ray Kurzweil joined Google to simulate the human brain and recreate kind of intelligence. Here is a nice article How Larry Page and Knowledge Graph helped to seduce Ray Kurzweil to join Google. “The Knowledge Graph knows that Santa Cruz is a place, and that this list of places are related to Santa Cruz”.

We can look at those graphs together. Social will be in the middle, because we (people) like to be in the center of the Universe:) The Knowledge Graph could be considered as meta-graph, penetrating all other graphs, or as super-graph, including multiple parts from other graphs. Even now, the Knowledge Graph is capable of handling dynamics (e.g. flight status).

Other Graphs

There are other graphs in the world of Big Data. The technology ecosystems are emerging around those graphs. The boost is expected from the Biotech. There is plenty of gene data, but lack of structured information on top of it. Brand new models (graphs) to emerge, with ease of understanding those terabytes of data. Circos was invented in the field of genomic data, to simplify understanding of data via visualization. More experiments could be found on Visual Complexity web site. We are living in the different World than a decade ago. And it is exciting. Just plan your strategies correspondingly. Consider Big Data strategically.

Tagged , , , , , , , , , , , , , , , , , , , , , , , ,

The Power of Paper

Here are ruminations on the real power of the paper and other “two dimensional” surfaces we use to present data or information. Inspired by respectful scientists of visualization since 1960s…

Typical answer from many (all?) of you about dimensions of the paper is 2 (two). Not a big surprise.

Paper sheet, two dimensions

Paper sheet, two dimensions

You see vertical and horizontal axis, what is called width and height. Below is a typical sheet to confirm you are right.

Width, Height

Width, Height

Let’s look at the same sheet more carefully. There is a gradient light/shadow on it. Consider the strength of the light or shadow as a value. It is true third dimension. We’ve got the sheet with three dimensions: width, height and value.

Width, Height, Value

Width, Height, Value

Well, so what? We squeezed three dimensions. What else?
Of course there is opportunity for fourth dimension:) Let’s pay attention to the surface of the paper, represent it as a texture, thus make it applicable for digital visualizations. Texture could be different. Don’t mix it with the pattern. Same texture could be scaled in and out, but it is still same texture. Below is a same sheet with texture as fourth dimension.

Width, Height, Value, Texture

Width, Height, Value, Texture

What else? Is our piece of paper done? No! There is fifth dimension – color. Code something into color and you use five dimensions. Don’t mess up value and color, they are different things. Hence, below is a same sheet with five dimensions.

Width, Height, Value, Texture, Color

Width, Height, Value, Texture, Color

At this point I am sure you are confident that we are still able to use even more dimensions. Here is 6th. Size. The sheet could be of different size. Smaller, bigger. Size also encodes, size does matter.

6 dimensions

6 dimensions

Very good. What else on that piece of paper (or digital picture) is capable to encode? The shape. Different shapes encode different things. Below are samples of the shapes, all with 6 dimensions. Together with shape encoding we are getting 7 dimensions.

7 dimensions

7 dimensions

OK, we are not finished yet. There is still a dimension to use. Guess what? Orientation. The sheet (and digital image) could be oriented differently. Independent of shape or size. Below is example.

8 dimensions

8 dimensions

So here you go – 8 dimensions to use during information visualization. All 8 are applicable for paper and digital designs. Very important for efficient BI designs. How to fit Big Data into single widget? Make information from data. But what if information is big too? Use efficient information modeling to get much much more from the same piece of space. This is design wisdom. Use it. Start from 4-5 dimensions. Continue to 8.

Tagged , , , , , , , , , , , , , , , , , , , , , , , , ,

Mobile EMR, Part IV

This is continuation of Mobile EMR, Part III.

It happened to be possible to fit more information to the single pager! We’ve extended EKG slightly, reworked LABs results, reworked measurements (charts) and inserted a genogram. Probably the genogram brings majority of new information in comparison to other updates.

v4 of mEMR concept

Right now the concept of mobile EMR looks this way…

Mobile EMR v4

Mobile EMR v4

New ‘All Data’ charts

Initially the charts of measured values have been from dots. Recent analysis and reviews tended to connect the dots, but things are not so straightforward… There could be kind of sparkline for the current period (7-10 days). Applicability of sparkline technique to represent data from the entire last year is suspicious. Furthermore, if more data is available from the past, then it will be a mess rather than a visualization, because there is so narrow space allocated for old data. Sure, the section of the chart could be wider, but does it worth it?

What is most informative from the past periods? Anomalies, such as low and high values, especially in comparison with current values. Hence we’ve left old data as dots, previous year data as dots, and made current short period as line chart. We’ve added min/max points to ease the analysis of the data for MD.


Having genogram on the default screen seems very useful. User testing needed to test the concept on real genograms, to check the sizes of genograms used most frequently. Anyhow, it is always possible to show part of the genogram as expanded diagram, while keep some parts collapsed. The genogram could be interactive. When MD clicks on it, she gets to the new screen totally devoted to the genogram with all detailed attributes present. Editing could be possible too. While default screen should represent such view onto the genogram that relates to the current or potential diagnosis the patient has.

In the future the space allocated for the genogram could be increased, based on the speed of evolution of genetic-based treatments. May be visualization of personal genotyping will be put onto the home screen very soon. There are companies providing such service and keeping such data (e.g. 23andme). Eventually all electronic data will be integrated, hence MDs will be able to see patients genotyped data from EMR app on the tablet.

DNA Sequence

This is mid term future. DNA sequencing is still a long process today. But we’ve got the technology how to deliver DNA sequence information onto the tablet. The technology is similar to BigImage(tm). Predefined levels of information deliver could be defined, such as genes, exoms and finally entire genotype. For sure additional layers overlays will be needed to simplify visual perception and navigation thru the genetic information. So technology should be advanced with that respect.

Tagged , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

Mobile EMR, Part II

On 27th of August I’ve published Mobile EMR, Part I. This post is a continuation.

The main output from initial implementation was feedback from users. They just needed even more information. We initially considered mEMR and All Information vs. Big Data. But it happened that some important information was missing from the concept relied on Powsner/Tufte research. Hence we have added more information and now ready to show the results of our research.

First of all, data is still slightly out of sync, so please be tolerant. It is mechanical piece of work and will be resolved as soon as we integrate with hospital’s backend. The charts on the default screen will show the data most suitable for the each diagnosis. This will be covered in Part III when we are ready with data.

Second, quick introduction for redesign of the initial concept. Vital Signs had to go together, because they deliver synergetically more information when seen relatively to each other. Vital Signs are required for all diagnosis. Hence we have designed a special kind of chart for vital signs and hosted it on top of the iPad. Medications happened to be extremely important, so that physician instantly see what meds are used right now, reaction of vital signs, diagnosis and allergy, and significant events. All other charts are specific to the diagnosis and physician should be able to drag’n’drop them as she needs. It is obvious that diabetes is cured differently than Alzheimer. Only one chart has its dedicated place there – EKG. Partially, EKG is connected to the vital signs, but historically (and technically too) the EKG chart is complemently different and should be rendered separately. Below is a snapshot of the new default screen:

Default Screen (with Notes)

Most important notes are filtered as Significant Events and could be viewed exclusively. Actually default screen can start with Significant Events. We just don’t have much data for today’s demo. Below is a screenshot with Significant Events for the same patient.

Default Screen (with Significant Events)

Charts are configurable like apps on iPad. You tap and hold the one, then move to the desired place and release it. All other charts are ordered automatically around it. This is very useful for the physician to work as she prefers. It’s a good opportunity to configure the sets according to diagnosis. Actually we embedded pre-sets, because it is obvious that hypertension disease is cured differently than cut wound. Screenshot below shows some basic charts, but we are working on its usability. More about that in Part III some time.

Charts Configuration

According to Inverted Pyramid , default screen is a cap of the information mountain. When many are hyping around Big Data, we move forward with All Information. Data is a low-level atoms. Users need information from the data. Our mEMR default screen delivers much information. It can deliver all information. It is up to MD to configure the charts that are most informative in her context. MD can dig for additional information on demand. Labs are available on separate view, groupped into the panels. Images (x-rays) are available on separate view too. MD can click onto the tab IMAGERY and switch to the view with image thumbnails, which correspond to MRIs, radiology/x-ray and other types of medical imagery. Clicking on any thumbnail leads to the image zoomed to the entire iPad screen estate. The image becomes zoomable and draggable. We use our BigImage(tm) IP to empower image delivery of any size to any front end. The interaction with the image is according to Apple HIG standard.

Imagery (empowered by BigImage)

I don’t put here a snapshot of the scan. because it looks like standard full screen picture. Additional description and demo of the BigImage(tm) technology is available at SoftServe site http://bigimage.softserveinc.com. If new labs or new PACS are available, then they are pushed to the home screen as red notifications on the tab label (like on MEASUREMENTS tab above) so that physician can notice and click to see them. It is common scenario if some complicated lab required, e.g. tissue research for cancer.

Labs are shown in tabular form. This was confirmed by user testing. We have grouped the labs by the corresponding panels (logical sets of measurements). It is possible to order labs by date in ascending (chronological) and descending (most recent result is first) orders. Snapshot below shows labs in chronological order. Physician can swipe the table to the left (and then right) to see older results.


Editing is possible via long tap of the widget, until corresponding widget goes into the edit mode. Quick single click will return the widget to preview mode. MD can edit (edit existing, delete existing and assign new) medications, enter significant sign, notes. Audit is automatic, according to HIPAA, time and identity is captured and stored together with edited data.

Continued in Mobile EMR, Part III.

Tagged , , , , , , , , , , , , , , , , , , , , , , , ,