Hans-Jürgen Kugler is Chief Scientist of Kugler Maag Cie, an independent, international consulting company. The firm advises clients on how to improve business and product development processes, especially for those working in critical systems such as automotive and transport.
In addition to his insights for clients, Kugler has discussed the benefits of the “open source” organization model as a roadmap for successful enterprises.
As organizations struggle with Big Data, the Internet of Things, and Actuate got a chance to ask Mr. Kugler his thoughts.
Actuate: Big Data is the buzzword of the day. How should developers approach Big Data visualization? Do you expect a second wave of open source development on the way of open source projects to enable more analysis?
Kugler: Dealing with Big Data is quite an old challenge. When I studied Computer Science (and Astronomy) – and this is a long time ago – it was said, that for every computer, however powerful, there is at least one physicist whose requirements will exceed the computational power and the storage capacity of that computer. That hasn’t changed. Think of CERN (the European Organization for Nuclear Research) and the processing and visualization requirements.
By saying that the problem is old, I didn’t mean to imply it is solved. Actually there are significant new challenges, and I expect these to give rise to a new wave of open source development.
One driver is the complexity, size and growth of the technical systems being built. In the past data analysis and visualization was used as a means of confirming the system under construction or in operation was performing the way it should. The developers and operator had an understanding of what data would have what importance when. This could be characterized as a reductionist, or a constructivist perspective.
However, many of the technical systems we are constructing are complex adaptive systems (think of online trading networks), which may produce tons of data (total ca. 2.5×10**18 bytes every day since 2012.) And the developers and operators are now looking for patterns in the data that tell them what is going on – and to what level of predictability. These people are now in the same boat as the physicists in CERN: try and understand what is happening based on data patterns. At CERN they have the predictive power of the laws of quantum mechanics. What scientific framework can the NSA (US National Security Agency) rely on to master its data? …And that is really BIG!
I used the NSA here for a purpose, because their inherent challenge is that the data sources are very diverse and not always causally connected. And this leads to the second driver that will change big data, and the way it needs to be looked at.
The emergence of the Internet of Things and services will connect “horizontally” across what we now recognize as different industry domains. Yes, it may actually this new and untried connection that may lead to identify emergent behavior or a new business model, where the data becomes more important than the products or services. Google has shown this. Think of combining automotive vehicle sensor data, traffic data and environmental data to derive routing and driving strategies.
Actuate: What is the best way to approach building analytics systems that are open and connected without compromising features or functionality?
Kugler: What are the users of these analytical tools looking for? Think of a novel combination applications of different disciplines. Innovative ways at looking at a combination of datasets might lead to new services. For example, Google in Germany combined (anonymous) Vodafone mobile movement data and Google maps. The visualization of that data created a real-time indicator of traffic congestions as they were forming.
The analytic system must allow users to look at all the dimensions independently, but also in various combinations. The principle must be to continue to the possible ways of looking at the data – to keep as many options open as long as possible, because you may not really know what you are looking for. You may want to pursue several analytical models concurrently.
Community efforts should co-create open interfaces to plug data into the analysis, and build open platforms for visualization practices, with experiential learning of what patterns provide what insight. In my opinion this would lead to an eco-system that would also give smaller players a chance to develop niche in this eco-system. This increases diversity – and the chances of finding the needle in the haystack.
The biggest challenge is to build the analytical system so that it can be used to identify emergent behavior early on. The analytical system needs “to keep is options open” and not early on break the emergence down into want could be constituent part.
Actuate: Why should developers look for Big Data analytics tools that are easy to embed, easy to deploy, and easy to scale?
Kugler: The challenge for the user is to begin to understand the data. The degree of “ease of use” of the tools and the underlying methods will have a great impact on the ability of the user to see patterns. The second wave of open source analytics and visualization projects requires committed participation by these users.
Developers are professionals in software and systems development – and they must continue to be. Additionally they need to learn more about the application domains and their future connectivity, and about analysis and visualization – including a better understanding of human cognition: the human brain may just filter patterns it hasn’t experienced in the past. We see with our brain, not our eyes. Will something like “open proof” from the safety and security area be useable as “open analysis”?