A better way to tackle all that data
August 21, 2013 | 9:54 am| | | Start Conversation
The single biggest challenge any organization faces in a world awash in data is the time it takes to make a decision. We can amass all of the data in the world, but if it doesn’t help to allocate resources better or avoid a crisis, what good is it? Hampered by a shortage of qualified data scientists, big data’s rise is outstripping our ability to perform analysis and reach conclusions quickly.
At the root of this problem is our concept of what constitutes data. Given that the boundaries of what we can digitize and analyze are moving outward every day, our ability to process the right data and perform the right analysis is headed for serious trouble.
The measure of how long it takes analytics to reach a conclusion is often called “time to decision.” If we accept that big data’s holy grail is better, faster decision-making, we have to believe that as data continue to grow in volume, velocity and variety, making management more complex and potentially slowing time to decision, something has to give.
This is a problem crying out for a solution that has long been in development but has only recently become effective and economically feasible enough for widespread adoption – machine learning.
As the term suggests, machine learning is a branch of computer science where algorithms learn from and react to data just as humans do. Machine-learning software identifies hidden patterns in data and uses those patterns both to group similar data and to make predictions. Each time new data are added and analyzed, the software gains a clearer view of data patterns and gets closer to making the optimal prediction or reaching a meaningful understanding.
Machine learning does this by turning the conventional data-mining practice on its head. Rather than scientists beginning with a (possibly biased) hypothesis that they then seek to confirm or disprove using a body of data, the machine starts with a definition of an ideal outcome that it uses to decide which data matter and how they should factor into solving problems. The idea is that if we know the optimal way for something to operate, we can figure out exactly what to change in a suboptimal situation.
Thus, for example, a complex system like commuter train service has targets for the on-time, safe delivery of passengers that present an optimization problem in real time based on a variety of fluctuating variables, ranging from the weather to load size. Machine-learning software on board the trains themselves can take these factors into account, running hundreds of calculations per second in order to direct an engineer to operate at the proper speed.
The Nest thermostat is a well-known example of machine learning applied to very local data. As people turn the dial on the Nest thermostat, it learns their temperature preferences and begins to manage the heating and cooling of their home automatically, regardless of the time of day or day of the week. The system never stops learning, allowing people to continuously define the optimum.
With the rise of off-the-shelf software such as solver, the winner of a recent crowdsourcing contest to find better ways to recognize Parkinson’s disease, machine learning is at last entering the mainstream, available to a wider variety of businesses than the likes of Google and Facebook.
More and more companies may now see it as a viable alternative to addressing the rapid proliferation of data. Expect to see machine learning used to train supply-chain systems, predict the weather, spot fraud and especially in customer experience management, to help decide what variables and context matter for customer responses to marketing strategies.
By: CHRIS TAYLOR
Big Read |