Skip to main content

INFORMATION OVERLOAD : Why Solving The Last Mile Problem In Data Analytics Will Start A Revolution

Ron Bodkin , Teradata

To understand the role of open source in data analytics, it is helpful to think back to a concept from telecommunications: the last mile. The last mile is the distance from the distribution points of telecom and cable operators to the house and it was always seen as a bottleneck.

But there was good reason for that bottleneck. About 95% or more of all of the wiring in the system was in the last mile.

Something similar is happening in the world of open source data analytics. 

Hadoop is rightly celebrated as a game changer, but people forget about the “last mile”: the process of converting that power into something that most people can use. 

In my view, the revolution in big data will come from solving the “last mile” problem, that is, making big data useful for the masses.

Business and technology executives who are building data analytics capabilities to take advantage of big data can avoid many problems by understanding the relationship between open source and the last mile of data analytics.

The Awesome Power of Open Source

Open source should not be ignored by anyone seeking to create a best-in-class analytics system. 

Hadoop and its ecosystem don’t solve every challenge, but they do solve many. 

A wealth of data never before available has led to deeper insights and richer models that support advanced discovery and automated execution through predictive analytics and recommendation engines.

Hadoop has also led to transformative innovations in how data is processed and stored. 

Think about the way data lakes enable storage of massive amounts of data without first having to create a schema or even know what you might do with that data later. 

Consider how sessionized schemas help aggregate data related to a user so you can analyze and attribute behavior across channels.

Hadoop has answered an important question about the power of open source communities. 

Is it possible for open source to innovate and create an entirely new product? 

The answer is yes. Hadoop’s open tool chain allows data to be stored in HDFS but then processed by Hive, Spark, Pig, Storm, and HBase.

The value, of course, does not come out of raw analytical power, but out of:

** Applying that analytical power to data

** Creating efficient, optimized approaches for analysis and modeling

** Creating a secure, manageable environment for data

** Building richer models of behavior and business activity

** Extracting valuable signals

**Delivering the ability to use and explore data to as many people as possible

The last mile problem for data analytics involves making these steps as easy as possible. This is where open source needs some support.

The Limits of Open Source for Applications

Most great open source projects created platforms for developers and gave rise to a thriving economy of applications and components, many of which are commercial.

 It’s often a case of open source plus. Here are three examples.

1. The best systems leverage open source, but are customized. Facebook, Amazon, and Google all created systems relying heavily on open source, but their most valuable analytics are custom-built. I suspect that systems that enable thousands of users to do relevant analytics will be customized, even if they use open source.

2. Open source communities meet the most common needs, leaving niches for others to fill. Open source communities typically limit productization to a subset of common needs. Problem areas such as metadata management, integration with enterprise applications and datasets, and application security will likely be solved by commercial solutions because the knowledge required is too specialized and the market too small to nurture an open source community.

3. Commercial products offer a better user experience. When it comes to user experience, commercial products have a much better track record. The open source world created R and the awesome D3 JavaScript libraries for displaying data, but Tableau and Qlik created environments that broke new ground in offering a pleasing user experience for data discovery.

For all of these reasons, it seems likely that the best solutions will leverage open source but be enhanced by commercial or customized solutions that meet the needs of the last mile.

This blog was previously published on Teradata Data Points.

Popular posts from this blog


While "Flavor" is very subjective, and each country that grows mangoes is very nationalistic, these are the mango varieties that are the most sought after around the world because of sweetnesss (Brix) and demand.

The Chaunsa has a Brix rating in the 22 degree level which is unheard of!
Carabao claims to be the sweetest mango in the world and was able to register this in the Guiness book of world records.
Perhaps it is time for a GLOBAL taste test ???

In alphabetical order by Country....



Alphonso (mango)
From Wikipedia, the free encyclopedia

Alphonso (हापुस Haapoos in Marathi, હાફુસ in Gujarati, ಆಪೂಸ್ Aapoos in Kannada) is a mango cultivar that is considered by many[who?] to be one of the best in terms of sweetness, richness and flavor. 

It has considerable shelf life of a week after it is ripe making it exportable. 

It is also one of the most expensive kinds of mango and is grown mainly in Kokan region of western India.

 It is in season April through May and the fruit wei…

INDIA 2016 : Mango production in state likely to take a hit this year

TNN | May 22, 2016, 12.32 PM IST

Mangaluru: Vagaries of nature is expected to take a toll on the production of King of Fruits - Mango - in Karnataka this year. A combination of failure of pre-monsoon showers at the flowering and growth stage and spike in temperature in mango growing belt of the state is expected to limit the total production of mango to an estimated 12 lakh tonnes in the current season as against 14 lakh tonnes in the last calendar year.

However, the good news for fruit lovers is that this could see price of mangoes across varieties decrease marginally by 2-3%. This is mainly on account of 'import' of the fruit from other mango-growing states in India, said M Kamalakshi Rajanna, chairperson, Karnataka State Mango Development and Marketing Corporation Ltd.

Karnataka is the third largest mango-growing state in India after Uttar Pradesh and Maharashtra.

Inaugurating a two-day Vasanthotsava organized by Shivarama Karantha Pilikula Nisargadhama and the Corporation at P…

Mangoes date back 65 million years according to research ...

Experts at the Birbal Sahni Institute of Palaeobotany (BSIP) here have traced the origin of mango to the hills of Meghalaya, India from a 65 million year-old fossil of a mango leaf. 

The earlier fossil records of mango (Mangifera indica) from the Northeast and elsewhere were 25 to 30 million years old. The 'carbonized leaf fossil' from Damalgiri area of Meghalaya hills, believed to be a mango tree from the peninsular India, was found by Dr R. C. Mehrotra, senior scientist, BSIP and his colleagues. 

After careful analysis of the fossil of the mango leaf and leaves of modern plants, the BISP scientist found many of the fossil leaf characters to be similar to mangifera.

An extensive study of the anatomy and morphology of several modern-day species of the genus mangifera with the fossil samples had reinforced the concept that its centre of origin is Northeast India, from where it spread into neighbouring areas, says Dr. Mehrotra. 

The genus is believed to have disseminated into neighb…