5 things CIOs must watch out for in big data

In today’s world, we leave information traces all the time, even without being aware of it. Traditionally, companies had data related to customers when they signed up for services. However, today that is merely one of the data sources. Emails, blog Image-(195)posts, social media interactions, credit card spending, transaction logs are just a few of the sources that companies can tap into. There are many companies who have taken the raw data and converted into usable information and presented it to you in a format for which you are willing to pay. While monetizing existing and new data is one of the top priorities for an organization, CIOs needs to be cautious to ensure data privacy is not violated when monetizing this data, especially external data.

Here is a list of five things CIOs must watch out for before deploying big data:

  1. Choosing big data technology: Most organizations have some form of data warehouse for reporting and analytics. They typically contain data from ERP, CRM, other internal applications and some external data feeds. Almost all of this data is structured, and stored in traditional RDBMS or appliances. To bring on the additional external data which are very high in volume and can be unstructured, organizations are faced with choices to expand within the current infrastructure or to try and expand their infrastructure to include newer technologies. Given the cost pressures from an investment standpoint and the need to monetize data, it becomes imperative to consider Hadoop based technologies. The Hadoop based architecture can co-exist with traditional data architecture to ensure there is no disruption. The Hadoop based technologies including HDFS, HBASE can scale out horizontally resulting in lesser up front capital expenditure. The idea is to start small, prove out the technology works in the context of the business, and build upon it.
  2. Cloud computing: While making the decision to move to Hadoop, it is worth considering a Cloud based platform for obvious reasons of quick start up, no administrative costs resulting in lower resource needs, and on demand scale out providing the flexibility of “pay as you need”, no costs involved in upgrades which results in an overall lowered TCO. While this model lowers the capital expenditure, it will result in increased operational expenditure. Planning becomes an important component of this model.
  3. Asset rationalization strategy: Organizations should have an overall asset rationalization strategy in place as they embark on these initiatives. This strategy should identify the strategic applications, their expiry date and their need in the future. This strategy should also Machine_learning_750X350(1)encompass the cloud strategy for the organization including how they want to approach SaaS, PaaS and IaaS. Estimates indicate that the percent of on premise spending migrating to the cloud is going to go up from around 2.34% in 2014 to 14.39% in 20201. Building out the Hadoop infrastructure on the cloud should be in alignment with the overall strategy. It makes sense to consider a hybrid architecture which will help with agility without compromising security, and go about it in a phased manner.
  4. Social media analytics: Given the widespread adoption of social media, it pays to have a strategy around social media analytics. It is estimated that there are 2.07 billion active social media accounts used by 29% of the world population, and with YOY growth of 12%. Given this, it is impossible to ignore the data from this segment and the downstream analytics that can be derived from this can be used for various purposes including Social Media Marketing (SMM) and Search Engine Marketing (SEM). The data can be used for better customer profiling and segmentation, targeted campaigns and a better customer experience. A cloud based Hadoop architecture lends itself well to collecting, storing and processing the vast volumes of social data.
  5. Leveraging sensors and GPS: Big data is not restricted only to social media or unstructured data from websites. It also includes the billions of data points that are collected by sensors. In the IoT age, we are seeing an explosion of sensors and the resulting data is available for use by downstream systems. For example, sensors placed in a HVAC system sends data continuously, which when combined with motion sensor data, can be used to optimize an air conditioning system thereby saving energy. Data from the GPS of a pharmaceutical sales representative who visits multiple physicians in a day can be combined with sales data to optimize the route for him ensuring he spends the least amount of time traveling, but at the same time, he maximizes the face time with the high value physicians. This type of analytics is termed prescriptive analytics where a course of action is recommended based on the data and analytics. Organizations should be prepared to tap into this by understanding the various data sources that can impact their business, understanding business drivers and processes to build models in conjunction with their business SMEs that provide output which can be embedded in business processes. Collecting the right data and building the right model to provide meaningful output is the key to streamlining business processes.

Most organizations have been leveraging business intelligence tools for many years to help make the right decisions. With the advent of new data sources and technologies to process the data, traditional business intelligence has taken a giant leap. Outputs like customer segmentation have become far more sophisticated given the additional data points and our ability to process it in a comprehensive manner taken into account multiple variables. In a highly competitive landscape, the ability to monetize data directly and indirectly becomes a key differentiator.

The article was originally published on CXOToday.com on June 23, 2015 and is re-posted here by permission. 

Arvind Purushothaman

Arvind Purushothaman leads the Data & Analytics practice at Virtusa. He has over 22 years of experience in this space, and focuses on Data consulting, Data Engineering, Analytics and AI/ML.

More Posts