4. Success Factors and Challenges

4.2. Challenges of Big data and Business Analytics

Though big data can be useful, like any resource, it has potential implementation challenges if it is not properly handled. It is needful to understand that big data does not equal good data. It may have come with some element of imperfection due to the consequence of an imperfect world. The impression of garbage in, garbage out with respect to data visualization, data analytics, still holds. It is needful to comprehend the level of imperfection in the collected data in order to calibrate and understand the meaning of the outputs with respect to the cleanliness or quality of the data. Some of these challenges are outlined below:

(1) A false sense of security

Big data gives a false sense of security. This is because having a huge amount of data does not necessarily mean the result must be true. Big data may not capture the true information you need to solve a particular question on the business problem. Sometimes small data can be ok or even better. There are occasions you may not have the data to support your questions, which is going to spur on either data collection efforts or data acquisition efforts to answer your questions. However, collected data should not be used out of the sample to answer the wrong business problems.

(2) May waste resources

Unnecessary use of big data ties up computer resources and so money and time should not be committed to big data if it is not needed. Big data should be deployed only when it is necessary, to avoid waste of computer resources. It is wasteful for an organization to spend time computing big data when small data can actually answer her questions. This is a case in the Google flu trend BD (engineering). In this project, Google attempted to predict flu outbreaks by measuring millions of Google search terms related to the flu – flu shots, flu symptoms, etc. However, the proposed big data analytics approach for flu prediction failed due to overestimation of results. The failure was due to the wrong choice of measurement as people who search for flu may not actually have the flu. Such error could be avoided when the source of your data, the degree of cleanliness or quality of your data, are understood from a modeling or analytics perspective. Understand the assumptions in your models and verify your data to identify/eliminate bad data, outliers, etc. If you have a big data system already in place, do not be afraid to capture more data that you think may be irrelevant and verify model results overtimes.

(3) Physical challenges to Big Data

Big data poses challenges beyond volumes, velocity, and variety. It equally questions the fundamental beliefs about the relationship between data and knowledge. Big data is challenging to current IT architecture, networks, servers, and software. Considering various reports and Cisco estimates by 2014, on the exponential growth of business data, the yearly doubling of Internet traffic will leave experts with the significant challenge of how these data will be collected and analyzed; will every collected data be analyzed and stored? How do we determine what should be stored and for what duration? Will there be enough physical space for storage? The volume of data on internal networks will hugely exceed most networks capacity for data transmission. The conversion of the moving data to higher bandwidth networks becomes a requirement. Infrastructure for datacenters that support big data and data storage challenges for online and archival data are all problems to deal with. Even if the cost of hardware and software are made affordable, the people and time to enable these changes while keeping the current enterprise fully operational is another challenging factor.

(4) Management challenges.

On management challenges, the issues on security, privacy and civil liberties, regulatory challenges, and compliance come into play. The path between the good and the bad in every technology is determined by the people and how they use it. Just as there are many amazing benefits of big data, so are many possible bad and criminal uses for it. Big data can be very destructive in the wrong hands. The original architecture of big data was not built with enough emphasis on security. Some users take advantage of design oversight and introduce management related challenges. These include security, privacy and civil liberties, regulatory challenges, and compliance.

  • Security and privacy: The digital world has experienced threat from criminal attacks. The fact that big data encourages the collection and analysis of everything, people privacy and civil liberties are at high risk. Big data technology is being used by organized crime to now run a cyber-scam. The criminals use the platform to identify victims, normally elderly, and their relationship to new relatives who are traveling to foreign countries. They then make a call and impersonate foreign officials – asking for immediate payment to post bail, or to pay for urgent medical care. The collection of big data gives them enough data to make the scam work and intrude upon one's privacy. The collection of vast amounts of data can be used to attack the economy, infrastructure, and personnel of the opposition. Today, there are real threats of using cyber blackmail to bend an enemy to your will. For example, in Nigeria, political parties use this platform to attack one another and win members to their group. In worst cases, world leaders have been insulted via this means. This seems to create a society void of respect and dignity. Privacy and civil liberty issues around big data are extremely controversial on whether big data is the end of privacy. There is no doubt that in today's' world, people leave an ever-increasing detailed and complete digital footprint, there is a number of companies that make revenue by tracking every click, and every second you spend on the Internet. The number of companies, government agencies, and research organizations that track and use the telephony data from mobile phones is growing rapidly. They track every movement of a switched on a mobile phone, and store all this collection into a big data solution. Perhaps, creating a new version of the community through broad public education and discussion to determine the right standards, policies, regulations, and laws might resolve the perception of big data as an end to privacy.
  • Regulatory and compliance: Deeper knowledge of big data technology has led to an increase in regulatory requirements. Europe is taking lead in setting rules around the capture and use of various sources of data such as e-mails, instant messages, web forms, mobile records, and mobile data. The tools and the practices for ensuring compliance with these new regulations are immature, or they do not exist. It will require continuous attention to detail and to new tool offerings, to ensure we can manage compliance. This concern calls for enhancements in the Hadoop cluster. Hadoop has the problem of encrypting data. Currently, Kerberos is one of the most common security technologies deployed with a Hadoop cluster to ensure security. Kerberos is an open source project that originated at MIT. It is fundamentally a network protocol, designed as a client-server model and uses the highest available cryptography to ensure mutual authentication for both the user and the server.