Big data and business analytics methods for improved business decision-making, technological approaches, applications, and open research challenges. Big data has brought companies in developed countries many positive effects, which those in emerging and developing nations may replicate. However, big data's many challenges include data security, management, characteristics, compliance, and regulation. This paper contains a neatly wrapped breakdown outlining the structure, components, and tools that provide effective and efficient processing for the Hadoop ecosystem.
6. Summary and Open Research Directions
Benefits of big data are potentially tremendous. To a business class, technology is merely a means of keeping the company close to its customers. Enterprises that embarked on a big data project have experienced massive growth in business as revealed in this work. It has successfully helped the organization to achieve cost reductions, faster and better decisions, and even provide new offerings for the customer as discussed in this paper. Hadoop and cloud-based analytics used in big data have contributed immensely to a substantial reduction in the cost of the technology when compared to the traditional architectures (data warehouses and marts in particular). However, big data is not a replacement for data warehouses but it augments it. Rather than processing and storing vast quantities of new data in a data warehouse, for example, companies are using Hadoop clusters for that purpose and moving data to enterprise warehouses as needed for production analytical applications. Analytics has always helped to improve decision making. But big data has leveraged the speed of Hadoop and in-memory analytics, to generate faster and better decisions. For example, health insurance giant United Healthcare is using "natural language processing" tools from SAS to better understand customer satisfaction and when to intervene to improve it. The most interesting use of big data analytics is to create new products and services for customers as seen in the case of BancaCarige and other organizations mentioned in the work. Big data is characterized by volume, variety, and velocity. Understanding customers' demand requires an excellent grasp and analysis of business data. This is the key to the development of successful new products and services.
Big data has some limitations. It is however not equal to a good result. Big data encourages collection and analysis of everything. There is some level of imperfection in the collected data which when not properly cleaned could yield a bad result that will lead to wrong business decisions. Data generated via big data is sometimes used by the wrong hands to perpetrate crime. Unnecessary use of technology can also waste computer resources. Therefore, the organization should be familiar with various analytics offerings by distro companies, to clearly understand their business requirements and apply a matching solution that will fit their business environment in order to avoid waste of resources. In addition, big data solution should be used when is needful. The technology is complex and runs on a supercomputing platform. This has resulted in new roles for supercomputing experts. Organizations should bear in mind that it takes time to build complex, sophisticated, and intense technology skills. They should, therefore, invest in their team to achieve a good result.
Big data has a wide range of applications namely business, government, health, education, finance, and so on. Finally, every organization should stick to her memory that determining relevant data is key to delivering value from massive amounts of data.
Even though various big data and business analytics approaches that can be deployed for enhanced business acceleration and development have been discussed in this paper. There are open research directions that have constantly appeared in literature recently. These open research directions cut across an analysis of heterogeneous data, data privacy, and security, providing a unified framework for data cleaning and deep learning techniques for big data processing.
- Data privacy and security: One of the major challenges in developing effective big data and business analytics in our opinion is how to develop a security mechanism that ensures user security. With such an approach, business owners will be confident in sharing their user data to develop the next generation of big data analytics protocol that takes into cognizance the security challenges. This may involve providing a dynamic security mechanism that takes care of the changing nature of big data especially mobile big data or big data algorithms for data privacy during data extraction, filtering techniques that reduce scarce bandwidth consumption in the mobile network through computational offloading. In addition, areas such as ways to generate the right metadata to be analyzed using scalable data mining also require further research.
- Effective techniques for heterogeneous data analysis: Developing techniques and framework for the analysis of heterogeneous big data for various economic enhancement and applications such as disease control, transportation network scheduling and modeling of dynamic distribution of population for human mobility are highly required. Moreover, other techniques in big data are data cleaning and aggregation for the recent explosion of mobile big data and how these data can be analyzed for target advertising, behavioral analysis, detection of hotspot crime zone and disaster management.
- Deep learning Methods for big data and business analytics: Deep learning techniques are automatic feature representation approaches for big data analysis and have been widely applied in image classification, medical diagnosis, natural language processing, and human activity identification using smartphones and other cyber-physical system data. Various deep-learning approaches have been proposed for the analysis of a variety of data models. These include convolutional neural network, deep autoencoder, restricted Boltzmann machine, and recurrent neural networks. However, there are still areas that are scarcely explored in deep learning for big data analytics. These include evaluation of deep learning methods on a variety of datasets and hyperparameter tuning for improved results, solving class imbalanced issues, efficient and real-time analysis of big data using deep learning approaches.
- Data Fusion for big data and business analytics: Another big data area that has received less attention and requires further research is in the area of data fusion for effective data analysis. Data fusion methods are the integration of heterogeneous or homogenous data in order to increase reliability, robustness, and generalizability of big data analytics algorithms. In addition, big data fusion approaches are necessary to reduce uncertainty and the impact of indirect capture that are common during big data generations. Areas that require further research include cyber-physical implementation for the internet of things applications, improved decision fusion for enhanced generalization and diversity, obtaining reliable approaches to combine heterogeneous data and identifying the importance of individual data modality for big data and business analytics before fusion is performed.