Read this paper for an overview and examples of how big data is used in specific areas, such as supply chain management, risk management, and logistics of business in industry. One of the biggest issues for analysts with big data is knowing how to separate the valuable data from that which does not help answer their requirements.
Sometimes people describe intelligence as "connecting the dots", but it is rarely simple like a "paint-by-numbers" art project. The dots are not just lying around waiting to be connected. More appropriately, it has been described as filtering out the right radio signals from the fray in a huge city. You have to be carefully tuned to your requirements, which will be discussed at length in Unit 2 and again in Unit 8, as these are the guide stars that keep you on track to finding the right data to answer the questions you need to focus on.
5. Conclusions
5.1. Hurdles
Despite of all the big data benefits for businesses, applying big data has not been accepted at the management level of many companies yet; this may be because the cost of using big data requires a high initial investment. Management support has a crucial role in the successful implementation of a big data analysis system. An initial cost-and-benefit assessment of big data in terms of how it will be utilized long-term is a very difficult task. It is not easy to determine applying big data analysis is beneficial for businesses that make fewer than a certain number of transactions every day. In some companies, such as Amazon, Walmart, Google, etc., traditional systems cannot be used to analyze the data because of how enormous the volume is. However, in some other companies, the V characteristics of big data are more questionable, and there is no rule of thumb to help tell the manager that the available data is suitable for establishing a big data analytics framework.
One of the problems with big data applications is knowing how, where, and by which means to collect useful data. Another issue has to do with inadvertently separating valuable information out of the available data. The analyst should know the information that he/she wants to exclude from the data; additionally, the available data should be able to answer the analyst's questions. Yet another problem regards finding the methods by which one can provide the most accurate answer while still using a reasonable amount of time and financial cost. There is an increasing demand for employees who are qualified to analyze big data as companies respond to the rapid pace of technology developments. It is also a challenge to create trust between data analysts and the managers. Most systems originally resist change, so it is vital to have the higher-level managers' support in order to use the results of data analysis to change a system.
A computers' Central Processing Unit (CPU) could be an obstacle for big data analysis because of the underdeveloped capability of traditional computers to store and effectively process a big data set. On the other hand, not all of the available raw data is complete and consistent; therefore, effective cleaning and integration methods are required to make the dataset ready for analysis.
The 5th "V" added to the definition of big data refers to the value that can be obtained from a big data analysis. Unfortunately, whenever value comes in, hacking can start to crop up as well. Information security can be a hurdle to applying big data analysis in companies. A huge data volume increases the probability of having confidential and valuable information in the system, and this may increase data vulnerability and the chance of cybercriminals.
Another issue can be selecting the appropriate type of decision-making data for most of a system. More explicitly, not all of the available data in a system is used for making each and every decision. These decisions are based on the knowledge and experience of the data analyst that determine the part of a dataset that should be used. Moreover, it is an unfortunate fact that the available big data may not necessarily be created by the target population. For example, there is a huge volume of information on Twitter, but not everyone in a community will have Twitter accounts. Thus, there is a part of community which creates a lot of data, while the other part is not involved in creating any of the available information in a dataset. This fact continuously emphasizes statistical uncertainties such as biased statistics.
Applying the results of a non-real-time data analysis can lead to a significant difference between the analysis for both the historical data and the real-time data; for example, the initial assumption of the forecasting analysis that "the future follows the past" would not be true. As an additional example, when the data shows the proficiency of a transportation path in terms of cost and time (many companies have access to this data), other companies may start to use this same path. By increasing the demand for the mentioned path, it may lose its attractiveness both in terms of cost and time.
Another old and common challenge is sharing data between the echelons of a supply chain, or even various departments of a plant. It is challenging to ensure that all the stakeholders who share data receive some benefits from this cooperation. Moreover, it is vital to have a supportive information technology department which provides both the hardware and software requirements for working with big data. Data analysts who can work with big data should be hired, or knowledgeable instructors should be easy to contact so that they can educate the big data analysts for the plant.