The terms ‘Big Data’ and “Open Data” are now in vogue; important because in the long run, every aspect of our society will be impacted by big and open data in a profound manner.
Big data is extremely large or complex data sets which, when analysed, can reveal unknown correlations, valuable patterns, insights and associations which are useful to undertake human interactions and behaviour that were previously unknown.
Big data is mostly generated in real time and comes from multiple sources such as devices, transactional applications, Internet, video, audio, networks, log files and sensors. Manipulating big data can provide info on market trends, predicting disease outbreak, customer preferences and other useful information.
Open data can be defined as content that can be freely used, re-used and redistributed by users, readily available and accessible.
Interoperability is a key characteristic of open data since it enhances the ability for the combination of different data sets with no challenges, thereby producing better knowledge products and services enriched with multiple data sources.
Open and big data in action
Although big data and open data are similar in some ways, they are not the same. Open data unlocks big data to make it more relevant, democratic, useful and user-friendly. By definition, big data can be looked at through the lenses of size whereas open data is premised on its usage.
It is important to note that what we consider big data today may not be as big in future due to improvement in computing power and data analysis capability which reduce the complexity of previously classified big data.
Big data, which is not open, is invariably not democratic since it confers a lot of power on persons who have access to it, to the disadvantage of the rest of the population. This data can include propriety commercial data, military or national security data, sensitive research data, etc.
It must be pointed that open data does not have to be big to be important and relevant, a little amount of data can still have far reaching impact. For example, making local government budget data available to citizens in easy-to-understand format can go a long way to empower them to hold duty bearers to account.
Big data applications can be seen in diverse fields of endeavours including retail or service business where it helps in understanding consumer needs. Big data analytics are now being applied in improving healthcare delivery by helping us understand diseases, thereby contributing to research and the invention of better cures as well as in the monitoring and predicting of disease outbreaks and the mapping of disease epidemics.
National security and law enforcement agencies are also making use of big data analytics to thwart terror plots, detecting and preventing cyber-attacks, arresting criminals and even predicting the occurrence of some types of criminal activities.
Also, it can help forecast the volatility of political uprisings in real time. Elsewhere, politicians use big data analytics in their election campaigns so that they can target prospective voters with the right messages. Furthermore, big and open data can help increase citizen engagement, thereby improving the democratic process and good governance.
To this end, the government has an important role to play in making big data into open data which can make it a powerful tool for citizens to participate fully in the governance process.
For example, http://data.gov.gh maintained by National Information Technology Agency (NITA) is a very important resource; however, the portal can be improved by adding more datasets and more importantly ensuring information published is up to date.
In order to make these applications possible, there is the need for the development of a number of capabilities including the use of advanced analytics techniques such as natural language processing, data mining, text analytics, machine learning, predictive analytics and statistical analysis to make sense of extremely high volume data which are now increasingly becoming available.
Programming frameworks such as Hadoop and related tools such as NoSQL, YARN, MapReduce, Spark, Hive and Pig databases are some examples of technologies that support the processing of large datasets.
The age of big and open data is upon us since their combination can lead to significant changes in our society; support organisations to operate more effectively, expand the frontiers of science and research, and improve emergency preparedness, stock market trading and sports performance.
Although manipulating big data comes with some challenges such as capturing, storing, updating, querying, curating, analysing, searching, sharing, visualisation and also privacy issues, big and open data provides unparalleled power to understand our environment, undertake better analysis which produces evidence-based decision making.
This can produce tangible dividends in the life of citizens while helping solve the complex problems Ghanaians face, especially in the fight against poverty, disease and hunger.
The writer is the Executive Director of Penplusbytes.org - You can reach him at [email protected]