The Time for Big Data is Now
Marylyn D Ritchie, PhD
Professor, Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, USA
The term ‘Big Data’ refers to any collection of data that is too complex or large for traditional computer software, database tools, and data processing methods. The challenges that emerge due to the generation of big data are vast and include everything from data transfer, storage, curation, searching, analysis, and visualization. We have seen tremendous advances in data generation in all areas of science and technology which has driven this emergence of ‘big data’. For example, collection of data in astronomy, finance, genomics, meteorology, social science (networking), and internet search among others leads to an enormous wealth of data ready to be processed, analyzed, visualized, and interpreted. While this abundance of data has led to significant challenges in many scientific disciplines, it has also created tremendous opportunity for advancement and innovation bridging the gap between computer science, engineering and mathematics to the other scientific disciplines (life, physical, and social sciences). In many areas, research has shifted from sparseness or scarcity of data to data abundance; however the theory and methodology to deal with the increasing complexity that emerges from large, comprehensive datasets is still in development. Nonetheless, ‘Big Data’ is everywhere and has gained significant attention. Here, we will discuss the realities, challenges, and the future of ‘Big Data’ as it pertains broadly to the scientific community.
Background Review Article:
Big Data Needs a Big Theory to Go with It, Apr 16, 2013, Geoffrey West, Scientific American.