Organizations are collecting and using more data than ever before, but the process of gathering and analyzing information isn’t new. From the abacus to the computer, using technology to identify and understand trends is a common thread we can trace throughout our history. It’s no surprise that big data has grown into a $57 billion industry and is projected to grow at a steady rate over the next four years.1 In this blog post we’ll look at the many definitions of big data and what it means in the managed print services (MPS) space.
What is big data?
Let’s define big data. What is it and what differentiates it from other types of data, like the information on your phone or computer? As the name suggests, quantity plays a significant role in what makes data ‘big,’ but it isn’t the only measure. There are three criteria that define big data, often referred to as the three Vs: volume, velocity and variety.
Volume refers to how much data is collected from various sources which can include everything from business transaction information to marketing and social media metrics. With the rise of the Internet of Things (IoT) and machine-to-machine (M2M) communication, it can also include sensor information that smart device providers like Nest are able to collect from the app that controls their smart thermostats and the learning thermostats themselves. They can, in turn, use that information to better understand who uses their product to refine their sales and marketing strategies and how consumers use their product to improve usability and design.
We can share information so quickly that it often feels instantaneous, but a lot of data processing happens during the seconds or minutes it takes to send a text or email. In the world of data analytics, this can present a significant challenge. Big data is often measured against how fast information is collected and how quickly it is turned into actionable information. Depending on the type of data being collected, it might be feasible for organizations to store it for a period of time before analyzing it. Others may need near real-time insights that are only made possible with advanced analytics technologies capable of processing large quantities of information at remarkable speeds.
Another criterion that separates big data from ‘regular’ datasets is the format it comes in. You might derive everyday insights from a handful of sources based on your role and your organization, but big data involves collecting information from a wide variety of sources, including structured, numerical data from a database as well as unstructured data from text documents, email, video, audio, financial transactions and more. Powerful analytic tools enable us to not only collect data in various formats but combine and cross-reference it for valuable insights that we might otherwise miss. Consider how much data e-commerce titans like Amazon collect on a daily basis and consider how varied the datasets are, from information about transactions to ad-buying behaviour to social media and search engine results. All of this data helps retailers like Amazon forecast trends, identify changes in buying habits and, understand what and how marketers are advertising and more, but they need the resources to be able to collect and process the raw data.
Despite the three Vs, the definition of big data varies depending on who you ask. Here are some definitions that show just how varied the concept is:
Big data is an evolving term that describes any voluminous amount of structured, semistructured and unstructured data that has the potential to be mined for information.2
Big data is a term that describes the large volume of data – both structured and unstructured – that inundates a business on a day-to-day basis.3
A comprehensive post from datascience@berkeley lists over 40 definitions from industry thought leaders. Here are some interesting ones that highlight key differences in how we understand big data:
Big Data is the result of collecting information at its most granular level.4
Big data refers to the approach to data of “collect now, sort out later”…meaning you capture and store data on a very large volume of actions and transactions of different types, on a continuous basis, in order to make sense of it later.5
Big data describes datasets that are so large, complex, or rapidly changing that they push the very limits of our analytical capability.6
Big Data is nothing more than a tool for capturing reality.7
Most definitions point to the quantity of data as a key characteristic, but a number of them also refer to the tools used or types of insights gathered from the data. There’s no single definition and the ambiguity around what is and what isn’t considered big data brings us to our next question: can we call print data big data? The short answer is, it depends.
Is print data big data?
In the world of print management, big data may still seem like an abstract concept with little relevance. How much data do you collect? On a daily basis, maybe not a lot. On an annual basis, maybe a significant amount. As the definitions above show, volume isn’t everything when it comes to big data. The better question might be, what are you doing with the data you collect? One of the common themes in most conversations about big data is the impact it has on an organization. Collecting large amounts of data is pointless unless you eventually analyze it and develop insights that help your business be more successful.
It’s a bit of stretch to suggest that all device data is big data if it helps you drive growth, but there is something to be said for innovative data collection processes, new analytic tools and new ways of using data to make your MPS program successful. If we look at managed print as more than just device data and combine it with transaction data, customer information and social media insights, we get a much bigger and much more useful picture of who is using our managed print solution(s), which solution(s) they are using and why, and how we as MPS providers can better market, sell and support our solutions.
I wouldn’t say that device data itself is big data given the limited variety of datasets, but I certainly think there is a place for big data in the managed print industry. As with most verticals, I think the opportunity lies in examining the broader scope of information available to better understand the who, what, where, when and why of MPS and strategically using that information to transform and grow our businesses.
Looking for more data on big data? Check out our Big Infographic of Big Data for facts, stats and trends. Stay tuned for next week’s blog post on the key differences between structured and unstructured data.
How do you define big data? Do you think the device data you collect could be considered ‘big’? Let us know in the comments below!
1 PR Newswire. “Global Big Data Market 2017-2030 - $76 Billion Opportunities.” September 2017.
2 Margaret Rouse. SearchCloudComputing, “big data.”
3 SAS. “What is big data?”
4 Jon Bruner. datascience@berkeley, “What Is Big Data?”
5 Rohan Deuskar. datascience@berkeley, “What Is Big Data?”
6 Joel Gurin. datascience@berkeley, “What Is Big Data?”
7 David Leonhardt. datascience@berkeley, “What Is Big Data?”