Anyone who deals with digitalization can hardly ignore Big Data. In media reports, this is mostly about collecting and analyzing data in a big way, about “data bites” and the (un)certainty of personal data. There are also plenty of articles on the right big data strategy for companies. But do you, as a small or medium-sized company, need a big data strategy at all? Isn’t that rather a topic for large corporations and research institutions?

Big Data – What is it?

According to Wikipedia, big data is a neutral name for data that can’t be used with traditional data processing methods, because they are too big, too complex, too fast-paced, or too under-structured. The term is defined even more generally, like this:  “Big Data is the mass of all digitally available data, for example from the Internet, mobile communications and finance.”

So, first of all, it’s not about collecting and using data, but rather about coming to grips with the steadily increasing amounts of data. And that is a task that is familiar to most companies: cope with the flood of information.

So the initial question should be clarified and answered: yes, even small and medium-sized companies need a strategy in dealing with huge amounts of data. Whether you want to use the term big data or not, it’s a matter of taste.

Big Data is and remains one of the most important topics of digital transformation for companies. It is believed that big data has to play an even more central role in the long term. Because the future of companies lies in digitization concepts, such as the Internet of Things and Industry 4.0. These visions have one thing in common: they produce huge amounts of data every day.

Collecting, processing, analyzing and maintaining this data are the big challenges of our time. Only those who exhaust the potential of this data will be fit for the future in global competition. However, there are always several ways to turn that data into information. Big data is not always the solution. There are several criteria that determine the type of data processing.

Dealing with Huge Amounts of Data

Thanks to the Internet, e-mail and co., the number of digitally available information is increasing exponentially. But even companies themselves are producing more and more electronic data. For example, invoices and other documents are created in PDF format and no longer printed on paper, production machines continuously send information about their maintenance status or exchange data with other machines, the ERP system communicates with another ERP system and, in turn, with the document management system. All these data are stored and should be found by employees and, of course, be used.

Because employees spend their time with their actual tasks and not with running after documents and information, the following tasks have to be solved in dealing with large volumes of data:

  • Providing information: ideally, each employee automatically receives all the information he needs for his work. This sounds elaborate, but it’s easier with a good document management system than you might think.
  • Creating an overview: with dashboards, ticket systems and electronic files, you archive all information in a structured manner and offer your employees the opportunity to get an overview of the status of a project at a glance or to collect all the information required to complete a task in one ticket.
  • Managing knowledge and making it available in the long term: even today, some of the knowledge available in the company exists only in the minds of individual employees or in locally stored e-mail mailboxes, to which the other employees have no access. To harness such “lost” information is the task of systematic knowledge management.

When Does Big Data Make Sense?

If we are thinking at some kind of checklist, whether the use of big data technologies make sense, we are pointing at the so-called “three V’s”: Volume, Velocity and Variety. From a purely technical point of view, big data technologies show the limits of these aspects, such as relational databases. The amount of data that needs to be kept is a crucial point. If the data volume is in the multi-digit terabyte range, big data technologies must be taken into account in the selection. If it goes in the direction of Petabyte, the use becomes mandatory.

If you are dealing with huge data “mountains” that are being made available, but also with very many users to access these data, a cluster solution offers itself. A cluster-enabled document management system is worth gold in this case because it is inherently highly scalable, so it can handle millions of users as well as billions of documents.