We’re consumed by data, but how can we make use of it?
As part of our focus on data for this month, Yannick Mermet looks at the importance of data, and how publishers in particular can benefit from the intelligent use of information.
You can find Yannick on LinkedIn.
What is data?
Data, a four letter word which carries a powerful meaning. There are several definitions for it such as “a collection of facts and/or statistics”, “things known used for the basis of concluding an answer” or “it allows you to surf the Internet on your phone to watch videos of cats yawning”. Ok, that last one is not really the case, but it does retrieve lots of different kinds of data to enable to us to view said video. The question arises, what is data?
After some hard researching (or as some would call it “several quick Google searches”), the word data originates from the mid-17th century word Datum which is Latin for “a piece of information”. But after this brief historical nugget, we ask what kind of forms can data take? Well, the answer is anything really from numbers or texts on a piece of paper, to bytes or bits stored within a CPU – or even the facts within the mind of a person.
In this day and age, data surrounds us all the time. The problem is that we have too much data and we need differentiate between the data we desire and the data that is just nonsense. In this article, I intend to discuss about Big and Small Data, and how this wonderful information is helping publishers in particular.
Big Data vs Small Data
Small Data vs Big Data is a theme that has arisen over the past few years, and there has been much debate on both sides on deciding which one is more pertinent to modern times. But what are they exactly? Well actually, both phrases have a profound and practical meaning. The Small Data Group offers the following definition:
Small data connects people with timely, meaningful insights (derived from Big Data and/or “local” sources), organized and packaged – often visually – to be accessible, understandable, and actionable for everyday tasks.
What this means is that the data is constructed to be understandable at a first glance, enabling people to make quick and decisive actions on a day-to-day basis. Now Big Data can be turned into actionable Small Data. The Diagram shows the “Cost Effectiveness” vs the “Volume of Data to Analyse”.
As you can see, when getting closer to a middle point, the data becomes “too close to call”. A great quote from the Interview with Martin Lindstrom to explain this is that “Big Data is all about finding correlations, but Small Data is all about finding the causation, the reason why.”
The reason that Big Data is good for finding correlations is that it’s looking into the past. Normally at the size of petabytes or exabytes, Big Data is usually a combination of structured and unstructured data. An article on DataFloq says that Big Data can be characterised by the 3Vs: The Volume of Data, the Variety of types of data and the velocity at which it is processed. These three Vs combined are what make Big Data very difficult to manage unlike its counterpart Small Data which is constructed of small usable sections.
Big Data is an exciting venture where one can find the next undiscovered pattern which could project their company into the limelight. But they would need a mass of data scientists to comb through it and build a solid, scalable reporting application then generating significant value by having these important discoveries made available every day. The reason people are scared of Big Data is purely because there is just too much of it and they don’t know where to put it, how to access it and how to extract the knowledge from all that information.
An example of Big Data being used incorrectly and almost killing a company is Lego. Back in 2003, Lego were using Big Data to decide on their next product choices which took out the creativity on what the company was built on. They almost went bankrupt. So they changed their game plan.
They decided to go into homes of consumers in Europe and talk to young kids asking them “What are you most proud of?” What these conversations later revealed was that the quality of insights from talking to individuals was invaluable. They implemented these methods into their company, changed the size of bricks back to the tiny bricks, produced the Lego Movie and the rest, you would say, is history.
For a more in-depth discussion, I would recommend Lindstrom’s interview as it has some great insights on the debate between Big and Small Data.
So we know now the difference between on Big Data vs Small Data right? Here’s a table which will hopefully clear things up once and for all.
Both tables featured above are from Datafloq
Now that we know what Big and Small Data is, we need to know when to use it and what tools we can utilise. A fantastic picture-diagram from IBM best explains the “what to use” and “when”.
As shown in that diagram, there have been some emerging platforms that have taken the data world by storm like Hadoop and NOSQL, making the management of Big Data a less impossible task. But even with these powerful tools, the general consensus is that Small data truly holds the important data. Remember that old saying “don’t sweat the small things”, well we actually should as they are the most important bits.
How data can help publishers
In the last 5 years, data has manifested into a huge “Data Tsunami”, forcing many companies to become adaptive and either learn to surf the wave or get washed away by its aweing, powerful swells. In today’s market, data is an essential tool which assists in helping businesses in understanding their customers/clients as well as how well their products and employees are performing.
It helps football clubs in choosing players to see how many miles a player runs in a match. It helps stock brokers in choosing the right stocks to buy and sell. It helps travellers know when the next flight to their next destination is, how much it will cost and which airline/airport. Data is everywhere! But, how can it help publishers?
Well, publishers usually relied on the data of unit sales, which indicates how well books selling. Especially since the release of e-books, e-journals and e-magazines, we are able to track data with a lot more ease. Now with Big Data, there is the possibility of seeing and understanding more than just sales. For instance, we could see how many customers buy books of a particular genre during a seasonal period or what sort of customers (Adult, Children, Student, Professors, etc.) are searching for what books on a regular basis.
Here’s an example of a publisher utilising Big Data “Springer Made Scientific Journals Easy Accessible”. Springer is well known for being the leader publisher on scientific journals and books having over 6000 employees in 20+ countries. Springer realised that there was an enormous shift from printing to digital and thus they decided to carry out a survey to find out why. The results they found were that most scientific journals would become digital (by 88%) and a small amount would remain on print (12%).
Thus, they needed to adapt to their new surroundings and so came the birth of Springer Link: A site which made all their content searchable and accessible on all devices. Not only did it digitalise their journals but they made all their data, structured and unstructured, searchable through different channels. They created two applications to be able to track their journals: AuthorMapper – provides search results based on geographical location. The other application is Realtime Springer – provides insights in which scientific articles and journals are read in real-time and which topics are trending.
For more case studies on Publishers making positive changes thanks to Big Data, check out this article from DataFloq.
Another possibility that Big Data can provide is how a publisher should invest. Before the time of digital data, publishers would maybe rely on their gut feeling or intuition when deciding on whether or not to invest in an author. Now with Big Data, they can predict the new upcoming trends, how well that author’s work has done in the past and then make a concise decision on whether or not to invest in this author’s future works.
A final food-for-thought on Big Data helping publishers is with digital reading. It is possible to see who opens the book, who finishes it and know what stops readers from finishing the book. Having the knowledge of knowing what engages readers and what doesn’t in real-time is an absolute game-changer. Publishers are then able to unlock some hidden equity within their publishing lists and inform decisions on which authors and franchises to invest in.
For a more in-depth understanding, I would recommend checking out this Kobo Document.
Well that the beauty of the future isn’t it? I don’t know. Technology advancements have changed the industry so many times, from the first iPhone and Selfies to Mixed Reality Projects and Space Ventures!
It is difficult to say where how the will affect the Publishing Industry, but if I had to make a guess, I would say that print would become a thing of the past. All books, journals, magazines, etc. would become digitalised. No longer would you see people on public transport with their hardcopy books or their newspapers, but with their phones or tablets. Whether you want it or not, Digital is the way forward.
Surprisingly though, physical book sales have been on the rise while there has been a decline in the number of e-books sold. According to the Association of American Publishers, e-book sales, which constitute about 20% of the book-buying market, have plateaued, and Pew’s newest data, collected in March and April this year, also corroborates the fact that e-book readership has steadied over the past year. You can find more here on the “Are paper books really disappearing?” topic on the BBC website.
In my personal opinion, physical books will eventually become another attraction that we will see in museums. This is due to the younger generation reading almost anything and everything off their tablets and phones; the generation that will be making their mark in the coming years.
Yannick Mermet is a Junior Support Developer at Ribbonfish.
- What is a hybrid database? In introduction
- Spotlight on data: How has it changed the publishing world?
- Video: How data came to rule the world: A timeline