The explosion of data, or big data, as it is affectionately known, is already happening. Big analytical and big data storage systems are looking at ways of storing data outside the traditional database because it inherently uses what is known as ‘a loosely coupled relationship’ for its flexibility. If you look at the SQL or relational databases when first introduced, flexibility and interactivity were at the forefront of the thought process.
Before that, however, it meant that data was stored in a static way, depending and how programs were coded. Any change in the data format or data attributes or the length of the file attribute meant the change had to be reflected in the whole system and then recompiling the program to accept it. That was the state prior to relational database systems being introduced.
Two things then happened - huge flexibility in data management so that data could be managed outside programs and the accessing of data was managed by the database and not dependent on the programming code. But with this new flexibility and ease of execution came two fundamental costs – the execution was always in the interpretation, structure was always being analysed and then executed, and secondly, there were limitations to the scalability and performance of the database compared to the previous file systems.
From the mid-eighties until now, with data volumes going from megabytes to gigabytes this was all quite manageable but as volumes grow to terabytes and petabytes, this model is not sustainable. Ideally you want to have a mix of both models at this point. Every tag is a point of search, the point of reference is the tagline, at the same time the data has to be stored in the database and it has to be searched very fast often through billions or trillions of records - something relational databases are not good at.
There are new technologies like NoSQL where there is no relational model at all. From an application viewpoint what is happening is that 80 per cent of the access of data is static and only 5 to 10 per cent is dynamic. For that, you are creating a layer and adding a substantial cost overhead. With the advent of service oriented architecture (SOA) the external world is only going to interact to the services that, in the past, only spoke to the database. In today’s architectural context, you don’t have to go to a table to access data, you can refer to a service to get the data. That being the case, why do we need flexibility in both places, it’s only needed at the service layer that provides a much higher level of intelligence as an integration point thus eliminating the need for a database at all.
If we need the kind of flexibility offered by a database at the growth stage, where programs versus data store, we move up into an object-oriented programming in an enterprise environment that is SOA. Now the points of reference are only objects or services where it manages the entity attributes at an atomic level, as well as its methods or services and how it stores or keeps data should be completely dependent on the flexibility and scalability that service requires. If you look to optimize right from the operating system layer all the way up to the service layer it may be better to avoid all the intermediate layers and have ‘self-funded’ services that can interact with the rest of the world providing any kind of information from that service.
Considering the kind of data explosion that is now happening, it might be preferable to go with a file system or non-SQL data storage methodology(the likes of Hadoop and Cassandra). With today’s application servers deploying multiple applications we are now seeing platform or process containers, especially for large enterprise, high-performance transaction processing.
The biggest business benefit in this context is shrinking cost dramatically. In hardware, the need for big database servers goes away as you are talking directly from the file system to the services. You may still need to use databases to export data to other systems needing access but that could be pushed to a data warehouse, but in time this will prove to be unnecessary bringing great cost savings in software as well.
As we move closer to a ‘real-time’ world we will move to in-memory data objects for speedy analytics and an appliance model with arrays of boxes in the cloud exclusively for applications. This probably explains why the big database vendors have been making a conscious move towards the applications end of the market with acquisitions and restructuring targeted at applications and solution sales.
This has left a nasty taste in the mouths of software solutions vendors, particularly in the telecoms and finance space, that have loyally stuck by database vendors over the years and now find themselves competing head on with entities they once considered as partners. Is the end of the database era in sight – it seems so!