Data and databases are essential for startups at every stage of their growth. Businesses increasingly prefer data-driven decision-making to intuition-based decision-making — and the data analytics market is growing at an annual rate of nearly 30%.
New technologies such as cloud-based database management and AI have shaken up the sector in recent times, but what are the key trends to watch in data management — and what changes can we expect to see in the near future?
SurrealDB World Conference 2023 brought together visionaries, innovators and thought leaders to answer these very questions. But if you missed it, here are some of our key takeaways plus some extra expert insights.
1/ Data on the cloud
Data is the core part of your tech stack in most businesses, says Tobie Morgan Hitchcock, cofounder and CEO of SurrealDB, a startup that offers a modern cloud-native database platform, adding that many companies have been moving their data onto cloud-based systems in recent years.
“If you can move to a system in the cloud, it enables you to focus on building your application rather than on the tech stack, which then enables you to get to revenue quicker, increase your profit margins, reduce your costs and reduce development time,” says Morgan Hitchcock.
The only downside is that using cloud services can be expensive — but these costs are justified, according to Morgan Hitchcock.
“People forget the fact that these data centres are doing a lot for you that you would normally have to do yourself. You have to include a lot of these services, technologies and architectures, in addition to your other costs, and not just how much the server costs.”
He adds that many companies are still in the process of moving their databases into cloud systems — and many are yet to start. “People are now becoming more used to and accepting of data being stored in the cloud. Earlier, data needed to be managed and hosted by the organisation, and now they’re actually handing it off to companies like Amazon and Google.”
2/ Concerns around privacy
“More and more people around the world are becoming savvy to how their data is being used — and that’s a good thing,” says Morgan Hitchcock. “Organisations and the people building the databases need to accept that and become aware of it.”
Just removing people’s name and date of birth and address is not sufficient
Dr Caroline Morton, a GP, clinical researcher and software developer, says that concerns around data privacy are huge in healthcare — where patient data is crucial to research and innovation.
“Just removing people’s name and date of birth and address is not sufficient — that is not anonymising. If you combine that data with other data that’s available, say, on social media, you can find something,” she says.
To tackle this, she says there’s a move towards “trusted research environments”, which are secure virtual environments in which you can access data, analyse it and conduct research. The platform is kept secure by allowing no access to the internet from the time you log in.
Apart from this, Dr Morton says that it’s key for users themselves to be aware of what data they’re sharing and where — and how it can be used combined with other data they’ve shared elsewhere on the internet. She adds that the move towards more data being stored on devices could also enforce better data protection and ownership for users.
The volume of data in the world is also growing, and this is straining traditional models of computing where everything is controlled and analysed centrally. This is also leading to many companies gravitating to a more decentralised computing model, where analytics and intelligence are built into edge applications closer to the users.
“You can see it with Apple in its latest releases — it’s about moving data; moving machine learning right onto the devices, so the devices are becoming more powerful and more integral to this global architecture in the tech space,” says Morgan Hitchcock.
3/ Merging of transactional and analytical data stores
For Morgan Hitchcock, the biggest trend to watch in data and databases is the merging of transactional and analytical data stores. Where analytical databases are designed for data analysis, transactional databases are optimised for day-to-day operations such as running production systems.
As organisations try to improve data efficiency, consolidating database operations by combining transactional and analytic capabilities in a single database management system with blended features is expected to become the norm.
“Over the next few years, we’re going to see those two different or disparate pieces of data space technology coming together,” says Morgan Hitchcock.
He adds that there will also be further simplification of how databases can be created. “Currently, you’ve got lots of different tools trying to simplify the space and improved ways of querying data and creating your database — but nothing has really come and simplified it enough, or as much as it can do.”
4/ Automation at early stages
AI is expected to enable companies to analyse large amounts of unstructured data efficiently.
Morgan Hitchcock says that while there are some exciting developments taking place in the world of AI, many enterprises are still trying to do “basic things” — and it might take them a while to start using AI.
We’re just at the beginning; it’s like being at the beginning of the cloud, and 20 years later, we’re still moving to the cloud
“We’re using applications on a daily basis and if the organisations that are building the applications can use small incremental improvements to add intelligence in some way — whether that’s through custom machine learning models, or whether it’s through large language models — all of those small improvements add up in the long run.
“So it’s exciting — but we’re just at the beginning; it’s like being at the beginning of the cloud, and 20 years later, we’re still moving to the cloud.”
Dr Morton agrees that while AI in data is still just beginning, it could help process and find patterns in large amounts of data, especially to find healthcare patterns in large populations.
“The benefit is when the dataset is too big and the data points are too many to fit into the human working memory, and to be able to process and move in any way.”