Onwards to data-driven world, part 3: IT industry and jobs of the future - Onwards to data-driven world, part 3: IT industry and jobs of the future
Onwards to data-driven world, part 3: IT industry and jobs of the future
In the previous blog post I described how data-driven approach will change the software development landscape. Here I continue the path upwards to what it means to the software industry and especially programming jobs.
When you start your career as a software developer, you are probably not going to work as a lead developer of an operating system kernel or 3D graphics engine. Instead, the traditional role is to work with a database backed system for tracking something quite specific, like orders of a particular cattle feed supplement provider or water quality measurements made with a single type of optical instrument by one research institution. As an entry level programmer, you are not great at what you are doing, but given the small world you are operating in, you are good enough. Your employer is not paying your salary because of your unique skills, but because of your adequate skills put to serve their unique needs.
My guess is that the scene is going to change for entry level programmer positions. Currently millions of people are employed in positions where they carve program code to the demands of somewhat small use cases. This kind of work is probably not going to be needed so much in future, as it will be replaced by more generic data-driven applications. However, even if data-driven software is going to eat many traditional applications, they also introduce a new dependency: without high quality data there are no data driven systems. I believe this will create a new profession for millions of people who are getting into IT: entry level data scientist or data engineer.
What used to be millions of people carving software to match some specific needs will be replaced with millions of people carving data to match those same needs. From a technological point of view, we have taken a step forward. Most of the time data is a better medium for expressing ideas about the mundane world around us, when compared to program code.
Of course this is just a suggestion, but I find it plausible and worth considering. In any case data-driven future cannot not mean that a handful of extremely well-paid data science PhD’s working for Google and others are running the whole show. No, instead the new big thing is, I truly hope, the democratization of data and data science. This will create new opportunities and new jobs for (almost) everyone and (almost) everywhere.
In the previous post I described how incredibly expensive software development is and how software industry can exist only because once written, software is cheap to run and copy. Having moved from software projects to analytics projects, I have seen how the same idea holds. As a practical example, we have been working with a sports application where you need to infer the position of athlete’s limbs and joints based on video data. Knee and elbow are easy and cheap, because there are many freely available pretrained machine learning models for the task. Ankle, however, is very hard and expensive, because the existing models don’t include the instep and you need to build your own dataset and train the model. Layperson might be surprised to find out that for knee detection algorithm the cost might be 100 euros, but for ankle detection the price goes up to 100 000 euros. In software business this has of course always been the reality. Common functionalities are free or at least very cheap, but the costs grow rapidly when you start tailoring and customising.
The price structure of software has among other things lead to open source software. Originally hobbies, nowadays major open source applications are largely projects developed by paid professionals. Companies have different motives for developing open source, ranging from supporting their consulting business to lowering maintenance cost of non-core code bases to disrupting markets they otherwise could not compete on. These and other commercial motives combined with personal and academic efforts produce something that benefits us all. The growing assortment of free and open source projects forces the development to move on, as a company cannot keep on milking the same database engine or the same text processing software for all eternity.
With data-driven software we hopefully witness an open data boom similar to the open source movement of the last three decades. Motives behind open data could be very familiar. We have people investing their spare time to create valuable datasets, such as open street maps, and academics putting out open data as part of their long tradition of open scientific research. But I guess also in this arena things really get moving when companies discover reasons to open their datasets. As an example, being the fourth or fifth most popular social network might be hard to monetise. However, if you have a brilliant business idea that goes beyond advertisement targeting provided by social platforms of today, you might level the playing field by opening your dataset. It will be interesting to see how much from the open source movement carries over to the open data side and in what ways it will be different.
In my view open source has both democratized software and kept market economy truly a market economy. I hope open data will do the same for data-driven software. Obviously it is not good for the whole of humanity if the few first real internet giants can keep on selling their unique datasets to the rest of us, over and over again.
I would like to close by pointing out that even though I painted data-driven revolution as a thing that is happening to the world of IT, in actuality it is happening to the whole world. Information technology has been the major driving force for business, science, culture and everyday life, so big IT revolutions in the magnitude of internet and mobile technology are not going to stay inside the tech bubble. Whatever you are doing and whoever you are, it is good to pay some thought to this change and be prepared. And if you are working with IT, then you should have already started your journey towards the future of data-driven software.
The author is development manager of data analytics at CSC