2. Feb 2021
The Porsche Holding Data Lake Making our solutions even smarter

With the goal of enabling data-driven operations throughout the organisation, the Data Lake forms an important component of Porsche Holding’s data strategy. Laura König, Head of our Data-Driven Business Solutions team, talks about how we use the Data Lake and the technology behind it.

Laura, why does Porsche Holding need the Data Lake?

Our aim with the Data Lake is to make Porsche Holding and Porsche Informatik’s products even smarter. To achieve this, we are working with methods from the fields of data science and artificial intelligence. The Data Lake allows us to store, link and analyse both structured and unstructured data from a variety of sources. This enables us, for instance, to make global and automatic forecasts and to make recommendations that will further improve our business.

And, of course, the idea is for all of this to not only be available to individual users on their PCs, but to also be freely scalable to a large number of users within our international organisation. The Data Lake acts as a platform for this.

What sort of technology is the Data Lake based on?

The Porsche Holding Data Lake is based on first-class Big Data technologies, such as Spark, Airflow and HDFS, which allow large amounts of data from various sources to be combined, analysed and processed. It offers a highly available, scalable and 100 per cent GDPR-compliant solution involving strict authorisation and access control.

Laura König in an online interview

And where exactly do you apply the Data Lake?

The data science use cases we have implemented so far are very diverse. They range from technical topics, such as financial planning support, to product improvements such as clever data sorting or the pro-active detection of possible system limitations to provide even better customer service.

So it’s similar to the data-driven support for CROSS 2, which led to a Porsche Informatik Change Award last year?

Yes, exactly! This use case involved predictive maintenance: the goal was to pro-actively detect errors in CROSS 2. This now allows the support team to pro-actively resolve issues in the system before customers even notice them.

What is your team currently working on?

Right now, we are working on the productive implementation of image recognition for the Das WeltAuto used car exchange: In this, we are using Computer Vision to detect the orientation of vehicles on images and to sort them automatically. The module crops the images in such a way that the vehicles are displayed in an ideal size. It also checks the brightness and the idea is that, in the future, it will automatically replace the background as well.

Currently, Porsche Holding Salzburg is also involved a multi-year collaboration with the Department of Computer Science and Mathematics at the University of Salzburg. The aim of this collaboration is to develop new scientific methods in the field of data science with a view to continuously improving our use cases in the Data Lake.

Thank you for this interesting conversation! We look forward to presenting further use cases of our Data Lake here in the future.


Barbara Klein

is responsible for communications and social media at Porsche Informatik. Even after two decades with the company, she enjoys learning something new every day.