From laptops to self-driving cars, ML deployment has exploded beyond the cloud. Add LLMs to the mix, and you’re not just monitoring model drift – you’re preventing AI from suggesting glue as pizza topping. Welcome to the wild new world of ML in production.
The job of data scientists and AI devs is pretty tough. It often starts from a paper with almost no runnable-code, or from an off-the-shelve model that has to be optimized or fine tuned.
But that’s not enough. Models might need to be trained or deployed on specific hardware (GPU, IoT and mobiles), and performance might drift over time.
During this talk I’ll share my experience of leading platform and AI service teams, how we tackled some of the major problems you might find along your AI journey like training, serving and monitoring ML models.
We start from a data scientist’s laptop and talk on how to make her successful, covering both capabilities that you should provide and which tools might help.
The last part of the talk is dedicated to LLMs, how they are different from a serving and monitoring perspective compared to traditional ML models, and how to start using them in production.
Christian Barra is a Software Engineer, Tech Lead and international speaker living in Lisbon. He’s the co-founder of ZeroBang, a cloud consulting company. He is an active member of the tech community in Berlin, conference organiser and a Python Software Foundation Fellow. You can follow him on twitter @christianbarra