Alessandro Di Stefano, PhD
Cloud-Native & Distributed Systems Engineer - DevOps, MLOps, Platform Engineering, SRE+44 (0) 747 64 386 43 [email protected] aleskandro aleskandro
I am a Senior Software Engineer in the Openshift on Multi-arch and systems enablement team at Red Hat.
Previously, I pursued a PhD in distributed computing, specializing in AIOps in Kubernetes, where I explored how AI-driven automation can enhance decision-making and infrastructure resilience. This research shaped my approach to building intelligent, adaptable, cloud-native systems that leverage data, automation, and real-time insights to improve performance and efficiency. My work spans container scheduling, SLA optimization, and large-scale system performance, always focusing on the bigger picture: how distributed systems can be improved, automated, and made more intelligent. Alongside research, I have also been mentoring students in the Distributed Computing lab at the University of Catania, guiding them through the microservices pattern, from design to deployment in multi-node clusters like Kubernetes.
My journey in technology began with a Commodore 64, which first sparked my curiosity as a child. Fascinated by the cryptic beauty of command-line interfaces, I was drawn to the idea that, with the proper instructions, I could shape a machine’s behavior and make it respond to my imagination. I learned Pascal, and my curiosity turned into a passion for problem-solving and creative expression through math and code. Over time, this evolved into a deep interest in programming languages, from C/C++ to Python, Go, and Rust, alongside GNU/Linux operating systems and networks.
Since my early years in the FOSS and Hacking communities of South Italy, I have been driven by a belief in a world where open collaboration, education and research, and free exchange of knowledge are fundamental rights rather than privileges.
I strongly believe in a decentralized internet driven by people and a borderless world free from censorship and artificial divisions imposed by those in power where people can share, learn, and build together without barriers. To me, open-source software embodies this ideal, offering not only a technical solution but a philosophy of transparency, empowerment, and collective progress. These principles continue to guide my work and my advocacy for Free and Open Source Software.
When I’m not engineering systems at work or experimenting on my own, I’m hiking, climbing, immersed in music, reading sci-fi, or exploring non-fiction.
Experience
- Senior Software Engineer
- Red Hat Inc.
- 08/2021 - Now
- UK (Remote)
- Research and develop multi-architecture compute node features for OpenShift and Kubernetes, focusing on scheduling and autoscaling workloads across different architectures. Lead architect engineer for the Multiarch Tuning Operator.
- Led the QE team responsible for Openshift on Arm64 and Multi-arch. I migrated upstream and downstream QE test suites and CI automation to support arm64 offerings on major cloud providers (AWS, Azure, Bare Metal on-prem) and multi-arch OpenShift clusters.
- Designed and maintained an internal network and automation software infrastructure to deploy ephemeral, parallel OpenShift clusters on Bare Metal, enabling efficient large-scale testing in on-prem data centers.
- Mentor and support colleagues, sharing expertise in multi-arch enablement, Kubernetes, OpenShift, and automation, fostering a culture of collaboration, learning, and technical excellence.
- Active contributor to the OKD and Kubernetes community projects, currently focusing on enabling OKD Streams, OKD on CentOS Stream CoreOS 9, and the Massachusetts Open Cloud (MOC) Alliance OKD infrastructure.
- Technologies: GNU/Linux, Docker, Kubernetes, OpenShift, Golang, Python, Rust.
- Research Engineer (Contractor)
- Aucta Cognitio srl
- 08/2020 - 07/2021
- Italy (Remote)
- Designed a prototype decision support system for wind farm management to enhance operational efficiency, predictive maintenance, and resilience.
- Conducted in-depth research on KPIs and data visualization strategies, architecting a scalable microservices-based system using Golang, Python, and JS/Angular.
- Developed a data pipeline enabling Prometheus as a time-series warehouse, facilitating ingestion, monitoring, and long-term storage of SCADA system telemetry from wind turbines.
- Designed, trained, and deployed machine learning models for anomaly detection and predictive analytics, leveraging time-series forecasting, LSTMs, ARIMA, and statistical regression models to preemptively identify performance degradation, equipment failures, and operational inefficiencies.
- Developed and open-sourced a Prometheus backfiller, enabling historical data reconstruction to support time-series analysis, anomaly detection, and trend forecasting.
- Technologies: Linux, Kubernetes, GitLab, Golang, Python, Angular, Kafka, Prometheus, Machine Learning (LSTMs, ARIMA, Regression Models), Time-Series Analysis, SCADA Systems.
- Self-Employed
- Software Architecture & DevOps Consulting
- 01/2012 - 07/2021
- Italy (Remote)
- While pursuing and after completing my studies in computer engineering and distributed computing, I worked as a self-employed consultant for small businesses and local technology firms, delivering software architecture, DevOps, and IT infrastructure solutions to support on-premise and cloud-native environments.
- Designed and implemented software architectures, DevOps workflows, and IT infrastructure for end-user companies and local technology firms, supporting both on-premise and cloud-native environments.
- Architected and modernized distributed systems, leading projects such as migrating monolithic platforms to Kubernetes (Google Cloud, OKD) and adopting microservices patterns to improve scalability and reliability.
- Provided DevOps consulting for businesses, implementing CI/CD pipelines, infrastructure as code, and containerized deployment strategies, ensuring streamlined development and operational efficiency.
- Led system and network administration efforts, including LAN design, virtualization, security hardening, and hybrid cloud integration**, deploying technologies like Proxmox, pfSense, Azure AD Connect, VMware, and Active Directory.
- Developed and deployed scalable solutions for real-time streaming services, medical IT infrastructures, and production networks, integrating Kafka, Elasticsearch, MinIO, and Prometheus for monitoring and data processing.
- Mentored development teams on best practices in software architecture, cloud computing, and distributed systems, fostering innovation and technical growth in client organizations.
- Technologies: GNU/Linux, Docker, Kubernetes, Golang, Python, Rust, Ruby on Rails, Ansible, Terraform, Proxmox, pfSense, Active Directory, Kafka, Elasticsearch, MinIO, Prometheus.
Education
- PhD in Distributed and Parallel Computing
- University of Catania
- 10/2018 - 11/2021
- Italy
- Conducted advanced research in distributed computing, focusing on AI-driven automation (AIOps) in Kubernetes to optimize workload orchestration, performance tuning, and resource efficiency in large-scale infrastructure.
- Developed machine learning models and time-series forecasting techniques for anomaly detection and automated failure remediation in cloud-native environments.
- Designed and implemented experimental frameworks to analyze the impact of AI-driven scheduling and scaling strategies on containerized workloads.
- Published research in peer-reviewed conferences and journals, contributing to the broader scientific community in cloud computing, AIOps, and Kubernetes-based architectures.
- Mentored and supported undergraduate and master's students in their thesis as co-supervisor and teaching assistant in the distributed computing class, providing technical guidance on distributed computing, distributed transactions, microservices design, DevOps, and Kubernetes.
- MSc in Computer Engineering
- University of Catania
- 10/2016 - 10/2018
- Italy
- BSc in Computer Engineering
- University of Catania
- 01/2011 - 07/2016
- Italy
- Research Engineer
- Queen Mary University of London
- 03/2018 - 10/2018
- UK
- I worked in the team responsible for Raphtory, an Open Source distributed streaming graph processing middleware system;
- I designed and developed in Scala some components of Raphtory: Interfaces/Traits for the Spouts, implementations of the distributed partitions manager with a focus on concurrent data-structures and their behavior, implementations of the live analysis manager based on the bulk synchronous parallel pattern.
Publications
-
MORA on the Edge: a testbed of Multiple Option Resource Allocation
Wendlasida Ouedraogo, A. Araldo, Al. Di Stefano, An. Di Stefano
2022 IEEE 11th International Conference on Cloud Networking (CloudNet)
-
Improving QoS through network isolation in PaaS
Al. Di Stefano, An. Di Stefano, G. Morana
Elsevier - Future Generation Computer Systems - Volume 131, June 2022, Pages 91-105
-
Prometheus and AIOps for the orchestration of Cloud-native applications in Ananke
Al. Di Stefano, An. Di Stefano, G. Morana
2021 IEEE 30th International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE)
-
EdgeMORE: Improving Resource Allocation with Multiple Options from Tenants
A. Araldo, Al. Di Stefano, An. Di Stefano
IEEE Consumer Communications \& Networking Conference (CCNC), Las Vegas (USA) 2020
-
Resource Allocation for Edge Computing with Multiple Tenant Configurations
A. Araldo, Al. Di Stefano, An. Di Stefano
Proceedings of the 35th ACM/SIGAPP Symposium on Applied Computing, Brno, Czech Republic 2020
-
Ananke: A framework for Cloud-Native Applications smart orchestration
Al. Di Stefano, An. Di Stefano, G. Morana
2020 IEEE 29th International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE)
-
Scheduling communication-intensive applications on Mesos
Al. Di Stefano, An. Di Stefano, G. Morana
International Journal of Grid and Utility Computing (IJGUC), 2019
-
Coope4M: A Deployment Framework for Communication-Intensive Applications on Mesos
Al. Di Stefano, An. Di Stefano, G. Morana, D. Zito
2018 IEEE 27th International Conference on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE)
Certifications
- 3rd International Summer School on Deep Learning , Warsaw, Poland
- 01/2019
- VI Mediterranean school of complex networks , Salina, Italy
- 07/2019
- Lipari School on Network and Computer Sciences , Lipari, Italy
- 07/2017
- Angular.JS certificate , University of Catania
- 01/2016
- Degree in Music Theory , Conservatory of Music "Vincenzo Bellini", Catania, Italy
- 01/2009
Societies
- Scout in the Italian Scout Association "Agesci" , Italy
- 01/2000 - 01/2010
- Co-founder of the Scordia Linux User Group , Italy
- 01/2008 - 01/2012
- Hacktivist at Catania GNU/Linux User Group , Italy
- 01/2008 - 01/2017
- Hacktivist at Freaknet Medialab , Italy
- 01/2008 - 01/2017
- Scoutmaster in the Italian Scout Association "Agesci" , Italy
- 01/2012 - 01/2020
- Volunteer at MOCA Olografix Camp Hackmeeting , Italy
- 08/2016 - 08/2016
Skills
Programming And systems:
Python, Golang, Rust, C/C++, GNU/Linux, Docker, Kubernetes, OpenShift
DevOps and Cloud:
Ansible, Terraform, Jenkins, GitLab, AWS, Azure, Google Cloud, Bare Metal
Database and Observability:
PostgreSQL, MySQL, MongoDB, Redis, Prometheus, Grafana, ELK Stack, Jaeger
Soft Skills:
Team Leadership, Mentoring, Public Speaking, Technical Writing