Data Science: Behind the buzzword

Data Science: Behind the buzzword

05-08-2025

What an EPFL data science course taught me about judgment, messiness, and why clean data is rarer than you think

The course

Before taking the Foundations of Data Science course offered by EPFL in spring 2025, Data science seemed like a buzzword; an opaque but essential skill, and somehow not for me. I vaguely knew about data analysis and coding in Python, but had assumed it was meant for other kinds of professionals. When this opportunity came up, I decided it was time to stop being intimidated and deepen my ability to analyse the data I already work with. This online, self-paced program gave me a solid grounding in the entire data science workflow, from collecting and cleaning data to analyzing it with R, creating visualizations, and communicating insights effectively. To pass the course, I submitted four final reports, each based on a different type of data source: SQL, Excel, API, and HTML code scraping. Below you'll find the links to the final project reports I submitted to obtain my Certificate in Foundations of Data Science.

Reflections

I progressively learned that coding was like another language. With its own syntax and grammar, but once you understand the underlying logic, it starts to feel familiar and can help you get a grasp of other coding languages. I also learned that it was the invisible infrastructure that supports most of our daily activities. For instance, machine learning has lately been a trending topic in the media; however, the technology has been powering systems we use daily for years. It just wasn't visible to those of us outside the field.

In parallel, I realized the importance of data management and analysis and how many organizations may lack skills and tools to guarantee consistent and reliable data. Nitty-gritty details also matter; in order to clean a dataset, we are forced to make decisions to impute values, or exclude them in case of missing data, for example. These decisions have consequences on the overall methodology and ultimately, the data interpretation, which we have to be aware of. In other words, it might be similar to the work of a detective: understanding the context of data collection and analysis purpose, making decisions, interpreting the data, and finally translating it to the audience in a meaningful way.

No comments yet
Search