On Attending Machine Learning Summer School

In the last 2 weeks, I attended Machine Learning Summer School (MLSS), hosted by Columbia University’s Center for Financial Engineering and sponsored by Bloomberg. Having just finished the second year of my PhD, this was a fantastic opportunity for me to learn about the broader scope of modern machine learning and meet other PhD students in the field. In this blog post, I’ll share some of my key learnings and experiences!

MLSS 2026 cohort. Source: MLSS.

Key Learnings

This was a very intensive 2-week program. Distinguished professionals and academics were invited to give us dense lectures on various topics in machine learning (causal ML, reinforcement learning, agentic AI, ML for finance, probabilistic ML, neurosymbolic AI, mechanistic interpretability, diffusion models, etc.) and related meta topics, like AI governance, AI safety and alignment, multilingual and multicultural foundation models, AI ethics, AI for scientific discovery and reasoning, and more. Below I include some very condensed, high-level ideas and takeaways in no particular order:

A slide from Elias Bareinboim's talk on Causal ML. Source: author.
  • Foundation models are very powerful for solving big problems.
  • Agentic AI holds potential to automate away the tedious parts of research, but this is a nuanced issue. The key guiding question is really, ‘Under what constraints can the agent behave reliably?’.
  • Diffusion models offer a new paradigm for generative vision and language.
  • Interpretability and explainability is still an unsolved problem, these models are getting too deep and complex to reliably interpret the black box.
    • Mechanistic interpretability is a step towards solving this problem.
  • Causal ML and neurosymbolic AI offer alternative paradigms for traditional ML, which is primarily probabilistic.
  • In the fintech sector, most analysis is classical ML-based, but there is a push towards using foundation models for highly detailed predictions.
  • System design at all levels matter - and so does evaluation of system behavior.
  • Speaking of evaluation - this is a research area in its own right, and current evaluation metrics don’t capture the full behavior.
    • furthermore, current leaderboards and benchmarks enable ‘benchmaxxing’, i.e. overfitting to the task, rather than holistic evaluation.
  • For model safety, alignment is the goal and steering is the method.
    • Model and data bias remain a core, fundamental issue. Some talks focused on very interesting discussions on this issue, such as coloniality in alignment, embedded cultural and linguistic biases, engineering ethics vs AI ethics, etc.
An excerpt from Andrew Wilson's talk on probabalistic ML. We derived probabalistic-driven loss functions from first principles. Source: author.

Some memorable/hilarious quotes by speakers that stayed with me:

We’re still at the mercy of gradient descent - Michael Ivanitskiy

With great power comes great ability to evade accountability - Steve Casper

When you create risks, you have obligations to manage those risks - Enrico Santus

I know it seems like it, but I don’t actually know everything - Michael Ivanitskiy

‘What have we learned so far?’ Good question, I ask myself that everyday - Elias Bareinboim

Generally when a paper title is a question, the answer is no - Andrew Wilson

What’s amazing about LLMs is they have no ego - Dean Foster

I filled up quite a few pages in my notebook trying to capture all these different aspects of modern machine learning (and ended up buying another to capture yet more), I have a ton of new ideas for my own research now!

An excerpt of author's notes, plus the two notebooks she filled out with her ideas. Source: author.

Other than the lectures, we also had the opportunity to present our research at a poster showcase, take part in a hackathon, and visit the Bloomberg Headquarters!

Author's poster presentation on her research, and author at Bloomberg Headquarters. Source: author.

My Thoughts

In the opening keynote, Gary Kazantsev and Ali Hirsa stated the goal of MLSS was for participants to develop judgement about the types of questions that are worth asking, what systems work, and what don’t. This resonated with me a lot, and helped frame my perspective at MLSS.

“Good judgement lasts a lifetime” - Gary Kazantsev

Gary Kazantsev (speaking) and Ali Hirsa (left) at the opening keynote. Source: MLSS.

My PhD research thus far has been focused on security for classical computer vision from last decade (traditional CNNs), which is still an unsolved problem. Prior to attending MLSS, I had been wanting to understand the rapidly-evolving landscape of modern LLM-driven models, and how it might apply to my research. MLSS did indeed help me understand the broad landscape and potential security gaps which I could explore in my work, especially across different domains.

I walked in a skeptic of using AI to automate away my work for fear of not learning ‘under-the-hood’ behaviors of these models, but through many of the talks, discussions and conversations, I now have a slightly better appreciation for using AI to automate away some of the tedious tasks of research (such as writing matplotlib code, or refactoring my codebase) so that I can focus on the real science and discovery part, although I would never use AI for the ‘real’ science work and writing. As I evolve in my PhD research, I know my relationship with AI will also evolve, but MLSS was very useful for me to see and understand how others are using AI for various aspects of their research.

A slide from Brendan Hogan's talk on AlphaLab, an agentic AI harness system for a research workflow. Source: author.

Although the technical talks were very useful, I found the ‘meta-AI’ talks to be very valuable. As a computer vision security researcher, I am constantly thinking about the biases in my data and models, and how to make my models more trustworthy through explainability, interpretability, robustness, and more. I gained new perspectives on how to move towards model alignment and making inherent biases explicit through some of the talks.

A slide from Steve Casper's talk on AI governance. Source: author.

I also loved that the speakers in the program helped with the guiding goal that Gary Kazantsev mentioned, in that they provided overviews of their domains, supported by literature references and specific technical discussion. I found this teaching method to be especially effective for myself and the way I’m being trained to think as a PhD researcher - what’s the source of your information, what’s the overall landscape of where your work sits, who are the foundational people and works, etc. I also appreciate how much the speakers were willing to engage with our curiosity, and that my peers were also willing to ask a lot of questions to the speakers - I know I asked way too many questions, but it really did help clarify a lot of the novel concepts being presented to me!

Author (yellow striped tshirt) asking lots of questions after Elias Bareinboim's talk on Causal ML. Source: MLSS.

I also loved that everyone was authentically themselves. What I mean by that is, we were all talking about and representing what’s important to us, whether that’s security in AI (me), Moo-Deng memes, using internet slang words as a metaphor for multi-lingual/cultural NLP, or highlighting colonial biases in LLMs, etc. Often a misconception about computer scientists (or STEM in general) is that we’re too “in the weeds”, or that we’re too serious, and that can sometimes create an environment of isolation, but at MLSS, I observed a warm, thriving, creative community of AI scientists from all backgrounds, here to learn together as a community.

[click to zoom in] (a) A slide from Steve Casper's talk on AI governance. (b) A slide from a hackathon team's presentation. (c) A slide from Wei Xu's talk on LLMs and NLP. (d) A slide from Bálint Gyevnár's talk on pitfalls of AI scientists. Source: author.

As a lowkey House, M.D. fan, I was also delighted to see two distinct speakers individually include House references in their talks, I love it when House is the metaphor 😂

A slide from Andrew Wilson's talk on probabalistic ML with an obscure House, M.D. reference. Source: author.

About The Program

About 200 PhD students (and postdocs, some Master’s students and undergrads) from all over the US and around the world were selected to attend this program. This year there were about 1,200 applicants. This is a program that the organizers have been working on for the last 10 years to bring to NYC, and thanks to their perseverance, they brought this amazing program to life! Hosted at Columbia University, we got to stay in the on-campus housing, not too far from the lecture hall. We were assigned roommates, and I might be biased, but I think I lucked out with the best housemates ever! This was also sort of my first time living in a dorm (since I lived at home during undergrad and still now), I really did enjoy the on-campus apartment they put us in, and just how easily accessible the rest of campus was!

(left) Author in her dorm, (right) author and her awesome housemates: Ruth, Taqiya, Estela. Source: author.

As intensive as the program was, I can say I still had a blast outside of lectures, I definitely took advantage of the city that never sleeps for many adventures!

Author and friends (Zach, Emily, Stacy) eating NY pizza (the best kind of pizza - fight me!); author enjoying delicious Blue Bottle coffee (who knew adding chicory to coffee makes it taste so good? not me before visiting Blue Bottle!). Source: author.

The Columbia University campus itself was gorgeous, and to be honest, quite a sharp contrast from my home campus (UConn)! Of course being located in Manhattan, the campus was so lively most of the time, which surprised me, since UConn is pretty quiet, especially during the summer, owing to its rural/agricultural roots (which is under-appreciated in itself). I was also surprised to learn that Columbia is a ‘gated’ campus, i.e. you have to badge in when returning to campus from the ‘outside world’. I wasn’t able to visit Colombia’s gorgeous libraries since my visitor badge didn’t have access, but I did admire them from outside. There was a lot of construction happening also, which cracked me up because as an academic, you know it’s summer when they start construction on campus (can confirm, they’re renovating Gampel Pavilion back home at UConn right now).

Columbia's main library, with constuction in the foreground. Source: author.

Closing Thoughts

Overall, I learned a lot from this program - not only from the many lectures, but also from the people I met and the conversations we had. Being with 200 other passionate academics-in-training truly made me realize that being a scholar means digging deeper and integrating holistic perspectives to aid in the search for meaning. In other words, science is messy but rewarding! It may be cliche, but I truly think attending MLSS was a great privilege and honor, and has made a lifelong impact on me, not only in terms of how I intend to scope my research now, but also in terms of the peer connections I made in this supportive and curiosity-driven environment! As a scientist, we are shaped not only by our passion, but also by our community, and I am so lucky to be part of such a great community!

A quote from Kush Varshney's talk on an AI safety framework. Source: author.

No generative AI was used to write this blog article, this is content straight from my brain to your screen, 100% aayushi!




    Enjoy Reading This Article?

    Here are some more articles you might like to read next:

  • What I Learned From My First Poster Competition
  • On the Power of Positive Thinking
  • Exploring Old Town Alexandria
  • I Watched The April 2024 Solar Eclipse!
  • Tableau Conference 2023 - My Experience