👋 Welcome to this issue of RM-RF, where you get to hear about crazy dev war stories from inspiring tech leaders.
This week I interviewed Charles De Groote, co-founder and CTO of Sortlist, a marketing agencies matchmaking platform. Charles is a very pragmatic tech leader who’s learned a ton on the job as his team grew to 100+ people. Read on for exciting anecdotes!
🕰 For how long have you been working on Sortlist?
We launched the company 7 years ago together with my 3 co-founders. I was initially the only founder with a technical background. Interestingly, Michael, the founder who originally headed Product Management, decided to learn to code from scratch after we’d launched the company. He ended up working in the dev team and has been managing developers ever since.
🏉 How big is your team today?
The whole Sortlist family is 100+ people strong today. Of these, we’ve got 15 people in the engineering team and 5 people in the product team (3 PM + 2 designers), which is headed by Robin, our Head of Product.
The engineering team is composed of a dozen of versatile devs, along with 1 QA, 1 infra manager, and 2 line managers that handle all the 1-1s etc. These folks in particular give me a lot of leverage as I was getting overwhelmed by the 1-1s, especially in the beginning when we hired mostly junior talents. Nowadays I even find time to continue coding myself 😋.
We work as 3 independent product squads and perform quarterly “mercatos” where juniors and mediors get to change teams to tackle different challenges. Senior people mostly remain in their areas of expertise.
🛠 Describe your tech stack in a nutshell?
Well, it has evolved a lot over these 7 years, along with our knowledge and skills. It originally started with a Ruby on Rails monolith and an Angular front-end.
2 years ago we started splitting the back-end into microservices. After many experiments (NodeJS + Express, NestJS, Python + Flask, Ruby on Rails), we settled on RoR as our go-to framework. We took the same journey in the front-end 1 year ago, picking Next.js this time.
In terms of infra, we mostly run on AWS with Kubernetes. We were initially on Heroku but had to move away, partly for DB costs reasons. At €2K/month, they represented 50% of our infra costs - which we were able to bring down using AWS AuroraDB. Another issue with Heroku was inter-services communications which required us to go over the public internet to pass messages - clearly not an optimal setup.
It’s worth mentioning that we at Sortlist handle multiple apps corresponding to our multiple stakeholders: agencies, consumers, back-office… That’s 9 apps in total. We’ve hence been in fierce discussions over the opportunity to try and group it all as a monorepo to leverage common modules. However this would introduce more overhead in terms of DevOps (to silo builds etc). Given that infra is currently our least staffed area, we decided to keep it simple with separate repos.
🐛 What is the most elusive bug you and your team ever had to fix?
You surely know this famous quote from Phil Karlton:
There are only two hard things in Computer Science: cache invalidation and naming things.
Well, we suffered badly from caching issues related to cookies 🍪. Our multi-apps stack runs on multiple domains, which is made more complex because we use geographical TLDs such as .fr and .be. We made the painful mistake of sharing cookies between all these apps and domains, which is a dangerous pattern with hindsight.
What happened is that for the short period of one release, we unknowingly screwed up authentication in one app and this resulted in corrupted cookies being released into the wild and cached by our users for all apps. As a consequence we started getting more and more reports of users not able to login at all. We never managed to repro the issue while this problem came and went for several weeks. That is, until one of our employees actually faced the problem themselves. We of course got super excited and told them “Don’t touch anything, a dev is on his way to check this out!". That’s how we finally diagnosed the cookies issue. We ultimately had to go nuclear to fix it, invalidating all cookies at once 🤯.
😱 What is the most stressful tech situation you ever faced?
It’s again something caused by our relative ignorance back in the days - 5 years ago to be precise. By then we didn’t master all the intricacies of DB management, which resulted in lengthy migrations and often required putting the website in maintenance mode.
One such migration began innocently one night at 7PM and rapidly morphed into a monster event which required a ton of manual operations from my side. I ended up having to work the whole night on this situation. I normally never drink coffee, but when my co-founders came to work on that morning I had to ask them for one - probably the only time they ever got this request from me ☕!
✌ What's your best piece of learning on these topics?
We’ve learned a lot all these years about the pitfalls of over-engineering our systems. At some point for example we rolled-out our own custom reverse-proxies in Node which led to unmanageable timeouts… Now we’ve wisened up as a team and focus on pragmatic solutions to real business problems.
Personally as a leader, I’ve learned how valuable it is to surround yourself with the right people with the right experience for the job. An example is the hiring of our DevOps manager and that of our more senior developers, who’ve brought in expertise I could never have developed on my own.
⭐ Finally, are you hiring any developers these days?
We went through a temp freeze with the covid situation but are now fully back on track and hiring 1 additional DevOps engineer as well as 3 React and 3 RoR engineers.
Thanks a ton to Charles for this interview 🙏. It’s been very exciting to follow Sortlist’s story since its early days. Also cool to meet someone who’s worked with Ruby on Rails and NestJS, enjoying the same similarities between both frameworks ❤️ If you’d like to work with Charles, join Sortlist’s team here. See you in our next of issue of RM-RF with Laurent Perrin from Front.