Ask HN: How are Data Scientist keeping themselves updated?
I have been in Data Science for 8 years now. The latest developments in deep learning and LLMs are moving at such high pace. It is getting difficult to keep myself updated with latest trends while simultaneously delivering work at my current job.
How are others managing?
I don't try to keep myself updated. I don't listen to podcasts. I'm not subscribed to any data science newsletter. Neither do I go to meetups. I think this is all a distraction.
If I encounter a problem that I cannot solve sufficiently with the methods I already know, I start exploring and read material until I find something that does the job.
The other way around makes you try to apply your new and fancy method everywhere simply because you're excited about it and it's new.
There's a similar phenomenon in tech in general, when people suddenly start to adopt OOP everywhere or there's a new JavaScript framework around the corner without assessing what the benefit will be.
I 100% do exactly the same. I gave up following what's new in Artificial Intelligence (Machine Learning?) years ago. 99% of it is distraction, and not worth my time to find that last 1% of useful information. Instead, I focus on improving my foundations: statistical inference, linear algebra, calculus, classical machine learning (e.g., regression, boosting, component analysis, ...), programming, domain knowledge, social skills, ... I only learn a new technique if I cannot solve it with my usual toolbox (which is not very often).
I'm way more productive, have to work less hard, and I'm not distracted. Sure, I don't do that fancy new thing, but at the end of the day (or earlier) I get the job done. And I'm judged on what I do, and how it brings money into the company, not how I do it.
Another benefit working mostly with a box of boring, old tools, is that it will likely still be relevant in the next 30 years. You never know how long that new popular thing will remain popular and useful. But I'm pretty sure we'll still fit datasets with linear/logistic regressions, optimize processes with linear programming, or do straightforward A/B testing for the next few decades (if not centuries or millennia).
I agree broadly, though I think it's important to distinguish between techniques, people, and religions. I'll follow certain people on LinkedIn who regularly post useful technical stuff in relatively plain language that I might not know about and which might come in handy. I've picked up some genuinely useful stuff this way. But then there are hordes of religious frequentists and bayesians having pseudo-intellectual knife fights and I avoid them about as vigilantly as I would people having actual knife fights.
I have an hour each day permanently carved out on my calendar just for reading https://arxiv.org/list/cs.LG/new. It satisfies the "continuing education" portion of my annual goals. At the end of last year, I was able to use my browser history to find the total number of opened[at least skimmed] preprints and it more than satisfied my goal.
This is only one source among the many you should consult, also I'm biased because I'm a co-founder, but cognitiveclass.ai constantly publishes new guided projects (and courses) on related topics. They are free, and in the case of guided projects, quick. Sort by new and have at it: https://cognitiveclass.ai/courses?type%5B%5D=all&sort%5B%5D=...
I try to regularly check my sources and discuss the news with my network.
Resources:
People (Karparthy, Andrew Ng), YouTube channels (AI breakdown, AI explained), websites/newsletters (The Batch!), conferences, follow Reddit (r/artificial, r/datascience), discord servers (Hugging Face, LLMOps.space), podcasts (Last Week in AI, Super Data Science Podcast).
I am in a WFH setting, so most of the conversations in the network are online. I need to have more IRL interactions with other DS's to bounce off ideas.
I just just thinking last night about how I hadn't heard anything about 'data science' for a while.
It seems to be at the very tail end of the hype cycle, having gone from 'the new words for programming', the big growth area / easy way to get hired that every company had to have, to... the old word for 'AI' now?
If I encounter a problem that I cannot solve sufficiently with the methods I already know, I start exploring and read material until I find something that does the job.
The other way around makes you try to apply your new and fancy method everywhere simply because you're excited about it and it's new.
There's a similar phenomenon in tech in general, when people suddenly start to adopt OOP everywhere or there's a new JavaScript framework around the corner without assessing what the benefit will be.
I'm way more productive, have to work less hard, and I'm not distracted. Sure, I don't do that fancy new thing, but at the end of the day (or earlier) I get the job done. And I'm judged on what I do, and how it brings money into the company, not how I do it.
Another benefit working mostly with a box of boring, old tools, is that it will likely still be relevant in the next 30 years. You never know how long that new popular thing will remain popular and useful. But I'm pretty sure we'll still fit datasets with linear/logistic regressions, optimize processes with linear programming, or do straightforward A/B testing for the next few decades (if not centuries or millennia).
Resources: People (Karparthy, Andrew Ng), YouTube channels (AI breakdown, AI explained), websites/newsletters (The Batch!), conferences, follow Reddit (r/artificial, r/datascience), discord servers (Hugging Face, LLMOps.space), podcasts (Last Week in AI, Super Data Science Podcast).
I am in a WFH setting, so most of the conversations in the network are online. I need to have more IRL interactions with other DS's to bounce off ideas.
It seems to be at the very tail end of the hype cycle, having gone from 'the new words for programming', the big growth area / easy way to get hired that every company had to have, to... the old word for 'AI' now?