Nuxt HN | Ask HN: Are deep learning theorist employable in the industry

Ask HN: Are deep learning theorist employable in the industry

I am an undergraduate student interested in deep learning. Just curious on how applicable deep learning theory knowledge is in the practical setting? Based on my very limited amount of conversation with people in my university, career fairs, and online information, what I get from them is that a lot of the hyperparameter selection decisions comes from the engineer's intuition and past experience. Trying to combine my knowledge of statistics and some deep learning theory survey that I've tried my best to understand, it seems to be possible to come up with a rigorous and procedural way (something that could be streamlined) to determine hyperparameters and possible architectures if we know certain properties of the dataset and intended task.

Is the theory good enough for the practical applications? (by practical I don't mean large scale projects that Google or OpenAI does. More of just small companies who seek to apply established methodologies on their own dataset). If yes, then do people actually do this? If no, can I have more of your thoughts on why?

7 points | by aaronli2003 13 days ago

6 comments

jononor 12 days ago
I do not understand what your current research thesis, and it's validity, has to do with your employability. You will be employed to solve the problems that the organization have. These will generally be much more varied than can be addressed by any one piece of research. Your research work will almost in all cases be evaluated as proof that you are able to do challenging things, and that this indicates that you are capable of doing the work that they have. The work is rarely evaluated directly for its applicability.
psyklic 12 days ago
Many of the best engineers have a strong interest in the literature, even if things are not completely understood theoretically. Good engineers provide strong justification for tuning hyperparameters based both on intuition and theory. The same goes with suggesting architectures -- the best people don't guess, but they can debate the theoretical purpose of each element and how tuning it may affect the outcome. If something isn't completely understood, they may code small experiments and/or work with the math to better understand its potential impact.
Beyond this, optimizing models requires a strong understanding of the math behind them. This provides crucial insights, for example that the attention key bias does not affect the attention weights. In industry, engineers might read a paper about a new activation function. They will probably wonder whether there is (theoretical) justification for how it might affect training time or be computationally efficient in the company's architecture. A theorist would be a great fit here.
Some higher-level research could have some commercial application. For example, there are a few papers showing near-LLM performance (for certain things) attained by only searching datasets.
Ultimately, you will be successful if you can find where your skillset overlaps with what the company needs. Sometimes there is a research division which needs exactly what you research. Other times, you may need to work on things that aren't an exact fit but still utilize your skills. But as of right now, it seems that a successful theorist with coding ability would be in demand (e.g. https://www.anthropic.com/research).
p1esk 11 days ago
Your question sounds like: “are power lifters employable as movers?” The answer is sure, if they want to be movers, and if they have other qualities needed to be a good mover.
But if your question is whether DL theorists are employable as DL theorists, the answer is again - sure, but you have to be really good at it.
talldayo 13 days ago
> Is the theory good enough for the practical applications?
Unless you are implementing your theories yourself, I'm going to go ahead and guess "no" on this one.
yorwba 13 days ago
I think you might want to read the Tuning Playbook if you haven't already. https://github.com/google-research/tuning_playbook