How students can impress hiring managers

How can a student without any experience outside the classroom pique the interest of a hiring manager?

Have a project that:
1. Is eye catching when boiled down to a few bullets points on a resume
2. Leads to an engaging 20 minute conversation

As a hiring manager looking for machine learning researchers, I’ve reviewed 1000’s of resumes and conducted 100’s of interviews, and the toughest resumes for me to evaluate remain new grads without internships or publications.

Why? Let me compare how my initial conversations go with different types of candidates.

  • Experienced candidate with a prior job: “Let’s chat about when you deployed X network in production to achieve result Y”.
  • New grad with an internship: “Give me more details about what you did for company C during the summer.”
  • New grad with a paper: “I was reading your paper, and I had a question about ablation study A”.
  • Other new grads: “Let me look over your resume…, um yeah, I guess I’ll try and pay attention as you tell me about yet another class project that involved applying a pretrained ResNet model to MNIST.”

At this point, there are enough students doing ML that it is no longer sufficient to stand out by just having ML classes on your resume. But if you can get an internship or publication, that continues to stand out. So are you screwed without an internship or pub? Not at all! But you do need to do some work to spice up your class or capstone projects.

What can you do to make a project that stands out? Below are a few ideas biased towards my work on neural networks for real time embedded systems.

  1. Open source code. An estimated 10% of published papers have open sourced code. So take a cool new paper and code it up! Here is a very impressive repo that is well beyond a class project, but for a simpler project code up one single paper.
  2. Faster. Most academic papers and leader boards focus on performance, often to the detriment of runtime. I’m always impressed when someone can take a paper and speed it up with minimal drop in performance. Some ways to speed up networks include changing the architecture, pruning, and quantization.
  3. Smaller. Networks are massively over parametrized and much larger than they need to be. Grab a network, squeeze it down, and show you can fit it into an edge device. Check out SqueezeNet and the Lottery Ticket Hypothesis for interesting papers in this area.
  4. Cheaper. Training state of the art neural networks is extremely time consuming and costly. Demonstrate how to train a network with a limited GPU hour budget and still get reasonable performance. Check out some ideas from Fast.ai for fast training and this article for training on a single GPU.
  5. Multitask. Academic networks are usually single task, but real time networks are usually Frankensteins with a shared feature map supporting multiple tasks. I recommend this review paper as well as this more recent paper to get started.

Hope that helps! I look forward to chatting with you about these cool projects!

NSF GRFP 2018-2019

Thanks to everyone who has been sending me new essays to host. Its crazy that my advice page went from only my essays and thoughts to 93 different examples!

I have to give a disclaimer that I haven’t been following changes to the NSF GRFP as closely this past year, but I think that my general advice still holds. If something seems outdated or wrong, please let me know. I always highly recommend getting multiple opinions and I’m glad that many people who have shared their essays are also writing about their experiences. Also check out the Grad Cafe for useful discussions.

Good luck everyone!

Python on a Mac

I personally do most of my coding on my laptop, which is a Mac. Eventually that code gets run on a Linux server, but all initial coding, exploratory data analysis, etc is done on my laptop. And since I advocate for Python, I thought I would lay out all the steps I needed to do to setup my Mac in the easiest manner. (Note: probably similar steps on Windows, but I haven’t used a Windows computer in so long that I don’t know the potential differences).

8rYYWPNYNN-2

Unfortunately, the Python 2.x vs 3.x divide exists and so far, I have yet to be able to completely commit to 3.x due to a few packages with legacy issues. Luckily, there is a pretty easy solution below. Note, your Mac has Python preinstalled (go to terminal and type python to start coding…). However, if you want to update any packages, you can quickly run into issues. So it is easiest to install your own version of Python.

  1. Install Anaconda (I advocate version 2.7, Anaconda will call this environment root)
  2. I recommend using Anaconda Navigator and using Spyder for an IDE
  3. Install version 3.5 and make an environment (in Anaconda Navigator or terminal commands below):
    $ conda create -n python3.5 python=3.5 anaconda
  4. You can switch between python environments  {root, python3.5}
    $ source activate {insert environment name here}
  5. To add new python packages use conda or pip (anaconda has made its own pip the default)
  6. WARNING: always close Spyder before using conda update or pip. I got stuck in some weird place where Spyder would no longer launch. Apparently it can happen if Spyder is open and underlying packages get changed.

To get around the 2.x vs 3.x issue, go to your terminal and use pip install for the following packages: future, importlib, unittest2, and argparse. See the package’s website for details of any differences. Then, start your Python code with the following two lines:

from __future__ import (absolute_import, division, print_function, 
unicode_literals)

from builtins import *

For nearly all scientific computing applications, you are essentially writing Python 3 code. So make sure to read the correct documentation!

Personally, I found Anaconda to be a lifesaver. Otherwise, I got stuck in some weird infinite update loop to install all required packages for machine learning (specifically Theano).

Now you are ready to code! If you aren’t familiar with Python, my recommended tutorials will be in a future post.

 

Webmaster Buddy

Feel free to leave comments on posts. In order to stop spam, if you are a first time commenter, your comment will be held and must be approved by the webmaster Buddy:

Buddy_computer

Whether your comment is deemed spam will be arbitrarily decided by the whims of Buddy. He can be bribed with treats such as peanut butter and Cesar’s. Don’t expect a prompt response, since Buddy’s usual state is this:

Buddy_sleep

Initial Conditions

I (ie Alex Lang) am a physics PhD currently doing my postdoctorate research at the Salk Institute in San Diego. I work on a variety of research topics such as physics, computational neuroscience, machine learning and theoretical biophysics. As an outsider, that probably looks like a jumble of topics, but I swear, there is a theme! In my research, I apply techniques (both conceptual and mathematical) from statistical physics to a variety of problems. Statistical physics is the domain of physics that applies to large systems (number of “particles” N when N \gg 1). In many ways large systems are simpler than small systems, so taking the extremely large system size limit (N \to \infty) often brings useful insights into a problem. So the blog name is inspired from statistical physics, my broad interests, and of course, Buzz Lightyear.

The blog will focus on research topics of interest to me and hopefully others. I will also blog about research in general, what academia is like, and other science-like things (including science-fiction!).

I will focus on occasional, but detailed posts. My personal goal for 2016 is 25 posts of substance, so one every two weeks. I’m hoping the journal club we are starting up at the Salk will provide plenty of material, more details on that soon.

If you are interested in following, I personally enjoy Feedly as a RSS Reader. Or you can sign up for email alerts (see sidebar) or follow me on Twitter @n2infty.