A learning from a mistake is a learning iff it's not repeated.
 - a wise person

For someone like me, with no industry experience whatsoever, this summer was an eye-opening experience. Before the internship started, I would tell myself that I am so good in the Machine Learning aspect of Data Science (I’ve got references), that I would probably blow their minds. Little did I know that I won’t spend more than 10% of my entire internship time designing models. Ok, I am getting ahead of myself.

Let’s start with some background about the internship. So, I worked for 10 weeks in the Watson Healthcare & Life Sciences sector of IBM. This division provides consulting services to pharmaceutical companies and other medical organizations. As in most of the consulting firms, they have different consultant levels, and after MS, people usually join as a senior consultant and that is why they have a 2 years work experience prerequisite as the most basic filter for this position. How did I get through the filter with no prior experience?

IBM likes using the word “Cognitive” a lot. I recently got to know that there’s a whole department within IBM called IBM Cognitive (yep, that’s it, that’s the entire name). The official title for the MS interns they hired as senior consultants was “Cognitive Engineer Consultant Intern” (phew, I am glad I know this now). After working there this summer, I think a more appropriate and simpler title for senior consultant position would be “Solutions Engineer/Architect” (I am not a big fan of the term “engineer”). If I have to describe my internship in one sentence, I’d say that I built solutions (in some definitions of the word) for a big medical company that will help save lives! How? We helped them extract information about their products from medical literature. They have to report this information to regulatory authorities if they want to keep their products on the market.

Dig deeper, shall we? This solution is a packaged solution to a business problem the clients put in front of IBM. By packaged I mean that it caters to their every need, from the business aspects to anything technical needed to solve the problem. My role in the project team was of a Data Scientist, and I was responsible for any sort of Machine learning required. Does this mean I was a code monkey? No, the best part about IBM consulting is that even though you’d spend about 70-75% of your time doing what you were hired for, the rest time is spent learning important soft skills which people don’t get a chance to learn so early on in their careers.

I got a chance to be a part of sales meetings (thanks Michele). Although I didn’t say/present anything during the meetings because I was just an observer, I am glad about this because those meetings are like a date with a very demanding and stringent person (and in the meeting I webt to, there were 10 of them) and I am not good on such dates. The client sends its best people usually on sales meetings to evaluate the solution proposed by IBM and see if that is worth spending millions of dollars on. Oh, before I forget, one has so many meetings in consulting every day, and most of them are with different people entirely.

I think I may have lost track (seem a little out of practice, do I?). Coming back to what I did in the space of Data Science, the project was mainly related to NLP. The most significant parts of the project, however, were data preprocessing and labeling the data ourselves to create a training and test set for the supervised models. In the ML courses, we have the data in the most structured form generally, but believe me, the data in real life is so crude and unstructured. The source of the data was PDF documents which even though aesthetically appealing are a horrible way of storing text. So I also worked on extracting, processing, and manipulating the data from those files. The whole team, including the project manager, spent their weekends (even the GoT finale weekend) manually annotating the extracted data. After we had the desired rows and columns, the model used for information extraction was Maximum Entropy Markov chains. The outputs again had to be processed and converted into a form desirable for the next steps (making a dashboard).

One of the key takeaways as a data scientist was that the solution one develops happens in an iterative cycle. Once you have the data, you run the data through the model, you evaluate the output and then go back to the first step of annotating the data again to improve the output. You keep on doing this until you have a good enough output. An important thing to learn from consulting environment is the definition of good enough which is very subjective. The managers at IBM are amazing at this because they have a lot of experience in delivering the solutions in limited time.

Apart from work, I enjoyed a lot in intern group outings (every alternate Friday). Also, I went on a dinner with the North America practice leader of IBM Watson Health (Michele) and got to know her experiences, I got to share a 3-hour long drive with an associate partner (Brian), and lunch every day with the project team. The project was in a deserted town. Manhattan spoils you, it really does. In Manhattan, I have to walk just a street to get to Starbucks or Dunkin for Coffee but in that town, the nearest Starbucks was 1.7 miles from the IBM office and coffee for me is like a hug in a cup which I need after every 3-4 hours.

I’d like to thank Katy, Marina, Michele and everyone that I worked with for making my internship so amazing. I got to work with some of the finest brains in business strategy and advanced analytics and the quote at the top is an advice given to me by one of such fine brains at IBM. One of the things that I didn’t like about IBM (which I was quite vocal about) was that they hire massively every year from a particular school (let’s call it DNV). I don’t have a problem with DNV, I have a problem with hiring from the same school every year. This dilutes the diversity in thinking during meetings or brainstorming sessions. Anyway, not my place to say much about it.

Thanks,
Ashwin