Brandon Ko

In progress!!! :D

Throughout my final quarter at UW, I worked on a research paper with Kyle Lo's team at the Allen Institute of Artificial Intelligence. This project falls under the category of NLP for scientific literature, focusing on classification tasks involving scholarly papers. The purpose of the project is to build a better understanding of the usages of citations in research papers, specifically those in the field of computer science. After designing the different use cases of citations (i.e. to provide more background information or to provide a counterpoint), we were interested in seeing if a model could accurately classify how a citation is used in a paper.

The researchers on this team had previously identified that context windows for citations are valuable for learning. This means that a model will more accurately classify the intents of a citation if a window of context surrounding the citation is provided. This window varies, since each citation can have multiple sentences (or none) before and after the actual citation sentence that provides context. To aid the citation intent classifier, I built a model that captures the optimal context window for each citation. I experimented with the BERT and Longformer models from HuggingFace, using the sequence tagging variation of the model to capture different windows. To achieve this, I devised a token tagging system that captures the citation context windows and modified the model's loss function such that only the loss is calculated on the citation window classifications.

I was able to greatly expand my technical knowledge in NLP throughout this experience. Here is a brief list of stuff I learned:

Quality data is most important. It doesn't matter how fancy or sophisticated your model is, if your data is bad, your model won't be able to learn.
Problem formulation and design is tricky. Sometimes, you think you have a simple classification task when in reality, you have a very complex task with multiple different scenarios. It is important to really think about what problem your model is trying to solve and how the model should go about solving it.
Transformers are quite powerful. By building contextualized sentence representations using attention, the model can "understand" the data as if it were reading through it, remembering what happened before and after. We exploited this aspect of the transformer to build a sequence tagger and classifier all in one.
And a bunch more technical details when it comes to training and debugging a machine learning model!

Huge thanks to Noah Smith for connecting me with Kyle, which lead to this research assistant opportunity.

I spent my summer of 2020 working as a research intern focusing on NLP at Giving Tech Labs. GTL is a nonprofit company that works in the public sector of technology, focusing on addressing systemic social issues through AI, data sciences and sustainable models.

My first project involved a binary classifier for labeling web pages. The purpose of this classifier was to determine if a random web page provides grants for projects. I factored the code base for this classifier from Jupyter Notebooks into standalone python classes, where models could be trained, saved, and uploaded to Azure via the command line. I also determined the optimal model through various comparisons and tuned the hyperparameters of the final model.

The second project focused on topic modeling for websites containing information on the older workforce. I dove deep into data cleaning and text mining, using various vectorization and clustering techniques to build an understanding of my corpus. The final product used Word2Vec with TFIDF weights as the vectorizer, and clustered the noun phrases of the sentences using spherical k-means. These topics are used to drive ideas for the next milestones for the dev team.

I had the privilege to work with amazing people throughout this experience. Thank you Shelly Kurtz and Luis Salazar (co founders of GTL) for making this opportunity available, and Dr. Ying Li (chief scientist) for being a great mentor. And last but not least, thank you to all the AI4PI interns that I worked with.

Qualtrics is an experience management company, aiming to close experience gaps with their technology. I joined the employee experience (EX) team at Qualtrics, which is responsible for building tools around their experience management application tailored for larger sized corporations. The EX platform works by sending out a company-wide survey every few months, where feedback is collected for each employee. My specific team focused on the action planning tool on the EX platform. This tool is designed to help employees improve on their behavior so that they can work seamlessly with their coworkers. Improvements include recommending behavior adjustments (BA) based on the survey feedback.

My project was a proof of concept for integrating AI into action planning. I created a notification bot that looks over the action plan BAs of an individual and reminds them based on their calendar events. By having smarter reminders, individuals can be more aware of their tasks at the right time of the day. I built this tool by first creating a NodeJS app that can connect to a Qualtrics XM user using OAuth through a Microsoft account. Once connected, the app is able to look through all calendar events and action plan BAs. Using Microsoft LUIS and a set of predetermined rules, it determines the optimal pairing of calendar events to BAs. I actually ended up writing my own classifier in PyTorch for classifying calendar events into categories (to be paired with BAs), but there wasn't enough training data to build a decent-performing model, so we decided to stick with Microsoft LUIS and use its default settings. Next, I set up a Microsoft Teams bot that notifies the user of their BAs prior to specific events on the calendar. The user can interact with the bot to configure notifications. Finally, other employees are able to give feedback directly to the user through this bot, so that progress can be tracked through the evaluations of others. Prior to this, progress could not be tracked at all.

Throughout this internship, I strengthened my experience in web development, and learned how to use new tools such as working with Microsoft LUIS and building a Microsoft Teams bot. I also learned how to write design documentations for projects. Although not implemented, I was asked questions about security, scalability, and robustness, all of which I had to discuss in my design documentation. I created optimal schemas for my database and covered different ways of launching the application so that more users could be supported. I was pushed to think outside of the box, covering all sorts of edge cases for my application. I greatly enjoyed my time at Qualtrics, and would like to thank my manager Zhongwei Wu and my mentor Ron Quan for guiding me through this experience.

During my first three months at Giving Tech Labs, I created a web application that controled AWS instances across multiple accounts. GTL had many projects deployed on different services on AWS, and the CTO found it tedious to log onto each account and navigate to the right console just to restart a server or check the status. We decided a solution would be to create our own console tailored to our needs, and control the instances through the AWS API. I created an application using a VueJS frontend and NodeJS backend. The application was able to monitor our AWS instances, and send commands to virtual machines.

The next few months were spent writing technical documentation for custom plugins in their WordPress website. They planned to branch this web app off into its own company, and required clear documentation so that the application could be maintained in the future. I took time looking over the entire website from the developer's perspective, and explained all of the functionalities in detail. The web app is now its own company, and is known as Giving Compass!

This internship would not have been possible without my wonderful mentor, Joel Rosenberger, who gave me the opportunity to dive into technical work as an intern without any previous experience.

Timeline

Career

Software Engineer Intern @ Airbnb June 2021 - Sept 2021

NLP Research Assistant @ Ai2 March 2021 - June 2021

NLP Research Intern @ GTL June 2020 - Sept 2020

Software Engineer Intern @ Qualtrics July 2019 - Sept 2019

Full Stack Intern @ GTL Nov 2018 - April 2019

Personal

Teaching Assistant @ UW September 2020 - June 2020

Section Leader @ Stanford April 2020 - May 2020

Team Lead for Startup @ Alpha Kappa Psi May 2019 - April 2020

Projects

Recruitment Manager @ Alpha Kappa Psi June 2019

Study Buddies @ DubHacks Oct 2018