About Me

My name is Tony and I’m a Forward Deployed Engineer at Palantir Technologies. My passions include writing code and building things that affect people's lives.

At Dartmouth College, I studied Computer Science with a focus in Machine Learning and Artificial Intelligence as well as Italian language and culture. At school, I combined my love of Computer Science and language with my research in Computational Sociolinguistics. This love of CS and language also complimented my internship at TomTom, where I used named entity recognition to augment maps using textual data sources.

I spend most of my time running, hiking, biking, swimming, and programming. I love side projects, collaboration, solving difficult problems, and having meaningful discussions. Currently, I'm working on a hiking/running/biking trip planning app, combining my loves of CS, maps, data, and outdoor adventures.

Projects

Automatic Rhoticity Encoding

Automated extraction methods are widely available for vowels, but automated methods for coding rhoticity have lagged far behind. R-fulness versus r-lessness (in words like park, store, etc.) is a classic and frequently cited variable, but it is still commonly coded by human analysts rather than automated methods. Human-coding requires extensive resources and lacks replicability, making it difficult to compare large datasets across research groups. Can reliable automated methods be developed to aid in coding rhoticity? In this study, we use Neural Networks/Deep Learning, training our model on 208 Boston-area speakers.

Multilingual Named Entity Recognition

Mapping companies face the difficult issue of how to extract pertinent information from news and social media to automatically update their maps. Tweets about new restaurants or Facebook postings about road construction introduce novel information relevant to map updates. Named Entity Recognition (NER) is a key part of processing and discerning the relevancy of this information. Additionally, given the global and interconnected natures of news and social media, this kind of information may be presented in different languages. This program implements a Language-Independent Named Entity Recognition (NER) model using Facebook Multilingual Unsupervised and Supervised Embeddings (MUSE word embeddings) as features for a bi-directional Long Short Term Memory (Bi-LSTM) network with a Conditional Random Field (CRF) classifier.

Artificial Sommelier

Wine is viewed as a sophisticated beverage, only to be understood by the wine connoisseurs of the upper class. When ordering wine at an upscale restaurant, it is not uncommon to ask for recommendations from the waiter. Using the PyData stack, Steven Jiang and I make the understanding of wine more accessible to the general public. Our goal is to make the sommelier obsolete. Through exploring a Kaggle wine dataset with features such as ratings, variety, region of origin, and description, we answer the following questions:

  • Do price and quality vary by region, variety, or winery? If so, to what extent?
  • Are price and quality correlated?

Additionally, we create models to predict quality and price of a wine soley from the textual description.

Chess AI

Written in Python, the Chess AI uses an adversarial search algorithm to play chess against a human opponent. More specifically, the Chess AI uses an Alpha-Beta search algorithm with Zobrist Hashing in order to efficiently predict the optimal next move based on the expected board state several moves in the future.