M.S. Candidate in Data Science

I am a 2nd year Master's student in Data Science at the NYU Center for Data Science(CDS) and expect to graduate at May 2018. I am currently seeking full-time related with Data Science in various fields.

I graduated with honor with a B.A. in Finance and a B.S. in Statistics from Peking University. I'm familiar with applying statistical analysis tools using Python or Stata to real world problems, mainly in, but not limited to financial field.

Projects


Predicting Donations on Donorschoose.org

Oct 2016 - Dec 2016

In this project, we analyzed over 1 million data on Donorschoose.org. We trained models for predicting the probability of failure of projects with logistic regression, decision tree, kNN and random forest, and tune hyper-parameters by cross-validation and grid search. We also analyzed essays of projects with NLTK in python.

This is a joint work with Lilin Qin, Yiran Xu and Xue Yang under the guidance of Professor Brain d’Alessandro in Introduction to Data Science course.

NYC Travel Helper

Oct 2016 - Dec 2016

In this project, we first wrote web-scrapers to gather data for NYC restaurants, hotels and travel attractions from Booking.com, Tripadvisor.com and Yelp.com with python. Then we built a user-friendly GUI providing several interactions that allows the user to search for nearby places of interest, see visualizations of basic stats overview, and customize detailed travel plans according to their preferences based on adjusted k-means clustering algorithm.

This is a joint work with Hezhi Wang and Storm Avery Ross under the guidance of Professor Greg Watson in Programming for Data Science course.

Emphirical Research on Determinants of Online Loan Transactions

May 2015 - Jun 2016

P2P lending platforms based on the notion of 'peer-to-peer network' have experienced a remarkable growth in recent years. They have easy access to massive data, but it remains a challenge to ensure the authenticity of the collected information as well as utilize the information in decision-making and risk management. Here I applied logistic and linear regression models to analyze the influencing factors of loan interest and the probability of successful funding based on a unique sample of more than 40k loan listings on the online platform ppdai.com.

This is my undergraduate thesis advised by Professor Frank M. Song and Professor Liuyan Zhao.

Mathematical Modeling for Density of Air Pollution

May 2014

In this project, we first created a model based on stochastic simulation and Gaussian plume model, to graphically display the density of pollution generated by a municipal solid waste incinerator in surrounding residence. Then we proposed a way to estimate the loss and make monetary compensation to local residents under the notion of social welfare and utility.

This project received 2nd prize in the 11th Mathematical Modeling Competition of Peking University. It is a joint work with Yijie Guo and Junyang Li.

Education


M.S. in Data Science

May 2018

New York University, Center for Data Science

B.A. in Finance

July 2016

Peking University, School of Economics

Outstanding Graduate, top 2%

B.S. in Statistics

July 2016

Peking University, School of Mathematical Sciences

Exchange Student

Dec 2014

The University of Hong Kong, School of Business and Economics