Topic modeling of Newspaper with LDA Model
- Used spaCy from Python NLP to preprocess the uncleaned text data for further model training
- Implemented LDA model from gensim to categorize the topic of a given newsgroups posts from a text dataset of scikit-learn and tested the model with unseen text data
Text Classification with Deep Learning Mar
- Wrote a classifier class in Python including data preparation, modeling, evaluating, and predicting
- Preprocessed text data with tokenization using NLTK for deep learning model
- Build a neural network using keras framework, trained the model over Google Cloud and achieved 0.89 accuracy
The Hunger Game (Hackathon of Girls in Tech)
- Conducted data engineering and modeled data combinations from different online sources
- Developed a UI by shiny R to with multiple user inputs and interactive plots to explore an environment problem and found significance
Packages for meta-analysis of R/Python scripts
- Designed a package in both R and Python version to provide an understanding of the input scripts by retrieving any packages, functions, and size of characteristics contained in the files
- Developed three functions and established 100% coverage unite tests for each function
Analysis of worldly wines’ prices and ratings via shiny app
- Cleansed the wine information dataset through data wrangling in R and resolved data issues
- Created a UI by Shiny in R including interactive plots by Plotly for the user to reach out details of their selection of wines
Data Analysis on Relationship between Student’s Performance and Romantic Relationship
- Conducted a research on an exploratory question whether romantic relationship influences students’ academic performance
- Done EDA on the dataset and conducted a two sample Welch’s t-test in R