Research

Undergraduate Research:

  • Undergraduate data scientist in Data Lab @ TXST, datalab12.github.io
  • Various twitter datasets processing and ingestion using Twitter API, JSON data format, and graph transformations
  • JSON data analysis and ingestion to MongoDB, use of MongoDB unstructured data schemas
  • Tested and debugged python package features on community detection, topic identification, graph analysis and visualization
  • Research support: worked with a graduate student to complete topic analysis and visualizations experiments, and social network analysis of communities in multiple large twitter topic networks
  • Use of python notebooks, python, python packages (pytwanalysis 0.0.6) on Linux server at scale
  • Presented the work as part of SURE program in which first-generation students and federal pell grant-eligible students participate in a ten-week extensive research experience which was extended to the fall semester.
  • Attended multiple professional development seminar

Graduate Research:

  • Graduate data scientist in Data Lab @ TXST
  • Decisions about when and how to reopen schools were difficult for district administrations during COVID-19, as there was no consensus on the impact of reopening of the school reopening on the spread of COVID-19. Learning loss was documented throughout the process. In this project, we attempt to identify the most impactful factors on learning loss (or the absence of learning gain) at the Texas level and help policy makers make more informative decisions on learning recovery
  • We analyze all school districts with more than 2000 students in Texas and pull data from the STAAR test results by grade with scores broken by subject, district, grade, race, ethnicity, and free lunches, and use NCES Common Core of Data to capture the in-school student population in 9/20, 10/20, and 01/21 as a predictor that captures the dominant mode of teaching and reopening policy. Our initial EDA in 12 school districts in Travis County showed that the negative impact of COVID-19 erased years of improvement in reading and math. Remote learning appeared to contribute to learning loss regardless of household income. Next, we have used NCES, STAAR and COVID-19 data to predict the learning loss per district per subject per grade per race in Travis County. This experiment will guide future data analysis and considerations for prediction and impact modeling. Accuracy scoring and cross validation will be used as informal succes measures.