About Me
I am a data analyst for Lambeth Council and a MSc graduate from Imperial College London for Health Data Analytics and Machine Learning. My thesis was in conjunction with AstraZeneca on investigating machine learning methods for addressing competing risks in survival analysis. Prior to that, I earned my BS in biomedical engineering at the University of Missouri-Columbia, with emphases in bioinformatics and pre-med.
In two summers of my undergrad, I conducted research in two labs: Georgia Institute of Technology/Emory University School of Medicine and University of Missouri.
My interests lie in leveraging data in any setting, within or beyond healthcare. It is exciting to think about the endless ways I can apply my skillset to drive change in our Information Age!
When the laptop is closed, I am a traveller, tennis player, tutor, and model. I’ve been to 25 countries and it has inspired me to pick up Spanish and German. I hope to keep travel as a focal aspect of my life! :)
Projects
AIMO LLM (05/2024 - 06/2024): As a team of 4, we created a multi-agent model using open-source LLMs (BERT, DeepSeekMath, OpenCodeInterpreter) and Autogen in an attempt to outperform Math Olympiad gold-medalists on a set of 50 math competition questions. We prompted chain-of-thought communication between models, which improved performance. Repo
Competing Risks Analysis with Machine Learning (05/2022 - 09/2022):
This methodology study seeks to identify the relative performance of machine learning methods for competing risks analysis relative to the established Fine & Gray model and cause-specific hazards model. I found four open survival datasets in R to use, as well as synthesized a larger dataset to compare performance over dataset size. Additionally, I ran five total models on each dataset and scored their predictive performances with the c-index and Brier score.
Repo
ClimateHack: Predicting Geospatial Images (11/2021 - 03/2022): I produced top-scoring results from Imperial College with mean absolute error of 0.69 by applying Optical Flow and convolutional neural network-based models in Python. I optimised satellite image sampling by utilising geocoordinates to calculate solar position using pvlib package. I also leveraged dataloader and cloud computing to train over 260 GB of EUMETSAT satellite data. Repo
Clinician Burnout Project (06/2020 - 05/2021):
We hypothesize that Electronic Medical Record (EMR) use metrics by physician correlate to physician burnout risk and remodeling the EMR workflow with be an interventional method to reduce burnout risk factors. I cleaned the raw data of over 5000 clinicians and utilized descriptive statistics, kmeans, and PCA clustering as part of exploratory analysis.
Repo
SemNet (05/2019 - 08/2019):
The objective of this study is to identify and rank associative factors related to angiogenesis, fibrosis, and EF using an innovative literature mining approach, SemNet. I interpreted the SemNet’s network analysis data and visualized node connectivity for presentation. I also drafted the first 20 pages of the project’s manuscript, which is still ongoing by the team. It is soon to be submitted for publishing with my name as an author.
Publications
Kartchner D, McCoy K, Dubey J, Zhang D, Zheng K, Umrani R, Kim JJ, Mitchell CS. Literature-Based Discovery to Elucidate the Biological Links between Resistant Hypertension and COVID-19. Biology. 2023; 12(9):1269. doi:10.3390
Presentations
Kim, J., Tonellato, P. J., & Wilkinson, K. (Nov. 2020). A Model of Physician Clinical Burnout based on Electronic Medical Record Use Metrics. e-Poster presentation at Annual Biomedical Research Conference for Minority Students (ABRCMS) The Virtual Experience.
Kim, J., Lee, B., Leftwich, D., & Mitchell, C. S. (Apr. 2020). Ranking Correlative Factors of Cardiovascular Disease with SemNet: an Augmented Literature Review. Poster presentation at Stanford Research Conference at Stanford University, Stanford, CA. [Cancelled due to COVID-19]
Kim, J., Lee, B., Leftwich, D., & Mitchell, C. S. (Mar. 2020). Machine Learning to Agnostically Rank Associative Factors in Cardiovascular Disease: an Augmented Literature Review. Poster presentation at National Conference of Undergraduate Research 2020 at Montana State University, Bozeman, MT. [Cancelled due to COVID-19]
Kim, J., Lee, B., Leftwich, D., Davis, M., & Mitchell, C. S. (Aug. 2019). Machine Learning to Agnostically Rank Associative Factors in Cardiovascular Disease. Oral presentation at SURE Symposium at Emory University, Atlanta, GA.
Kim, J. J., Ferguson, C. E., Kimmey, S., & Ferguson, K. (Apr. 2019). The Unstoppable Joins the Immutable: The Impact of Big Data and Blockchain on Healthcare. Paper presented at Institute of Biological Engineering (IBE) 2019 National Conference, St. Louis, MO. -Won 2nd place for the Bioethics Essay contest