diff --git a/about.md b/about.md index 1f6f226..93b6818 100644 --- a/about.md +++ b/about.md @@ -5,7 +5,8 @@ permalink: /about/ --- I am currently a senior researcher at [Microsoft Research New England](https://www.microsoft.com/en-us/research/lab/microsoft-research-new-england/). Previously, I was a machine learning scientist at [Generate Biomedicines](https://generatebiomedicines.com/), a [Flagship Pioneering](https://www.flagshippioneering.com/) company, where I used machine learning to optimize proteins. -From 2014-2018, I was a PhD student in Chemical Engineering at Caltech. I worked in Frances Arnold's [lab](http://cheme.che.caltech.edu/groups/fha/). The Arnold lab is best known for its pioneering use of [directed evolution](https://en.wikipedia.org/wiki/Directed_evolution) to create useful proteins without requiring a deep understanding of the biophysical underpinnings of protein folding and function. Recently, they've been designing new [light-sensitive proteins](http://www.pnas.org/content/early/2017/03/09/1700269114.abstract) for applications in neuroscience and evolving an enzyme to make [carbon-silicon bonds](http://science.sciencemag.org/content/354/6315/1048.full?ijkey=mIJS6o5p4H63Y&keytype=ref&siteid=sci). They also pioneered the use of [machine learning for protein engineering](http://cheme.che.caltech.edu/groups/fha/publications/Romero_PNAS2012.pdf). +From 2014-2018, I was a PhD student in Chemical Engineering at Caltech. I worked in Frances Arnold's [lab](http://cheme.che.caltech.edu/groups/fha/), where I helped pioneer the use of [machine learning for protein engineering](https://doi.org/10.1038/s41592-019-0496-6) + Before moving to California, I completed my undergraduate degree at The Ohio State University, where I studied chemical engineering with a minor in piano performance. Between Ohio State and graduate school, I taught math and physics for three years at a high school in Inglewood, California through Teach for America. In those three years, I transformed from a struggling first-year teacher into an effective instructor and robotics coach with the help of the amazing staff at Animo Inglewood Charter High School and Green Dot Public Schools. In June 2017, I had the honor of watching my last class of freshmen graduate from high school, and I'm excited to see what they do with their futures. diff --git a/index.md b/index.md index 2582d2e..e38b20e 100644 --- a/index.md +++ b/index.md @@ -8,4 +8,7 @@ I'm a computational biologist working at the intersection of machine learning an Here's some more [about me](/about) and details about [my research](/research). My resume can be found [here](https://github.com/yangkky/resume/blob/master/KKY_cv.pdf). -Please email me at yang dot kevin at microsoft dot com if you are interested in collaborating or a research internship. \ No newline at end of file +Please email me at yang dot kevin at microsoft dot com if you are interested in collaborating or a research internship. My previous interns include: + +- Amy Wang +- Kevin Wu \ No newline at end of file diff --git a/research.md b/research.md index c81e20d..170d07c 100644 --- a/research.md +++ b/research.md @@ -38,6 +38,24 @@ In the second half of my PhD, I focused on developing methods for the two key st # Publications +**Protein structure generation via folding diffusion.** +Kevin E. Wu, Kevin K. Yang, Rianne van den Berg, Sarah Alamdari, James Y. Zou, Alex X. Lu, Ava P. Amini. *Nature Communications*, 2024. [10.1038/s41467-024-45051-2](https://doi.org/10.1038/s41467-024-45051-2) + + +**Masked inverse folding with sequence transfer for protein representation learning.** +Kevin K. Yang, Niccolò Zanichelli, Hugh Yeh. *Protein Engineering, Design and Selection*, 2024. [10.1101/2022.05.25.493516](https://doi.org/10.1101/2022.05.25.493516) + +**Convolutions are competitive with transformers for protein sequence pretraining.** Kevin K. Yang, Nicolo Fusi, Alex X. Lu. *Cell Systems*, 2024. [10.1101/2022.05.19.492714](https://doi.org/10.1101/2022.05.19.492714) + +**Randomized gates eliminate bias in sort-seq assays.** +Brian L. Trippe, Buwei Huang, Erika A. DeBenedictis, Brian Coventry, Nicholas Bhattacharya, Kevin K. Yang, David Baker, Lorin Crawford. *Protein Science*, 2022. [biorxiv](https://doi.org/10.1101/2022.02.17.480881) + +**Deep self-supervised learning for biosynthetic gene cluster detection and product classification.** +Carolina Rios-Martinez, Nicholas Bhattacharya, Ava P Amini, Lorin Crawford, Kevin K. Yang. *PLoS Computational Biology*, 2023. [10.1371/journal.pcbi.1011162](https://doi.org/10.1371/journal.pcbi.1011162) + +**Exploring evolution-based &-free protein language models as protein function predictors.** +Mingyang Hu, Fajie Yuan, Kevin K. Yang, Fusong Ju, Jin Su, Hui Wang, Fei Yang, Qiuyang Ding. [NeurIPS 2022](https://arxiv.org/abs/2206.06583) + **Evolutionary velocity with protein language models.** Brian L. Hie, Kevin K. Yang, and Peter S. Kim. *Cell Systems*, 2022. [10.1016/j.cels.2022.01.003](https://doi.org/10.1016/j.cels.2022.01.003) **Machine learning modeling of family wide enzyme-substrate specificity screens.** @@ -72,17 +90,18 @@ Samuel Goldman, Ria Das, Kevin K Yang, Connor W Coley. *PLoS computational biolo # Preprints -**Exploring evolution-based &-free protein language models as protein function predictors.** -Mingyang Hu, Fajie Yuan, Kevin K. Yang, Fusong Ju, Jin Su, Hui Wang, Fei Yang, Qiuyang Ding. [arxiv](https://arxiv.org/abs/2206.06583) +**Feature Reuse and Scaling: Understanding Transfer Learning with Protein Language Models.** +Francesca-Zhoufan Li, Ava P. Amini, Yisong Yue, Kevin K. Yang, Alex X. Lu. +[10.1101/2024.02.05.578959](https://doi.org/10.1101/2024.02.05.578959) -**Masked inverse folding with sequence transfer for protein representation learning.** -Kevin K. Yang, Niccolò Zanichelli, Hugh Yeh. [biorxiv](https://doi.org/10.1101/2022.05.25.493516) +**Protein generation with evolutionary diffusion: sequence is all you need.** +Sarah Alamdari, Nitya Thakkar, Rianne van den Berg, Alex Xijie Lu, Nicolo Fusi, Ava Pardis Amini, Kevin K. Yang. [10.1101/2023.09.11.556673](https://doi.org/10.1101/2023.09.11.556673) -**Convolutions are competitive with transformers for protein sequence pretraining.** Kevin K. Yang, Alex X. Lu, Nicolo Fusi. [biorxiv](https://doi.org/10.1101/2022.05.19.492714) +**Computational Scoring and Experimental Evaluation of Enzymes Generated by Neural Networks.** Sean R Johnson, Xiaozhi Fu, Sandra Viknander, Clara Goldin, Sarah Monaco, Aleksej Zelezniak, Kevin K. Yang. [10.1101/2023.03.04.531015](https://doi.org/10.1101/2023.03.04.531015) + +**Benchmarking uncertainty quantification for protein engineering.** Kevin P. Greenman, Ava P. Amini, Kevin K. Yang. [10.1101/2023.04.17.536962](https://doi.org/10.1101/2023.04.17.536962) -**Randomized gates eliminate bias in sort-seq assays.** -Brian L. Trippe, Buwei Huang, Erika A. DeBenedictis, Brian Coventry, Nicholas Bhattacharya, Kevin K. Yang, David Baker, Lorin Crawford. [biorxiv](https://doi.org/10.1101/2022.02.17.480881)