Teaching Group Publications Home

Assistant Professor
Electrical & Computer Engineering
Computer Science
University of Virginia

P.O. Box 400743
Charlottesville, VA 22904

About me: I am an assistant professor at the University of Virginia with appointments in Electrical and Computer Engineering and Computer Science. Previously, I was a postdoctoral scholar at Caltech. I received my Ph.D. in ECE from UIUC in 2013. I also received an M.Sc. in mathematics from UIUC in 2012 and an M.Sc. in ECE from the University of Toronto in 2008. My Erdos number is 2 and you can see my academic geneology here (courtesy of Anatoly Khina).

My interests include information theory, bioinformatics/computational biology, and machine learning. I particularly gravitate towards problems that lie in the intersections of these areas, such as data storage in DNA, compression of biological data, and probabilistic and information-theoretic modeling of DNA mutations.

Recent News:

• Jan. 2022: My proposal CAREER: Model-based compression and probabilistic analysis of non-Markovian sequences was funded by the NSF.
• Jan. 2022: Our paper, Adaptive Sampling for Heterogeneous Rank Aggregation from Noisy Pairwise Comparisons, was accepted to AISTATS 2022.
• Oct. 2021: Our paper, Error-correcting Codes for Short Tandem Duplication and Edit Errors, was published by the IEEE Trans. on Information Theory.
• July 2021: Three papers were presented at ISIT 2021.
• Apr. 2021: The first three chapters of my online course, Mathematics of Information, are available. The goal of the course is to provide an interactive learning experience on the mathematical foundations of defining, transforming, and communicating information (work-in-progress).
• Feb. 2021: Our paper, Error-correcting Codes for Noisy Duplication Channels, was published by the IEEE Trans. on Information Theory.
• Oct. 2020: The paper first-authored by Yiming Wang, an undergraduate researcher, was presented at IEEE BIBE.
• July 2020: Our paper Single-Error Detection and Correction for Duplication and Substitution Channels is accepted by IEEE Trans. Information Theory.
• June 2020: Three papers were presented at ISIT 2020.
• May. 2020: Our paper, Evolution of k-mer Frequencies and Entropy in Duplication and Substitution Mutation Systems, was published in the IEEE Trans. Information Theory.
• Feb. 2020: Our paper, Rank Aggregation via Heterogeneous Thurstone Preference Models was presented at the AAAI Conference on Artificial Intelligence. (Acceptance Rate: 20.6%, Oral est. 4.5%)
• Nov. 2019: Our paper, Finite-time Behavior of k-mer Frequencies and Waiting Times in Noisy-Duplication Systems was presented at the Asilomar Conference, Monterey, CA.
• Sep. 2019: Our paper, Error-correcting Codes for Noisy Duplication Channels was presented at the Allerton Conference, Monticello, IL.
• July 2019: Our proposal, CIF: Small: Collaborative Research: Rank Aggregation with Heterogeneous Information Sources: Efficient Algorithms and Fundamental Limits, was funded by the NSF ($250,000). • June 2019: Our paper, Single-Error Detection and Correction for Duplication and Substitution Channels was presented at ISIT, Paris, France. • Jan. 2019: Our paper, Estimation of duplication history under a stochastic model for tandem repeats was published in BMC Bioinformatics. • Oct. 2018: Our proposal, CIF: NSF-BSF: Characterization and Mitigation of Noise in a Live DNA Storage Channel, was funded by the NSF ($500,000, Co-PIs: Mete Civelek, Jehoshua Bruck, Moshe Schwartz).
• Sep. 2018: Our draft, Reconciling Similar Sets of Data, is available on arXiv.
• Aug. 2018: Our draft, The Capacity of Some Polya String Models, is available on arXiv.
• June 2018: Our paper, Evolution of N-Gram Frequencies Under Duplication and Substitution Mutations was presented at ISIT, Vail, CO.
• May 2018: Our proposal, Predicting Antibiotic Resistance, was funded by the Global Infectious Diseases Institute ($70,000, Co-PI: Jason Papin). • Apr. 2018: I received the 2017 Outstanding Teacher Award from the ECE department. • Mar. 2018: My CRII proposal, CRII: CIF: Model-based Compression of Biological Sequences was funded by the NSF ($175,000, Single PI).