PI : Qin Zhang
NSF CCF-1844234 (June 2019 - May 2024)
Through the massive use of mobile devices, data clouds, and the rise of the Internet of Things, large amounts of data have been generated, digitized, and analyzed for the benefit of society. As data are often collected and maintained at different sites, communication has become necessary for nearly every computational task. Moreover, decision makers naturally want to maintain a centralized view of all the data in a timely manner, which requires frequent queries on the distributed data and, in the extreme, continuous monitoring of the query output. The cost of communication has naturally become the bottleneck for such applications. This project aims to develop communication-efficient solutions for distributed computation and monitoring.
This project targets three fundamental aspects of distributed computation: (1) the tradeoffs between the communication cost and the number of rounds of the computation in distributed one-shot computation, (2) the power of data partitioning, and (3) the connections between distributed one-shot computation and continuous monitoring.
Collaborative Top Distribution Identifications with Limited Interaction (preliminary full version, 47 pages)
with N. Karpov and Y. Zhou
Proc. IEEE Symposium on Foundations of Computer Science (FOCS 20), to appear. Virtual conference, November 2020.
- We have shown a strong separation on the top-1 arm identification and top-k arm identifications in the collaborative learning model.
Multinomial Logit Bandit with Low Switching Cost (preliminary full version, 25 pages)
with K. Dong, Y. Li, and Y. Zhou
Proc. International Conference on Machine Learning (ICML 20), to appear. Virtual conference, July 2020.
Collaborative Learning with Limited Interaction:
Tight Bounds for Distributed Exploration in Multi-Armed Bandits (preliminary full version, 37 pages) [conf. talk]
with C. Tao and Y. Zhou
Proc. IEEE Symposium on Foundations of Computer Science (FOCS 19), pages 126-146. Baltimore, MD, U.S.A., November 2019.
- We conduct a systematic study of the best arm identification problem in the setting of collaborative learning with limited interaction.
- We obtain almost tight round-speedup tradeoffs for both fixed-time and fixed-confidence settings, and show a complexity separation for the two variants.
- We develop two new techniques for proving round lower bounds for multi-agent collaborative learning.
Smooth q-Gram, and Its Applications to Detection of Overlaps among Long, Error-Prone Sequencing Reads
with H. Zhang, Q. Zhang and H. Tang
MinJoin: Efficient Edit Similarity Joins via Local Hash Minima
with H. Zhang
Proc. ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 19) (oral presentation), pages 1093-1103. Anchorage, AK, U.S.A., August 2019.
Distributed and Streaming Linear Programming in Low Dimensions
with S. Assadi and N. Karpov
Proc. ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS 19), pages 236-253. Amsterdam, The Netherlands, June-July, 2019.
Invited to the special issue for PODS 2019 in ACM Transactions on Database Systems (TODS).
Educational and Other Development
A Trilogy of Courses:
Code library for distributed algorithms: