The ever-growing size and complexity of data have created scalability challenges for storage and processing. Ensuring the privacy of data owned by different entities across various locations has also become a crucial concern. In this context, distributed learning architectures have emerged as a viable solution. In this dissertation, we analyze the performance of distributed learning with streaming data in three settings. In the first setting, we consider distributed single-task learning with heterogeneous data streams. Each node in the network of learners maintains a model that is updated with stochastic gradient estimates based on a local data stream and a network regularization gradient that promotes cohesion among the ensemble of models. We show the ensemble average approximates a stationary point quantify the deviation of individual models from it. We compare the results with federated learning to conclude that distributed networked learning is more robust to heterogeneity in data streams. We illustrate these findings through image recognition using convolutional neural networks. In the second setting, we consider distributed single-task learning of a linear model with correlated data streams. Each node updates a model based on local data subject to a network regularization that promotes consensus with neighboring models. We analyze computation dynamics and information exchange and provide a finite-time characterization of convergence of the weighted ensemble average estimate to the ground-truth parameter. We compare this result to federated learning to identify conditions favoring higher precision in estimation. We illustrate the distributed learning scheme with three examples: estimating field temperature, modeling the prey escape behavior, and head movement prediction. In the final setting, we consider distributed multi-task learning with heterogeneous and correlated data streams. We assume that nodes can be partitioned into groups corresponding to different learning tasks and are connected via a network. Each node estimates a model with local data and is subject to local and global regularization terms targeting noise reduction and generalization performance improvement. We provide a finite-time characterization of convergence of the estimated models along with estimated task covariance and illustrate distributed multi-task learning with two examples: random field temperature estimation and modeling academic performance of students.