Publication
 Assumption violations in causal discovery and the robustness of score matching. 37th Conference on Neural Information Processing Systems (NeurIPS 2023) (2024). at <https://proceedings.neurips.cc/paper_files/paper/2023/file/93ed74938a54a73b5e4c52bbaf42ca8e-Paper-Conference.pdf>
 Estimating Koopman operators with sketching to provably learn large scale dynamical systems. 37th Conference on Neural Information Processing Systems (NeurIPS 2023) (2023). at <https://proceedings.neurips.cc/paper_files/paper/2023/file/f3d1e34a15c0af0954ae36a7f811c754-Paper-Conference.pdf>
 Heteroscedastic Gaussian Processes and Random Features: Scalable Motion Primitives with Guarantees. 7th Conference on Robot Learning (CoRL 2023 (2023). at <https://proceedings.mlr.press/v229/caldarelli23a/caldarelli23a.pdf>
 The Janus effects of SGD vs GD: high noise and low rank. (2023).
 Updated with appendix showing empirically that the main results extend to  deep nonlinear networks (2.95 MB)
 Small updates...typos... (616.82 KB)
 An Optimal Structured Zeroth-order Algorithm for Non-smooth Optimization. 37th Conference on Neural Information Processing Systems (NeurIPS 2023) (2023). at <https://proceedings.neurips.cc/paper_files/paper/2023/file/7429f4c1b267cf619f28c4d4f1532f99-Paper-Conference.pdf>
 Scalable Causal Discovery with Score Matching. NeurIPS 2022 (2022). at <https://openreview.net/forum?id=v56PHv_W2A>
 For interpolating kernel machines, the minimum norm ERM solution is the most stable. (2020).
 CBMM_Memo_108.pdf (1015.14 KB)
 Better bound (without inequalities!) (1.03 MB)
 Beating SGD Saturation with Tail-Averaging and Minibatching. Neural Information Processing Systems (NeurIPS 2019) (2019).
 9422-beating-sgd-saturation-with-tail-averaging-and-minibatching.pdf (389.35 KB)
 Dynamics & Generalization in Deep Networks -Minimizing the Norm. NAS Sackler Colloquium on Science of Deep Learning (2019).
 Implicit Regularization of Accelerated Methods in Hilbert Spaces.  Neural Information Processing Systems (NeurIPS 2019) (2019).
 9591-implicit-regularization-of-accelerated-methods-in-hilbert-spaces.pdf (451.14 KB)
 Theory III: Dynamics and Generalization in Deep Networks. (2018).
 Original, intermediate versions are available under request (2.67 MB)
 CBMM Memo 90 v12.pdf (4.74 MB)
 Theory_III_ver44.pdf Update Hessian (4.12 MB)
 Theory_III_ver48 (Updated discussion of convergence to max margin) (2.56 MB)
 fixing errors and sharpening some proofs (2.45 MB)
 Computational and Cognitive Neuroscience of Vision 85-104 (Springer, 2017).
 Symmetry Regularization. (2017).
 CBMM-Memo-063.pdf (6.1 MB)
 Theory of Deep Learning III: explaining the non-overfitting puzzle. (2017).
 CBMM-Memo-073.pdf (2.65 MB)
 CBMM Memo 073 v2 (revised 1/15/2018) (2.81 MB)
 CBMM Memo 073 v3 (revised 1/30/2018) (2.72 MB)
 CBMM Memo 073 v4 (revised 12/30/2018) (575.72 KB)
 Why and when can deep-but not shallow-networks avoid the curse of dimensionality: A review. International Journal of Automation and Computing 1-17 (2017). doi:10.1007/s11633-017-1054-2
 art%3A10.1007%2Fs11633-017-1054-2.pdf (1.68 MB)
 Holographic Embeddings of Knowledge Graphs. Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16) (2016).
 1510.04935v2.pdf (360.65 KB)
 On invariance and selectivity in representation learning. Information and Inference: A Journal of the IMA iaw009 (2016). doi:10.1093/imaiai/iaw009
 imaiai.iaw009.full_.pdf (267.87 KB)
 Theory I: Why and When Can Deep Networks Avoid the Curse of Dimensionality?. (2016).
 CBMM-Memo-058v1.pdf (2.42 MB)
 CBMM-Memo-058v5.pdf (2.45 MB)
 CBMM-Memo-058-v6.pdf (2.74 MB)
 Proposition 4 has been deleted (2.75 MB)
 Deep Convolutional Networks are Hierarchical Kernel Machines. (2015).
 CBMM Memo 035_rev5.pdf (975.65 KB)
 Discriminative Template Learning in Group-Convolutional Networks for Invariant Speech Representations. INTERSPEECH-2015 (International Speech Communication Association (ISCA), 2015). at <http://www.isca-speech.org/archive/interspeech_2015/i15_3229.html>
 Holographic Embeddings of Knowledge Graphs. (2015).
 holographic-embeddings.pdf (677.87 KB)
 On Invariance and Selectivity in Representation Learning. (2015).
 CBMM Memo No. 029 (812.07 KB)
 I-theory on depth vs width: hierarchical function composition. (2015).
 cbmm_memo_041.pdf (1.18 MB)
]