Papers | Yifan Jiang

2024

The Anytime Convergence of Stochastic Gradient Descent with Momentum: From a Continuous-Time Perspective

Yasong Feng, Yifan Jiang, Tianyu Wang, and Zhiliang Ying

Feb 2024

Abs arXiv Bib

In this paper, we study the stochastic optimization problem from a continuous-time perspective. We propose a stochastic first-order algorithm, called Stochastic Gradient Descent with Momentum (SGDM), and show that the trajectory of SGDM, despite its stochastic nature, converges to a deterministic second-order Ordinary Differential Equation (ODE) in \(L_2\)-norm, as the stepsize goes to zero. The connection between the ODE and the algorithm results in delightful patterns in the discrete-time convergence analysis. More specifically, we develop convergence results for the ODE through a Lyapunov function, and translate the whole argument to the discrete-time case. This approach yields a novel anytime convergence guarantee for stochastic gradient methods. Precisely, we prove that, for any \(β\), there exists \(k_0\)such that the sequence \({x_k}\)governed by running SGDM on a smooth convex function \(f\)satisfies \[ \bbP\Biggl(\mathop⋂_{k=k_0}^∞\biggl\{f (x_k) - f^* ≤\frac{C\log k \log(2/β)}{\sqrt{k}}\biggr\}\Biggr)≥1-β,\]where \(f^*=\min_{x∈\bbR^n}f(x)\). Our contribution is significant in that it better captures the convergence behavior across the entire trajectory of the algorithm, rather than at a single iterate.
@misc{feng24Anytime, title = {The {{Anytime Convergence}} of {{Stochastic Gradient Descent}} with {{Momentum}}: {{From}} a {{Continuous-Time Perspective}}}, shorttitle = {The {{Anytime Convergence}} of {{Stochastic Gradient Descent}} with {{Momentum}}}, author = {Feng, Yasong and Jiang, Yifan and Wang, Tianyu and Ying, Zhiliang}, year = {2024}, month = feb, number = {arXiv:2310.19598}, eprint = {2310.19598}, publisher = {{arXiv}}, }
Duality of Causal Distributionally Robust Optimization: The Discrete-Time Case

Yifan Jiang

Jan 2024

Abs arXiv Bib

This paper studies distributionally robust optimization (DRO) in a dynamic context. We consider a general penalized DRO problem with a causal transport-type penalization. Such a penalization naturally captures the information flow generated by the dynamic model. We derive a tractable dynamic duality formula under mild conditions. Furthermore, we apply this duality formula to address distributionally robust version of average value-at-risk, stochastic control, and optimal stopping.
@misc{jiang24Duality, title = {Duality of Causal Distributionally Robust Optimization: The Discrete-Time Case}, shorttitle = {Duality of Causal Distributionally Robust Optimization}, author = {Jiang, Yifan}, year = {2024}, month = jan, number = {arXiv:2401.16556}, eprint = {2401.16556}, publisher = {{arXiv}}, }

2023

Wasserstein Distributional Robustness of Neural Networks

Xingjian Bai, Guangyi He, Yifan Jiang, and Jan Obłój

In Advances in Neural Information Processing Systems , Dec 2023

Abs arXiv Bib Code

Deep neural networks are known to be vulnerable to adversarial attacks (AA). For an image recognition task, this means that a small perturbation of the original can result in the image being misclassified. Design of such attacks as well as methods of adversarial training against them are subject of intense research. We re-cast the problem using techniques of Wasserstein distributionally robust optimization (DRO) and obtain novel contributions leveraging recent insights from DRO sensitivity analysis. We consider a set of distributional threat models. Unlike the traditional pointwise attacks, which assume a uniform bound on perturbation of each input data point, distributional threat models allow attackers to perturb inputs in a non-uniform way. We link these more general attacks with questions of out-of-sample performance and Knightian uncertainty. To evaluate the distributional robustness of neural networks, we propose a first-order AA algorithm and its multistep version. Our attack algorithms include Fast Gradient Sign Method (FGSM) and Projected Gradient Descent (PGD) as special cases. Furthermore, we provide a new asymptotic estimate of the adversarial accuracy against distributional threat models. The bound is fast to compute and first-order accurate, offering new insights even for the pointwise AA. It also naturally yields out-of-sample performance guarantees. We conduct numerical experiments on CIFAR-10, CIFAR-100, ImageNet datasets using DNNs on RobustBench to illustrate our theoretical results. Our code is available at here.
@inproceedings{BHJO23Wasserstein, title = {Wasserstein Distributional Robustness of Neural Networks}, author = {Bai, Xingjian and He, Guangyi and Jiang, Yifan and Obłój, Jan}, year = {2023}, month = dec, booktitle = {Advances in Neural Information Processing Systems}, publisher = {Curran Associates, Inc.}, }
Empirical Approximation to Invariant Measures for McKean–Vlasov Processes: Mean-field Interaction vs Self-interaction

Kai Du, Yifan Jiang, and Jinfeng Li

Bernoulli, Aug 2023

Abs arXiv Bib

This paper proves that, under a monotonicity condition, the invariant probability measure of a McKean–Vlasov process can be approximated by weighted empirical measures of some processes including itself. These processes are described by distribution dependent or empirical measure dependent stochastic differential equations constructed from the equation for the McKean–Vlasov process. Convergence of empirical measures is characterized by upper bound estimates for their Wasserstein distances to the invariant measure. Numerical simulations of the mean-field Ornstein–Uhlenbeck process are implemented to demonstrate the theoretical results.
@article{DJL23Empirical, title = {Empirical Approximation to Invariant Measures for {McKean--Vlasov} Processes: Mean-field Interaction vs Self-interaction}, author = {Du, Kai and Jiang, Yifan and Li, Jinfeng}, year = {2023}, month = aug, journal = {Bernoulli}, volume = {29}, number = {3}, pages = {2492--2518}, publisher = {Bernoulli Society for Mathematical Statistics and Probability}, }
Sequential Propagation of Chaos

Kai Du, Yifan Jiang, and Xiaochen Li

Jan 2023

Abs arXiv Bib

A new class of particle systems with sequential interaction is proposed to approximate the McKean-Vlasov process that originally arises as the limit of the mean-field interacting particle system. The weighted empirical measure of this particle system is proved to converge to the law of the McKean-Vlasov process as the system grows. Based on the Wasserstein metric, quantitative propagation of chaos results are obtained for two cases: the finite time estimates under the monotonicity condition and the uniform in time estimates under the dissipation and the non-degenerate conditions. Numerical experiments are implemented to demonstrate the theoretical results.
@misc{DJL23Sequential, title = {Sequential Propagation of Chaos}, author = {Du, Kai and Jiang, Yifan and Li, Xiaochen}, year = {2023}, month = jan, number = {arXiv:2301.09913}, eprint = {2301.09913}, publisher = {{arXiv}}, }

2022

Existence and Distributional Chaos of Points that are Recurrent but Not Banach Recurrent

Yifan Jiang, and Xueting Tian

Journal of Dynamics and Differential Equations, Apr 2022

Abs arXiv Bib

According to the recurrent frequency, many levels of recurrent points are found such as periodic points, almost periodic points, weakly almost periodic points, quasi-weakly almost periodic points and Banach recurrent points. In this paper, we consider symbolic dynamics and show the existence of six refined levels between Banach recurrence and general recurrence. Despite the fact that these refined levels are all null-measure under any invariant measure, we further show they carry strong topological complexity. Each refined level of recurrent points is dense in the whole space and contains an uncountable distributionally chaotic subset. For a wide range of dynamical systems such as expansive systems with the shadowing property, we also show the distributional chaos of the points that are recurrent but not Banach recurrent.
@article{JT22Existence, title = {Existence and Distributional Chaos of Points that are Recurrent but Not {Banach} Recurrent}, author = {Jiang, Yifan and Tian, Xueting}, year = {2022}, month = apr, journal = {Journal of Dynamics and Differential Equations}, issn = {1572-9222}, doi = {10.1007/s10884-022-10158-x}, }

2021

Convergence of the Deep BSDE Method for FBSDEs with Non-Lipschitz Coefficients

Yifan Jiang, and Jinfeng Li

Probability, Uncertainty and Quantitative Risk, Dec 2021

Abs arXiv Bib

This paper is dedicated to solving high-dimensional coupled FBSDEs with non-Lipschitz diffusion coefficients numerically. Under mild conditions, we provided a posterior estimate of the numerical solution that holds for any time duration. This posterior estimate validates the convergence of the recently proposed Deep BSDE method. In addition, we developed a numerical scheme based on the Deep BSDE method and presented numerical examples in financial markets to demonstrate the high performance.
@article{JL21Convergence, title = {Convergence of the {Deep BSDE} Method for {FBSDEs} with Non-{Lipschitz} Coefficients}, author = {Jiang, Yifan and Li, Jinfeng}, year = {2021}, month = dec, journal = {Probability, Uncertainty and Quantitative Risk}, volume = {6}, number = {4}, pages = {391--408}, publisher = {American Institute of Mathematical Sciences}, }