Fundamental components of deep learning : a category-theoretic approach

Gavranović, Bruno

Thesis

Fundamental components of deep learning : a category-theoretic approach

下载PDF文件

Creator

Gavranović, Bruno

Rights statement

Strathclyde Thesis Copyright

Awarding institution

University of Strathclyde

Date of award

2024

Thesis identifier

T16859

Person Identifier (Local)

201982557

Qualification Level

Doctoral (Postgraduate)

Qualification Name

Doctor of Philosophy (PhD)

Department, School or Faculty

Department of Computer and Information Sciences

Abstract

Deep learning, despite its remarkable achievements, is still a young field. Like the early stages of many scientific disciplines, it is marked by the discovery of new phenomena, ad-hoc design decisions, and the lack of a uniform and compositional mathematical foundation. From the intricacies of the implementation of backpropagation, through a growing zoo of neural network architectures, to the new and poorly understood phenomena such as double descent, scaling laws or in-context learning, there are few unifying principles in deep learning.This thesis develops a novel mathematical foundation for deep learning based on the language of category theory. We develop a new framework that is a) end-to-end, b) uniform, and c) not merely descriptive, but prescriptive, meaning it is amenable to direct implementation in programming languages with sufficient features. We also systematise many existing approaches, placing many existing constructions and concepts from the literature under the same umbrella.In Part I, the theory, we identify and model two main properties of deep learning systems: they are parametric and bidirectional. We expand on the previously defined construction of categories and Para to study the former, and define weighted optics to study the latter. Combining them yields parametric weighted optics, a categorical model of artificial neural networks, and more: constructions in Part I have close ties to many other kinds of bidirectional processes such as Bayesian updating, value iteration, and game theory.Part II justifies the abstractions from Part I, applying them to model backpropagation, architectures, and supervised learning. We provide a lens-theoretic axiomatisation of differentiation, covering not just smooth spaces, but discrete settings of Boolean circuits as well. We survey existing, and develop new categorical models of neural network architectures. We formalise the notion of optimisers and lastly, combine all the existing concepts together, providing a uniform and compositional framework for supervised learning.

Advisor / supervisor

Ghani, Neil

Resource Type

Doctoral thesis

DOI

10.48730/795r-wn11

关系

项目

缩略图	标题	上传日期	公开度	行动
	PDF of thesis T16859	2024-03-26	公开	下载

Fundamental components of deep learning : a category-theoretic approach

可下载的内容

关系

项目