#Parametrization - Mastodon

Recent searches

Search options

Only available when logged in.

0 posts0 participants0 posts today

∞ 𝕁uan ℂarlos @jcponcemath@mathstodon.xyz

️ #Parametrization of #surfaces ️
https://youtu.be/5jbozOyMR_8

#visualization #mathematics #mathart

∞ 𝕁uan ℂarlos @jcponcemath@mathstodon.xyz

Applying matrix diagonalisation in the classroom with #GeoGebra: parametrising the intersection of a sphere and plane

In collaboration with Bradley Welch

https://www.tandfonline.com/doi/full/10.1080/0020739X.2023.2233513

GIF

#dynamic #geometric #systems

Victoria Stuart @persagen@mastodon.social

Pruning neural networks using Bayesian inference
https://arxiv.org/abs/2308.02451

* neural network (NN) pruning highly effective at reducing computational & memory demands of large NN
* novel approach utilizing Bayesian inference; seamlessly integrates into training procedure
* leverages the posterior probabilities of NN prior to/following pruning, enabling calculation of Bayes factors
* achieves sparsity maintains accuracy

arXiv.orgPruning a neural network using Bayesian inferenceNeural network pruning is a highly effective technique aimed at reducing the computational and memory demands of large neural networks. In this research paper, we present a novel approach to pruning neural networks utilizing Bayesian inference, which can seamlessly integrate into the training procedure. Our proposed method leverages the posterior probabilities of the neural network prior to and following pruning, enabling the calculation of Bayes factors. The calculated Bayes factors guide the iterative pruning. Through comprehensive evaluations conducted on multiple benchmarks, we demonstrate that our method achieves desired levels of sparsity while maintaining competitive accuracy.

#ML #MachineLearning #parametrization

Victoria Stuart @persagen@mastodon.social

On the curvature of the loss landscape
https://arxiv.org/abs/2307.04719

A main challenge in modern deep learning is to understand why such over-parameterized models perform so well when trained on finite data ... we consider the loss landscape as an embedded Riemannian manifold ... we focus on the scalar curvature, which can be computed analytically for our manifold ...

Manifolds: https://en.wikipedia.org/wiki/Manifold
...

arXiv.orgOn the curvature of the loss landscapeOne of the main challenges in modern deep learning is to understand why such over-parameterized models perform so well when trained on finite data. A way to analyze this generalization concept is through the properties of the associated loss landscape. In this work, we consider the loss landscape as an embedded Riemannian manifold and show that the differential geometric properties of the manifold can be used when analyzing the generalization abilities of a deep net. In particular, we focus on the scalar curvature, which can be computed analytically for our manifold, and show connections to several settings that potentially imply generalization.

#MachineLearning #NeuralNetworks #manifolds

Victoria Stuart @persagen@mastodon.social

Extending the Forward Forward Algorithm
https://arxiv.org/abs/2307.04205

The Forward Forward algorithm (Geoffrey Hinton, 2022-11) is an alternative to backpropagation for training neural networks (NN)

Backpropagation - the most widely successful and used optimization algorithm for training NN - has 3 important limitations ...

Hinton's paper: https://www.cs.toronto.edu/~hinton/FFA13.pdf
Discussion: https://bdtechtalks.com/2022/12/19/forward-forward-algorithm-geoffrey-hinton
...

arXiv.orgExtending the Forward Forward AlgorithmThe Forward Forward algorithm, proposed by Geoffrey Hinton in November 2022, is a novel method for training neural networks as an alternative to backpropagation. In this project, we replicate Hinton's experiments on the MNIST dataset, and subsequently extend the scope of the method with two significant contributions. First, we establish a baseline performance for the Forward Forward network on the IMDb movie reviews dataset. As far as we know, our results on this sentiment analysis task marks the first instance of the algorithm's extension beyond computer vision. Second, we introduce a novel pyramidal optimization strategy for the loss threshold - a hyperparameter specific to the Forward Forward method. Our pyramidal approach shows that a good thresholding strategy causes a difference of upto 8% in test error. 1 Lastly, we perform visualizations of the trained parameters and derived several significant insights, such as a notably larger (10-20x) mean and variance in the weights acquired by the Forward Forward network.

#GeoffHinton #ForwardPropagation #NeruralNetworks

Victoria Stuart @persagen@mastodon.social

Loss Functions and Metrics in Deep Learning. A Review
https://arxiv.org/abs/2307.02694

One of the essential components of deep learning is the choice of the loss function and performance metrics used to train and evaluate models.

This paper reviews the most prevalent loss functions and performance measurements in deep learning.

arXiv.orgLoss Functions and Metrics in Deep Learning. A ReviewOne of the essential components of deep learning is the choice of the loss function and performance metrics used to train and evaluate models. This paper reviews the most prevalent loss functions and performance measurements in deep learning. We examine the benefits and limits of each technique and illustrate their application to various deep-learning problems. Our review aims to give a comprehensive picture of the different loss functions and performance indicators used in the most common deep learning tasks and help practitioners choose the best method for their specific task.

#MachineLearning #parametrization #LossFunctions

Victoria Stuart @persagen@mastodon.social

Pruning vs Quantization: Which is Better?
https://arxiv.org/abs/2307.02973

* Pruning remove weights reducing memory footprint
* Quantization (4-bit, 8-bit matrix multiplication; ...) reduces bit-width used for both weights / computation used in neural networks, leading to both predictable memory savings & reductions in the necessary compute

In most cases quantization outperforms pruning.

arXiv.orgPruning vs Quantization: Which is Better?Neural network pruning and quantization techniques are almost as old as neural networks themselves. However, to date only ad-hoc comparisons between the two have been published. In this paper, we set out to answer the question on which is better: neural network quantization or pruning? By answering this question, we hope to inform design decisions made on neural network hardware going forward. We provide an extensive comparison between the two techniques for compressing deep neural networks. First, we give an analytical comparison of expected quantization and pruning error for general data distributions. Then, we provide lower bounds for the per-layer pruning and quantization error in trained networks, and compare these to empirical error after optimization. Finally, we provide an extensive experimental comparison for training 8 large-scale models on 3 tasks. Our results show that in most cases quantization outperforms pruning. Only in some scenarios with very high compression ratio, pruning might be beneficial from an accuracy standpoint.

#MachineLearning #parametrization #weights

Victoria Stuart @persagen@mastodon.social

The bigger-is-better approach to AI is running out of road
If AI is to keep getting better, it will have to do more with less
https://www.economist.com/science-and-technology/2023/06/21/the-bigger-is-better-approach-to-ai-is-running-out-of-road

Discussion: https://news.ycombinator.com/item?id=36462282

The EconomistThe bigger-is-better approach to AI is running out of roadBy The Economist

#LLM #MachineLearning #parametrization

Victoria Stuart @persagen@mastodon.social

Training Transformers with 4-bit Integers
https://arxiv.org/abs/2306.11987

... we propose a training method for transformers with matrix multiplications implemented with the INT4 arithmetic. Training with an ultra-low INT4 precision is challenging ... we carefully analyze the specific structures of activation & gradients in transformers to propose dedicated quantizers for them. For forward propagation, we identify ...

arXiv.orgTraining Transformers with 4-bit IntegersQuantizing the activation, weight, and gradient to 4-bit is promising to accelerate neural network training. However, existing 4-bit training methods require custom numerical formats which are not supported by contemporary hardware. In this work, we propose a training method for transformers with all matrix multiplications implemented with the INT4 arithmetic. Training with an ultra-low INT4 precision is challenging. To achieve this, we carefully analyze the specific structures of activation and gradients in transformers to propose dedicated quantizers for them. For forward propagation, we identify the challenge of outliers and propose a Hadamard quantizer to suppress the outliers. For backpropagation, we leverage the structural sparsity of gradients by proposing bit splitting and leverage score sampling techniques to quantize gradients accurately. Our algorithm achieves competitive accuracy on a wide range of tasks including natural language understanding, machine translation, and image classification. Unlike previous 4-bit training methods, our algorithm can be implemented on the current generation of GPUs. Our prototypical linear operator implementation is up to 2.2 times faster than the FP16 counterparts and speeds up the training by up to 35.1%.

#ML #MachineLearning #parametrization

Pustam | पुस्तम | পুস্তম @pustam_egr@mathstodon.xyz

John von Neumann once claimed, "with 4 parameters, I can fit an elephant, and with 5, I can make him wiggle his trunk."
\[x(t)=\displaystyle\sum_{k=0}^\infty\left(A_k^x\cos(kt)+B_k^x\sin(kt) \right)\]
\[y(t)=\displaystyle\sum_{k=0}^\infty\left(A_k^y\cos(kt)+B_k^y\sin(kt) \right)\]
Here's a paper proving that von Neumann's claim is valid! https://aapt.scitation.org/doi/10.1119/1.3254017
#Neumann #JohnVonNeumann #VonNeumann #FourierSeries #parameters #complexparameters #parametrization #mathematics #maths

Drag & drop to upload