Back to Basics: Artificial Neural Networks and Machine Learning
Oct 15, 2024
Artificial Neural Networks and Machine Learning The heart of every AI application is the artificial...

Artificial Neural Networks and Machine Learning
The heart of every AI application is the artificial neural network (ANN). ANNs attempt to mimic, in software, the way signals propagate in a human brain, from sensory output to some output. In this metaphor, software “nodes” (equivalent to neurons in the brain) have connections (“synapses”) to other nodes. Each connection has a different numerical weight (known as a “parameter”) associated with it. Each node “fires” (sends) a signal to other nodes when the weighted sum of the signals entering it exceeds a certain threshold.
The most common ANN design, the “machine learning” (ML) design, arranges nodes in layers: at a minimum, an input layer and an output layer. Few ANN designs have only these two layers; most have one or more “hidden” layers between the input and output layers. This design is known as “deep learning.” Nodes in each layer are connected only to nodes in adjacent layers.
The number of layers, the number of nodes in each layer, and the connections between nodes make up an ML “model.”
Training a Machine-Learning Model
For an ML model to be useful, it must be trained. The basic function of an ML model is to recognize patterns in data (such as image, audio, video, text, or other types of data) and generate an output that depends on what patterns, if any, it recognizes in the input data.
To take a simple example, suppose we want to train an ML model to distinguish images with cats in them from images without cats. If you show a toddler one photo of a cat, she will be able, from that day forward, to identify any image with a cat in it—even poorly-drawn cartoon cats. An ML model, by contrast, must be trained on thousands of images with and without cats—a time-consuming and energy-intensive process.
Remember the weights between nodes that we discussed earlier? The objective of training an ML model is to adjust the weight values in an iterative process until the model can reliably classify an input image as “cat” or “not cat.”
Generative AI
The simple ML model described above is useful for only one purpose: to distinguish cat images from cat-free images. It’s useless for identifying dogs, chipmunks, giraffes, or anything else. The more you ask of a model, the larger it must be, and the more data is needed to train it.
At the opposite end of this spectrum, and what has captured the public’s attention so much, are “generative AI” models, which take unstructured text inputs and generate text (be it prose, poetry, computer code, or what have you), images, video, or audio. You could, for instance, prompt an image-generation application to create a “blue cat with orange stripes in cubist style,” and it might generate a cat image resembling a Picasso painting.
ChatGPT is a generative AI application designed to “converse” with human users. The underlying model for ChatGPT is a special type of ML model called a “large language model” (LLM). As the name implies, LLMs can be quite large, with parameter counts in the tens or hundreds of billions or more.
Limitations of ML Models
Despite their popularity, ML models in general, and generative AI applications in particular, suffer from a number of limitations:
- As mentioned earlier, training an AI model requires an immense amount of data. For some applications, the amount of training data needed does not exist or is difficult to obtain. For generative AI applications, training data is often pulled indiscriminately from the internet, and many copyright holders are not thrilled with this.
- No ML model is 100% reliable. Even many simple models with narrow purposes top out at 80–90% accuracy. Apps such as GPT may be convincing, but they can “hallucinate” or make statements that are dead wrong. No important decision should be made on the basis of an AI model’s output without double-checking it first.
- Training an ML model requires specialized computing hardware—to be specific, high-performance graphical processing units (GPUs), which can be in short supply and expensive to acquire, deploy, operate, and maintain.
TensorWave and ML Development
To address that last point, TensorWave has built the leading AI cloud platform based on AMD’s MI300X GPU. As a cloud-based service, TensorWave helps you avoid the high cost of GPU hardware while it provides access to GPUs that are more available and perform better than NVIDIA’s flagship H100 GPU.
The result is a platform that scales to support any size project, supported by AMD’s open-source ROCm AI development software, for faster development cycles and low total cost of ownership.
To learn more about TensorWave’s role in fulfilling your company’s AI aspirations, book a demo today.