Humans love a good pattern. After all, phenomena in the natural world tend to follow them: how stars and planets orbit in space, how birds follow certain migratory routes, or how fractals tend to show up in different creatures, fossils, and plants. There’s even a field of study entirely dedicated to the cyclic and seasonal patterns observed in nature: phenology! And who among us can say they haven’t worn a flannel?
Artificial intelligence, or AI, is one approach that can help in the age-old task of sorting through data and identifying meaningful relationships, or patterns, within it. Today, talk of AI is difficult to miss. Yet the actual applications and capabilities of it are often misunderstood. Will it solve every problem ever? (No.) Should AI write my paper? (Probably not?) What can it actually do? Let’s find out!
There are many nuances and considerations to the application of AI in our society, and this two-part guide aims to shed a little light on one particular AI method: neural networks. In Part II, we’ll apply what we’ve learned about neural networks to a case study: hydrology and HydroForecast.
Our abilities to parse through information, find meaningful relationships, identify anomalies, and discern new knowledge when presented with new information are pretty classic human displays of intelligence™. So what’s AI got to do with it?
The term AI is a broad, fuzzily defined catch-all that encompasses a range of definitions and forms. For our sake, AI comprises the liminal space where machines show some form of intelligence, similarly loosely defined, that is comparable to humans.
One advancement can help explain the seemingly sudden proliferation of AI in recent years: computer processing power, a.k.a. powerful hardware. Rapid advances in computing power have facilitated greater collection, availability, and access to data of all kinds, and this data, in turn, can now also be processed via said powerful computing power! Nice.
Why does this matter? Think of it this way: it’d be impossible for any one person or team to process the torrent of information and data that modern society is now able to create and collect. Imagine the data that a single app like Instagram produces in a day or the information recorded by an AI assistant like Siri – that’s all data. Data, however, isn't quite the same as knowledge.
It’s all bits and pieces waiting to be sifted through and understood.
We’re ready to dive into the branch of AI this guide investigates: machine learning and neural networks.
Machine learning is one branch of artificial intelligence particularly useful to the sciences because of how it can process massive amounts of source data and “learn” meaningful relationships within that data. This process of identifying relationships makes machine learning models particularly well suited to predicting outcomes in environmental settings. Scientists have already gathered lots of data on different environmental sciences, and this data works well in machine learning models.
<span class="term" data-def="‘Machine learning is a branch of artificial intelligence (AI) and computer science which focuses on the use of data and algorithms to imitate the way that humans learn, gradually improving its accuracy’ IBM">Machine learning</span> can be thought of as a set of <span class="term" data-def="‘a process or set of rules to be followed in calculations or other problem-solving operations, especially by a computer’ Oxford Dictionary">algorithms</span> (steps for a computer to follow) that will take sample data (input and, in most cases, desired output), analyze meaningful relationships between that data (learn, identify key <span class="term" data-def="Features are individual and independent variables that serve as input in your model">features</span>), all to eventually perform a task (produce an output).
Our focus is on neural networks, one method to machine learning, which is one branch of AI.
With the computer and processing power now available, machine learning is valuable in cases where:
This approach to parsing through vast amounts of data can uncover relationships in data that are too layered, complex, or laborious for us to uncover without assistance from computers.
Varying forms of machine learning are nearly ubiquitous these days: from Photoshop fixing a blemish on a photo, to large language models like ChatGPT, to the spam filter in your email inbox. One example of AI that is not machine learning? The AI in video games. There’s been decades of AI in video games without machine learning. Your Atari pong opponent? That’s (non-machine learning) AI!
However, <span class="term" data-def="‘A means to do machine learning, in which a computer learns to perform some task by analyzing training examples’ MIT">artificial neural networks</span> (ANN) or neural networks for short, are one particular machine learning algorithm. Taking it a step further, deep learning is a subset of neural networks. This is what we’re particularly interested in: deep neural networks.
So how do they work?
Neural networks get their name because they are modeled after how neurons signal to one another in the human brain. In fact, many machine learning models have been designed based on how scientists and researchers believe information is stored and decisions are made in the human brain. Biomimicry old friend, there ya are again!
Neural networks are made up of different layers. These layers fall into one of three categories:
Each layer within a neural network contains a set number of nodes, or artificial neurons. These nodes receive information from the previous layer and pass it to the subsequent layer. As information moves through the network, the nodes learn to focus on the information that is most important. The interconnected nature of the nodes allows them to learn patterns from individual sources as well as how different sources of input data interact with each other in meaningful ways.
Theoretically, a neural network can learn any complex arbitrary relationship that might exist between a set of inputs and a desired output.
When a neural network is composed of three or more hidden layers, folks call it deep learning. As you might imagine, the greater the number of layers, the more complexity the model can learn, and the more data is needed to create a good model.
Let’s make this all a bit more concrete. Say there’s a scientist tasked with the very important job of building a model that can determine what kind of animal is in a photo. Too simple? Well, what’s easy for a 7 year-old human is deceptively tricky for a computer. The scientist would feed input data – photos of fluffy friends – into the <span class="term" data-def="This is the layer that takes in the initial information or input data, transforms it, and passes it onto the next layer.">input layer</span> of the model. The <span class="term" data-def="transforms internal representations into the desired output(s).">output layer</span> would be a label corresponding to the type of animal in each photo: e.g. dog, shark, capybara, etc. The <span class="term" data-def="Create internal representations of information relevant to the task at hand, transforming and refining it in each hidden layer. This is where the majority of the magic happens.">hidden layers</span> would be the complex set of relationships that ultimately allow the model to do its task: receive a photo (input layer), which is just a bunch of colored pixels, and figure out (hidden layer) which kind of animal is in the photo, and then pop out a label of what it is (output layer).
The 7 year-old human may initially have an easier time with this animal-identification task and, importantly, can learn from straightforward information like ‘snakes have no legs’. Machine learning models will often make silly mistakes, and their mistakes can be tricky to correct. But once they learn, they can make large numbers of predictions millions of times faster than any human.
People receive training of all kinds: how to pole vault, <span class="term" data-img-url="https://media.giphy.com/media/v1.Y2lkPTc5MGI3NjExZnZxZ2oyZjhvdnp3ZXJzejZsdTdwdnBjbjZuMm9leHI3ZHpudW44OSZlcD12MV9pbnRlcm5hbF9naWZfYnlfaWQmY3Q9Zw/uLMxqxVvVtuVO/giphy.gif">how to use excel</span>, how to make sourdough bread.
Machine learning models are the same. They need to be trained in order to do their desired task.
When a neural network trains or learns, the model starts by making guesses, gets feedback on its guesses, takes that feedback and then updates the relationships in its hidden layers so that it can make slightly better guesses and get more feedback. This process is repeated over and over until it is able to reliably guess correctly. Fun fact: in the world of machine learning, we usually say ‘predict’ not ‘guess’.
During training, the weights and biases in all the layers of the neural network get updated many times as the model makes predictions and gets feedback on those predictions. Once training is done, the weights don’t change. At this point, you can feed the model some new input data, like a photo of an animal, and if you’ve trained your model effectively, it should give you the correct output: the name of the animal in the photo.
Relative to classical statistical methods, neural network models have many more weights to learn – smaller models can have on the order of hundreds thousands weights while larger models can reach hundreds billions weights. This is the reason why modern neural network models are so data-hungry and computation-hungry; the larger the model, the hungrier, or more data and energy consumptive, it is.
The hidden layers of a neural network compress the input data into an abstract representation of the information space that is difficult to relate to physical conceptualizations. In the space of forecasting, not only are weights established between inputs, but also across how those inputs change over time.
In other words, the hidden layers are potentially learning thousands upon thousands of patterns that exist within the data.
We mentioned that a neural network can have thousands or millions of neurons in its hidden layers. That means it can store a lot of information. It could, theoretically, memorize what type of animal is in each of the 500,000 photos you give it without actually learning the relationships between pixels and animal types. But memorization is no more useful for machine learning models than it is for humans. If the model just memorized all the animals in the photos, it would be useless in telling you what kind of animal was in a photo that it had never seen before. Or to put it another way, it wouldn’t do well with new, or constantly changing, information.
So, like the teacher who gives their students pop quizzes to test learning, a model must be tested on data it has never seen before in order to make sure it’s really learning patterns, and not just regurgitating memorized answers.
Machine learning engineers often don’t let a model see all of the available data for training. They keep some data separate, to test the model to make sure it’s actually learning the relationships they want it to learn.
How do we know when a model is done training and ready to be released into the world? When it shouts, I’m ready!
Kidding. This seemingly simple question is actually quite complex, and it starts to get at some of the tricky and especially interesting parts of machine learning. The parts that are, dare we say, as much art as they are science.
The short answer is that the model is done training when additional training doesn’t improve its performance.
Building a great machine learning model usually involves lots of iteration, and it’s up to the people building it to know when AI is ultimately as fallible as the data and people guiding it. Large datasets and powerful computers are necessary for solving some of our most entangled of problems, but so are experts, i.e. people, who understand the nuances and complexities of a dataset, the problem to be solved, and the myriad ways there are to build and train a machine learning model.
In 1997, the world was stunned when Deep Blue, a computer, first beat Gary Kasparov, the then-world champion in a game of chess. However, in a tournament in 2005, a new kind of chess-champion was born: the centaur. A centaur chess player is one made up of at least one computer and one human.
So far, the centaur is a better chess player than any one person or any one machine.
Want to solve tricky problems? Our money is on a team of <span class="term" data-def="Some folks call this ‘machine-in-the-loop’, or ‘human-in-the-loop’ depending on your perspective">smart people plus powerful computers</span>.
Machine learning and neural networks help humans process data to say something about the world. The difference? The sheer amount of data that these models can now ingest and learn from surpasses the capability and speed that any one individual can achieve.
AI and neural networks are likely staying around, so it’s important to understand their base functions and processes. Though this scratches the surface to how these solutions might operate and help us, this knowledge can inform how people can apply AI to modern problems, like those in the science community.
To quickly recap, broadly speaking:
Remove the word climate from climate change, and we’re left with the reality of it all: change. Change at planetary scale. Admittedly, change can be difficult to account for in any given scenario, but it’s a particularly daunting task when the interactions at play involve all of the wonderfully interwoven intricacies of a planet’s inner climate.
The plus side? Data already exists around these variables that impact our climate, and scientists have often been collecting this data for decades. This means that when it comes to solving for unprecedented change that directly affects long-standing knowledge built on steady patterns, machine learning can serve as a powerful tool that builds on existing scientific models and principles.
This is one reason why there are many in the scientific community who believe AI could be a tool capable of pushing significant discovery across the board. One such field? Well, the study of how water moves through our planet: hydrology.
Up next, let’s take what we’ve learned and add the context of a specific use case. If AI is a tool, what’s an applicable project? In Part II, we’ll investigate hydrology need-to-knows and tie it back together to AI with a look at one real world example: HydroForecast.