From as early as 500 BCE, humans have recognized that some things vary together in space. This is essentially correlation, but the spatial aspect sometimes adds a special twist. Also, correlation requires evaluation of quantitative data, while this concept is not limited to quantitative characteristics. For example, Diophanes of Bithynia observed that “you can judge whether land is fit for cultivation or not, either from the soil itself or from the vegetation growing on it.” Although used frequently in the history of science (e.g. Humboldtian science), the first naming of this principle that I have found appears in a book by F.D. Hole and J.B. Campbell, published in 1985. They referred to it as spatial association. Because I am not aware of another term that covers this concept, I will continue with their use of it. Unfortunately, in the 1990s some began to use this term to describe clustering. In order to be clear, I define spatial association as the degree to which phenomena are similarly arranged over space.
The first scientific application of spatial association to soil mapping that we know about was by E.W. Hilgard. In 1860, he published his report on the ‘geology and agriculture’ of the state of Mississippi, USA. Hilgard observed that knowledge of the geology and type of vegetation were useful indicators for predicting soil type. In 1883, V.V. Dokuchaev added climate, relief, organisms (both plants and animals), and time to that list of useful spatial predictors. Because these spatial covariates are connected to processes, thinking about their geography enabled Dokuchaev to formulate ideas about soil formation. His descriptions of these factors of soil formation were key in the establishment of modern soil science.
Coinciding with the ‘quantitative revolution,’ H. Jenny wrote a landmark book entitled Factors of Soil Formation (1941). In this book, Jenny accomplished two main things. First, he coined an acronym for the soil formation factors: CLORPT (CL=climate, O=organisms, R=relief, P=parent material, and T=time). This easy-to-remember abbreviation popularized the concept and became the standard framework for teaching about soil formation. Second, Jenny proposed a system to experimentally control geographic variables so that a single variable could be better studied. He advocated for research to be designed so that soils that formed under similar factors, except for one, could be quantitatively compared. This way, differences between the soils compared could be directly attributed to the one factor that had changed. In practice, this is a bit harder than it sounds because the different factors influence one another, but this was a greatly improved strategy for advancing soil science.
Before the factors of soil formation were assigned an acronym, soil mappers were regularly using them to design their maps. Notably, Hilgard’s application of geology and vegetation as predictors was primarily focused on producing a better spatial description of where different soils were. Dokuchaev’s work prior to and after writing the list of five factors was driven by the Russian government’s desire for better soil maps. Most of the soil maps made at that time were at the continental or national scale and the limited information available led to a heavy reliance on large scale climate. However, later work – particularly more detailed soil maps – began to utilize the other factors as predictors of soil variation. As T.M. Bushnell synthesized these concepts – along with G. Milne’s catena concept – in the 1940s, he applied them to what he could see in aerial photographs. Those images provided more spatial information about vegetation and relief than had been previously available.
Soil mapping in the 20th century continued to build on field experience to better understand the local variations of CLORPT. It was still difficult to quantify many of the indicators for soil formation factors, so soil mappers tended to develop unique mental models of the soil landscape. These models were based on their experience in a region for key indicators that marked shifts from one soil series to another, usually in connection with one of the soil formation factors. However, within those mental models, certain factors tended to become emphasized due to the limited spatial information available, map scale, the purpose of the map, and the particular conditions of the area.
Today in digital soil mapping, we still utilize these concepts. Because we use much more quantitative variables – still primarily related to CLORPT – we typically describe our method as spatial regression, or something related to that. However, the geographic principle for why spatial regression works remain rooted in the idea of spatial association.