Open source leaders take investment to commercialize a tensor data platform to accelerate AI/ML modeling, research, and innovation, and for Earth data
Earthmover PBC, a New York-based software company defining the modern workflow for climate, weather, and other tensor data and founded by climate scientists and open source pioneers, announces its seed round of $7.2M to enable faster data exploration, AI/ML model development, and data product delivery for understanding, modeling, and predicting the physical world. The funding announcement comes on the heels of Earthmover's recent launch of open-source data format Icechunk 1.0, geospatial API gateway product Flux, and ongoing collaboration with NASA which provided a 100x performance improvement in their retrieval of Earth observation data.
The seed round was led by Lowercarbon Capital and joined by Costanoa Ventures and Preston-Werner Ventures. "Earthmover's open-source leadership and deep domain expertise is the recipe for success in data infrastructure companies," said Tom Preston-Werner, Earthmover investor and co-founder of GitHub.
Climate change is accelerating rapidly, leading to increased volatility and uncertainty in energy, commodities, agriculture insurance, and real-estate markets (among many). Gallagher Re reported $402 billion in weather disaster damage from 27 extreme weather events in 2024 - 20% higher than the 10-year inflation-adjusted average. In response, a growing number of organizations—from innovative AI startups to energy traders to massive government agencies—are seeking to use AI to upgrade their ability to model the physical world in real time to enable better decision making, leveraging vast volumes of high-velocity Earth-system data. However, these efforts are stymied by outdated technology and inefficient data systems. Earthmover makes physical-world data AI-ready in the cloud, so teams can train and deploy models on petabyte-scale weather, climate, and geoscience datasets.
"The lack of purpose-built cloud data infrastructure for physical-world AI acts as a drag on our entire space, limiting the value of Earth System data and ultimately slowing climate change adaptation and mitigation efforts," said CEO and co-founder Ryan Abernathey. “We are helping people get answers from data in seconds rather than days.”
Earthmover's platform is built on the concept of multidimensional arrays (a.k.a. tensors), rather than tables, as the core data model. "Where other platforms see 'unstructured data', Earthmover brings a native understanding of the scientific data formats, data structures, and modeling techniques common in weather, climate, and geospatial data analytics," said CTO and co-founder Joe Hamman. "This unlocks incredible performance as well as cost savings for our customers."
A leading use case for Earthmover’s technology is AI weather forecasting for renewable energy supply and demand prediction. “We evaluated multiple datalake architectures, but none of the tabular solutions could meet the demands of our AI weather models,” said Galen Yacalis, Lead Scientist at the RWE AI Research Laboratory. “Earthmover’s array-native approach was the only one that scaled with our data and aligned with how we actually do science.”
The Earthmover platform is composed of three key components: Icechunk, cloud optimized array storage format; Arraylake, a data management layer for teams, and Flux, a set of highly performant API access methods. First, a customer's multidimensional array data is stored in the Icechunk format in cloud object storage (compatible with all major cloud platforms). Inspired by the popular Apache Iceberg table format, Icechunk is an open-source, cloud-native transactional storage engine for multi-dimensional array data, designed for high-performance analytics and AI workloads. Arraylake provides a version- controlled data catalog, management, and data governance layer for analytics and AI teams. And Earthmover’s latest product, Flux, provides performant, geospatial query APIs, helping teams deliver data to downstream applications faster. Together, these features compress the time teams need to set up, run, and deliver their own ML or AI models from months to days while preserving data provenance and reducing compute costs.
“Industry leaders and CTOs want cloud agnostic, maximally flexible infrastructure data platform choices, especially given the speed at which AI is forcing them to operate,” said Tony Liu, Partner at Costanoa Ventures.
Accelerating innovation at exabyte scale
"Ryan and Joe are the perfect founders to deliver on a next generation platform for n-dimensional array data, as evidenced by the support Earthmover has received from the Chan Zuckerberg Initiative for their work on open source tools critical to science, including potential applications in bioimaging, radar, and AI models of the physical world," said Tony Liu, Partner at Costanoa Ventures.
Earthmover's founders are leaders in the open source software and data ecosystem and leading contributors to the Python stack for scientific data. Before founding Earthmover, Abernathey worked as a professor at Columbia University, leading research on ocean circulation and climate, while Hamman worked as a scientist at the National Center for Atmospheric Research before co-founding CarbonPlan, a data-driven climate think tank. The founders met while working on Xarray, a foundational open-source software package for weather, climate, and geospatial data analytics that has been downloaded over 6 million times and is used by organizations including NASA, ESA, Google, NVIDIA, and Planet. The Earthmover team is the leading contributor to Zarr, an open source data format for compressed, chunked array storage, and launched the open source project Icechunk, a cloud-native transactional storage engine for tensor data.
Beyond Geospatial
Earthmover’s ambitions extend far beyond the geospatial category. The team’s years of experience in the Scientific Python community revealed how the same core data structures–tensors–appear across nearly all scientific and computational fields. (Zarr, for example, is used heavily in bioinformatics and microscopy, neuroscience, and fusion research.) As autonomous multi-modal sensor platforms, from self-driving cars to ocean-going robots, continue to gather more data about the physical world, the need for scalable cloud-native tensor data infrastructure will only grow.
"Earthmover provides the fastest, cheapest, and most advanced cloud-native infrastructure to manage climate and weather data — an increasingly essential asset for critical industries building physical AI,” said Shawn Xu, Partner at Lowercarbon Capital.
About Earthmover
Earthmover PBC is building the cloud-native tensor data platform. The company empowers organizations working with weather, climate, and geospatial data with its exabyte scale data management platform: enhancing collaboration, accelerating R&D, streamlining operations and reducing costs. Earthmover’s mission is to empower people to use scientific data to solve humanity’s greatest challenges.
View source version on businesswire.com: https://www.businesswire.com/news/home/20250919639500/en/
“We evaluated multiple datalake architectures, but...Earthmover’s array-native approach was the only one that scaled with our data and aligned with how we actually do science," said Galen Yacalis, Lead Scientist at the RWE AI Research Laboratory.
Contacts
Inquiries: hello@earthmover.io