TabFM

Name: TabFM
Availability: OnlineOnly
Author: Google Research

A scikit-learn compatible tabular foundation model from Google Research that performs zero-shot classification and regression on tabular datasets with mixed column types using in-context learning.

Visit Website

At a Glance

Pricing

Open Source

Free to use, modify, and distribute under the Apache License 2.0.

Engagement

Available On

CLI

API

SDK

Google ResearchMountain View, CAEst. 1998$91B+ raised

Listed Jul 2026

About TabFM

TabFM (Tabular Foundation Model) is an open-source library from Google Research that brings foundation model capabilities to tabular machine learning tasks. Released under the Apache 2.0 license, it provides a scikit-learn compatible interface for zero-shot classification and regression on datasets with mixed numerical and categorical column types. The repository notes explicitly that this is not an officially supported Google product.

What It Is

TabFM is a pre-trained tabular foundation model that eliminates the need to train model parameters on your specific dataset. Instead of fitting a model from scratch, it uses in-context learning — reading your training data as "context" at inference time to make instant predictions on new test samples. This approach places it in the emerging category of tabular foundation models, analogous to how large language models generalize across text tasks without task-specific fine-tuning.

How In-Context Learning Works for Tabular Data

Rather than gradient-based training on a target dataset, TabFM ingests training rows as context and uses that context to predict labels for unseen test rows. The workflow is:

Call .fit(X_train, y_train) — this prepares ordinal encoders and numerical scalers, not model weights
Call .predict(X_test) or .predict_proba(X_test) — the model reads training data as context and returns predictions immediately
No epochs, no hyperparameter tuning, no GPU required for CPU inference

This makes TabFM particularly useful for rapid prototyping, small-data scenarios, and situations where training time is a constraint.

Backend and Installation Options

TabFM supports two compute backends, giving users flexibility based on their existing infrastructure:

JAX backend — supports CPU and GPU (CUDA) via pip install -e .[jax] or pip install -e .[jax,cuda]; requires JAX 0.10.1 and Flax 0.12.7 using the modern flax.nnx API
PyTorch backend — supports CPU and GPU via pip install -e .[pytorch]; requires PyTorch 2.12.1 or a compatible GPU build

Both backends load the same pre-trained weights from Hugging Face Hub automatically. Python 3.11 or higher is required.

Update: TabFM v1.0.0

The repository ships pre-trained weights for the TabFM v1.0.0 release, which is the current version as of the library's initial public availability. The GitHub repository was created in June 2026 and last updated in early July 2026, indicating an active early-release phase. The library includes runnable example scripts for both classification and regression tasks under the examples/ directory, unit tests compatible with both unittest and Bazel, and evaluation results stored in the results/ directory.

Scikit-Learn Compatibility and Audience

TabFM exposes TabFMClassifier and TabFMRegressor classes that follow the standard scikit-learn estimator API (.fit(), .predict(), .predict_proba()). This design choice means data scientists already familiar with scikit-learn workflows can adopt TabFM without learning a new interface. The library targets ML practitioners who work with structured/tabular data and want to leverage pre-trained model capabilities without the overhead of full model training pipelines.

Community Discussions

Be the first to start a conversation about TabFM

Share your experience with TabFM, ask questions, or help others learn from your insights.

Pricing

OPEN SOURCE

Open Source

Free to use, modify, and distribute under the Apache License 2.0.

Zero-shot classification and regression
JAX and PyTorch backends
Pre-trained weights via Hugging Face Hub
scikit-learn compatible API
Full source code access

Capabilities

Key Features

Zero-shot classification on tabular datasets
Zero-shot regression on tabular datasets
In-context learning — no training on target dataset required
scikit-learn compatible API (TabFMClassifier, TabFMRegressor)
Mixed column type support (numerical and categorical)
JAX backend with CPU and GPU (CUDA) support
PyTorch backend with CPU and GPU support
Automatic pre-trained weight download from Hugging Face Hub
Ordinal encoding and numerical scaling built into fit()
predict_proba() support for classification
Runnable example scripts for classification and regression
Unit tests compatible with unittest and Bazel

Integrations

scikit-learn

JAX

Flax (flax.nnx)

PyTorch

Hugging Face Hub

pandas

numpy

Bazel

API Available

Back to all tools Suggest an edit