# TabFM

> A scikit-learn compatible tabular foundation model from Google Research that performs zero-shot classification and regression on tabular datasets with mixed column types using in-context learning.

TabFM (Tabular Foundation Model) is an open-source library from Google Research that brings foundation model capabilities to tabular machine learning tasks. Released under the Apache 2.0 license, it provides a scikit-learn compatible interface for zero-shot classification and regression on datasets with mixed numerical and categorical column types. The repository notes explicitly that this is not an officially supported Google product.

## What It Is

TabFM is a pre-trained tabular foundation model that eliminates the need to train model parameters on your specific dataset. Instead of fitting a model from scratch, it uses in-context learning — reading your training data as "context" at inference time to make instant predictions on new test samples. This approach places it in the emerging category of tabular foundation models, analogous to how large language models generalize across text tasks without task-specific fine-tuning.

## How In-Context Learning Works for Tabular Data

Rather than gradient-based training on a target dataset, TabFM ingests training rows as context and uses that context to predict labels for unseen test rows. The workflow is:

- Call `.fit(X_train, y_train)` — this prepares ordinal encoders and numerical scalers, not model weights
- Call `.predict(X_test)` or `.predict_proba(X_test)` — the model reads training data as context and returns predictions immediately
- No epochs, no hyperparameter tuning, no GPU required for CPU inference

This makes TabFM particularly useful for rapid prototyping, small-data scenarios, and situations where training time is a constraint.

## Backend and Installation Options

TabFM supports two compute backends, giving users flexibility based on their existing infrastructure:

- **JAX backend** — supports CPU and GPU (CUDA) via `pip install -e .[jax]` or `pip install -e .[jax,cuda]`; requires JAX 0.10.1 and Flax 0.12.7 using the modern `flax.nnx` API
- **PyTorch backend** — supports CPU and GPU via `pip install -e .[pytorch]`; requires PyTorch 2.12.1 or a compatible GPU build

Both backends load the same pre-trained weights from Hugging Face Hub automatically. Python 3.11 or higher is required.

## Update: TabFM v1.0.0

The repository ships pre-trained weights for the **TabFM v1.0.0** release, which is the current version as of the library's initial public availability. The GitHub repository was created in June 2026 and last updated in early July 2026, indicating an active early-release phase. The library includes runnable example scripts for both classification and regression tasks under the `examples/` directory, unit tests compatible with both `unittest` and Bazel, and evaluation results stored in the `results/` directory.

## Scikit-Learn Compatibility and Audience

TabFM exposes `TabFMClassifier` and `TabFMRegressor` classes that follow the standard scikit-learn estimator API (`.fit()`, `.predict()`, `.predict_proba()`). This design choice means data scientists already familiar with scikit-learn workflows can adopt TabFM without learning a new interface. The library targets ML practitioners who work with structured/tabular data and want to leverage pre-trained model capabilities without the overhead of full model training pipelines.

## Features
- Zero-shot classification on tabular datasets
- Zero-shot regression on tabular datasets
- In-context learning — no training on target dataset required
- scikit-learn compatible API (TabFMClassifier, TabFMRegressor)
- Mixed column type support (numerical and categorical)
- JAX backend with CPU and GPU (CUDA) support
- PyTorch backend with CPU and GPU support
- Automatic pre-trained weight download from Hugging Face Hub
- Ordinal encoding and numerical scaling built into fit()
- predict_proba() support for classification
- Runnable example scripts for classification and regression
- Unit tests compatible with unittest and Bazel

## Integrations
scikit-learn, JAX, Flax (flax.nnx), PyTorch, Hugging Face Hub, pandas, numpy, Bazel

## Platforms
CLI, API, DEVELOPER_SDK

## Pricing
Open Source

## Version
v1.0.0

## Links
- Website: https://github.com/google-research/tabfm
- Repository: https://github.com/google-research/tabfm
- EveryDev.ai: https://www.everydev.ai/tools/tabfm
