A scikit-learn compatible tabular foundation model from Google Research that performs zero-shot classification and regression on tabular datasets with mixed column types using in-context learning.
At a Glance
About TabFM
TabFM (Tabular Foundation Model) is an open-source library from Google Research that brings foundation model capabilities to tabular machine learning tasks. Released under the Apache 2.0 license, it provides a scikit-learn compatible interface for zero-shot classification and regression on datasets with mixed numerical and categorical column types. The repository notes explicitly that this is not an officially supported Google product.
What It Is
TabFM is a pre-trained tabular foundation model that eliminates the need to train model parameters on your specific dataset. Instead of fitting a model from scratch, it uses in-context learning — reading your training data as "context" at inference time to make instant predictions on new test samples. This approach places it in the emerging category of tabular foundation models, analogous to how large language models generalize across text tasks without task-specific fine-tuning.
How In-Context Learning Works for Tabular Data
Rather than gradient-based training on a target dataset, TabFM ingests training rows as context and uses that context to predict labels for unseen test rows. The workflow is:
- Call
.fit(X_train, y_train)— this prepares ordinal encoders and numerical scalers, not model weights - Call
.predict(X_test)or.predict_proba(X_test)— the model reads training data as context and returns predictions immediately - No epochs, no hyperparameter tuning, no GPU required for CPU inference
This makes TabFM particularly useful for rapid prototyping, small-data scenarios, and situations where training time is a constraint.
Backend and Installation Options
TabFM supports two compute backends, giving users flexibility based on their existing infrastructure:
- JAX backend — supports CPU and GPU (CUDA) via
pip install -e .[jax]orpip install -e .[jax,cuda]; requires JAX 0.10.1 and Flax 0.12.7 using the modernflax.nnxAPI - PyTorch backend — supports CPU and GPU via
pip install -e .[pytorch]; requires PyTorch 2.12.1 or a compatible GPU build
Both backends load the same pre-trained weights from Hugging Face Hub automatically. Python 3.11 or higher is required.
Update: TabFM v1.0.0
The repository ships pre-trained weights for the TabFM v1.0.0 release, which is the current version as of the library's initial public availability. The GitHub repository was created in June 2026 and last updated in early July 2026, indicating an active early-release phase. The library includes runnable example scripts for both classification and regression tasks under the examples/ directory, unit tests compatible with both unittest and Bazel, and evaluation results stored in the results/ directory.
Scikit-Learn Compatibility and Audience
TabFM exposes TabFMClassifier and TabFMRegressor classes that follow the standard scikit-learn estimator API (.fit(), .predict(), .predict_proba()). This design choice means data scientists already familiar with scikit-learn workflows can adopt TabFM without learning a new interface. The library targets ML practitioners who work with structured/tabular data and want to leverage pre-trained model capabilities without the overhead of full model training pipelines.
Community Discussions
Be the first to start a conversation about TabFM
Share your experience with TabFM, ask questions, or help others learn from your insights.
Pricing
Open Source
Free to use, modify, and distribute under the Apache License 2.0.
- Zero-shot classification and regression
- JAX and PyTorch backends
- Pre-trained weights via Hugging Face Hub
- scikit-learn compatible API
- Full source code access
Capabilities
Key Features
- Zero-shot classification on tabular datasets
- Zero-shot regression on tabular datasets
- In-context learning — no training on target dataset required
- scikit-learn compatible API (TabFMClassifier, TabFMRegressor)
- Mixed column type support (numerical and categorical)
- JAX backend with CPU and GPU (CUDA) support
- PyTorch backend with CPU and GPU support
- Automatic pre-trained weight download from Hugging Face Hub
- Ordinal encoding and numerical scaling built into fit()
- predict_proba() support for classification
- Runnable example scripts for classification and regression
- Unit tests compatible with unittest and Bazel
