Overview
The SDK ships sensible defaults for every moving piece, so most users only supply a model. This guide covers the three classes you can override and how source extraction works.
Source Extraction — One Mechanism, Three Use Cases
For each class you pass (model=, executor=, persistor=), the SDK:
- Calls
inspect.getsource(cls)to grab the class body. - If that fails (happens in some notebooks where
torchpatchesinspect), falls back to scanning IPython cell history. - Also grabs every
importstatement from the same cell so those symbols are available on the worker. - Writes the result to
scripts/<filename>.pyin your bucket.
This fails:
This works:
Model — the Only Required Class
Constraints
- Constructor takes
num_classes: int. forward(self, x)accepts(B, 3, img_size, img_size)and returns logits(B, num_classes).
Extra constructor kwargs (e.g. pretrained=True, dropout=0.1) go through ModelConfig.model_args:
num_classes is injected separately from TrainingConfig.num_classes.Custom Executor (optional)
By default the SDK ships a generic executor that reads TrainingConfig args and calls rt_train_model from the stitched adapter. Supply your own only if you need behaviour the default can't express (custom data loaders, non-standard metrics, multi-task heads, …).
The SDK:
- Extracts
MyExecutor's source. - Uses its class name in
config_fed_client.json → executor.path. - Writes it to
scripts/custom_client_executor.pyin your bucket.
You get every TrainingConfig field (fixed + extra) as kwargs.
Custom Persistor (optional)
The default custom_persistor.py wraps your model class with the standard ModelPersistor interface: loads initial weights, saves aggregated ones per round, emits the final checkpoint to model_out/. Override only if you need bespoke serialisation (quantised weights, partial-model updates, custom filename schemes).
Preview Everything Before Submitting
Troubleshooting Source Extraction
"Could not extract source of 'X'"
You're in an environment where neither inspect.getsource nor IPython history gives usable source. Workaround: save the class to a .py file, import it normally, and pass its full dotted path via ModelConfig(model_class="my_module.MyResNet"). The SDK will skip source extraction and trust the import path.
"NameError: name 'nn' is not defined" on the worker
Imports live in a different cell from the class. Consolidate them into the same cell and resubmit. Applies to model=, executor=, persistor=.
Custom executor is silently ignored
Make sure you pass the class, not an instance:
More failure modes in the troubleshooting reference.