Help
Use this page to decide whether you should call a ready-made endpoint, deploy a curated catalog model into your tenant, or build a model deployment from scratch.
Whichever path you choose:
you still get your own dedicated serverless instance, and InferX provides an OpenAI-compatible interface for easy integration with applications such as OpenCode, KiloCode, Dify, OpenWebUI, and more.
InferX is built for agent-native workloads, serverless GPU inference, and fast production integration with OpenAI-compatible APIs.
Start Here
I want the fastest path to a working API.
Use Endpoints. You do not need to deploy models yourself, no snapshot retention fee is charged, and your traffic still runs on your own dedicated instance.
I want a known model, but I don't find it's available in Endpoints or I need some customization.
Use Catalog. Start from a curated template, then deploy it into your own tenant.
I need a model or runtime setup that is not already offered from Catalog
Use Deploy From Scratch. Create a model directly and define the spec, runtime, and configuration yourself.
Which Path Should I Use?
Endpoints
Best when you want the fastest path to a working API without deploying models yourself.
- Use when an endpoint already matches your use case.
- Use when you do not want to deploy and manage the model yourself.
- Use when you want to avoid snapshot retention fees.
- Use when you still want your traffic to run on your own dedicated instance.
- Use when you just want to grab the endpoint and API key and integrate with your application.
Catalog
Best when you do not find what you need in Endpoints or when you need some customization but still want a curated starting point.
- Use when the model is available in Catalog but not in Endpoints.
- Use when you want to deploy the model into your own tenant.
- Use when you need some customization after starting from a curated template.
- Use when you want a safer starting point than building the full deployment from scratch.
Deploy From Scratch
Best when the model or runtime setup you need is not already offered in Catalog.
- Use when the model is not available in Endpoints or Catalog.
- Use when you need full control over model choice, runtime, commands, and deployment shape.
- Use when you need custom environment variables or runtime behavior.
- Use when you are comfortable owning the full deployment configuration yourself.
Quick Comparison
| Need | Best Fit | Why |
|---|---|---|
| Call an API right now | Endpoints | You do not need to deploy the model yourself, there is no snapshot retention fee, and your traffic still runs on your own dedicated instance. |
| I do not find it in Endpoints or I need some customization | Catalog | Catalog lets you start from a curated template and deploy it into your own tenant. |
| I need a model or runtime setup that Catalog does not already offer | Deploy From Scratch | Direct model creation gives you full control when neither Endpoints nor Catalog fits. |
Rule Of Thumb
- Start with Endpoints when you want the fastest integration path and do not want to deploy models yourself.
- Move to Catalog when Endpoints does not have what you need or when you need some customization in your own tenant.
- Use Deploy From Scratch only when Catalog still does not cover the model or runtime setup you need.