Getting Started
Before ModsysML, testing model quality and automating workloads was time-consuming, with ModsysML, you can simplify, accelerate and backtest the entire process. This makes it easier to train classifiers, handle real-time changes and make data driven decisions.
Lets install the SDK first...
pip install modsys
Evaluating prompt quality
modsys
produces table views that allow you to quickly review prompt outputs across many inputs. The goal: tune prompts systematically across all relevant test cases, instead of testing prompts by trial and error.
Usage (command line)
Support for user interface coming soon
It works on the command line, you can output to [json
, csv
, yaml
]:
To get started, run the following command:
modsys init
This will create some templates in your current directory: prompts.txt
, vars.csv
, and config.json
.
After editing the prompts and variables to your desired state, modsys
command to kick off an prompt evaluation test:
modsys -p ./prompts.txt -v ./vars.csv -r openai:completion
If you're looking to customize your usage, you have a wide set of parameters at your disposal. See the Configuration docs for more detail:
Option | Description |
---|---|
-p, --prompts <paths...> | Paths to prompt files, directory, or glob |
-r, --providers <name or path...> | One of: openai:chat, openai:completion, openai:model-name, hive:hate, google:safety, etc. See AI Providers |
-o, --output <path> | Path to output file (csv, json, yaml, html) |
-v, --vars <path> | Path to file with prompt variables (csv, json, yaml) |
-c, --config <path> | Path to configuration file. config.json is automatically loaded if present |
-j, --max-concurrency <number> coming soon | Maximum number of concurrent API calls |
--table-cell-max-length <number> coming soon | Truncate console table cells to this length |
--grader coming soon | Provider that will grade outputs, if you are using |
Examples
Prompt quality
In this example, we evaluate whether adding adjectives to the personality of an chat bot affects the responses:
modsys -p prompts.txt -v vars.csv -r openai:completion
This command will evaluate the prompts in prompts.txt
, substituing the variable values from vars.csv
, and output results in your terminal.
Have a look at the setup and full output in another format:
modsys -p prompts.txt -v vars.csv -r openai:completion -o ./output.json
You can also output a nice spreadsheet, JSON, or YAML file:
{
"results": [
{
"prompt": {
"raw": "Rephrase this in French: Hello world",
"display": "Rephrase this in French: {{body}}"
},
"vars": {
"body": "Hello world"
},
"response": {
"output": "Bonjour le monde",
"tokenUsage": {
"total": 19,
"prompt": 16,
"completion": 3
}
}
}
// ...
],
"stats": {
"successes": 4,
"failures": 0,
"tokenUsage": {
"total": 120,
"prompt": 72,
"completion": 48
}
}
}
Here's an example of a side-by-side comparison of multiple prompts and inputs:
Model quality
You can evaluate the difference between safety outputs for a specific context:
Model quality tests & python package for model testing is a beta feature at the moment, open an issue and tag us to setup
modsys -p prompts.txt -r hiveai:hate google:safety -o output.json
Configuration
- Setting up an model test: Learn more about how to set up prompt files, vars file, output, etc.
Building Automated Pipelines
View full documentation »
Let's setup your first Integration!
It will pull from your local database (and keep it in sync).
# import the package
from modsys.client import Modsys
# sync data from your database instance
# (we support supabase at the current moment or postgresql via uri format)
Modsys.connect("postgres://username:password@hostname:port/database_name")
# If you want to test out operation on your external connection
Modsys.fetch_tables()
Modsys.query("desc", "table", "column")
...and create a workflow with a simple command:
Note: you can use our sandbox api and skip providing a token or obtain a Auth token here, sign up today on our Site
# import the package
from modsys.client import Modsys
# Use any provider
Modsys.use("google_perspective:<model name>", secret="YOUR_API_TOKEN_HERE")
# Lets check to see if a phrase contains threats
Modsys.detectText(prompt="Phrase1", content_id="content-id", community_id="user-id")
Example response:
{
"attributeScores": {
"THREAT": {
"spanScores": [
{
"begin": 0,
"end": 12,
"score": { "value": 0.008090926, "type": "PROBABILITY" }
}
],
"summaryScore": { "value": 0.008090926, "type": "PROBABILITY" }
},
"INSULT": {
"spanScores": [
{
"begin": 0,
"end": 12,
"score": { "value": 0.008804884, "type": "PROBABILITY" }
}
],
"summaryScore": { "value": 0.008804884, "type": "PROBABILITY" }
},
"SPAM" // ...
},
"languages": ["en"],
"clientToken": "content_123",
"detectedLanguages": ["en", "fil"]
}
Experimental inputs:
# Create custom rules which creates a task!
Modsys.rule('Phrase1', '>=', '0.8')
Modsys.detectImage('Image1', 'contains', 'VERY_LIKELY') # Image Analysis/OCR
Modsys.detectSpeech('Audio1', 'contains', 'UNLIKELY') # Audio Processing
Modsys.detectVideo('Video1', 'contains', 'POSSIBLE') # Video Analysis
Modsys.detectText('Phrase1', 'contains', 'UNKNOWN') # Text Analysis
Modsys.test('prompt', 'expected_output') # ML Validation
That's all it takes!
In practice, you probably want to use one of our native SDKs to interact with Modsys's API or use Apollo ModsysML Console so you dont have to write code. If so, sign up at Apollo API!